-
Notifications
You must be signed in to change notification settings - Fork 120
Issue loading trajectory with pyemma.coordinates.source when topology file changes #1541
Comments
Hi, I can reproduce this error locally. I believe it's a problem with the topology cache. (This cache is there because it accelerates loading large sets of trajectories with the same topology.) As far as I see, it is implemented using an LRU-cache here: PyEMMA/pyemma/coordinates/util/patches.py Lines 40 to 52 in 8c2bc84
From this code it also becomes clear why your workaround works. Unfortunately I don't think that the cache can be turned off as there is no option in the config. Maybe @clonker knows how to fix the cache? |
I'll take a look at it. 🙂 Thanks for reporting this @AnjaConev! |
Thank you both for a quick response! |
Definitely sounds like a potential fix! Thanks 🚀 |
Opted for extending the lru cache key by hash of the first MB of the file contents plus last modified and creation date - that should be unique enough 🙂 |
Nice approach 😄 |
Hello,
Thanks for this amazing package - it has been very fun to work with and very reliable!
I have encountered a strange issue when loading the trajectory with pyemma.coordinates.source.
(PyEmma version = 2.5.7 on Debian GNU/Linux 9)
Here is the example of what is happening:
I have a trajectory and a reference PDB file (X.xtc and X.pdb). They both have 10 atoms.
I load the trajectory for the first time with:
pyemma.coordinates.source("./X.xtc", top="./X.pdb")
Everything works as expected.
Later in my workflow I have new files with the same names: X.xtc, X.pdb but they now refer to a new trajectory with 18 atoms.
I make the same call to load this new trajectory:
pyemma.coordinates.source("./X.xtc", top="./X.pdb")
It now fails with the error:
ValueError: xyz must be shape (Any, 10, 3). You supplied (1, 18, 3)
If I reload the package and call source again with the new files it works.
It seems like the pdb file name gets cashed somewhere and pyemma.coordinates.source still thinks that we are dealing with the old X.pdb.
I was able to work around this issue without reloading the package by running:
mdtraj_top = mdtraj.load("./X.pdb").topology
pyemma.coordinates.source("./X.xtc", top= mdtraj_top)
The code for reproducing this example and the corresponding files are in example.zip
example.zip
I just wanted to report this as it might be an unexpected behavior and I was stuck for a while on this issue.
My python environment: pip_list.txt
Full error message:

The text was updated successfully, but these errors were encountered: