-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572
Conversation
Fixed the attribute error (MDAnalysis#2396). Added test cases that check the ids and indices of atoms in a hydrogen bond. Co-authored-by: p-j-smith <[email protected]>
Codecov Report
@@ Coverage Diff @@
## develop #2572 +/- ##
===========================================
- Coverage 90.71% 90.70% -0.02%
===========================================
Files 174 174
Lines 23555 23554 -1
Branches 3073 3073
===========================================
- Hits 21369 21365 -4
- Misses 1565 1569 +4
+ Partials 621 620 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor issues, see comments.
Please also update CHANGELOG as this fixes an issue.
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
Tidied up the mock universe in TestHydrogenBondAnalysisMock. Using numpy asserts rather than bare asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe replace hasattr(u, 'bonds')
by hasattr(u._topology, 'bonds')
for performance reasons (see comment below). Since I don't know how long the analysis usually takes (per frame), I don't know if this is an issue. So I leave the decision up to you. If you decide to follow my suggestion, the reason should be commented in the code.
The bonds lookup is because it actually calculates them too right? I think
that example is why I wanted to make Topology first class, so you could do
“‘bonds’ in u.topology”, else there’s not a clean/canonical way to check
for existence of attributes.
…On Sun, Mar 8, 2020 at 21:14, Johannes Zeman ***@***.***> wrote:
***@***.**** requested changes on this pull request.
Maybe replace hasattr(u, 'bonds') by hasattr(u._topology, 'bonds') for
performance reasons (see comment). Since I don't know how long the analysis
usually takes (per frame), I don't know if this is an issue. So I leave the
decision up to you. If you decide to follow my suggestion, the reason
should be commented in the code.
------------------------------
In package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py
<#2572 (comment)>
:
> @@ -408,7 +409,7 @@ def _get_dh_pairs(self):
# If donors_sel is not provided, use topology to find d-h pairs
if not self.donors_sel:
if not (hasattr(self.u, 'bonds') and len(self.u.bonds) != 0):
I also posted
<#2396 (comment)>
this in the related issue for reference:
hasattr(u, 'bonds') is an *incredibly slow* test because it involves a
lot of object instantiations under the hood. hasattr(u._topology, 'bonds')
is *much* faster:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import GRO, TPR
u = mda.Universe(GRO) # universe *without* bonds
%timeit hasattr(u, 'bonds')
# 758 µs ± 4.33 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit hasattr(u._topology, 'bonds')
# 216 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
u = mda.Universe(TPR) # universe *with* bonds
%timeit hasattr(u, 'bonds')
# 88.8 ms ± 524 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) <== AAARGH!
%timeit hasattr(u._topology, 'bonds')
# 86.4 ns ± 0.266 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
You see that in the latter example, hasattr(u, 'bonds') is *a million
times slower* than hasattr(u._topology, 'bonds'). This may be irrelevant
for a setup routine that is run only once, but when being called in a loop, hasattr(u,
'bonds') is a nice way of very thoroughly killing performance. 😁
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2572?email_source=notifications&email_token=ACGSGB5L4LVQ27RQ5Y6P7DTRGQDEPA5CNFSM4K7F64K2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCYNNRQA#pullrequestreview-370858176>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGB7WAQRDWCWR4RY5NXTRGQDEPANCNFSM4K7F64KQ>
.
|
It is a million times faster to access. See MDAnalysis#2396 (comment)
@zemanj are you happy with the changes? I'll leave it to you to shepherd the PR to the final merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome, just found some minor thingies.
Your tests are really neat, especially the ASCII art 😁👍
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
Thank you @bieniekmateusz @p-j-smith! |
Changes made in this Pull Request:
analysis.hydrogenbonds.hbond_analysis.HydrogenBondsAnalysis
currently stores the ids of atoms in a hydrogen bond, rather than the indices. The documentation states, and the helper methods assume, that it is atom indices that are stored.PR Checklist
Co-authored-by: @p-j-smith [email protected]