Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572

bieniekmateusz · 2020-03-01T16:00:29Z

Changes made in this Pull Request:

The run() method of analysis.hydrogenbonds.hbond_analysis.HydrogenBondsAnalysis currently stores the ids of atoms in a hydrogen bond, rather than the indices. The documentation states, and the helper methods assume, that it is atom indices that are stored.
We have fixed this and added test cases that check the ids and indices of atoms in a hydrogen bond.
Also Fixes AttributeError when using MDAnalysis.analysis.hydrogenbonds.hbond_analysis with topology file without bonds information #2396

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

Co-authored-by: @p-j-smith [email protected]

Fixed the attribute error (MDAnalysis#2396). Added test cases that check the ids and indices of atoms in a hydrogen bond. Co-authored-by: p-j-smith <[email protected]>

codecov · 2020-03-01T16:51:11Z

Codecov Report

Merging #2572 into develop will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #2572      +/-   ##
===========================================
- Coverage    90.71%   90.70%   -0.02%     
===========================================
  Files          174      174              
  Lines        23555    23554       -1     
  Branches      3073     3073              
===========================================
- Hits         21369    21365       -4     
- Misses        1565     1569       +4     
+ Partials       621      620       -1

Impacted Files	Coverage Δ
...DAnalysis/analysis/hydrogenbonds/hbond_analysis.py	`97.29% <100.00%> (+1.84%)`	⬆️
coordinates/base.py	`94.33% <0.00%> (-0.63%)`	⬇️
util.py	`88.06% <0.00%> (-0.02%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da67789...2801f2d. Read the comment docs.

orbeckst

Minor issues, see comments.

Please also update CHANGELOG as this fixes an issue.

package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py

testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py

Tidied up the mock universe in TestHydrogenBondAnalysisMock. Using numpy asserts rather than bare asserts.

Co-authored-by: p-j-smith <[email protected]>

zemanj

Maybe replace hasattr(u, 'bonds') by hasattr(u._topology, 'bonds') for performance reasons (see comment below). Since I don't know how long the analysis usually takes (per frame), I don't know if this is an issue. So I leave the decision up to you. If you decide to follow my suggestion, the reason should be commented in the code.

package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py

richardjgowers · 2020-03-09T08:41:14Z

The bonds lookup is because it actually calculates them too right? I think that example is why I wanted to make Topology first class, so you could do “‘bonds’ in u.topology”, else there’s not a clean/canonical way to check for existence of attributes.

…

On Sun, Mar 8, 2020 at 21:14, Johannes Zeman ***@***.***> wrote: ***@***.**** requested changes on this pull request. Maybe replace hasattr(u, 'bonds') by hasattr(u._topology, 'bonds') for performance reasons (see comment). Since I don't know how long the analysis usually takes (per frame), I don't know if this is an issue. So I leave the decision up to you. If you decide to follow my suggestion, the reason should be commented in the code. ------------------------------ In package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py <#2572 (comment)> : > @@ -408,7 +409,7 @@ def _get_dh_pairs(self): # If donors_sel is not provided, use topology to find d-h pairs if not self.donors_sel: if not (hasattr(self.u, 'bonds') and len(self.u.bonds) != 0): I also posted <#2396 (comment)> this in the related issue for reference: hasattr(u, 'bonds') is an *incredibly slow* test because it involves a lot of object instantiations under the hood. hasattr(u._topology, 'bonds') is *much* faster: import MDAnalysis as mda from MDAnalysis.tests.datafiles import GRO, TPR u = mda.Universe(GRO) # universe *without* bonds %timeit hasattr(u, 'bonds') # 758 µs ± 4.33 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit hasattr(u._topology, 'bonds') # 216 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) u = mda.Universe(TPR) # universe *with* bonds %timeit hasattr(u, 'bonds') # 88.8 ms ± 524 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) <== AAARGH! %timeit hasattr(u._topology, 'bonds') # 86.4 ns ± 0.266 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) You see that in the latter example, hasattr(u, 'bonds') is *a million times slower* than hasattr(u._topology, 'bonds'). This may be irrelevant for a setup routine that is run only once, but when being called in a loop, hasattr(u, 'bonds') is a nice way of very thoroughly killing performance. 😁 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2572?email_source=notifications&email_token=ACGSGB5L4LVQ27RQ5Y6P7DTRGQDEPA5CNFSM4K7F64K2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCYNNRQA#pullrequestreview-370858176>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGSGB7WAQRDWCWR4RY5NXTRGQDEPANCNFSM4K7F64KQ> .

It is a million times faster to access. See MDAnalysis#2396 (comment)

orbeckst · 2020-03-11T23:52:08Z

@zemanj are you happy with the changes? I'll leave it to you to shepherd the PR to the final merge.

zemanj

Looks awesome, just found some minor thingies.
Your tests are really neat, especially the ASCII art 😁👍

package/CHANGELOG

package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py

testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py

zemanj · 2020-03-16T21:42:45Z

Thank you @bieniekmateusz @p-j-smith!

bieniekmateusz · 2020-03-16T21:42:57Z

@zemanj are you happy with the changes? I'll leave it to you to shepherd the PR to the final merge.

Thanks @zemanj, would you mind merging it when you get a chance? Thanks, Mat

Fixed the run() method to use atom indices instead of ids.

ee38518

Fixed the attribute error (MDAnalysis#2396). Added test cases that check the ids and indices of atoms in a hydrogen bond. Co-authored-by: p-j-smith <[email protected]>

orbeckst requested changes Mar 5, 2020

View reviewed changes

Paul Smith added 3 commits March 6, 2020 10:29

Replaced bare Exception with NoDataError.

3a8e187

Added a test for _get_dh_pairs().

06f9fa2

Tidied up the mock universe in TestHydrogenBondAnalysisMock. Using numpy asserts rather than bare asserts.

Updated the CHANGELOG.

00d0db1

orbeckst mentioned this pull request Mar 6, 2020

AttributeError when using MDAnalysis.analysis.hydrogenbonds.hbond_analysis with topology file without bonds information #2396

Closed

bieniekmateusz and others added 2 commits March 7, 2020 19:07

Fixed the test case for the NoDataError when bond info is missing.

9be38c9

Co-authored-by: p-j-smith <[email protected]>

Merge branch 'develop' into hbondtests

920e8dc

zemanj requested changes Mar 8, 2020

View reviewed changes

package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py Outdated Show resolved Hide resolved

Using u._topology.bonds rather than u.bonds.

9389ecf

It is a million times faster to access. See MDAnalysis#2396 (comment)

orbeckst approved these changes Mar 9, 2020

View reviewed changes

orbeckst mentioned this pull request Mar 9, 2020

hydrogenbond analysis should check for missing bonds in the same way as the serial code MDAnalysis/pmda#118

Open

orbeckst assigned zemanj Mar 11, 2020

zemanj requested changes Mar 12, 2020

View reviewed changes

bieniekmateusz and others added 2 commits March 13, 2020 11:09

Minor refactorting (MDAnalysis#2572)

6efffec

Merge branch 'develop' into hbondtests

2801f2d

bieniekmateusz mentioned this pull request Mar 13, 2020

Consistent Autocorrelation and Intermittency - Across Different Analysis #2256

Merged

4 tasks

zemanj approved these changes Mar 16, 2020

View reviewed changes

zemanj merged commit 7cb4184 into MDAnalysis:develop Mar 16, 2020

fiona-naughton added defect Component-Analysis labels Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572

Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572

bieniekmateusz commented Mar 1, 2020 •

edited

Loading

codecov bot commented Mar 1, 2020 •

edited

Loading

orbeckst left a comment

zemanj left a comment •

edited

Loading

richardjgowers commented Mar 9, 2020 via email

orbeckst commented Mar 11, 2020

zemanj left a comment

zemanj commented Mar 16, 2020

bieniekmateusz commented Mar 16, 2020

Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572

Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids #2572

Conversation

bieniekmateusz commented Mar 1, 2020 • edited Loading

PR Checklist

codecov bot commented Mar 1, 2020 • edited Loading

Codecov Report

orbeckst left a comment

Choose a reason for hiding this comment

zemanj left a comment • edited Loading

Choose a reason for hiding this comment

richardjgowers commented Mar 9, 2020 via email

orbeckst commented Mar 11, 2020

zemanj left a comment

Choose a reason for hiding this comment

zemanj commented Mar 16, 2020

bieniekmateusz commented Mar 16, 2020

bieniekmateusz commented Mar 1, 2020 •

edited

Loading

codecov bot commented Mar 1, 2020 •

edited

Loading

zemanj left a comment •

edited

Loading