-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terachem #1788
Terachem #1788
Conversation
142c6bb
to
c388681
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made some comments to see whether you agree. They are mostly for the terachem Parser. for the moments of inertia, I could n't recreate the Gaussian output for CH4 manually. I will keep looking into that.
arkane/logs/terachem.py
Outdated
if len(coords) == 0 or len(numbers) == 0 or len(masses) == 0 \ | ||
or ((len(coords) != num_of_atoms or len(numbers) != num_of_atoms or len(masses) != num_of_atoms) | ||
and num_of_atoms is not None): | ||
raise LogError(f'Unable to read atoms from TeraChem geometry output file {self.path}.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened if someone gives geometry optimization output file as output? I think it better to suggest user give frequency calculation, output.geomatry or .xyz or if parse could n't find a Since there is .xyz, ouput.geomatry and frequency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent suggestion, added
arkane/logs/terachem.py
Outdated
# Matrix element rows | ||
for j in range(n_rows): # for j in range(i*6, Nrows): | ||
line = f.readline() | ||
while len(line.split()) != 7: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if you have an odd number of atoms eg. 11. you will have a 33*33 matrix and the last block will only have 3 columns, which means while loop will exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add as a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! test added
arkane/logs/terachem.py
Outdated
if 'Scan Cycle' in lines[line_index]: | ||
# example: '-=#=- Scan Cycle 7/36 -=#=-' | ||
v_index += 1 | ||
if 'FINAL ENERGY:' in lines[line_index]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to grab "Optimized Energy" value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see Optimized Energy
in a scan log file. Let's talk about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!! Comments were addressed, and fixup commits were added
arkane/logs/terachem.py
Outdated
if len(coords) == 0 or len(numbers) == 0 or len(masses) == 0 \ | ||
or ((len(coords) != num_of_atoms or len(numbers) != num_of_atoms or len(masses) != num_of_atoms) | ||
and num_of_atoms is not None): | ||
raise LogError(f'Unable to read atoms from TeraChem geometry output file {self.path}.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent suggestion, added
arkane/logs/terachem.py
Outdated
if 'Scan Cycle' in lines[line_index]: | ||
# example: '-=#=- Scan Cycle 7/36 -=#=-' | ||
v_index += 1 | ||
if 'FINAL ENERGY:' in lines[line_index]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see Optimized Energy
in a scan log file. Let's talk about it
arkane/logs/terachem.py
Outdated
# Matrix element rows | ||
for j in range(n_rows): # for j in range(i*6, Nrows): | ||
line = f.readline() | ||
while len(line.split()) != 7: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! test added
958b630
to
059fb6f
Compare
Remaining things to do:
|
ff781c0
to
3c2f5dd
Compare
@dranasinghe, I addressed all the comments we discussed (online and offline), please take a look when you get a chance. |
@goldmanm, I implemented in TeraChem the rotor scan checks we discussed, please take a look and let me know what you think. Feel free to replicate for Gaussian and/or make suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for addressing the issues. The code looks good. :)
The part about rotor scans looks good. I had one comment about it, listed below. I've implemented the gaussian changes, and I'm just waiting on the more substantial bugfix pr #1810 to go through to reduce merge conflicts. |
arkane/logs/terachem.py
Outdated
logging.info(' Assuming {0} is the output from a TeraChem PES scan...'.format(os.path.basename(self.path))) | ||
|
||
v_list.append(v_list[0]) # TeraChem does not repeat the first scan point (unlike Gaussian), append it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we need to repeat the data at the end? I think adding data should be avoided if possible. The only reason I can think of is how we define 'angles', will have a point at the end.
A workaround might be to not add the extra datapoint and to replace the angle definition on line 310 so that it does not have the 2$\pi$ angle.
angles = np.linspace(0, 2 * math.pi * (len(v_list) -1) / len(v_list), len(v_list), np.float64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, if there are 4 data points (90 degree spacing), the angles output will be 0, 0.5$\pi$,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a way to repeat the last point in TeraChem instead, I'll change these lines. We do it as a quality assurance method of checking that a 360 scan gives almost the same energy as the initial point (checked in ARC)
Codecov Report
@@ Coverage Diff @@
## master #1788 +/- ##
==========================================
- Coverage 43.02% 41.85% -1.17%
==========================================
Files 80 82 +2
Lines 21099 21574 +475
Branches 5516 5653 +137
==========================================
- Hits 9077 9030 -47
- Misses 11004 11539 +535
+ Partials 1018 1005 -13
Continue to review full report at Codecov.
|
@mjohnson541, thanks for the comments, I reverted the style changes (rebased and squashed). That commit was not removed, though, it instead applies a consistent style to all of Arkan'es files (now keeping the #### lines). The only exception is common.py where the #### lines are deleted, but added in a subsequent commit. Sorry for that, it just created an annoying conflict otherwise. |
2d3f042
to
0995767
Compare
@mjohnson541, I think all comments were addressed. This branch is now rebased. Please feel free to merge unless there are additional concerns. |
Soon we'll add support for many additional ESS, for better organization and readability, these files were relocated.
And restructured this function
And removed existing determine_qm_software tests from GaussianTest
Motivation or Problem
We'd like to expand the electronic structure software Arkane supports
Description of Changes
Added full support for TeraChem.
I couldn't figure out how to parse the moments of inertia out of a TeraChem log file (I think they might not be reported, although they are certainly calculated and used internally), so some general functionalities for calculating them from xyz were added to common.py. Note that neither RDKit nor OB were used to have the capability to calculate moments of inertia for TSs as well. These functions are used in statmech if no rotational mode exists in conformer.modes. Could be useful in the future for additional ESS. I verified that the results are in agreement with those Gaussian would have calculated for a few species, and added tests.
Sine we have additional future plans to expand ESS support in Arkane (Orca, Psi4), we'll end up with many ESS log parsing and respective tests files under the
arkane
folder. For this to not get out of hand, alogs
subdir was created for all ESS log files and their tests, including theLog
parent class.Testing
Plenty of tests were added.
Reviewer Tips
See that the code makes sense, verify that the tests are comprehensive and run them.
I'm tagging @dranasinghe as the reviewer and @mjohnson541 as the merger.