Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File open fix #471

Merged
merged 21 commits into from
Jan 8, 2019
Merged

File open fix #471

merged 21 commits into from
Jan 8, 2019

Conversation

justinGilmer
Copy link
Contributor

Brought up in #469, tempfile.mkstemp was being used for the generation of temporary files for feeding to PACKMOL.

However, the files were not being closed, which led to eventual program crashes after the ulimit of the process was reached.

Unlike TemporaryFile(), the user of mkstemp() is responsible for deleting the temporary file when done with it.

To fix this, mkstemp was replaced with NamedTemporaryFile

This is a higher-level interface to mkstemp and has increased benefits of garbage collection, etc.

The files are also now under try finally blocks, which ensures the files are closed and deleted whenever the program finishes, goes out of scope, or is interrupted.

`tempfile.mkstemp` provides a low level interface for interacting
with temporary files. However, garbage collection (closing files,
deleting files after use) must be manually taken care of.

Packing.py uses temp files when packing or solvating a system,
these files are never closed. This can cause the program to
reach the limit of open files for a process set by the OS level
`ulimit`.

These changes migrate from using `mkstemp` to a higher level interface
`tempfile.NamedTemporaryFile`, providing the same type of temporary
file, but with saner garbage collection and scope.

The sections where temp files are made are now contained in `try
finally` sections, to ensure proper file closure and deletion when the
program finishes, or is interrupted.
NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.
@justinGilmer justinGilmer added the WIP Work in Progress, not ready for merging label Nov 14, 2018
@mattwthompson
Copy link
Member

We should definitely strive to abstract this logic so as to not have so much repeated code. I was originally hoping this could all be contained within _run_packmol. I wonder if solvated_pdb and its equivalents even need to be created before _run_packmol? The public functions just need a path to a file to be included in input_text, not necessarily a file that exists at that point.

@justinGilmer
Copy link
Contributor Author

We should definitely strive to abstract this logic so as to not have so much repeated code.

Completely agree, looks like I have the logic working now. It properly deletes files when they are no longer needed and closes them.

I was originally hoping this could all be contained within _run_packmol. I wonder if solvated_pdb and its equivalents even need to be created before _run_packmol?

I am less familiar with this code at the moment. So I decided to just emulate current behavior. But this can definitely cleaned up.

@justinGilmer justinGilmer self-assigned this Nov 14, 2018
@justinGilmer
Copy link
Contributor Author

I am not sure how to accurately validate this using unit tests, however. I verified it on my local machine, maybe if a few other people could also do that we can use that as a confirmation at the moment?

Also open to other suggestions!

@justinGilmer
Copy link
Contributor Author

Ive abstracted away the file generation and the updated topology generation to separate functions. @mattwthompson How does that look? If you have any other recommendations, let me know. Hopefully we can get this merged soon.

@mattwthompson
Copy link
Member

Help me understand the try/finally blocks? Why can't we just close the files after calling packmol and building the topology?


finally:
filled_pdb.close()
compound_pdb.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic will only close the last compound in the list of compounds stored in the compound argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good catch since I am only closing one file. However, I am now curious how Python handles it. If I check tmp before and after running the mwe.py from #469 the amount of files are the same.

Maybe that finally clause also cleans up the other files?

I guess the cleanest way to make sure all of these files are closed would be to append all these file objects to a list and then iterate through and delete them afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That result doesn't seem intuitive to me, isn't a new compound_pdb object being generated each iteration?

This also means a unit test, if we can manage one, would need to have a list of compounds instead of a single one.

@justinGilmer
Copy link
Contributor Author

The try/finally blocks make sure that no matter the condition (minus a power failure, I think) that the code will be executed. Here is a pretty good description from the docs.

The try statement has another optional clause which is intended to define clean-up actions that must be executed under all circumstances.

What could happen without try/finally if the program was killed in an improper way would leave pdb files cluttering the tmp directory. Normally not an issue for many computers, since restarting will flush that dir. But for systems that are expected to stay online for a long time, this can lead to files left open and not being deleted.

mattwthompson and others added 14 commits January 7, 2019 11:15
Using Microsoft's VSCode generates an addtional directory
and files to assist the various plugins, save states, etc.

This is not needed for other users.

This has been added to the .gitignore file to prevent these files from
being committed erroneously.
* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits
* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog
* Make foyer an optional dependency

* Split dependencies into required and required for development

* Update developer requirements

* Marked appropriate tests to check if Foyer is installed

* Remove duplicate line

* Remove openbabel and gsd from Appveyor testing

* Attempt to fix coverage dependency issue (mosdef-hub#466)

* Attempt to fix coverage dependency issue

Currently, the most recent version of python-coveralls requires
`coverage==4.0.3`, while pytest-cov requires `Coverage>=4.4`.

The current fix seems to be to pin `pytest-cov` to a previous
version. This can be changed once:
z4r/python-coveralls#66

has been resolved.

* Hotfix for MDTraj MOL2 file issues

MDTraj has merged a fix for MOL2 file reading
mdtraj/mdtraj#1378

However, it has not been included in a new release on
`conda` yet.

Pinning to an older version without the MOL2 fixes
currently.

* Forgot to update Appveyor build script

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Fixes issues with packmol input files (mosdef-hub#474)

* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits

* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog

* Re-pin mdtraj and pytest-cov
NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.
a test for `fill_region` was packing a very small volume (~2nm^3)
with many water molecules. This led to parts of the molecules outside
the defined region. The region has been increased.
Recently, the tests on travis for the example notebooks
are failing due to some internal error with pytest-ignore-flaky
plugin.

It seems like the feature can be replicated well enough
using @pytest.mark.xfail with the strict setting to False.
@justinGilmer
Copy link
Contributor Author

Everything is now passing. The diffs are a bit hard to look at, but assuming you dont seen any major changes @mattwthompson , I will edit the changelog and then merge.

@mattwthompson
Copy link
Member

The only thing that comes to mind is that, despite the tests passing as-is, we're not explicitly testing that the files are being closed. If this is doable we should have a unit test that covers that.

@justinGilmer
Copy link
Contributor Author

Im not entirely sure how we can test for that, aside from manual inspection of someones /tmp of this commit compared to a previous commit lacking this update.

An alternative viewpoint would be that we dont have to test for that. Since the os and tempfile modules are tested themselves by their developers. We might be able to argue that the lack of failures for too many files open now is sufficient. Its not the best solution, but it migth be one of the only real ways?

@mattwthompson
Copy link
Member

Well I definitely can't think of a way to test it; unless it becomes an issue again we can leave it be.

@justinGilmer justinGilmer merged commit 2bff0dd into mosdef-hub:master Jan 8, 2019
mikemhenry pushed a commit to mikemhenry/mbuild that referenced this pull request Jan 28, 2019
* Fix too many files being left open in packing.py

`tempfile.mkstemp` provides a low level interface for interacting
with temporary files. However, garbage collection (closing files,
deleting files after use) must be manually taken care of.

Packing.py uses temp files when packing or solvating a system,
these files are never closed. This can cause the program to
reach the limit of open files for a process set by the OS level
`ulimit`.

These changes migrate from using `mkstemp` to a higher level interface
`tempfile.NamedTemporaryFile`, providing the same type of temporary
file, but with saner garbage collection and scope.

The sections where temp files are made are now contained in `try
finally` sections, to ensure proper file closure and deletion when the
program finishes, or is interrupted.

* Syntax changes

NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.

* remove nested try-finally blocks

* Forgot to pass in path for the solvate method.

* Move file creation to method, move topology generation to file.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Update gitignore file for VSCode
Using Microsoft's VSCode generates an addtional directory
and files to assist the various plugins, save states, etc.

This is not needed for other users.

This has been added to the .gitignore file to prevent these files from
being committed erroneously.

* Fixes issues with packmol input files (mosdef-hub#474)

* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits

* Run `activate base` before `conda build` in Appveyor (mosdef-hub#477)

* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog

* GSD files now include 1,4 special pairs for use in OPLS (mosdef-hub#473)

* Update dependency requirements (mosdef-hub#457)

* Make foyer an optional dependency

* Split dependencies into required and required for development

* Update developer requirements

* Marked appropriate tests to check if Foyer is installed

* Remove duplicate line

* Remove openbabel and gsd from Appveyor testing

* Attempt to fix coverage dependency issue (mosdef-hub#466)

* Attempt to fix coverage dependency issue

Currently, the most recent version of python-coveralls requires
`coverage==4.0.3`, while pytest-cov requires `Coverage>=4.4`.

The current fix seems to be to pin `pytest-cov` to a previous
version. This can be changed once:
z4r/python-coveralls#66

has been resolved.

* Hotfix for MDTraj MOL2 file issues

MDTraj has merged a fix for MOL2 file reading
mdtraj/mdtraj#1378

However, it has not been included in a new release on
`conda` yet.

Pinning to an older version without the MOL2 fixes
currently.

* Forgot to update Appveyor build script

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Fixes issues with packmol input files (mosdef-hub#474)

* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits

* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog

* Re-pin mdtraj and pytest-cov

* Syntax changes

NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.

* Ensure all files generated in the for loops are closed and unlinked

* Increase the box dims for testing fill_region

a test for `fill_region` was packing a very small volume (~2nm^3)
with many water molecules. This led to parts of the molecules outside
the defined region. The region has been increased.

* Remove support for pytest-ignore-flaky

Recently, the tests on travis for the example notebooks
are failing due to some internal error with pytest-ignore-flaky
plugin.

It seems like the feature can be replicated well enough
using @pytest.mark.xfail with the strict setting to False.

* Update CI as well to remove 'pytest-ignore-flaky'

* Change the overlap parameter for a test, was causing floating point errors in PACKMOL

* Changelog.md

* Include information about the  in changelog.md
mattwthompson added a commit to ftiet/mbuild that referenced this pull request Jan 29, 2019
justinGilmer pushed a commit that referenced this pull request Mar 11, 2019
* Initial implementation of fill_sphere packing function

* Do some sanity checks on `sphere` argument

Note that there is no `mb.Sphere` class like `mb.Box`, so a
`_validate_sphere` function is not so sensible.

* Add test to ensure all particles are in the sphere

* Small formatting changes

* Merge changes from #471

* Allow for sphere arguments to be tuples

* Switch from PDB to XYZ files #422

* Test for some bad arguments

* Test for some more bad args for coverage
umesh-timalsina referenced this pull request in GOMC-WSU/MoSDeF-GOMC Mar 22, 2022
* Fix too many files being left open in packing.py

`tempfile.mkstemp` provides a low level interface for interacting
with temporary files. However, garbage collection (closing files,
deleting files after use) must be manually taken care of.

Packing.py uses temp files when packing or solvating a system,
these files are never closed. This can cause the program to
reach the limit of open files for a process set by the OS level
`ulimit`.

These changes migrate from using `mkstemp` to a higher level interface
`tempfile.NamedTemporaryFile`, providing the same type of temporary
file, but with saner garbage collection and scope.

The sections where temp files are made are now contained in `try
finally` sections, to ensure proper file closure and deletion when the
program finishes, or is interrupted.

* Syntax changes

NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.

* remove nested try-finally blocks

* Forgot to pass in path for the solvate method.

* Move file creation to method, move topology generation to file.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Update gitignore file for VSCode
Using Microsoft's VSCode generates an addtional directory
and files to assist the various plugins, save states, etc.

This is not needed for other users.

This has been added to the .gitignore file to prevent these files from
being committed erroneously.

* Fixes issues with packmol input files (#474)

* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits

* Run `activate base` before `conda build` in Appveyor (#477)

* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog

* GSD files now include 1,4 special pairs for use in OPLS (#473)

* Update dependency requirements (#457)

* Make foyer an optional dependency

* Split dependencies into required and required for development

* Update developer requirements

* Marked appropriate tests to check if Foyer is installed

* Remove duplicate line

* Remove openbabel and gsd from Appveyor testing

* Attempt to fix coverage dependency issue (#466)

* Attempt to fix coverage dependency issue

Currently, the most recent version of python-coveralls requires
`coverage==4.0.3`, while pytest-cov requires `Coverage>=4.4`.

The current fix seems to be to pin `pytest-cov` to a previous
version. This can be changed once:
z4r/python-coveralls#66

has been resolved.

* Hotfix for MDTraj MOL2 file issues

MDTraj has merged a fix for MOL2 file reading
mdtraj/mdtraj#1378

However, it has not been included in a new release on
`conda` yet.

Pinning to an older version without the MOL2 fixes
currently.

* Forgot to update Appveyor build script

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Fixes issues with packmol input files (#474)

* Fixes issues with packmol input file

Also reports error based on process code instead of output,
which prevented report of error about input issues.

* Bump to version 0.8.1

* Update changelog to 0.8.1

* Small formatting nits

* Run `activate base` before `conda build`

See ContinuumIO/anaconda-issues#10211 (comment)

* Update changelog

* Re-pin mdtraj and pytest-cov

* Syntax changes

NamedTemporaryFile provides a file-like object to interact with
compared to mkstemp.

To get the full path, the `name` attribute of the NamedTemporaryFile
is needed.

* Ensure all files generated in the for loops are closed and unlinked

* Increase the box dims for testing fill_region

a test for `fill_region` was packing a very small volume (~2nm^3)
with many water molecules. This led to parts of the molecules outside
the defined region. The region has been increased.

* Remove support for pytest-ignore-flaky

Recently, the tests on travis for the example notebooks
are failing due to some internal error with pytest-ignore-flaky
plugin.

It seems like the feature can be replicated well enough
using @pytest.mark.xfail with the strict setting to False.

* Update CI as well to remove 'pytest-ignore-flaky'

* Change the overlap parameter for a test, was causing floating point errors in PACKMOL

* Changelog.md

* Include information about the  in changelog.md
umesh-timalsina referenced this pull request in GOMC-WSU/MoSDeF-GOMC Mar 22, 2022
* Initial implementation of fill_sphere packing function

* Do some sanity checks on `sphere` argument

Note that there is no `mb.Sphere` class like `mb.Box`, so a
`_validate_sphere` function is not so sensible.

* Add test to ensure all particles are in the sphere

* Small formatting changes

* Merge changes from #471

* Allow for sphere arguments to be tuples

* Switch from PDB to XYZ files #422

* Test for some bad arguments

* Test for some more bad args for coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug WIP Work in Progress, not ready for merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants