Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearrange folders according to Python packaging best practices #250

Merged
merged 12 commits into from
Aug 31, 2023

Conversation

yantosca
Copy link
Contributor

@yantosca yantosca commented Aug 9, 2023

Name and institution

Bob Yantosca
Harvard + GCST

Overview

This PR (currently a draft) is the companion PR to #249. We propose rearranging the GCPy folder structure from the current structure:

gcpy
    |
    gcpy
    |
    benchmark
    |       |
    |      modules
    |
    examples
            |
            several  other subdirectories not listed

to:

gcpy
    |
    gcpy
        |
        benchmark
        |       |
        |       modules
        |
        examples
               |
               several other subdirectories not listed

Related issues

NOTES

We have now arranged the GCPy folder structure so that:

(1) gcpy/gcpy              --> gcpy/src/gcpy
(2) gcpy/benchmark         --> gcpy/src/benchmark
(3) gcpy/benchmark/modules --> gcpy/src/gcpy/benchmark/modules
(4) gcpy/examples          --> gcpy/src/gcpy/examples
(5) gcpy/examples/*        --> gcpy/src/gcpy/examples/*

Also note, the file gcpy/benchmark has now been renamed to
gcpy/benchmark_funcs.py, in order to break a namespace collision
with the new gcpy/benchmark folder.  Nothing else other than
the move has been changed.

Updated the CHANGELOG.md accordingly.  Will test to make sure that
benchmark plots are generated properly.

TODO: Move benchmark.py and related config files out of gcpy/src/gcpy
and into gcpy/src/benchmark

TODO: Break up gcpy/src/benchmark_funcs.py into smaller modules,
arranged by purpose.  For example, group all the functions that
print out emissions and mass tables into a single module, etc.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added category: Feature Request New feature or request topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates) labels Aug 9, 2023
@yantosca yantosca self-assigned this Aug 9, 2023
We have now moved gcpy/src/gcpy to gcpy/gcpy, which seems to avoid
some issues with importing modules.

You can now set your PYTHONPATH to the top-level folder:

export PYTHONPATH=/path/to/gcpy

and then you should be able to

import gcpy
from gcpy import benchmark

Signed-off-by: Bob Yantosca <[email protected]>
We have added __init__.py files in the various subfolders of
python/gcpy/gcpy to facilitate importing everything with an
"import gcpy" statement.

NOTE: To avoid namespace collisions, we have renamed:
examples/xarray --> examples/xarray_examples
examples/yaml   --> examples/yaml_examples

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/examples/*.py
- Move code to a routine protected by if __name__ == "__main__",
  which will prevent code from being executed upon import
- Now use "from gcpy import ___" where expedient
- Now use plt.cm.get_cmap("viridis") instead of plt.cm.viridis
- Added code updates suggested by Pylint such as
  - Make variable names conform to snake_case
  - Remove extraneous imports
  - Trim trailing whitespace
  - Use "except ___ as exc:" and "raise ___ from exc"

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/run_benchmark.py
- In the if __name__ == "__main__" block call main(sys.argv), which
  will pass command-line arguments to main
- Replace main() with main(argv), which accepts sys.argv from
  the command line

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/examples/compare_diags.py
- In the if __name__ == "__main__" block call main(sys.argv), which
  will pass command-line arguments to main
- Replace main() with main(argv), which accepts sys.argv from
  the command line

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor Author

yantosca commented Aug 9, 2023

I have made the following updates:

  1. The PYTHONPATH variable can now be set in your .bashrc as follows:

    export PYTHONPATH=/path/to/gcpy
  2. Moved gcpy/benchmark to gcpy/gcpy/benchmark

  3. Moved gcpy/examples to gcpy/gcpy/examples.

  4. Added __init__.py files in subdirectories of gcpy/gcpy.

  5. Make sure scripts are protected with an if __name__ == "__main__": block to prevent routines from executing upon import.

  6. The gcpy/benchmark/run_benchmark.py script can now be invoked with:

    from gcpy.benchmark import run_benchmark
    run_benchmark.main("1yr_fullchem_benchmark.yml")
  7. The gcpy/examples/diagnostics/compare_diags.py script can now be invoked with:

    from gcpy.examples import compare_diags
    compare_diags.main("input.yml")

The yaml_examples only contained species2wiki.py, which is now
obsolete now that we do not use the wiki to display species
database info.  Removed gcpy/examples/yaml_examples and updated
the gcpy/examples/__init__.py folder accordingly.

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/examples/plotting/plot_comparisons.py
gcpy/examples/plotting/plot_single_panel.py
gcpy/examples/timeseries/mda8_o3_timeseries.py
gcpy/examples/working_with_files/*.py
- Add a main() function (or rename an existing function to main)
  so that these can be callable with .main()

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark_modules/modules/run_1yr_fullchem_benchmark.py
gcpy/benchmark_modules/modules/run_1yr_tt_benchmark.py
- Changed permission chmod 755 to chmod 644 (since these are now
  not usually run directly but through benchmark/run_benchmark.py)

Signed-off-by: Bob Yantosca <[email protected]>
CHANGELOG.md
- Added blurb about gcpy/benchmark/modules*.py scripts permissions
  being changed to chmod 644

Signed-off-by: Bob Yantosca <[email protected]>
This merge brings updates from PR #251 (Update gcpy_env environment
to install with MambaForge and to use latest package versions; Update
scripts accordingly, by @yantosca) into the feature/move-folders branch.

This will facilitate merging into dev at a later time.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor Author

I have merged the updates from PR #251 (now in the dev branch) into the feature/move-folders branch. This will allow us to develop on top of PR #251.

@yantosca
Copy link
Contributor Author

Note. this should be merged after PR #246

@yantosca yantosca marked this pull request as ready for review August 17, 2023 21:23
@yantosca yantosca requested review from lizziel and msulprizio August 17, 2023 21:23
@yantosca yantosca added this to the 1.4.0 milestone Aug 17, 2023
@yantosca yantosca linked an issue Aug 28, 2023 that may be closed by this pull request
@yantosca
Copy link
Contributor Author

PR https://github.com/geoschem/gc-cloud-infrastructure/pull/51 should be merged concurrently with this PR, in order to avoid breaking the cloud benchmarking workflow.

CHANGELOG.md Outdated
@@ -39,6 +41,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Routine `print_totals` now prints small and/or large numbers in scientific notation
- Truncate names in benchmark & emissions tables to improve readability
- Add TransportTracers species names to `gcpy/emissions_*.yml` files
- Folder `gcpy/gcpy` is now `gcpy/src/gcpy`, for adherence to Python packaging standards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still the case or are we not using the src directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @msulprizio. I'll edit the changelog. If we use gcpy/src then we'd have to add path/to/gcpy/src to PYTHONPATH, which is a little ugly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now removed in commit cf6fc12

@@ -388,7 +388,7 @@ def run_benchmark(config, bmk_year_ref, bmk_year_dev):

# Create plots
print("\nCreating plots for annual mean")
bmk.make_benchmark_conc_plots(
bmkf.make_benchmark_conc_plots(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a typo - bmk is still used above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @msulprizio. I'll fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now fixed in commit cf6fc12

gcpy/gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
- Fix typo, "bmkf" -> "bmk" in call to make_benchmark_conc_plots

CHANGELOG.md
- Removed note about gcpy/src/gcpy, this is no longer correct.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca requested a review from msulprizio August 30, 2023 18:59
@yantosca
Copy link
Contributor Author

@msulprizio, I will merge this locally and make sure I can plot benchmarks before pushing.

@yantosca yantosca merged commit 04bc173 into dev Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Move gcpy/benchmark.py to the benchmark/ folder
2 participants