Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse .idx files #109

Open
JackKelly opened this issue Oct 15, 2024 · 1 comment
Open

Parse .idx files #109

JackKelly opened this issue Oct 15, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@JackKelly
Copy link

Motivation

The grib-rs project looks great! Thanks so much for all your work on grib-rs!

I'm interested in lazily opening petabyte sized GRIB datasets on cloud object storage. For example, the NOAA Open Data Dissemination (NODD) project has released 59 petabytes of data to public cloud object storage.

I've recently started a Rust project called hypergrib. hypergrib won't implement the decoding of GRIB messages. Instead, hypergrib will allow users to lazily open huge GRIB datasets (with millions of GRIB files).

Essential to this plan is the ability to interpret the .idx files that are published with many large GRIB datasets. A classic .idx file might look like this:

1:0:d=2017010100:HGT:10 mb:anl:ENS=low-res ctl
2:50487:d=2017010100:TMP:10 mb:anl:ENS=low-res ctl
3:70653:d=2017010100:RH:10 mb:anl:ENS=low-res ctl
4:81565:d=2017010100:UGRD:10 mb:anl:ENS=low-res ctl

The columns are message number, byte offset, NWP init datetime, parameter name, vertical level, product type, and ensemble member.

Does grib-rs already implement the ability to decode .idx files?

Proposed Solution

To quickly decode the parameter name, I presume we'd want a HashMap keyed on the name? I must admit I haven't fully understood the codegen part of grib-rs so I'm not sure what the value of the HashMap would be?

Additional Context

No response

@JackKelly JackKelly added the enhancement New feature or request label Oct 15, 2024
@JackKelly
Copy link
Author

A quick update: I've created a small crate called grib_tables which loads the GDAL CSV files into memory and allows users to get the GRIB parameter details from the abbreviation string (e.g. "TMP") or the GRIB numeric ID. Comments very welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant