-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨Streaming utils for zipping and reading/wiring to S3 #7186
✨Streaming utils for zipping and reading/wiring to S3 #7186
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7186 +/- ##
==========================================
+ Coverage 87.00% 87.04% +0.03%
==========================================
Files 1667 1669 +2
Lines 64764 64768 +4
Branches 1096 1115 +19
==========================================
+ Hits 56351 56380 +29
+ Misses 8100 8070 -30
- Partials 313 318 +5
Continue to review full report in Codecov by Sentry.
|
…zipping-of-s3-content
…zipping-of-s3-content
…zipping-of-s3-content
…zipping-of-s3-content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last thing. please double check download_fileobj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff! Thanks a lot for the effort! I would suggest to add some RAM checks and perhaps also some disk space checks to your tests.
…zipping-of-s3-content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
…zipping-of-s3-content
|
What do these changes do?
These bring a set of utilities that will allow us to create zip archives on the fly and stream it as it gets created to S3. The idea is to use constant amount or RAM and no disk space.
How does this work? A request to upload a zip archive to S3 is created. As chunks of this archives are requested by the uploader, the streaming zip utility requests chunks of files on the fly and compose the archive. It will provide pieces of the archive to the S3 uploaded as soon as they are available.
Have a look at
/home/silenthk/work/pr-osparc-stream-zipping-of-s3-content/packages/aws-library/tests/test_s3_client.py::test_workflow_compress_s3_objects_and_local_files_in_a_single_archive_then_upload_to_s3
for a full working workflow.Progress bar support has also been added. Progress is sent based on the data read from the input streams.
Bonus: renamed
_filemanager.py
which created confusion tofilemanager_utils.py
Related issue/s
How to test
Dev-ops checklist