Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial datasets setup #155

Merged
merged 3 commits into from
Apr 29, 2022
Merged

Initial datasets setup #155

merged 3 commits into from
Apr 29, 2022

Conversation

ChristophAlt
Copy link
Collaborator

@ChristophAlt ChristophAlt commented Apr 29, 2022

  • add a top-level ./datasets folder, similar to HF datasets, that will host all dataset loading scripts, dummy data, and dummy documents (will be added later).
  • add tests to check that all dataset loading scripts in ./datasets are formatted properly and work as expected, e.g. can be loaded locally and from the hub

- datasets are now located in /datasets, similar to HF datasets
- tests now automatically check for correct dataset script formatting and correct loading
@ChristophAlt
Copy link
Collaborator Author

ChristophAlt commented Apr 29, 2022

Sorry for the noise, somehow black or isort had a change of personality and wanted to reformat everything related to datasets. 😄

@ChristophAlt ChristophAlt requested a review from ArneBinder April 29, 2022 14:30
Copy link
Owner

@ArneBinder ArneBinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good.
Disclaimer: I don't have a good understanding of tests/data/test_dataset_common.py and tests/data/test_dataset_scripts.py.

@ChristophAlt ChristophAlt merged commit 248d619 into main Apr 29, 2022
@ChristophAlt ChristophAlt deleted the datasets branch May 1, 2022 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants