-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename Bunch keys #44
Conversation
skada/datasets/_base.py
Outdated
@@ -232,8 +232,8 @@ def pack( | |||
target = np.concatenate(ys) | |||
sample_domain = np.concatenate(sample_domains) | |||
return (data, target, sample_domain) if return_X_y else Bunch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should also change the variable nalmes in the for loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid confusion with the X and y in the for loop, I reuse the names Xs and ys instead of data and target respectively
It seems those we mentioned in the docs as well (in the README file). |
Plus sample generator functions have documentation (not sure if they implement packing locally but worth checking out). |
The documentation for |
Same for Office31 datasets. |
It seems the |
@@ -470,4 +470,7 @@ def _load_office31( | |||
data.append(content.astype(loader_spec.dtype)) | |||
files['data'] = np.vstack(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just set X and y here no need to add stuff and pop it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't. The files
object comes from the load_files
function, which is a sklearn function. Thus the keys 'data' and 'target' exist from the beginning.
Thus at some point we would have to remove these keys to not have duplicates with the keys 'X' and 'y'.
Thus the pop methods
#25
DomainAwareDataset
def