-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Modification source_target_merge function behaviours #71
[MRG] Modification source_target_merge function behaviours #71
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #71 +/- ##
==========================================
+ Coverage 88.54% 88.83% +0.29%
==========================================
Files 41 41
Lines 2601 2678 +77
==========================================
+ Hits 2303 2379 +76
- Misses 298 299 +1 ☔ View full report in Codecov by Sentry. |
@@ -243,3 +243,56 @@ def source_target_split( | |||
return list(chain.from_iterable( | |||
(a[source_idx], a[~source_idx]) for a in arrays | |||
)) | |||
|
|||
|
|||
def source_target_merge( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should match split API: accept as many pairs of 'things' as necessary, alongside with a named argument 'sample_domain'.
So I can do something like
X, y = source_target_merge(X_source, X_target, y_source, y_target, sample_domain=sample_domain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean :
X, y, sample_domain = source_target_merge(X_source, X_target, y_source, y_target, sample_domain)
insterad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In source_target_split()
we don't output back the sample domain, just the splitted arrays. Makes sense to do the same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, as we have this function to pair up with the split functionality, we shouldn't encourage swapping sample_domain to a new one, even though it's technically possible (it would mean that we have to return it with every operation). Moreover, it might easily lead to 'accidental' loss of the information when having multiple sources/targets.
Now flake8 should run automatically on PR
For now
Would love your feedbacks |
Regarding pairs, you are right, it might be better with pairs. But it's common to have pairs of ndarrays as an input type... So it might also be a bit confusing. Let's keep the current implementation to see how it works in practice. We will be able to change it if necessary. |
@YanisLalou is this one ready for review? |
Almost, there's one codevoc test that I still need to implement + check if I've managed to handle all possible cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall comment, let's replace 'duo' with 'pair'. I mentioned it multiple times in different places in the code but certainly not everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @YanisLalou!
Issue: #61
Change the name of
source_target_merge
Allow an N dimension array as an input for
source_target_merge
Add test cases
For
source_target_merge
: