[MRG] Regression label for 2d classification data generation #69

BuenoRuben · 2024-02-01T13:40:43Z

In this branch, I added the 'regression' label to both _generate_data_2d_classif and _generate_data_2d_classif_subspace in skada/datasets/_sample_generator.py, added a new example for this label: examples/datasets/plot_shifted_dataset_regression.py
and added a new test: test_make_shifted_datasets_regression in skada\datasets\tests\test_samples_generator.py

the main bug that has been changes was due to the y.astype(int) in the return that rounded the float values when using regression labels

tgnassou

I think in the code I gave you, there was also a regression mode for subspace dataset, it could be nice to have it too

examples/datasets/plot_shifted_dataset_regression.py

skada/datasets/tests/test_samples_generator.py

tgnassou · 2024-02-01T14:25:42Z

Your tests are not passing

the test as using a method that has been changed, thus raising errors

the two tests that got removed were checking that the y-values were between 0 and 1, which should not necessarely be the case in regression

I just changed the size of the colorbar so that we have better looking plots next to it

now _generate_data_2d_classif_subspace use the label that has been given in parametter instead of "binary" everytimes, additionally the example for the regression label use the subspace shift

the main issue was that the y vlues that were generated weren't of the correct size (note the same as the X values)

codecov · 2024-02-02T14:07:08Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (5fe1df9) 86.82% compared to head (37aa429) 87.16%.
Report is 1 commits behind head on main.

Files	Patch %	Lines
skada/datasets/_samples_generator.py	95.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #69      +/-   ##
==========================================
+ Coverage   86.82%   87.16%   +0.34%     
==========================================
  Files          38       38              
  Lines        2398     2439      +41     
==========================================
+ Hits         2082     2126      +44     
+ Misses        316      313       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

this test should cover the change over generate_data_2d_classif_subspace when using 'multiclass' or 'regression' label

with subset shift, the values are twice smaller for the default case

this was needed with teh previous changes

tgnassou · 2024-02-07T13:52:53Z

skada/datasets/_samples_generator.py

    elif label == 'multiclass':
        y = np.zeros(n1)
        for i in range(4):
            y = np.concatenate((y, (i + 1) * np.ones(n2)), 0)
+            y = y.astype(int)
+    elif label == 'regression':
+        # create label y with gaussian distribution


Maybe it could be nice to have the possibility to modify the mu and the Sigma1 as we want. So just put it in the parameters of the function with default values.

And change the name to mu_regression, sigma_regression

tgnassou · 2024-02-07T13:53:14Z

skada/datasets/_samples_generator.py

+        for i in range(k):
+            y = np.concatenate((y, (i + 1) * np.ones(n1//k)), 0)
+            y = y.astype(int)
+    elif label == 'regression':


BuenoRuben added 3 commits February 1, 2024 12:01

added regression labels

8b1ca1c

added a new test for regression label

45e2b95

added an exemple of uses of teh regression label and bug fixes

f9bf3bb

the main bug that has been changes was due to the y.astype(int) in the return that rounded the float values when using regression labels

BuenoRuben changed the title ~~Regression label for 2d classification data generation~~ [WIP] Regression label for 2d classification data generation Feb 1, 2024

BuenoRuben changed the title ~~[WIP] Regression label for 2d classification data generation~~ [MRG] Regression label for 2d classification data generation Feb 1, 2024

tgnassou reviewed Feb 1, 2024

View reviewed changes

examples/datasets/plot_shifted_dataset_regression.py Outdated Show resolved Hide resolved

skada/datasets/tests/test_samples_generator.py Outdated Show resolved Hide resolved

BuenoRuben changed the title ~~[MRG] Regression label for 2d classification data generation~~ [WIP] Regression label for 2d classification data generation Feb 1, 2024

BuenoRuben added 6 commits February 1, 2024 15:40

changed the plot to have only one colorbar

5b0071a

solved an issue with test_make_shifted_datasets_regression

1ccf347

the test as using a method that has been changed, thus raising errors

removed two test that were not making sense

f98308f

the two tests that got removed were checking that the y-values were between 0 and 1, which should not necessarely be the case in regression

changed the size of the colorbar

2dcecc3

I just changed the size of the colorbar so that we have better looking plots next to it

use label instead of binary in _generate_data_2d_classif_subspace

d2efd6a

now _generate_data_2d_classif_subspace use the label that has been given in parametter instead of "binary" everytimes, additionally the example for the regression label use the subspace shift

made multiclass usable for subspace shift

e62d884

the main issue was that the y vlues that were generated weren't of the correct size (note the same as the X values)

BuenoRuben changed the title ~~[WIP] Regression label for 2d classification data generation~~ [MRG] Regression label for 2d classification data generation Feb 2, 2024

corrected a typo

4a0fe87

BuenoRuben and others added 10 commits February 2, 2024 15:19

added a new test

59a4179

this test should cover the change over generate_data_2d_classif_subspace when using 'multiclass' or 'regression' label

updated a test

694bd6b

with subset shift, the values are twice smaller for the default case

updated the test

5fb4162

this was needed with teh previous changes

Update test_samples_generator.py

7d3cb62

Update test_samples_generator.py

ae9b73f

Update test_samples_generator.py

acc749a

made the code follow linter's standards

229a47a

correction of some mistake

615e61c

Merge branch 'main' into regressions

5f91936

changes for flake8

37aa429

tgnassou reviewed Feb 7, 2024

View reviewed changes

tgnassou merged commit 8e8fc7b into scikit-adaptation:main Feb 7, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Regression label for 2d classification data generation #69

[MRG] Regression label for 2d classification data generation #69

BuenoRuben commented Feb 1, 2024 •

edited

Loading

tgnassou left a comment

tgnassou commented Feb 1, 2024

codecov bot commented Feb 2, 2024 •

edited

Loading

tgnassou Feb 7, 2024

tgnassou Feb 7, 2024

tgnassou Feb 7, 2024

[MRG] Regression label for 2d classification data generation #69

[MRG] Regression label for 2d classification data generation #69

Conversation

BuenoRuben commented Feb 1, 2024 • edited Loading

tgnassou left a comment

Choose a reason for hiding this comment

tgnassou commented Feb 1, 2024

codecov bot commented Feb 2, 2024 • edited Loading

Codecov Report

tgnassou Feb 7, 2024

Choose a reason for hiding this comment

tgnassou Feb 7, 2024

Choose a reason for hiding this comment

tgnassou Feb 7, 2024

Choose a reason for hiding this comment

BuenoRuben commented Feb 1, 2024 •

edited

Loading

codecov bot commented Feb 2, 2024 •

edited

Loading