-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weights normalization in ReweightDensity #112
Comments
You are right, when comparing the weights obtained with this methods, with those from the others, it is the only one that is normalized in such a way (all other reweighting methods have some weights bigger than 1), so it might be better not to normalize it |
From the code, it seems like normalization here is specifically to produce density. Right? |
maybe it is a mistake, and we shouldn't give a density in both cases: |
Hi, Moreover, from a scikit-learn implementation perspective, uniform weighting is equivalent to all weight being equal to one, I think (i.e. the mean of the weights should be equal to one). Otherwise, I think you may fall in numerical issue, when the number of source data is large (your weights become close to zeros). From my point of view, there is also an issue with the current implementation of |
I agree with you that this reweighting with just probabilities is not correct or the one from the papers (I would cite the review of sugiyama instead of GAn though ;)). OK for weights that sum to n instead of 1 because indeed it changes the proper parameters (C or alpha) in the sklearn models. I juts had the problem when implementing the JDOT classifier. |
Also about the Reweight adapter we should maybe rename them without density everywhere; density is used in the model that use a density estimator, but it would make sens to remove teh desnsity for the Gaussian one (it is implied in a way) and the discriminator one. We did not use density for KMM either. As log as Reweight is in the same it is enough I think. |
Ok thank you @rflamary, |
closing this usse fixed in #118 |
Hi everyone,
I wonder why the weight are normalized in ReweightDensity cf
skada/skada/_reweight.py
Line 108 in 9e7905d
After some tests, I observed that the strange behavior of ReweightDensity in the comparison example is due to the weight normalization. cf. https://scikit-adaptation.github.io/auto_examples/plot_method_comparison.html#sphx-glr-auto-examples-plot-method-comparison-py
Maybe, we should remove the normalization or at least divide by the mean instead of the sum (dividing by the sum provokes a dependance of the weights with respect to the dataset size, which should be avoided, I think).
The text was updated successfully, but these errors were encountered: