Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] PredictionEntropyScorer output negative scores #63

Merged

Conversation

YanisLalou
Copy link
Collaborator

@YanisLalou YanisLalou commented Jan 29, 2024

Issue: PredictionEntropyScorer output negative scores
Solution: In PredictionEntropyScorer we compute equation (page 3 paper: https://arxiv.org/pdf/1711.10288.pdf):
$$[ E(X_T) = - \sum_{x_t \in T} \langle f(x_t; \theta), \log f(x_t; \theta) \rangle ]$$

In the code we forgot to use the greater_is_better flag + we use a double minus sign in the formula.
+ Add test case to test that all scores computed are > 0

Copy link

codecov bot commented Jan 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c50867e) 88.83% compared to head (2ed1c2c) 88.95%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
+ Coverage   88.83%   88.95%   +0.12%     
==========================================
  Files          41       41              
  Lines        2678     2708      +30     
==========================================
+ Hits         2379     2409      +30     
  Misses        299      299              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@YanisLalou YanisLalou changed the title PredictionEntropyScorer output negative scores [TO_REVIEW] PredictionEntropyScorer output negative scores Jan 29, 2024
tgnassou
tgnassou previously approved these changes Jan 31, 2024
@YanisLalou
Copy link
Collaborator Author

Actually there was another mistake. Now we compute directly the mean of the entropy of each sample.
However there is still something bothering. Now the score outputted is bounded by -xlog(x) with x a probability. Thus $score \in [0, \frac{1}{e}*\log\frac{1}{e} ]$ and not $\in [0, 1]$ as we could expect for a score.

@tgnassou
Copy link
Collaborator

I reread the paper. For entropy scorer, we want to minimize the entropy, so greater_is_better equals False here. And the entropy is not a score, so that's why it is not between [0, 1]

@tgnassou
Copy link
Collaborator

Maybe we can change the name

@YanisLalou
Copy link
Collaborator Author

Yes my bad you're right greater_is_better equals False here.
So yes we should change the name
OR we could scale the entropy to be between [0, 1] and call it a score (we kind of already do that by computing a mean instead of a sum)
Tbh I do prefer the last choice, since it'll be easier to plot the results next to other scores bounded between [0, 1] like the accuracy, recall, precision....

@tgnassou tgnassou dismissed their stale review January 31, 2024 13:53

Modification of the PR

@tgnassou
Copy link
Collaborator

I think the best thing is to put a sum instead of the mean and implement what is done in the paper. I think the mean was a mistake. We can plot it like a loss.

@YanisLalou
Copy link
Collaborator Author

Not a fan of removing the mean. If we do that the output will be proportionate to the number of samples. Thus it will be hard to compare methods effectiveness between different datasets.
(plus if we're considering this class as a loss, in pytroch by default they compute the mean(L) instead of the sum(L))

@YanisLalou
Copy link
Collaborator Author

Would love to have the thoughts of @kachayev @rflamary

@tgnassou
Copy link
Collaborator

Yeah, maybe a parameter reduction to be able to fit the paper if wanted with option None or mean.

@rflamary
Copy link
Collaborator

good idea to handle both (state in teh doc that one of them corespod to the paper)

@kachayev
Copy link
Collaborator

That's an interesting question...

Usually, I'm a 'theory absolutist' when it comes to the questions like this one. If we call it 'entropy', it should perform 'sum'. Otherwise it's no longer entropy.

In PyTorch, 'reduction' parameter for losses is typically (as it should be) a 'batch-related' setting: it only tells the system what to do with the fact the the loss is typically per-item thing, which means that for a batch we have an array of those, while the gradient descent requires a scalar. Thus it provides 'sum' or 'mean' options for how to collapse the array of values into a scalar. This doesn't change the nature of the loss itself, just a mini-batch GD. Also, from the gradient perspective it's just a matter for multiplicative scalar.

You are right that with larger number of samples max entropy of the system increases. As it should... The hypothesis that 'entropy divided by the number of samples' makes better comparison between different methods is interesting though by no means obvious or intuitive.

@kachayev
Copy link
Collaborator

kachayev commented Feb 1, 2024

Also,

$\in [0, 1]$ as we could expect for a score

Where is this stated for sklearn scores to be from a particular range? I'm not sure I've seen this requirement.

@YanisLalou
Copy link
Collaborator Author

Also,

∈[0,1] as we could expect for a score

Where is this stated for sklearn scores to be from a particular range? I'm not sure I've seen this requirement.

It's not a requirement, but to the best of my recollection, I believe that every class ending with the term 'score' - 'scorer' is upper-bounded by 1. Otherwise, they are referred to as 'loss,' 'error,' and so on.
I'm not 100% sure of that, but I quickly looked at: https://scikit-learn.org/stable/modules/model_evaluation.html and it seems to correlate with what I'm saying.

@rflamary rflamary changed the title [TO_REVIEW] PredictionEntropyScorer output negative scores [MRG] PredictionEntropyScorer output negative scores Feb 19, 2024
@YanisLalou YanisLalou requested a review from rflamary February 20, 2024 13:48
@rflamary rflamary merged commit bf4a427 into scikit-adaptation:main Feb 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants