Add a test to check that Evaluator evaluations match transformers examples #163

fxmarty · 2022-06-27T13:52:29Z

Basically just to check that Trainer and Evaluator do return the same evaluations. This was useful for me working on Optimum to debug slight differences between evaluations with pipelines vs. Trainer (https://github.com/fxmarty/optimum/tree/tests-benchmark/tests/benchmark/onnxruntime), where depending on the task you need to be careful with the kwargs passed to the pipeline to match the output.

Let me know if you think such tests for each tasks are useful or not. If not, I will just add TokenClassificationEvaluator, QuestionAnsweringEvaluator and ImageProcessingEvaluator without the tests.

HuggingFaceDocBuilderDev · 2022-06-27T13:58:01Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra · 2022-06-27T15:47:39Z

Hi @fxmarty thanks for working on this! That does indeed look useful, kind of a compatibility test between Trainer and evaluator. I am curious to hear what @LysandreJik thinks about this and how this is handled in other libraries (e.g. Trainer vs. accelerate)? The evaluate library is generally not tightly coupled to transformers so I am not sure how much we should test against it.

lvwerra · 2022-06-30T15:10:40Z

I have been thinking about this some more: I think although this is a great tool for debugging the evaluator at the point the PR is merged we know the expected value and can simply test against it, no? There is no reason to run the Trainer to get the value to compare against. What do you think?

fxmarty · 2022-07-01T07:45:47Z

Agreed it's probably a bit overkill, although checking against PyTorch examples makes sure that in the future we continue to match Trainer / PyTorch scripts examples behavior.

If you would prefer to have a fixed value in the test, let me know and I will edit accordingly!

lvwerra

After discussing with @LysandreJik I think we can add it. It will be an interesting signal if this fails. Thanks a lot for adding this! I left a few minor comments, let me know if you have any questions.

I don't think we need subfolders at this point and you can just create a new file in tests (e.g. test_trainer_evaluator_parity.py).

tests/evaluator/test_text_classification.py

.github/workflows/test_evaluator.yml

fxmarty · 2022-07-01T09:07:02Z

Thank you for the review @lvwerra , I did changes according to your comments! I think CircleCI is not running here so if you can trigger it it's nice.

Once this one is merged and we have agreed a solution in #167 I will edit the test for token-classification accordingly.

lvwerra

LGTM! 🚀

fxmarty added 4 commits June 27, 2022 15:34

added test for text classification

3156109

added printing

b4eb4fb

fix

5e3daf5

add latest transformers

eaeac3b

fxmarty added 3 commits June 27, 2022 16:08

add torch

dfe86f9

fix torch install

c4b14b0

add requirement for glue

99115ed

lvwerra reviewed Jul 1, 2022

View reviewed changes

fxmarty added 3 commits July 1, 2022 10:58

simplified trainer parity test

328a309

Merge branch 'main' into add-test-text-classification

89a63db

remove print

2bd6100

fxmarty added 8 commits July 1, 2022 13:07

Trigger Build

54bf23c

fix path

e96e666

debug

b854d6e

use release transformers versions for trainer

d462cfa

debug windows

acfd5ea

fix windows

079d2b9

fix

ca92533

fix windows

8df95aa

lvwerra approved these changes Jul 4, 2022

View reviewed changes

lvwerra merged commit 4532f79 into huggingface:main Jul 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a test to check that Evaluator evaluations match transformers examples #163

Add a test to check that Evaluator evaluations match transformers examples #163

fxmarty commented Jun 27, 2022

HuggingFaceDocBuilderDev commented Jun 27, 2022 •

edited

Loading

lvwerra commented Jun 27, 2022

lvwerra commented Jun 30, 2022

fxmarty commented Jul 1, 2022

lvwerra left a comment

fxmarty commented Jul 1, 2022 •

edited

Loading

lvwerra left a comment

Add a test to check that Evaluator evaluations match transformers examples #163

Add a test to check that Evaluator evaluations match transformers examples #163

Conversation

fxmarty commented Jun 27, 2022

HuggingFaceDocBuilderDev commented Jun 27, 2022 • edited Loading

lvwerra commented Jun 27, 2022

lvwerra commented Jun 30, 2022

fxmarty commented Jul 1, 2022

lvwerra left a comment

Choose a reason for hiding this comment

fxmarty commented Jul 1, 2022 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 27, 2022 •

edited

Loading

fxmarty commented Jul 1, 2022 •

edited

Loading