Pass metric init_kwargs from the Evaluators to metrics #351

NimaBoscarino · 2022-11-09T21:00:31Z

Some metrics require arguments an init-time to load them appropriately, such as the HONEST metric. This PR adds a metric_init_kwargs option to the base Evaluator, which then passes those down through to the load_metric method.

Not quite sure what the best way to write test for this is, since there isn't a suite specifically for the base evaluator. Let me know if you have any suggestions!

HuggingFaceDocBuilderDev · 2022-11-09T21:04:07Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra · 2022-11-10T10:02:10Z

You can pass an already loaded metric to the evaluator - wouldn't that solve this issue?

NimaBoscarino · 2022-11-10T16:17:58Z

Ah good point, I should've clarified – this is specifically to make the eventual Evaluation Suite (#337) easier to use. Originally I'd done this for the JSON-only version of the Evaluation Suite (#302) since there wasn't a way to pass init_kwargs to metrics there, but even with the suite-as-code version it's still a bit awkward to have to load metrics above the suite and then pass them down.

Especially if I have multiple tasks to run that need the same metric, but with different configurations on it, I'd end up having to load multiple versions of the metric and I'd have to come up with a number of different variable names, which makes the resulting code confusing to look at IMO.

EDIT: I guess the other option is to have the Evaluation Suite receive the metric_init_kwargs and load the metric in there before calling task_evaluator.compute.

lvwerra · 2022-11-14T10:25:45Z

If we need a metric with custom init args we could pass them in the SubTask definition, right? E.g.
"metric": evaluate.load("metric", important_setting=1) (doesn't need to be a string). I think it would be good to keep the signature of EvaluationSuite and Evaluator as clean as possible and if there is already way to achieve the goal then we shouldn't create multiple ways to do the same thing. That being said, if it's a blocker for something you want to do we can add it :)

NimaBoscarino · 2022-11-14T18:10:44Z

Oh good point, I hadn't considered passing something other that a dict of strings to args_for_task 😅 I'll try that out, and if it looks like it works alright then I'll close this PR.

NimaBoscarino · 2022-11-18T00:33:48Z

Confirmed! Doing it that way works, so this PR isn't needed. Closing this!

Pass metric init_kwargs from the Evaluator to metrics

e3ad6fa

NimaBoscarino requested review from lvwerra and mathemakitten November 9, 2022 21:00

NimaBoscarino mentioned this pull request Nov 9, 2022

Add TextGeneration Evaluator #350

Merged

NimaBoscarino requested a review from meg-huggingface November 17, 2022 20:12

NimaBoscarino closed this Nov 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass metric init_kwargs from the Evaluators to metrics #351

Pass metric init_kwargs from the Evaluators to metrics #351

NimaBoscarino commented Nov 9, 2022

HuggingFaceDocBuilderDev commented Nov 9, 2022 •

edited

Loading

lvwerra commented Nov 10, 2022

NimaBoscarino commented Nov 10, 2022 •

edited

Loading

lvwerra commented Nov 14, 2022

NimaBoscarino commented Nov 14, 2022

NimaBoscarino commented Nov 18, 2022

Pass metric init_kwargs from the Evaluators to metrics #351

Pass metric init_kwargs from the Evaluators to metrics #351

Conversation

NimaBoscarino commented Nov 9, 2022

HuggingFaceDocBuilderDev commented Nov 9, 2022 • edited Loading

lvwerra commented Nov 10, 2022

NimaBoscarino commented Nov 10, 2022 • edited Loading

lvwerra commented Nov 14, 2022

NimaBoscarino commented Nov 14, 2022

NimaBoscarino commented Nov 18, 2022

HuggingFaceDocBuilderDev commented Nov 9, 2022 •

edited

Loading

NimaBoscarino commented Nov 10, 2022 •

edited

Loading