-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integration of automatic translation evaluation into model evaluation tools #32
Comments
@hist0613 Hello. May I ask how could we run the |
|
@hist0613 Thank you for the detailed explanation! |
@kangsuhyun-yanolja
|
Thank you for the comment! Especially the 1-2 item, I think we need to handle 1-2 now. We're using a language detector so it would be better to believe it. Then users won't have to select two options. I'll create an issue about it. |
Currently, there is a need for an automated evaluation tool that can simplify the process. This tool should be capable of assessing the accuracy and quality of translations produced by various models. A potential solution could involve integrating this functionality into an existing framework like lm_evaluation_harness or creating a standalone service.
This service could accept inputs in formats such as CSV or JSONL, providing users with a straightforward method to obtain evaluations. Results from this tool could be essential for models aiming to participate in Arena.
The text was updated successfully, but these errors were encountered: