You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for your high-quality open-source project. We are very interested in your EQ bench and creative writing bench, as they share many similarities with our subjective evaluations in OpenCompass. I would like to know if you would be willing to integrate these two benches into OpenCompass to enable a more diverse range of evaluations?
Here is the link of OpenCompass: https://github.com/open-compass/opencompass
And here is a demo for subjective evaluation in Opencompass: https://github.com/open-compass/opencompass/blob/main/configs/eval_subjective_alignbench.py
The text was updated successfully, but these errors were encountered:
Hello! Glad you are liking the benchmarks. I'm more than happy for them to be included in your OpenCompass eval suite. I don't have a lot of free time at the moment to integrate them myself, however I can reassess in ~a month. Otherwise if you want to get started on it, I can answer questions & assist when I have time. :)
Hi! I will also try to integrate them when some free time. And after integration, you can make your leaderboard here: https://hub.opencompass.org.cn/home
just like this
First of all, thank you for your high-quality open-source project. We are very interested in your EQ bench and creative writing bench, as they share many similarities with our subjective evaluations in OpenCompass. I would like to know if you would be willing to integrate these two benches into OpenCompass to enable a more diverse range of evaluations?
Here is the link of OpenCompass: https://github.com/open-compass/opencompass
And here is a demo for subjective evaluation in Opencompass: https://github.com/open-compass/opencompass/blob/main/configs/eval_subjective_alignbench.py
The text was updated successfully, but these errors were encountered: