You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the retry values like number of tries before abort and delay before retrying are part of the openai-compatible API backend code.
However, different providers have different limits, for different models (groq, for example), so it would be good to have the retry behavior set in an external file instead of code that should be kept static between different projects and experiments.
The retry library (https://github.com/invl/retry) allows for more dynamic retryng as well, so a retry configuration could handle different return codes/payloads differently, allowing to dynamically adjust retrying to the provider/API limits/model.
The text was updated successfully, but these errors were encountered:
Maybe it makes sense to have a configuration file for each benchmark run that includes info such as
max tokens to generate
delay time, number of tries
sherzod-hakimov
changed the title
[backends] External retry settings for openai-compatible API backend
[backends] Benchmark configuration file to change settings such as openai-compatible API backend
Aug 7, 2024
Right, one consolidated configuration file for a specific benchmark run could hold these kinds of settings as well. It might also allow for specifc pre-set combinations of instances and models as I've seen with student projects using clembench - these are currently done as shell scripts, which some potential users might not be familiar with.
Currently the retry values like number of tries before abort and delay before retrying are part of the openai-compatible API backend code.
However, different providers have different limits, for different models (groq, for example), so it would be good to have the retry behavior set in an external file instead of code that should be kept static between different projects and experiments.
The retry library (https://github.com/invl/retry) allows for more dynamic retryng as well, so a retry configuration could handle different return codes/payloads differently, allowing to dynamically adjust retrying to the provider/API limits/model.
The text was updated successfully, but these errors were encountered: