-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚀 AgentPool has separate levers for cooldownPeriodSeconds
and scalingPeriodSeconds
#341
Comments
Hi @starlightromero, could you please provide more context on what exactly you are asking re |
@sheneska another way to put it might be to have separate cooldowns for scale-out vs scale-in. It's of particular concern when scaling to zero, because no agents will be launched until the cooldown period expires, no matter how big the queue is. Right now, we work around it by having a very short cooldown period (like 1 minute), so that if there are no agents it takes at most a minute to launch one. The downside of this is that agents disappear very quickly after a run, and having to relaunch one takes a bit of time, so it adds delay to the next run. Ideally, after being launched, an agent sticks around for a bit, maybe 30 minutes or whatever, so that subsequent runs have an available agent to use. But if we set cooldown to 30 minutes, and it scales to zero, and then 2 minutes later we have another run, that run will wait for 28 minutes before another agent is launched. So the ability to have asymmetrical cooldown times would be especially helpful: we want to be able to quickly scale-out in response to load, and scale-in more slowly to reduce latency for runs that start closely in time. |
It's important to me to be able to manage time to scale up independently from time to scale down, so that I can better manage the tradeoff between cost control and user experience. |
Thanks for the additional context. It's really helping us understand the impact of this potential change. We've included it as a candidate for our next round of planning. |
Description
Currently
cooldownPeriodSeconds
affects the time to wait between scaling events. It would be useful if the time to wait between scaling events could be detached from the time the agents stick around after a run.I propose
scalingPeriodSeconds
is the time to wait between scaling events. AndcooldownPeriodSeconds
is the time to wait after a run before startingscalingPeriodSeconds
.Potential YAML Configuration
References
N/A
Community Note
The text was updated successfully, but these errors were encountered: