-
Notifications
You must be signed in to change notification settings - Fork 294
General publisher plugin error handling questions #964
Comments
I am going to let someone like @jcooklin or @tjmcs comment on #1. On #2 I would vote now since the deadline time on a workflow may give up. For #2 and #3 we have an RFC being written now for optionally buffering both processor and publisher calls for a configurable period of time with retry. This would prevent workflow data from being lost and allow for reentry into processor and publisher calls that timeout. This should solve your problem without having to make the publisher maintain more state. We don't expect the new RFC to be very difficult to implement. |
The setting for failures is 10. Right now it's set in rest here. Pretty low hanging fruit if we wanted to make it configurable. |
@IRCody Looks like it's a very easy fix to make it configurable. I am wondering if it makes sense if not failing the task by passing in -1. |
#967 has been created to capture #1 For #2 I feel it might make sense for some plugins to implement their own internal retry separate from the fact that the framework will retry a task some @iceycake: Related to #3 we've had someone else recently request for spooling on failure (process/publish) so I just added a separate issue on it (#966). Any comments you have on #966 would be greatly appreciated. |
Hey @iceycake, thank you again for your great questions! I just wanted to complete the loop on this issue and make sure we answered everything before closing. The first point was resolved by issue #967. The second was addressed by @jcooklin above. And for the third, I would suggest continuing the conversation over on issue #966. Please feel free to add your comments there or open up a new issue if we missed something. Thanks again for your questions and for contributing to Snap! |
I am creating a task so that it collects some system metrics using the build-in plugins and publish the metrics to TSDB. Due to the extremely high load on the TSDB cluster, occasionally we could get timeout on the TSDB publisher.
Some questions:
The text was updated successfully, but these errors were encountered: