Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeats: new and improved #669

Merged
merged 4 commits into from
Feb 20, 2019
Merged

Heartbeats: new and improved #669

merged 4 commits into from
Feb 20, 2019

Conversation

cicdw
Copy link
Member

@cicdw cicdw commented Feb 20, 2019

Thanks for contributing to Prefect!

Please describe your work and make sure your PR:

  • adds new tests (if appropriate)
  • updates CHANGELOG.md (if appropriate)
  • updates docstrings for any new functions or function arguments, including docs/outline.toml for API reference docs (if appropriate)

What does this PR change?

This PR updates heartbeats to ensure they are called multiple times on a fixed interval. Moreover, I updated the heartbeat tests to be more robust and truly test this functionality across executors, as well as in conjunction with timeouts.

Why is this PR important?

I began pulling this thread (pun intended) when wondering if heartbeats worked with timeouts, because they both do some multithreading / multiprocessing work under the hood. As I dove deeper, I realized that threading.Timer calls a function one time after a specified interval, so I fixed that by overriding the run method of threading.Timer.

Doing this work reminded me that the DaskExecutor(processes=True) does not support task timeouts as I've implemented them. This is not a problem, but reminds me that we need to ensure our deployment executors are setup to match the full functionality in Core. In this case, the "deployed" analogue to processes=False is the --no-nanny flag on the dask workers.

Copy link
Member

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you caught this issue. I have a nagging desire that we need to discover a universal method for implementing these "sidecar" threads (timeout, heartbeat, potentially progress updates, etc.).

@cicdw
Copy link
Member Author

cicdw commented Feb 20, 2019

@jlowin i'm confident I understand how to implement sidecar / daemon threads safely now; however, timeouts can't be implemented as such because they need to interrupt the main process if it reaches the timeout limit.

@cicdw cicdw merged commit 40ad55f into master Feb 20, 2019
@cicdw cicdw deleted the heartbeats branch February 20, 2019 15:29
cicdw added a commit that referenced this pull request Dec 9, 2021
Orion release code publish step
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants