Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

countme: Trigger on boot and bi-weekly with random delay #3041

Merged
merged 1 commit into from
Dec 1, 2021

Conversation

travier
Copy link
Member

@travier travier commented Jul 30, 2021

The current timer will trigger weekly with a potential delay of a week,
which means that if the delay is big enough, an entire week will be
skipped and we will miss counting.

Instead, let's trigger on boot and every three days with a random delay
of one day so that we are sure that we at least trigger once per
counting window.

This does not introduce the risk of over counting as the logic for that
check is in rpm-ostree itself.

@travier
Copy link
Member Author

travier commented Jul 30, 2021

Untested yet. Will do that next week.

@cgwalters
Copy link
Member

Hmm. In theory we could use something like GNetworkMonitor to reliably react to network connectivity rather than failing.

In other words, if the network isn't available we sleep until it is, rather than exiting with failure.

That said, the network availability check may not be 100% accurate, so we may want to have unit restarts anyways. But, I think it'd be good to avoid restarting on all failures - only if we detect a network issue. (OK well, I guess from this service the only real issues can be network)

But it looks like systemd offers RestartForceExitStatus= which would allow us to e.g. exit with a specific code like 42 to signal systemd to restart us only in this case - but not if we e.g. panic due to a configuration error.

@travier
Copy link
Member Author

travier commented Aug 12, 2021

I've tried to figure out how to set a different return code but haven't been able to do that yet. Will keep looking next week.

@travier
Copy link
Member Author

travier commented Sep 20, 2021

Updated with suggestion from #3041 (comment). Still untested. Will have to figure out a way to test that.

@travier
Copy link
Member Author

travier commented Sep 20, 2021

I'm not completely sold on the way that I implemented this but this is the simplest one I could figure out short of defining a new Error type.

cgwalters
cgwalters previously approved these changes Sep 21, 2021
jlebon
jlebon previously approved these changes Sep 22, 2021
@jlebon
Copy link
Member

jlebon commented Sep 22, 2021

ext.config.rpm-ostree-countme failing in CI

@travier
Copy link
Member Author

travier commented Sep 23, 2021

ext.config.rpm-ostree-countme failing in CI

Hum, this fails before calling the code here so this is extra weird. Re-testing.

@dustymabe
Copy link
Member

2nd commit LGTM - thanks for stepping through that with me @travier

cgwalters
cgwalters previously approved these changes Nov 23, 2021
The current timer will trigger weekly with a potential delay of a week,
which means that if the delay is big enough, an entire week will be
skipped and we will miss counting.

Instead, let's trigger on boot and every three days with a random delay
of one day so that we are sure that we at least trigger once per
counting window.

This does not introduce the risk of over counting as the logic for that
check is in rpm-ostree itself.
@travier
Copy link
Member Author

travier commented Nov 24, 2021

OK, this keeps failing in CI and I don't understand why so I'm trying to split the changes to figure out the root issue.

@travier
Copy link
Member Author

travier commented Nov 24, 2021

rpm-ostree-countme.service: Service has RestartForceStatus= set, which isn't allowed for Type=oneshot services. Refusing.

@travier travier changed the title countme: Retry hourly on failure countme: Trigger on boot and bi-weekly with random delay Nov 24, 2021
@travier
Copy link
Member Author

travier commented Nov 24, 2021

Re-purposing this PR for the "second" change from @dustymabe until I find another fix. We might have to do the initial suggestion from #3033 where we move the trigger logic in the code itself instead of systemd.

Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dustymabe
Copy link
Member

anything else needed for this?

@cgwalters cgwalters merged commit 06f24d5 into coreos:main Dec 1, 2021
@travier travier deleted the countme-retry branch December 1, 2021 16:39
travier added a commit to travier/rpm-ostree that referenced this pull request Feb 3, 2022
- Update link to official DNF Configuration Reference documentation.
- Mention that the timer is now triggered on boot after 5 minutes and
  then bi-weekly, with a random delay in both cases.

See: coreos#3041
travier added a commit to travier/rpm-ostree that referenced this pull request Feb 3, 2022
- Update link to official DNF Configuration Reference documentation.
- Mention that the timer is now triggered on boot after 5 minutes and
  then bi-weekly, with a random delay in both cases.

See: coreos#3041
cgwalters pushed a commit that referenced this pull request Feb 3, 2022
- Update link to official DNF Configuration Reference documentation.
- Mention that the timer is now triggered on boot after 5 minutes and
  then bi-weekly, with a random delay in both cases.

See: #3041
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants