Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus and PeerLastSeen out of sync #11

Open
t-lin opened this issue Jun 15, 2020 · 1 comment
Open

Prometheus and PeerLastSeen out of sync #11

t-lin opened this issue Jun 15, 2020 · 1 comment
Labels
bug Something isn't working investigate This doesn't seem right (triage and analyze)

Comments

@t-lin
Copy link
Member

t-lin commented Jun 15, 2020

After a long while, Prometheus seems to have some metrics for peers with 0 RTT that do not exist in ping-monitor's PeerLastSeen data structure. Similarly, there are occasional error messages that appear in ping-monitor trying to delete gauges (from PeerLastSeen's expiry mechanism) that do no exist in Prometheus.

This can be seen by calling hl-cli list periodically at an interval over a long period of time. A few metrics w/ 0 RTT should appear in the dashboard but do not exist within the PeerLastSeen structure.

Attempted to wrap all access to Prometheus client code in ping-monitor using mutex, but that doesn't seem to fix it.

@t-lin t-lin added bug Something isn't working investigate This doesn't seem right (triage and analyze) labels Jun 15, 2020
@t-lin
Copy link
Member Author

t-lin commented Jun 15, 2020

Easiest "solution" (more of a work-around) is to simply have a seperate goroutine sync the two every x minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working investigate This doesn't seem right (triage and analyze)
Projects
None yet
Development

No branches or pull requests

1 participant