-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Instructions for setting up Grafana/Prometheus for monitoring local lotus node #11276
Conversation
dff92be
to
3e56383
Compare
This PR also includes location where to put our grafana dashboards which we should maintain in repo.
3e56383
to
26e9703
Compare
This is really nice! 👏 I will open a issue in the Lotus-Docs as well, so that we can port this guide over there once it lands in a stable release! |
Suggested a couple of typo-fixes. The title of the PR can also be updated to align with the PR checklist |
This is awesome, thank you fridrcik! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A part from the typos, this LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working well on Arch Linux too!
Related Issues
#10888
Proposed Changes
This PR adds documentation for how to installation and setting up
Prometheus
,Grafana
andnode_exporter
to get a fully working monitoring system running against a local running lotus node.Currently the instructions have been tested on Mac M1 and Ubuntu linux.
This PR includes a pre-configured Prometheus configuration as well as an initial dashboard I created for investigating where time is spent in ApplyBlocks which is executed as part of ExecuteTipSet. We should aim to have dashboards targeted for each individual users (miners, rpc providers, core devs, etc) and maintain them inside the Lotus codebase (currently using
metrics/grafana
as the location for that). I leave that up to future PRs.Test plan
After following the installation readme, you should get something like (showing the default lotus metric dashboard):

And if you setup node_exporter, have rich dashboard for viewing your system metrics:

Future work
Although the monitoring system described here should give good overview and analyzis, there are still a lot more things we can do to extend it with more capabilities (especially after we migrate to OpenTelemetry, see #11268):
Checklist
Before you mark the PR ready for review, please make sure that:
<PR type>: <area>: <change being made>
fix: mempool: Introduce a cache for valid signatures
PR type
: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, testarea
, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps