-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prometheus monitoring foundation #5736
Conversation
Should we replace all datadog metrics by prometheus? |
@rodrigok I cast my vote 100% for replacing datadog with prometheus. I think this will let us fine tune what we want / need to monitor within rocket.chat its self too. |
@rodrigok 100% vote also here to replace datadog 😄 |
one gotcha to share when hooking up prometheus is that for multi-instanced install - you need to connect direct to instance [not through load balancer] to get meaningful metrics feed |
If we're going to add more stats this way, then let's move them all to a new |
Should we go so specific as to |
How generic is the data returned? |
@graywolf336 I think your right. Very least should move to: |
Not quite an "api" ... it is a data-feed that is accessed via http / https |
While I can understand what it is you're trying to do, it should be implemented in a manner which is extensible and easily supports additional metric systems beyond the ones that support the prometheus format. Before this makes it into production, I would like to see the changes this pull request added changed to use internal the cc: @engelgabriel |
@Sing-Li what other system use the same data feed format as Prometheus? |
I think that the API should be a wrapper around our own stats collector, and not yet another stats collecting engine. This way we can support multiple system, and even show some basic data and graphs on the admin panel. |
@engelgabriel I agree 👍 |
@graywolf336 As stated previously, as a single endpoint datafeed - you are free to locate it anywhere. It is not an API. @engelgabriel it actually is. The client part of prometheus actually does not care how metrics are collected. It is just a data structure (a buffer) that is updated and then exported/rendered to the feed. How you collect those metrics, is absolutely flexible and not dictated in any way - so if @graywolf336 chooses to use Prometheus solves the instrumentation and monitoring problem elegantly, and allows for minimal disturbance to existing systems for its implementation - hence its growing popularity. |
@engelgabriel as most of the metrics enumerated in #726 are counters, and the https://moovel.github.io/teamchatviz/ site referenced in #3824 are time series graphs ... this is almost a classic use-case for a prometheus - Grafana pipeline. But now I do understand (thx 4 the offline talk) that there are some plans already in motion to accomplish some part of this using REST APIs and other means. So indeed it is your call on the approach - to avoid duplication of effort and resources. |
RocketChat.sendMessage user, message, room | ||
RocketChat.metrics.messagesSent.inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line broke several things which rely on the return result of RocketChat.sendMessage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose we remove the sendMessage hook here and work on a PR to make use of the statistics collection we are already creating. That way we don't adversely effect things
Add support for prometheus monitoring ( https://prometheus.io/ )
This can become our own foundation to instrument everything. A first step in realizing #5730
Only one "sample metrics" has been added - accumulated number of messages sent
But more can be readily added
Prometheus exposes default useful nodejs metrics
There are excellent add on modules that can expose GC metrics, docker metrics , as well as OS metrics.
Used the following prometheus.yml for testing and development: