-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat]create kube-state-metrics metricset, including all objects #14306
[Metricbeat]create kube-state-metrics metricset, including all objects #14306
Conversation
} | ||
|
||
// ExtendedInfoMetric obtains info labels from the given metric and puts them | ||
// into events matching all the key labels present in the metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the difference between this and InfoMetric is that we use it to catch all labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, re-reading the comment that is not clear.
Will improve it, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think most existing uses of InfoMetric
could use this extended version out of the box, so a choice was evolving current InfoMetric instead of creating the new one.
That would need a lot of testing, not in the mood for that, but I don't see a reason why we should left some labels coming from prometheus not pushed. That might be something to check for each existing use of InfoMetrics
if this extended type gets in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought, we should consider moving all existing metricsets to the new behavior. This could perhaps be a parameter for InfoMetric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing the signature for InfoMetric ... could be done, but I would rather attempt to merge both InfoMetric
and ExtendedInfoMetric
in a different PR.
Current InfoMetric
parameter option func is intended for post-processing, and does not plays well with the current intended behaviour, I don't think this is doable without changing current signature.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me, as long as we keep the plan for unifying them to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// if label for this metric is not found at the label mappings but | ||
// it is configured to store any labels found, make it so | ||
// TODO dedot | ||
labels.Put(labelsLocation+"."+k, v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think prometheus labels don't contain dots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's interesting, I will need to test but prometheus is copying labels + values from kubernetes, which might contain dots.
I'm not sure if it will code dots into something else. I'll keep the TODO until I test
- module: kubernetes | ||
enabled: true | ||
filter: | ||
blacklist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would be the reasoning to do blacklist instead of using the drop_event
processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pros
blacklist happens before processing, it avoids including items in the mapping object, should save memory + CPU + time.
In fact, I was wondering how to introduce this same feature for all prometheus metrics, but don't want to abstract it until that is proved to be worth it for this metricset.
for me it feels better to write configuration to receive everything but configmaps
and damonsets
like:
metricset:
kube-state-metrics:
i-want-all-but:
- configmaps
- daemonsets
than
metricset:
kube-state-metrics:
processors:
once-processed-remove:
- events-that-match
event.somekey equals kube_configuration_*
- events-that-match
event.somekey equals kube_daemonset_*
Cons
not using drop_event
is not beats standard, this might be a one off unless we find this valuable for other metrics. That said, you can still use drop_event
even if blacklisting makes it in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how often people will want to disable these. What would be the expected behavior for resources that are not present in k8s? This will probably report none or only a few documents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on doubts when introducing features no one asked for
I think there is a total of 28 groups, among them replicationcontrollers
, signingcertificaterequests
, storageclasses
... and the default would be enabling them all. But as soon as a user notices all that information that is not useful for their use cases, or that might bring redundant information (sometimes pv
/pvc
), and that keeps on using storage, it looks like a candidate to be filtered.
What would be the expected behavior for resources that are not present in k8s?
not present at prometheus --> no data is processed nor sent
present and boring (storageclasses, whatever the user considers boring) --> constant data with no value
The use case for this, it would be the same as the drop_event
processor, but other than saving storage at ES, it saves resources + time, and groups using familiar names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm feeling like dropping blacklisting and whitelisting filtering. Those are a lot welcomed here, but don't want to go forward until discussing out of band with the team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, we can always add them in a followup PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! left a few questions
Re-use of labels names at different metrics
That hasn't been an issue so far since each label was unique per family, or if repeated, it had the same meaning and was informed at the same location at the output event. At kube-state-metrics we find these 2 families informed:
Retrieving a resource that contains both families using a single fetch could be done using these labels as keys
Which would group families that match the values for those labels. The problem is that while namespace, pod and container can be placed at:
Label reason is not that generic, this does not sound that good:
Ideally this would be placed at a different location at each event
Solution proposalIn order to do this probably the most straightforward solution is to add an optional field to the label mapping (creating a new label type) that adds context about the metric it is being retrieved from. This way we would be able to register label mappings for elements with the same name to different event locations:
AlternativesDismiss the single metricset for kube-state-metrics and use the previous approach of one metricset per metric group @exekias waiting for your decission |
Shouldn't this be
I'm wondering if this should be more something like this. If I understand things correctly there can be only one status+reason pairs at the same time?
|
@exekias continuing on the last conversation To be able to extract context from labels belonging to metrics, the proposal would be declaring the labels inside of the metric map. Much closer to mapping the exact structure of what we find at the prometheus resource. The two examples below are metric families that define 2 labels with the same name, one of them
This proposal is compatible with previous metrics mapping specification, ans uses their objects. An ExtendedMetric object is added that allows defining labels inside the metric scope, as they are informed at prometheus. A real world example can be found at https://github.com/kubernetes/kube-state-metrics/blob/master/docs/pod-metrics.md, look for Looking forward for your feedback. |
This sounds reasonable, I wonder if mapping may get too complex in some cases. Right now for most metricsets It's really straight forward. ie: beats/metricbeat/module/kubernetes/state_container/state_container.go Lines 47 to 71 in 2df19e3
Would this change mean that we need to define labels for every single p.Metric there? It could be useful to explore how this one would end up looking |
It would move labels inside metrics, so some caveats:
As done before with the filters, I can move on and work on a temptative implementation and see how this would look. AlternativeA very ugly one, when finding these reuse of family scoped labels, decide to use only one of them and discard the less important metric. |
Sounds like having a definition like that per metric family (or arbitrary metric name prefix) could be a good compromise? That way you can have a definition like the one I linked for each family. WDYT? |
not sure if I follow There are ways to make the proposal look like that, defining labels somewhere else and then referencing them from the metric definition, making it one liner. I'd suggest discussing out of band if that suits you fine |
discarding the proposal for now. Will bring the proposal back if needed later |
hitting another scenario
I'm considering 3 first as primary keys (keyLabels as of the prometheus helper at metricbeat). I don't think there is currently a way of doing this, is there? If not, I guess the non disruptive solution would be to add a param at the @exekias agree? |
another scenario:
this family is used to show memory and cpu depending on the |
Where do you get the info about a container being init container? |
that's upon the metric:
see #14306 (comment) |
halting work on this. Read here #14682 (comment) We can probably follow the same path we planned, but gaining some extra experience by creating a number of separate metricsets for each kube-state-metrics group. |
Create kube-state-metrics metricset for the kubernetes module.
Inplements #12938
Fixes #13986
Fixes #11624
Related to #7058
THIS IS WIP
Pre-release review
This PR is modifying the prometheus fetcher, adding a pre-processing filter, and using more than 20 kubernetes objects.
To make it easier to review there will be at least 3 milestones, one of which is already here
Milestones
Milestone 1
Milestone 1 is in review status
Prometheus
ExtendedInfoMetric
:Based on
InfoMetric
type at the prometheus package, it accepts aConfiguration
object that allows an option to include any label at the metric that hasn't been mapped explicitly.Underlying
commonMetric
object has been modified, it holds aconfiguration
item andMetricProcessingOptions
(an array of functions to apply to already retrieved elements) is included inside this configuration object. Existing client code do not need to be updated. As an alternative, creating a function signature that included aconfiguration
andmetricProcessingOption
as part of the signature did not seemed right, since the later sounds like part of the former.pre-process filtering:
The proposal implemented here for this filtering is to use either a whitelist or blacklist of items. Docs included at the PR shows how it would work.
The main caveat would be that we are making up names for each group that can be filtered (in fact it is inferred from the
kube_<name>_
from kube-state-metrics, and I would say that is intuitive). The alternative would be to use some pattern that matches the source metric name, so instead of filtering forcronjob
the user would writekube_cronjob_*
. To my eyes the alternative does not look goodFirst 2 implementation of groups
certificatessigningrequest
andconfigmaps
have been added to get the vibe of how the code would look like. This is also fully working if you want to testMilestone 2
Milestone 2 is not started yet
Milestone 3
Milestone 3 is not started yet