-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Shuffle sharding is not working as desired for ingesters #10531
Comments
Hi, can you provide some more information? Can you find the specific series which have disappeared? After that you can try to find which ingesters hosted these series by using tools/grpcurl-query-ingesters and confirm whether the series belong to any ingester? |
I will try to get more data on series information which disappeared but we reduced shard size couple of times and every time we saw reduction in number of series on reads. Is there an easy way to verify what ingesters are getting queried for a certain query? |
there are traces which would include all ingesters involved in a query. See this articel to configure tracing collection https://grafana.com/docs/mimir/latest/configure/configure-tracing/ |
thanks for linking that, in addition to the above option is there a metric which can tell us that shuffle sharding is enabled on read path? |
I don't think so. If it enabled on the write path, then it is enabled on the read path too. So, you can compare the number of ingesters with series for a tenant ( |
What is the bug?
Hi Team,
We are currently on Grafana 2.14.3, and we were trying to decrease the tenant shard size by following these steps in runbook.
We noticed discrepancy in number of metrics as soon as we try to reduce tenant shard size for a tenant after disabling shuffle sharding (
-querier.shuffle-sharding-ingesters-enabled=false
). This flag was modified under querier block and was verified by sshing in the queriers pod.Shuffle sharding is not working as desired.
How to reproduce it?
count({__name__=~".+"})
) at given point of time. Attaching screenshots for before/after modifying shard size. Shard size was modified from 41 to 16 for tenant.What did you think would happen?
For a given point of time, no of metrics count shouldn't change as querier should be able to query all ingesters.
What was your environment?
Kubernetes
Any additional context to share?
The text was updated successfully, but these errors were encountered: