Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[elasticsearch] readiness probe failing at startup when cluster status is yellow #215

Closed
naseemkullah opened this issue Jul 8, 2019 · 12 comments

Comments

@naseemkullah
Copy link
Contributor

Hi,

I tried keeping more and more indices and eventually got a circuit breaker error. So I increased ram and heap size.

I don't know if the above is related but now the elasticsearch pods do not pass the readiness probe, if i remove the readiness probe, things seem to work fine, all logs get collected and stored in elasticsearch.

Any idea what might be going on?

@naseemkullah
Copy link
Contributor Author

I reduced the readiness probe to just the http check, and it works:

 87         readinessProbe:
 88           exec:
 89             command:
 90             - sh
 91             - -c
 92             - |
 93               #!/usr/bin/env bash -e
 94               # If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
 95               # Once it has started only check that the node itself is responding
 96               START_FILE=/tmp/.es_start_file
 97
 98               http () {
 99                   local path="${1}"
100                   if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
101                     BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
102                   else
103                     BASIC_AUTH=''
104                   fi
105                   curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
106               }
107
108               echo 'Elasticsearch is already running, lets check the node is healthy'
109               http "/"

So something must have been wrong with the other part, namely START_FILE not existing and then:

else
                    echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "{{ .Values.clusterHealthCheckParams }}" )'
                    if http "/_cluster/health?{{ .Values.clusterHealthCheckParams }}" ; then
                        touch ${START_FILE}
                        exit 0
                    else
                        echo 'Cluster is not yet ready (request params: "{{ .Values.clusterHealthCheckParams }}" )'
                        exit 1
                    fi
                fi

@naseemkullah
Copy link
Contributor Author

Would it be possible to make the readiness probe simpler, something like this ?

@naseemkullah
Copy link
Contributor Author

naseemkullah commented Jul 9, 2019

It seems to work now, not sure how... If all continues well I will close this tomorrow end of day. But I'm still curious about the multi line bash script health check when we could have something like the above link?

@naseemkullah naseemkullah changed the title [elasticsearch] readiness probe failing [elasticsearch] readiness probe failing at startup when cluster status is yellow Jul 9, 2019
@naseemkullah
Copy link
Contributor Author

Ok I deleted a pod to test, and it is not starting up with a ready status, the issue is if the cluster state is yellow, the pod never becomes ready, but if it is ready then the cluster status goes yellow, it remains ready.

Imho this is not ideal, either it is ready regardless of yellow or it is not ready if yellow (I much prefer the first of the two options). Be at at startup or during runtime.

@naseemkullah
Copy link
Contributor Author

After a lot of messing around, I've come to see that setting clusterHealthCheckParams: "local=true" yields much better results. @Crazybus May I suggest either simplifying the readiness probe to something like https://github.com/helm/charts/blob/8d0cd3ceee4a8d761375ec45e112e01467fb4e4e/stable/elasticsearch/values.yaml#L237-L242 or at the very least having a default value clusterHealthCheckParams: "local=true" ?

@Crazybus
Copy link
Contributor

Crazybus commented Jul 9, 2019

Waiting for the cluster to become green is very important. It's needed to make sure that Elasticsearch will remain available during upgrades to the statefulset and to the Kubernetes nodes themselves. This is combined with the default pod disruption budget to make sure that only 1 pod is ever unavailable at any time. This is the only safe way to do updates because you can't assume how many primaries and replicas each index has, or assume where each copy is.

I don't know if the above is related but now the elasticsearch pods do not pass the readiness probe, if i remove the readiness probe, things seem to work fine, all logs get collected and stored in elasticsearch.

Could you go into more detail about what your setup looks like? And what problem you are having? If you made a change to the statefulset only 1 pod should be updated at a time and all data should remain available during the entire restart. If more than 1 pod became unready then it sounds like a problem, if the other pods in the cluster were still ready then it sounds like everything was functioning as expected.

After a lot of messing around, I've come to see that setting clusterHealthCheckParams: "local=true" yields much better results.

I would really discourage you from doing this. The setting was made configurable so that the cluster aware health check can be disabled for client only nodes. Doing this on any other kind of (stateful) node is not recommended and will lead to downtime. If your cluster really can't survive losing a single pod then it sounds like their is another issue, changing this setting is only masking the actual problem.

@naseemkullah
Copy link
Contributor Author

naseemkullah commented Jul 9, 2019

Thanks @Crazybus for taking the time to shed light on this and explain.

As for my setup, it is trying to keep 30 days worth of logs. Each index is roughly 150GB.
I've got the default 3 pod es cluster, each with 1Ti, 8GB (heap size 4gb), and 3 cores. I might need to increase one or more of these resources as i collect more open indices, im at about 14 indices now (14 days worth of logs.

What I noticed with health check as is, if i delete 1 pod, after many hours it is still not ready. to me that is not normal.

I will put back health check param clusterHealthCheckParams back to default value of "wait_for_status=green&timeout=1s" and delete a pod again, to see that it re spawns and becomes ready at some point. Stay tuned!

@Crazybus
Copy link
Contributor

What does the output of curl localhost:9200/_cluster/health?pretty look like over time while it is recovering? And also curl localhost:9200/_cat/indices. For the cluster health output I'm mostly curious to see if its just taking a long time for the shards to recover and replicate or if it really is stuck for some reason. For _cat/indices I want to see how many replicas are available and what the actual size of each of them is.

There are a few reasons that this can be slow, fixing why it is slow is the issue here, you always want to wait for the cluster to become green again during restarts. Some ideas of things you should be looking at:

  1. The default value for index.unassigned.node_left.delayed_timeout is 1m which in my experience can sometimes lead to unnecessary resyncing of shards during normal restarts, setting this to something higher like 5 minutes can prevent this from happening. https://www.elastic.co/guide/en/elasticsearch/reference/7.2/delayed-allocation.html explains this in more detail.
  2. The default settings for shard recovery are quite low to prioritize performance over recovery time. For large shards you may want to look at https://www.elastic.co/guide/en/elasticsearch/reference/current/recovery.html to optimize this.
  3. You might be running out of disk space on one node when shards are being reallocated.
    Based on the size of your nodes and the size of your indicies it is also possible that you might be hitting the disk watermark level that stops shard allocation.
  4. General load of the cluster, the heap size you have for that amount of data sounds a bit off. If the cluster is really overloaded that could be delaying the recovery time too.

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-recovery.html can also give some useful information about what might be going wrong if things really aren't recovering at all.

@naseemkullah
Copy link
Contributor Author

Hi @Crazybus , sorry for the delay in getting back to you.

Once again many thanks for such a thorough explanation.

I've deleted a pod (out of the 3 pods, there are always 2 with high cpu consumption, I deleted one of those).

elasticsearch-master-0                   833m         5403Mi
elasticsearch-master-1                   992m         5597Mi
elasticsearch-master-2                   33m          4849Mi

And it was ready quite quickly, just under 4 minutes.

elasticsearch-master-1                   1/1     Running           0          3m52s

I did the same with the pod using less cpu out of the 3, and it came back even quicker, just over a minute and a half:

elasticsearch-master-2                   1/1     Running           0          97s

I'm not sure how to reproduce what was being experience when I opened the issue. It might have to do with me having allocated insufficient RAM at the time of opening the issue, or not. In any case.

What does the output of curl localhost:9200/_cluster/health?pretty look like over time while it is recovering?

It was interesting to see the unassigned shards be assigned as I ran this command continuously.

As for indices:

[root@elasticsearch-master-1 elasticsearch]# curl localhost:9200/_cat/indices
green open logstash-2019.07.10       1 1 236854622    0 242.5gb 120.1gb
green open logstash-2019.07.02      1 1  26520081    0  39.2gb  19.6gb
green open logstash-2019.07.11       1 1 209584780    0 226.3gb 112.9gb
green open elastalert_status_past    1 1         0    0    566b    283b
green open logstash-2019.07.04       1 1 144564621    0 159.7gb  79.8gb
green open .kibana_2                 1 1         8    1   115kb  57.5kb
green open .kibana_1                 1 1         3    0  24.3kb  12.1kb
green open logstash-2019.07.03       1 1 125173755    0 130.1gb    65gb
green open .tasks                    1 1         1    0  12.7kb   6.3kb
green open logstash-1970.01.01       1 1     10418    0   6.2mb   3.4mb
green open logstash-2019.07.14       1 1 137594229  850 168.9gb  87.2gb
green open logstash-2019.07.07       1 1 137330913    0 149.1gb  74.5gb
green open elastalert_status_silence 1 1       443    0 149.7kb  74.8kb
green open logstash-2019.07.09       1 1 225163646 1187 227.4gb 115.2gb
green open logstash-2019.07.01       1 1  22732295    0  33.1gb  16.5gb
green open logstash-2099.01.01       1 1         2    0  39.2kb  19.6kb
green open .kibana_3                       11    2 228.4kb 114.2kb
green open elastalert_status         1 1       437    0 859.8kb 429.9kb
green open logstash-2019.07.06       1 1 160486714    0 171.7gb  85.8gb
green open elastalert_status_status  1 1     23462    0   6.2mb   3.1mb
green open elastalert_status_error   1 1        28    0 163.8kb  70.5kb
green open logstash-2019.06.30       1 1  24835289    0  38.1gb    19gb
green open logstash-2096.03.29       1         2    0  39.2kb  19.6kb
green open logstash-2019.07.12       1 1 203525463    0 215.8gb 107.5gb
green open logstash-2019.07.05      1 1 175578592    0 186.4gb  93.2gb
green open logstash-2019.07.08       1 1 165905396    0 174.4gb  87.2gb
green open .kibana_task_manager      1 1         2    0  25.5kb  12.7kb
green open logstash-2019.07.13       1 1 161275939    0 179.3gb  89.6gb

Could you please tell me at first glance why you may think the heap size is off?

General load of the cluster, the heap size you have for that amount of data sounds a bit off. If the cluster is really overloaded that could be delaying the recovery time too.

This issue can be closed as it works now!

@Crazybus
Copy link
Contributor

Could you please tell me at first glance why you may think the heap size is off?

It was just a hunch based of your heap size and amount of data. I would need to see some monitoring data from the cluster to have a better idea. However it really depends on so many factors and how the cluster is being used and queried. If things are performing well then there is nothing to worry about. Here is an old blog post that goes into some more details about what can influence this: https://www.elastic.co/blog/a-heap-of-trouble

@Nimesh36
Copy link

Nimesh36 commented Mar 2, 2021

I was having the same issue. Earlier I have set a password with a length of fewer than 20 characters. but after setting a password with 20 character length in 100s pod status is turn in ready.

@tranvansang
Copy link

After a lot of messing around, I've come to see that setting clusterHealthCheckParams: "local=true" yields much better results. @Crazybus May I suggest either simplifying the readiness probe to something like https://github.com/helm/charts/blob/8d0cd3ceee4a8d761375ec45e112e01467fb4e4e/stable/elasticsearch/values.yaml#L237-L242 or at the very least having a default value clusterHealthCheckParams: "local=true" ?

Thanks for your hint. My cluster now gets green.

The default params doesn't work.

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants