-
Notifications
You must be signed in to change notification settings - Fork 404
status check went boom #826
Comments
I have the same problem in OpenShift bare metal cluster. |
I have the same issue in Azure Kubernetes 1.20.7. I had a similar issue in 6.4.0, and this persists in 8.3.2 Kubernetes External Secrets Version: 8.3.2 |
This happened to me again at 18:36. Vault cluster was just fine and I could log in. External secrets simply stopped polling. This is affecting production systems. Please advise. |
This happened again at 10:01am
|
I'm not following, this particular log line is to be expected even in a healthy cluster or setup. The operator is watching a HTTP stream and for some reason those are not cleanly closing towards the kubernetes api, so the operator will as a precaution restart the watch every 60 seconds (or whatever is configured). Otherwise the operator could hang forever, not knowing if nothing happened or if there will never be any new events published on the stream, for more background see #362 |
Hard to tell what goes wrong from this, and odd that there's no error being logged since this particular log line (status check went boom) does try to log an error 😕 |
Yes this is the last logline before KES simply stops upserting. There are NO errors outputted for me to provide you. The poller simply stops polling. This is happening DAILY at this point. We need a fix our I'm going to have to stop using this engine. |
I also observed this issue in bare metal Kubernetes Clusters with Vault. Some secrets are not updated anymore with no error besides "status check went boom". This also happened with a completely new ExternalSecret once. The ExternalSecret was not updated at all (no error message in the custom resource). Kubernetes version: v1.20.11 |
I wish I had an easy fix 😄 |
Can someone provide metrics for KES around the time where it stops polling? I'm particularly interested in cpu/mem/netstat and |
I couldn't find an issue using the aws provider. I let it run for 3 days, with 1/10/100/500 |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
This issue was closed because it has been stalled for 30 days with no activity. |
Possibly same issue as #362.
Running v8.3.0, installed via Helm chart, on GKE (1.20.8-gke.900).
The cluster gets bootstrapped by an automated pipeline, and
external-secrets
is expected to retrieve a single secret from Google Secret Manager.I works about 50% of the time.
When it doesn't work, I'm seeing the following lines repeated in the pod logs:
Deleting the pod solves the problem, but I have not checked whether the problem re-occurs.
I never saw this problem before upgrading from 6.2.0 to 8.3.0.
The text was updated successfully, but these errors were encountered: