-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem on updating the external secret for s390x-knative #1452
Comments
Looks like KES just stopped again for some reason. The log output right before it gets stuck is different from normal operation: {"level":30,"message_time":"2022-02-06T07:10:29.761Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"starting poller for test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:39.751Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"running poll on the secret test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:39.767Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"fetching secret kubeconfig from GCP Secret for project s390x-knative with version latest"}
{"level":30,"message_time":"2022-02-06T07:10:39.767Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"fetching secret ko-docker-repository from GCP Secret for project s390x-knative with version latest"}
{"level":30,"message_time":"2022-02-06T07:10:39.767Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"fetching secret registry-certificate from GCP Secret for project s390x-knative with version latest"}
{"level":30,"message_time":"2022-02-06T07:10:39.767Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"fetching secret knative01-ssh from GCP Secret for project s390x-knative with version latest"}
{"level":30,"message_time":"2022-02-06T07:10:39.767Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"fetching secret docker-config from GCP Secret for project s390x-knative with version latest"}
{"level":30,"message_time":"2022-02-06T07:10:39.797Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"getting secret test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:39.808Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"updating secret test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:39.844Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"stopping poller for test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:39.845Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"starting poller for test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:10:43.284Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"Stopping watch stream for namespace * due to event: END"}
{"level":30,"message_time":"2022-02-06T07:10:43.288Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"stopping poller for test-pods/s390x-cluster1"}
{"level":30,"message_time":"2022-02-06T07:11:43.289Z","pid":18,"hostname":"kubernetes-external-secrets-5487fc7656-cwf77","payload":{},"msg":"No watch event for 60000 ms, restarting watcher for *"} It looks like KES in the service cluster is experiencing similar issues: {"level":30,"message_time":"2022-01-27T00:08:22.938Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"starting poller for prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:08:32.927Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"running poll on the secret prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:08:32.936Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"fetching secret knative-prow__prow-monitoring__grafana from GCP Secret for project knative-tests with version latest"}
{"level":30,"message_time":"2022-01-27T00:08:32.959Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"getting secret prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:08:32.967Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"skipping secret prow-monitoring/grafana upsert, objects are the same"}
{"level":30,"message_time":"2022-01-27T00:08:32.976Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"stopping poller for prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:08:32.976Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"starting poller for prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:08:37.252Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"Stopping watch stream for namespace * due to event: END"}
{"level":30,"message_time":"2022-01-27T00:08:37.255Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"stopping poller for prow-monitoring/grafana"}
{"level":30,"message_time":"2022-01-27T00:09:37.255Z","pid":18,"hostname":"kubernetes-external-secrets-85b98f665c-r26tz","payload":{},"msg":"No watch event for 60000 ms, restarting watcher for *"} Looking at the upstream repo, this appears to be an issue others are experiencing as well and the only solution being recommended is migrating to an alternative tool. external-secrets/kubernetes-external-secrets#826 (comment) I'll restart the pods for now, but we may need to look into external-secrets/kubernetes-external-secrets#864. It is strange that we've only seen this affect the Knative Prow instance. |
Thanks for looking into the issue and restarting the pod. I will try to figure out what makes KES operations on |
No meaningful findings for the solution. Is it possible to get the operator scheduled for restart on a regular basis during the time period when it is least used? (until the migration to ESO is done) Or to place a monitor for a metric Any suggestion from your side to get around this problem for a while would be welcome! |
According to the logs KES is updating the
I'll restart the pod, but I wouldn't expect a change in behavior this time since the KES pod is not stuck like it was before. |
oh, good to know. Thanks for restarting the pod. Based on the log (today's) you have showed me, it looks everything works fine. 😉 But the log on the 25th from my side, the secret was not updated (maybe instant failure? 🤔 ). Then I stopped updating the secret. I should have kept it updated until you look into the pod. Sorry. Again, thanks a lot! have a great start of the week! |
I will close this issue because the implementation to make CI working independent of whether KES works has been finished. |
This is the same issue as reported in #1441
Since 2022-02-06 07:30:00 UTC, an update on the following external secrets has not been working.
https://github.com/GoogleCloudPlatform/oss-test-infra/blob/master/prow/knative/cluster/build/600-kubernetes_external_secrets.yaml
@cjwagner Could you look into this again?
The text was updated successfully, but these errors were encountered: