-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle EFA security group eni leakage #534
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this fix 👏
Can you add some more detail about how this leak is happening? What is creating the ENI(s) and not cleaning them up? What is special about the EFA stack vs. our existing leaked ENI cleanup? aws-k8s-tester/kubetest2/internal/deployers/eksapi/infra.go Lines 296 to 298 in 72ebc98
|
The EFA Security group is created in the unmanaged-nodegroup-efa.yaml which is a customized security group specified in the node launch template, so it will be created and deleted during unmanaged nodegroup stack creation and deletion. The leaked eni attached to the EFA Security group comes from node group ASG instance. aws-k8s-tester/kubetest2/internal/deployers/eksapi/templates/unmanaged-nodegroup-efa.yaml Lines 547 to 566 in 72ebc98
The reason why our old leaked eni clean up doesn't work as expected is because we are using customized security group in node launch template instead of the default cluster security group. It need to be deleted during the node group stack deletion after ASG tear down, but the old leaked eni clean up is handled during cluster cleaned up after the node group stack already teared down. |
The reason why we need customized security group instead of default cluster security group is because the EFA-enabled security group is required. According to this doc for EFA and MPI https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-security, we need security group with special inbound and outbound rules for EFA |
In commit 3, I use the default cluster security group for EFA by updating the EFA required ingress and egress in it instead of creating a new security group for EFA in node group stack. And the ENI leakage handler in infra.go we already have can handle eni leakage in the default security group for us. test logs
|
Issue #, if available:
To resolve above issue, we can use the default cluster security group and update the EFA required ingress and egress in it instead of creating a new security group for EFA in node group stack. And the eni leakage handler in infra.go we already have can handle eni leakage in the default security group for us.
Test in local below are logs in reverse order: