MAGNUM: Cluster autoscaler support for manually removed node from node group #7776
Labels
area/cluster-autoscaler
area/provider/magnum
Issues or PRs related to the Magnum cloud provider for Cluster Autoscaler
kind/feature
Categorizes issue or PR as related to a new feature.
Description:
Magnum cluster-autoscaler in case of manually removed node from node group, can remove all nodes from the node group before the autoscaler removes the correct node.
Version:
Cluster Autoscaler: v1.27.5,v1.29.5
Cloud Provider: Magnum
Current Behavior:
Magnum cluster autoscaler fails to retrieve ID of manually removed node during resize on cleanup
Expected Behavior:
The Magnum autoscaler will be able to retrieve the ID of a manually deleted node, which in this case is a fake node with the prefix openstack:/// not fake:///.
https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-release-1.27/cluster-autoscaler/cloudprovider/magnum/magnum_manager_impl.go#L379-L386
Reproduction Steps:
Create a magnum cluster (k8s ver 1.27 ,1.29 ) with cluster autoscaler (v1.27.5,v1.29.5) running with log level 5 .
Add to it node group.
Add a workload that scales a node group to, say, 4 nodes.
Remove manualy the first node in the node group(by openstack server delete ...).
The autoscaler will start cleaning up random nodes, by resizing without ID nodes to remove
which, in a pessimistic case like the one above, may result in the clearing of all nodes in node group.
Describe the solution you'd like:
Add here
https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-release-1.27/cluster-autoscaler/cloudprovider/magnum/magnum_manager_impl.go#L379-L386
support for fake nodes with the prefix openstack:///
e.g. like this
where parseFakeProviderIDDeletedNode is an additional util function like this
Additional context:
After looking in the code in the main branch, I see that there this case will also occur.
The text was updated successfully, but these errors were encountered: