23.10.22

When capi controller manger net/http fails to clean up machines , due to a netsplit... whose fault is it?

Recently found out that, well, i think... if  capi controller , when it is netsplit off from a worker VM, might fail to reconcile VSphere resources associated with the associated workload cluster, which results in zombie Machine resources that continue floating around in a cluster.

I think this is a potential bug , bc it seems to me like capi-controller , even if it cant access a controlplane of a WL cluster, should be deleting / updating stale machine states as "Running"...  

Filed the issue here , https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1660 


No comments:

Post a Comment