CFE Failover - moving Alias IP ranges #26

leeanntsab · 2024-12-12T19:16:45Z

Describe the bug

This is not necessarily a bug but more of an issue the customer is experiencing in GCP.

The customer has deployed two big-ip instances in GCP using GDM template. CFE is deployed using tags/labels to identify resources for failover. Recently customer has experienced a network outage on GCP, which triggered CFE to failover peer A to peer B. The failover did not occur. What the customer observed in the GCP console is the alias IP ranges were removed successfully from the failed instance but were never added to the new active instance. Upon researching the GCP log, the customer saw invalid fingerprint errors during the failed failover event. Per Google support, this means that there are multiple requests to the resource and/or the resource is being used by another process. In addition to this we also observed errors in restnoded IP x.x.x.x is already being used by another resource.

Expected behavior

The expected behavior is CFE process should move the alias IP range from the failed instance to the new active instance by removing the entries on the failed instance, and then adding the entries to the new active instance.

Current behavior

This step of the CFE process does not happen. The customer observe the removal of the alias IP entries but it is never added to the new active instance. Thus traffic never failover to the peer.

Possible solution

From troubleshooting with the customer, and Matt (internal). They discover the issue is with timing mechanism inside of GCP that allows for the alias IP to move. To allow for this extra timing, the customer suggest to add a sleep or delay timer in the CFE script of about 30 seconds before adding the alias IPs to the new peer.

Steps to reproduce

You can reproduce this by configuring two big-ip instances in active/standby with CFE in GCP using any of the GDM templates. Assign one or more alias IP ranges to the instances, and force standby on the current active. You can observe the behavior mentioned above.

mikeshimkus · 2024-12-31T18:32:53Z

@leeanntsab Please open a case with F5 if this is still an issue (if you haven't already).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CFE Failover - moving Alias IP ranges #26

CFE Failover - moving Alias IP ranges #26

leeanntsab commented Dec 12, 2024 •

edited

Loading

mikeshimkus commented Dec 31, 2024

CFE Failover - moving Alias IP ranges #26

CFE Failover - moving Alias IP ranges #26

Comments

leeanntsab commented Dec 12, 2024 • edited Loading

Describe the bug

Expected behavior

Current behavior

Possible solution

Steps to reproduce

mikeshimkus commented Dec 31, 2024

leeanntsab commented Dec 12, 2024 •

edited

Loading