Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ip route conflicts and race conditions when resolving best routes in clientNetwork.getBestRouteFromStatuses() #2066

Open
nazarewk opened this issue May 28, 2024 · 4 comments
Labels
bug Something isn't working cloud routes

Comments

@nazarewk
Copy link
Contributor

nazarewk commented May 28, 2024

Describe the problem

I have noticed the client-side route selection feature so during today's Netbird workshop for our team I have enabled both routes clashing on 10.4/16 and later today teammates started reporting problems with accessing the primary system so I switched the secondary off and dug into the code.

First thing I have noticed using ip route metric was superseded by:

The method is scoped to *clientNetwork, which I guess does not consider a possibility of another NetworkID providing the same CIDR (10.4/16 in our case) resulting in a conflict/race condition on the created ip route entry.

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

Either of:

  • only the best/highest metric route across all networks is selected
  • create the ip route with the metric argument (seems to be netlink.Route.Priority) which would properly prioritize routes on the ip route level even in case of a clash
    • watch our not to deregister the wrong route during updates

Are you using NetBird Cloud?

Yes

NetBird version

0.27.7

NetBird status -d output:

If applicable, add the `netbird status -d' command output.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

@nazarewk
Copy link
Contributor Author

nazarewk commented May 28, 2024

I've noticed that I was connected to the primary network and the ip route entry disappeared completely after i disabled the secondary network in the web ui.

I had to do netbird down && netbird up for the client to re-register the primary route.

@nazarewk
Copy link
Contributor Author

nazarewk commented May 28, 2024

a single clientNetwork seems to represent a very specific <network-id>-<network-range> grouping:

  1. clientNetworks map[route.HAUniqueID]*clientNetwork
  2. func GetHAUniqueID(input *Route) HAUniqueID {
    return HAUniqueID(string(input.NetID) + "-" + input.Network.String())
    }

making it impossible to know about other input.NetID providing the same input.Network.String() on completely different site/datacenter

@nazarewk
Copy link
Contributor Author

theoretically using input.Network.String() instead of HAUniqueID here could help with my issue:

for _, newRoute := range newRoutes {
haID := route.GetHAUniqueID(newRoute)
if !ownNetworkIDs[haID] {
if !isPrefixSupported(newRoute.Network) {
continue
}
newClientRoutesIDMap[haID] = append(newClientRoutesIDMap[haID], newRoute)
}
}

but I am not sure about the wider consequences

@nazarewk
Copy link
Contributor Author

nazarewk commented May 28, 2024

I've noticed that I was connected to the primary network and the ip route entry disappeared completely after i disabled the secondary network in the web ui.

I had to do netbird down && netbird up for the client to re-register the primary route.

@mlsmaycon noted that this part of the issue (removing routes still in use) will be fixed by #1943 , since it is a big refactor of (previously) linked code in this issue it will be worth analyzing and revisiting again after #1943 is merged

@nazarewk nazarewk changed the title Possible ip route conflicts after switching ip route metric to clientNetwork.getBestRouteFromStatuses() ip route conflicts and race conditions when resolving best routes in clientNetwork.getBestRouteFromStatuses() May 28, 2024
@bcmmbaga bcmmbaga added bug Something isn't working routes cloud and removed triage-needed labels May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cloud routes
Projects
None yet
Development

No branches or pull requests

2 participants