mirror of
https://github.com/outbackdingo/firezone.git
synced 2026-01-27 10:18:54 +00:00
In #6032, we attempted to fix routing loops for Windows and did so successfully for UDP packets. For TCP sockets, we believed that binding the socket to an interface is enough to prevent routing loops. This assumptions is wrong. > On Windows, a call to bind() affects card selection only incoming traffic, not outgoing traffic. > > Thus, on a client running in a multi-homed system (i.e., more than one interface card), it's the network stack that selects the card to use, and it makes its selection based solely on the destination IP, which in turn is based on the routing table. A call to bind() will not affect the choice of the card in any way. On most of our testing machines, this problem didn't surface but it turns out that on some machines, especially with WiFi cards there is a conflict between the routes added on the system. In particular, with the Internet resource active, we also add a catch-all route that we _want_ to have the most priority, i.e. Windows SHOULD send all traffic to our TUN device. Except for traffic that we generate, like TCP connections to the portal or UDP packets sent to gateways, relays or DNS servers. It appears that on some systems, mostly with Ethernet adapters, Windows picks the "correct" interface for our socket and sends traffic via that but on other systems, it doesn't. TCP sockets are only used for the WebSocket connection to the portal. Without that one, Firezone completely stops working because we can't send any control messages. To reliably fix this issue, we need to add a dedicated route for the target IP of each TCP socket that is more specific than the Internet resource route (`0.0.0.0/0`) but otherwise identical. We do this as part of creating a new TCP socket. This route is for the _default_ interface and thus, doesn't get automatically removed when Firezone exits. We implement a RAII guard that attempts to drop the route on a best-effort basis. Despite this RAII guard, this route can linger around in case Firezone is being forced to exit or exits in otherwise unclean ways. To avoid lingering routes, we always delete all routing table entries matching the IP of the portal just before we are about to add one. Fixes: #6591. [0]: https://forums.codeguru.com/showthread.php?487139-Socket-binding-with-routing-table&s=a31637836c1bf7f0bc71c1955e47bdf9&p=1891235#post1891235 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Foo Bar <foo@bar.com> Co-authored-by: conectado <gabrielalejandro7@gmail.com>