firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 18:18:55 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	b7dc897eea	refactor(rust): introduce `libs/` directory (#10964 ) The current Rust workspace isn't as consistent as it could be. To make navigation a bit easier, we move a few crates around. Generally, we follow the idea that entry-points should be at the top-level. `rust/` now looks like this (directories only): ``` . ├── cli # Firezone CLI ├── client-ffi # Entry point for Apple & Android ├── gateway # Gateway ├── gui-client # GUI client ├── headless-client # Headless client ├── libs # Library crates ├── relay # Relay ├── target # Compile artifacts ├── tests # Crates for testing └── tools # Local tools ``` To further enforce this structure, we also drop the `firezone-` prefix from all crates that are not top-level binary crates.	2025-11-25 10:59:11 +00:00
Jamil	0ccd4bbf24	feat(ci): enable relay eBPF offloading (#10160 ) In CI, eBPF in driver mode actually functions just fine with no changes to our existing tests, given we apply a few workarounds and bugfixes: - The interface learning mechanism had two flaws: (1) it only learned per-CPU, which meant the risk for a missing entry grew as the core count of the relay host grew, and (2) it did not filter for unicast IPs, so it picked up broadcast and link-local addresses, causing cross-relay paths to fail occasionally - The `relay-relay` candidate where the two relays are the same relay causes packet drops / loops in the Docker bridge setup, and possibly in GCP too. I'm not sure this is a valid path that solves a real connectivity issue in the wild. I can understand relay-relay paths where two relays are different hosts, and the client and gateway both talk over their TURN channel to each other (i.e. WireGuard is blocked in each of their networks), but I can't think of an advantage for a relay-relay candidate where the traffic simply hairpins (or is dropped) off the nearest switch. This has been now detected with a new `PacketLoop` error that triggers whenever source_ip == dest_ip. - The relays in CI need a common next-hop to talk to for the MAC address swapping to work. A simple router service is added which functions as a basic L3 router (no NAT) that allows the MAC swapping to work. - The `veth` driver has some peculiar requirements to allow it to function with XDP_TX. If you send a packet out of one interface of a veth pair with XDP_TX, you need to either make sure both interfaces have GRO enabled, or you need to attach a dummy XDP program that simply does XDP_PASS to the other interface so that the sk_buff is allocated before going up the stack to the Docker bridge. The GRO method was unreliable and didn't work in our case, causing massive packet delays and unpredictable bursts that prevented ICE from working, so we use the XDP_PASS method instead. A simple docker image is built and lives at https://github.com/firezone/xdp-pass to handle this. Related: #10138 Related: #10260	2025-08-31 23:37:03 +00:00
Thomas Eizinger	c70c88c856	build(deps): upgrade to opentelemetry 0.30 (#10239 )	2025-08-21 22:47:39 +00:00
Thomas Eizinger	70a930e45d	chore(relay): use existing `ebpf` module import (#10202 )	2025-08-17 23:45:36 +00:00
Jamil	b07fa341cf	feat(relay): XDP driver (native) mode for gVNIC (#10177 ) This updates our eBPF module to use DRV_MODE for less CPU overhead and better performance for all same-stack TURN relaying. Notably, gVNIC does not seem to support the `bpf_xdp_adjust_head` helper, so unfortunately we need to extend / shrink the packet tail and move the payload instead. Comprehensive benchmarks have not been performed, but early results show that we can saturate about 1 Gbps per E2 core on GCP: ``` [SUM] 0.00-30.04 sec 3.16 GBytes 904 Mbits/sec 12088 sender [SUM] 0.00-30.00 sec 3.12 GBytes 894 Mbits/sec receiver ``` This is with 64 TCP streams. More streams will better utilize all available RX queues, and lead to better performance. Related: #10138 Fixes: #8633	2025-08-17 15:04:19 +00:00
Thomas Eizinger	d6805d7e48	chore(rust): bump to Rust 1.88 (#9714 ) Rust 1.88 has been released and brings with it a quite exciting feature: let-chains! It allows us to mix-and-match `if` and `let` expressions, therefore often reducing the "right-drift" of the relevant code, making it easier to read. Rust.188 also comes with a new clippy lint that warns when creating a mutable reference from an immutable pointer. Attempting to fix this revealed that this is exactly what we are doing in the eBPF kernel. Unfortunately, it doesn't seem to be possible to design this in a way that is both accepted by the borrow-checker AND by the eBPF verifier. Hence, we simply make the function `unsafe` and document for the programmer, what needs to be upheld.	2025-07-12 06:42:50 +00:00
Thomas Eizinger	cee4be9e24	build(deps): bump Rust dependencies (#9192 ) A mass upgrade of our Rust dependencies. Most crucially, these remove several duplicated dependencies from our tree. - The Tauri plugins have been stuck on `windows v0.60` for a while. They are now updated to use `windows v0.61` which is what the rest of our dependency tree uses. - By bumping `axum`, can also bump `reqwest` which reduces a few more duplicated dependencies. - By removing `env_logger`, we can get rid of a few dependencies.	2025-05-22 13:15:01 +00:00
Thomas Eizinger	37529803ce	build(rust): bump otel ecosystem crates to 0.29 (#9029 )	2025-05-05 12:33:07 +00:00
Thomas Eizinger	0079f76ebd	fix(eBPF): store allocation port-range in big-endian (#8804 ) Any communication between user-space and the eBPF kernel happens via maps. The keys and values in these maps are serialised to bytes, meaning the endianness of how these values are encoded matters! When debugging why the eBPF kernels were not performing as much as we thought they would, I noticed that only very small packets were getting relayed. In particular, only packets encoded as channel-data packets were getting unwrapped correctly. The reverse didn't happen at all. Turning the log-level up to TRACE did reveal that we do in fact see these packets but they don't get handled. Here is the relevant section that handles these packets: `74ccf8e0b2/rust/relay/ebpf-turn-router/src/main.rs (L127-L151)` We can see the `trace!` log in the logs and we know that it should be handled by the first `if`. But for some reason it doesn't. x86 systems like the machines running in GCP are typically little-endian. Network-byte ordering is big-endian. My current theory is that we are comparing the port range with the wrong endianness and therefore, this branch never gets hit, causing the relaying to be offloaded to user space. By storing the fields within `Config` in byte-arrays, we can take explicit control over which endianness is used to store these fields.	2025-04-18 04:51:40 +00:00
Thomas Eizinger	a942dee723	chore(eBPF): don't count channel data header as relayed bytes (#8590 )	2025-04-01 04:31:06 +00:00
Thomas Eizinger	1d0ecf94b8	feat(relay): record metrics about bytes relayed via eBPF (#8556 ) Perf events are designed to be an extremely efficient way of transferring data from an eBPF kernel to the user-space program. In order to monitor, how much traffic we are actually relaying via eBPF, we introduce a dedicated `STATS` map that is a `PerfEventArray`. The events from that array are read asynchronously in user-space and fed into our OTEL metrics. They will show up in our Google Cloud metrics as `data_relayed_ebpf_bytes`. We already have a metric for the total relayed bytes. That counter is renamed to `data_relayed_userspace_bytes` so we can clearly differentiate the two.	2025-03-31 21:57:31 +00:00
Thomas Eizinger	a4851ee76f	feat(relay): implement the reverse IPv4 eBPF code path (#8544 ) This PR implements the "reverse path" of handling TURN traffic, i.e. UDP datagrams that arrive on an allocation port and need to be wrapped in a channel-data message to be sent to the TURN client. In order to achieve that, I had to rewrite most of the TURN code to not use the `etherparse` crate. I couldn't quite figure out the details but the eBPF verifier rejected my code in mysterious ways that I didn't understand. Commenting out random code-paths seemed to make it happy but all code-paths combined caused an error. Eventually, I decided that we simply have to use less abstractions to implement the same logic. All the "parsing" code is now using types inspired by `network-types`. The only modification here is that we use byte-arrays within our structs in order to directly receive them in big-endian ordering. `network-types` uses `u16`s and `u32`s which get interpreted as little-endian on x86. Instead of converting around between the endianness, constructing those values where we want them using the right endianness is deemed much simpler. I opened an issue with upstream which - if accepted - will allow us to remove our own structs and instead depend on upstream again. I also had to aggressively add `#[inline(always)]` to several functions, otherwise the compiler would not optimise away our function calls, causing the linker and / or eBPF verifier to fail. This PR also fixes numerous bugs that I've found in the already existing eBPF code. The number of bugs makes me question how this has been working so far at all! - We did not swap the Ethernet source and destination MAC address when re-routing the packet. The integration-test didn't catch this because it only operates on the loopback interface. Further testing on staging should allow us to confirm that this is indeed working now. - The UDP checksum update did not incorporate the new src and dst port. The integration-test didnt' catch that because it has UDP checksumming disabled. We need to have that disabled in the test because UDP checksumming is typically offloaded to the NIC and packets on the loopback interface never leave the device. Related: https://github.com/vadorovsky/network-types/issues/32. Related: #7518	2025-03-31 12:32:35 +00:00
Thomas Eizinger	ae157bce12	fix(relay): turn regression tests back on (#8541 ) As part of iterating on #8496, the API of `relay::Server` had changed and I had commented out the regression tests to move quicker. In later iterations, those API changes were reverted but I forgot to uncomment them.	2025-03-31 08:55:26 +00:00
Thomas Eizinger	afa6814ab4	chore(relay): ignore eBPF integration test (#8543 ) This needs elevated privileges to run. Our current pattern for these is to set them as ignored. In CI, we run all tests, including the ignored ones.	2025-03-29 01:49:43 +00:00
Thomas Eizinger	3c7ac084c0	feat(relay): MVP for routing channel data message in eBPF kernel (#8496 ) ## Abstract This pull-request implements the first stage of off-loading routing of TURN data channel messages to the kernel via an eBPF XDP program. In particular, the eBPF kernel implemented here only handles the decapsulation of IPv4 data channel messages into their embedded UDP payload. Implementation of other data paths, such as the receiving of UDP traffic on an allocation and wrapping it in a TURN channel data message is deferred to a later point for reasons explained further down. As it stands, this PR implements the bare minimum for us to start experimenting and benefiting from eBPF. It is already massive as it is due to the infrastructure required for actually doing this. Let's dive into it! ## A refresher on TURN channel-data messages TURN specifies a channel-data message for relaying data between two peers. A channel data message has a fixed 4-byte header: - The first two bytes specify the channel number - The second two bytes specify the length of the encapsulated payload Like all TURN traffic, channel data messages run over UDP by default, meaning this header sits at the very front of the UDP payload. This will be important later. After making an allocation with a TURN server (i.e. reserving a port on the TURN server's interfaces), a TURN client can bind channels on that allocation. As such, channel numbers are scoped to a client's allocation. Channel numbers are allocated by the client within a given range (0x4000 - 0x4FFF). When binding a channel, the client specifies the remote's peer address that they'd like the data sent on the channel to be sent to. Given this setup, when a TURN server receives a channel data message, it first looks at the sender's IP + port to infer the allocation (a client can only ever have 1 allocation at a time). Within that allocation, the server then looks for the channel number and retrieves the target socket address from that. The allocation itself is a port on the relay's interface. With that, we can now "unpack" the payload of the channel data message and rewrite it to the new receiver: - The new source IP can be set from the old dst IP (when operating in user-space mode this is irrelevant because we are working with the socket API). - The new source port is the client's allocation. - The new destination IP is retrieved from the mapping retrieved via the channel number. - The new destination port is retrieved from the mapping retrieved via the channel number. Last but not least, all that is left is removing the channel data header from the UDP payload and we can send out the packet. In other words, we need to cut off the first 4 bytes of the UDP payload. ## User-space relaying At present, we implement the above flow in user-space. This is tricky to do because we need to bind _many_ sockets, one for each possible allocation port (of which there can be 16383). The actual work to be done on these packets is also extremely minimal. All we do is cut off (or add on) the data-channel header. Benchmarks show that we spend pretty much all of our time copying data between user-space and kernel-space. Cutting this out should give us a massive increase in performance. ## Implementing an eBPF XDP TURN router eBPF has been shown to be a very efficient way of speeding up a TURN server [0]. After many failed experiments (e.g. using TC instead of XDP) and countless rabbit-holes, we have also arrived at the design documented within the paper. Most notably: - The eBPF program is entirely optional. We try to load it on startup, but if that fails, we will simply use the user-space mode. - Retaining the user-space mode is also important because under certain circumstances, the eBPF kernel needs to pass on the packet, for example, when receiving IPv4 packets with options. Those make the header dynamically-sized which makes further processing difficult because the eBPF verifier disallows indexing into the packet with data derived from the packet itself. - In order to add/remove the channel-data header, we shift the packet headers backwards / forwards and leave the payload in place as the packet headers are constant in size and can thus easily and cheaply be copied out. In order to perform the relaying flow explained above, we introduce maps that are shared with user-space. These maps go from a tuple of (client-socket, channel-number) to a tuple of (allocation-port, peer-socket) and thus give us all the data necessary to rewrite the packet. ## Integration with our relay Last but not least, to actually integrate the eBPF kernel with our relay, we need to extend the `Server` with two more events so we can learn, when channel bindings are created and when they expire. Using these events, we can then update the eBPF maps accordingly and therefore influence the routing behaviour in the kernel. ## Scope What is implemented here is only one of several possible data paths. Implementing the others isn't conceptually difficult but it does increase the scope. Landing something that already works allows us to gain experience running it in staging (and possibly production). Additionally, I've hit some issues with the eBPF verifier when adding more codepaths to the kernel. I expect those to be possible to resolve given sufficient debugging but I'd like to do so after merging this. --- Depends-On: #8506 Depends-On: #8507 Depends-On: #8500 Resolves: #8501 [0]: https://dl.acm.org/doi/pdf/10.1145/3609021.3609296	2025-03-27 10:59:40 +00:00

15 Commits