Data has shown that we are doing a significant amount of relaying in userspace because the latency of which candidates establish first matters - if an IPv6 to IPv4 path establishes first, we could often pick that, which would bypass the eBPF relaying altogether. To address this, we now perform address translation when relaying so these paths are covered. Preliminary benchmarking on Azure has shown this performs around ~1.5 Gbps for a single client - gateway path, scaling linearly with the number clients up to the core count. On GCP, performance will be a fraction of that because we need to attach the program in SKB_MODE (generic) based on the fact the `gve` driver there does not support the needed `bpf_xdp_adjust_head` call. To keep the verifier happy (and make the verifier error trace log usable) throughout this large refactor, we unfortunately had to drop down to pointer arithmetic in this process. This however means that we have full control (and visibility) over how the bytes are loaded, stored, and copied. Each struct / abstraction adds a little bit of overhead on the stack which pushed us over the 512-byte limit. Since we are generally loading only one set of packet headers onto the stack to then copy into their new locations, our actual stack usage should be well the 512-byte limit. Further performance analysis is required to push past the current per-core 1.5 Gbps limit. This, along with CI support for integration testing these codepaths is left for a later date as this PR is already quite large and needs to soak test for a bit in a live environment before we push to prod. Fixes #10192
Rust development guide
Firezone uses Rust for all data plane components. This directory contains the Linux and Windows clients, and low-level networking implementations related to STUN/TURN.
We target the last stable release of Rust using rust-toolchain.toml.
If you are using rustup, that is automatically handled for you.
Otherwise, ensure you have the latest stable version of Rust installed.
Reading Client logs
The Client logs are written as JSONL for machine-readability.
To make them more human-friendly, pipe them through jq like this:
cd path/to/logs # e.g. `$HOME/.cache/dev.firezone.client/data/logs` on Linux
cat *.log | jq -r '"\(.time) \(.severity) \(.message)"'
Resulting in, e.g.
2024-04-01T18:25:47.237661392Z INFO started log
2024-04-01T18:25:47.238193266Z INFO GIT_VERSION = 1.0.0-pre.11-35-gcc0d43531
2024-04-01T18:25:48.295243016Z INFO No token / actor_name on disk, starting in signed-out state
2024-04-01T18:25:48.295360641Z INFO null
Benchmarking on Linux
The recommended way for benchmarking any of the Rust components is Linux' perf utility.
For example, to attach to a running application, do:
- Ensure the binary you are profiling is compiled with the
releaseprofile. sudo perf record -g --freq 10000 --pid $(pgrep <your-binary>).- Run the speed test or whatever load-inducing task you want to measure.
sudo perf script > profile.perf- Open profiler.firefox.com and load
profile.perf
Instead of attaching to a process with --pid, you can also specify the path to executable directly.
That is useful if you want to capture perf data for a test or a micro-benchmark.