With this patch, we sample a list of DNS resources on each test run and create a "TCP service" for each of their addresses. Using this list of resources, we then change the `SendTcpPayload` transition to `ConnectTcp` and establish TCP connections using `smoltcp` to these services. For now, we don't send any data on these connections but we do set the keep-alive interval to 5s, meaning `smoltcp` itself will keep these connections alive. We also set the timeout to 30s and after each transition in a test-run, we assert that all TCP sockets are still in their expected state: - `ESTABLISHED` for most of them. - `CLOSED` for all sockets where we ended up sampling an IPv4 address but the DNS resource only supports IPv6 addresses (or vice-versa). In these cases, we use the ICMP error to sent by the Gateway to assert that the socket is `CLOSED`. Unfortunately, `smoltcp` currently does not handle ICMP messages for its sockets, so we have to call `abort` ourselves. Overall, this should assert that regardless of whether we roam networks, switch relays or do other kind of stuff with the underlying connection, the tunneled TCP connection stays alive. In order to make this work, I had to tweak the timeouts when we are on-demand refreshing allocations. This only happens in one particular case: When we are being given new relays by the portal, we refresh all _other_ relays to make sure they are still present. In other words, all relays that we didn't remove and didn't just add but still had in-memory are refreshed. This is important for cases where we are network-partitioned from the portal whilst relays are deployed or reset their state otherwise. Instead of the previous 8s max elapsed time of the exponential backoff like we have it for other requests, we now only use a single message with a 1s timeout there. With the increased ICE timeout of 15s, a TCP connection with a 30s timeout would otherwise not survive such an event. This is because it takes the above mentioned 8s for us to remove a non-functioning relay, all whilst trying to establish a new connection (which also incurs its own ICE timeout then). With the reduced timeout on the on-demand refresh of 1s, we detect the disappeared relay much quicker and can immediately establish a new connection via one of the new ones. As always with reduced timeouts, this can create false-positives if the relay doesn't reply within 1s for some reason. Resolves: #9531
Rust development guide
Firezone uses Rust for all data plane components. This directory contains the Linux and Windows clients, and low-level networking implementations related to STUN/TURN.
We target the last stable release of Rust using rust-toolchain.toml.
If you are using rustup, that is automatically handled for you.
Otherwise, ensure you have the latest stable version of Rust installed.
Reading Client logs
The Client logs are written as JSONL for machine-readability.
To make them more human-friendly, pipe them through jq like this:
cd path/to/logs # e.g. `$HOME/.cache/dev.firezone.client/data/logs` on Linux
cat *.log | jq -r '"\(.time) \(.severity) \(.message)"'
Resulting in, e.g.
2024-04-01T18:25:47.237661392Z INFO started log
2024-04-01T18:25:47.238193266Z INFO GIT_VERSION = 1.0.0-pre.11-35-gcc0d43531
2024-04-01T18:25:48.295243016Z INFO No token / actor_name on disk, starting in signed-out state
2024-04-01T18:25:48.295360641Z INFO null
Benchmarking on Linux
The recommended way for benchmarking any of the Rust components is Linux' perf utility.
For example, to attach to a running application, do:
- Ensure the binary you are profiling is compiled with the
releaseprofile. sudo perf record -g --freq 10000 --pid $(pgrep <your-binary>).- Run the speed test or whatever load-inducing task you want to measure.
sudo perf script > profile.perf- Open profiler.firefox.com and load
profile.perf
Instead of attaching to a process with --pid, you can also specify the path to executable directly.
That is useful if you want to capture perf data for a test or a micro-benchmark.