Files
firezone/rust
Thomas Eizinger d244a99c58 feat(connlib): always use all candidates (#9979)
In #6876, we added functionality that would only make use of new remote
candidates whilst we haven't nominated a socket yet with the remote. The
reason for that was because in the described edge-case where relays
reboot or get replaced whilst the client is partitioned from the portal
(or we experience a connection hiccup), only one of the two peers, i.e.
Client or Gateway would migrate to the new relay, leaving the other one
in an inconsistent state.

Looking at recent customer logs, I've been seeing a lot of these
messages:

> Unknown connection or socket has already been nominated

For this particular customer, these are then very quickly followed by
ICE timeouts, leaving the connection unusable.

Considering that, I no longer think that the above change was a good
idea and we should instead always make use of all candidates that we are
given. What we are seeing is that in deployment scenarios where the
latency link between Client and Gateway is very short (5-10ms) yet the
latency to the portal is longer (~30-50ms), we trigger a race condition
where we are temporarily nominating a _peer-reflexive_ candidate pair
instead of a regular one. This happens because with such a short latency
link, Client and Gateway are _faster_ in sending back and forth several
STUN bindings than the control plane is in delivering all the
candidates.

Due to the functionality added in #6876, this then results in us not
accepting the candidates. It further appears that a nominated
peer-reflexive candidate does not provide a stable connection which is
why we then run into an ICE timeout, requiring Firezone to establish a
new connection only to have the same thing happen again.

This is very disruptive for the user experience as the connection only
works for a few moments at a time.

With #9793, we have actually added a feature that is also at play here.
Now that we don't immediately act on an ICE timeout, it is actually
possible for both Client and Gateway to migrate a connection to a
different relay, should the one that they are using get disconnected. In
#9793, we added a timeout of 2s for this.

To make this fully work, we need to patch str0m to transition to
`Checking` early. Presently, str0m would directly transition from
`Disconnected` to `Connected` in this case which in some of the
high-latency scenarios that we are testing in CI is not enough to
recover the connection within 2s. By transitioning to `Checking` early,
we abort this timer.

Related: https://github.com/algesten/str0m/pull/676
2025-07-24 01:35:54 +00:00
..
2025-07-22 13:24:58 +00:00
2023-05-10 07:58:32 -07:00

Rust development guide

Firezone uses Rust for all data plane components. This directory contains the Linux and Windows clients, and low-level networking implementations related to STUN/TURN.

We target the last stable release of Rust using rust-toolchain.toml. If you are using rustup, that is automatically handled for you. Otherwise, ensure you have the latest stable version of Rust installed.

Reading Client logs

The Client logs are written as JSONL for machine-readability.

To make them more human-friendly, pipe them through jq like this:

cd path/to/logs  # e.g. `$HOME/.cache/dev.firezone.client/data/logs` on Linux
cat *.log | jq -r '"\(.time) \(.severity) \(.message)"'

Resulting in, e.g.

2024-04-01T18:25:47.237661392Z INFO started log
2024-04-01T18:25:47.238193266Z INFO GIT_VERSION = 1.0.0-pre.11-35-gcc0d43531
2024-04-01T18:25:48.295243016Z INFO No token / actor_name on disk, starting in signed-out state
2024-04-01T18:25:48.295360641Z INFO null

Benchmarking on Linux

The recommended way for benchmarking any of the Rust components is Linux' perf utility. For example, to attach to a running application, do:

  1. Ensure the binary you are profiling is compiled with the release profile.
  2. sudo perf record -g --freq 10000 --pid $(pgrep <your-binary>).
  3. Run the speed test or whatever load-inducing task you want to measure.
  4. sudo perf script > profile.perf
  5. Open profiler.firefox.com and load profile.perf

Instead of attaching to a process with --pid, you can also specify the path to executable directly. That is useful if you want to capture perf data for a test or a micro-benchmark.