mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 18:18:55 +00:00

Files

Thomas Eizinger fde8d08423 fix(connlib): maintain packet order across GSO batches (#8920 )

Despite our efforts in #8912, the current implementation still does not
do enough to maintain packet ordering across GSO batches.

At present, we very aggressively batch packets of the same length
together. This however is too eager when we consider packet flows such
as the following:

```
9:03:49.585143 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 1:1229, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
09:03:49.585151 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 1229:2063, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 834
09:03:49.585157 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 2063:3094, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1031
09:03:49.585187 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 3094:4322, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
09:03:49.585188 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 4322:5156, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 834
09:03:49.585227 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 5156:6384, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
09:03:49.585228 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 6384:7612, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
09:03:49.585230 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 7612:8249, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 637
09:03:49.585846 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 8249:9477, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
09:03:49.585851 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 9477:10705, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228
```

As we can see here, the remote sends us packet batches of varying
lengths:

- 1228, 834
- 1031
- 1228, 834
- 1228, 1228, 637
- 1228, 1228

1228 represents a "full" TCP packet so any packet following a
full-packet SHOULD be grouped together into a GSO batch.

Currently, we are batching all the 1228 packets together and we ignore
the fact that there were actually smaller sized packets inbetween those
that belong together.

To mitigate this, we refactor the `GsoQueue` to remove the
`segment_size` from the binning key of our map and instead only group
batches by their source, destination and ECN information. Within such a
connection, we then create an ordered list of batches. A new batch is
started if the length differs or we have previously pushed a packet that
isn't of the length of the batch, therefore signalling the end of the
batch.

The result here looks very promising (this is loading
`blog.firezone.dev` via the `lynx` browser from within the
headless-client docker container, so going through a Gateway running
this PR):

|main|this PR|
|---|---|
|![Screenshot From 2025-04-29
10-32-00](https://github.com/user-attachments/assets/ba0535e4-1df9-4601-a2d7-ba099ba2313f)|![image](https://github.com/user-attachments/assets/ab2ccec7-ce96-4305-8514-2e43d82ecc7d)|

Related: #8899

2025-04-29 00:50:23 +00:00

.cargo

docs(connlib): add profiling instructions (#6643 )

2024-09-10 14:00:00 +00:00

bin-shared

build(rust): move our own windows dependency to 0.61.0 (#8730 )

2025-04-22 02:35:28 +00:00

connlib

fix(connlib): maintain packet order across GSO batches (#8920 )

2025-04-29 00:50:23 +00:00

dns-over-tcp

build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240 )

2025-03-19 02:58:55 +00:00

dns-types

build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240 )

2025-03-19 02:58:55 +00:00

etherparse-ext

feat(connlib): mirror ECN bits on TUN device (#8511 )

2025-03-26 20:55:51 +00:00

gateway

fix(rust): don't use jemalloc on ARMv7 (#8859 )

2025-04-19 22:20:05 +00:00

gui-client

chore: release Apple and GUI clients (#8882 )

2025-04-21 21:45:16 +00:00

headless-client

feat(linux-client): reduce number of TUN threads to 1 (#8914 )

2025-04-28 12:25:27 +00:00

ip-packet

fix(connlib): maintain packet order across GSO batches (#8920 )

2025-04-29 00:50:23 +00:00

logging

build(deps): bump Rust to 1.86 (#8636 )

2025-04-03 21:14:08 +00:00

phoenix-channel

build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240 )

2025-03-19 02:58:55 +00:00

relay

build(rust): bump aya to include BTF information feature (#8883 )

2025-04-22 00:36:41 +00:00

socket-factory

feat(connlib): request larger buffers for UDP sockets (#8731 )

2025-04-22 06:52:33 +00:00

telemetry

chore(relay): remove feature flag for eBPF TURN router (#8681 )

2025-04-07 03:31:22 +00:00

tests

build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240 )

2025-03-19 02:58:55 +00:00

tun

build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240 )

2025-03-19 02:58:55 +00:00

.dockerignore

refactor(portal): Don't pin session token to user_agent or remote_ip (#2195 )

2023-09-30 07:40:57 -07:00

.gitignore

Implement basic STUN server (#1603 )

2023-05-10 07:58:32 -07:00

Cargo.lock

build(rust): move our own windows dependency to 0.61.0 (#8730 )

2025-04-22 02:35:28 +00:00

Cargo.toml

build(rust): move our own windows dependency to 0.61.0 (#8730 )

2025-04-22 02:35:28 +00:00

clippy.toml

fix(connlib): prioritise GSO batches with smaller segments (#8772 )

2025-04-14 00:04:39 +00:00

deny.toml

build(deps): bump tempfile from 3.13.0 to 3.19.1 in /rust (#8845 )

2025-04-20 22:49:41 +00:00

docker-init-gateway.sh

fix(gateway): Apply more specific firewall rules on start (#8483 )

2025-03-19 05:32:50 +00:00

docker-init-relay.sh

feat(relay): add ec2 metadata discovery (#6617 )

2024-09-12 12:28:55 -06:00

docker-init.sh

fix(gateway): always masquerade for docker-deployed gateways (#6169 )

2024-08-07 03:00:50 +00:00

Dockerfile

chore(rust): remove dev stage in Dockerfile (#8688 )

2025-04-10 04:00:58 +00:00

Dockerfile-rpm

chore(ci): build RPM package (#7190 )

2024-11-01 18:06:09 +00:00

Dockerfile.xdp-tools

chore(relay): Add xdp-tools debug Docker image build script (#8591 )

2025-04-03 18:17:23 +00:00

README.md

docs(rust): fix profiling command (#7547 )

2024-12-18 13:01:23 +00:00

rust-toolchain.toml

chore(rust): remove dev stage in Dockerfile (#8688 )

2025-04-10 04:00:58 +00:00

README.md

Rust development guide

Firezone uses Rust for all data plane components. This directory contains the Linux and Windows clients, and low-level networking implementations related to STUN/TURN.

We target the last stable release of Rust using rust-toolchain.toml. If you are using rustup, that is automatically handled for you. Otherwise, ensure you have the latest stable version of Rust installed.

Reading Client logs

The Client logs are written as JSONL for machine-readability.

To make them more human-friendly, pipe them through jq like this:

cd path/to/logs  # e.g. `$HOME/.cache/dev.firezone.client/data/logs` on Linux
cat *.log | jq -r '"\(.time) \(.severity) \(.message)"'

Resulting in, e.g.

2024-04-01T18:25:47.237661392Z INFO started log
2024-04-01T18:25:47.238193266Z INFO GIT_VERSION = 1.0.0-pre.11-35-gcc0d43531
2024-04-01T18:25:48.295243016Z INFO No token / actor_name on disk, starting in signed-out state
2024-04-01T18:25:48.295360641Z INFO null

Benchmarking on Linux

The recommended way for benchmarking any of the Rust components is Linux' perf utility. For example, to attach to a running application, do:

Ensure the binary you are profiling is compiled with the release profile.
sudo perf record -g --freq 10000 --pid $(pgrep <your-binary>).
Run the speed test or whatever load-inducing task you want to measure.
sudo perf script > profile.perf
Open profiler.firefox.com and load profile.perf

Instead of attaching to a process with --pid, you can also specify the path to executable directly. That is useful if you want to capture perf data for a test or a micro-benchmark.