Commit Graph

15 Commits

Author SHA1 Message Date
Thomas Eizinger
d718c5de8e fix(connlib): retry packets on IO error 5 (#10279)
Unfortunately, it isn't very easy to detect whether a socket supports
GSO on Linux. Hence, `quinn-udp` simply probes for its support by trying
to send GSO batches and effectively disables GSO by setting the
`max-gso-segments` state variable to 1 if it encounters either EINVAL
(-22) or EIO (-5).

For EINVAL, `quinn-udp` has an internal retry mechanism. For EIO, the
`Transmit` which is passed to `quinn-udp` needs to be re-chunked and
thus cannot be automatically retried.

In order to avoid dropping packets, we therefore add a once-off retry
step to sending a datagram whenever we hit EIO on Linux or Android. If
the error was due to GSO not being supported, the 2nd attempt should be
successful and going forward, even the first one should be until we roam
the socket (where this state variable gets reset).

These packet drops have been causing flakiness in CI ever since we
merged the eBPF tests. Those disable checksum offloading which appears
to trigger these errors.
2025-09-02 21:31:57 +00:00
Thomas Eizinger
e84bdc5566 refactor(connlib): periodically record queue depths (#10242)
Instead of recording the queue depths on every event-loop tick, we now
record them once a second by setting a Gauge. Not only is that a simpler
instrument to work with but it is significantly more performant. The
current version - when metrics are enabled - takes on quite a bit of CPU
time.

Resolves: #10237
2025-09-02 02:57:36 +00:00
Thomas Eizinger
4e11112d9b feat(connlib): improve throughput on higher latencies (#10231)
Turns out the multi-threaded access of the TUN device on the Gateway
causes packet reordering which makes the TCP congestion controller
throttle the connection. Additionally, the default TX queue length of a
TUN device on Linux is only 500 packets.

With just a single thread and an increased TX queue length, we get a
throughput performance of just over 1 GBit/s for a 20ms link between
Client and Gateway with basically no packet drops:

```
Connecting to host 172.20.0.110, port 5201
[  5] local 100.79.130.70 port 49546 connected to 172.20.0.110 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   116 MBytes   977 Mbits/sec    0   6.40 MBytes       
[  5]   1.00-2.00   sec   137 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]   2.00-3.00   sec   134 MBytes  1.13 Gbits/sec    0   6.40 MBytes       
[  5]   3.00-4.00   sec   136 MBytes  1.14 Gbits/sec   47   6.40 MBytes       
[  5]   4.00-5.00   sec   137 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]   5.00-6.00   sec   138 MBytes  1.16 Gbits/sec    0   6.40 MBytes       
[  5]   6.00-7.00   sec   138 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]   7.00-8.00   sec   138 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]   8.00-9.00   sec   138 MBytes  1.16 Gbits/sec    0   6.40 MBytes       
[  5]   9.00-10.00  sec   138 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]  10.00-11.00  sec   139 MBytes  1.17 Gbits/sec    0   6.40 MBytes       
[  5]  11.00-12.00  sec   139 MBytes  1.17 Gbits/sec    0   6.40 MBytes       
[  5]  12.00-13.00  sec   136 MBytes  1.14 Gbits/sec    0   6.40 MBytes       
[  5]  13.00-14.00  sec   139 MBytes  1.17 Gbits/sec    0   6.40 MBytes       
[  5]  14.00-15.00  sec   140 MBytes  1.17 Gbits/sec    0   6.40 MBytes       
[  5]  15.00-16.00  sec   138 MBytes  1.16 Gbits/sec    0   6.40 MBytes       
[  5]  16.00-17.00  sec   137 MBytes  1.15 Gbits/sec    0   6.40 MBytes       
[  5]  17.00-18.00  sec   139 MBytes  1.17 Gbits/sec    0   6.40 MBytes       
[  5]  18.00-19.00  sec   138 MBytes  1.16 Gbits/sec    0   6.40 MBytes       
[  5]  19.00-20.00  sec   136 MBytes  1.14 Gbits/sec    0   6.40 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.00  sec  2.67 GBytes  1.15 Gbits/sec   47             sender
[  5]   0.00-20.02  sec  2.67 GBytes  1.15 Gbits/sec                  receiver

iperf Done.

```

For further debugging in the future, we are now recording the send and
receive queue depths of both the TUN device and the UDP sockets. Neither
of those showed to be full in my testing which leads me to conclude that
it isn't any buffer inside Firezone that is too small here.

Related: #7452

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2025-08-20 23:08:56 +00:00
Thomas Eizinger
301d2137e5 refactor(windows): share src IP cache across UDP sockets (#9976)
When looking through customer logs, we see a lot of "Resolved best route
outside of tunnel" messages. Those get logged every time we need to
rerun our re-implementation of Windows' weighting algorithm as to which
source interface / IP a packet should be sent from.

Currently, this gets cached in every socket instance so for the
peer-to-peer socket, this is only computed once per destination IP.
However, for DNS queries, we make a new socket for every query. Using a
new source port DNS queries is recommended to avoid fingerprinting of
DNS queries. Using a new socket also means that we need to re-run this
algorithm every time we make a DNS query which is why we see this log so
often.

To fix this, we need to share this cache across all UDP sockets. Cache
invalidation is one of the hardest problems in computer science and this
instance is no different. This cache needs to be reset every time we
roam as that changes the weighting of which source interface to use.

To achieve this, we extend the `SocketFactory` trait with a `reset`
method. This method is called whenever we roam and can then reset a
shared cache inside the `UdpSocketFactory`. The "source IP resolver"
function that is passed to the UDP socket now simply accesses this
shared cache and inserts a new entry when it needs to resolve the IP.

As an added benefit, this may speed up DNS queries on Windows a bit
(although I haven't benchmarked it). It should certainly drastically
reduce the amount of syscalls we make on Windows.
2025-07-24 01:36:53 +00:00
Thomas Eizinger
f98fcca542 refactor(connlib): directly implement async fn (#9806)
At present, and as a result of how `connlib` evolved, we still implement
a `Poll`-based function for receiving data on our UDP socket. Ever since
we moved to dedicated threads for the UDP socket, we can directly block
on "block" on receiving datagrams and don't have to poll the socket.

This simplifies the implementation a fair bit. Additionally, it made me
reailise that we currently don't expose any errors on the UDP socket.
Likely, those will be ephemeral but it is still better than completely
silencing them.
2025-07-10 13:54:44 +00:00
Jamil
e1ac9e4912 fix(rust): relax assertion on cloudflare tcp response (#9506)
In #9498, the Cloudflare response was updated to match what appears to
be a transient change on their end. It looks like this has changed
again, so to prevent this from breaking CI in the future we relax the
assertion.
2025-06-10 19:51:18 -07:00
Thomas Eizinger
1fa345aa5e test(rust): adapt response from Cloudflare proxy (#9498)
It appears that Cloudflare changed the response that it is sending for
the 1.1.1.1 IP so we need to adapt our integration test for packet loops
in order to make CI pass.
2025-06-10 14:18:07 +00:00
Thomas Eizinger
f11a902b3d refactor(rust): move dns-control to bin-shared (#9023)
Currently, the platform-specific code for controlling DNS resolution on
a system sits in `firezone-headless-client`. This code is also used by
the GUI client. This creates a weird compile-time dependency from the
GUI client to the headless client.

For other components that have platform-specific implementations, we use
the `firezone-bin-shared` crate. As a first step of resolving the
compile-time dependency, we move the `dns_control` module to
`firezone-bin-shared`.
2025-05-06 01:29:09 +00:00
Thomas Eizinger
e031dfdb4a refactor(connlib): introduce our own bufferpool crate (#8928)
We have been using buffer pools for a while all over `connlib` as a way
to efficiently use heap-allocated memory. This PR harmonizes the usage
of buffer pools across the codebase by introducing a dedicated
`bufferpool` crate. This crate offers a convenient and easy-to-use API
for all the things we (currently) need from buffer pools. As a nice
bonus of having it all in one place, we can now also track metrics of
how many buffers we have currently allocated.

An example output from the local metrics exporter looks like this:

```
Name         : system.buffer.count
Description  : The number of buffers allocated in the pool.
Unit         : {buffers}
Type         : Sum
Sum DataPoints
Monotonic    : false
Temporality  : Cumulative
DataPoint #0
	StartTime    : 2025-04-29 12:41:25.278436
	EndTime      : 2025-04-29 12:42:25.278088
	Value        : 96
	Attributes   :
		 ->  system.buffer.pool.name: udp-socket-v6
		 ->  system.buffer.pool.buffer_size: 65535
DataPoint #1
	StartTime    : 2025-04-29 12:41:25.278436
	EndTime      : 2025-04-29 12:42:25.278088
	Value        : 7
	Attributes   :
		 ->  system.buffer.pool.buffer_size: 131600
		 ->  system.buffer.pool.name: gso-queue
DataPoint #2
	StartTime    : 2025-04-29 12:41:25.278436
	EndTime      : 2025-04-29 12:42:25.278088
	Value        : 128
	Attributes   :
		 ->  system.buffer.pool.name: udp-socket-v4
		 ->  system.buffer.pool.buffer_size: 65535
DataPoint #3
	StartTime    : 2025-04-29 12:41:25.278436
	EndTime      : 2025-04-29 12:42:25.278088
	Value        : 8
	Attributes   :
		 ->  system.buffer.pool.buffer_size: 1336
		 ->  system.buffer.pool.name: ip-packet
DataPoint #4
	StartTime    : 2025-04-29 12:41:25.278436
	EndTime      : 2025-04-29 12:42:25.278088
	Value        : 9
	Attributes   :
		 ->  system.buffer.pool.buffer_size: 1336
		 ->  system.buffer.pool.name: snownet
```

Resolves: #8385
2025-04-30 08:52:18 +00:00
Thomas Eizinger
b3746b330f refactor(connlib): spawn dedicated threads for UDP sockets (#7590)
Correctly implementing asynchronous IO is notoriously hard. In order to
not drop packets in the process, one has to ensure a given socket is
ready to accept packets, buffer them if it is not case, suspend
everything else until the socket is ready and then continue.

Until now, we did this because it was the only option to run the UDP
sockets on the same thread as the actual packet processing. That in turn
was motivated by wanting to pass around references of the received
packets for processing. Rust's borrow-checker does not allow to pass
references between threads which forced us to have the sockets on the
same thread as the packet processing.

Like we already did in other places in `connlib`, this can be solved
through the use of buffer pools. Using a buffer pool, we can use heap
allocations to store the received packets without having to make a new
allocation every time we read new packets. Instead, we can have a
dedicated thread that is connected to `connlib`'s packet processing
thread via two channels (one for inbound and one for outbound packets).
These channels are bounded, which ensures backpressure is maintained in
case one of the two threads lags behind. These bounds also mean that we
have at most N buffers from the buffer pool in-flight (where N is the
capacity of the channel).

Within those dedicated threads, we can then use `async/await` notation
to suspend the entire task when a socket isn't ready for sending.

Resolves: #8000
2025-04-14 06:18:06 +00:00
Thomas Eizinger
58fe527b0e feat(connlib): mirror ECN bits on TUN device (#8511)
From the perspective of any application, Firezone is a layer-3 network
and will thus use the host's networking stack to form IP packets for
whichever application protocol is in use (UDP, TCP, etc). These packets
then get encapsulated into UDP packets by Firezone and sent to a
Gateway.

As a result of this design, the IP header seen by the networking stacks
of the Client and the receiving service are not visible to any
intermediary along the network path of the Client and Gateway.

In case this network path is congested and middleboxes such as routers
need to drop packets, they will look at the ECN bits in the IP header
(of the UDP packet generated by a Client or Gateway) and flip a bit in
case the previous value indicated support for ECN (`0x01` or `0x10`).
When received by a network stack that supports ECN, seeing `0x11` means
that the network path is congested and that it must reduce its
send/receive windows (or otherwise throttle the connection).

At present, this doesn't work with Firezone because of the
aforementioned encapsulation of IP packets. To support ECN, we need to
therefore:

- Copy ECN bits from a received IP packet to the datagram that
encapsulates it: This ensures that if the Client's network stack support
ECN, we mirror that support on the wire.
- Copy ECN bits from a received datagram to the IP packet the is sent to
the TUN device: This ensures that if the "Congestion Experienced" bit
get set along the network path between Client and Gateway, we reflect
that accordingly on the IP packet emitted by the TUN device.

Resolves: #3758

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2025-03-26 20:55:51 +00:00
Thomas Eizinger
84a2c275ca build(rust): upgrade to Rust 1.85 and Edition 2024 (#8240)
Updates our codebase to the 2024 Edition. For highlights on what
changes, see the following blogpost:
https://blog.rust-lang.org/2025/02/20/Rust-1.85.0.html
2025-03-19 02:58:55 +00:00
Thomas Eizinger
96170be082 fix(gui-client): mitigate deadlock when shutting down TUN device (#8268)
In #8159, we introduced a regression that could lead to a deadlock when
shutting down the TUN device. Whilst we did close the channel prior to
awaiting the thread to exit, we failed to notice that _another_ instance
of the sender could be alive as part of an internally stored "sending
permit" with the `PollSender` in case another packet is queued for
sending. We need to explicitly call `abort_send` to free that.

Judging from the comment and a prior bug, this shutdown logic has been
buggy before. To further avoid this deadlock, we introduce two changes:

- The worker threads only receive a `Weak` reference to the
`wintun::Session`
- We move all device-related state into a dedicated `TunState` struct
that we can drop prior to joining the threads

The combination of these features means that all strong references to
channels and the session are definitely dropped without having to wait
for anything. To provide a clean and synchronous shutdown, we wait for
at most 5s on the worker-threads. If they don't exit until then, we log
a warning and exit anyway.

This should greatly reduce the risk of future bugs here because the
session (and thus the WinTUN device) gets shutdown in any case and so at
worst, we have a few zombie threads around.

Resolves: #8265
2025-02-26 00:46:12 +00:00
Thomas Eizinger
48ba2869a8 chore(rust): ban the use of .unwrap except in tests (#7319)
Using the clippy lint `unwrap_used`, we can automatically lint against
all uses of `.unwrap()` on `Result` and `Option`. This turns up quite a
few results actually. In most cases, they are invariants that can't
actually be hit. For these, we change them to `Option`. In other cases,
they can actually be hit. For example, if the user supplies an invalid
log-filter.

Activating this lint ensures the compiler will yell at us every time we
use `.unwrap` to double-check whether we do indeed want to panic here.

Resolves: #7292.
2024-11-13 03:59:22 +00:00
Reactor Scram
5a44151bba test(bin-shared): improve network notifier test (#6676)
On Windows, the network notifier always notifies once at startup. We
make the DNS notifier and Linux match this behavior, and we assert it in
the unit test.

Part of a yak shave towards removing Tauri.
2024-09-13 14:53:13 +00:00