Commit Graph

21 Commits

Author SHA1 Message Date
Thomas Eizinger
46cdbbcc23 fix(connlib): use a buffer pool for the GSO queue (#7749)
Within `connlib`, we read batches of IP packets and process them at
once. Each encrypted packet is appended to a buffer shared with other
packets of the same length. Once the batch is successfully processed,
all of these buffers are written out using GSO to the network. This
allows UDP operations to be much more efficient because not every packet
has to traverse the entire syscall hierarchy of the operating system.

Until now, these buffers got re-allocated on every batch. This is pretty
wasteful and leads to a lot of repeated allocations. Measurements show
that most of the time, we only have a handful of packets with different
segments lengths _per batch_. For example, just booting up the
headless-client and running a speedtest showed that only 5 of these
buffers are were needed at one time.

By introducing a buffer pool, we can reuse these buffers between batches
and avoid reallocating them.

Related: #7747.
2025-01-13 19:24:52 +00:00
Thomas Eizinger
40ff26ce1a chore: remove commented out import (#7539) 2024-12-17 21:00:20 +00:00
Thomas Eizinger
dd6b52b236 chore(rust): share edition key via workspace table (#7451) 2024-12-03 00:28:06 +00:00
Thomas Eizinger
0a6554122a feat(connlib): utilise GSO for UDP sockets (#7210)
## Context

At present, `connlib` sends UDP packets one at a time. Sending a packet
requires us to make a syscall which is quite expensive. Under load, i.e.
during a speedtest, syscalls account for over 50% of our CPU time [0].
In order to improve this situation, we need to somehow make use of GSO
(generic segmentation offload). With GSO, we can send multiple packets
to the same destination in a single syscall.

The tricky question here is, how can we achieve having multiple UDP
packets ready at once so we can send them in a single syscall? Our TUN
interface only feeds us packets one at a time and `connlib`'s state
machine is single-threaded. Additionally, we currently only have a
single `EncryptBuffer` in which the to-be-sent datagram sits.

## 1. Stack-allocating encrypted IP packets

As a first step, we get rid of the single `EncryptBuffer` and instead
stack-allocate each encrypted IP packet. Due to our small MTU, these
packets are only around 1300 bytes. Stack-allocating that requires a few
memcpy's but those are in the single-digit % range in the terms of CPU
time performance hit. That is nothing compared to how much time we are
spending on UDP syscalls. With the `EncryptBuffer` out the way, we can
now "freely" move around the `EncryptedPacket` structs and - technically
- we can have multiple of them at the same time.

## 2. Implementing GSO

The GSO interface allows you to pass multiple packets **of the same
length and for the same destination** in a single syscall, meaning we
cannot just batch-up arbitrary UDP packets. Counterintuitively, making
use of GSO requires us to do more copying: In particular, we change the
interface of `Io` such that "sending" a packet performs essentially a
lookup of a `BytesMut`-buffer by destination and packet length and
appends the payload to that packet.

## 3. Batch-read IP packets

In order to actually perform GSO, we need to process more than a single
IP packet in one event-loop tick. We achieve this by batch-reading up to
50 IP packets from the mpsc-channel that connects `connlib`'s main
event-loop with the dedicated thread that reads and writes to the TUN
device. These reads and writes happen concurrently to `connlib`'s packet
processing. Thus, it is likely that by the time `connlib` is ready to
process another IP packet, multiple have been read from the device and
are sitting in the channel. Batch-processing these IP packets means that
the buffers in our `GsoQueue` are more likely to contain more than a
single datagram.

Imagine you are running a file upload. The OS will send many packets to
the same destination IP and likely max MTU to the TUN device. It is
likely, that we read 10-20 of these packets in one batch (i.e. within a
single "tick" of the event-loop). All packets will be appended to the
same buffer in the `GsoQueue` and on the next event-loop tick, they will
all be flushed out in a single syscall.

## Results

Overall, this results in a significant reduction of syscalls for sending
UDP message. In [1], we spend only a total of 16% of our CPU time in
`udpv6_sendmsg` whereas in [0] (main), we spent a total of 34%. Do note
that these numbers are relative to the total CPU time spent per program
run and thus can't be compared directly (i.e. you cannot just do 34 - 16
and say we now spend 18% less time sending UDP packets). Nevertheless,
this appears to be a great improvement.

In terms of throughput, we achieve a ~60% improvement in our benchmark
suite. That one is running on localhost though so it might not
necessarily be reflect like that in a real network.

[0]: https://share.firefox.dev/4hvoPju
[1]: https://share.firefox.dev/4frhCPv
2024-12-02 01:09:44 +00:00
Thomas Eizinger
2c26fc9c0e ci: lint Rust dependencies using cargo deny (#7390)
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.

This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).

Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.

This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.

Resolves: #7241.
2024-11-22 00:17:28 +00:00
Thomas Eizinger
f34f596d8d chore(connlib): print GRO metadata in wire::net::recv log (#7353)
Previously, we printed only the size of each individual packet in the
`wire::net` logs. This makes it impossible to tell whether or not GRO
was used to receive this packet. The total number of bytes can still be
computed by calculating `num_packets * segment_size + trailing_bytes`.
Thus, the new log is strictly superior.
2024-11-15 16:11:03 +00:00
Thomas Eizinger
9712942caa chore(connlib): add logs and error handling on bad stride (#7339)
In order to narrow down #7332, we are now checking the stride length and
report oddities such as `string > len` as an error to the event-loop.
2024-11-14 02:34:17 +00:00
dependabot[bot]
0dd93fcfe6 build(deps): Bump tokio from 1.40.0 to 1.41.0 in /rust (#7169)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.40.0 to 1.41.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tokio/releases">tokio's
releases</a>.</em></p>
<blockquote>
<h2>Tokio v1.41.0</h2>
<h1>1.41.0 (Oct 22th, 2024)</h1>
<h3>Added</h3>
<ul>
<li>metrics: stabilize <code>global_queue_depth</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6854">#6854</a>,
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6918">#6918</a>)</li>
<li>net: add conversions for unix <code>SocketAddr</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6868">#6868</a>)</li>
<li>sync: add <code>watch::Sender::sender_count</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6836">#6836</a>)</li>
<li>sync: add <code>mpsc::Receiver::blocking_recv_many</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6867">#6867</a>)</li>
<li>task: stabilize <code>Id</code> apis (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6793">#6793</a>,
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6891">#6891</a>)</li>
</ul>
<h3>Added (unstable)</h3>
<ul>
<li>metrics: add H2 Histogram option to improve histogram granularity
(<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6897">#6897</a>)</li>
<li>metrics: rename some histogram apis (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6924">#6924</a>)</li>
<li>runtime: add <code>LocalRuntime</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6808">#6808</a>)</li>
</ul>
<h3>Changed</h3>
<ul>
<li>runtime: box futures larger than 16k on release mode (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6826">#6826</a>)</li>
<li>sync: add <code>#[must_use]</code> to <code>Notified</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6828">#6828</a>)</li>
<li>sync: make <code>watch</code> cooperative (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6846">#6846</a>)</li>
<li>sync: make <code>broadcast::Receiver</code> cooperative (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6870">#6870</a>)</li>
<li>task: add task size to tracing instrumentation (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6881">#6881</a>)</li>
<li>wasm: enable <code>cfg_fs</code> for <code>wasi</code> target (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6822">#6822</a>)</li>
</ul>
<h3>Fixed</h3>
<ul>
<li>net: fix regression of abstract socket path in unix socket (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6838">#6838</a>)</li>
</ul>
<h3>Documented</h3>
<ul>
<li>io: recommend <code>OwnedFd</code> with <code>AsyncFd</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6821">#6821</a>)</li>
<li>io: document cancel safety of <code>AsyncFd</code> methods (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6890">#6890</a>)</li>
<li>macros: render more comprehensible documentation for
<code>join</code> and <code>try_join</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6814">#6814</a>,
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6841">#6841</a>)</li>
<li>net: fix swapped examples for <code>TcpSocket::set_nodelay</code>
and <code>TcpSocket::nodelay</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6840">#6840</a>)</li>
<li>sync: document runtime compatibility (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6833">#6833</a>)</li>
</ul>
<p><a
href="https://redirect.github.com/tokio-rs/tokio/issues/6793">#6793</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6793">tokio-rs/tokio#6793</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6808">#6808</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6808">tokio-rs/tokio#6808</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6810">#6810</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6810">tokio-rs/tokio#6810</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6814">#6814</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6814">tokio-rs/tokio#6814</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6821">#6821</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6821">tokio-rs/tokio#6821</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6822">#6822</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6822">tokio-rs/tokio#6822</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6826">#6826</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6826">tokio-rs/tokio#6826</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6828">#6828</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6828">tokio-rs/tokio#6828</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6833">#6833</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6833">tokio-rs/tokio#6833</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6836">#6836</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6836">tokio-rs/tokio#6836</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6838">#6838</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6838">tokio-rs/tokio#6838</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6840">#6840</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6840">tokio-rs/tokio#6840</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="01e04daaa1"><code>01e04da</code></a>
chore: prepare Tokio v1.41.0 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6917">#6917</a>)</li>
<li><a
href="92ccadeb3c"><code>92ccade</code></a>
runtime: fix stability feature flags for docs (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6909">#6909</a>)</li>
<li><a
href="fbfeb9a68a"><code>fbfeb9a</code></a>
metrics: rename <code>*_poll_count_*</code> to
<code>*_poll_time_*</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6924">#6924</a>)</li>
<li><a
href="da745ff335"><code>da745ff</code></a>
metrics: add H2 Histogram option to improve histogram granularity (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6897">#6897</a>)</li>
<li><a
href="ce1c74f1cc"><code>ce1c74f</code></a>
metrics: fix deadlock in injection_queue_depth_multi_thread test (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6916">#6916</a>)</li>
<li><a
href="28c9a14a2e"><code>28c9a14</code></a>
metrics: rename <code>injection_queue_depth</code> to
<code>global_queue_depth</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6918">#6918</a>)</li>
<li><a
href="32e0b4325f"><code>32e0b43</code></a>
ci: freeze FreeBSD and wasm-unknown-unknown on rustc 1.81 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6911">#6911</a>)</li>
<li><a
href="1656d8e231"><code>1656d8e</code></a>
sync: add <code>mpsc::Receiver::blocking_recv_many</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6867">#6867</a>)</li>
<li><a
href="c9e998e4b3"><code>c9e998e</code></a>
ci: print the correct sort order of the dictionary on failure (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6905">#6905</a>)</li>
<li><a
href="512e9decfb"><code>512e9de</code></a>
rt: add LocalRuntime (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6808">#6808</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tokio-rs/tokio/compare/tokio-1.40.0...tokio-1.41.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tokio&package-manager=cargo&previous-version=1.40.0&new-version=1.41.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-10-31 04:45:09 +00:00
Thomas Eizinger
b7bef6d062 chore(rust): use new try_send APIs in quinn-udp (#7185)
With the recent lobbying effort in `quinn-udp`, we were able to get
`try_send` APIs for the UDP socket that doesn't silence any errors while
sending datagrams. Originally, the reasoning in `quinn-udp` was that
because UDP is an unreliable protocol anyway, errors don't need to be
surfaced because there must be upper-level mechanisms for retrying
messages. Whilst that is true, getting immediate feedback that something
isn't working can also be very beneficial. For example, if you don't
have proper IPv6 connectivity on a socket, the syscall will immediately
fail with `DestinationUnreachable`.

Within Firezone, we use these UDP sockets to send all kinds of messages,
including DNS queries to upstream servers. In case that doesn't work,
failing instantly allows us to send a SERVFAIL error back to the OS
right away instead of having to wait for a timeout.

Additionally, `quinn-udp` logs these send errors on WARN which cause
unnecessary noise in Sentry.

Resolves: #6353.
2024-10-30 16:17:52 +00:00
Thomas Eizinger
73eebd2c4d refactor(rust): consistently record errors as tracing::Value (#7104)
Our logging library, `tracing` supports structured logging. This is
useful because it preserves the more than just the string representation
of a value and thus allows the active logging backend(s) to capture more
information for a particular value.

In the case of errors, this is especially useful because it allows us to
capture the sources of a particular error.

Unfortunately, recording an error as a tracing value is a bit cumbersome
because `tracing::Value` is only implemented for `&dyn
std::error::Error`. Casting an error to this is quite verbose. To make
it easier, we introduce two utility functions in `firezone-logging`:

- `std_dyn_err`
- `anyhow_dyn_err`

Tracking errors as correct `tracing::Value`s will be especially helpful
once we enable Sentry's `tracing` integration:
https://docs.rs/sentry-tracing/latest/sentry_tracing/#tracking-errors
2024-10-22 04:46:26 +00:00
Thomas Eizinger
9302331881 refactor(connlib): create new UDP socket for each DNS query (#6999)
This extracts the initial refactoring required for #6944. Currently,
`connlib` sends all DNS queries over the same UDP socket as all the p2p
traffic for gateways and relays. In an earlier design of `connlib`, we
already did something similar as we are doing here but using
`hickory_resolver` for the actual DNS resolution.

Instead of depending on hickory, we implement DNS resolution ourselves
by sending a UDP DNS query to the mapped upstream DNS server. There are
no retries, instead, we rely on the original DNS client to retry in case
a packet gets lost on the way.

Modelling recursive DNS queries as explicit events from the
`ClientState` is necessary for implement DNS over TCP and DNS over
HTTPS. In both cases, the query to the upstream server isn't as simple
as emitting a `Transmit`. By modelling the query as an `async fn` within
`Io`, it will be possible to perform them all in one place.

Resolves: #6297.
2024-10-11 22:33:22 +00:00
Thomas Eizinger
a9f515a453 chore(rust): use #[expect] instead of #[allow] (#6692)
The `expect` attribute is similar to `allow` in that it will silence a
particular lint. In addition to `allow` however, `expect` will fail as
soon as the lint is no longer emitted. This ensures we don't end up with
stale `allow` attributes in our codebase. Additionally, it provides a
way of adding a `reason` to document, why the lint is being suppressed.
2024-09-16 13:51:12 +00:00
Thomas Eizinger
0939068492 chore(connlib): report UDP socket buffer sizes (#6564)
The actual size of the send and receive buffers is OS-dependent. To aid
debugging with customer-submitted logs, we now print the size of the
send and receive buffers of each UDP socket.
2024-09-12 04:00:50 +00:00
Reactor Scram
f507a01f9f fix(windows): prevent routing loops for TCP connections (#6584)
In #6032, we attempted to fix routing loops for Windows and did so
successfully for UDP packets. For TCP sockets, we believed that binding
the socket to an interface is enough to prevent routing loops. This
assumptions is wrong.

> On Windows, a call to bind() affects card selection only incoming
traffic, not outgoing traffic.
>
> Thus, on a client running in a multi-homed system (i.e., more than one
interface card), it's the network stack that selects the card to use,
and it makes its selection based solely on the destination IP, which in
turn is based on the routing table. A call to bind() will not affect the
choice of the card in any way.

On most of our testing machines, this problem didn't surface but it
turns out that on some machines, especially with WiFi cards there is a
conflict between the routes added on the system. In particular, with the
Internet resource active, we also add a catch-all route that we _want_
to have the most priority, i.e. Windows SHOULD send all traffic to our
TUN device. Except for traffic that we generate, like TCP connections to
the portal or UDP packets sent to gateways, relays or DNS servers.

It appears that on some systems, mostly with Ethernet adapters, Windows
picks the "correct" interface for our socket and sends traffic via that
but on other systems, it doesn't. TCP sockets are only used for the
WebSocket connection to the portal. Without that one, Firezone
completely stops working because we can't send any control messages.

To reliably fix this issue, we need to add a dedicated route for the
target IP of each TCP socket that is more specific than the Internet
resource route (`0.0.0.0/0`) but otherwise identical. We do this as part
of creating a new TCP socket. This route is for the _default_ interface
and thus, doesn't get automatically removed when Firezone exits.

We implement a RAII guard that attempts to drop the route on a
best-effort basis. Despite this RAII guard, this route can linger around
in case Firezone is being forced to exit or exits in otherwise unclean
ways. To avoid lingering routes, we always delete all routing table
entries matching the IP of the portal just before we are about to add
one.

Fixes: #6591.

[0]:
https://forums.codeguru.com/showthread.php?487139-Socket-binding-with-routing-table&s=a31637836c1bf7f0bc71c1955e47bdf9&p=1891235#post1891235

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Foo Bar <foo@bar.com>
Co-authored-by: conectado <gabrielalejandro7@gmail.com>
2024-09-05 06:17:28 +00:00
Thomas Eizinger
e3688a475e refactor(connlib): only buffer 1 unsent packet if socket is busy (#6563)
Currently, we buffer UDP packets whenever the socket is busy and try to
flush them out at a later point. This requires allocations and is tricky
to get right.

In order to solve both of these problems, we refactor `snownet` to
return us an `EncryptedPacket` instead of a `Transmit`. An
`EncryptedPacket` is an indirection-abstraction that can be turned into
a `Transmit` given an `EncryptBuffer`. This combination of types allows
us to hold on to the `EncryptedPacket` (which does not contain any
references itself) in the `io` component whilst we are waiting for the
socket to be ready to send again.

This means we will immediately suspend the event loop in case the socket
is no longer ready for sending and resend the datagram in the
`EncryptBuffer` once we get re-polled.
2024-09-04 16:59:33 +00:00
Thomas Eizinger
b13e52b124 build(deps): converge Rust quinn dependency (#6314) 2024-08-16 02:06:18 +00:00
Thomas Eizinger
128d0eb407 feat(connlib): transparently forward non-resources DNS queries (#6181)
Currently, `connlib` depends on `hickory-resolver` to perform DNS
queries for non-resources. This is unnecessary. Instead of buffering the
original UDP DNS query, consulting hickory to resolve the name and
mapping the response back, we can simply take the UDP payload and send
it via our protected socket directly to the original upstream DNS
server.

This ensures `connlib` is as transparent as possible for DNS queries for
non-resources. Additionally, it removes a lot of error handling and
other cruft that we currently have to perform because we are using
hickory. For example, hickory will automatically retry a DNS query after
a certain timeout. However, the OS / client talking to `connlib` will
also retry after a certain timeout because it is making DNS queries over
an unreliable transport (UDP). It is thus unnecessary for us to do that
internally.

To correctly test this change, our test-suite needed some refactoring.
Specifically, DNS servers are now modelled as dedicated `Host`s that can
receive (UDP) traffic.

Lastly, we can remove our dependency on `hickory-proto` and
`hickory-resolver` everywhere and only use `domain` for parsing DNS
messages.

Resolves: #6141.
Related: #6033.
Related: #4800. (Impossible to happen with this design)
2024-08-07 08:54:49 +00:00
dependabot[bot]
bd49298240 build(deps): Bump tokio from 1.38.0 to 1.39.2 in /rust (#6082)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.38.0 to 1.39.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tokio/releases">tokio's
releases</a>.</em></p>
<blockquote>
<h2>Tokio v1.39.2</h2>
<h1>1.39.2 (July 27th, 2024)</h1>
<p>This release fixes a regression where the <code>select!</code> macro
stopped accepting expressions that make use of temporary lifetime
extension. (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6722">#6722</a>)</p>
<p><a
href="https://redirect.github.com/tokio-rs/tokio/issues/6722">#6722</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6722">tokio-rs/tokio#6722</a></p>
<h2>Tokio v1.39.1</h2>
<h1>1.39.1 (July 23rd, 2024)</h1>
<p>This release reverts &quot;time: avoid traversing entries in the time
wheel twice&quot; because it contains a bug. (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6715">#6715</a>)</p>
<p><a
href="https://redirect.github.com/tokio-rs/tokio/issues/6715">#6715</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/6715">tokio-rs/tokio#6715</a></p>
<h2>Tokio v1.39.0</h2>
<h1>1.39.0 (July 23rd, 2024)</h1>
<ul>
<li>This release bumps the MSRV to 1.70. (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6645">#6645</a>)</li>
<li>This release upgrades to mio v1. (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6635">#6635</a>)</li>
<li>This release upgrades to windows-sys v0.52 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6154">#6154</a>)</li>
</ul>
<h3>Added</h3>
<ul>
<li>io: implement <code>AsyncSeek</code> for <code>Empty</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6663">#6663</a>)</li>
<li>metrics: stabilize <code>num_alive_tasks</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6619">#6619</a>,
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6667">#6667</a>)</li>
<li>process: add <code>Command::as_std_mut</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6608">#6608</a>)</li>
<li>sync: add <code>watch::Sender::same_channel</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6637">#6637</a>)</li>
<li>sync: add
<code>{Receiver,UnboundedReceiver}::{sender_strong_count,sender_weak_count}</code>
(<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6661">#6661</a>)</li>
<li>sync: implement <code>Default</code> for <code>watch::Sender</code>
(<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6626">#6626</a>)</li>
<li>task: implement <code>Clone</code> for <code>AbortHandle</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6621">#6621</a>)</li>
<li>task: stabilize <code>consume_budget</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6622">#6622</a>)</li>
</ul>
<h3>Changed</h3>
<ul>
<li>io: improve panic message of <code>ReadBuf::put_slice()</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6629">#6629</a>)</li>
<li>io: read during write in <code>copy_bidirectional</code> and
<code>copy</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6532">#6532</a>)</li>
<li>runtime: replace <code>num_cpus</code> with
<code>available_parallelism</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6709">#6709</a>)</li>
<li>task: avoid stack overflow when passing large future to
<code>block_on</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6692">#6692</a>)</li>
<li>time: avoid traversing entries in the time wheel twice (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6584">#6584</a>)</li>
<li>time: support <code>IntoFuture</code> with <code>timeout</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6666">#6666</a>)</li>
<li>macros: support <code>IntoFuture</code> with <code>join!</code> and
<code>select!</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6710">#6710</a>)</li>
</ul>
<h3>Fixed</h3>
<ul>
<li>docs: fix docsrs builds with the fs feature enabled (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6585">#6585</a>)</li>
<li>io: only use short-read optimization on known-to-be-compatible
platforms (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6668">#6668</a>)</li>
<li>time: fix overflow panic when using large durations with
<code>Interval</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6612">#6612</a>)</li>
</ul>
<h3>Added (unstable)</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f602eae499"><code>f602eae</code></a>
chore: prepare Tokio v1.39.2 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6730">#6730</a>)</li>
<li><a
href="438def7957"><code>438def7</code></a>
macros: allow temporary lifetime extension in select (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6722">#6722</a>)</li>
<li><a
href="ee8d4d1b05"><code>ee8d4d1</code></a>
chore: fix ci failures (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6725">#6725</a>)</li>
<li><a
href="3297052763"><code>3297052</code></a>
ci: test Quinn in CI (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6719">#6719</a>)</li>
<li><a
href="f8fe0ffb23"><code>f8fe0ff</code></a>
chore: prepare Tokio v1.39.1 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6716">#6716</a>)</li>
<li><a
href="47210a8e6e"><code>47210a8</code></a>
time: revert &quot;avoid traversing entries in the time wheel
twice&quot; (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6715">#6715</a>)</li>
<li><a
href="29545d9037"><code>29545d9</code></a>
runtime: ignore many_oneshot_futures test for alt scheduler (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6712">#6712</a>)</li>
<li><a
href="48e35c11d9"><code>48e35c1</code></a>
chore: release Tokio v1.39.0 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6711">#6711</a>)</li>
<li><a
href="dd1d37167d"><code>dd1d371</code></a>
macros: accept <code>IntoFuture</code> args for macros (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6710">#6710</a>)</li>
<li><a
href="6a1a7b1591"><code>6a1a7b1</code></a>
chore: prepare tokio-macros v2.4.0 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/6707">#6707</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tokio-rs/tokio/compare/tokio-1.38.0...tokio-1.39.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tokio&package-manager=cargo&previous-version=1.38.0&new-version=1.39.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-07-30 20:45:35 +00:00
Gabi
c3a45f53df fix(connlib): prevent routing loops on windows (#6032)
In `connlib`, traffic is sent through sockets via one of three ways:

1. Direct p2p traffic between clients and gateways: For these, we always
explicitly set the source IP (and thus interface).
2. UDP traffic to the relays: For these, we let the OS pick an
appropriate source interface.
3. WebSocket traffic over TCP to the portal: For this too, we let the OS
pick the source interface.

For (2) and (3), it is possible to run into routing loops, depending on
the routes that we have configured on the TUN device.

In Linux, we can prevent routing loops by marking a socket [0] and
repeating the mark when we add routes [1]. Packets sent via a marked
socket won't be routed by a rule that contains this mark. On Android, we
can do something similar by "protecting" a socket via a syscall on the
Java side [2].

On Windows, routing works slightly different. There, the source
interface is determined based on a computed metric [3] [4]. To prevent
routing loops on Windows, we thus need to find the "next best" interface
after our TUN interface. We can achieve this with a combination of
several syscalls:

1. List all interfaces on the machine
2. Ask Windows for the best route on each interface, except our TUN
interface.
3. Sort by Windows' routing metric and pick the lowest one (lower is
better).

Thanks to the abstraction of `SocketFactory` that we already previously
introduced, Integrating this into `connlib` isn't too difficult:

1. For TCP sockets, we simply resolve the best route after creating the
socket and then bind it to that local interface. That way, all packets
will always going via that interface, regardless of which routes are
present on our TUN interface.
2. UDP is connection-less so we need to decide per-packet, which
interface to use. "Pick the best interface for me" is modelled in
`connlib` via the `DatagramOut::src` field being `None`.
- To ensure those packets don't cause a routing loop, we introduce a
"source IP resolver" for our `UdpSocket`. This function gets called
every time we need to send a packet without a source IP.
- For improved performance, we cache these results. The Windows client
uses this source IP resolver to use the above devised strategy to find a
suitable source IP.
- In case the source IP resolution fails, we don't send the packet. This
is important, otherwise, the kernel might choose our TUN interface again
and trigger a routing loop.

The last remark to make here is that this also works for connection
roaming. The TCP socket gets thrown away when we reconnect to the
portal. Thus, the new socket will pick the new best interface as it is
re-created. The UDP sockets also get thrown away as part of roaming.
That clears the above cache which is what we want: Upon roaming, the
best interface for a given destination IP will likely have changed.

[0]:
59014a9622/rust/headless-client/src/linux.rs (L19-L29)
[1]:
59014a9622/rust/bin-shared/src/tun_device_manager/linux.rs (L204-L224)
[2]:
59014a9622/rust/connlib/clients/android/src/lib.rs (L535-L549)
[3]:
https://learn.microsoft.com/en-us/previous-versions/technet-magazine/cc137807(v=msdn.10)?redirectedfrom=MSDN
[4]:
https://learn.microsoft.com/en-us/windows-server/networking/technologies/network-subsystem/net-sub-interface-metric

Fixes: #5955.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-07-29 22:25:42 +00:00
Thomas Eizinger
59014a9622 refactor(connlib): encapsulate UDP and TCP sockets (#6028)
As part of debugging full-route tunneling on Windows, we discovered that
we need to always explicitly choose the interface through which we want
to send packets, otherwise Windows may cause a routing loop by routing
our packets back into the TUN device.

We already have a `SocketFactory` abstraction in `connlib` that is used
by each platform to customise the setup of each socket to prevent
routing loops.

So far, this abstraction directly returns tokio sockets which don't
allow us to intercept the actual sending of packets. For some of our
traffic, i.e. the UDP packets exchanged with relays, we don't specify a
source address. To make full-route work on Windows, we need to intercept
these packets and explicitly set the source address.

To achieve that, we introduce dedicated `TcpSocket` and `UdpSocket`
structs within `socket-factory`. With this in place, we will be able to
add Windows-conditional code to looks up and sets the source address of
outgoing UDP packets. For TCP sockets, the lookup will happen prior to
connecting to the address and used to bind to the correct interface.

Related: #2667.
Related: #5955.
2024-07-25 04:28:46 +00:00
Gabi
5b0aaa6f81 fix(connlib): protect all sockets from routing loops (#5797)
Currently, only connlib's UDP sockets for sending and receiving STUN &
WireGuard traffic are protected from routing loops. This is was done via
the `Sockets::with_protect` function. Connlib has additional sockets
though:

- A TCP socket to the portal.
- UDP & TCP sockets for DNS resolution via hickory.

Both of these can incur routing loops on certain platforms which becomes
evident as we try to implement #2667.

To fix this, we generalise the idea of "protecting" a socket via a
`SocketFactory` abstraction. By allowing the different platforms to
provide a specialised `SocketFactory`, anything Linux-based can give
special treatment to the socket before handing it to connlib.

As an additional benefit, this allows us to remove the `Sockets`
abstraction from connlib's API again because we can now initialise it
internally via the provided `SocketFactory` for UDP sockets.

---------

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-07-16 00:40:05 +00:00