Commit Graph

63 Commits

Author SHA1 Message Date
Thomas Eizinger
46cdbbcc23 fix(connlib): use a buffer pool for the GSO queue (#7749)
Within `connlib`, we read batches of IP packets and process them at
once. Each encrypted packet is appended to a buffer shared with other
packets of the same length. Once the batch is successfully processed,
all of these buffers are written out using GSO to the network. This
allows UDP operations to be much more efficient because not every packet
has to traverse the entire syscall hierarchy of the operating system.

Until now, these buffers got re-allocated on every batch. This is pretty
wasteful and leads to a lot of repeated allocations. Measurements show
that most of the time, we only have a handful of packets with different
segments lengths _per batch_. For example, just booting up the
headless-client and running a speedtest showed that only 5 of these
buffers are were needed at one time.

By introducing a buffer pool, we can reuse these buffers between batches
and avoid reallocating them.

Related: #7747.
2025-01-13 19:24:52 +00:00
Thomas Eizinger
956bbbfd91 fix(gateway): translate ICMPv6's PacketTooBig error (#7567)
IPv6 treats fragmentation and MTU errors differently than IPv4. Rather
than requiring fragmentation on each hop of a routing path,
fragmentation needs to happen at the packet source and failure to route
a packet triggers an ICMPv6 `PacketTooBig` error.

These need to be translated back through our NAT64 implementation of the
Gateway. Due to the size difference in the headers of IPv4 and IPv6, the
available MTU to the IPv4 packet is 20 bytes _less_ than the MTU
reported by the ICMP error. IPv6 headers are always 40 bytes, meaning if
the MTU is reported as e.g. 1200 on the IPv6 side, we need to only offer
1180 to the IPv4 end of the application. Once the new MTU is then
honored, the packets translated by our NAT64 implementation will still
conform to the required MTU of 1200, despite the overhead introduced by
the translation.

Resolves: #7515.
2024-12-22 12:09:14 +00:00
Thomas Eizinger
bc2febed99 fix(connlib): use correct constant for truncating DNS responses (#7551)
In case an upstream DNS server responds with a payload that exceeds the
available buffer space of an IP packet, we need to truncate the
response. Currently, this truncation uses the **wrong** constant to
check for the maximum allowed length. Instead of the
`MAX_DATAGRAM_PAYLOAD`, we actually need to check against a limit that
is less than the MTU as the IP layer and the UDP layer both add an
overhead.

To fix this, we introduce such a constant and provide additional
documentation on the remaining ones to hopefully avoid future errors.
2024-12-19 17:15:43 +00:00
Thomas Eizinger
d39b6ff1b9 chore(gateway): don't log errors for untranslatable packets (#7541)
Certain packets cannot be translated as part of NAT64/46. The RFC says
to "Silently drop" those. Currently, we log all errors that happens
during the translation and don't follow this guideline.

Most of these "silently drop" errors are related to ICMP types that
cannot be represented in the other version such as ICMPv6 Neighbor
Solicitation.

To fix this, we introduce a new error type in the `ip_packet` module:
`ImpossibleTranslation`. For convenience reasons, we carry that one
through all layers as an `anyhow::Error` and test at the very top of the
event-loop, whether the root-cause of the error is such a failed
translation. If so, we ignore the error and move on. This isn't as
type-safe as it could be but it is much easier to implement.
Additionally, the risk of a bug here (i.e. if we stop emitting this
error within the IP packet translation layer) is merely that the log
will pop up again.

Resolves: #7516.
2024-12-18 20:35:08 +00:00
Thomas Eizinger
aa8c53a20d refactor(rust): use a buffer pool for network packets (#7489)
In order to achieve concurrency within `connlib`, we needed to create a
way for IP packets to own the piece of memory they are sitting in. This
allows us to concurrently read IP packets and them batch-process them
(as opposed to have a dedicated buffer and reference it). At the moment,
those IP packets are defined on the stack. With a size of ~1300 bytes
that isn't very large but still causes _some_ amount of copying.

We can avoid this copying by relying on a buffer pool:

1. When reading a new IP packet, we request a new buffer from the pool.
2. When the IP packet gets dropped, the buffer gets returned to the
pool.

This allows us to reuse an allocation for a packet once it finished
processing, resulting in less CPU time spent on copying around memory.

This causes us to make more _individual_ heap-allocations in the
beginning: Each packet is being processed by `connlib` is allocated on
the heap somewhere. At some point during the lifetime of the tunnel,
this will settle in an ideal state where we have allocated enough slots
to cover new packets whilst also reusing memory from packets that
finished processing already.

The actual `IpPacket` data type is now just a pointer. As a result, the
channels to and from the TUN thread (where we were holding multiple of
these packets) are now significantly smaller, leading to roughly the
same memory usage overall.

In my local testing on Linux, the client still only uses about ~15MB of
RAM even with multiple concurrent speedtests running.
2024-12-16 01:02:17 +00:00
Thomas Eizinger
a0efc4cfdc fix(connlib): don't fail NAT64 on invalid IPv4 DSCP value (#7479)
As per the RFC, the IPv6 traffic class should be 1-to-1 translated to
the IPv4 DSCP value. However, it appears that not all values here are
valid. In particular, when attempting to reach GitHub over IPv6, we
receive an IPv6 packet that has a traffic class value of 72 which is
out-of-range for the IPv4 DSCP value, resulting in the following error
on the Gateway:

```
Failed to translate packet: NAT64 failed: Error '72' is too big to be a 'IPv4 DSCP (Differentiated Services Code Point)' (maximum allowed value is '63')
```

The bigger scope of this issue is that this causes the ICMP packets
returned to the client to be dropped which means that `ssh` spawned by
`git` doesn't learn that the IPv6 address assigned by Firezone is not
actually routable.

Related: #7476.
2024-12-11 19:03:37 +00:00
Thomas Eizinger
dd6b52b236 chore(rust): share edition key via workspace table (#7451) 2024-12-03 00:28:06 +00:00
Thomas Eizinger
9073bddaef fix(gateway): translate ICMP destination unreachable errors (#7398)
## Context

The Gateway implements a stateful NAT that translates the destination IP
and source protocol of every packet that targets a DNS resource IP. This
is necessary because the IPs for DNS resources are generated on the
client without actually performing a DNS lookup, instead it always
generates 4 IPv4 and 4 IPv6 addresses. On the Gateway, these IPs are
then assigned in a round-robin fashion to the actual IPs that the domain
resolves to, necessitating a NAT64/46 translation in case a domain only
resolves to IPs of one family.

A domain may resolve to a set of IPs but not all of these IPs may be
routable. Whilst an arguably poor practise of the domain administrator,
routing problems can occur for all kinds of reasons and are well handled
on the wider Internet.

When an IP packet cannot be routed further, the current routing node
generates an ICMP error describing the routing failure and sends it back
to the original sender. ICMP is a layer 4 protocol itself, same as TCP
and UDP. As such, sending out a UDP packet may result in receiving an
ICMP response. In order to allow the sender to learn, which packet
failed to route, the ICMP error embeds parts of the original packet in
its payload [0] [1].

The Gateway's NAT table uses parts of the layer 4 protocol as part of
its key; the UDP and TCP source port and the ICMP echo request
identifier (further referred to as "source protocol"). An ICMP error
message doesn't have any of these, meaning the lookup in the NAT table
currently fails and the ICMP error is silently dropped.

A lot of software implements a happy-eyeballs approach and probs for
IPv6 and IPv4 connectivity simulataneously. The absence of the ICMP
errors confuses that algorithm as it detects the packet loss and starts
retransmits instead of giving up.

## Solution

Upon receiving an ICMP error on the Gateway, we now extract the
partially embedded packet in the ICMP error payload. We use the
destination IP and source protocol of _that_ packet for the lookup in
the NAT table. This returns us the original (client-assigned)
destination IP and source protocol. In order for the Gateway's NAT to be
transparent, we need to patch the packet embedded in the ICMP error to
use the original destination and source protocol. We also have to
account for the fact that the original packet may have been translated
with NAT64/46 and translate it back. Finally, we generate an ICMP error
with the appropriate code and embed the patched packet in its payload.

## Test implementation

To test that this works for all kind of combinations, we extend
`tunnel_test` to sample a list of unreachable IPs from all IPs sampled
for DNS resources. Upon receiving a packet for one of these IPs, the
Gateway will send an ICMP error back instead of invoking its regular
echo reply logic. On the client-side, upon receiving an ICMP error, we
extract the originally failed packet from the body and treat it as a
successful response.

This may seem a bit hacky at first but is actually how operating systems
would treat ICMP errors as well. For example, a `TcpSocket::connect`
call (triggering a TCP SYN packet) may fail with an IO error if we
receive an ICMP error packet. Thus, in a way, the original packet got
answered, just not with what we expected.

In addition, by treating these ICMP errors as responses to the original
packet, we automatically perform other assertions on them, like ensuring
that they come from the right IP address, that there are no unexpected
packets etc.

## Test alternatives

It is tricky to solve this in other ways in the test suite because at
the time of generating a packet for a DNS resource, we don't know the
actual IP that is being targeted by a certain proxy IP unless we'd start
reimplementing the round-robin algorithm employed by the Gateway. To
"test" the transparency of the NAT, we'd like to avoid knowing about
these implementation details in the test.

## Future work

In this PR, we currently only deal with "Destination Unreachable" ICMP
errors. There are other ICMP messages such as ICMPv6's `PacketTooBig` or
`ParameterProblem`. We should eventually handle these as well. They are
being deferred because translating those between the different IP
versions is only partially implemented and would thus require more work.
The most pressing need is to translate destination unreachable errors to
enable happy-eyeballs algorithms to work correctly.

Resolves: #5614.
Resolves: #6371.

[0]: https://www.rfc-editor.org/rfc/rfc792
[1]: https://www.rfc-editor.org/rfc/rfc4443#section-3.1
2024-12-02 23:07:41 +00:00
Thomas Eizinger
2c26fc9c0e ci: lint Rust dependencies using cargo deny (#7390)
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.

This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).

Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.

This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.

Resolves: #7241.
2024-11-22 00:17:28 +00:00
Thomas Eizinger
48ba2869a8 chore(rust): ban the use of .unwrap except in tests (#7319)
Using the clippy lint `unwrap_used`, we can automatically lint against
all uses of `.unwrap()` on `Result` and `Option`. This turns up quite a
few results actually. In most cases, they are invariants that can't
actually be hit. For these, we change them to `Option`. In other cases,
they can actually be hit. For example, if the user supplies an invalid
log-filter.

Activating this lint ensures the compiler will yell at us every time we
use `.unwrap` to double-check whether we do indeed want to panic here.

Resolves: #7292.
2024-11-13 03:59:22 +00:00
Thomas Eizinger
0e20f7d086 chore(connlib): better error reporting for invalid IP packets (#7320)
Currently, we don't report very detailed errors when we fail to parse
certain IP packets. With this patch, we use `Result` in more places and
also extend the validation of IP packets to:

a) enforce a length of at most 1280 bytes. This should already be the
case due to our MTU but bad things may happen if that is off for some
reason
b) validate the entire IP packet instead of just its header
2024-11-12 19:46:32 +00:00
dependabot[bot]
7e4e190cd6 build(deps): Bump test-strategy from 0.3.1 to 0.4.0 in /rust (#7308)
Bumps [test-strategy](https://github.com/frozenlib/test-strategy) from
0.3.1 to 0.4.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c683eb3cf6"><code>c683eb3</code></a>
Version 0.4.0.</li>
<li><a
href="17706bcd1c"><code>17706bc</code></a>
Update MSRV to 1.70.0.</li>
<li><a
href="90a5efbf00"><code>90a5efb</code></a>
Update dependencies.</li>
<li><a
href="cff2ede71f"><code>cff2ede</code></a>
Changed the strategy generated by <code>#[filter(...)]</code> to reduce
`Too many local ...</li>
<li><a
href="34cc6d2545"><code>34cc6d2</code></a>
Update expected compile error message.</li>
<li><a
href="a4427e2d98"><code>a4427e2</code></a>
Update CI settings.</li>
<li><a
href="ecb7dbae04"><code>ecb7dba</code></a>
Clippy.</li>
<li><a
href="637f29e9c8"><code>637f29e</code></a>
Made it so an error occurs when an unsupported attribute is specified
for enu...</li>
<li><a
href="6d66057bb0"><code>6d66057</code></a>
Use <code>test</code> instead of <code>check</code> with <code>cargo
hack --rust-version</code>.</li>
<li><a
href="cee2ebbfe6"><code>cee2ebb</code></a>
Fix CI settings.</li>
<li>Additional commits viewable in <a
href="https://github.com/frozenlib/test-strategy/compare/v0.3.1...v0.4.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=test-strategy&package-manager=cargo&previous-version=0.3.1&new-version=0.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-11-11 21:26:41 +00:00
Thomas Eizinger
62cb32b7a3 chore(gateway): report more tunnel errors to event-loop (#7299)
Currently, the Gateway's state machine functions for processing packets
use type-signature that only return `Option`. Any errors while
processing packets are logged internally. This makes it difficult to
consistently log these errors.

We refactor these functions to return `Result<Option<T>>` in most cases,
indicating that they may fail for various reasons and also sometimes
succeed without producing an output.

This allows us to consistently log these errors in the event-loop.
Logging them on WARN or ERROR would be too spammy though. In order to
still be alerted about some of these, we use the `telemetry_event!`
macro which samples them at a rate of 1%. This will alert us about cases
that happen often and allows us to handle them explicitly.

Once this is deployed to staging, I will monitor the alerts in Sentry to
ensure we won't get spammed with events from customers on the next
release.
2024-11-11 03:50:27 +00:00
Thomas Eizinger
271c480357 fix(connlib): don't attempt to encrypt too large packets (#7263)
When encrypting packets, we need to reserve a buffer within which
boringtun will encrypt the IP packet. Unfortunately, `boringtun` panics
if that buffer is not big enough which essentially brings all of
`connlib` down.

Really, we should never see a packet that is too large and ideally, we
enforce this at compile-time by creating different variants of
`IpPacket` that are sized accordingly. That is a large refactoring so
until then, we simply discard them instead of panicking.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-11-05 04:17:21 +00:00
dependabot[bot]
34882eb689 build(deps): Bump etherparse from 0.15.0 to 0.16.0 in /rust (#7167)
Bumps [etherparse](https://github.com/JulianSchmid/etherparse) from
0.15.0 to 0.16.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/JulianSchmid/etherparse/releases">etherparse's
releases</a>.</em></p>
<blockquote>
<h2>v0.16.0 Add IP Packet Defragmentation Support</h2>
<h2>What's Changed</h2>
<ul>
<li>typo by <a
href="https://github.com/ugur-a"><code>@​ugur-a</code></a> in <a
href="https://redirect.github.com/JulianSchmid/etherparse/pull/106">JulianSchmid/etherparse#106</a></li>
<li>Add etherparse-defrag by <a
href="https://github.com/JulianSchmid"><code>@​JulianSchmid</code></a>
in <a
href="https://redirect.github.com/JulianSchmid/etherparse/pull/92">JulianSchmid/etherparse#92</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/ugur-a"><code>@​ugur-a</code></a> made
their first contribution in <a
href="https://redirect.github.com/JulianSchmid/etherparse/pull/106">JulianSchmid/etherparse#106</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/JulianSchmid/etherparse/compare/v0.15.0...v0.16.0">https://github.com/JulianSchmid/etherparse/compare/v0.15.0...v0.16.0</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/JulianSchmid/etherparse/blob/master/changelog.md">etherparse's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog:</h1>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="93c7f0bb13"><code>93c7f0b</code></a>
Resolved clippy warnings</li>
<li><a
href="447c592aab"><code>447c592</code></a>
Increment proptest crate version</li>
<li><a
href="00c04f7dbe"><code>00c04f7</code></a>
Resolved clippy warning</li>
<li><a
href="b6d98e5100"><code>b6d98e5</code></a>
Extended tests for frag pool</li>
<li><a
href="0a58fa5e64"><code>0a58fa5</code></a>
Corrected fragment reconstruction</li>
<li><a
href="74739e5a4f"><code>74739e5</code></a>
Correct ip defrag pool new return type</li>
<li><a
href="c0741f51f3"><code>c0741f5</code></a>
Applying rust fmt &amp; add return_buf to ip defrag pool</li>
<li><a
href="31c8e84f4b"><code>31c8e84</code></a>
Update proptest and mark some tests as not relevant for miri</li>
<li><a
href="29894ab462"><code>29894ab</code></a>
Further work on defragmentation</li>
<li><a
href="9464a0f363"><code>9464a0f</code></a>
Adapt readme to defrag module</li>
<li>Additional commits viewable in <a
href="https://github.com/JulianSchmid/etherparse/compare/v0.15.0...v0.16.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=etherparse&package-manager=cargo&previous-version=0.15.0&new-version=0.16.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-10-31 04:46:57 +00:00
Gabi
dc97b9040d fix(connlib): large upstream dns message (#7183)
If edns0 doesn't work correctly DNS servers might respond with messages
bigger than our maximum udp size.

In that case we need to truncate those messages when forwarding the
respond back to the interface and expect the OS to retry with TCP.

Otherwise we aren't able to allocate a packet big enough for this.

Fixes #7121

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-10-30 04:02:14 +00:00
Thomas Eizinger
6eecfc0cfb fix: replace panics with Result for IP packets (#7135)
My theory for this issue is that we receive a UDP DNS response from an
upstream server that is bigger than our MTU and thus forwarding it
fails.

This PR doesn't fix that issue by itself but only mitigates the actual
panic. To properly fix the underlying issue, we need to parse the DNS
message. Truncate it and set the TC bit.

Related: #7121.
2024-10-23 16:25:12 +00:00
Gabi
2976081bc0 chore(connlib): use tcp and udp packets for proptests (#7064)
Currently, tests only send ICMP packets back and forth, to expand our
coverage and later on permit us cover filters and resource picking this
PR implements sending UDP and TCP packets as part of that logic too.

To make this PR simpler in this stage TCP packets don't track an actual
TCP connection, just that they are forwarded back and forth, this will
be fixed in a future PR by emulating TCP sockets.

We also unify how we handle CIDR/DNS/Non Resources to reduce the number
of transitions.

Fixes #7003

---------

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-10-22 01:21:40 +00:00
Thomas Eizinger
9de1119b69 feat(connlib): support DNS over TCP (#6944)
At present, `connlib` only supports DNS over UDP on port 53. Responses
over UDP are size-constrained on the IP MTU and thus, not all DNS
responses fit into a UDP packet. RFC9210 therefore mandates that all DNS
resolvers must also support DNS over TCP to overcome this limitation
[0].

Handling UDP packets is easy, handling TCP streams is more difficult
because we need to effectively implement a valid TCP state machine.

Building on top of a lot of earlier work (linked in issue), this is
relatively easy because we can now simply import
`dns_over_tcp::{Client,Server}` which do the heavy lifting of sending
and receiving the correct packets for us.

The main aspects of the integration that are worth pointing out are:

- We can handle at most 10 concurrent DNS TCP connections _per defined
resolver_. The assumption here is that most applications will first
query for DNS records over UDP and only fall back to TCP if the response
is truncated. Additionally, we assume that clients will close the TCP
connections once they no longer need it.
- Errors on the TCP stream to an upstream resolver result in `SERVFAIL`
responses to the client.
- All TCP connections to upstream resolvers get reset when we roam, all
currently ongoing queries will be answered with `SERVFAIL`.
- Upon network reset (i.e. roaming), we also re-allocate new local ports
for all TCP sockets, similar to our UDP sockets.

Resolves: #6140.

[0]: https://www.ietf.org/rfc/rfc9210.html#section-3-5
2024-10-18 03:40:50 +00:00
Thomas Eizinger
d8cc4c7161 chore(rust): use latest main of smoltcp (#7062)
The last released version of `smoltcp` is `0.11.0`. That version is
almost a year old. Since then, an important "bug" got fixed in the IPv6
handling code of `smoltcp`.

In order to route packets to our interface, we define a dummy IPv4 and
IPv6 address and create catch-all routes with our interface as the
gateway. Together with `set_any_ip(true)`, the makes `smoltcp` accept
any packet we pass it to. This is necessary because we don't directly
connect `smoltcp` to the TUN device but rather have an `InMemoryDevice`
where we explicitly feed certain packets to it.

In the last released version, `smoltcp` only performs the above logic
for IPv4. For IPv6, the additional check for "do we have a route that
this packet matches" doesn't exist and thus no IPv6 traffic is accepted
by `smoltcp`.

Extracted out of #6944.
2024-10-16 02:15:26 +00:00
Thomas Eizinger
c21bd18b62 refactor(connlib): explicitely define UDP DNS server resources (#7043)
Currently, in our tests, traffic that is targeted at a resource is
handled "in-line" on the gateway. This doesn't really represent how the
real world works. In the real world, the gateway uses the IP forwarding
functionality of the Linux kernel and the corresponding NAT to send the
IP packet to the actual resource.

We don't want to implement this forwarding and NAT in the tests.
However, our testing harness is about to get more sophisticated. We will
be sending TCP DNS queries with #6944 and we want to test TCP and its
traffic filters with #7003.

The state of those TCP sockets needs to live _somewhere_.

If we "correctly" model this and introduce some kind of `HashMap` with
`dyn Resource` in `TunnelTest`, then we will have to actually implement
NAT for those packets to ensure that e.g. the SYN-ACK of a TCP handshake
makes it back to the correct(!) gateway.

That is rather cumbersome.

This PR suggests taking a shortcut there by associating the resources
with each gateway individually. At present, all we have are UDP DNS
servers. Those don't actually have any connection state themselves but
putting them in place gives us a framework for where we can put
connection-specific state. Most importantly, these resources MUST NOT
hold application-specific state. Instead, that state needs to be kept in
`ReferenceState` or `TunnelState` and passed in by reference, as we do
here for the DNS records.

This has effectively the same behaviour as correctly translating IP
packets back and forth between resources and gateways. The packets
"emitted" by a particular `UdpDnsServerResource` will always go back to
the correct gateway.
2024-10-16 01:10:38 +00:00
Thomas Eizinger
cd2dea7846 chore: add sans-IO DNS-over-TCP implementation (#6997)
This splits out the actual DNS server from #6944 into a separate crate.
At present, it only contains a DNS server. Later, we will likely add a
DNS client to it as well because the proptests and connlib itself will
need a user-space DNS TCP client.

The implementation uses `smoltcp` but that is entirely encapsulated. The
`Server` struct exposes only a high-level interface for

- feeding inbound packets as well as retrieving outbound packets
- retrieving parsed DNS queries and sending DNS responses

Related: #6140.
2024-10-10 21:05:12 +00:00
Thomas Eizinger
355726db7a refactor(connlib): clarify design of p2p control protocol (#6980)
This incorporates the feedback from #6939 after a discussion with
@conectado. We agreed that the protocol should be more event-based,
where each message has its own event type. Events MAY appear in pairs or
other cardinality combinations, meaning semantically they could be seen
as requests and responses. In general though, due to the unreliable
nature of IP, it is better to view them as events. Events are typically
designed to be idempotent which is important to make this protocol work.
Using events also means it is not as easy to fall into the "trap" of
modelling requests / responses on the control protocol level.
2024-10-09 00:45:30 +00:00
Thomas Eizinger
027ef60ded feat(connlib): introduce FZ p2p control protocol (#6939)
At present, `connlib` utilises the portal as a signalling layer for any
kind of control message that needs to be exchanged between clients and
gateways. For anything regard to connectivity, this is crucial: Before
we have a direct connection to the gateway, we don't really have a
choice other than using the portal as a "relay" to e.g. exchange address
candidates for ICE.

However, once a direct connection has been established, exchanging
information directly with the gateway is faster and removes the portal
as a potential point of failure for the data plane.

For DNS resources, `connlib` intercepts all DNS requests on the client
and assigns its own IPs within the CG-NAT range to all domains that are
configured as resources. Thus, all packets targeting DNS resources will
have one of these IPs set as their destination. The gateway needs to
learn about all the IPs that have been assigned to a certain domain by
the client and perform NAT. We call this concept "DNS resource NAT".

Currently, the domain + the assigned IPs are sent together with the
`allow_access` or `request_connection` message via the portal. The new
control protocol defined in #6732 purposely excludes this information
and only authorises traffic to the entire resource which could also be a
wildcard-DNS resource.

To exchange the assigned IPs for a certain domain with the gateway, we
introduce our own p2p control protocol built on top of IP. All control
protocol messages are sent through the tunnel and thus encrypted at all
times. They are differentiated from regular application traffic as
follows:

- IP src is set to the unspecified IPv6 address (`::`)
- IP dst is set to the unspecified IPv6 address (`::`)
- IP protocol is set to reserved (`0xFF`)

The combination of all three should never appear as regular traffic.

To ensure forwards-compatibility, the control protocol utilises a fixed
8-byte header where the first byte denotes the message kind. In this
current design, there is no concept of a request or response in the
wire-format. Each message is unidirectional and the fact that the two
messages we define in here appear in tandem is purely by convention. We
use the IPv6 payload length to determine the total length of the packet.
The payloads are JSON-encoded. Message types are free to chose whichever
encoding they'd like.

This protocol is sent through the WireGuard tunnel, meaning we are
effectively limited by our device MTU of 1280, otherwise we'd have to
implement fragmentation. For the messages of setting up the DNS resource
NAT, we are below this limit:

- UUIDs are 16 bytes
- Domain names are at most 255 bytes
- IPv6 addresses are 16 bytes * 4
- IPv4 addressers are 4 bytes * 4

Including the JSON serialisation overhead, this results in a total
maximum payload size of 402 bytes, which is well below our MTU.

Finally, another thing to consider here is that IP is unreliable,
meaning each use of this protocol needs to make sure that:

- It is resilient against message re-ordering
- It is resilient against packet loss

The details of how this is ensured for setting up the DNS resource NAT
is left to #6732.
2024-10-08 05:43:14 +00:00
Thomas Eizinger
0ef4b50913 refactor(ip-packet): be precise about length of payload (#6938)
The `len` specified in the constructor of `IpPacket` is user-provided.
Technically, that one can be longer than the actual packet. To make sure
we only ever pass out the precise payload of the IP packet, we read the
length from the IP header and cut the slice at the specified length.

For #6461, we will build a control protocol on top of IP that runs
through the WireGuard tunnel. Reading the exact length of the payload is
important for that.
2024-10-07 22:48:52 +00:00
Thomas Eizinger
9f3f171b3d feat(connlib): implement TRACE logging for DNS (#6907)
When debugging DNS-related issues, it is useful to see all DNS queries
that go into `connlib` and the responses that we generate. Analogous to
the `wire::net` and `wire::dev` TRACE logs, we introduce `wire::dns`
which logs incoming queries and the responses on TRACE. The output looks
like this:

```
2024-10-02T00:16:47.522847Z TRACE wire::dns::qry: A     caldav.fastmail.com qid=55845
2024-10-02T00:16:47.522926Z TRACE wire::dns::qry: AAAA  caldav.fastmail.com qid=56277
2024-10-02T00:16:47.531347Z TRACE wire::dns::res: AAAA  caldav.fastmail.com => [] qid=56277 rcode=NOERROR
2024-10-02T00:16:47.538984Z TRACE wire::dns::res: A     caldav.fastmail.com => [103.168.172.46 | 103.168.172.61] qid=55845 rcode=NOERROR
2024-10-02T00:16:47.580237Z TRACE wire::dns::qry: HTTPS cloudflare-dns.com qid=21518
2024-10-02T00:16:47.580338Z TRACE wire::dns::qry: A     example.org qid=35459
2024-10-02T00:16:47.580364Z TRACE wire::dns::qry: AAAA  example.org qid=60073
2024-10-02T00:16:47.580699Z TRACE wire::dns::qry: AAAA  ipv4only.arpa qid=17280
2024-10-02T00:16:47.580782Z TRACE wire::dns::qry: A     ipv4only.arpa qid=47215
2024-10-02T00:16:47.581134Z TRACE wire::dns::qry: A     detectportal.firefox.com qid=34970
2024-10-02T00:16:47.581261Z TRACE wire::dns::qry: AAAA  detectportal.firefox.com qid=39505
2024-10-02T00:16:47.609502Z TRACE wire::dns::res: AAAA  example.org => [2606:2800:21f:cb07:6820:80da:af6b:8b2c] qid=60073 rcode=NOERROR
2024-10-02T00:16:47.609640Z TRACE wire::dns::res: AAAA  ipv4only.arpa => [] qid=17280 rcode=NOERROR
2024-10-02T00:16:47.610407Z TRACE wire::dns::res: A     ipv4only.arpa => [192.0.0.170 | 192.0.0.171] qid=47215 rcode=NOERROR
2024-10-02T00:16:47.617952Z TRACE wire::dns::res: HTTPS cloudflare-dns.com => [1  alpn=h3,h2 ipv4hint=104.16.248.249,104.16.249.249 ipv6hint=2606:4700::6810:f8f9,2606:4700::6810:f9f9] qid=21518 rcode=NOERROR
2024-10-02T00:16:47.631124Z TRACE wire::dns::res: A     example.org => [93.184.215.14] qid=35459 rcode=NOERROR
2024-10-02T00:16:47.640286Z TRACE wire::dns::res: AAAA  detectportal.firefox.com => [detectportal.prod.mozaws.net. | prod.detectportal.prod.cloudops.mozgcp.net. | 2600:1901:0:38d7::] qid=39505 rcode=NOERROR
2024-10-02T00:16:47.641332Z TRACE wire::dns::res: A     detectportal.firefox.com => [detectportal.prod.mozaws.net. | prod.detectportal.prod.cloudops.mozgcp.net. | 34.107.221.82] qid=34970 rcode=NOERROR
2024-10-02T00:16:48.737608Z TRACE wire::dns::qry: AAAA  myfiles.fastmail.com qid=52965
2024-10-02T00:16:48.737710Z TRACE wire::dns::qry: A     myfiles.fastmail.com qid=5114
2024-10-02T00:16:48.745282Z TRACE wire::dns::res: AAAA  myfiles.fastmail.com => [] qid=52965 rcode=NOERROR
2024-10-02T00:16:49.027932Z TRACE wire::dns::res: A     myfiles.fastmail.com => [103.168.172.46 | 103.168.172.61] qid=5114 rcode=NOERROR
2024-10-02T00:16:49.190054Z TRACE wire::dns::qry: HTTPS github.com qid=64696
2024-10-02T00:16:49.190171Z TRACE wire::dns::qry: A     github.com qid=11912
2024-10-02T00:16:49.190502Z TRACE wire::dns::res: A     github.com => [100.96.0.1 | 100.96.0.2 | 100.96.0.3 | 100.96.0.4] qid=11912 rcode=NOERROR
2024-10-02T00:16:49.190619Z TRACE wire::dns::qry: A     github.com qid=63366
2024-10-02T00:16:49.190730Z TRACE wire::dns::res: A     github.com => [100.96.0.1 | 100.96.0.2 | 100.96.0.3 | 100.96.0.4] qid=63366 rcode=NOERROR
```

As with the other filters, seeing both queries and responses can be
achieved with `RUST_LOG=wire::dns=trace`. If you are only interested in
the responses, you can activate a more specific log filter using
`RUST_LOG=wire::dns::res=trace`. All responses also print the original
query that they are answering.

Resolves: #6862.
2024-10-02 21:19:06 +00:00
Thomas Eizinger
29bc276bf2 refactor(connlib): parallelise TUN operations (#6673)
Currently, `connlib` is entirely single-threaded. This allows us to
reuse a single buffer for processing IP packets and makes reasoning of
the packet processing code very simple. Being single-threaded also means
we can only make use of a single CPU core and all operations have to be
sequential.

Analyzing `connlib` using `perf` shows that we spend 26% of our CPU time
writing packets to the TUN interface [0]. Because we are
single-threaded, `connlib` cannot do anything else during this time. If
we could offload the writing of these packets to a different thread,
`connlib` could already process the next packet while the current one is
writing.

Packets that we send to the TUN interface arrived as an encrypted WG
packet over UDP and get decrypted into a - currently - shared buffer.
Moving the writing to a different thread implies that we have to have
more of these buffer that the next packet(s) can be decrypted into.

To avoid IP fragmentation, we set the maximum IP MTU to 1280 bytes on
the TUN interface. That actually isn't very big and easily fits into a
stackframe. The default stack size for threads is 2MB [1].

Instead of creating more buffers and cycling through them, we can also
simply stack-allocate our IP packets. This incurs some overhead from
copying packets but it is only ~3.5% [2] (This was measured without a
separate thread). With stack-allocated packets, almost all
lifetime-annotations go away which in itself is already a welcome
ergonomics boost. Stack-allocated packets also means we can simply spawn
a new thread for the packet processing. This thread is connected with
two channel to connlib's main thread. The capacity of 1000 packets will
at most consume an additional 3.5 MB of memory which is fine even on our
most-constrained devices such as iOS.

[0]: https://share.firefox.dev/3z78CzD
[1]: https://doc.rust-lang.org/std/thread/#stack-size
[2]: https://share.firefox.dev/3Bf4zla

Resolves: #6653.
Resolves: #5541.
2024-09-26 03:03:35 +00:00
Thomas Eizinger
d09335b31f chore(connlib): log dynamic packet header on "unknown resource" (#6766) 2024-09-19 00:21:43 +00:00
Thomas Eizinger
8bac75bd49 fix(connlib): forward PTR queries for non-resources (#6765)
When encountering a PTR query, `connlib` checks if the query is for a
Firezone-managed resource and resolve it to the correct IP. If it isn't
for a DNS resource, we should forward the query to the upstream
resolver.

This isn't what is currently happening though. Instead of forwarding the
query, we bail early from `StubResolver::handle` and thus attempt to
route the packet through the tunnel. This however fails because the DNS
query was targeted at `connlib`'s stub resolver address which never
corresponds to a resource IP.

When TRACE logs where activated, this resulted in several entries such
as

> Unknown resource dst=100.100.111.1

To ensure this doesn't regress, we now generate PTR and MX record
queries in `tunnel_test`. We don't assert the response of those but we
do assert that we always get a response. The inclusion of MX records
asserts that unknown query types get correctly forwarded.

Resolves: #6749.
2024-09-18 22:46:26 +00:00
Thomas Eizinger
e34f36df7e chore(connlib): be more verbose when probing DNS packets (#6751)
Currently, checking whether a packet is a DNS query has multiple silent
exit paths. This makes it DNS problems difficult to debug because the
packets will be treated as if they have to get routed through the
tunnel.

This is also something we should fix but that isn't done in this PR: If
we know that a packet is for connlib's DNS stub resolver, we should
never route it through the tunnel. Currently, this isn't possible to
express with the type signature of our DNS module and requires more
refactoring.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-09-18 14:27:23 +00:00
Thomas Eizinger
a9f515a453 chore(rust): use #[expect] instead of #[allow] (#6692)
The `expect` attribute is similar to `allow` in that it will silence a
particular lint. In addition to `allow` however, `expect` will fail as
soon as the lint is no longer emitted. This ensures we don't end up with
stale `allow` attributes in our codebase. Additionally, it provides a
way of adding a `reason` to document, why the lint is being suppressed.
2024-09-16 13:51:12 +00:00
Thomas Eizinger
7adbf9c6af refactor(connlib): remove pnet_packet (#6659)
As the final step in removing `pnet_packet`, we need to introduce `-Mut`
equivalent slices for UDP, TCP and ICMP packets. As a starting point,
introducing `UpdHeaderSliceMut` and `TcpHeaderSliceMut` is fairly
trivial. The ICMP variants are a bit trickier because those are
different for IPv4 and IPv6. Additionally, ICMP for IPv4 is quite
complex because it can have a variable header length. Additionally. for
both variants, the values in byte range 5-8 are semantically different
depending on the ICMP code.

This requires us to design an API that balances ergonomics and
correctness. Technically, an ICMP identifier and sequence can only be
set if the ICMP code is "echo request" or "echo reply". However, adding
an additional parsing step to guarantee this in the type system is quite
verbose.

The trade-off implemented in this PR allows to us to directly write to
the byte 5-8 using the `set_identifier` and `set_sequence` functions. To
catch errors early, this functions have debug-assertions built in that
ensure that the packet is indeed an ICMP echo packet.

Resolves: #6366.
2024-09-11 23:52:48 +00:00
Thomas Eizinger
133c2565b2 refactor(connlib): merge IpPacket and MutableIpPacket (#6652)
Currently, we have two structs for representing IP packets: `IpPacket`
and `MutableIpPacket`. As the name suggests, they mostly differ in
mutability. This design was originally inspired by the `pnet_packet`
crate which we based our `IpPacket` on. With subsequent iterations, we
added more and more functionality onto our `IpPacket`, like NAT64 &
NAT46 translation. As a result of that, the `MutableIpPacket` is no
longer directly based on `pnet_packet` but instead just keeps an
internal buffer.

This duplication can be resolved by merging the two structs into a
single `IpPacket`. We do this by first replacing all usages of
`IpPacket` with `MutableIpPacket`, deleting `IpPacket` and renaming
`MutableIpPacket` to `IpPacket`. The final design now has different
`self`-receivers: Some functions take `&self`, some `&mut self` and some
consume the packet using `self`.

This results in a more ergonomic usage of `IpPacket` across the codebase
and deletes a fair bit of code. It also takes us one step closer towards
using `etherparse` for all our IP packet interaction-needs. Lastly, I am
currently exploring a performance-optimisation idea that stack-allocates
all IP packets and for that, the current split between `IpPacket` and
`MutableIpPacket` does not really work.

Related: #6366.
2024-09-11 22:32:49 +00:00
Thomas Eizinger
578363a7fe refactor(ip-packet): introduce etherparse (#6524)
This PR introduces the `etherparse` dependency for parsing and
generating IP packets.

Using `etherparse`, we can implement the NAT46 & NAT64 implementations
for the gateway more elegantly because it allows us to parse the IP and
protocol headers into a static and much richer representation. The
conversion to the IPv4/IPv6 equivalent is then just a question of
transforming one data structure into another and writing it to the
correct place in the buffer.

We extract this functionality into dedicated `nat64` and `nat46`
modules.

Furthermore, we implement the various functions in `ip_packet::make`
using `etherparse` too. Following that, we also overhaul the NAT
translation tests that we have in `ip_packet::proptests`. Those now use
the more low-level `consume_to_ipX` APIs which makes the tests more
ergonomic to write.

In the future, we should upstream `Ipv4HeaderSliceMut` and
`Ipv6HeaderSliceMut` to `etherparse`.

Moving all of this functionality to `etherparse` will make it easier to
write tests that involve more IP packets as well as customise the
behaviour of our NAT.

Related: #5614.
Related: #6371.
Related: #6353.
2024-09-04 20:01:01 +00:00
Thomas Eizinger
a4fef5f6e7 test(connlib): index ICMP packets by custom payload (#6438)
In the `tunnel_test` test suite, we send ICMP requests with arbitrary
sequence numbers and identifiers. Due to the NAT implementation of the
gateway, the sequence number and identifier chosen by the client are not
necessarily the same as the ones sent to the resource. Thus, it is
impossible to correlate the ICMP packets sent by the client with the
ones arriving at the gateway.

Currently, our test suite thus relies on the ordering of packets to
match them up and assert properties on them, like whether they target
the correct resource. As soon as we want to send multiple packets
concurrently, this order is not necessarily stable.

ICMP echo requests can contain an arbitrary payload. We utilise this
payload to embed a random u64 that acts as the unique identifier of an
ICMP request. This allows us to correlate the packets arriving at the
gateway with the ones sent by the client, making the test suite more
robust and ready for handling concurrent ICMP packets.
2024-08-27 04:15:39 +00:00
Thomas Eizinger
3b56664e02 test(rust): ensure deterministic proptests (#6319)
For quite a while now, we have been making extensive use of
property-based testing to ensure `connlib` works as intended. The idea
of proptests is that - given a certain seed - we deterministically
sample test inputs and assert properties on a given function.

If the test fails, `proptest` prints the seed which can then be added to
a regressions file to iterate on the test case and fix it. It is quite
obvious that non-determinism in how the test input gets generated is no
bueno and reduces the value we get out of these tests a fair bit.

The `HashMap` and `HashSet` data structures are known to be
non-deterministic in their iteration order. This causes non-determinism
during the input generation because we make use of a lot of maps and
sets to gradually build up the test input. We fix all uses of `HashMap`
and `HashSet` by replacing them with `BTreeMap` and `BTreeSet`.

To ensure this doesn't regress, we refactor `tunnel_test` to not make
use of proptest's macros and instead, we initialise and run the test
ourselves. This allows us to dump the sampled state and transitions into
a file per test run. In CI, we then run a 2nd iteration of all
regression tests and compare the sampled state and transitions with the
previous run. They must match byte-for-byte.

Finally, to discourage use of non-deterministic iteration, we ban the
use of the iteration functions on `HashMap` and `HashSet` across the
codebase. This doesn't catch iteration in a `for`-loop but it is better
than not linting against it at all.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-16 23:15:58 +00:00
Thomas Eizinger
4ae64f0257 fix(connlib): index forwarded DNS queries by ID + socket (#6233)
When forwarding DNS queries, we need to remember the original source
socket in order to send the response back. Previously, this mapping was
indexed by the DNS query ID. As it turns out, at least Windows doesn't
have a global DNS query ID counter and may reuse them across different
DNS servers. If that happens and two of these queries overlap, then we
match the wrong responses together.

In the best case, this produces bad DNS results on the client. In the
worst case, those queries were for DNS servers with different IP
versions in which case we triggered a panic in connlib further down the
stack where we created the IP packet for the response.

To fix this, we first and foremost remove the explicit `panic!` from the
`make::` functions in `ip-packet`. Originally, these functions were only
used in tests but we started to use them in production code too and
unfortunately forgot about this panic. By introducing a `Result`, all
call-sites are made aware that this can fail.

Second, we fix the actual indexing into the data structure for forwarded
DNS queries to also include the DNS server's socket. This ensures we
don't treat the DNS query IDs as globally unique.

Third, we replace the panicking path in
`try_handle_forwarded_dns_response` with a log statement, meaning if the
above assumption turns out wrong for some reason, we still don't panic
and simply don't handle the packet.
2024-08-09 07:01:57 +00:00
Thomas Eizinger
128d0eb407 feat(connlib): transparently forward non-resources DNS queries (#6181)
Currently, `connlib` depends on `hickory-resolver` to perform DNS
queries for non-resources. This is unnecessary. Instead of buffering the
original UDP DNS query, consulting hickory to resolve the name and
mapping the response back, we can simply take the UDP payload and send
it via our protected socket directly to the original upstream DNS
server.

This ensures `connlib` is as transparent as possible for DNS queries for
non-resources. Additionally, it removes a lot of error handling and
other cruft that we currently have to perform because we are using
hickory. For example, hickory will automatically retry a DNS query after
a certain timeout. However, the OS / client talking to `connlib` will
also retry after a certain timeout because it is making DNS queries over
an unreliable transport (UDP). It is thus unnecessary for us to do that
internally.

To correctly test this change, our test-suite needed some refactoring.
Specifically, DNS servers are now modelled as dedicated `Host`s that can
receive (UDP) traffic.

Lastly, we can remove our dependency on `hickory-proto` and
`hickory-resolver` everywhere and only use `domain` for parsing DNS
messages.

Resolves: #6141.
Related: #6033.
Related: #4800. (Impossible to happen with this design)
2024-08-07 08:54:49 +00:00
dependabot[bot]
6fa6c08bf9 build(deps): Bump pnet_packet from 0.34.0 to 0.35.0 in /rust (#5396)
Bumps [pnet_packet](https://github.com/libpnet/libpnet) from 0.34.0 to
0.35.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/libpnet/libpnet/releases">pnet_packet's
releases</a>.</em></p>
<blockquote>
<h2>v0.35.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update license field following SPDX 2.1 license expression standard
by <a href="https://github.com/frisoft"><code>@​frisoft</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/633">libpnet/libpnet#633</a></li>
<li>transport: Add option to set ECN on the TransportSender socket. by
<a href="https://github.com/hawkinsw"><code>@​hawkinsw</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/685">libpnet/libpnet#685</a></li>
<li>Fix failing tests by <a
href="https://github.com/Paul-weqe"><code>@​Paul-weqe</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/676">libpnet/libpnet#676</a></li>
<li>remove the repetitive word by <a
href="https://github.com/cuishuang"><code>@​cuishuang</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/672">libpnet/libpnet#672</a></li>
<li>Add apple tvos support by <a
href="https://github.com/lcruz99"><code>@​lcruz99</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/652">libpnet/libpnet#652</a></li>
<li>Adding vxlan to pnet_packet by <a
href="https://github.com/stevedoyle"><code>@​stevedoyle</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/654">libpnet/libpnet#654</a></li>
<li>Add ICMP Destination unreachable Next-hop MTU by <a
href="https://github.com/fabi321"><code>@​fabi321</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/662">libpnet/libpnet#662</a></li>
<li>Update ARP example to also support IPv6 via NDP by <a
href="https://github.com/tgross35"><code>@​tgross35</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/642">libpnet/libpnet#642</a></li>
<li>Ensure BPF read is 4-byte aligned by <a
href="https://github.com/frankplow"><code>@​frankplow</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/655">libpnet/libpnet#655</a></li>
<li>Expose the various values in the TcpOption structure for external
program access by <a
href="https://github.com/rikonaka"><code>@​rikonaka</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/640">libpnet/libpnet#640</a></li>
<li>Definition for ethernet flow control packets. by <a
href="https://github.com/AJMansfield"><code>@​AJMansfield</code></a> in
<a
href="https://redirect.github.com/libpnet/libpnet/pull/649">libpnet/libpnet#649</a></li>
<li>Expose set_ecn on unix only by <a
href="https://github.com/mrmonday"><code>@​mrmonday</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/689">libpnet/libpnet#689</a></li>
<li>datalink(linux): add feature to pass the fd (socket) to ::channel()
by <a href="https://github.com/Martichou"><code>@​Martichou</code></a>
in <a
href="https://redirect.github.com/libpnet/libpnet/pull/584">libpnet/libpnet#584</a></li>
<li>Added DNS protocol support by <a
href="https://github.com/tomDev5"><code>@​tomDev5</code></a> in <a
href="https://redirect.github.com/libpnet/libpnet/pull/678">libpnet/libpnet#678</a></li>
<li>linux: use poll api instead of select inorder to support fd &gt;
1024. Fixes <a
href="https://redirect.github.com/libpnet/libpnet/issues/612">#612</a>
and <a
href="https://redirect.github.com/libpnet/libpnet/issues/639">#639</a>
by <a
href="https://github.com/nemosupremo"><code>@​nemosupremo</code></a> in
<a
href="https://redirect.github.com/libpnet/libpnet/pull/681">libpnet/libpnet#681</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/frisoft"><code>@​frisoft</code></a> made
their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/633">libpnet/libpnet#633</a></li>
<li><a href="https://github.com/hawkinsw"><code>@​hawkinsw</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/685">libpnet/libpnet#685</a></li>
<li><a href="https://github.com/Paul-weqe"><code>@​Paul-weqe</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/676">libpnet/libpnet#676</a></li>
<li><a href="https://github.com/cuishuang"><code>@​cuishuang</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/672">libpnet/libpnet#672</a></li>
<li><a href="https://github.com/lcruz99"><code>@​lcruz99</code></a> made
their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/652">libpnet/libpnet#652</a></li>
<li><a
href="https://github.com/stevedoyle"><code>@​stevedoyle</code></a> made
their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/654">libpnet/libpnet#654</a></li>
<li><a href="https://github.com/fabi321"><code>@​fabi321</code></a> made
their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/662">libpnet/libpnet#662</a></li>
<li><a href="https://github.com/tgross35"><code>@​tgross35</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/642">libpnet/libpnet#642</a></li>
<li><a href="https://github.com/frankplow"><code>@​frankplow</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/655">libpnet/libpnet#655</a></li>
<li><a
href="https://github.com/AJMansfield"><code>@​AJMansfield</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/649">libpnet/libpnet#649</a></li>
<li><a href="https://github.com/tomDev5"><code>@​tomDev5</code></a> made
their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/678">libpnet/libpnet#678</a></li>
<li><a
href="https://github.com/nemosupremo"><code>@​nemosupremo</code></a>
made their first contribution in <a
href="https://redirect.github.com/libpnet/libpnet/pull/681">libpnet/libpnet#681</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/libpnet/libpnet/compare/v0.34.0...v0.35.0">https://github.com/libpnet/libpnet/compare/v0.34.0...v0.35.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="97ece70e2f"><code>97ece70</code></a>
Release v0.35.0</li>
<li><a
href="49c8c683f9"><code>49c8c68</code></a>
Merge pull request <a
href="https://redirect.github.com/libpnet/libpnet/issues/681">#681</a>
from ionosnetworks/feat/linux-poll-api</li>
<li><a
href="07526a7f6f"><code>07526a7</code></a>
Merge pull request <a
href="https://redirect.github.com/libpnet/libpnet/issues/678">#678</a>
from tomDev5/dns</li>
<li><a
href="b319ca2f64"><code>b319ca2</code></a>
fixed dns code</li>
<li><a
href="a3a46e6fb2"><code>a3a46e6</code></a>
removed BooleanField for u1</li>
<li><a
href="7086ed22a6"><code>7086ed2</code></a>
dns layer in pnet</li>
<li><a
href="14a01ffc37"><code>14a01ff</code></a>
Merge pull request <a
href="https://redirect.github.com/libpnet/libpnet/issues/584">#584</a>
from Martichou/raw_socket</li>
<li><a
href="bd4c8b0e32"><code>bd4c8b0</code></a>
datalink(linux): add feature to pass the fd (socket) to ::channel()</li>
<li><a
href="28e9de4b8c"><code>28e9de4</code></a>
Merge pull request <a
href="https://redirect.github.com/libpnet/libpnet/issues/689">#689</a>
from mrmonday/ecn-unix-only</li>
<li><a
href="01eee253ba"><code>01eee25</code></a>
Expose set_ecn on unix only.</li>
<li>Additional commits viewable in <a
href="https://github.com/libpnet/libpnet/compare/v0.34.0...v0.35.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pnet_packet&package-manager=cargo&previous-version=0.34.0&new-version=0.35.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 04:38:45 +00:00
Thomas Eizinger
c92dd559f7 chore(rust): format Cargo.toml using cargo-sort (#5851) 2024-07-12 04:57:22 +00:00
Thomas Eizinger
d95193be7d test(connlib): introduce dynamic number of gateways to tunnel_test (#5823)
Currently, `tunnel_test` exercises a lot of code paths within connlib
already by adding & removing resources, roaming the client and sending
ICMP packets. Yet, it does all of this with just a single gateway
whereas in production, we are very likely using more than one gateway.

To capture these other code-paths, we now sample between 1 and 3
gateways and randomly assign the added resources to one of them, which
makes us hit the codepaths that select between different gateways.

Most importantly, the reference implementation has barely any knowledge
about those individual connections. Instead, it is implementation in
terms of connectivity to resources.
2024-07-11 23:42:46 +00:00
Thomas Eizinger
1aa95ed17e fix(connlib): be explicit about unsupported ICMP types (#5611)
Our NAT table uses TCP & UDP ports for its entries. To correctly handle
ICMP requests and responses, we use the ICMP identifier in those
packets. All other ICMP messages are currently unsupported.

The errors paths for accessing these fields, i.e. ports for UDP/TCP and
identifier for ICMP currently conflate two different errors:

- Unsupported IP payload: it is neither TCP, UDP or ICMP
- Unsupported ICMP type: it is not an ICMP request or response

This makes certain logs look worse than they are because we say
"Unsupported IP protocol: Icmpv6". To avoid this, we create a dedicated
error variant that calls out the unsupported ICMP type.

Fixes: #5594.
2024-06-28 01:13:25 +00:00
Thomas Eizinger
8cb3659636 chore(connlib): implement some missing ICMP conversions (#5475)
So far, our packet translation only implemented the bare-minimum for
ICMP to work. There are a few things left that haven't been dealt with.
This PR adds additional conversions where it was easy.

There are still some left that require more elaborate mangling of the
packet, like updating pointer fields.
2024-06-24 23:48:14 +00:00
Gabi
2ea6a5d07e feat(gateway): NAT & mangling for DNS resources (#5354)
As part of #4994, the IP translation and mangling of packets to and from
DNS resources is moved to the gateway. This PR represents the
"gateway-half" of the required changes.

Eventually, the client will send a list of proxy IPs that it assigned
for a certain DNS resource. The gateway assigns each proxy IP to a real
IP and mangles outgoing and incoming traffic accordingly. There are a
number of things that we need to take care of as part of that:

- We need to implement NAT to correctly route traffic. Our NAT table
maps from source port* and destination IP to an assigned port* and real
IP. We say port* because that is only true for UDP and TCP. For ICMP, we
use the identifier.
- We need to translate between IPv4 and IPv6 in case a DNS resource e.g.
only resolves to IPv6 addresses but the client gave out an IPv4 proxy
address to the application. This translation is was added in #5364 and
is now being used here.

This PR is backwards-compatible because currently, clients don't send
any IPs to the gateway. No proxy IPs means we cannot do any translation
and thus, packets are simply routed through as is which is what the
current clients expect.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-06-19 01:15:27 +00:00
Gabi
8cc28499e9 chore(connlib): implement IP translation according to RFC6145 (#5364)
As part of #4994, we need to translate IP packets between IPv4 and IPv6.
This PR introduces the `ConvertiblePacket` abstraction that implements
this.
2024-06-14 21:33:07 +00:00
Jamil
7e533c42f8 refactor: Split releases for Clients and Gateways (#5287)
- Removes version numbers from infra components (elixir/relay)
- Removes version bumping from Rust workspace members that don't get
published
- Splits release publishing into `gateway-`, `headless-client-`, and
`gui-client-`
- Removes auto-deploying new infrastructure when a release is published.
Use the Deploy Production workflow instead.

Fixes #4397
2024-06-10 16:47:49 +00:00
Thomas Eizinger
4117639cf4 fix(connlib): reply with SERVFAIL on DNS query errors (#5263)
Currently, we simply drop a DNS query if we can't fulfill it. Because
DNS is based on UDP which is unreliable, a downstream system will
re-send a DNS query if it doesn't receive an answer within a certain
timeout window.

Instead of dropping queries, we now reply with `SERVFAIL`, indicating to
the client that we can't fulfill that DNS query. The intent is that this
will stop any kind of automated retry-loop and surface an error to the
user.

Related: #4800.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-06-08 02:12:46 +00:00
Thomas Eizinger
63d7c35717 test(connlib): send DNS queries to non-resources (#5168)
Currently, `tunnel_test` only sends DNS queries to a client's configured
DNS resources. However, connlib receives _all_ DNS requests made on a
system and forwards them to the originally set upstream resolvers in
case they are for non-resources.

To capture the code paths for forwarding these DNS queries, we introduce
a `global_dns_records` strategies that pre-fills the `ReferenceState`
with DNS records that are not DNS resources. Thus, when sampling for a
domain to query, we might pick one that is not a DNS resource.

The expectation here is that this query still resolves (we assert that
we don't have any unanswered DNS queries). In addition, we introduce a
`Transition` to send an ICMP packet to such a resolved address. In a
real system, these wouldn't get routed to connlib but if they were, we
still want to assert that they don't get routed.

There is a special-case where the chosen DNS server is actually a CIDR
resource. In that case, the DNS packet gets lost and we use it to
trigger initiate a connection to the corresponding gateway. In that
case, a repeated query to such a DNS server actually gets sent via the
tunnel to the gateway. As such, we need generate a DNS response
similarly to how we need to send an ICMP reply.

This allows us to add a few more useful assertions to the test: Correct
mangling of source and destination port of UDP packets.
2024-06-04 05:52:22 +00:00
Gabi
499edd2dd4 chore(connlib): fix echo request and reply packets (#5169)
When creating an echo request or reply packet using pnet it uses the
whole packet since the identifier and sequence is part of the icmp
header not the payload.

Those fields aren't accessible unless the packet is converted to an echo
request or reply because the interpretation of that header field depends
on the specific type of packet.
2024-05-31 11:15:46 +00:00
Thomas Eizinger
ce929e1204 test(connlib): resolve DNS resources in tunnel_test (#5083)
Currently, `tunnel_test` only sends ICMPs to CIDR resources. We also
want to test certain properties in regards to DNS resources. In
particular, we want to test:

- Given a DNS resource, can we query it for an IP?
- Can we send an ICMP packet to the resolved IP?
- Is the mapping of proxy IP to upstream IP stable?

To achieve this, we sample a list of `IpAddr` whenever we add a DNS
resource to the state. We also add the transition
`SendQueryToDnsResource`. As the name suggests, this one simulates a DNS
query coming from the system for one of our resources. We simulate A and
AAAA queries and take note of the addresses that connlib returns to us
for the queries.

Lastly, as part of `SendICMPPacketToResource`, we now may also sample
from a list of IPs that connlib gave us for a domain and send an ICMP
packet to that one.

There is one caveat in this test that I'd like to point out: At the
moment, the exact mapping of proxy IP to real IP is an implementation
detail of connlib. As a result, I don't know which proxy IP I need to
use in order to ping a particular "real" IP. This presents an issue in
the assertions: Upon the first ICMP packet, I cannot assert what the
expected destination is. Instead, I need to "remember" it. In case we
send another ICMP packet to the same resource and happen to sample the
same proxy IP, we can then assert that the mapping did not change.
2024-05-31 04:44:30 +00:00