The actual size of the send and receive buffers is OS-dependent. To aid
debugging with customer-submitted logs, we now print the size of the
send and receive buffers of each UDP socket.
As the final step in removing `pnet_packet`, we need to introduce `-Mut`
equivalent slices for UDP, TCP and ICMP packets. As a starting point,
introducing `UpdHeaderSliceMut` and `TcpHeaderSliceMut` is fairly
trivial. The ICMP variants are a bit trickier because those are
different for IPv4 and IPv6. Additionally, ICMP for IPv4 is quite
complex because it can have a variable header length. Additionally. for
both variants, the values in byte range 5-8 are semantically different
depending on the ICMP code.
This requires us to design an API that balances ergonomics and
correctness. Technically, an ICMP identifier and sequence can only be
set if the ICMP code is "echo request" or "echo reply". However, adding
an additional parsing step to guarantee this in the type system is quite
verbose.
The trade-off implemented in this PR allows to us to directly write to
the byte 5-8 using the `set_identifier` and `set_sequence` functions. To
catch errors early, this functions have debug-assertions built in that
ensure that the packet is indeed an ICMP echo packet.
Resolves: #6366.
Currently, we have two structs for representing IP packets: `IpPacket`
and `MutableIpPacket`. As the name suggests, they mostly differ in
mutability. This design was originally inspired by the `pnet_packet`
crate which we based our `IpPacket` on. With subsequent iterations, we
added more and more functionality onto our `IpPacket`, like NAT64 &
NAT46 translation. As a result of that, the `MutableIpPacket` is no
longer directly based on `pnet_packet` but instead just keeps an
internal buffer.
This duplication can be resolved by merging the two structs into a
single `IpPacket`. We do this by first replacing all usages of
`IpPacket` with `MutableIpPacket`, deleting `IpPacket` and renaming
`MutableIpPacket` to `IpPacket`. The final design now has different
`self`-receivers: Some functions take `&self`, some `&mut self` and some
consume the packet using `self`.
This results in a more ergonomic usage of `IpPacket` across the codebase
and deletes a fair bit of code. It also takes us one step closer towards
using `etherparse` for all our IP packet interaction-needs. Lastly, I am
currently exploring a performance-optimisation idea that stack-allocates
all IP packets and for that, the current split between `IpPacket` and
`MutableIpPacket` does not really work.
Related: #6366.
Why:
* `trust.firezone.dev` is actually being hosted by `trust.oneleet.com`
which means Oneleet needs to issue the cert for `trust.firezone.dev` and
can't use the Google CA used for the rest of `firezone.dev`.
Closes#6661
The lifetime of the returned packet is actually already `'static`,
meaning we don't need to call `to_owned`.
Related: #6366.
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: firezone <firezone@firezones-MacBook-Air.local>
Self-hosted users often forget to deploy relays. Without relays,
`snownet` cannot establish any connections because we never figure out
our server-reflexive address and local host address. Even if relays are
configured, if STUN / TURN is blocked, we may end up with no relays.
In that case, any newly created connection will very likely fail unless
new TURN servers are added within the 10s timeout that we have when
waiting for candidates. To make it easier to detect these situations, we
log a warning if we see that a new connection is being created without
any active relays.
One may argue that we should just disallow the connection altogether,
i.e. return a `Result`. Yet, this situation happens so rarely that
having to handle this `Result` further up the stack is quite the
ergonomic hit.
As measured by running `perf` on our tests, a big part of why they are
slow is that we are calling `handle_timeout` basically on every
iteration of the `advance` loop in the test. Similar as in production,
there is no need to do that. Instead, we only call `handle_timeout` of a
particular component (client, gateway or relay) if they indicate that
they have something they are waiting for (as defined by `poll_timeout`).
Simply doing that makes the tests fail for certain scenarios where we
handle IP packets that aren't mean for the tunnel (such as DNS queries
or STUN messages for the relay / ICE agent). To fix this, we call
`handle_timeout` whenever `encapsulate` returns `None`. This is fairly
common across sans-IO systems: When a function that usually 1-to-1
transforms a packet instead handles it internally, it must have changed
internal state. To make code-organisation easier, `handle_timeout` is
treated as the "work-horse" of a sans-IO system: It is where all the
code goes that needs to perform processing upon multiple conditions.
Making this change drops the runtime of `tunnel_test` from ~33s to ~12s
on my machine (tests compiled with opt-level 2).
Documents profiling instructions that I've figured out over the last
couple of days. Since Rust 1.79, the standard library is compiled with
frame pointers enabled [0]. Grabbing stack-trace information from the
frame pointer makes profiling much easier because the data is just there
in-line. Using debug information (via `dwarf`) is also possible but
requires post-processing of the performance profile with `addr2line`
(`perf script` does that automatically). This can take multiple minutes
or longer, depending on the sampling frequency of the captured
performance data. This makes benchmarking almost infeasible because the
feedback loop is simply too long. Using frame pointers is a much nicer
experience.
The downside is that the application themselves also needs to be
compiled with frame pointers. We achieve that by setting the appropriate
compiler option in `.cargo/config.toml`. Ubuntu [1], Fedora [2] and Arch
[3] also ship all of their code with frame pointers enabled. Also, tech
giants such as Google & Meta have been running their systems with frame
pointers on-by-default for years [4].
[0]:
https://blog.rust-lang.org/2024/06/13/Rust-1.79.0.html#frame-pointers-enabled-in-standard-library-builds
[1]:
https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html
[2]: https://pagure.io/fesco/issue/2923
[3]: https://gitlab.archlinux.org/archlinux/rfcs/-/merge_requests/26
[4]:
https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html
Creating a new span is fairly expensive when it happens as part of a hot
function. Decapsulating packets in `snownet` is such a hot-function:

Previously, we created a new span for every packet that we decrypted
which accounted for ~3% of spent CPU time. We can optimise this and
remove some duplication by creating the span early and simply only
entering it every time we want it to be active. This results in
`boringtun`'s `decapsulate` being the most expensive function that
happens in `decapsulate`:

On a system with only a single IP stack (either V4 or V6), we will only
have a single socket. When the system gets busy, the `send` function is
extremely hot for obvious reasons. With only a single socket active, we
allocated a lot of strings and errors here that ended up not being used
at all. This accounts for about 1% of CPU time spent during a speedtest.
When CIDR resources get added or removed, we need to update the routing
table on the clients to redirect traffic for these resources to the TUN
device. Currently, this is done in a separate event from the remaining
`TunConfig` tracked in `connlib`. Having this in a separate event means
it is hard to diff, whether anything meaningful changed about the TUN
device. Additionally, changes to these routes are currently not tested
in `tunnel_test`.
Not having this code tested already caused bugs previously, such as
#6387.
To fix these things, we:
- Add the IPv4 and IPv6 routes to the `TunConfig` tracked in `connlib`
- Track the expected routes in `RefClient`
- Assert that we don't emit `TunConfigUpdated` events without any actual
changes
Fixes: #6423.
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.209 to
1.0.210.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.210</h2>
<ul>
<li>Support serializing and deserializing <code>IpAddr</code> and
<code>SocketAddr</code> in no-std mode on Rust 1.77+ (<a
href="https://redirect.github.com/serde-rs/serde/issues/2816">#2816</a>,
thanks <a
href="https://github.com/MathiasKoch"><code>@MathiasKoch</code></a>)</li>
<li>Make <code>serde::ser::StdError</code> and
<code>serde::de::StdError</code> equivalent to
<code>core::error::Error</code> on Rust 1.81+ (<a
href="https://redirect.github.com/serde-rs/serde/issues/2818">#2818</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="89c4b02bf3"><code>89c4b02</code></a>
Release 1.0.210</li>
<li><a
href="eeb8e44cda"><code>eeb8e44</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2818">#2818</a>
from dtolnay/coreerror</li>
<li><a
href="785c2d9605"><code>785c2d9</code></a>
Stabilize no-std StdError trait</li>
<li><a
href="d549f048e1"><code>d549f04</code></a>
Reformat parse_ip_impl definition and calls</li>
<li><a
href="4c0dd63011"><code>4c0dd63</code></a>
Delete attr support from core::net deserialization macros</li>
<li><a
href="26fb134165"><code>26fb134</code></a>
Relocate cfg attrs out of parse_ip_impl and parse_socket_impl</li>
<li><a
href="07e614b52b"><code>07e614b</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2817">#2817</a>
from dtolnay/corenet</li>
<li><a
href="b1f899fbe8"><code>b1f899f</code></a>
Delete doc(cfg) attribute from impls that are supported in no-std</li>
<li><a
href="b4f860e627"><code>b4f860e</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2816">#2816</a>
from MathiasKoch/chore/core-net</li>
<li><a
href="d940fe1b49"><code>d940fe1</code></a>
Reuse existing Buf wrapper as replacement for std::io::Write</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.209...v1.0.210">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
When the addressDescription looks like a URI, we highlight it with
markdown and set its click handler to open the URL when clicked. We were
also making it bold, which seems to break macOS's menubar width
calculation, causing the field to truncate (typically after `https://`)
instead of expanding the container to contain it.
By removing the bold formatting, the container is properly sized, fixing
the display issue.
# Before
<img width="553" alt="Screenshot 2024-09-07 at 10 44 37 AM"
src="https://github.com/user-attachments/assets/4596bf54-918d-4a59-81d6-a18e436da5ad">
# After
<img width="569" alt="Screenshot 2024-09-07 at 10 45 38 AM"
src="https://github.com/user-attachments/assets/0400731f-e189-4416-a670-d5c3b314d71b">
On Android when when we receive the `onLost` callback for the VPN we
disconnect the tunnel, since that's the only way to find when the user
has disconnected the VPN from outside Firezone, otherwise this would
kill network connection when it's this happens.
But `onLost` was also called when we call `establish` on the tunnel to
recreate it and there's no obvious way to distinguish between both
cases.
So we previously just stopped monitoring `onLost` while we were
executing `establish`. The problem with this approach was that sometimes
`onLost` might have a delayed since `establish` is called.
Therefore, to fix this, we changed the `onLost` call to delay the
execution of `session.disconnect` for 2 second, if within this grace
period we get a `linkPropertiesChanged` for our interface, which is
always called after the network is re-created, we cancel the execution,
otherwise we go ahead an disconnect the session.
This PR reverts commit that moves out IPv6 address to a separate
subdomain (deploying that will cause a prod downtime) and simply removes
the check that causes redirect loops.
This moves about 2/3rds of the code from `firezone-gui-client` to
`firezone-gui-client-common`.
I tested it in aarch64 Windows and cycled through sign-in and sign-out
and closing and re-opening the GUI process while the IPC service stays
running. IPC and updates each get their own MPSC channel in this, so I
wanted to be sure it didn't break.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
In #6032, we attempted to fix routing loops for Windows and did so
successfully for UDP packets. For TCP sockets, we believed that binding
the socket to an interface is enough to prevent routing loops. This
assumptions is wrong.
> On Windows, a call to bind() affects card selection only incoming
traffic, not outgoing traffic.
>
> Thus, on a client running in a multi-homed system (i.e., more than one
interface card), it's the network stack that selects the card to use,
and it makes its selection based solely on the destination IP, which in
turn is based on the routing table. A call to bind() will not affect the
choice of the card in any way.
On most of our testing machines, this problem didn't surface but it
turns out that on some machines, especially with WiFi cards there is a
conflict between the routes added on the system. In particular, with the
Internet resource active, we also add a catch-all route that we _want_
to have the most priority, i.e. Windows SHOULD send all traffic to our
TUN device. Except for traffic that we generate, like TCP connections to
the portal or UDP packets sent to gateways, relays or DNS servers.
It appears that on some systems, mostly with Ethernet adapters, Windows
picks the "correct" interface for our socket and sends traffic via that
but on other systems, it doesn't. TCP sockets are only used for the
WebSocket connection to the portal. Without that one, Firezone
completely stops working because we can't send any control messages.
To reliably fix this issue, we need to add a dedicated route for the
target IP of each TCP socket that is more specific than the Internet
resource route (`0.0.0.0/0`) but otherwise identical. We do this as part
of creating a new TCP socket. This route is for the _default_ interface
and thus, doesn't get automatically removed when Firezone exits.
We implement a RAII guard that attempts to drop the route on a
best-effort basis. Despite this RAII guard, this route can linger around
in case Firezone is being forced to exit or exits in otherwise unclean
ways. To avoid lingering routes, we always delete all routing table
entries matching the IP of the portal just before we are about to add
one.
Fixes: #6591.
[0]:
https://forums.codeguru.com/showthread.php?487139-Socket-binding-with-routing-table&s=a31637836c1bf7f0bc71c1955e47bdf9&p=1891235#post1891235
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Foo Bar <foo@bar.com>
Co-authored-by: conectado <gabrielalejandro7@gmail.com>
This PR introduces the `etherparse` dependency for parsing and
generating IP packets.
Using `etherparse`, we can implement the NAT46 & NAT64 implementations
for the gateway more elegantly because it allows us to parse the IP and
protocol headers into a static and much richer representation. The
conversion to the IPv4/IPv6 equivalent is then just a question of
transforming one data structure into another and writing it to the
correct place in the buffer.
We extract this functionality into dedicated `nat64` and `nat46`
modules.
Furthermore, we implement the various functions in `ip_packet::make`
using `etherparse` too. Following that, we also overhaul the NAT
translation tests that we have in `ip_packet::proptests`. Those now use
the more low-level `consume_to_ipX` APIs which makes the tests more
ergonomic to write.
In the future, we should upstream `Ipv4HeaderSliceMut` and
`Ipv6HeaderSliceMut` to `etherparse`.
Moving all of this functionality to `etherparse` will make it easier to
write tests that involve more IP packets as well as customise the
behaviour of our NAT.
Related: #5614.
Related: #6371.
Related: #6353.
Closes#6576
This recreates the callback channel on every connect / disconnect cycle,
to prevent this sequence:
1. Start connlib
2. Fail in `make_tun`
3. Spend several seconds doing platform-specific things
4. Stop connlib (since `make_tun` failed)
5. Come back to the main loop to find a bunch of queued-up callbacks
even though connlib is supposed to be stopped.
Instead we get:
5\. Come back to the main loop and we've dropped the callback receiver,
so any callbacks that connlib sent while we were busy are either dropped
or not even sent.