firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-28 02:18:50 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	3f4e004a48	fix(connlib): don't recreate DNS resource NAT for failed domains (#9064 ) Before a Client can send packets to a DNS resource, the Gateway must first setup a NAT table between the IPs assigned by the Client and the IPs the domain actually resolves to. This is what we call the DNS resource NAT. The communication for this process happens over IP through the tunnel which is an unreliable transport. To ensure that this works reliably even in the presence of packet loss on the wire, the Client uses an idempotent algorithm where it tracks the state of the NAT for each domain that is has ever assigned IPs for (i.e. received an A or AAAA query from an application). This algorithm ensures that if we don't hear anything back from the Gateway within 2s, another packet for setting up the NAT is sent as soon as we receive _any_ DNS query. This design balances efficiency (we don't try forever) with reliability (we always check all of them). In case a domain does not resolve at all or there are resolution errors, the Gateway replies with `NatStatus::Inactive`. At present, the Client doesn't handle this in any particular way other than logging that it was not able to successfully setup the NAT. The combination of the above results in an undesirable behaviour: If an application queries a domain without A and AAAA records once, we will keep retrying forever to resolve it upon every other DNS query issued to the system. To fix this, we introduce `dns_resource_nat::State::Failed`. Entries in this state are ignored as part of the above algorithm and only recreated when explicitly told to do so which we only do when we receive another DNS query for this domain. To handle the increased complexity around this system, we extract it into its own component and add a fleet of unit tests for its behaviour.	2025-05-09 15:04:21 +00:00
Thomas Eizinger	fa790b231a	fix(gateway): respond with SERVFAIL for missing nameserver (#9061 ) When we implemented #8350, we chose an error handling strategy that would shutdown the Gateway in case we didn't have a nameserver selected for handling those SRV and TXT queries. At the time, this was deemed to be sufficiently rare to be an adequate strategy. We have since learned that this can indeed happen when the Gateway starts without network connectivity which is quite common when using tools such as terraform to provision infrastructure. In #9060, we fix this by re-evaluating the fastest nameserver on a timer. This however doesn't change the error handling strategy when we don't have a working nameserver at all. It is practically impossible to have a working Gateway yet us being unable to select a nameserver. We read them from `/etc/resolv.conf` which is what `libc` uses to also resolve the domain we connect to for the WebSocket. A working WebSocket connection is required for us to establish connections to Clients, which in turn is a precursor to us receiving DNS queries from a Client. It causes unnecessary complexity to have a code path that can potentially terminate the Gateway, yet is practically unreachable. To fix this situation, we remove this code path and instead reply with a DNS SERVFAIL error. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-09 05:55:48 +00:00
Thomas Eizinger	ac339ff63b	fix(gateway): evaluate fastest nameserver every 60s (#9060 ) Currently, the Gateway reads all nameservers from `/etc/resolv.conf` on startup and evaluates the fastest one to use for SRV and TXT DNS queries that are forwarded by the Client. If the machine just booted and we do not have Internet connectivity just yet, this fails which leaves the Gateway in state where it cannot fulfill those queries. In order to ensure we always use the fastest one and to self-heal from such situations, we add a 60s timer that refreshes this state. Currently, this will not re-read the nameservers from `/etc/resolv.conf` but still use the same IPs read on startup.	2025-05-09 03:38:35 +00:00
Thomas Eizinger	33d5c32f35	fix(gateway): truncate payload of ICMP errors (#9059 ) When the Gateway is handed an IP packet for a DNS resource that it cannot route, it sends back an ICMP unreachable error. According to RFC 792 [0] (for ICMPv4) and RFC 4443 [1] (for ICMPv6), parts of the original packet should be included in the ICMP error payload to allow the sending party to correlate, what could not be sent. For ICMPv4, the RFC says: ``` Internet Header + 64 bits of Data Datagram The internet header plus the first 64 bits of the original datagram's data. This data is used by the host to match the message to the appropriate process. If a higher level protocol uses port numbers, they are assumed to be in the first 64 data bits of the original datagram's data. ``` For ICMPv6, the RFC says: ``` As much of invoking packet as possible without the ICMPv6 packet exceeding the minimum IPv6 MTU ``` [0]: https://datatracker.ietf.org/doc/html/rfc792 [1]: https://datatracker.ietf.org/doc/html/rfc4443#section-3.1	2025-05-09 01:38:31 +00:00
Thomas Eizinger	18ec6c6860	refactor(rust): move service implementation to GUI client (#9045 ) The module and crate structure around the GUI client and its background service are currently a mess of circular dependencies. Most of the service implementation actually sits in `firezone-headless-client` because the headless-client and the service share certain modules. We have recently moved most of these to `firezone-bin-shared` which is the correct place for these modules. In order to move the background service to `firezone-gui-client`, we need to untangle a few more things in the GUI client. Those are done commit-by-commit in this PR. With that out the way, we can finally move the service module to the GUI client; where is should actually live given that it has nothing to do with the headless client. As a result, the headless-client is - as one would expect - really just a thin wrapper around connlib itself and is reduced down to 4 files with this PR. To make things more consistent in the GUI client, we move the `main.rs` file also into `bin/`. By convention `bin/` is where you define binaries if a crate has more than one. cargo will then build all of them. Eventually, we can optimise the compile-times for `firezone-gui-client` by splitting it into multiple crates: - Shared structs like IPC messages - Background service - GUI client This will be useful because it allows only re-compiling of the GUI client alone if nothing in `connlib` changes and vice versa. Resolves: #6913 Resolves: #5754	2025-05-08 13:22:09 +00:00
dependabot[bot]	bea57c02c4	build(deps): bump libc from 0.2.171 to 0.2.172 in /rust (#9031 ) Bumps [libc](https://github.com/rust-lang/libc) from 0.2.171 to 0.2.172. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/libc/releases">libc's releases</a>.</em></p> <blockquote> <h2>0.2.172</h2> <h3>Added</h3> <ul> <li>Android: Add <code>getauxval</code> for 32-bit targets (<a href="https://redirect.github.com/rust-lang/libc/pull/4338">#4338</a>)</li> <li>Android: Add <code>if_tun.h</code> ioctls (<a href="https://redirect.github.com/rust-lang/libc/pull/4379">#4379</a>)</li> <li>Android: Define <code>SO_BINDTOIFINDEX</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4391">#4391</a>)</li> <li>Cygwin: Add <code>posix_spawn_file_actions_add[f]chdir[_np]</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4387">#4387</a>)</li> <li>Cygwin: Add new socket options (<a href="https://redirect.github.com/rust-lang/libc/pull/4350">#4350</a>)</li> <li>Cygwin: Add statfs & fcntl (<a href="https://redirect.github.com/rust-lang/libc/pull/4321">#4321</a>)</li> <li>FreeBSD: Add <code>filedesc</code> and <code>fdescenttbl</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4327">#4327</a>)</li> <li>Glibc: Add unstable support for _FILE_OFFSET_BITS=64 (<a href="https://redirect.github.com/rust-lang/libc/pull/4345">#4345</a>)</li> <li>Hermit: Add <code>AF_UNSPEC</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Hermit: Add <code>AF_VSOCK</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Illumos, NetBSD: Add <code>timerfd</code> APIs (<a href="https://redirect.github.com/rust-lang/libc/pull/4333">#4333</a>)</li> <li>Linux: Add <code>_IO</code>, <code>_IOW</code>, <code>_IOR</code>, <code>_IOWR</code> to the exported API (<a href="https://redirect.github.com/rust-lang/libc/pull/4325">#4325</a>)</li> <li>Linux: Add <code>tcp_info</code> to uClibc bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4347">#4347</a>)</li> <li>Linux: Add further BPF program flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4356">#4356</a>)</li> <li>Linux: Add missing INPUT_PROP_XXX flags from <code>input-event-codes.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4326">#4326</a>)</li> <li>Linux: Add missing TLS bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4296">#4296</a>)</li> <li>Linux: Add more constants from <code>seccomp.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4330">#4330</a>)</li> <li>Linux: Add more glibc <code>ptrace_sud_config</code> and related <code>PTRACE_ET_SYSCALL_USER_DISPATCH_CONFIG</code>. (<a href="https://redirect.github.com/rust-lang/libc/pull/4386">#4386</a>)</li> <li>Linux: Add new netlink flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4288">#4288</a>)</li> <li>Linux: Define ioctl codes on more architectures (<a href="https://redirect.github.com/rust-lang/libc/pull/4382">#4382</a>)</li> <li>Linux: Add missing <code>pthread_attr_setstack</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4349">#4349</a>)</li> <li>Musl: Add missing <code>utmpx</code> API (<a href="https://redirect.github.com/rust-lang/libc/pull/4332">#4332</a>)</li> <li>Musl: Enable <code>getrandom</code> on all platforms (<a href="https://redirect.github.com/rust-lang/libc/pull/4346">#4346</a>)</li> <li>NuttX: Add more signal constants (<a href="https://redirect.github.com/rust-lang/libc/pull/4353">#4353</a>)</li> <li>QNX: Add QNX 7.1-iosock and 8.0 to list of additional cfgs (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>QNX: Add support for alternative Neutrino network stack <code>io-sock</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>Redox: Add more <code>sys/socket.h</code> and <code>sys/uio.h</code> definitions (<a href="https://redirect.github.com/rust-lang/libc/pull/4388">#4388</a>)</li> <li>Solaris: Temporarily define <code>O_DIRECT</code> and <code>SIGINFO</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4348">#4348</a>)</li> <li>Solarish: Add <code>secure_getenv</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4342">#4342</a>)</li> <li>VxWorks: Add missing <code>d_type</code> member to <code>dirent</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add missing signal-related constsants (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add more error codes (<a href="https://redirect.github.com/rust-lang/libc/pull/4337">#4337</a>)</li> </ul> <h3>Deprecated</h3> <ul> <li>FreeBSD: Deprecate <code>TCP_PCAP_OUT</code> and <code>TCP_PCAP_IN</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4381">#4381</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Cygwin: Fix member types of <code>statfs</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4324">#4324</a>)</li> <li>Cygwin: Fix tests (<a href="https://redirect.github.com/rust-lang/libc/pull/4357">#4357</a>)</li> <li>Hermit: Make <code>AF_INET = 3</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Musl: Fix the syscall table on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>Musl: Fix the value of <code>SA_ONSTACK</code> on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>VxWorks: Fix a typo in the <code>waitpid</code> parameter name (<a href="https://redirect.github.com/rust-lang/libc/pull/4334">#4334</a>)</li> </ul> <h3>Removed</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/libc/blob/0.2.172/CHANGELOG.md">libc's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/rust-lang/libc/compare/0.2.171...0.2.172">0.2.172</a> - 2025-04-14</h2> <h3>Added</h3> <ul> <li>Android: Add <code>getauxval</code> for 32-bit targets (<a href="https://redirect.github.com/rust-lang/libc/pull/4338">#4338</a>)</li> <li>Android: Add <code>if_tun.h</code> ioctls (<a href="https://redirect.github.com/rust-lang/libc/pull/4379">#4379</a>)</li> <li>Android: Define <code>SO_BINDTOIFINDEX</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4391">#4391</a>)</li> <li>Cygwin: Add <code>posix_spawn_file_actions_add[f]chdir[_np]</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4387">#4387</a>)</li> <li>Cygwin: Add new socket options (<a href="https://redirect.github.com/rust-lang/libc/pull/4350">#4350</a>)</li> <li>Cygwin: Add statfs & fcntl (<a href="https://redirect.github.com/rust-lang/libc/pull/4321">#4321</a>)</li> <li>FreeBSD: Add <code>filedesc</code> and <code>fdescenttbl</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4327">#4327</a>)</li> <li>Glibc: Add unstable support for _FILE_OFFSET_BITS=64 (<a href="https://redirect.github.com/rust-lang/libc/pull/4345">#4345</a>)</li> <li>Hermit: Add <code>AF_UNSPEC</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Hermit: Add <code>AF_VSOCK</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Illumos, NetBSD: Add <code>timerfd</code> APIs (<a href="https://redirect.github.com/rust-lang/libc/pull/4333">#4333</a>)</li> <li>Linux: Add <code>_IO</code>, <code>_IOW</code>, <code>_IOR</code>, <code>_IOWR</code> to the exported API (<a href="https://redirect.github.com/rust-lang/libc/pull/4325">#4325</a>)</li> <li>Linux: Add <code>tcp_info</code> to uClibc bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4347">#4347</a>)</li> <li>Linux: Add further BPF program flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4356">#4356</a>)</li> <li>Linux: Add missing INPUT_PROP_XXX flags from <code>input-event-codes.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4326">#4326</a>)</li> <li>Linux: Add missing TLS bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4296">#4296</a>)</li> <li>Linux: Add more constants from <code>seccomp.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4330">#4330</a>)</li> <li>Linux: Add more glibc <code>ptrace_sud_config</code> and related <code>PTRACE_ET_SYSCALL_USER_DISPATCH_CONFIG</code>. (<a href="https://redirect.github.com/rust-lang/libc/pull/4386">#4386</a>)</li> <li>Linux: Add new netlink flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4288">#4288</a>)</li> <li>Linux: Define ioctl codes on more architectures (<a href="https://redirect.github.com/rust-lang/libc/pull/4382">#4382</a>)</li> <li>Linux: Add missing <code>pthread_attr_setstack</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4349">#4349</a>)</li> <li>Musl: Add missing <code>utmpx</code> API (<a href="https://redirect.github.com/rust-lang/libc/pull/4332">#4332</a>)</li> <li>Musl: Enable <code>getrandom</code> on all platforms (<a href="https://redirect.github.com/rust-lang/libc/pull/4346">#4346</a>)</li> <li>NuttX: Add more signal constants (<a href="https://redirect.github.com/rust-lang/libc/pull/4353">#4353</a>)</li> <li>QNX: Add QNX 7.1-iosock and 8.0 to list of additional cfgs (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>QNX: Add support for alternative Neutrino network stack <code>io-sock</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>Redox: Add more <code>sys/socket.h</code> and <code>sys/uio.h</code> definitions (<a href="https://redirect.github.com/rust-lang/libc/pull/4388">#4388</a>)</li> <li>Solaris: Temporarily define <code>O_DIRECT</code> and <code>SIGINFO</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4348">#4348</a>)</li> <li>Solarish: Add <code>secure_getenv</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4342">#4342</a>)</li> <li>VxWorks: Add missing <code>d_type</code> member to <code>dirent</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add missing signal-related constsants (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add more error codes (<a href="https://redirect.github.com/rust-lang/libc/pull/4337">#4337</a>)</li> </ul> <h3>Deprecated</h3> <ul> <li>FreeBSD: Deprecate <code>TCP_PCAP_OUT</code> and <code>TCP_PCAP_IN</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4381">#4381</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Cygwin: Fix member types of <code>statfs</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4324">#4324</a>)</li> <li>Cygwin: Fix tests (<a href="https://redirect.github.com/rust-lang/libc/pull/4357">#4357</a>)</li> <li>Hermit: Make <code>AF_INET = 3</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Musl: Fix the syscall table on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>Musl: Fix the value of <code>SA_ONSTACK</code> on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>VxWorks: Fix a typo in the <code>waitpid</code> parameter name (<a href="https://redirect.github.com/rust-lang/libc/pull/4334">#4334</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a5eab581f9`"><code>a5eab58</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4410">#4410</a> from tgross35/release-libc</li> <li><a href="`481eca7cc3`"><code>481eca7</code></a> chore: release libc 0.2.172</li> <li><a href="`ce2edbbaa9`"><code>ce2edbb</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4399">#4399</a> from tgross35/backport-triagebot-branch-warn</li> <li><a href="`31b3200907`"><code>31b3200</code></a> Suggest stable-nominated in the PR template</li> <li><a href="`3bffe1d58a`"><code>3bffe1d</code></a> Make triagebot warn on non-default branches</li> <li><a href="`03e6ffc8c4`"><code>03e6ffc</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4396">#4396</a> from tgross35/backport-serrano</li> <li><a href="`f9a47ac811`"><code>f9a47ac</code></a> Define SO_BINDTOIFINDEX on Android</li> <li><a href="`a358dae479`"><code>a358dae</code></a> Add missing utmpx apis for linux musl</li> <li><a href="`1ff2f2181a`"><code>1ff2f21</code></a> adding linux glibc ptrace_sud_config and related PTRACE_*ET_SYSCALL_USER_DISP...</li> <li><a href="`55c58c956d`"><code>55c58c9</code></a> Add more redox sys/socket.h and sys/uio.h definitions</li> <li>Additional commits viewable in <a href="https://github.com/rust-lang/libc/compare/0.2.171...0.2.172">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=libc&package-manager=cargo&previous-version=0.2.171&new-version=0.2.172)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-05-06 01:26:26 +00:00
Thomas Eizinger	37529803ce	build(rust): bump otel ecosystem crates to 0.29 (#9029 )	2025-05-05 12:33:07 +00:00
Thomas Eizinger	2d802edf6a	fix(connlib): _always_ use one IP stack for relayed connections (#9018 ) At the moment, Firezone already attempts to prefer the same IP stack across relayed connections all the way through to the Gateway. This is achieved with a feature in str0m implemented in https://github.com/algesten/str0m/pull/640 where the `IceAgent` computes the local preference of an added candidate such that an IPv4 candidate allocated over an IPv4 network has a higher preference than an IPv6 candidate allocated over an IPv4 network. If a candidate gets accepted by the local agent, it is signaled to the remote via our control protocol. The remote peer then adds the candidate as a remote candidate and the ICE process starts by pairing them with local ones and testing connectivity. Currently, str0m's API consumes the candidate and only returns a `bool` whether it should be sent signaled to the remote. This means the local preference computed as part of `add_local_candidate` is not reflected in the priority of the candidate sent to the remote. As a result, if the controlled agent (i.e. the Gateway) is behind symmetric NAT and therefore only has relay candidates available, the preference of IPv4 over IPv6 candidates on an IPv4 network is lost. This is what we are seeing in #8998. This changes with https://github.com/algesten/str0m/pull/650 being merged to `main` which we are updating to with this PR. Now, `add_local_candidate` returns an `Option<&Candidate>` which has been modified with the local preference of the `IceAgent` around the preferred IP stack of relay candidates. As such, the priority calculated and signaled to the remote embeds this information and will be taken into account by the controlling agent (i.e. the Client) when nominating a pair. Resolves: #8998	2025-05-05 01:11:28 +00:00
Jamil	6e0e7343ba	chore: release Apple & Gateway with ECN fix (#9013 )	2025-05-02 00:16:40 -07:00
Thomas Eizinger	0aab954fa9	fix(connlib): never clear ECT from IP packets (#9009 ) ECN information is helpful to allow the congestion controllers to more easily fine-tune their send and receive windows. When a Firezone Client receives an IP packet where the ECN bits signal an ECN capable transport, we mirror this bit on the UDP datagram that carries the encrypted IP packet. When receiving a datagram with ECN bits set, the Gateway will then apply these bits to the decrypted IP packet and pass it along towards its destination. This implementation is unfortunately a bit too naive. Not all devices on the Internet support ECN and therefore, we may receive a datagram that has its ECN bits cleared when the ECN bits on the inner IP packet still signal an ECN capable transport. In this case, we should _not_ override the ECN bits and instead pass the IP packet along as is. Network devices along the path between Gateway and Resource may still use these ECN bits to signal congestion. We fix this by making the `with_ecn` function on `IpPacket` private. It is not meant to be used outside of the module. We supersede it with a `with_ecn_from_transport` function that implements the above logic. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-05-02 05:28:19 +00:00
Thomas Eizinger	0fd14b993b	chore(connlib): buffer most recent TCP SYN (#9004 ) When establishing connections that take longer than the TCP RTO, we may see duplicate TCP SYNs. Those have different timestamps from each other but are otherwise equal. To provide more accurate timing information to the TCP stack, we now keep the latest TCP SYN around instead of the very first one.	2025-05-02 01:06:14 +00:00
Thomas Eizinger	ea5709e8da	chore(rust): initialise OTEL with useful metadata (#8945 ) Once we start collecting metrics across various Clients and Gateways, these metrics need to be tagged with the correct `service.name`, `service.version` as well as an instance ID to differentiate metrics from different instances.	2025-05-01 05:19:07 +00:00
Thomas Eizinger	8dd794d8c8	chore(gateway): record metrics about dropped packets (#8942 ) When a NAT session expires or other unallowed traffic is routed to the Gateway, we drop these packets. It will be useful to learn, how often that actually happens and what the reason is for why they got dropped. To do so, we add a counter metric for these packets. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-04-30 18:24:10 +00:00
Thomas Eizinger	6f11568c8c	fix(connlib): move `wire::dev::recv` log to right location (#8944 ) I don't understand why but in the current location, this log simply doesn't show up for anything other than UDP packets. If we move it up, it will actually log all packets.	2025-04-30 13:45:51 +00:00
Thomas Eizinger	e031dfdb4a	refactor(connlib): introduce our own `bufferpool` crate (#8928 ) We have been using buffer pools for a while all over `connlib` as a way to efficiently use heap-allocated memory. This PR harmonizes the usage of buffer pools across the codebase by introducing a dedicated `bufferpool` crate. This crate offers a convenient and easy-to-use API for all the things we (currently) need from buffer pools. As a nice bonus of having it all in one place, we can now also track metrics of how many buffers we have currently allocated. An example output from the local metrics exporter looks like this: ``` Name : system.buffer.count Description : The number of buffers allocated in the pool. Unit : {buffers} Type : Sum Sum DataPoints Monotonic : false Temporality : Cumulative DataPoint #0 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 96 Attributes : -> system.buffer.pool.name: udp-socket-v6 -> system.buffer.pool.buffer_size: 65535 DataPoint #1 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 7 Attributes : -> system.buffer.pool.buffer_size: 131600 -> system.buffer.pool.name: gso-queue DataPoint #2 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 128 Attributes : -> system.buffer.pool.name: udp-socket-v4 -> system.buffer.pool.buffer_size: 65535 DataPoint #3 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 8 Attributes : -> system.buffer.pool.buffer_size: 1336 -> system.buffer.pool.name: ip-packet DataPoint #4 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 9 Attributes : -> system.buffer.pool.buffer_size: 1336 -> system.buffer.pool.name: snownet ``` Resolves: #8385	2025-04-30 08:52:18 +00:00
Thomas Eizinger	f7df445924	fix(gateway): don't invalidate active NAT sessions (#8937 ) Whenever the Gateway is instructed to (re)create the NAT for a DNS resource, it performs a DNS query and then overwrite the existing entries in the NAT table. Depending on how the DNS records are defined, this may lead to a very bad user experience where connections are cut regularly. In particular, if a service utilises round-robin DNS where a DNS query only ever returns a single entry yet that entry may change as soon as the TTL expires, all connections for this particular DNS resource for a Client get cut. To fix this, we now first check for active NAT sessions for a given proxy IP and only replace it if we don't have an open NAT session. The NAT sessions have a TTL of 1 minute, meaning there needs to be at least 1 outgoing packet from the Client every minute to keep it open.	2025-04-30 06:58:58 +00:00
Jamil	2650d81444	chore: release clients with GSO fix (#8936 )	2025-04-29 23:52:43 -07:00
Thomas Eizinger	c75b6c6641	feat(connlib): record the number of IO errors as a metric (#8934 ) It will be interesting to learn for example, how many installations have no IPv6 connectivity as those will encounter `NetworkUnreachable` errors. We categorise the errors by IO direction and IP stack which will allow us to deduce this information.	2025-04-30 05:24:55 +00:00
Thomas Eizinger	6dc5f85cc5	fix(connlib): don't buffer when recreating DNS resource NAT (#8935 ) In order to detect changes to DNS records of DNS resources, `connlib` will recreate the DNS resource NAT whenever it receives a query for a DNS resource. The way we implemented this was by clearing the local state of the DNS resource NAT, which triggered us to perform the handshake with the Gateway again upon the next packet for this resource. The Gateway would then perform the DNS query and respond back when this was finished. In order to not drop any packets, `connlib` has a buffer where it keeps the packets that are arriving in the meantime. This works reasonably well when the connection is first set up because we are only buffering a TCP SYN or equivalent handshake packet. Yet, when the connection is full use, and the application just so happens to make another DNS query, we halt the entire flow of packets until this is confirmed again. To prevent high memory use, the buffer for this packets is constrained to 32 packets which is nowhere near enough when a connection is actively transferring data (like a file upload). In most cases, the DNS query on the Gateway will yield the exact same results as because the records haven't changed. Thus, there is no reason for us to actually halt the flow of these packets when we are _recreating_ the DNS resource NAT. That way, this handshake happens in parallel to the actual packet flow and does not interrupt anything in the happy path case.	2025-04-30 04:26:49 +00:00
Thomas Eizinger	d19d20da51	fix(connlib): send IO errors from UDP threads to event-loop (#8933 ) With #7590, we've moved all UDP IO operations to a separate thread. As a result, some of the error handling of IO errors within the Client's and Gateway's event-loop no longer applied as those are now captured within the respective thread. To fix this, we extend the type-signature of the receive-channel to also allow for errors and use that to send back errors from sending AND receiving UDP datagrams.	2025-04-30 02:00:41 +00:00
Thomas Eizinger	2ba7a87899	feat(connlib): add FFI for changing log-level on MacOS (#8927 ) This isn't plugged into anything yet on the Swift side but lays the foundation for changing the log-level at runtime without having to sign the user out.	2025-04-29 13:51:46 +00:00
Thomas Eizinger	66b7ca6f7f	fix(connlib): ensure we don't mistake SYN-ACK for SYN (#8922 ) This shouldn't matter because we are only using the `UniquePacketBuffer` on the client and not on the Gateway where SYN-ACK packets would be sent from. To be fully correct though, we need to also compare the ACK flag of the two packets.	2025-04-29 04:17:18 +00:00
Thomas Eizinger	fde8d08423	fix(connlib): maintain packet order across GSO batches (#8920 ) Despite our efforts in #8912, the current implementation still does not do enough to maintain packet ordering across GSO batches. At present, we very aggressively batch packets of the same length together. This however is too eager when we consider packet flows such as the following: ``` 9:03:49.585143 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 1:1229, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 09:03:49.585151 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 1229:2063, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 834 09:03:49.585157 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 2063:3094, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1031 09:03:49.585187 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 3094:4322, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 09:03:49.585188 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 4322:5156, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 834 09:03:49.585227 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 5156:6384, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 09:03:49.585228 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 6384:7612, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 09:03:49.585230 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 7612:8249, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 637 09:03:49.585846 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [.], seq 8249:9477, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 09:03:49.585851 IP 10.128.15.241.3000 > 100.69.109.138.53474: Flags [P.], seq 9477:10705, ack 524, win 249, options [nop,nop,TS val 3862031964 ecr 1928356896], length 1228 ``` As we can see here, the remote sends us packet batches of varying lengths: - 1228, 834 - 1031 - 1228, 834 - 1228, 1228, 637 - 1228, 1228 1228 represents a "full" TCP packet so any packet following a full-packet SHOULD be grouped together into a GSO batch. Currently, we are batching all the 1228 packets together and we ignore the fact that there were actually smaller sized packets inbetween those that belong together. To mitigate this, we refactor the `GsoQueue` to remove the `segment_size` from the binning key of our map and instead only group batches by their source, destination and ECN information. Within such a connection, we then create an ordered list of batches. A new batch is started if the length differs or we have previously pushed a packet that isn't of the length of the batch, therefore signalling the end of the batch. The result here looks very promising (this is loading `blog.firezone.dev` via the `lynx` browser from within the headless-client docker container, so going through a Gateway running this PR): \|main\|this PR\| \|---\|---\| \|![Screenshot From 2025-04-29 10-32-00](https://github.com/user-attachments/assets/ba0535e4-1df9-4601-a2d7-ba099ba2313f)\|![image](https://github.com/user-attachments/assets/ab2ccec7-ce96-4305-8514-2e43d82ecc7d)\| Related: #8899	2025-04-29 00:50:23 +00:00
Thomas Eizinger	e0faddf43f	chore(connlib): maintain order within a single GSO batch (#8912 ) Generic Segmentation Offload (GSO) is a clever way of reducing the number of syscalls made when a you want to send a lot of packets with the same length to the same recipient. The way this works is that the packets are concatenated and passed to the kernel as a single packet together with the `segment_size` as an out-of-band argument. The component managing this batching in `connlib` is called `GsoQueue`. In #8772, we made the order in which these batches are sent to the kernel explicit by prioritising batches with smaller segments. What we overlooked with that strategy is that in a particular GSO batch, the last packet is actually allowed to be of a different length. For example, say the user is downloading an image of 4500Kb. With our MTU of 1280, we have a payload size of 1252. This results in three fully-filled packets and one packet of 744 bytes. With the change in #8772, the small packet of 744 bytes will be transferred first, followed by the "train" of fully filled packets. To fix this, we flip the order here and transfer batches or larger sizes first. The original problem we attempted to mitigate in #8772 no longer exists now that we merged #7590. We will simply suspend now if the UDP socket isn't ready contrary to dropping the next batch. By flipping the order here, we guarantee that batches with a larger size are sent before batches with a smaller size. This should also imply that the encapsulated IP packets of e.g. an image arrive in the correct order (with the smallest packet last as it is part of a smaller batch). What we don't guarantee with this is that there won't be any other IP packets sent "in the middle" of such a batch. This shouldn't be a problem though as we are simply interleaving packets of different TCP / UDP connections with each other which already happens on the regular Internet anyway.	2025-04-28 06:53:43 +00:00
Thomas Eizinger	ac5e44d5d0	feat(connlib): request larger buffers for UDP sockets (#8731 ) Sufficiently large receive buffers are important to sustain high-throughput as latency increases. If the receive buffer in the kernel is too small, packets need to be dropped on arrival. Firefox uses 1MB in its QUIC stack [0]. `quic-go` recommends to set send and receive buffers to 7.5 MB [1]. Power users of Firezone are likely receiving a lot more traffic than the average Firefox user (especially with Internet Resource activated) so setting it to 10 MB seems reasonable. Sending packets is likely not as critical because we have back-pressure through our system such that we will stop reading IP packets when we cannot write to our UDP socket. The UDP socket is sitting in a separate thread and those threads are connected with dedicated queues which act as another buffer. However, as the data below shows, some systems have really small send buffers which are currently likely a speed bottleneck because we need to suspend writing so frequently. Assuming a 50ms latency, the bandwidth-delay product tells us that we can (in theory) saturate a 1.6 Gbps link with a 10MB receive buffer (assuming the OS also has large enough buffer sizes in its TCP or QUIC stack): ``` 80 Mb / 0.05s = 1600Mbps ``` Experiments and research [2] show the following: \|OS\|Receive buffer (default)\|Receive buffer (this PR)\|Send buffer (default)\|Send buffer (this PR)\| \|---\|---\|---\|---\|---\| \|Windows\|65KB\|10MB\|65KB\|1MB\| \|MacOS\|786KB\|8MB\|9KB\|1MB\| \|Linux\|212KB\|212KB\|212KB\|212KB\| With the exception of Linux, the OSes appear to be quite generous with how big they allow receive buffers to be. On Linux, these limit can be changed by setting the `core.net.rmem_max` and `core.net.wmem_max` parameters using `sysctl`. Most of our users are on Windows and MacOS, meaning they immediately benefit from this without having to change any system settings. Larger client-side UDP receive buffers are critical for any "download" scenario which is likely the majority of usecases that Firezone is used for. On Windows, increasing this receive buffer almost doubles the throughput in an iperf3 download test. [0]: https://github.com/mozilla/neqo/pull/2470 [1]: https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes [2]: https://unix.stackexchange.com/a/424381 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-04-22 06:52:33 +00:00
Thomas Eizinger	d2e275be56	feat(connlib): classify UDP traffic by protocol (#8886 ) It creates a bit of duplication with code that we have in `snownet` but it is code that is unlikely to change because the protocols are already standarised. Contrary to recording the port, the cardinality of these protocols is much fixed to a much smaller range which will allow us to safely record these metrics in an actual time-series database further down the line whilst still reasoning about how much traffic we are sending over TURN, as STUN or as WireGuard.	2025-04-22 01:35:38 +00:00
Jamil	5db8e20f3b	chore: release Apple and GUI clients (#8882 ) - Apple clients 1.4.12 - GUI clients 1.4.11	2025-04-21 21:45:16 +00:00
Jamil	368ace2c6e	ci: Release Android 1.4.7 (#8878 ) App is live on Play store.	2025-04-21 21:12:27 +00:00
Thomas Eizinger	4c5fd9b256	feat(connlib): prefer relay candidates of same IP version (#8798 ) When calculating preferences for candidates, `str0m` currently always prefer IPv6 over IPv4. This is as per the ICE spec. Howver, this can lead to sub-optimal situations when a connection ends up using a TURN server. TURN allows a client to allocate an IPv4 and an IPv6 address in the same allocation. This makes it possible for e.g. an IPv4-only client to connect to an IPv6-only peer as long as the TURN server runs in dual-stack AND the client requests an IPv6 address in addition to an IPv4 address with the `ADDITIONAL-ADDRESS-FAMILY` attribute. Assume that a client sits behind symmetric NAT and therefore needs to rely on a TURN server to communicate with its peers. The TURN server as well as all the peers operate in dual-stack mode. The current priority calculation will yield a communication path that uses IPv4 to talk to the TURN server (as that is the only one available) but due to the preference ordering of IPv6 over IPv4, will use an IPv6 path to the peer, despite the peer also supporting IPv4. This isn't a problem per-se but makes our life unnecessarily difficult. Our TURN servers use eBPF to efficiently deal with TURN's channel-data messages. This however is at present only implemented for the IPv4 <> IPv4 and IPv6 <> IPv6 path. Implementing the other paths is possible but complicates the eBPF code because we need to also translate IP headers between versions and not just update the source and destination IPs. We have since patched `str0m` to extend the `Candidate::relayed` constructor to also take a `base` address which is - similar to the other candidate types - the address the client is sending from in order to use this candidate. In the context of relayed candidates, this is the address the client is using to talk to the TURN server. We can use this information in the candidate's priority calculation to prefer candidates that allow traffic to remain within one IP version, i.e. if the client talks to the TURN server over IPv4, the candidate with an allocated IPv4 address will have a higher priority than the one with the IPv6 address because we are applying a "punishment" factor as part of the local-preference component in the priority formula. Staying within the same IP version whilst relaying traffic allows our TURN servers to use their eBPF kernel which results in a better UX due to lower latency and higher throughput. The final candidate ordering is ultimately decided by the controlling ICE agent which in our case is the Firezone Client. Thus, we don't necessarily need to update Gateways in order to test / benefit from this. Building a Client with this patch included should be enough to benefit from this change. Related: https://github.com/algesten/str0m/pull/640 Related: https://github.com/algesten/str0m/pull/644	2025-04-20 22:41:56 +00:00
Thomas Eizinger	92534ae4ec	test(connlib): ensure we have gaps in port ranges (#8862 ) This should fix the flaky proptest.	2025-04-20 11:02:55 +00:00
Thomas Eizinger	ae4b7d9d08	test(connlib): correctly assert in `Io` unit-test (#8861 ) Not sure what I was smoking when I wrote this test but the current assertion makes no sense for what we actually want to test. As the test name says, we want to assert that if we are given an `Instant` in the past, we do in fact return a more recent one and therefore what is returned in `Input::Timeout` is at least as recent as `now`.	2025-04-19 22:17:51 +00:00
Jamil	5669c83835	ci: Bump Apple clients to 1.4.11 (#8848 ) Includes a fix for auto-starting on launch when other VPN clients have been connected previously.	2025-04-19 11:45:42 +00:00
Jamil	a2e32a4918	ci: Bump apple to 1.4.10 to ship PKG (#8797 ) This publishes the 1.4.10 permalinks for the PKG download.	2025-04-17 15:13:44 +00:00
Jamil	aab691a67f	ci: Release Apple clients 1.4.9 (#8793 ) These contain the recent UDP thread enhancements.	2025-04-15 20:14:43 +00:00
Jamil	743f5fdfeb	ci: bump clients/gateway to ship write improvements (#8792 ) Signed-off-by: Jamil <jamilbk@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-04-15 06:21:23 +00:00
Thomas Eizinger	901207b274	chore(rust): remove stale error context (#8787 ) Minor oversight from #8783. We accidentally retained this `.context` even though there are now multiple error paths from the `Eventloop`, not just portal connection errors.	2025-04-14 20:35:11 -07:00
Thomas Eizinger	7f5a81cc5a	chore(rust-ffi): log non-authentication errors on error (#8785 ) In the FFI layer, it is tricky to decide what we should do with errors. On the one hand, logging and returning errors is an anti-pattern because it may lead to duplicate logs. In this particular case however, it is useful to log the error on the Rust side because it allows our Sentry integration to capture and include the DEBUG logs prior to this one which may add crucial context.	2025-04-15 02:28:09 +00:00
Thomas Eizinger	7c2163ddf4	fix(connlib): fail event-loops if UDP threads stop (#8783 ) The UDP socket threads added in #7590 are designed to never exit. UDP sockets are stateless and therefore any error condition on them should be isolated to sending / receiving a particular datagram. It is however possible that code panics which will shut down the threads irrecoverably. In this unlikely event, `connlib`'s event-loop would keep spinning and spam the log with "UDP socket stopped". There is no good way on how we can recover from such a situation automatically, so we just quit `connlib` in that case and shut everything down. To model this new error path, we refactor the `DisconnectError` to be internally backed by `anyhow`.	2025-04-15 02:27:37 +00:00
Thomas Eizinger	b3746b330f	refactor(connlib): spawn dedicated threads for UDP sockets (#7590 ) Correctly implementing asynchronous IO is notoriously hard. In order to not drop packets in the process, one has to ensure a given socket is ready to accept packets, buffer them if it is not case, suspend everything else until the socket is ready and then continue. Until now, we did this because it was the only option to run the UDP sockets on the same thread as the actual packet processing. That in turn was motivated by wanting to pass around references of the received packets for processing. Rust's borrow-checker does not allow to pass references between threads which forced us to have the sockets on the same thread as the packet processing. Like we already did in other places in `connlib`, this can be solved through the use of buffer pools. Using a buffer pool, we can use heap allocations to store the received packets without having to make a new allocation every time we read new packets. Instead, we can have a dedicated thread that is connected to `connlib`'s packet processing thread via two channels (one for inbound and one for outbound packets). These channels are bounded, which ensures backpressure is maintained in case one of the two threads lags behind. These bounds also mean that we have at most N buffers from the buffer pool in-flight (where N is the capacity of the channel). Within those dedicated threads, we can then use `async/await` notation to suspend the entire task when a socket isn't ready for sending. Resolves: #8000	2025-04-14 06:18:06 +00:00
Thomas Eizinger	859aa3cee0	feat(connlib): add context to event-loop errors (#8773 ) This should make it easier to diagnose any error returned from the event-loop.	2025-04-14 00:07:27 +00:00
Thomas Eizinger	19d954c76c	fix(connlib): prioritise GSO batches with smaller segments (#8772 ) In order to implement GSO in `connlib`, we opted for an approach where packets of the same length are being appended to a buffer. Each of these buffers is the sent to the kernel in a single syscall, which drastically decreases the per-packet overhead of syscalls and therefore improves performance. Within `connlib` itself, we prioritise control-protocol associated packets over tunnel traffic. The idea here is that even under high-load, we want to ensure that STUN probes between the peers and to the relays are sent in a timely manner. Failing to send these probes results in a false-positive detection of a lost connection because the `connlib`'s internal state uses timeouts to detect such situations. Despite processing the packets itself in a timely manner, it is still possible that they get delayed depending on which order the get flushed to the socket. This order is currently non-deterministic because `GsoQueue` uses a `HashMap` internally and when accessing the batched-together datagrams, we just access it via `iter_mut`. To fix this, we use a `BTreeMap` instead and explicitly define the `Key` to start with the `segment_size` field. As a result, entries within the `BTreeMap` will be sorted ascending by `segment_size` (i.e. the size of individual packets within the batch). Packets of smaller size are more likely to be control messages like STUN binding requests or TURN messages to the relays for managing allocations. By sorting the map explicitly, we ensure that if the UDP socket is ready to send, we flush out these messages first before moving on to bigger packets such as the ones containing (more likely) WireGuard data messages.	2025-04-14 00:04:39 +00:00
Thomas Eizinger	439da65180	chore(connlib): log all tunnel errors on WARN (#8764 ) Currently, errors encountered as part of operating the tunnel are non-fatal and only logged on `TRACE` in order to not flood the logs. Recent improvements around how the event loop operates made it such that we actually emit a lot less errors and ideally there should be 0. Therefore we can now employ a much more strict policy and log all errors here on `WARN` in order to get Sentry alerts.	2025-04-13 01:35:37 +00:00
Thomas Eizinger	289bd35e4c	feat(connlib): add packet counter metrics (#8752 ) This PR adds opentelemetry-based packet counter metrics to `connlib`. By default, the collection of these metrics of disabled. Without a registered metrics-provider, gathering these metrics are effectively no-ops. They will still incur 1 or 2 function calls per packet but that should be negligible compared to other operations such as encryption / decryption. With this system in place, we can in the future add more metrics to make debugging easier.	2025-04-12 08:35:26 +00:00
Thomas Eizinger	25267b18c8	feat(connlib): flush UDP and TUN concurrently (#8737 ) Upon each tick of the event loop `connlib` first attempts to flush pending UDP packets to the socket, followed by packets queued for sending out on the TUN device. In case the UDP socket is busy, we suspend the event loop until we can send more packets there. This isn't quite as efficient as we can be. Whilst waiting for the UDP socket, we can still write packets to the TUN device. With this patch, we attempt to do both. In case either of them couldn't quite finish their work, we still return `Poll::Pending` to signal the event loop to suspend, preventing us from accepting more work than we can handle.	2025-04-10 04:56:59 +00:00
Thomas Eizinger	6d6db3346d	test(connlib): increase grace period for unit test (#8738 ) This test appears to be sometimes flaky in CI, likely due to noisy neighbours.	2025-04-10 03:51:01 +00:00
Thomas Eizinger	6eab29a770	feat(connlib): supply multiple buffers to UDP socket (#8733 ) At present, `connlib` uses `quinn-udp`'s GRO functionality to read multiple UDP packets within a single syscall. We are however only passing a single buffer and a single `RecvMeta` to the `recv` function. As a result, the function is limited to giving us only packets that originate from one particular IP. By supplying multiple buffers (and their according `RecvMeta`s), we can now read packets from up to 10 different IPs at once within a single syscall. To obtain multiple buffers, we need to split the provided buffer into equal chunks. To ensure that each buffer can still hold several packets, we increase the buffer size to 1MB. It is expected that is increases throughput especially on Gateways which receive UDP packets from many different IPs. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-04-09 05:17:46 +00:00
Thomas Eizinger	dce5ab9178	build(deps): bump Rust to 1.86 (#8636 )	2025-04-03 21:14:08 +00:00
Thomas Eizinger	bac5cfa4cb	fix(connlib): set idle timer to be longer than ICE timeout (#8612 ) Our idle connection detection works based on incoming and outgoing packets, whichever one happened later. If we have not received or sent packets for longer than `MAX_IDLE`, we transition into idle mode where we configure our ICE agent to only send binding requests every 60 seconds. Our ICE timeout in non-idle mode is just north of 10 seconds (the formula is a bit tricky so don't have the accurate number). This can cause a problem whenever a Gateway disappears. We leave the idle mode as soon as we send a packet through the Gateway. Thus, what we intended to happen is that, as long as you keep trying to connect to the Gateway, we will leave the idle mode, increase our rate of STUN bindings through the ICE agent and detect within ~10s that the Gateway is gone. What actually happens is that, IF whatever resource you are trying to talk to is a DNS resource (which is very likely) and the application starts off with a DNS query, then we will reset the local DNS resource NAT state and ping the Gateway to set up the NAT again (we do this to ensure we don't have stale DNS entries on the Gateway). This message is only sent once and all other packets are buffered. Thus, the connection will go back to idle before the newly sent STUN binding requests can determine that the connection is actually broken. Resolves: #8551	2025-04-02 07:03:35 +00:00
Thomas Eizinger	3c7ac084c0	feat(relay): MVP for routing channel data message in eBPF kernel (#8496 ) ## Abstract This pull-request implements the first stage of off-loading routing of TURN data channel messages to the kernel via an eBPF XDP program. In particular, the eBPF kernel implemented here only handles the decapsulation of IPv4 data channel messages into their embedded UDP payload. Implementation of other data paths, such as the receiving of UDP traffic on an allocation and wrapping it in a TURN channel data message is deferred to a later point for reasons explained further down. As it stands, this PR implements the bare minimum for us to start experimenting and benefiting from eBPF. It is already massive as it is due to the infrastructure required for actually doing this. Let's dive into it! ## A refresher on TURN channel-data messages TURN specifies a channel-data message for relaying data between two peers. A channel data message has a fixed 4-byte header: - The first two bytes specify the channel number - The second two bytes specify the length of the encapsulated payload Like all TURN traffic, channel data messages run over UDP by default, meaning this header sits at the very front of the UDP payload. This will be important later. After making an allocation with a TURN server (i.e. reserving a port on the TURN server's interfaces), a TURN client can bind channels on that allocation. As such, channel numbers are scoped to a client's allocation. Channel numbers are allocated by the client within a given range (0x4000 - 0x4FFF). When binding a channel, the client specifies the remote's peer address that they'd like the data sent on the channel to be sent to. Given this setup, when a TURN server receives a channel data message, it first looks at the sender's IP + port to infer the allocation (a client can only ever have 1 allocation at a time). Within that allocation, the server then looks for the channel number and retrieves the target socket address from that. The allocation itself is a port on the relay's interface. With that, we can now "unpack" the payload of the channel data message and rewrite it to the new receiver: - The new source IP can be set from the old dst IP (when operating in user-space mode this is irrelevant because we are working with the socket API). - The new source port is the client's allocation. - The new destination IP is retrieved from the mapping retrieved via the channel number. - The new destination port is retrieved from the mapping retrieved via the channel number. Last but not least, all that is left is removing the channel data header from the UDP payload and we can send out the packet. In other words, we need to cut off the first 4 bytes of the UDP payload. ## User-space relaying At present, we implement the above flow in user-space. This is tricky to do because we need to bind _many_ sockets, one for each possible allocation port (of which there can be 16383). The actual work to be done on these packets is also extremely minimal. All we do is cut off (or add on) the data-channel header. Benchmarks show that we spend pretty much all of our time copying data between user-space and kernel-space. Cutting this out should give us a massive increase in performance. ## Implementing an eBPF XDP TURN router eBPF has been shown to be a very efficient way of speeding up a TURN server [0]. After many failed experiments (e.g. using TC instead of XDP) and countless rabbit-holes, we have also arrived at the design documented within the paper. Most notably: - The eBPF program is entirely optional. We try to load it on startup, but if that fails, we will simply use the user-space mode. - Retaining the user-space mode is also important because under certain circumstances, the eBPF kernel needs to pass on the packet, for example, when receiving IPv4 packets with options. Those make the header dynamically-sized which makes further processing difficult because the eBPF verifier disallows indexing into the packet with data derived from the packet itself. - In order to add/remove the channel-data header, we shift the packet headers backwards / forwards and leave the payload in place as the packet headers are constant in size and can thus easily and cheaply be copied out. In order to perform the relaying flow explained above, we introduce maps that are shared with user-space. These maps go from a tuple of (client-socket, channel-number) to a tuple of (allocation-port, peer-socket) and thus give us all the data necessary to rewrite the packet. ## Integration with our relay Last but not least, to actually integrate the eBPF kernel with our relay, we need to extend the `Server` with two more events so we can learn, when channel bindings are created and when they expire. Using these events, we can then update the eBPF maps accordingly and therefore influence the routing behaviour in the kernel. ## Scope What is implemented here is only one of several possible data paths. Implementing the others isn't conceptually difficult but it does increase the scope. Landing something that already works allows us to gain experience running it in staging (and possibly production). Additionally, I've hit some issues with the eBPF verifier when adding more codepaths to the kernel. I expect those to be possible to resolve given sufficient debugging but I'd like to do so after merging this. --- Depends-On: #8506 Depends-On: #8507 Depends-On: #8500 Resolves: #8501 [0]: https://dl.acm.org/doi/pdf/10.1145/3609021.3609296	2025-03-27 10:59:40 +00:00
Thomas Eizinger	19c5bc530a	feat(gateway): deprecate the NAT64 module (#8383 ) At present, the Gateway implements a NAT64 conversion that can convert IPv4 packets to IPv6 and vice versa. Doing this efficiently creates a fair amount of complexity within our `ip-packet` crate. In addition, routing ICMP errors back through our NAT is also complicated by this because we may have to translate the packet embedded in the ICMP error as well. The NAT64 module was originally conceived as a result of the new stub resolver-based DNS architecture. When the Client resolves IPs for a domain, it doesn't know whether the domain will actually resolve to IPv4 AND IPv6 addresses so it simply assigns 4 of each to every domain. Thus, when receiving an IPv6 packet for such a DNS resource, the Gateway may only have IPv4 addresses available and can therefore not route the packet (unless it translates it). This problem is not novel. In fact, an IP being unroutable or a particular route disappearing happens all the time on the Internet. ICMP was conceived to handle this problem and it is doing a pretty good job at it. We can make use of that and simply return an ICMP unreachable error back to the client whenever it picks an IP that we cannot map to one that we resolved. In this PR, we leave all of the NAT64 code intact and only add a feature-flag that - when active - sends aforementioned ICMP error. While offline (and thus also for our tests), the feature-flag evaluates to false. It is however set to `true` in the backend, meaning on staging and later in production, we will send these ICMP errors. Once this is rolled out and indeed proving to be working as intended, we can simplify our codebase and rip out the NAT64 module. At that point, we will also have to adapt the test-suite.	2025-03-27 01:01:37 +00:00

1 2 3 4 5 ...

1083 Commits