firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 10:18:54 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	d5be185ae4	chore(rust): remove telemetry spans and events (#9634 ) Originally, we introduced these to gather some data from logs / warnings that we considered to be too spammy. We've since merged a burst-protection that will at most submit the same event once every 5 minutes. The data from the telemetry spans themselves have not been used at all.	2025-06-25 17:15:57 +00:00
Thomas Eizinger	4be73da21c	fix(gateway): reply with cookie when rate limit is hit (#9657 ) WireGuard implements a rate-limit mechanism when the number of handshake initiations increases a certain limit. This is important because handshakes involve asymmetric cryptography and are cryptographically expensive. To prevent DoS attacks where other peers repeatedly ask for new handshakes, the rate limiter implements a cookie mechanism where - when under load - the remote peer needs to include a given cookie in new handshakes. This cookie is tied to the peer's IP address to prevent it from being reused by other peers. Up until now, we have not been passing the sender's IP address to `boringtun` and therefore, the only option when the rate limit was hit was to error with `UnderLoad`. By passing the source IP of the packet, `boringtun` can engage in the cookie-reply mechanism and therefore avoid the `UnderLoad` error. Resolves: #9643	2025-06-24 11:33:38 +00:00
Thomas Eizinger	91edd11a47	feat(gateway): send `$identify` event with account-slug (#9658 ) When we receive the `account_slug` from the portal, the Gateway now sends a `$identify` event to PostHog. This will allow us to target Gateways with feature-flags based on the account they are connected to.	2025-06-24 11:31:56 +00:00
Thomas Eizinger	3c0e866e77	feat(connlib): listen on 52625 by default (#9593 ) Presently, `connlib` always just lets the OS pick a random port for our UDP socket. This works well in many cases but has the downside that IF network admins would like to aid in the process of establishing direct connections, they cannot open a specific port because it is always random. It doesn't cost us anything to try and bind to a particular port (here 52625) and fallback to a random one if something is listening there. The port 52625 was chosen because: - It is within the ephemeral port range and will therefore never be registered to anything else. - It is an palindrome and therefore easy to remember. - When typing FIRE on a phone keypad, it you get the numbers 3473. 52625 is the port at the offset 3473 from the ephemeral port range. In order for this port to be useful in establishing direct connections, we generate optimistic candidates based on existing remote candidates by combining the IP of all server-reflexive candidates with the port of all host candidates. This patch deliberately does not publicly announce this feature in the docs or the changelog so we can first gather experience with it in our own test environment. Resolves: #9559	2025-06-24 08:41:08 +00:00
Thomas Eizinger	a91dda139f	feat(connlib): only conditionally hash firezone ID (#9633 ) A bit of legacy that we have inherited around our Firezone ID is that the ID stored on the user's device is sha'd before being passed to the portal as the "external ID". This makes it difficult to correlate IDs in Sentry and PostHog with the data we have in the portal. For Sentry and PostHog, we submit the raw UUID stored on the user's device. As a first step in overcoming this, we embed an "external ID" in those services as well IF the provided Firezone ID is a valid UUID. This will allow us to immediately correlate those events. As a second step, we automatically generate all new Firezone IDs for the Windows and Linux Client as `hex(sha256(uuid))`. These won't parse as valid UUIDs and therefore will be submitted as is to the portal. As a third step, we update all documentation around generating Firezone IDs to use `uuidgen \| sha256` instead of just `uuidgen`. This is effectively the equivalent of (2) but for the Headless Client and Gateway where the Firezone ID can be configured via environment variables. Resolves: #9382 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-06-24 07:05:48 +00:00
Thomas Eizinger	686918f1d1	chore(rust): bump str0m (#9591 ) The latest `main` of str0m undoes a breaking change in the constructor of `Candidate::relayed` by flipping the parameters back. This will make it easier to upgrade to the latest release once it is out.	2025-06-24 06:57:55 +00:00
Thomas Eizinger	1bd3d2a382	chore(gateway): remove NAT64/46 module (#9626 ) This has been disabled for several releases now and is not causing any problems in production. We can therefore safely remove it. It is about time we do this because our tests are actually still testing the variant without the feature flag and therefore deviate from what we do in production. We therefore have to convert the tests as well. Doing so uncovered a minor problem in our ICMP error parsing code: We attempted to parse the payload of an ICMP error as a fully-valid layer 4 header (e.g. TCP header or UDP header). However, per the RFC a node only needs to embed the first 8 bytes of the original packet in an ICMPv4 error. That is not enough to parse a valid TCP header as those are at least 20 bytes. I don't expect this to be a huge problem in production right now though. We only use this code to parse ICMP errors arriving on the Gateway and I _think_ most devices actually include more than 8 bytes. This only surfaced because we are very strict with only embedding exactly 8 bytes when we generate an ICMP error. Additionally, we change our ICMP errors to be sent from the resource IP rather than the Gateway's TUN device. Given that we perform NAT on these IPs anyway, I think this can still be argued to be RFC conform. The _proxy_ IP which we are trying to contact can be reached but it cannot be routed further. Therefore the destination is unreachable, yet the source of this error is the proxy IP itself. I think this is actually more correct than sending the packets from the Gateway's TUN device because the TUN device itself is not a routing hop per-se: its IP won't ever show up in the routing path.	2025-06-24 06:48:30 +00:00
Thomas Eizinger	950afd9b2d	chore(gateway): set account-slug in telemetry context (#9545 ) This PR adds an optional field `account_slug` to the Gateway's init message. If populated, we will use this field to set the account-slug in the telemetry context. This will allow us to know, which customers a particular Sentry issue is related to.	2025-06-23 18:52:39 +00:00
Thomas Eizinger	c8a4a20818	feat(snownet): increase ICE timeout (#9569 ) Some of our users are facing issues on what looks to be very unreliable network connections. At present, we consider a connection dead if we don't receive a response within 9.25 seconds. Cutting a connection and re-establishing it _should_ not be a problem in general and TCP connections happening through Firezone should resume gracefully. Further work on whether that is actually the case is due in #9531. Until then, we increase the ICE timeout to ~15s. Related: #9526	2025-06-18 22:16:32 +00:00
Thomas Eizinger	650cf893ba	feat(snownet): decrease idle connection ICE timeout (#9570 ) Any well-behaved NAT should keep the port mappings of an established UDP connection open for 120s, even without seeing any traffic. Not all NATs in the wild are well-behaved though and a discarded port mapping causes connectivity loss for customers. To combat these situations, we decrease the timer for STUN probes on idle connections from 60s to 25s. Related: #9526	2025-06-18 16:53:26 +00:00
Thomas Eizinger	d3ff59ab84	chore(rust): bump str0m (#9564 ) The recent changes to str0m include a bug fix for network constellations where both peers are behind symmetric NAT and therefore need a relay-relay candidate pair to succeed. In the current version, such candidate pairs would erroneously be rejected as redundant with host candidates. Fixes: #9514	2025-06-17 22:04:13 +00:00
Thomas Eizinger	f3dcd06115	chore(snownet): document current ICE timeouts with tests (#9558 ) This ensures we always know, what the ICE timeouts of the agent are. With the backoff implemented in the agent, it is not trivial to compute this from the input parameters.	2025-06-17 21:38:08 +00:00
Jamil	805ba085c2	fix(connlib): re-add resource if ip_stack changes (#9372 ) In #9300, we added logic to control whether we emit A and/or AAAA records for a DNS resource based on the `ip_stack` property of the `Resource` struct. Unfortunately this didn't take updates into account when the client was signed in, so updating a DNS resource's ip_stack failed to update the client's local Resource copy. To fix this, we determine if `resource_addressability_changed` which is true if the resource's address, or ip_stack, has changed meaningfully. If so, we remove the resource prior to evaluating the remaining logic of the `resource_created_or_updated` handler, which in turn causes the resource to be re-added, effectively updating its ip_stack. Related: https://github.com/firezone/firezone/pull/9300#issuecomment-2932365798	2025-06-03 03:00:19 +00:00
Thomas Eizinger	218c711789	fix(connlib): don't hard-fail if buffer increase is rejected (#9366 ) When `connlib` creates new UDP sockets for the p2p traffic, it tries to increase the send and receive buffers for improved performance. Failure to do so currently results in `connlib` failing to start entirely. This is unnecessarily harsh, we can simply log a warning instead and move on.	2025-06-02 15:20:58 +00:00
Thomas Eizinger	29f8dd8688	fix(connlib): block until UDP thread has been set up (#9363 ) Internally, `connlib` spawns a new thread for handling IO on the UDP socket. In order to make sure that this thread is operational, we intended to block `connlib`s main thread until the setup of the UDP thread has successfully completed. Unfortunately, this isn't quite the case because we already send an `Ok(())` value into the channel once we've successfully bound the socket. Following the binding, we also try to increase the maximum buffer size of the socket. Even though the intention here was to also log this error, the error value sent into the channel there is never read because we only ever read one value from the `error_tx` channel. To fix this, we move the sending of the `Ok(())` value to the very bottom of the UDP thread, just before we kick it off. Whilst this does not fix the actual issue as to why the setup of the UDP thread fails, these changes will at least surface the error.	2025-06-02 12:37:38 +00:00
Thomas Eizinger	e05c98bfca	ci: update to new `cargo sort` release (#9354 ) The latest release now also sorts workspace dependencies, as well as different dependency sections. Keeping these things sorted reduces the chances of merge conflicts when multiple PRs edit these files.	2025-06-02 02:01:09 +00:00
Thomas Eizinger	02638582fe	feat(connlib): allow controlling IP stack per DNS resource (#9300 ) With this patch, `connlib` exposes a new, optional field `ip_stack` within the resource description of each DNS resource that controls the supported IP stack. By default, the IP stack is set to `Dual` to preserve the current behaviour. When set to `IPv4Only` or `IPv6Only`, `connlib` will not assign any IPv4 or IPv6 addresses when receiving DNS queries for such a resource. The DNS query will still respond successfully with NOERROR (and not NXDOMAIN) but the list of IPs will be empty. This is useful to e.g. allow sys-admins to disable IPv6 for resources with buggy clients such as the MongoDB atlas driver. The MongoDB driver does not correctly handle happy-eyeballs and instead fails the connection early on any connection error. Additionally, customers operating in IPv6-exclusive networks can disable IPv4 addresses with this setting. Related: https://jira.mongodb.org/browse/NODE-4678 Related: #9042 Related: #8892	2025-05-31 00:27:59 +00:00
Thomas Eizinger	e6f13a124a	fix(connlib): optimise logging of activated CIDR resources (#9293 ) Instead of always logging when CIDR resources change, we add an additional condition to the already existing `Activated resource` log that suppresses it in case the currently active CIDR resource is actively routing traffic. Resolves: #9281	2025-05-29 02:19:33 +00:00
dependabot[bot]	82d097baa0	build(deps): bump domain from 0.10.4 to 0.11.0 in /rust (#9274 ) Bumps [domain](https://github.com/nlnetlabs/domain) from 0.10.4 to 0.11.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/nlnetlabs/domain/releases">domain's releases</a>.</em></p> <blockquote> <h2>Release 0.11.0</h2> <p>Breaking changes</p> <ul> <li>FIX: Use base 16 per RFC 4034 for the DS digest, not base 64. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/423">#423</a>)</li> <li>FIX: NSEC3 salt strings should only be accepted if within the salt size limit. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/431">#431</a>)</li> <li>Stricter RFC 1035 compliance by default in the <code>Zonefile</code> parser. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/477">#477</a>)</li> <li>Rename {DigestAlg, Nsec3HashAlg, SecAlg, ZonemdAlg} to {DigestAlgorithm, Nsec3HashAlgorithm, SecurityAlgorithm, ZonemdAlgorithm}</li> </ul> <p>New</p> <ul> <li>Added <code>HashCompressor</code>, an unlimited name compressor that uses a hash map rather than a tree. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/396">#396</a>)</li> <li>Changed <code>fmt::Display</code> for <code>HINFO</code> records to a show a quoted string. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/421">#421</a>)</li> <li>Added support for <code>NAPTR</code> record type. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/427">#427</a> by [<a href="https://github.com/weilence"><code>@weilence</code></a>])</li> <li>Added initial fuzz testing support for some types via a new <code>arbitrary</code> feature (not enabled by default). (<a href="https://redirect.github.com/nlnetlabs/domain/issues/441">#441</a>)</li> <li>Added <code>StubResolver::add_connection()</code> to allow adding a connection to the running resolver. In combination with <code>ResolvConf::new()</code> this can also be used to control the connections made when testing code that uses the stub resolver. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/440">#440</a>)</li> <li>Added <code>ZonefileFmt</code> trait for printing records as zonefiles. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/379">#379</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/446">#446</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/463">#463</a>)</li> </ul> <p>Bug fixes</p> <ul> <li>NSEC records should include themselves in the generated bitmap. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/417">#417</a>)</li> <li>Trailing double quote wrongly preserved when parsing record data. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/470">#470</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/472">#472</a>)</li> <li>Don't error with unexpected end of entry for RFC 3597 RDATA of length zero. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/475">#475</a>)</li> </ul> <p>Unstable features</p> <ul> <li> <p>New unstable feature <code>unstable-crypto</code> that enable cryptography support for features that do not rely on secret keys. This feature needs either or both of the features <code>ring</code> and <code>openssl</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/416">#416</a>)</p> </li> <li> <p>New unstable feature <code>unstable-crypto-sign</code> that enable cryptography support including features that rely on secret keys. This feature needs either or both of the features <code>ring</code> and <code>openssl</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/416">#416</a>)</p> </li> <li> <p>New unstable feature <code>unstable-client-cache</code> that enable the client transport cache. The reason is that the client cache uses the <code>moka</code> crate.</p> </li> <li> <p>New unstable feature <code>unstable-new</code> that introduces a new API for all of domain (currently only with <code>base</code>, <code>rdata</code>, and <code>edns</code> modules). Also see the [associated blog post][new-base-post].</p> </li> <li> <p><code>unstable-server-transport</code></p> <ul> <li>The trait <code>SingleService</code> which is a simplified service trait for requests that should generate a single response (<a href="https://redirect.github.com/nlnetlabs/domain/issues/353">#353</a>).</li> <li>The trait <code>ComposeReply</code> and an implementation of the trait (<code>ReplyMessage</code>) to assist in capturing EDNS(0) options that should be included in a response message (<a href="https://redirect.github.com/nlnetlabs/domain/issues/353">#353</a>).</li> <li>Adapters to implement <code>Service</code> for <code>SingleService</code> and to implement <code>SingleService</code> for <code>SendRequest</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/353">#353</a>).</li> <li>Conversion of a <code>Request</code> to a <code>RequestMessage</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/353">#353</a>).</li> <li>A sample query router, called <code>QnameRouter</code>, that routes requests based on the QNAME field in the request (<a href="https://redirect.github.com/nlnetlabs/domain/issues/353">#353</a>).</li> </ul> </li> <li> <p><code>unstable-client-transport</code></p> <ul> <li>introduce timeout option in multi_stream (<a href="https://redirect.github.com/nlnetlabs/domain/issues/424">#424</a>).</li> <li>improve probing in redundant (<a href="https://redirect.github.com/nlnetlabs/domain/issues/424">#424</a>).</li> <li>restructure configuration for multi_stream and redundant (<a href="https://redirect.github.com/nlnetlabs/domain/issues/424">#424</a>).</li> <li>introduce a load balancer client transport. This transport tries to distribute requests equally over upstream transports (<a href="https://redirect.github.com/nlnetlabs/domain/issues/425">#425</a>).</li> <li>the client cache now has it's own feature <code>unstable-client-cache</code>.</li> </ul> </li> <li> <p><code>unstable-sign</code></p> <ul> <li>add key lifecycle management (<a href="https://redirect.github.com/nlnetlabs/domain/issues/459">#459</a>).</li> <li>add support for adding NSEC3 records when signing.</li> <li>add support for ZONEMD.</li> </ul> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/NLnetLabs/domain/blob/main/Changelog.md">domain's changelog</a>.</em></p> <blockquote> <h2>0.11.0</h2> <p>Released 2025-05-21.</p> <p>Breaking changes</p> <ul> <li>FIX: Use base 16 per RFC 4034 for the DS digest, not base 64. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/423">#423</a>)</li> <li>FIX: NSEC3 salt strings should only be accepted if within the salt size limit. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/431">#431</a>)</li> <li>Stricter RFC 1035 compliance by default in the <code>Zonefile</code> parser. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/477">#477</a>)</li> <li>Rename {DigestAlg, Nsec3HashAlg, SecAlg, ZonemdAlg} to {DigestAlgorithm, Nsec3HashAlgorithm, SecurityAlgorithm, ZonemdAlgorithm}</li> </ul> <p>New</p> <ul> <li>Added <code>HashCompressor</code>, an unlimited name compressor that uses a hash map rather than a tree. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/396">#396</a>)</li> <li>Changed <code>fmt::Display</code> for <code>HINFO</code> records to a show a quoted string. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/421">#421</a>)</li> <li>Added support for <code>NAPTR</code> record type. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/427">#427</a> by [<a href="https://github.com/weilence"><code>@weilence</code></a>])</li> <li>Added initial fuzz testing support for some types via a new <code>arbitrary</code> feature (not enabled by default). (<a href="https://redirect.github.com/nlnetlabs/domain/issues/441">#441</a>)</li> <li>Added <code>StubResolver::add_connection()</code> to allow adding a connection to the running resolver. In combination with <code>ResolvConf::new()</code> this can also be used to control the connections made when testing code that uses the stub resolver. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/440">#440</a>)</li> <li>Added <code>ZonefileFmt</code> trait for printing records as zonefiles. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/379">#379</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/446">#446</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/463">#463</a>)</li> </ul> <p>Bug fixes</p> <ul> <li>NSEC records should include themselves in the generated bitmap. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/417">#417</a>)</li> <li>Trailing double quote wrongly preserved when parsing record data. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/470">#470</a>, <a href="https://redirect.github.com/nlnetlabs/domain/issues/472">#472</a>)</li> <li>Don't error with unexpected end of entry for RFC 3597 RDATA of length zero. ([475])</li> </ul> <p>Unstable features</p> <ul> <li> <p>New unstable feature <code>unstable-crypto</code> that enable cryptography support for features that do not rely on secret keys. This feature needs either or both of the features <code>ring</code> and <code>openssl</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/416">#416</a>)</p> </li> <li> <p>New unstable feature <code>unstable-crypto-sign</code> that enable cryptography support including features that rely on secret keys. This feature needs either or both of the features <code>ring</code> and <code>openssl</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/416">#416</a>)</p> </li> <li> <p>New unstable feature <code>unstable-client-cache</code> that enable the client transport cache. The reason is that the client cache uses the <code>moka</code> crate.</p> </li> <li> <p>New unstable feature <code>unstable-new</code> that introduces a new API for all of domain (currently only with <code>base</code>, <code>rdata</code>, and <code>edns</code> modules). Also see the [associated blog post][new-base-post].</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`84152353c9`"><code>8415235</code></a> Release 0.11.0 (<a href="https://redirect.github.com/nlnetlabs/domain/issues/533">#533</a>)</li> <li><a href="`16d7f364ce`"><code>16d7f36</code></a> Revert boxing 'ring::sign::KeyPair' (<a href="https://redirect.github.com/nlnetlabs/domain/issues/532">#532</a>)</li> <li><a href="`51a8360649`"><code>51a8360</code></a> Bump openssl from 0.10.71 to 0.10.72 (<a href="https://redirect.github.com/nlnetlabs/domain/issues/512">#512</a>)</li> <li><a href="`1f9de15431`"><code>1f9de15</code></a> Introduce <code>domain::new</code> (<a href="https://redirect.github.com/nlnetlabs/domain/issues/474">#474</a>)</li> <li><a href="`72b42a3991`"><code>72b42a3</code></a> Adjust for Clippy 1.87 lints (<a href="https://redirect.github.com/nlnetlabs/domain/issues/530">#530</a>)</li> <li><a href="`a0bf99c922`"><code>a0bf99c</code></a> Merge pull request <a href="https://redirect.github.com/nlnetlabs/domain/issues/515">#515</a> from NLnetLabs/ends-to-edns</li> <li><a href="`8e4280af39`"><code>8e4280a</code></a> Don't panic on mismatched private and public keys. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/528">#528</a>)</li> <li><a href="`473f871036`"><code>473f871</code></a> Pass &N instead of N and also remove thereby an unnecessary clone(). (<a href="https://redirect.github.com/nlnetlabs/domain/issues/526">#526</a>)</li> <li><a href="`2a390420af`"><code>2a39042</code></a> Remove incorrect logic for determining the apex from signing function. (<a href="https://redirect.github.com/nlnetlabs/domain/issues/521">#521</a>)</li> <li><a href="`f43d53d010`"><code>f43d53d</code></a> Remove no longer needed mut on GenerateNsec3Config and SigningConfig which ha...</li> <li>Additional commits viewable in <a href="https://github.com/nlnetlabs/domain/compare/v0.10.4...v0.11.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=domain&package-manager=cargo&previous-version=0.10.4&new-version=0.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-28 10:37:16 +00:00
Thomas Eizinger	6165555add	build(deps): bump Rust to 1.87.0 (#9159 )	2025-05-16 01:58:17 +00:00
Thomas Eizinger	ce06996a14	fix(connlib): allow more than one host candidate per IP version (#9147 ) Currently, one machines that have multiple routable egress interfaces, `connlib` may bounce between the two instead of settling on one. This happens because we have a dedicated `CandidateSet` that we use to filter out "duplicate" candidates of the same type. Doing that is important because if the other party is behind a symmetric NAT, they will send us many server-reflexive candidates that all only differ by their port, none of them will actually be routable though. To prevent sending many of these candidates to the remote, we first gather them locally in our `CandidateSet` and de-duplicate them.	2025-05-15 02:08:53 +00:00
Thomas Eizinger	09191549eb	test(rust): don't delay delivering scheduled `Transmit`s (#9129 ) To simulate varying network conditions in our tests, each `Host` in our test network has an "inbox" that contains all incoming network packets with an added latency. When another hosts sends a packet, the packet gets added to the inbox. Internally, the inbox has a binary heap that sorts incoming `Transmits` by their latency and only delivers them to the node when that delay is up. Currently, this delivery doesn't always happen because we fail to take into account the timestamp as when the next `Transmit` is due when we figure out what to do next. Instead of just looking at the inner state via `poll_transmit`, we now also consult the inbox of messages as to when the next message is due and wake up at the correct time. Not doing this caused our state machine to think that packets got dropped because `REFRESH` messages to the relays were timing out. Resolves: #9118	2025-05-14 05:37:05 +00:00
Thomas Eizinger	45924eb90b	fix(connlib): ignore scopes for IPv6 link-local addresses (#9115 ) To send UDP DNS queries to upstream DNS servers, we have a `UdpSocket::handshake` function that turns a UDP socket into a single-use object where exactly one datagram is expected from the address we send a message to. The way this is enforced is via an equality check. It appears that this equality check fails if users run an upstream DNS server on a link-local IPv6 address within a setup that utilises IPv6 scopes. At the time when we receive the response, the packet has already been successfully routed back to us so we should accept it, even if we didn't specify a scope as the destination address.	2025-05-13 13:33:28 +00:00
Thomas Eizinger	b8738448df	refactor(connlib): forward error from source IP resolver (#9116 ) In order to avoid routing loops on Windows, our UDP and TCP sockets in `connlib` embed a "source IP resolver" that finds the "next best" interface after our TUN device according to Windows' routing metrics. This ensures that packets don't get routed back into our TUN device. Currently, errors during this process are only logged on TRACE and therefore not visible in Sentry. We fix this by moving around some of the function interfaces and forward the error from the source IP resolver together with some context of the destination IP.	2025-05-13 13:33:15 +00:00
Thomas Eizinger	945fed8e9d	chore(phoenix-channel): downgrade log about dropped messages (#9092 ) This can easily happen if we are briefly disconnected from the portal. It is not the end of the world and not worth creating Sentry alerts for. Originally, this was intended to be a way of detecting "bad connectivity" but that didn't really work.	2025-05-12 11:40:40 +00:00
Thomas Eizinger	f01fd4ddf6	fix(connlib): clear pending sockets on DNS server re-creation (#9093 ) Our DNS over TCP implementation uses `smoltcp` which requires us to manage sockets individually, i.e. there is no such thing as a listening socket. Instead, we have to create multiple sockets and rotate through them. Whenever we receive new DNS servers from the host app, we throw away all of those sockets and create new ones. The way we refer to these sockets internally is via `smoltcp`'s `SocketHandle`. These are just indices into a `Vec` and this access can panic when it is out of range. Normally that doesn't happen because such a `SocketHandle` is only created when the socket is created and therefore, each `SocketHandle` in existence should be valid. What we overlooked is that these sockets get destroyed and re-created when we call `set_listen_addresses` which happens when the host app tells us about new DNS servers. In that case, sockets that we had just received a query on and are waiting for a response have their handles stored in a temporary `HashMap`. Attempting to send back a response for one of those queries will then either fail with an error that the socket is not in the right state or - worse - panic with an out of bounds error if the previously had more listen addresses than we have now. To fix this, we need to clear this map of pending queries every time we call `set_listen_addresses`.	2025-05-12 11:39:59 +00:00
Thomas Eizinger	7e4fe68485	fix(connlib): take into account header overhead for GSO (#9088 ) When calculating the maximum size of the UDP payload we can send in a single syscall, we need to take into account the overhead of the IP and UDP headers.	2025-05-12 11:36:10 +00:00
Jamil	537295d8a3	fix(rust): Downgrade fastest nameserver to DEBUG (#9071 ) These run every minute and add a lot of noise to the logs. ``` May 11 18:21:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:21:14.154Z INFO firezone_tunnel::io::nameserver_set: Evaluating fastest nameserver ips={127.0.0.53} May 11 18:21:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:21:14.155Z INFO firezone_tunnel::io::nameserver_set: Evaluated fastest nameserver fastest=127.0.0.53 May 11 18:22:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:22:14.154Z INFO firezone_tunnel::io::nameserver_set: Evaluating fastest nameserver ips={127.0.0.53} May 11 18:22:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:22:14.155Z INFO firezone_tunnel::io::nameserver_set: Evaluated fastest nameserver fastest=127.0.0.53 May 11 18:23:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:23:14.153Z INFO firezone_tunnel::io::nameserver_set: Evaluating fastest nameserver ips={127.0.0.53} May 11 18:23:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:23:14.155Z INFO firezone_tunnel::io::nameserver_set: Evaluated fastest nameserver fastest=127.0.0.53 May 11 18:24:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:24:14.154Z INFO firezone_tunnel::io::nameserver_set: Evaluating fastest nameserver ips={127.0.0.53} May 11 18:24:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:24:14.155Z INFO firezone_tunnel::io::nameserver_set: Evaluated fastest nameserver fastest=127.0.0.53 May 11 18:25:14 gateway-z1w4 firezone-gateway[2007]: 2025-05-11T18:25:14.153Z INFO firezone_tunnel::io::nameserver_set: Evaluating fastest nameserver ips={127.0.0.53} ```	2025-05-12 01:58:17 +00:00
Thomas Eizinger	5566f1847f	refactor(rust): move crates into a more sensical hierarchy (#9066 ) The current `rust/` directory is a bit of a wild-west in terms of how the crates are organised. Most of them are simply at the top-level when in reality, they are all `connlib`-related. The Apple and Android FFI crates - which are entrypoints in the Rust code are defined several layers deep. To improve the situation, we move around and rename several crates. The end result is that all top-level crates / directories are: - Either entrypoints into the Rust code, i.e. applications such as Gateway, Relay or a Client - Or crates shared across all those entrypoints, such as `telemetry` or `logging`	2025-05-12 01:04:17 +00:00
Thomas Eizinger	3f4e004a48	fix(connlib): don't recreate DNS resource NAT for failed domains (#9064 ) Before a Client can send packets to a DNS resource, the Gateway must first setup a NAT table between the IPs assigned by the Client and the IPs the domain actually resolves to. This is what we call the DNS resource NAT. The communication for this process happens over IP through the tunnel which is an unreliable transport. To ensure that this works reliably even in the presence of packet loss on the wire, the Client uses an idempotent algorithm where it tracks the state of the NAT for each domain that is has ever assigned IPs for (i.e. received an A or AAAA query from an application). This algorithm ensures that if we don't hear anything back from the Gateway within 2s, another packet for setting up the NAT is sent as soon as we receive _any_ DNS query. This design balances efficiency (we don't try forever) with reliability (we always check all of them). In case a domain does not resolve at all or there are resolution errors, the Gateway replies with `NatStatus::Inactive`. At present, the Client doesn't handle this in any particular way other than logging that it was not able to successfully setup the NAT. The combination of the above results in an undesirable behaviour: If an application queries a domain without A and AAAA records once, we will keep retrying forever to resolve it upon every other DNS query issued to the system. To fix this, we introduce `dns_resource_nat::State::Failed`. Entries in this state are ignored as part of the above algorithm and only recreated when explicitly told to do so which we only do when we receive another DNS query for this domain. To handle the increased complexity around this system, we extract it into its own component and add a fleet of unit tests for its behaviour.	2025-05-09 15:04:21 +00:00
Thomas Eizinger	fa790b231a	fix(gateway): respond with SERVFAIL for missing nameserver (#9061 ) When we implemented #8350, we chose an error handling strategy that would shutdown the Gateway in case we didn't have a nameserver selected for handling those SRV and TXT queries. At the time, this was deemed to be sufficiently rare to be an adequate strategy. We have since learned that this can indeed happen when the Gateway starts without network connectivity which is quite common when using tools such as terraform to provision infrastructure. In #9060, we fix this by re-evaluating the fastest nameserver on a timer. This however doesn't change the error handling strategy when we don't have a working nameserver at all. It is practically impossible to have a working Gateway yet us being unable to select a nameserver. We read them from `/etc/resolv.conf` which is what `libc` uses to also resolve the domain we connect to for the WebSocket. A working WebSocket connection is required for us to establish connections to Clients, which in turn is a precursor to us receiving DNS queries from a Client. It causes unnecessary complexity to have a code path that can potentially terminate the Gateway, yet is practically unreachable. To fix this situation, we remove this code path and instead reply with a DNS SERVFAIL error. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-09 05:55:48 +00:00
Thomas Eizinger	ac339ff63b	fix(gateway): evaluate fastest nameserver every 60s (#9060 ) Currently, the Gateway reads all nameservers from `/etc/resolv.conf` on startup and evaluates the fastest one to use for SRV and TXT DNS queries that are forwarded by the Client. If the machine just booted and we do not have Internet connectivity just yet, this fails which leaves the Gateway in state where it cannot fulfill those queries. In order to ensure we always use the fastest one and to self-heal from such situations, we add a 60s timer that refreshes this state. Currently, this will not re-read the nameservers from `/etc/resolv.conf` but still use the same IPs read on startup.	2025-05-09 03:38:35 +00:00
Thomas Eizinger	33d5c32f35	fix(gateway): truncate payload of ICMP errors (#9059 ) When the Gateway is handed an IP packet for a DNS resource that it cannot route, it sends back an ICMP unreachable error. According to RFC 792 [0] (for ICMPv4) and RFC 4443 [1] (for ICMPv6), parts of the original packet should be included in the ICMP error payload to allow the sending party to correlate, what could not be sent. For ICMPv4, the RFC says: ``` Internet Header + 64 bits of Data Datagram The internet header plus the first 64 bits of the original datagram's data. This data is used by the host to match the message to the appropriate process. If a higher level protocol uses port numbers, they are assumed to be in the first 64 data bits of the original datagram's data. ``` For ICMPv6, the RFC says: ``` As much of invoking packet as possible without the ICMPv6 packet exceeding the minimum IPv6 MTU ``` [0]: https://datatracker.ietf.org/doc/html/rfc792 [1]: https://datatracker.ietf.org/doc/html/rfc4443#section-3.1	2025-05-09 01:38:31 +00:00
Thomas Eizinger	18ec6c6860	refactor(rust): move service implementation to GUI client (#9045 ) The module and crate structure around the GUI client and its background service are currently a mess of circular dependencies. Most of the service implementation actually sits in `firezone-headless-client` because the headless-client and the service share certain modules. We have recently moved most of these to `firezone-bin-shared` which is the correct place for these modules. In order to move the background service to `firezone-gui-client`, we need to untangle a few more things in the GUI client. Those are done commit-by-commit in this PR. With that out the way, we can finally move the service module to the GUI client; where is should actually live given that it has nothing to do with the headless client. As a result, the headless-client is - as one would expect - really just a thin wrapper around connlib itself and is reduced down to 4 files with this PR. To make things more consistent in the GUI client, we move the `main.rs` file also into `bin/`. By convention `bin/` is where you define binaries if a crate has more than one. cargo will then build all of them. Eventually, we can optimise the compile-times for `firezone-gui-client` by splitting it into multiple crates: - Shared structs like IPC messages - Background service - GUI client This will be useful because it allows only re-compiling of the GUI client alone if nothing in `connlib` changes and vice versa. Resolves: #6913 Resolves: #5754	2025-05-08 13:22:09 +00:00
dependabot[bot]	bea57c02c4	build(deps): bump libc from 0.2.171 to 0.2.172 in /rust (#9031 ) Bumps [libc](https://github.com/rust-lang/libc) from 0.2.171 to 0.2.172. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/libc/releases">libc's releases</a>.</em></p> <blockquote> <h2>0.2.172</h2> <h3>Added</h3> <ul> <li>Android: Add <code>getauxval</code> for 32-bit targets (<a href="https://redirect.github.com/rust-lang/libc/pull/4338">#4338</a>)</li> <li>Android: Add <code>if_tun.h</code> ioctls (<a href="https://redirect.github.com/rust-lang/libc/pull/4379">#4379</a>)</li> <li>Android: Define <code>SO_BINDTOIFINDEX</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4391">#4391</a>)</li> <li>Cygwin: Add <code>posix_spawn_file_actions_add[f]chdir[_np]</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4387">#4387</a>)</li> <li>Cygwin: Add new socket options (<a href="https://redirect.github.com/rust-lang/libc/pull/4350">#4350</a>)</li> <li>Cygwin: Add statfs & fcntl (<a href="https://redirect.github.com/rust-lang/libc/pull/4321">#4321</a>)</li> <li>FreeBSD: Add <code>filedesc</code> and <code>fdescenttbl</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4327">#4327</a>)</li> <li>Glibc: Add unstable support for _FILE_OFFSET_BITS=64 (<a href="https://redirect.github.com/rust-lang/libc/pull/4345">#4345</a>)</li> <li>Hermit: Add <code>AF_UNSPEC</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Hermit: Add <code>AF_VSOCK</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Illumos, NetBSD: Add <code>timerfd</code> APIs (<a href="https://redirect.github.com/rust-lang/libc/pull/4333">#4333</a>)</li> <li>Linux: Add <code>_IO</code>, <code>_IOW</code>, <code>_IOR</code>, <code>_IOWR</code> to the exported API (<a href="https://redirect.github.com/rust-lang/libc/pull/4325">#4325</a>)</li> <li>Linux: Add <code>tcp_info</code> to uClibc bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4347">#4347</a>)</li> <li>Linux: Add further BPF program flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4356">#4356</a>)</li> <li>Linux: Add missing INPUT_PROP_XXX flags from <code>input-event-codes.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4326">#4326</a>)</li> <li>Linux: Add missing TLS bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4296">#4296</a>)</li> <li>Linux: Add more constants from <code>seccomp.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4330">#4330</a>)</li> <li>Linux: Add more glibc <code>ptrace_sud_config</code> and related <code>PTRACE_ET_SYSCALL_USER_DISPATCH_CONFIG</code>. (<a href="https://redirect.github.com/rust-lang/libc/pull/4386">#4386</a>)</li> <li>Linux: Add new netlink flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4288">#4288</a>)</li> <li>Linux: Define ioctl codes on more architectures (<a href="https://redirect.github.com/rust-lang/libc/pull/4382">#4382</a>)</li> <li>Linux: Add missing <code>pthread_attr_setstack</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4349">#4349</a>)</li> <li>Musl: Add missing <code>utmpx</code> API (<a href="https://redirect.github.com/rust-lang/libc/pull/4332">#4332</a>)</li> <li>Musl: Enable <code>getrandom</code> on all platforms (<a href="https://redirect.github.com/rust-lang/libc/pull/4346">#4346</a>)</li> <li>NuttX: Add more signal constants (<a href="https://redirect.github.com/rust-lang/libc/pull/4353">#4353</a>)</li> <li>QNX: Add QNX 7.1-iosock and 8.0 to list of additional cfgs (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>QNX: Add support for alternative Neutrino network stack <code>io-sock</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>Redox: Add more <code>sys/socket.h</code> and <code>sys/uio.h</code> definitions (<a href="https://redirect.github.com/rust-lang/libc/pull/4388">#4388</a>)</li> <li>Solaris: Temporarily define <code>O_DIRECT</code> and <code>SIGINFO</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4348">#4348</a>)</li> <li>Solarish: Add <code>secure_getenv</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4342">#4342</a>)</li> <li>VxWorks: Add missing <code>d_type</code> member to <code>dirent</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add missing signal-related constsants (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add more error codes (<a href="https://redirect.github.com/rust-lang/libc/pull/4337">#4337</a>)</li> </ul> <h3>Deprecated</h3> <ul> <li>FreeBSD: Deprecate <code>TCP_PCAP_OUT</code> and <code>TCP_PCAP_IN</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4381">#4381</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Cygwin: Fix member types of <code>statfs</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4324">#4324</a>)</li> <li>Cygwin: Fix tests (<a href="https://redirect.github.com/rust-lang/libc/pull/4357">#4357</a>)</li> <li>Hermit: Make <code>AF_INET = 3</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Musl: Fix the syscall table on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>Musl: Fix the value of <code>SA_ONSTACK</code> on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>VxWorks: Fix a typo in the <code>waitpid</code> parameter name (<a href="https://redirect.github.com/rust-lang/libc/pull/4334">#4334</a>)</li> </ul> <h3>Removed</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/libc/blob/0.2.172/CHANGELOG.md">libc's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/rust-lang/libc/compare/0.2.171...0.2.172">0.2.172</a> - 2025-04-14</h2> <h3>Added</h3> <ul> <li>Android: Add <code>getauxval</code> for 32-bit targets (<a href="https://redirect.github.com/rust-lang/libc/pull/4338">#4338</a>)</li> <li>Android: Add <code>if_tun.h</code> ioctls (<a href="https://redirect.github.com/rust-lang/libc/pull/4379">#4379</a>)</li> <li>Android: Define <code>SO_BINDTOIFINDEX</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4391">#4391</a>)</li> <li>Cygwin: Add <code>posix_spawn_file_actions_add[f]chdir[_np]</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4387">#4387</a>)</li> <li>Cygwin: Add new socket options (<a href="https://redirect.github.com/rust-lang/libc/pull/4350">#4350</a>)</li> <li>Cygwin: Add statfs & fcntl (<a href="https://redirect.github.com/rust-lang/libc/pull/4321">#4321</a>)</li> <li>FreeBSD: Add <code>filedesc</code> and <code>fdescenttbl</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4327">#4327</a>)</li> <li>Glibc: Add unstable support for _FILE_OFFSET_BITS=64 (<a href="https://redirect.github.com/rust-lang/libc/pull/4345">#4345</a>)</li> <li>Hermit: Add <code>AF_UNSPEC</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Hermit: Add <code>AF_VSOCK</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Illumos, NetBSD: Add <code>timerfd</code> APIs (<a href="https://redirect.github.com/rust-lang/libc/pull/4333">#4333</a>)</li> <li>Linux: Add <code>_IO</code>, <code>_IOW</code>, <code>_IOR</code>, <code>_IOWR</code> to the exported API (<a href="https://redirect.github.com/rust-lang/libc/pull/4325">#4325</a>)</li> <li>Linux: Add <code>tcp_info</code> to uClibc bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4347">#4347</a>)</li> <li>Linux: Add further BPF program flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4356">#4356</a>)</li> <li>Linux: Add missing INPUT_PROP_XXX flags from <code>input-event-codes.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4326">#4326</a>)</li> <li>Linux: Add missing TLS bindings (<a href="https://redirect.github.com/rust-lang/libc/pull/4296">#4296</a>)</li> <li>Linux: Add more constants from <code>seccomp.h</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4330">#4330</a>)</li> <li>Linux: Add more glibc <code>ptrace_sud_config</code> and related <code>PTRACE_ET_SYSCALL_USER_DISPATCH_CONFIG</code>. (<a href="https://redirect.github.com/rust-lang/libc/pull/4386">#4386</a>)</li> <li>Linux: Add new netlink flags (<a href="https://redirect.github.com/rust-lang/libc/pull/4288">#4288</a>)</li> <li>Linux: Define ioctl codes on more architectures (<a href="https://redirect.github.com/rust-lang/libc/pull/4382">#4382</a>)</li> <li>Linux: Add missing <code>pthread_attr_setstack</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4349">#4349</a>)</li> <li>Musl: Add missing <code>utmpx</code> API (<a href="https://redirect.github.com/rust-lang/libc/pull/4332">#4332</a>)</li> <li>Musl: Enable <code>getrandom</code> on all platforms (<a href="https://redirect.github.com/rust-lang/libc/pull/4346">#4346</a>)</li> <li>NuttX: Add more signal constants (<a href="https://redirect.github.com/rust-lang/libc/pull/4353">#4353</a>)</li> <li>QNX: Add QNX 7.1-iosock and 8.0 to list of additional cfgs (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>QNX: Add support for alternative Neutrino network stack <code>io-sock</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4169">#4169</a>)</li> <li>Redox: Add more <code>sys/socket.h</code> and <code>sys/uio.h</code> definitions (<a href="https://redirect.github.com/rust-lang/libc/pull/4388">#4388</a>)</li> <li>Solaris: Temporarily define <code>O_DIRECT</code> and <code>SIGINFO</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4348">#4348</a>)</li> <li>Solarish: Add <code>secure_getenv</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4342">#4342</a>)</li> <li>VxWorks: Add missing <code>d_type</code> member to <code>dirent</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add missing signal-related constsants (<a href="https://redirect.github.com/rust-lang/libc/pull/4352">#4352</a>)</li> <li>VxWorks: Add more error codes (<a href="https://redirect.github.com/rust-lang/libc/pull/4337">#4337</a>)</li> </ul> <h3>Deprecated</h3> <ul> <li>FreeBSD: Deprecate <code>TCP_PCAP_OUT</code> and <code>TCP_PCAP_IN</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4381">#4381</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Cygwin: Fix member types of <code>statfs</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4324">#4324</a>)</li> <li>Cygwin: Fix tests (<a href="https://redirect.github.com/rust-lang/libc/pull/4357">#4357</a>)</li> <li>Hermit: Make <code>AF_INET = 3</code> (<a href="https://redirect.github.com/rust-lang/libc/pull/4344">#4344</a>)</li> <li>Musl: Fix the syscall table on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>Musl: Fix the value of <code>SA_ONSTACK</code> on RISC-V-32 (<a href="https://redirect.github.com/rust-lang/libc/pull/4335">#4335</a>)</li> <li>VxWorks: Fix a typo in the <code>waitpid</code> parameter name (<a href="https://redirect.github.com/rust-lang/libc/pull/4334">#4334</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a5eab581f9`"><code>a5eab58</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4410">#4410</a> from tgross35/release-libc</li> <li><a href="`481eca7cc3`"><code>481eca7</code></a> chore: release libc 0.2.172</li> <li><a href="`ce2edbbaa9`"><code>ce2edbb</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4399">#4399</a> from tgross35/backport-triagebot-branch-warn</li> <li><a href="`31b3200907`"><code>31b3200</code></a> Suggest stable-nominated in the PR template</li> <li><a href="`3bffe1d58a`"><code>3bffe1d</code></a> Make triagebot warn on non-default branches</li> <li><a href="`03e6ffc8c4`"><code>03e6ffc</code></a> Merge pull request <a href="https://redirect.github.com/rust-lang/libc/issues/4396">#4396</a> from tgross35/backport-serrano</li> <li><a href="`f9a47ac811`"><code>f9a47ac</code></a> Define SO_BINDTOIFINDEX on Android</li> <li><a href="`a358dae479`"><code>a358dae</code></a> Add missing utmpx apis for linux musl</li> <li><a href="`1ff2f2181a`"><code>1ff2f21</code></a> adding linux glibc ptrace_sud_config and related PTRACE_*ET_SYSCALL_USER_DISP...</li> <li><a href="`55c58c956d`"><code>55c58c9</code></a> Add more redox sys/socket.h and sys/uio.h definitions</li> <li>Additional commits viewable in <a href="https://github.com/rust-lang/libc/compare/0.2.171...0.2.172">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=libc&package-manager=cargo&previous-version=0.2.171&new-version=0.2.172)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-05-06 01:26:26 +00:00
Thomas Eizinger	37529803ce	build(rust): bump otel ecosystem crates to 0.29 (#9029 )	2025-05-05 12:33:07 +00:00
Thomas Eizinger	2d802edf6a	fix(connlib): _always_ use one IP stack for relayed connections (#9018 ) At the moment, Firezone already attempts to prefer the same IP stack across relayed connections all the way through to the Gateway. This is achieved with a feature in str0m implemented in https://github.com/algesten/str0m/pull/640 where the `IceAgent` computes the local preference of an added candidate such that an IPv4 candidate allocated over an IPv4 network has a higher preference than an IPv6 candidate allocated over an IPv4 network. If a candidate gets accepted by the local agent, it is signaled to the remote via our control protocol. The remote peer then adds the candidate as a remote candidate and the ICE process starts by pairing them with local ones and testing connectivity. Currently, str0m's API consumes the candidate and only returns a `bool` whether it should be sent signaled to the remote. This means the local preference computed as part of `add_local_candidate` is not reflected in the priority of the candidate sent to the remote. As a result, if the controlled agent (i.e. the Gateway) is behind symmetric NAT and therefore only has relay candidates available, the preference of IPv4 over IPv6 candidates on an IPv4 network is lost. This is what we are seeing in #8998. This changes with https://github.com/algesten/str0m/pull/650 being merged to `main` which we are updating to with this PR. Now, `add_local_candidate` returns an `Option<&Candidate>` which has been modified with the local preference of the `IceAgent` around the preferred IP stack of relay candidates. As such, the priority calculated and signaled to the remote embeds this information and will be taken into account by the controlling agent (i.e. the Client) when nominating a pair. Resolves: #8998	2025-05-05 01:11:28 +00:00
Jamil	6e0e7343ba	chore: release Apple & Gateway with ECN fix (#9013 )	2025-05-02 00:16:40 -07:00
Thomas Eizinger	0aab954fa9	fix(connlib): never clear ECT from IP packets (#9009 ) ECN information is helpful to allow the congestion controllers to more easily fine-tune their send and receive windows. When a Firezone Client receives an IP packet where the ECN bits signal an ECN capable transport, we mirror this bit on the UDP datagram that carries the encrypted IP packet. When receiving a datagram with ECN bits set, the Gateway will then apply these bits to the decrypted IP packet and pass it along towards its destination. This implementation is unfortunately a bit too naive. Not all devices on the Internet support ECN and therefore, we may receive a datagram that has its ECN bits cleared when the ECN bits on the inner IP packet still signal an ECN capable transport. In this case, we should _not_ override the ECN bits and instead pass the IP packet along as is. Network devices along the path between Gateway and Resource may still use these ECN bits to signal congestion. We fix this by making the `with_ecn` function on `IpPacket` private. It is not meant to be used outside of the module. We supersede it with a `with_ecn_from_transport` function that implements the above logic. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-05-02 05:28:19 +00:00
Thomas Eizinger	0fd14b993b	chore(connlib): buffer most recent TCP SYN (#9004 ) When establishing connections that take longer than the TCP RTO, we may see duplicate TCP SYNs. Those have different timestamps from each other but are otherwise equal. To provide more accurate timing information to the TCP stack, we now keep the latest TCP SYN around instead of the very first one.	2025-05-02 01:06:14 +00:00
Thomas Eizinger	ea5709e8da	chore(rust): initialise OTEL with useful metadata (#8945 ) Once we start collecting metrics across various Clients and Gateways, these metrics need to be tagged with the correct `service.name`, `service.version` as well as an instance ID to differentiate metrics from different instances.	2025-05-01 05:19:07 +00:00
Thomas Eizinger	8dd794d8c8	chore(gateway): record metrics about dropped packets (#8942 ) When a NAT session expires or other unallowed traffic is routed to the Gateway, we drop these packets. It will be useful to learn, how often that actually happens and what the reason is for why they got dropped. To do so, we add a counter metric for these packets. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-04-30 18:24:10 +00:00
Thomas Eizinger	6f11568c8c	fix(connlib): move `wire::dev::recv` log to right location (#8944 ) I don't understand why but in the current location, this log simply doesn't show up for anything other than UDP packets. If we move it up, it will actually log all packets.	2025-04-30 13:45:51 +00:00
Thomas Eizinger	e031dfdb4a	refactor(connlib): introduce our own `bufferpool` crate (#8928 ) We have been using buffer pools for a while all over `connlib` as a way to efficiently use heap-allocated memory. This PR harmonizes the usage of buffer pools across the codebase by introducing a dedicated `bufferpool` crate. This crate offers a convenient and easy-to-use API for all the things we (currently) need from buffer pools. As a nice bonus of having it all in one place, we can now also track metrics of how many buffers we have currently allocated. An example output from the local metrics exporter looks like this: ``` Name : system.buffer.count Description : The number of buffers allocated in the pool. Unit : {buffers} Type : Sum Sum DataPoints Monotonic : false Temporality : Cumulative DataPoint #0 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 96 Attributes : -> system.buffer.pool.name: udp-socket-v6 -> system.buffer.pool.buffer_size: 65535 DataPoint #1 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 7 Attributes : -> system.buffer.pool.buffer_size: 131600 -> system.buffer.pool.name: gso-queue DataPoint #2 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 128 Attributes : -> system.buffer.pool.name: udp-socket-v4 -> system.buffer.pool.buffer_size: 65535 DataPoint #3 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 8 Attributes : -> system.buffer.pool.buffer_size: 1336 -> system.buffer.pool.name: ip-packet DataPoint #4 StartTime : 2025-04-29 12:41:25.278436 EndTime : 2025-04-29 12:42:25.278088 Value : 9 Attributes : -> system.buffer.pool.buffer_size: 1336 -> system.buffer.pool.name: snownet ``` Resolves: #8385	2025-04-30 08:52:18 +00:00
Thomas Eizinger	f7df445924	fix(gateway): don't invalidate active NAT sessions (#8937 ) Whenever the Gateway is instructed to (re)create the NAT for a DNS resource, it performs a DNS query and then overwrite the existing entries in the NAT table. Depending on how the DNS records are defined, this may lead to a very bad user experience where connections are cut regularly. In particular, if a service utilises round-robin DNS where a DNS query only ever returns a single entry yet that entry may change as soon as the TTL expires, all connections for this particular DNS resource for a Client get cut. To fix this, we now first check for active NAT sessions for a given proxy IP and only replace it if we don't have an open NAT session. The NAT sessions have a TTL of 1 minute, meaning there needs to be at least 1 outgoing packet from the Client every minute to keep it open.	2025-04-30 06:58:58 +00:00
Jamil	2650d81444	chore: release clients with GSO fix (#8936 )	2025-04-29 23:52:43 -07:00
Thomas Eizinger	c75b6c6641	feat(connlib): record the number of IO errors as a metric (#8934 ) It will be interesting to learn for example, how many installations have no IPv6 connectivity as those will encounter `NetworkUnreachable` errors. We categorise the errors by IO direction and IP stack which will allow us to deduce this information.	2025-04-30 05:24:55 +00:00
Thomas Eizinger	6dc5f85cc5	fix(connlib): don't buffer when recreating DNS resource NAT (#8935 ) In order to detect changes to DNS records of DNS resources, `connlib` will recreate the DNS resource NAT whenever it receives a query for a DNS resource. The way we implemented this was by clearing the local state of the DNS resource NAT, which triggered us to perform the handshake with the Gateway again upon the next packet for this resource. The Gateway would then perform the DNS query and respond back when this was finished. In order to not drop any packets, `connlib` has a buffer where it keeps the packets that are arriving in the meantime. This works reasonably well when the connection is first set up because we are only buffering a TCP SYN or equivalent handshake packet. Yet, when the connection is full use, and the application just so happens to make another DNS query, we halt the entire flow of packets until this is confirmed again. To prevent high memory use, the buffer for this packets is constrained to 32 packets which is nowhere near enough when a connection is actively transferring data (like a file upload). In most cases, the DNS query on the Gateway will yield the exact same results as because the records haven't changed. Thus, there is no reason for us to actually halt the flow of these packets when we are _recreating_ the DNS resource NAT. That way, this handshake happens in parallel to the actual packet flow and does not interrupt anything in the happy path case.	2025-04-30 04:26:49 +00:00
Thomas Eizinger	d19d20da51	fix(connlib): send IO errors from UDP threads to event-loop (#8933 ) With #7590, we've moved all UDP IO operations to a separate thread. As a result, some of the error handling of IO errors within the Client's and Gateway's event-loop no longer applied as those are now captured within the respective thread. To fix this, we extend the type-signature of the receive-channel to also allow for errors and use that to send back errors from sending AND receiving UDP datagrams.	2025-04-30 02:00:41 +00:00
Thomas Eizinger	2ba7a87899	feat(connlib): add FFI for changing log-level on MacOS (#8927 ) This isn't plugged into anything yet on the Swift side but lays the foundation for changing the log-level at runtime without having to sign the user out.	2025-04-29 13:51:46 +00:00

1 2 3 4 5 ...

1112 Commits