firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 18:18:55 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	e81dc452f7	refactor(connlib): use a lock-free queue for the buffer pool (#9989 ) We use several buffer pools across `connlib` that are all backed by the same buffer-pool library. Within that library, we currently use another object-pool library to provide the actual pooling functionality. Benchmarking has shown that spend quite a bit of time (a few % of total CPU time), fighting for the lock to either add or remote a buffer from the pool. This is unnecessary. By using a queue, we can remove buffers from the front and add buffers at the back, both of which can be implemented in a lock-free way such that they don't contend. Using the well-known `crossbeam-queue` library, we have such a queue directly available. I wasn't able to directly measure a performance gain in terms of throughput. What we can measure though, is how much time we spend dealing with our buffer pool vs everything else. If we compare the `perf` outputs that were recorded during an `iperf` run each, we can see that we spend about 60% less time dealing with the buffer pool than we did before. \|Before\|After\| \|---\|---\| \|<img width="1982" height="553" alt="Screenshot From 2025-07-24 20-27-50" src="https://github.com/user-attachments/assets/1698f28b-5821-456f-95fa-d6f85d901920" />\|<img width="1982" height="553" alt="Screenshot From 2025-07-24 20-27-53" src="https://github.com/user-attachments/assets/4f26a2d1-03e3-4c0d-84da-82c53b9761dd" />\| The number in the thousands on the left is how often the respective function was the currently executing function during the profiling run. Resolves: #9972	2025-07-28 21:39:11 +00:00
Thomas Eizinger	55304b3d2a	refactor(snownet): learn host candidates from TURN traffic (#9998 ) Presently, for each UDP packet that we process in `snownet`, we check if we have already seen this local address of ours and if not, add it to our list of host candidates. This is a safe way for ensuring that we consider all addresses that we receive data on as ones that we tell our peers that they should try and contact us on. Performance profiling has shown that hashing the socket address of each packet that is coming in is quite wasteful. We spend about 4-5% of our main thread time doing this. For comparison, decrypting packets is only about 30%. Most of the time, we will already know about this address and therefore, spending all this CPU time is completely pointless. At the same time though, we need to be sure that we do discover our local address correctly. Inspired by STUN, we therefore move this responsibility to the `allocation` module. The `allocation` module is responsible for interacting with our TURN servers and will yield server-reflexive and relay candidates as a result. It also knows, what the local address is that it received traffic on so we simply extend that to yield host candidates as well in addition to server-reflexive and relay candidates. On my local machine, this bumps us across the 3.5 Gbits/sec mark: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.93.174.92 port 57890 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 319 MBytes 2.67 Gbits/sec 18 548 KBytes [ 5] 1.00-2.00 sec 413 MBytes 3.46 Gbits/sec 4 884 KBytes [ 5] 2.00-3.00 sec 417 MBytes 3.50 Gbits/sec 4 1.10 MBytes [ 5] 3.00-4.00 sec 425 MBytes 3.56 Gbits/sec 415 785 KBytes [ 5] 4.00-5.00 sec 430 MBytes 3.60 Gbits/sec 154 820 KBytes [ 5] 5.00-6.00 sec 434 MBytes 3.64 Gbits/sec 251 793 KBytes [ 5] 6.00-7.00 sec 436 MBytes 3.66 Gbits/sec 123 811 KBytes [ 5] 7.00-8.00 sec 435 MBytes 3.65 Gbits/sec 2 788 KBytes [ 5] 8.00-9.00 sec 423 MBytes 3.55 Gbits/sec 0 1.06 MBytes [ 5] 9.00-10.00 sec 433 MBytes 3.63 Gbits/sec 8 1017 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-20.00 sec 8.21 GBytes 3.53 Gbits/sec 1728 sender [ 5] 0.00-20.00 sec 8.21 GBytes 3.53 Gbits/sec receiver iperf Done. ```	2025-07-28 21:38:39 +00:00
Thomas Eizinger	9c71026416	chore(connlib): gate more trace logs on `debug_assertions` (#10026 ) These are otherwise hit pretty often in the hot-path and slow packet routing down because tracing needs to evaluate whether it should log the statement.	2025-07-28 21:38:23 +00:00
Thomas Eizinger	fb9a142a9e	chore(snownet): add back span in `handle_timeout` (#10025 ) Whilst entering and leaving a span for every packet is very expensive, doing the same whenever we make timeout related changes is just fine. Thus, we re-introduce a span removed in #9949 but only for the `handle_timeout` function. This gives us the context of the connection ID for not just our own logs, but also the ones from `boringtun`.	2025-07-28 04:14:39 +00:00
Thomas Eizinger	bfa77bf7fc	chore(snownet): log connection ID in more places (#10023 ) With the removal of the span in #9949, we now need to explicitly log the connection ID in a few more places to have the necessary context.	2025-07-28 02:01:01 +00:00
Thomas Eizinger	ce5650b554	fix(snownet): compare `preshared_key` on connection upsert (#9999 ) By chance, I've discovered in a CI failure that we won't be able to handshake a new session if the `preshared_key` changes. This makes a lot of sense. The `preshared_key` needs to be the same on both ends as it is a shared secret that gets mixed into the Noise handshake. In following sequence of events, we would thus previously run into a "failed to decrypt handshake packet" scenario: 1. Client requests a connection. 2. Gateway authorizes the connection. 3. Portal restarts / gets deployed. To my knowledge, this will rotate the `preshared_key` to a new secret. Restarting the portal also cuts all WebSockets and therefore, the Gateways response never arrives. 4. Client reconnects to the WebSocket, requests a new connection. 5. Gateway reuses the local connection but this connection still uses the old `preshared_key`! 6. Client needs to wait for the Gateway's ICE timeout before it can establish a new connection. How exactly (3) happens doesn't matter. There are probably other conditions as to where the WebSocket connections get cut and we cannot complete our connection handshake.	2025-07-25 21:14:58 +00:00
Thomas Eizinger	f55c61c7cb	fix(snownet): always update `last_activity` idle timer (#10000 ) Previously, our idle timer was only driven by incoming and outgoing packets. To detect whether the tunnel is idle, we checked whether either the last incoming or last outgoing packet was more than 20s ago. For one, having two timestamps here is unnecessarily complex. We can simply combine them and always update this timestamp as `last_activity`. Two, recently, we have started to also take into account not only packets but other changes to the tunnel, such as an upsert of the connection or adding new candidate. What we failed to do though, is update these timestamps because their variable name was related to packets and not to any activity. The problem with not updating these timestamps however is that we will very quickly move out of "connected" back to "idle" because the old timestamps are still more than 20s ago. Hence, the previous fixes of moving out of idle on new candidates and connection upsert were ineffective. By combining and renaming the timestamps, it is now much more obvious that we need to update this timestamp in the respective handler functions which then grants us another 20s of non-idling. This is important for e.g. connection upserts to ensure the Gateway runs into an ICE timeout within a short amount of time, should there be something wrong with the connection that the Client just upserted.	2025-07-25 15:03:18 +00:00
Thomas Eizinger	d00c3b58cd	refactor(connlib): only enable `wire` logs in debug builds (#10002 ) As profiling shows, even if the log target isn't enabled, simply checking whether or not it is enabled is a significant performance hit. By guarding these behind `debug_assertions`, I was able to almost achieve 3.75 Gbits/s locally (when rebased onto #9998). Obviously, this doesn't quite translate into real-world improvements but it is nonetheless a welcome improvement. ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.93.174.92 port 34678 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 401 MBytes 3.37 Gbits/sec 14 644 KBytes [ 5] 1.00-2.00 sec 448 MBytes 3.76 Gbits/sec 3 976 KBytes [ 5] 2.00-3.00 sec 453 MBytes 3.80 Gbits/sec 43 979 KBytes [ 5] 3.00-4.00 sec 449 MBytes 3.77 Gbits/sec 21 911 KBytes [ 5] 4.00-5.00 sec 452 MBytes 3.79 Gbits/sec 4 1.15 MBytes [ 5] 5.00-6.00 sec 451 MBytes 3.78 Gbits/sec 81 1.01 MBytes [ 5] 6.00-7.00 sec 445 MBytes 3.73 Gbits/sec 39 705 KBytes [ 5] 7.00-8.00 sec 436 MBytes 3.66 Gbits/sec 3 1016 KBytes [ 5] 8.00-9.00 sec 460 MBytes 3.85 Gbits/sec 1 956 KBytes [ 5] 9.00-10.00 sec 453 MBytes 3.80 Gbits/sec 0 1.19 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.34 GBytes 3.73 Gbits/sec 209 sender [ 5] 0.00-10.00 sec 4.34 GBytes 3.73 Gbits/sec receiver ``` I didn't want to remove the `wire` logs entirely because they are quite useful for debugging. However, they are also exactly this: A debugging tool. In a production build, we are very unlikely to turn these on which makes `debug_assertions` a good tool for keeping these around without interfering with performance.	2025-07-25 12:24:25 +00:00
Thomas Eizinger	e5ee8e3572	fix(connlib): wait for sockets to be closed before rebinding (#9996 ) Our `ThreadedUdpSocket` uses a background thread for the actual socket operation. It merely represents a handle to send and receive from these sockets but not the socket itself. Dropping the handle will shutdown the background thread but that is an asynchronous operation. In order to be sure that we can rebind the same port, we need to wait for the background thread to stop. We thus add a `Drop` implementation for the `ThreadedUdpSocket` that waits for its background thread to disappear before it continues. Resolves: #9992	2025-07-25 03:09:13 +00:00
Thomas Eizinger	9133d46bbd	fix(snownet): don't log unknown packet for disconnected relay (#9961 ) Currently, packets for allocations, i.e. from relays are parsed inside the `Allocation` struct. We have one of those structs for each relay that `snownet` is talking to. When we disconnect from a relay because it is e.g. not responding, then we deallocate this struct. As a result, message that arrive from this relay can no longer be handled. This can happen when the response time is longer than our timeout. These packets then fall-through and end up being logged as "packet has unknown format". To prevent this, we make the signature on `Allocation` strongly-typed and expect a fully parsed `Message` to be given to us. This allows us to parse the message early and discard it with a DEBUG log in case we don't have the necessary local state to handle it. The functionality here is essentially the same, we just change at what level this is being logged at from WARN to DEBUG. We have to make one additional adjustment to make this work: Guard all messages to be parsed by any `Allocation` to come from port 3478. This is the assigned port that all relays are expected to listen on. If we don't have any local state for a given address, we cannot decide whether it is a STUN message for an agent or a STUN message for a relay that we have disconnected from. Therefore, we need to de-multiplex based on the source port.	2025-07-25 00:32:43 +00:00
Thomas Eizinger	aebfcd56eb	fix(connlib): resend candidates on connection upsert (#9986 ) Due to network partitions between the Client and the Portal, it is possible that a Client requests a new connection, then disconnects from the portal and re-requests the connection once it is reconnected. On the Gateway, we would have already authorized the first request and initialise our ICE agents with our local candidates. The second time around, the connection would be reused. The Client however has lost its state and therefore, we need to tell it our candidates again. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2025-07-24 21:01:50 +00:00
Thomas Eizinger	cbe114bddc	fix(connlib): clear join requests on reconnect (#9985 ) Room join requests on the portal are only valid whilst we have a WebSocket connection. To make sure the portal processes all our requests correctly, we need to hold all other messages back while we are waiting to join the room. If the connection flaps while we are waiting to join a room, we may have a lingering join request that never gets fulfilled and thus blocks the sending of messages forever. --------- Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>	2025-07-24 20:41:26 +00:00
Thomas Eizinger	f9721a1da6	fix(snownet): only idle when we are fully connected (#9987 ) Now that we are capable of migrating a connection to another relay with #9979, our test suite exposed an edge-case: If we are in the middle of migrating a connection, it could be that the idle timer triggers because we have not seen any application traffic in the last 20s. Moving to idle mode drastically reduces the number of STUN bindings we send and if this happens whilst we are still checking candidates, the nomination doesn't happen in time for our boringtun handshake to succeed. Thus, we add a condition to our idle timer to not trigger unless ICE has completed and reports us as `connected`.	2025-07-24 12:37:47 +00:00
Thomas Eizinger	d7b9ecb60b	feat(gateway): update expiry of access authoritzations on init (#9975 ) Resolves: #9971	2025-07-24 06:36:56 +00:00
Thomas Eizinger	301d2137e5	refactor(windows): share src IP cache across UDP sockets (#9976 ) When looking through customer logs, we see a lot of "Resolved best route outside of tunnel" messages. Those get logged every time we need to rerun our re-implementation of Windows' weighting algorithm as to which source interface / IP a packet should be sent from. Currently, this gets cached in every socket instance so for the peer-to-peer socket, this is only computed once per destination IP. However, for DNS queries, we make a new socket for every query. Using a new source port DNS queries is recommended to avoid fingerprinting of DNS queries. Using a new socket also means that we need to re-run this algorithm every time we make a DNS query which is why we see this log so often. To fix this, we need to share this cache across all UDP sockets. Cache invalidation is one of the hardest problems in computer science and this instance is no different. This cache needs to be reset every time we roam as that changes the weighting of which source interface to use. To achieve this, we extend the `SocketFactory` trait with a `reset` method. This method is called whenever we roam and can then reset a shared cache inside the `UdpSocketFactory`. The "source IP resolver" function that is passed to the UDP socket now simply accesses this shared cache and inserts a new entry when it needs to resolve the IP. As an added benefit, this may speed up DNS queries on Windows a bit (although I haven't benchmarked it). It should certainly drastically reduce the amount of syscalls we make on Windows.	2025-07-24 01:36:53 +00:00
Thomas Eizinger	d244a99c58	feat(connlib): always use all candidates (#9979 ) In #6876, we added functionality that would only make use of new remote candidates whilst we haven't nominated a socket yet with the remote. The reason for that was because in the described edge-case where relays reboot or get replaced whilst the client is partitioned from the portal (or we experience a connection hiccup), only one of the two peers, i.e. Client or Gateway would migrate to the new relay, leaving the other one in an inconsistent state. Looking at recent customer logs, I've been seeing a lot of these messages: > Unknown connection or socket has already been nominated For this particular customer, these are then very quickly followed by ICE timeouts, leaving the connection unusable. Considering that, I no longer think that the above change was a good idea and we should instead always make use of all candidates that we are given. What we are seeing is that in deployment scenarios where the latency link between Client and Gateway is very short (5-10ms) yet the latency to the portal is longer (~30-50ms), we trigger a race condition where we are temporarily nominating a _peer-reflexive_ candidate pair instead of a regular one. This happens because with such a short latency link, Client and Gateway are _faster_ in sending back and forth several STUN bindings than the control plane is in delivering all the candidates. Due to the functionality added in #6876, this then results in us not accepting the candidates. It further appears that a nominated peer-reflexive candidate does not provide a stable connection which is why we then run into an ICE timeout, requiring Firezone to establish a new connection only to have the same thing happen again. This is very disruptive for the user experience as the connection only works for a few moments at a time. With #9793, we have actually added a feature that is also at play here. Now that we don't immediately act on an ICE timeout, it is actually possible for both Client and Gateway to migrate a connection to a different relay, should the one that they are using get disconnected. In #9793, we added a timeout of 2s for this. To make this fully work, we need to patch str0m to transition to `Checking` early. Presently, str0m would directly transition from `Disconnected` to `Connected` in this case which in some of the high-latency scenarios that we are testing in CI is not enough to recover the connection within 2s. By transitioning to `Checking` early, we abort this timer. Related: https://github.com/algesten/str0m/pull/676	2025-07-24 01:35:54 +00:00
Thomas Eizinger	ecb2bbc86b	feat(gateway): allow updating expiry of access authorization (#9973 ) Resolves: #9966	2025-07-23 07:25:36 +00:00
Thomas Eizinger	fafe2c43ea	fix(connlib): update the current socket when in idle mode (#9977 ) In case we received a newly nominated socket from `str0m` whilst our connection was in idle mode, we mistakenly did not apply that and kept using the old one. ICE would still be functioning in this case because `str0m` would have updated its internal state but we would be sending packets into Nirvana. I don't think that this is likely to be hit in production though as it would be quite unusual to receive a new nomination whilst the connection was completely idle.	2025-07-23 05:28:21 +00:00
Thomas Eizinger	091d5b56e0	refactor(snownet): don't `memmove` every packet (#9907 ) When encrypting IP packets, `snownet` needs to prepare a buffer where the encrypted packet is going to end up. Depending on whether we are sending data via a relayed connection or direct, this buffer needs to be offset by 4 bytes to allow for the 4-byte channel-data header of the TURN protocol. At present, we always first encrypt the packet and then on-demand move the packet by 4-bytes to the left if we don't need to send it via a relay. Internally, this translates to a `memmove` instruction which actually turns out to be very cheap (I couldn't measure a speed difference between this and `main`). All of this code has grown historically though so I figured, it is better to clean it up a bit to first evaluate, whether we have a direct or relayed connection and based on that, write the encrypted packet directly to the front of the buffer or offset it by 4 bytes.	2025-07-23 00:38:39 +00:00
Thomas Eizinger	3e6fc8fda7	refactor(rust): use spinlock-based buffer pool (#9951 ) Profiling has shown that using a spinlock-based buffer pool is marginally (~1%) faster than the mutex-based one because it resolves contention quicker.	2025-07-22 23:22:48 +00:00
Thomas Eizinger	6ae074005f	refactor(connlib): don't check for enabled event (#9950 ) Profiling has shown that checking whether the level is enabled is actually more expensive than checking whether the packet is a DNS packet. This improves performance by about 3%.	2025-07-22 17:41:45 +00:00
Thomas Eizinger	71e6b56654	feat(snownet): remove "connection ID" span (#9949 ) At present, `snownet` uses a `tracing::Span` to attach the connection ID to various log messages. This requires the span to be entered and exited on every packet. Whilst profiling Firezone, I noticed that is takes between 10% and 20% of CPU time on the main thread. Previously, this wasn't a bottleneck as other parts of Firezone were not yet as optimised. With some changes earlier this year of a dedicated UDP thread and better GSO, this does appear to be a bottleneck now. On `main`, I am currently getting the following numbers on my local machine: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.85.16.226 port 42012 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 251 MBytes 2.11 Gbits/sec 16 558 KBytes [ 5] 1.00-2.00 sec 287 MBytes 2.41 Gbits/sec 6 800 KBytes [ 5] 2.00-3.00 sec 284 MBytes 2.38 Gbits/sec 2 992 KBytes [ 5] 3.00-4.00 sec 287 MBytes 2.41 Gbits/sec 3 1.12 MBytes [ 5] 4.00-5.00 sec 290 MBytes 2.44 Gbits/sec 0 1.27 MBytes [ 5] 5.00-6.00 sec 300 MBytes 2.52 Gbits/sec 2 1.40 MBytes [ 5] 6.00-7.00 sec 295 MBytes 2.47 Gbits/sec 2 1.52 MBytes [ 5] 7.00-8.00 sec 304 MBytes 2.55 Gbits/sec 3 1.63 MBytes [ 5] 8.00-9.00 sec 290 MBytes 2.44 Gbits/sec 49 1.21 MBytes [ 5] 9.00-10.00 sec 288 MBytes 2.41 Gbits/sec 24 1023 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.81 GBytes 2.41 Gbits/sec 107 sender [ 5] 0.00-10.00 sec 2.81 GBytes 2.41 Gbits/sec receiver ``` With this patch applied, the throughput goes up significantly: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.85.16.226 port 41402 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 315 MBytes 2.64 Gbits/sec 7 619 KBytes [ 5] 1.00-2.00 sec 363 MBytes 3.05 Gbits/sec 11 847 KBytes [ 5] 2.00-3.00 sec 379 MBytes 3.18 Gbits/sec 1 1.07 MBytes [ 5] 3.00-4.00 sec 384 MBytes 3.22 Gbits/sec 44 981 KBytes [ 5] 4.00-5.00 sec 377 MBytes 3.16 Gbits/sec 116 911 KBytes [ 5] 5.00-6.00 sec 378 MBytes 3.17 Gbits/sec 3 1.10 MBytes [ 5] 6.00-7.00 sec 377 MBytes 3.16 Gbits/sec 48 929 KBytes [ 5] 7.00-8.00 sec 374 MBytes 3.14 Gbits/sec 151 947 KBytes [ 5] 8.00-9.00 sec 382 MBytes 3.21 Gbits/sec 36 833 KBytes [ 5] 9.00-10.00 sec 375 MBytes 3.14 Gbits/sec 1 1.06 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 3.62 GBytes 3.11 Gbits/sec 418 sender [ 5] 0.00-10.00 sec 3.61 GBytes 3.10 Gbits/sec receiver ``` Resolves: #9948	2025-07-22 17:40:33 +00:00
Thomas Eizinger	4292ca7ae8	test(connlib): fix failing proptest (#9864 ) This essentially bumps just the boringtun dependency to include https://github.com/firezone/boringtun/pull/104.	2025-07-22 13:30:47 +00:00
Thomas Eizinger	35cd96b481	fix(phoenix-channel): fail connection in invalid peer cert (#9946 ) When being presented an invalid peer certificate, there is no reason why we should retry the connection, it is unlikely to fix itself. Plus, the certificate may get / be cached and a restart of the application is necessary. Resolves: #9944	2025-07-21 04:08:45 +00:00
Thomas Eizinger	318ce24403	fix(connlib): resend `AssignedIps` on traffic for DNS resource (#9904 ) This was exposed by #9846. It is being added here as a dedicated PR because the compatibility tests would fail or at least be flaky for the latest client release so we cannot add the integration test right away.	2025-07-19 05:26:41 +00:00
Thomas Eizinger	93ca701896	chore(snownet): check remote key and creds on connection upsert (#9902 )	2025-07-18 08:43:34 +00:00
Thomas Eizinger	c8760d87ae	chore(connlib): log remote address on decapsulation error (#9903 )	2025-07-18 07:48:41 +00:00
Thomas Eizinger	3e71a91667	feat(gateway): revoke unlisted authorizations upon `init` (#9896 ) When receiving an `init` message from the portal, we will now revoke all authorizations not listed in the `authorizations` list of the `init` message. We (partly) test this by introducing a new transition in our proptests that de-authorizes a certain resource whilst the Gateway is simulated to be partitioned. It is difficult to test that we cannot make a connection once that has happened because we would have to simulate a malicious client that knows about resources / connections or ignores the "remove resource" message. Testing this is deferred to a dedicated task. We do test that we hit the code path of revoking the resource authorization and because the other resources keep working, we also test that we are at least not revoking the wrong ones. Resolves: #9892	2025-07-17 19:04:54 +00:00
Thomas Eizinger	a6ffdd2654	feat(snownet): reduce rekey-attempt-time to 15s (#9891 ) From Sentry reports and user-submitted logs, we know that it is possible for Client and Gateway to de-sync in regards to what each other's public key is. In such a scenario, ICE will succeed to make a connection but `boringtun` will fail to handshake a tunnel. By default, `boringtun` tries for 90s to handshake a session before it gives up and expires it. In Firezone, the ICE agent takes care of establishing connectivity whereas `boringtun` itself just encrypts and decrypts packets. As such, if ICE is working, we know that packets aren't getting lost but instead, there must be some other issue as to why we cannot establish a session. To improve the UX in these error cases, we reduce the rekey-attempt-time to 15s. This roughly matches our ICE timeout. Those 15s count from the moment we send the first handshake which is just after ICE completes. Thus we can be sure that after at most 15s, we either have a working WireGuard session or the connection gets cleaned up. Related: #9890 Related: #9850	2025-07-17 00:50:31 +00:00
Thomas Eizinger	116b518700	fix(snownet): discard channel-data messages from old allocations (#9885 ) When we invalidate or discard an allocation, it may happen that a relay still sends channel-data messages to us. We don't recognize those and will therefore attempt to parse them as WireGuard packets, ultimately ending in an "Packet has unknown format" error. To avoid this, we check if the packet is a valid channel-data message even if we presently don't have an allocation on the relay that is sending us the packet. In those cases, we can stop processing the packet, thus avoiding these errors from being logged.	2025-07-16 05:57:44 +00:00
Thomas Eizinger	29f81c64ff	fix(snownet): wake idle connection on upsert (#9879 ) When a connection is in idle-mode, it only sends a STUN request every 25 seconds. If the Client disconnects e.g. due to a network partition, it may send a new connection intent later. If the Gateway's connection is still around then because it was in idle mode, it won't send any candidates to the remote, making the Client's connection fail with "no candidates received". To alleviate this, we wake a connection out of idle mode every time it is being upserted. This ensures that the connection will fail within 15s IF the above scenario happens, allowing the Client to reconnect within a much shorter time-frame. Note that attempting to repair such a connection is likely pointless. It is much safer to discard it and let them both establish a new connection. Related: #9862	2025-07-15 14:16:27 +00:00
Thomas Eizinger	0f1c5f2818	refactor(relay): simplify auth module (#9873 ) Whilst looking through the auth module of the relay, I noticed that we unnecessarily convert back and forth between expiry timestamps and username formats when we could just be using the already parsed version.	2025-07-15 14:14:51 +00:00
Thomas Eizinger	ffcb269c8b	chore(connlib): add "wake reason" to `poll_timeout` (#9876 ) In order to debug timer interactions, it is useful to know when and why connlib wants to be woken to perform tasks.	2025-07-15 13:58:06 +00:00
Thomas Eizinger	5141817134	feat(connlib): add `reason` argument to `reset` API (#9878 ) In order to provide more detailed logs, why `connlib`'s network state is being reset, we add a `reason` parameter that is gets logged. Resolves: #9867	2025-07-15 13:48:33 +00:00
Thomas Eizinger	f5425ac8e4	fix(snownet): fail connection on handshake decryption errors (#9850 ) As per the WireGuard paper, `boringtun` tries to handshake with the remote peer for 90s before it gives up. This timeout is important because when a session is discarded due to e.g. missing replies, WireGuard attempts to handshake a new session. Without this timeout, we would then try to handshake a session forever. Unfortunately, `boringtun` does not distinguish a missing handshake response from a bad one. Decryption errors whilst decoding a handshake response are simply passed up to the upper layer, in our case `snownet`. I am not sure how we can actually fail to decrypt a handshake but the pattern we are seeing in customer logs is that this happens over and over again, so there is no point in having `boringtun` retry the handshake. Therefore, we immediately fail the connection when this happens. Failed connections are immediately removed, triggering the client send a new connection-intent to the portal. Such a new connection intent will then sync-up the state between Client and Gateway so both of them use the most recent public key. Resolves: #9845	2025-07-14 13:22:23 +00:00
Thomas Eizinger	66455ab0ef	feat(gateway): translate TimeExceeded ICMP messages (#9812 ) In the DNS resource NAT table, we track parts of the layer 4 protocol of the connection in order to map packets back to the correct proxy IP in case multiple DNS names resolve to the same real IP. The involvement of layer 4 means we need to perform some packet inspection in case we receive ICMP errors from an upstream router. Presently, the only ICMP error we handle here is destination unreachable. Those are generated e.g. when we are trying to contact an IPv6 address but we don't have an IPv6 egress interface. An additional error that we want to handle here is "time exceeded": Time exceeded is sent when the TTL of a packet reaches 0. Typically, TTLs are set high enough such that the packet makes it to its destination. When using tools such as `tracepath` however, the TTL is specifically only incremented one-by-one in order to resolve the exact hops a packet is taking to a destination. Without handling the time exceeded ICMP error, using `tracepath` through Firezone is broken because the packets get dropped at the DNS resource NAT. With this PR, we generalise the functionality of detecting destination unreachable ICMP errors to also handle time-exceeded errors, allowing tools such as `tracepath` to somewhat work: ``` ❯ sudo docker compose exec --env RUST_LOG=info -it client /bin/sh -c 'tracepath -b example.com' 1?: [LOCALHOST] pmtu 1280 1: 100.82.110.64 (100.82.110.64) 0.795ms 1: 100.82.110.64 (100.82.110.64) 0.593ms 2: example.com (100.96.0.1) 0.696ms asymm 45 3: example.com (100.96.0.1) 5.788ms asymm 45 4: example.com (100.96.0.1) 7.787ms asymm 45 5: example.com (100.96.0.1) 8.412ms asymm 45 6: example.com (100.96.0.1) 9.545ms asymm 45 7: example.com (100.96.0.1) 7.312ms asymm 45 8: example.com (100.96.0.1) 8.779ms asymm 45 9: example.com (100.96.0.1) 9.455ms asymm 45 10: example.com (100.96.0.1) 14.410ms asymm 45 11: example.com (100.96.0.1) 24.244ms asymm 45 12: example.com (100.96.0.1) 31.286ms asymm 45 13: no reply 14: example.com (100.96.0.1) 303.860ms asymm 45 15: no reply 16: example.com (100.96.0.1) 135.616ms (This broken router returned corrupted payload) asymm 45 17: no reply 18: example.com (100.96.0.1) 161.647ms asymm 45 19: no reply 20: no reply 21: no reply 22: example.com (100.96.0.1) 238.066ms reached Resume: pmtu 1280 hops 22 back 45 ``` We say "somewhat work" because due to the NAT that is in place for DNS resources, the output does not disclose the intermediary hops beyond the Gateway. Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com> --------- Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>	2025-07-12 21:09:48 +00:00
Thomas Eizinger	16facd394e	chore(rust): bump str0m (#9852 ) The latest version of str0m includes a fix that would result in an immediate ICE timeout if a remote candidate was added prior to a local candidate. We mitigated this in #9793 to make Firezone overall more resilient towards sudden changes in the ICE connection state. As a defense-in-depth measure, we also fixed this issue in str0m by not transitioning to `Disconnected` if haven't even formed an candidate pairs yet. Diff: `2153bf0385...3d6e3d2f27`	2025-07-12 20:55:07 +00:00
Thomas Eizinger	47c9922131	test(connlib): don't attempt to listen on port 0 for TCP socket (#9851 )	2025-07-12 14:29:34 +00:00
Thomas Eizinger	d6805d7e48	chore(rust): bump to Rust 1.88 (#9714 ) Rust 1.88 has been released and brings with it a quite exciting feature: let-chains! It allows us to mix-and-match `if` and `let` expressions, therefore often reducing the "right-drift" of the relevant code, making it easier to read. Rust.188 also comes with a new clippy lint that warns when creating a mutable reference from an immutable pointer. Attempting to fix this revealed that this is exactly what we are doing in the eBPF kernel. Unfortunately, it doesn't seem to be possible to design this in a way that is both accepted by the borrow-checker AND by the eBPF verifier. Hence, we simply make the function `unsafe` and document for the programmer, what needs to be upheld.	2025-07-12 06:42:50 +00:00
Thomas Eizinger	55eaa7cdc7	test(connlib): establish real TCP connections in proptests (#9814 ) With this patch, we sample a list of DNS resources on each test run and create a "TCP service" for each of their addresses. Using this list of resources, we then change the `SendTcpPayload` transition to `ConnectTcp` and establish TCP connections using `smoltcp` to these services. For now, we don't send any data on these connections but we do set the keep-alive interval to 5s, meaning `smoltcp` itself will keep these connections alive. We also set the timeout to 30s and after each transition in a test-run, we assert that all TCP sockets are still in their expected state: - `ESTABLISHED` for most of them. - `CLOSED` for all sockets where we ended up sampling an IPv4 address but the DNS resource only supports IPv6 addresses (or vice-versa). In these cases, we use the ICMP error to sent by the Gateway to assert that the socket is `CLOSED`. Unfortunately, `smoltcp` currently does not handle ICMP messages for its sockets, so we have to call `abort` ourselves. Overall, this should assert that regardless of whether we roam networks, switch relays or do other kind of stuff with the underlying connection, the tunneled TCP connection stays alive. In order to make this work, I had to tweak the timeouts when we are on-demand refreshing allocations. This only happens in one particular case: When we are being given new relays by the portal, we refresh all _other_ relays to make sure they are still present. In other words, all relays that we didn't remove and didn't just add but still had in-memory are refreshed. This is important for cases where we are network-partitioned from the portal whilst relays are deployed or reset their state otherwise. Instead of the previous 8s max elapsed time of the exponential backoff like we have it for other requests, we now only use a single message with a 1s timeout there. With the increased ICE timeout of 15s, a TCP connection with a 30s timeout would otherwise not survive such an event. This is because it takes the above mentioned 8s for us to remove a non-functioning relay, all whilst trying to establish a new connection (which also incurs its own ICE timeout then). With the reduced timeout on the on-demand refresh of 1s, we detect the disappeared relay much quicker and can immediately establish a new connection via one of the new ones. As always with reduced timeouts, this can create false-positives if the relay doesn't reply within 1s for some reason. Resolves: #9531	2025-07-11 15:10:22 +00:00
Thomas Eizinger	520dd0aa31	feat(gateway): respond with ICMP error for filtered packets (#9816 ) When defining a resource, a Firezone admin can define traffic filters to only allow traffic on certain TCP and/or UDP ports and/or restrict traffic on the ICMP protocol. Presently, when a packet is filtered out on the Gateway, we simply drop it. Dropping packets means the sending application can only react to timeouts and has no other means on error handling. ICMP was conceived to deal with these kind of situations. In particular, the "destination unreachable" type has a dedicated code for filtered packets: "Communication administratively prohibited". Instead of just dropping the not-allowed packet, we now send back an ICMP error with this particular code set, thus informing the sending application that the packet did not get lost but was in fact not routed for policy reasons. When setting a traffic filter that does not allow TCP traffic, attempting to `curl` such a resource now results in the following: ``` ❯ sudo docker compose exec --env RUST_LOG=info -it client /bin/sh -c 'curl -v example.com' * Host example.com:80 was resolved. * IPv6: fd00:2021:1111:8000::, fd00:2021:1111:8000::1, fd00:2021:1111:8000::2, fd00:2021:1111:8000::3 * IPv4: 100.96.0.1, 100.96.0.2, 100.96.0.3, 100.96.0.4 * Trying [fd00:2021:1111:8000::]:80... * connect to fd00:2021:1111:8000:: port 80 from fd00:2021:1111::1e:7658 port 34560 failed: Permission denied * Trying [fd00:2021:1111:8000::1]:80... * connect to fd00:2021:1111:8000::1 port 80 from fd00:2021:1111::1e:7658 port 34828 failed: Permission denied * Trying [fd00:2021:1111:8000::2]:80... * connect to fd00:2021:1111:8000::2 port 80 from fd00:2021:1111::1e:7658 port 44314 failed: Permission denied * Trying [fd00:2021:1111:8000::3]:80... * connect to fd00:2021:1111:8000::3 port 80 from fd00:2021:1111::1e:7658 port 37628 failed: Permission denied * Trying 100.96.0.1:80... * connect to 100.96.0.1 port 80 from 100.66.110.26 port 53780 failed: Host is unreachable * Trying 100.96.0.2:80... * connect to 100.96.0.2 port 80 from 100.66.110.26 port 60748 failed: Host is unreachable * Trying 100.96.0.3:80... * connect to 100.96.0.3 port 80 from 100.66.110.26 port 38378 failed: Host is unreachable * Trying 100.96.0.4:80... * connect to 100.96.0.4 port 80 from 100.66.110.26 port 49866 failed: Host is unreachable * Failed to connect to example.com port 80 after 9 ms: Could not connect to server * closing connection #0 curl: (7) Failed to connect to example.com port 80 after 9 ms: Could not connect to server ```	2025-07-11 13:54:41 +00:00
Thomas Eizinger	06f703a0b5	feat(telemetry): log use of `map-enobufs-to-wouldblock` (#9829 ) In order to better track, how well our `ENOBUFS` mitigation is working, we should log the use of our feature flag to PostHog. This will give us some stats how often this is happening. That combined with the lack of error reports should give us good confidence in permanently enabling this behaviour.	2025-07-11 13:32:11 +00:00
Thomas Eizinger	9c4e71a68f	chore(connlib): improve error message for filtered packets (#9833 ) When a packet gets filtered because we are unable to evaluate the source protocol (i.e. TCP/UDP/ICMP), then the current error message currently misleadingly says that the packet got filtered because the protocol is not supported. The truth however is that we were never able to apply the filter in the first place. This is a subtle difference that is quite important when debugging filtered packets. To improve this, we add an error message to the stack here.	2025-07-11 13:24:55 +00:00
Thomas Eizinger	8e5ce66810	feat(gateway): don't apply traffic filters to ICMP errors (#9834 ) Firezone uses ICMP errors to signal to client applications that e.g. a certain IP is not reachable. This happens for example if a DNS resource only resolves to IPv4 addresses yet the client application attempted to use an IPv6 proxy address to connect to it. In the presence of traffic filters for such a resource that does _not_ allow ICMP, we currently filter out these ICMP errors because - well - ICMP traffic is not allowed! However, even in the presence of ICMP traffic being allowed, we would fail to evaluate this filter because the ICMP error packet is not an ICMP echo reply and therefore doesn't have an ICMP identifier. We require this in the DNS resource NAT to identify "connections" and NAT them correctly. The same L4 component is used to evaluate the traffic filters. ICMP errors are critical to many usage scenarios and algorithms like happy-eyeballs. Dropping them usually results in weird behaviour as client applications can then only react to timeouts.	2025-07-11 13:20:37 +00:00
Thomas Eizinger	13c8c70750	fix(connlib): treat `ENOBUFS` as `EWOULDBLOCK` (#9798 ) Socket APIs across operating systems vary in how they handle back-pressure. In most cases, a non-blocking socket should return `EWOULDBLOCK` when it cannot send a given datagram and would have to block to wait for resources to free up. It appears that macOS doesn't always behave like that. In particular, we are seeing error logs from a few users where sending a datagram fails with > No buffer space available (os error 55) Digging through `libc`, I've found that this error is known as `ENOBUFS` [0]. There are reports on the Apple developer forum [1] that recommend retrying when this error happens. It is however unclear as to whether it is entirely safe to map this error to `EWOULDBLOCK`. Other non-blocking event-loop implementations [2] appear to do that but we don't know whether it is fully correct. At present, Firezone's behaviour here is to drop the packet. This means the host networking stack has to fall-back to running into a timeout and re-send the packet. This very likely negatively impacts the UX for the users hitting this. In order to validate this assumption, we implement a feature-flag. This allows us to ship this code but switch back to the old behaviour, should it negatively impact how Firezone behaves. In particular, if the assumption that mapping `ENOBUFS` to `EWOULDBLOCK` is safe turns out wrong and `kqueue` does in fact not signal readiness when more buffers are available, then we may have missing wake-ups which would lead a further delay in datagrams being sent. [0]: `8e6f36c6ba/src/unix/bsd/apple/mod.rs (L2998)` [1]: https://developer.apple.com/forums/thread/42334 [2]: `aac866f399/src/unix/stream.c (L820)`	2025-07-10 17:51:16 +00:00
Thomas Eizinger	7689402c50	chore(snownet): print packets of unknown format (#9818 ) When receiving UDP packets that we cannot decode we log an error. In order to identify, whether we might have bugs in our decoding logic, we now also print the hex-encoding of the packet for further analysis on DEBUG.	2025-07-10 15:11:54 +00:00
Thomas Eizinger	0c151a2a96	chore(gateway): include ID of unknown peer in error message (#9819 ) This will help with diagnosing issues in Sentry.	2025-07-10 14:32:05 +00:00
Thomas Eizinger	f98fcca542	refactor(connlib): directly implement `async fn` (#9806 ) At present, and as a result of how `connlib` evolved, we still implement a `Poll`-based function for receiving data on our UDP socket. Ever since we moved to dedicated threads for the UDP socket, we can directly block on "block" on receiving datagrams and don't have to poll the socket. This simplifies the implementation a fair bit. Additionally, it made me reailise that we currently don't expose any errors on the UDP socket. Likely, those will be ephemeral but it is still better than completely silencing them.	2025-07-10 13:54:44 +00:00
Thomas Eizinger	237bd62b20	fix(snownet): don't generate candidates of mixed IP version (#9804 ) When we shipped the feature of optimistc server-reflexive candidates, we failed to add a check to only combine address and base such that they are the same IP version. This is not harmful but unnecessary noise.	2025-07-07 22:47:40 +00:00
Thomas Eizinger	e5fb6adbb4	fix(connlib): always signal server-reflexive candidates (#9802 ) When we create a new connection, we seed the local ICE agent with all known local candidates, i.e. host addresses and allocations on relays. Server-reflexive candidates are never added to the local agent because you cannot send directly from a server-reflexive addresses. Instead, an agent sends from the _base_ of a server-reflexive candidate which in turn is known as a host candidate. The server-reflexive candidate is however signaled to the remote so it can try and send packets to it. Those will then be mapped by the NAT to our host candidate. In case we have just performed a network reset, our own server-reflexive candidate may not be known yet and therefore the seeding doesn't add an candidates. With no candidates being seeded, we also can't signal them to the remote. For candidates discovered later in this process, the signalling happens as part of adding them to the local agent. Because server-reflexive candidates are not added to the local agent, we currently miss out on signaling those to the remote IF they weren't already present when the ICE agent got created. This scenario can happen right after a network reset. In practice, it shouldn't be much of an issue though. As soon as we start sending from our host candidate, the remote will create a peer-reflexive candidate for it. It is however cleaner to directly send the server-reflexive candidate once we discover it.	2025-07-07 22:46:46 +00:00

1 2 3 4 5 ...

1173 Commits