Commit Graph

626 Commits

Author SHA1 Message Date
Jamil
5cb6d278d1 chore(deps): Bump next-hubspot to 2.0.0 (#9220)
This dependency had some breaking changes for 2.0.0 which required
updated some variable names from imports.

Supersedes #8991
2025-05-24 22:01:34 +00:00
Jamil
a7054b8f40 ci: Bump apple to 1.4.15 (#9217) 2025-05-24 12:51:27 +00:00
Thomas Eizinger
67d11b1e01 fix(gui-client): don't reset favourites when settings change (#9211)
The GUI client currently has a bug that resets the favourites and the
status of the Internet Resource every time the advanced settings are
saved. This happens because those fields are annotated with
`#[serde(default)]` and are thus initialised to their default value when
the struct is deserialised from the frontend.

To mitigate this, we introduce a new `GeneralSettings` struct that holds
the status of the Internet Resource and the list of favourites. When a
client starts up, it will try to migrate the existing advanced settings
into the new split of general and advanced settings.
2025-05-23 17:39:58 +00:00
Jamil
c61f3ed238 docs: Update Apple changelog with recent additions (#9207) 2025-05-22 21:31:22 +00:00
Jamil
06551c80c8 fix(website): Use dummy when mixpanel token is blank (#9209)
This fixes an issue on dev if the `NEXT_PUBLIC_MIXPANEL_TOKEN` env var
is not available.
2025-05-22 21:08:11 +00:00
Jamil
af7eaa8cc9 chore: release GUI client 1.4.14 (#9197) 2025-05-21 23:23:45 +00:00
Thomas Eizinger
042d03af2a feat(gui-client): polish Linux bundling (#9181)
Tauri's `deb` and `rpm` bundler have support for configuring maintainer
scripts. We can therefore just use those instead of tearing apart the
`deb` file that it creates and rebuilding it ourselves.

Our `rpm` packaging is currently completely broken as well. I couldn't
get it to work on CentOS 9 at all due to missing dependencies, likely
introduced by our move to Tauri v2. It installs fine on CentOS 10
though, assuming that the user has the EPEL repository installed which
provides the WebView dependency. I extended the docs to reflect this.

Hence, with this PR, we drop support for CentOS 9 and now require CentOS
10. This allows us to remove a lot of cruft from our bundling process
and instead entirely rely on the Tauri provided bundler.

Lastly, for consistency with other platforms, the name of the
application in places like app drawers has been changed from "Firezone
Client" to just "Firezone".

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2025-05-20 15:34:16 +00:00
Thomas Eizinger
1bdba3601a feat(gui-client): rename IPC service to Tunnel service (#9154)
The name IPC service is not very descriptive. By nature of being
separate processes, we need to use IPC to communicate between them. The
important thing is that the service process has control over the tunnel.
Therefore, we rename everything to "Tunnel service".

The only part that is not changed are historic changelog entries.

Resolves: #9048
2025-05-19 09:52:06 +00:00
Thomas Eizinger
7f4b20ab7f feat(gui-client): show "Welcome" screen on 2nd app launch (#9136)
Resolves: #8352.
2025-05-15 08:20:24 +00:00
Thomas Eizinger
ce06996a14 fix(connlib): allow more than one host candidate per IP version (#9147)
Currently, one machines that have multiple routable egress interfaces,
`connlib` may bounce between the two instead of settling on one. This
happens because we have a dedicated `CandidateSet` that we use to filter
out "duplicate" candidates of the same type. Doing that is important
because if the other party is behind a symmetric NAT, they will send us
many server-reflexive candidates that all only differ by their port,
none of them will actually be routable though.

To prevent sending many of these candidates to the remote, we first
gather them locally in our `CandidateSet` and de-duplicate them.
2025-05-15 02:08:53 +00:00
Jamil
13a26326af fix(website): Remove dupe changelog entry (#9138) 2025-05-14 16:37:26 +00:00
Thomas Eizinger
b7451fcdae chore: release Gateway 1.4.9 (#9132) 2025-05-14 06:39:03 +00:00
Thomas Eizinger
a7ef588d86 chore: release headless client 1.4.8 (#9131) 2025-05-14 06:17:29 +00:00
Thomas Eizinger
5a4e72954f chore: release GUI client 1.4.13 (#9130) 2025-05-14 06:09:01 +00:00
Jamil
f19702f53e feat(apple): Allow user-configurable account slug (#9119)
Now that configuration is persisted in a more reasonable fashion, we can
expose a new `General` section to the Settings, allowing the user to
configure an account slug.

This field will automatically be populated upon the first sign in, so
that subsequent sign-ins will take the user directly to the account sign
in page.


Fixes #5119 
Related #8919

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-14 04:17:29 +00:00
Jamil
53b505e748 chore(website): Remove outdated battlecard (#9117)
This is outdated and probably doesn't send a strong message in its
current form.
2025-05-13 15:41:06 +00:00
Thomas Eizinger
407a67cb40 docs: add changelog entries for several issues (#9113)
As part of going through the changes since the last Client and Gateway
relies, I noticed that for several of the things we fixed, it might be
worth adding changelog entries.
2025-05-13 13:35:02 +00:00
Thomas Eizinger
c93a3d710a fix(gui-client): don't panic during setup hook (#9112)
As part of launching the Tauri GUI client, we need to observe a specific
initialisation order. In particular, we need to wait until Tauri sends
us a `RunEvent::Ready` before we can initialise things like the tray
menu.

To make this more convenient, Tauri offers a so-called "setup hook" that
can be set on the app builder. Unfortunately, Tauri internally panics if
this provided setup-hook returns an `Err`. Removing this is tracked
upstream: https://github.com/tauri-apps/tauri/issues/12815.

Until this is fixed, we stop using this "setup hook" and instead spawn
our own task that performs this work. This task needs to wait until
Tauri is ready. To achieve that, we introduce an additional mpsc channel
that sends a notification every time we receive a `RunEvent::Ready`.
That should only happen once. We only read from the receiver once, which
is why we ignore the error on the sending side in case the receiver has
already been dropped.

Resolves: #9101
2025-05-13 04:02:42 +00:00
Thomas Eizinger
ac339ff63b fix(gateway): evaluate fastest nameserver every 60s (#9060)
Currently, the Gateway reads all nameservers from `/etc/resolv.conf` on
startup and evaluates the fastest one to use for SRV and TXT DNS queries
that are forwarded by the Client. If the machine just booted and we do
not have Internet connectivity just yet, this fails which leaves the
Gateway in state where it cannot fulfill those queries.

In order to ensure we always use the fastest one and to self-heal from
such situations, we add a 60s timer that refreshes this state.
Currently, this will **not** re-read the nameservers from
`/etc/resolv.conf` but still use the same IPs read on startup.
2025-05-09 03:38:35 +00:00
Thomas Eizinger
33d5c32f35 fix(gateway): truncate payload of ICMP errors (#9059)
When the Gateway is handed an IP packet for a DNS resource that it
cannot route, it sends back an ICMP unreachable error. According to RFC
792 [0] (for ICMPv4) and RFC 4443 [1] (for ICMPv6), parts of the
original packet should be included in the ICMP error payload to allow
the sending party to correlate, what could not be sent.

For ICMPv4, the RFC says:

```
Internet Header + 64 bits of Data Datagram

The internet header plus the first 64 bits of the original
datagram's data.  This data is used by the host to match the
message to the appropriate process.  If a higher level protocol
uses port numbers, they are assumed to be in the first 64 data
bits of the original datagram's data.
```

For ICMPv6, the RFC says:

```
As much of invoking packet as possible without the ICMPv6 packet exceeding the minimum IPv6 MTU
```

[0]: https://datatracker.ietf.org/doc/html/rfc792
[1]: https://datatracker.ietf.org/doc/html/rfc4443#section-3.1
2025-05-09 01:38:31 +00:00
Thomas Eizinger
005b6fe863 feat(windows): optimise network change detection (#9021)
Presently, the network change detection on Windows is very naive and
simply emits a change event everytime _anything_ changes. We can
optimise this and therefore improve the start-up time of Firezone by:

- Filtering out duplicate events
- Filtering out network change events for our own network adapter

This reduces the number of network change events to 1 during startup. As
far as I can tell from the code comments in this area, we explicitly
send this one to ensure we don't run into a race condition whilst we are
starting up.

Resolves: #8905
2025-05-06 00:23:27 +00:00
Thomas Eizinger
ea475c721a docs(website): update changelog for latest releases (#9015)
In #9013, we forgot to update the changelogs for Apple Clients and the
Gateway.
2025-05-02 13:16:28 +00:00
Jamil
6e0e7343ba chore: release Apple & Gateway with ECN fix (#9013) 2025-05-02 00:16:40 -07:00
Thomas Eizinger
513e0a400c docs(website): update Apple changelog (#9011) 2025-05-02 05:55:25 +00:00
Thomas Eizinger
0aab954fa9 fix(connlib): never clear ECT from IP packets (#9009)
ECN information is helpful to allow the congestion controllers to more
easily fine-tune their send and receive windows. When a Firezone Client
receives an IP packet where the ECN bits signal an ECN capable
transport, we mirror this bit on the UDP datagram that carries the
encrypted IP packet.

When receiving a datagram with ECN bits set, the Gateway will then apply
these bits to the decrypted IP packet and pass it along towards its
destination.

This implementation is unfortunately a bit too naive. Not all devices on
the Internet support ECN and therefore, we may receive a datagram that
has its ECN bits cleared when the ECN bits on the inner IP packet still
signal an ECN capable transport. In this case, we should _not_ override
the ECN bits and instead pass the IP packet along as is. Network devices
along the path between Gateway and Resource may still use these ECN bits
to signal congestion.

We fix this by making the `with_ecn` function on `IpPacket` private. It
is not meant to be used outside of the module. We supersede it with a
`with_ecn_from_transport` function that implements the above logic.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-05-02 05:28:19 +00:00
Thomas Eizinger
ec4cd898ba chore: release Gateway v1.4.7 (#8943) 2025-04-30 13:37:32 +00:00
Thomas Eizinger
96998a43ae docs(website): add missing changelog entry for Apple Clients (#8938) 2025-04-30 07:14:33 +00:00
Thomas Eizinger
f7df445924 fix(gateway): don't invalidate active NAT sessions (#8937)
Whenever the Gateway is instructed to (re)create the NAT for a DNS
resource, it performs a DNS query and then overwrite the existing
entries in the NAT table. Depending on how the DNS records are defined,
this may lead to a very bad user experience where connections are cut
regularly.

In particular, if a service utilises round-robin DNS where a DNS query
only ever returns a single entry yet that entry may change as soon as
the TTL expires, all connections for this particular DNS resource for a
Client get cut.

To fix this, we now first check for active NAT sessions for a given
proxy IP and only replace it if we don't have an open NAT session. The
NAT sessions have a TTL of 1 minute, meaning there needs to be at least
1 outgoing packet from the Client every minute to keep it open.
2025-04-30 06:58:58 +00:00
Jamil
2650d81444 chore: release clients with GSO fix (#8936) 2025-04-29 23:52:43 -07:00
Thomas Eizinger
6dc5f85cc5 fix(connlib): don't buffer when recreating DNS resource NAT (#8935)
In order to detect changes to DNS records of DNS resources, `connlib`
will recreate the DNS resource NAT whenever it receives a query for a
DNS resource. The way we implemented this was by clearing the local
state of the DNS resource NAT, which triggered us to perform the
handshake with the Gateway again upon the next packet for this resource.
The Gateway would then perform the DNS query and respond back when this
was finished.

In order to not drop any packets, `connlib` has a buffer where it keeps
the packets that are arriving in the meantime. This works reasonably
well when the connection is first set up because we are only buffering a
TCP SYN or equivalent handshake packet. Yet, when the connection is full
use, and the application just so happens to make another DNS query, we
halt the entire flow of packets until this is confirmed again. To
prevent high memory use, the buffer for this packets is constrained to
32 packets which is nowhere near enough when a connection is actively
transferring data (like a file upload).

In most cases, the DNS query on the Gateway will yield the exact same
results as because the records haven't changed. Thus, there is no reason
for us to actually halt the flow of these packets when we are
_recreating_ the DNS resource NAT. That way, this handshake happens in
parallel to the actual packet flow and does not interrupt anything in
the happy path case.
2025-04-30 04:26:49 +00:00
Thomas Eizinger
122d84cfa2 fix(connlib): recreate log file if it got deleted (#8926)
Currently, when `connlib`'s log file gets deleted, we write logs into
nirvana until the corresponding process gets restarted. This is painful
for users to do because they need to restart the IPC service or Network
Extension. Instead, we can simply check if the log file exists prior to
writing to it and re-create it if it doesn't.

Resolves: #6850
Related: #7569
2025-04-29 13:05:02 +00:00
Thomas Eizinger
bbc9c29d5d docs(website): add changelog for #8920 (#8923) 2025-04-29 10:23:48 +00:00
Thomas Eizinger
ad9a453aa1 feat(linux-client): reduce number of TUN threads to 1 (#8914)
Having multiple threads for reading and writing the TUN device can cause
packet re-orderings on the client. All other clients only use a single
TUN thread, so aligning this value means a more consistent behaviour of
Firezone across all platforms.
2025-04-28 12:25:27 +00:00
Jamil
f181a3245b chore(website): Remove old docs (#8895)
These confuse users and are horribly outdated.

Fixes #8528
2025-04-23 15:24:09 +00:00
Thomas Eizinger
ac5e44d5d0 feat(connlib): request larger buffers for UDP sockets (#8731)
Sufficiently large receive buffers are important to sustain
high-throughput as latency increases. If the receive buffer in the
kernel is too small, packets need to be dropped on arrival.

Firefox uses 1MB in its QUIC stack [0]. `quic-go` recommends to set send
and receive buffers to 7.5 MB [1]. Power users of Firezone are likely
receiving a lot more traffic than the average Firefox user (especially
with Internet Resource activated) so setting it to 10 MB seems
reasonable. Sending packets is likely not as critical because we have
back-pressure through our system such that we will stop reading IP
packets when we cannot write to our UDP socket. The UDP socket is
sitting in a separate thread and those threads are connected with
dedicated queues which act as another buffer. However, as the data below
shows, some systems have really small send buffers which are currently
likely a speed bottleneck because we need to suspend writing so
frequently.

Assuming a 50ms latency, the bandwidth-delay product tells us that we
can (in theory) saturate a 1.6 Gbps link with a 10MB receive buffer
(assuming the OS also has large enough buffer sizes in its TCP or QUIC
stack):

```
80 Mb / 0.05s = 1600Mbps
```

Experiments and research [2] show the following:

|OS|Receive buffer (default)|Receive buffer (this PR)|Send buffer
(default)|Send buffer (this PR)|
|---|---|---|---|---|
|Windows|65KB|10MB|65KB|1MB|
|MacOS|786KB|8MB|9KB|1MB|
|Linux|212KB|212KB|212KB|212KB|

With the exception of Linux, the OSes appear to be quite generous with
how big they allow receive buffers to be. On Linux, these limit can be
changed by setting the `core.net.rmem_max` and `core.net.wmem_max`
parameters using `sysctl`.

Most of our users are on Windows and MacOS, meaning they immediately
benefit from this without having to change any system settings. Larger
client-side UDP receive buffers are critical for any "download" scenario
which is likely the majority of usecases that Firezone is used for.

On Windows, increasing this receive buffer almost doubles the throughput
in an iperf3 download test.

[0]: https://github.com/mozilla/neqo/pull/2470
[1]: https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes
[2]: https://unix.stackexchange.com/a/424381

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-04-22 06:52:33 +00:00
Jamil
5db8e20f3b chore: release Apple and GUI clients (#8882)
- Apple clients 1.4.12
- GUI clients 1.4.11
2025-04-21 21:45:16 +00:00
Jamil
368ace2c6e ci: Release Android 1.4.7 (#8878)
App is live on Play store.
2025-04-21 21:12:27 +00:00
Thomas Eizinger
4c5fd9b256 feat(connlib): prefer relay candidates of same IP version (#8798)
When calculating preferences for candidates, `str0m` currently always
prefer IPv6 over IPv4. This is as per the ICE spec. Howver, this can
lead to sub-optimal situations when a connection ends up using a TURN
server.

TURN allows a client to allocate an IPv4 and an IPv6 address in the same
allocation. This makes it possible for e.g. an IPv4-only client to
connect to an IPv6-only peer as long as the TURN server runs in
dual-stack AND the client requests an IPv6 address in addition to an
IPv4 address with the `ADDITIONAL-ADDRESS-FAMILY` attribute.

Assume that a client sits behind symmetric NAT and therefore needs to
rely on a TURN server to communicate with its peers. The TURN server as
well as all the peers operate in dual-stack mode.

The current priority calculation will yield a communication path that
uses IPv4 to talk to the TURN server (as that is the only one available)
but due to the preference ordering of IPv6 over IPv4, will use an IPv6
path to the peer, despite the peer also supporting IPv4.

This isn't a problem per-se but makes our life unnecessarily difficult.
Our TURN servers use eBPF to efficiently deal with TURN's channel-data
messages. This however is at present only implemented for the IPv4 <>
IPv4 and IPv6 <> IPv6 path. Implementing the other paths is possible but
complicates the eBPF code because we need to also translate IP headers
between versions and not just update the source and destination IPs.

We have since patched `str0m` to extend the `Candidate::relayed`
constructor to also take a `base` address which is - similar to the
other candidate types - the address the client is sending from in order
to use this candidate. In the context of relayed candidates, this is the
address the client is using to talk to the TURN server. We can use this
information in the candidate's priority calculation to prefer candidates
that allow traffic to remain within one IP version, i.e. if the client
talks to the TURN server over IPv4, the candidate with an allocated IPv4
address will have a higher priority than the one with the IPv6 address
because we are applying a "punishment" factor as part of the
local-preference component in the priority formula.

Staying within the same IP version whilst relaying traffic allows our
TURN servers to use their eBPF kernel which results in a better UX due
to lower latency and higher throughput.

The final candidate ordering is ultimately decided by the controlling
ICE agent which in our case is the Firezone Client. Thus, we don't
necessarily need to update Gateways in order to test / benefit from
this. Building a Client with this patch included should be enough to
benefit from this change.

Related: https://github.com/algesten/str0m/pull/640
Related: https://github.com/algesten/str0m/pull/644
2025-04-20 22:41:56 +00:00
Thomas Eizinger
f7f6e3885d docs(website): remove duplicate init (#8860)
Resolves: #8858
2025-04-19 22:09:06 +00:00
Jamil
5669c83835 ci: Bump Apple clients to 1.4.11 (#8848)
Includes a fix for auto-starting on launch when other VPN clients have
been connected previously.
2025-04-19 11:45:42 +00:00
Jamil
4c1379a6bf fix(apple): Force enable VPN configuration on autoStart (#8814)
If another VPN has been activated on the system while Firezone is
active, Apple OSes will deactivate our configuration, and never
reactivate it.

We knew this already, and always activated the configuration when
starting during the sign in flow, but failed to also do this when
autoStarting on launch.

This PR updates ensures that during autoStart, we re-enable the
configuration as well.

Fixes #8813
2025-04-18 18:00:44 +00:00
Jamil
a2e32a4918 ci: Bump apple to 1.4.10 to ship PKG (#8797)
This publishes the 1.4.10 permalinks for the PKG download.
2025-04-17 15:13:44 +00:00
Jamil
fc7b6e3fb0 feat(ci): Publish installer PKG for macOS standalone (#8795)
Microsoft Intune's DMG provisioner currently fails unexpectedly when
trying to provision our published DMG file with the error:

> The DMG file couldn't be mounted for installation. Check the DMG file
if the error persists. (0x87D30139)

I ran the following verification commands locally, which all passed:

```
hdiutil verify -verbose <dmg>
hdiutil imageinfo -verbose <dmg>
hdiutil hfsanalyze -verbose <dmg>
hdiutil checksum -type SHA256 -verbose <dmg>
hdiutil info -verbose
hdiutil pmap -verbose <dmg>
```

So the issue appears to be most likely that Intune doens't like the
`/Applications` shortcut in the DMG. This is a UX feature to make it
easy to drag the application the /Applications folder upon opening the
DMG.

So we're publishing an PKG in addition to the DMG, which should be a
more reliable artifact for MDMs to use.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2025-04-16 16:21:40 +00:00
Thomas Eizinger
4cf36cd8bd docs(kb): update path to Gateway to new location (#8794)
In #8480, we changed the location that `firezone-gateway` gets
downloaded to but forgot to update the knowledgebase with the new path.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2025-04-16 13:20:28 +00:00
Jamil
aab691a67f ci: Release Apple clients 1.4.9 (#8793)
These contain the recent UDP thread enhancements.
2025-04-15 20:14:43 +00:00
Jamil
743f5fdfeb ci: bump clients/gateway to ship write improvements (#8792)
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2025-04-15 06:21:23 +00:00
Thomas Eizinger
282fdb96ea chore: fixup changelog for latest releases (#8788)
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-14 20:41:47 -07:00
Thomas Eizinger
b3746b330f refactor(connlib): spawn dedicated threads for UDP sockets (#7590)
Correctly implementing asynchronous IO is notoriously hard. In order to
not drop packets in the process, one has to ensure a given socket is
ready to accept packets, buffer them if it is not case, suspend
everything else until the socket is ready and then continue.

Until now, we did this because it was the only option to run the UDP
sockets on the same thread as the actual packet processing. That in turn
was motivated by wanting to pass around references of the received
packets for processing. Rust's borrow-checker does not allow to pass
references between threads which forced us to have the sockets on the
same thread as the packet processing.

Like we already did in other places in `connlib`, this can be solved
through the use of buffer pools. Using a buffer pool, we can use heap
allocations to store the received packets without having to make a new
allocation every time we read new packets. Instead, we can have a
dedicated thread that is connected to `connlib`'s packet processing
thread via two channels (one for inbound and one for outbound packets).
These channels are bounded, which ensures backpressure is maintained in
case one of the two threads lags behind. These bounds also mean that we
have at most N buffers from the buffer pool in-flight (where N is the
capacity of the channel).

Within those dedicated threads, we can then use `async/await` notation
to suspend the entire task when a socket isn't ready for sending.

Resolves: #8000
2025-04-14 06:18:06 +00:00
Thomas Eizinger
e0f94824df fix(gateway): default to 1 TUN thread on single-core systems (#8765)
On single-core systems, spawning more than one TUN thread results in
contention that hurts performance more than it helps.

Resolves: #8760
2025-04-13 01:54:04 +00:00
Thomas Eizinger
132487c29e fix(connlib): correctly compute the GSO batch size (#8754)
We are currently naively chunking our buffer into `segment_size *
max_gso_segments()`. `max_gso_segments` is by default 64. Assuming we
processed several IP packets, this would quickly balloon to a size that
the kernel cannot handle. For example, during an `iperf3` run, we
receive _a lot_ of packets at maximum MTU size (1280). With the overhead
that we are adding to the packet, this results in a UDP payload size of
1320.

```
1320 x 64 = 84480
```

That is way too large for the kernel to handle and it will fail the
`sendmsg` call with `EMSGSIZE`. Unfortunately, this error wasn't
surfaced because `quinn_udp` handles it internally because it can also
happen as a result of MTU probes.

We've already patched `quinn_udp` in the past to move the handling of
more quinn-specific errors to the infallible `send` function. The same
is being done for this error in
https://github.com/quinn-rs/quinn/pull/2199.

Resolves: #8699
2025-04-12 13:10:43 +00:00