firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 18:18:55 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	b7dc897eea	refactor(rust): introduce `libs/` directory (#10964 ) The current Rust workspace isn't as consistent as it could be. To make navigation a bit easier, we move a few crates around. Generally, we follow the idea that entry-points should be at the top-level. `rust/` now looks like this (directories only): ``` . ├── cli # Firezone CLI ├── client-ffi # Entry point for Apple & Android ├── gateway # Gateway ├── gui-client # GUI client ├── headless-client # Headless client ├── libs # Library crates ├── relay # Relay ├── target # Compile artifacts ├── tests # Crates for testing └── tools # Local tools ``` To further enforce this structure, we also drop the `firezone-` prefix from all crates that are not top-level binary crates.	2025-11-25 10:59:11 +00:00
Thomas Eizinger	804ef7a3fb	fix(connlib): retain order of system/upstream DNS servers (#10773 ) Right now, connlib hands out a `BiMap` of sentinel IPs <> upstream servers whenever it emits a `TunInterfaceUpdated` event. This `BiMap` internally uses two `HashMap`s. The iteration order of `HashMap`s is non-deterministic and therefore, we lose the order in which the upstream / system resolvers have been passed to us originally. To prevent that, we now emit a dedicated `DnsMapping` type that does not expose its internal data structure but only getters for retrieving the sentinel and upstream servers. Internally, it uses a `Vec` to store this mapping and thus retains the original order. This is asserted as part of our proptests by comparing the resulting `Vec`s. This fix is preceded by a few refactorings that encapsulate the code for creating and updating this DNS mapping. Resolves: #8439	2025-11-03 17:55:48 +00:00
Thomas Eizinger	9016ffc9dc	build(rust): bump to Rust 1.91.0 (#10767 ) Rust 1.91 has been released and brings with it a few new lints that we need to tidy up. In addition, it also stabilizes `BTreeMap::extract_if`: A really nifty std-lib function that allows us to conditionally take elements from a map. We need that in a bunch of places.	2025-11-03 01:56:12 +00:00
dependabot[bot]	941f6f3d1c	build(deps): bump secrecy from 0.8.0 to 0.10.3 in /rust (#10631 ) Bumps [secrecy](https://github.com/iqlusioninc/crates) from 0.8.0 to 0.10.3. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/iqlusioninc/crates/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=secrecy&package-manager=cargo&previous-version=0.8.0&new-version=0.10.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-10-30 01:17:10 +00:00
Jamil	39b2f61cfd	fix(ci): ensure version markers are replaced across all files (#10752 ) Upon moving the version string from PKG_VERSION and Cargo.toml, we lost the bump version automation. To avoid more bugs here in the future, we now check for the version marker across all Git-tracked files, regardless of their extension. Fixes #10748 --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-10-29 02:10:50 +00:00
Thomas Eizinger	08857d602b	chore(client-ffi): add dummy constructor (#10659 ) When working on the `client-ffi` module on a Linux or Windows machine, we currently see a lot of "unused code" warnings. We could feature-gate the remaining functions too but that would result in not having code-completion on those platforms at all. To make working on this module more ergonomic, we add a dummy constructor for the session.	2025-10-22 02:07:00 +00:00
Thomas Eizinger	dee535f30e	chore(client-ffi): tweak uniffi settings (#10665 ) As far as I can tell, the `async_runtime` config option doesn't exist in UniFFI, hence we remove that. Whilst going through the UniFFI docs, I also noticed that there is a specific flag about Android that we can toggle on. Effectively, this uses the shared [`SystemCleaner`](https://developer.android.com/reference/android/system/SystemCleaner) instead of a per-thread one which is supposed to be more performant. Finally, using immutable records seems like a good idea as mutating any FFI-originated field is not going to be reflected in connlib's state. Preventing that at compile-time has a good chance of reducing bugs.	2025-10-21 05:19:26 +00:00
Thomas Eizinger	0e48d27b5a	feat(ffi): make all calls infallible (#10621 ) In the spirit of making Firezone as robust as possible, we make the FFI calls infallible and complete as much of the task as possible. For example, we don't fail `setDns` entirely just because we cannot parse a single DNS server's IP. Resolves: #10611	2025-10-20 01:03:26 +00:00
Mariusz Klochowicz	e76daaaab3	refactor: remove JSON serialization from FFI boundary (#10575 ) This PR eliminates JSON-based communication across the FFI boundary, replacing it with proper uniffi-generated types for improved type safety, performance, and reliability. We replace JSON string parameters with native uniffi types for: - Resources (DNS, CIDR, Internet) - Device information - DNS server lists - Network routes (CIDR representation) Also, get rid of JSON serialisation in Swift client IPC in favour of PropertyList based serialisation. Fixes: https://github.com/firezone/firezone/issues/9548 --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2025-10-16 05:15:31 +00:00
Mariusz Klochowicz	cb50800d52	refactor(apple): Migrate iOS/macOS clients to UniFFI (#10368 ) Replace callback-based Adapter with event polling-based AdapterUniFfi This change improves reliability by eliminating callback lifetime issues.	2025-10-13 23:13:52 +00:00
Thomas Eizinger	8fc2ef8ad1	fix(clients): set Internet Resource state on startup (#10509 ) Building on top of #10507, setting the initial Internet Resource state is a piece of cake. All we need to do is thread a boolean variable through to all call-sites of `Session::connect`. Without the need for the Internet Resource's ID, we can simply pass in the boolean that is saved in the configuration of each client. Resolves: #10255	2025-10-07 07:13:52 +00:00
Thomas Eizinger	36dfee2c42	refactor(connlib): explicitly enable/disable Internet Resource (#10507 ) Instead of the generic "disable any kind of resource"-functionality that connlib currently exposes, we now provide an API to only enable / disable the Internet Resource. This is a lot simpler to deal with and reason about than the previous system, especially when it comes to the proptests. Those need to model connlib's behaviour correctly across its entire API surface which makes them unnecessarily complex if we only ever use the `set_disabled_resources` API with a single resource. In preparation for #4789, I want to extend the proptests to cover traffic filters (#7126). This will make them a fair bit more complicated, so any prior removal of complexity is appreciated. Simplifying the implementation here is also a good starting point to fix #10255. Not implicitly enabling the Internet Resource when it gets added should be quite simple after this change. Finally, resolving #8885 should also be quite easy. We just need to store the state of the Internet Resource once per API URL instead of globally. Resolves: #8404 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-07 00:26:07 +00:00
Thomas Eizinger	a297c6dbbd	chore: differentiate between `shutdown` and `shut down` (#10494 ) In a prior code review, CoPilot flagged that we were using the noun "shutdown" as a verb in certain places. Resolves: #10425	2025-10-01 02:55:22 +00:00
Thomas Eizinger	685acdac3a	feat: add more specific component type to user-agent header (#10457 ) In order to allow the portal to more easily classify, what kind of component is connecting, we extend the `get_user_agent` header to include a component type instead of the generic `connlib/`. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-09-26 00:18:36 +00:00
Thomas Eizinger	0310bafbcd	feat(clients): gracefully close connections on shutdown (#10400 ) In #10076, connlib gained the ability to gracefully close connections between peers. The Gateway already uses this when it is being gracefully shutdown such as during an upgrade. This allows Clients to immediately fail-over to a different Gateway instead of waiting for an ICE timeout. When a Client signs out, we currently just drop all the state, resulting in an ICE timeout on the Gateway ~15 seconds later. This makes it difficult for us to analyze, whether an ICE timeout in the logs presents an actual problem where a network connection got cut or whether the Client simply signed out. Whilst not water-tight, attempting to gracefully close our connections when the Client signs out is better than nothing so we implement this here. All Clients use the `Session` abstraction from `client-shared` which spawns the event-loop into a dedicated task. - For the Linux and Windows GUI client, the already present tokio runtime instance of the tunnel service is used for this. - For Android and Apple, we create a dedicated, single-threaded runtime instance for connlib. - For the headless client, we also reuse the already existing tokio runtime instance of the binary. In case of Android, Apple and the headless client, this means we need to ensure the tokio runtime instances stays alive long enough to actually complete the graceful shutdown task. We achieve this by draining the `EventStream` returned from `Session`. The `EventStream` is a wrapper around a channel connected to the event-loop. This stream only finishes once the event-loop is entirely dropped (and therefore completed the graceful shutdown) as it holds the sender-end of the channel. In case of the Linux and Windows GUI client, the runtime outlives the `Session` because it is scoped to the entire tunnel process. Therefore, no additional measures are necessary there to ensure the graceful shutdown task completes.	2025-09-23 03:40:52 +00:00
Thomas Eizinger	e84bdc5566	refactor(connlib): periodically record queue depths (#10242 ) Instead of recording the queue depths on every event-loop tick, we now record them once a second by setting a Gauge. Not only is that a simpler instrument to work with but it is significantly more performant. The current version - when metrics are enabled - takes on quite a bit of CPU time. Resolves: #10237	2025-09-02 02:57:36 +00:00
Thomas Eizinger	9cddfe59fa	fix(rust): don't require Internet on startup (#10264 ) With the introduction of the pre-resolved Sentry host, all Firezone clients now require Internet on startup. That is a signficant usability hit that we can easily fix by simply falling back to resolving the host on-demand.	2025-09-01 01:31:05 +00:00
Thomas Eizinger	544ba11f21	chore(rust): allow `too_many_arguments` repo-wide (#10236 ) We always end up allow this lint when it pops up so we can also just allow it for the whole repo in general. Most of the time, the reason for too many arguments are borrow-checker limitations of Rust where mutable references need to be tracked explicitly.	2025-08-22 13:21:07 +00:00
Thomas Eizinger	a109c1a2ef	feat(connlib): discard intermediate resource and TUN updates (#10223 ) Right now, the Client event-loops have a channel with 1000 items for sending new resource lists and updates to the TUN device to the host app. This is kind of unnecessary as we always only care about the last version of these. Intermediate updates that the host app doesn't process are effectively irrelevant. We've had an issue before where a bug in the portal caused us to receive many updates to resources which ended up crashing Client apps because this channel filled up. To be more resilient on this front, we refactor the Client event loop to use a `watch` channel for this. Watch channels only retain the last value that got sent into them.	2025-08-21 05:42:54 +00:00
Thomas Eizinger	46afa52f78	feat(telemetry): pre-resolve Sentry ingest host (#10206 ) Our Sentry client needs to resolve DNS before being able to send logs or errors to the backend. Currently, this DNS resolution happens on-demand as we don't take any control of the underlying HTTP client. In addition, this will use HTTP/1.1 by default which isn't as efficient as it could be, especially with concurrent requests. Finally, if we decide to ever proxy all Sentry for traffic through our own domain, we have to take control of the underlying client anyway. To resolve all of the above, we create a custom `TransportFactory` where we reuse the existing `ReqwestHttpTransport` but provide an already configured `reqwest::Client` that always uses HTTP/2 with a pre-configured set of DNS records for the given ingest host.	2025-08-21 03:28:05 +00:00
Thomas Eizinger	4e11112d9b	feat(connlib): improve throughput on higher latencies (#10231 ) Turns out the multi-threaded access of the TUN device on the Gateway causes packet reordering which makes the TCP congestion controller throttle the connection. Additionally, the default TX queue length of a TUN device on Linux is only 500 packets. With just a single thread and an increased TX queue length, we get a throughput performance of just over 1 GBit/s for a 20ms link between Client and Gateway with basically no packet drops: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.79.130.70 port 49546 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 116 MBytes 977 Mbits/sec 0 6.40 MBytes [ 5] 1.00-2.00 sec 137 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 2.00-3.00 sec 134 MBytes 1.13 Gbits/sec 0 6.40 MBytes [ 5] 3.00-4.00 sec 136 MBytes 1.14 Gbits/sec 47 6.40 MBytes [ 5] 4.00-5.00 sec 137 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 5.00-6.00 sec 138 MBytes 1.16 Gbits/sec 0 6.40 MBytes [ 5] 6.00-7.00 sec 138 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 7.00-8.00 sec 138 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 8.00-9.00 sec 138 MBytes 1.16 Gbits/sec 0 6.40 MBytes [ 5] 9.00-10.00 sec 138 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 10.00-11.00 sec 139 MBytes 1.17 Gbits/sec 0 6.40 MBytes [ 5] 11.00-12.00 sec 139 MBytes 1.17 Gbits/sec 0 6.40 MBytes [ 5] 12.00-13.00 sec 136 MBytes 1.14 Gbits/sec 0 6.40 MBytes [ 5] 13.00-14.00 sec 139 MBytes 1.17 Gbits/sec 0 6.40 MBytes [ 5] 14.00-15.00 sec 140 MBytes 1.17 Gbits/sec 0 6.40 MBytes [ 5] 15.00-16.00 sec 138 MBytes 1.16 Gbits/sec 0 6.40 MBytes [ 5] 16.00-17.00 sec 137 MBytes 1.15 Gbits/sec 0 6.40 MBytes [ 5] 17.00-18.00 sec 139 MBytes 1.17 Gbits/sec 0 6.40 MBytes [ 5] 18.00-19.00 sec 138 MBytes 1.16 Gbits/sec 0 6.40 MBytes [ 5] 19.00-20.00 sec 136 MBytes 1.14 Gbits/sec 0 6.40 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-20.00 sec 2.67 GBytes 1.15 Gbits/sec 47 sender [ 5] 0.00-20.02 sec 2.67 GBytes 1.15 Gbits/sec receiver iperf Done. ``` For further debugging in the future, we are now recording the send and receive queue depths of both the TUN device and the UDP sockets. Neither of those showed to be full in my testing which leads me to conclude that it isn't any buffer inside Firezone that is too small here. Related: #7452 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2025-08-20 23:08:56 +00:00
Thomas Eizinger	0f2cfa2e3c	fix(rust): don't block runtime shutdown (#10204 ) By default, dropping a `tokio` runtime waits until all tasks have finished. The tasks we spawn within `connlib` can have complex dependencies with each other. To ensure that we can shut down in any case and don't hang, we apply a timeout of 1s to the runtime.	2025-08-18 01:59:03 +00:00
Thomas Eizinger	5141817134	feat(connlib): add `reason` argument to `reset` API (#9878 ) In order to provide more detailed logs, why `connlib`'s network state is being reset, we add a `reason` parameter that is gets logged. Resolves: #9867	2025-07-15 13:48:33 +00:00
Thomas Eizinger	2b70596636	fix(rust): only apply filter to select tracing layers (#9872 ) Applying a filter globally to the entire subscriber means it filters events for all layers. This prevents the Sentry layer from uploading DEBUG logs if configured.	2025-07-15 13:44:53 +00:00
Thomas Eizinger	04499da11e	feat(telemetry): grab env and `distinct_id` from Sentry session (#9801 ) At present, our primary indicator as to whether telemetry is active is whether we have a Sentry session. For our analytics events however, we currently require passing in the Firezone ID and API url again. This makes it difficult to send analytics events from areas of the code that don't have this information available. To still allow for that, we integrate the `analytics` module more tightly with the Sentry session. This allows us to drop two parameters from the `$identify` event and also means we now respect the `NO_TELEMETRY` setting for these events except for `new_session`. This event is sent regardless because it allows us to track, how many on-prem installations of Firezone are out there.	2025-07-10 20:05:08 +00:00
Thomas Eizinger	3b972643b1	feat(rust): stream logs to Sentry when enabled in PostHog (#9635 ) Sentry has a new "Logs" feature where we can stream logs directly to Sentry. Doing this for all Clients and Gateways would be way too much data to collect though. In order to aid debugging from customer installations, we add a PostHog-managed feature flag that - if set to `true` - enables the streaming of logs to Sentry. This feature flag is evaluated every time the telemetry context is initialised: - For all FFI usages of connlib, this happens every time a new session is created. - For the Windows/Linux Tunnel service, this also happens every time we create a new session. - For the Headless Client and Gateway, it happens on startup and afterwards, every minute. The feature-flag context itself is only checked every 5 minutes though so it might take up to 5 minutes before this takes effect. The default value - like all feature flags - is `false`. Therefore, if there is any issue with the PostHog service, we will fallback to the previous behaviour where logs are simply stored locally. Resolves: #9600	2025-06-25 16:14:14 +00:00
Thomas Eizinger	d376a122e4	feat(telemetry): send `account_slug` to PostHog (#9636 ) In order to more easily target customers with certain feature flags, we include the `account_slug` in the `$identify` event to PostHog. This will allow us to create Cohorts in PostHog and enable / disable feature flags for all installations of Firezone for a particular customer.	2025-06-24 09:00:24 +00:00
Thomas Eizinger	faeb958882	refactor: use UniFFI for Android FFI (#9415 ) To make our FFI layer between Android and Rust safer, we adopt the UniFFI tool from Mozilla. UniFFI allows us to create a dedicated crate (here `client-ffi`) that contains Rust structs annotated with various attributes. These macros then generate code at compile time that is built into the shared object. Using a dedicated CLI from the UniFFI project, we can then generate Kotlin bindings from this shared object. The primary motivation for this effort is memory safety across the FFI boundary. Most importantly, we want to ensure that: - The session pointer is not used after it has been free'd - Disconnecting the session frees the pointer - Freeing the session does not happen as part of a callback as that triggers a cyclic dependency on the Rust side (callbacks are executed on a runtime and that runtime is dropped as part of dropping the session) To achieve all of these goals, we move away from callbacks altogether. UniFFI has great support for async functions. We leverage this support to expose a `suspend fn` to Android that returns `Event`s. These events map to the current callback functions. Internally, these events are read from a channel with a capacity of 1000 events. It is therefore not very time-critical that the app reads from this channel. `connlib` will happily continue even if the channel is full. 1000 events should be more than sufficient though in case the host app cannot immediately process them. We don't send events very often after all. This event-based design has major advantages: It allows us to make use of `AutoCloseable` on the Kotlin side, meaning the `session` pointer is only ever accessed as part of a `use` block and automatically closed (and therefore free'd) at the end of the block. To communicate with the session, we introduce a `TunnelCommand` which represents all actions that the host app can send to `connlib`. These are passed through a channel to the `suspend fn` which continuously listens for events and commands. Resolves: #9499 Related: #3959 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>	2025-06-17 21:48:34 +00:00

28 Commits