firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-28 02:18:50 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	8bd8098cab	refactor(connlib): don't re-implement waker for TUN thread (#7944 ) Within `connlib` - on UNIX platforms - we have dedicated threads that read from and write to the TUN device. These threads are connected with `connlib`'s main thread via bounded channels: one in each direction. When these channels are full, `connlib`'s main thread will suspend and not read any network packets from the sockets in order to maintain back-pressure. Reading more packets from the socket would mean most likely sending more packets out the TUN device. When debugging #7763, it became apparent that _something_ must be wrong with these threads and that somehow, we either consider them as full or aren't emptying them and as a result, we don't read _any_ network packets from our sockets. To maintain back-pressure here, we currently use our own `AtomicWaker` construct that is shared with the TUN thread(s). This is unnecessary. We can also directly convert the `flume::Sender` into a `flume::async::SendSink` and therefore directly access a `poll` interface.	2025-01-29 15:48:48 +00:00
Jamil	7e231c6b10	chore: Release Android 1.4.1 (#7911 )	2025-01-29 00:29:15 +00:00
Thomas Eizinger	c6492d4832	fix(rust): don't start all log files with `connlib.` (#7853 ) At present, the file logger for all Rust code starts each logfile with `connlib.`. This is very confusing when exporting the logs from the GUI client because even the logs from the client itself will start with `connlib.`. To fix this, we make the base file name of the log file configurable.	2025-01-28 01:35:05 +00:00
Thomas Eizinger	ed5285268d	refactor: merge `on_update_routes` and `on_set_interface_config` (#7699 ) For a while now, `connlib` has been calling these two callbacks right after each other because the internal event already bundles all the information about the TUN device. With this PR, we merge the two callback functions also in layers above `connlib` itself. Resolves: #6182.	2025-01-08 18:26:40 +00:00
Thomas Eizinger	037a2e64b6	fix(connlib): attempt to detect runtime shutdown within TUN task (#7605 ) Reading and writing to the TUN device within `connlib` happens in a separate thread. The task running within these threads is connected to the rest of `connlib` via channels. When the application shuts down, these threads also need to exit. Currently, we attempt to detect this from within the task when these channels close. It appears that there is a race condition here because we first attempt to read from the TUN device before reading from the channels. We treat read & write errors on the TUN device as non-fatal so we loop around and attempt to read from it again, causing an infinite-loop and log spam. To fix this, we swap the order in which we evaluate the two concurrent tasks: The first task to be polled is now the channel for outbound packets and only if that one is empty, we attempt to read new packets from the TUN device. This is also better from a backpressure point of view: We should attempt to flush out our local buffers of already processed packets before taking on "new work". As a defense-in-depth strategy, we also attempt to detect the particular error from the tokio runtime when it is being shut down and exit the task. Resolves: #7601. Related: https://github.com/tokio-rs/tokio/issues/7056.	2025-01-05 20:41:24 +00:00
Jamil	309914a45d	chore(android): release version 1.4.0 (#7649 ) Bumps the Android client to the 1.4.0 release. Tested in Android emulator.	2025-01-03 14:45:00 +00:00
Thomas Eizinger	81f71cba62	fix(telemetry): use `package@version` notation for releases (#7466 ) In order for Sentry to parse our releases as semver, they need to be in the form of `package@version` [0]. Without this, the feature of "Mark this issue as resolved in the _next_ version" doesn't work properly because Sentry compares the versions as to when it first saw them vs parsing the semver string itself. We test versions prior to releasing them, meaning Sentry learns about a 1.4.0 version before it is actually released. This causes false-positive "regressions" even though they are fixed in a later (as per semver) release. This create some redundancy with the different DSNs that we are already using. I think it would make sense to consider merging the two projects we have for the GUI client for example. That is really just one project that happens to run as two binaries. For all other projects, I think the separation still makes sense because we e.g. may add Sentry to the "host" applications of Android and MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the issues into the same Sentry project. As per Sentry's docs, releases are organisation-wide and therefore need a package identifier to be grouped correctly. [0]: https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version	2024-12-09 05:04:45 +00:00
Thomas Eizinger	ddce9312ea	fix(android): apply new log-filter on repeated `connect` call (#7461 ) Related: #7460. Resolves: #5634.	2024-12-06 04:45:28 +00:00
Thomas Eizinger	90cf191a7c	feat(linux): multi-threaded TUN device operations (#7449 ) ## Context At present, we only have a single thread that reads and writes to the TUN device on all platforms. On Linux, it is possible to open the file descriptor of a TUN device multiple times by setting the `IFF_MULTI_QUEUE` option using `ioctl`. Using multi-queue, we can then spawn multiple threads that concurrently read and write to the TUN device. This is critical for achieving a better throughput. ## Solution `IFF_MULTI_QUEUE` is a Linux-only thing and therefore only applies to headless-client, GUI-client on Linux and the Gateway (it may also be possible on Android, I haven't tried). As such, we need to first change our internal abstractions a bit to move the creation of the TUN thread to the `Tun` abstraction itself. For this, we change the interface of `Tun` to the following: - `poll_recv_many`: An API, inspired by tokio's `mpsc::Receiver` where multiple items in a channel can be batch-received. - `poll_send_ready`: Mimics the API of `Sink` to check whether more items can be written. - `send`: Mimics the API of `Sink` to actually send an item. With these APIs in place, we can implement various (performance) improvements for the different platforms. - On Linux, this allows us to spawn multiple threads to read and write from the TUN device and send all packets into the same channel. The `Io` component of `connlib` then uses `poll_recv_many` to read batches of up to 100 packets at once. This ties in well with #7210 because we can then use GSO to send the encrypted packets in single syscalls to the OS. - On Windows, we already have a dedicated recv thread because `WinTun`'s most-convenient API uses blocking IO. As such, we can now also tie into that by batch-receiving from this channel. - In addition to using multiple threads, this API now also uses correct readiness checks on Linux, Darwin and Android to uphold backpressure in case we cannot write to the TUN device. ## Configuration Local testing has shown that 2 threads give the best performance for a local `iperf3` run. I suspect this is because there is only so much traffic that a single application (i.e. `iperf3`) can generate. With more than 2 threads, the throughput actually drops drastically because `connlib`'s main thread is too busy with lock-contention and triggering `Waker`s for the TUN threads (which mostly idle around if there are 4+ of them). I've made it configurable on the Gateway though so we can experiment with this during concurrent speedtests etc. In addition, switching `connlib` to a single-threaded tokio runtime further increased the throughput. I suspect due to less task / context switching. ## Results Local testing with `iperf3` shows some very promising results. We now achieve a throughput of 2+ Gbit/s. ``` Connecting to host 172.20.0.110, port 5201 Reverse mode, remote host 172.20.0.110 is sending [ 5] local 100.80.159.34 port 57040 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 274 MBytes 2.30 Gbits/sec [ 5] 1.00-2.00 sec 279 MBytes 2.34 Gbits/sec [ 5] 2.00-3.00 sec 216 MBytes 1.82 Gbits/sec [ 5] 3.00-4.00 sec 224 MBytes 1.88 Gbits/sec [ 5] 4.00-5.00 sec 234 MBytes 1.96 Gbits/sec [ 5] 5.00-6.00 sec 238 MBytes 2.00 Gbits/sec [ 5] 6.00-7.00 sec 229 MBytes 1.92 Gbits/sec [ 5] 7.00-8.00 sec 222 MBytes 1.86 Gbits/sec [ 5] 8.00-9.00 sec 223 MBytes 1.87 Gbits/sec [ 5] 9.00-10.00 sec 217 MBytes 1.82 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec 22247 sender [ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec receiver iperf Done. ``` This is a pretty solid improvement over what is in `main`: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.65.159.3 port 56970 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 90.4 MBytes 758 Mbits/sec 1800 106 KBytes [ 5] 1.00-2.00 sec 93.4 MBytes 783 Mbits/sec 1550 51.6 KBytes [ 5] 2.00-3.00 sec 92.6 MBytes 777 Mbits/sec 1350 76.8 KBytes [ 5] 3.00-4.00 sec 92.9 MBytes 779 Mbits/sec 1800 56.4 KBytes [ 5] 4.00-5.00 sec 93.4 MBytes 783 Mbits/sec 1650 69.6 KBytes [ 5] 5.00-6.00 sec 90.6 MBytes 760 Mbits/sec 1500 73.2 KBytes [ 5] 6.00-7.00 sec 87.6 MBytes 735 Mbits/sec 1400 76.8 KBytes [ 5] 7.00-8.00 sec 92.6 MBytes 777 Mbits/sec 1600 82.7 KBytes [ 5] 8.00-9.00 sec 91.1 MBytes 764 Mbits/sec 1500 70.8 KBytes [ 5] 9.00-10.00 sec 92.0 MBytes 771 Mbits/sec 1550 85.1 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 917 MBytes 769 Mbits/sec 15700 sender [ 5] 0.00-10.00 sec 916 MBytes 768 Mbits/sec receiver iperf Done. ```	2024-12-05 00:18:20 +00:00
Thomas Eizinger	48bd0f9804	chore: bump client versions to 1.4.0 (#7092 ) In order to release the new control protocol to users, we need to bump the versions of the clients to 1.4.0. The portal has a version gate to only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus, bumping these versions can only happen once testing has completed and the gateway has actually been released as 1.4.0. Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>	2024-12-04 19:48:51 +00:00
Thomas Eizinger	dd6b52b236	chore(rust): share edition key via workspace table (#7451 )	2024-12-03 00:28:06 +00:00
Thomas Eizinger	932f6791fb	fix(phoenix-channel): lazily create backoff timer (#7414 ) Our `phoenix-channel` component is responsible for maintaining a WebSocket connection to the portal. In case that connection fails, we want to reconnect to it using an exponential backoff, eventually giving up after a certain amount of time. Unfortunately, the code we have today doesn't quite do that. An `ExponentialBackoff` has a setting for the `max_elapsed_time`. Regardless of how many and how often we retry something, we won't ever wait longer than this amount of time. For the Relay, this is set to 15min. For other components its indefinite (Gateway, headless-client), or very long (30 days for Android, 1 day for Apple). The point in time from which this duration is counted is when the `ExponentialBackoff` is constructed which translates to when we first connected to the portal. As a result, our backoff would immediately fail on the first error if it has been longer than `max_elapsed_time` since we first connected. For most components, this codepath is not relevant because the `max_elapsed_time` is so long. For the Relay however, that is only 15 minutes so chances are, the Relay would immediately fail (and get rebooted) on the first connection error with the portal. To fix this, we now lazily create the `ExponentialBackoff` on the first error. This bug has some interesting consequences: When a relay reboots, it looses all its state, i.e. allocations, channel bindings, available nonces etc, stamp-secret. Thus, all credentials and state that got distributed to Clients and Gateways get invalidated, causing disconnects from the Relay. We have observed these alerts in Sentry for a while and couldn't explain them. Most likely, this is the root cause for those because whilst a Relay disconnects, the portal also cannot detect its presence and pro-actively inform Clients and Gateways to no longer use this Relay.	2024-11-29 20:19:11 +00:00
Thomas Eizinger	2c26fc9c0e	ci: lint Rust dependencies using `cargo deny` (#7390 ) One of Rust's promises is "if it compiles, it works". However, there are certain situations in which this isn't true. In particular, when using dynamic typing patterns where trait objects are downcast to concrete types, having two versions of the same dependency can silently break things. This happened in #7379 where I forgot to patch a certain Sentry dependency. A similar problem exists with our `tracing-stackdriver` dependency (see #7241). Lastly, duplicate dependencies increase the compile-times of a project, so we should aim for having as few duplicate versions of a particular dependency as possible in our dependency graph. This PR introduces `cargo deny`, a linter for Rust dependencies. In addition to linting for duplicate dependencies, it also enforces that all dependencies are compatible with an allow-list of licenses and it warns when a dependency is referred to from multiple crates without introducing a workspace dependency. Thanks to existing tooling (https://github.com/mainmatter/cargo-autoinherit), transitioning all dependencies to workspace dependencies was quite easy. Resolves: #7241.	2024-11-22 00:17:28 +00:00
Thomas Eizinger	8c5a5fa690	chore(rust): correctly disable ANSI escapes globally (#7336 ) I think I finally understood and correctly traced, where the use of ANSI escape codes came from. It turns out, the `with_ansi` switch on `tracing_subscriber::fmt::Layer` is what you want to toggle. From there, it trickles down to the `Writer` which we can then test for in our `Format`. Resolves: #7284. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-11-14 05:00:53 +00:00
Thomas Eizinger	48ba2869a8	chore(rust): ban the use of `.unwrap` except in tests (#7319 ) Using the clippy lint `unwrap_used`, we can automatically lint against all uses of `.unwrap()` on `Result` and `Option`. This turns up quite a few results actually. In most cases, they are invariants that can't actually be hit. For these, we change them to `Option`. In other cases, they can actually be hit. For example, if the user supplies an invalid log-filter. Activating this lint ensures the compiler will yell at us every time we use `.unwrap` to double-check whether we do indeed want to panic here. Resolves: #7292.	2024-11-13 03:59:22 +00:00
Thomas Eizinger	ad4eea29ff	chore(rust): don't panic in fallible functions (#7298 ) "Just let it crash" is terrible advice for software that is shipped to end users. Where possible, we should use proper error handling and only fail the current function / task that is active, e.g. drop a particular packet instead of failing all of connlib. We more or less already do that. Activating the clippy lint `unwrap_in_result` surfaced a few more places where we panic despite being in a function that is fallible already. These cases can easily be converted to not panic and return an error instead.	2024-11-11 23:55:23 +00:00
Thomas Eizinger	488c599d5b	chore(telemetry): capture Firezone ID and account in user ctx (#7310 ) Sentry has a feature called the "User context" which allows us to assign events to individual users. This in turn will give us statistics in Sentry, how many users are affected by a certain issue. Unfortunately, Sentry's user context cannot be built-up step-by-step but has to be set as a whole. To achieve this, we need to slightly refactor `Telemetry` to not be `clone`d and instead passed around by mutable reference. Resolves: #7248. Related: https://github.com/getsentry/sentry-rust/issues/706.	2024-11-11 19:50:14 +00:00
Jamil	1dda915376	ci: Publish new clients (#7291 ) Fixes the roaming bug.	2024-11-08 22:58:06 +00:00
Thomas Eizinger	78ebad13ab	chore(rust): log more errors as `tracing::Value`s (#7208 ) Logging these as structured values gives us a better stacktrace in Sentry (assuming the errors themselves make proper use of defining an error-chain).	2024-11-05 14:36:47 +00:00
Thomas Eizinger	5564e578fe	fix(telemetry): flush sentry.io events in dedicated task (#7205 ) `sentry`'s transport layer appears to be using blocking IO for flushing events. Performing blocking IO within a future that is running on a worker-thread of tokio causes this operation to hang and eventually time-out after 5 seconds. As a result, many events - especially traces - don't get flushed to sentry when an app is being shut down. To fix this, we make `Telemetry::stop` an `async fn` and offload the flushing to a task on tokio's thread-pool for blocking IO.	2024-11-01 15:52:09 +00:00
Thomas Eizinger	59412223cb	chore: bump Android and Apple apps to next version (#7192 ) We are in the process of releasing these so we need to bump their version to the next one.	2024-10-31 14:24:33 +00:00
Thomas Eizinger	5cf105f073	chore(android): start telemetry together with connlib session (#7151 ) As a first step for integration Sentry into the Android app, we launch the Sentry Rust agent as soon as a `connlib` session starts up. At a later point, we can also integrate Sentry into the Android app itself using the Java / Kotlin SDK. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-10-24 20:03:06 +00:00
Thomas Eizinger	2ca91a3b1a	chore(connlib): remove old `mock` feature (#7142 ) This is so stale, it definitely needs to go in the bin.	2024-10-23 16:30:15 +00:00
Thomas Eizinger	857bbf5d98	chore(connlib): introduce custom logging format (#7024 ) This PR introduces a custom logging format for all Rust-components. It is more or less a copy of `tracing_subscriber::fmt::format::Compact` with the main difference that span-names don't get logged. Spans are super useful because they allow us to record contextual values, like the current connection ID, for a certain scope. What is IMO less useful about them is that in the default formatter configuration, active spans cause a right-drift of the actual log message. The actual log message is still what most accurately describes, what `connlib` is currently doing. Spans only add contextual information that the reader may use for further understand what is happening. This optional nature of the utility of spans IMO means that they should come _after_ the actual log message. Resolves: #7014.	2024-10-14 18:09:38 +00:00
Thomas Eizinger	2d4818e007	refactor(connlib): rotate tunnel private key on `reset` (#6909 ) With the new control protocol specified in #6461, the client will no longer initiate new connections. Instead, the credentials are generated deterministically by the portal based on the gateway's and the client's public key. For as long as they use the same public key, they also have the same in-memory state which makes creating connections idempotent. What we didn't consider in the new design at first is that when clients roam, they discard all connections but keep the same private key. As a result, the portal would generate the same ICE credentials which means the gateway thinks it can reuse the existing connection when new flows get authorized. The client however discarded all connections (and rotated its ports and maybe IPs), meaning the previous candidates sent to the gateway are no longer valid and connectivity fails. We fix this by also rotating the private keys upon reset. Rotating the keys itself isn't enough, we also need to propagate the new public key all the way "over" to the phoenix channel component which lives separately from connlib's data plane. To achieve this, we change `PhoenixChannel` to now start in the "disconnected" state and require an explicit `connect` call. In addition, the `LoginUrl` constructed by various components now acts merely as a "prototype", which may require additional data to construct a fully valid URL. In the case of client and gateway, this is the public key of the `Node`. This additional parameter needs to be passed to `PhoenixChannel` in the `connect` call, thus forming a type-safe contract that ensures we never attempt to connect without providing a public key. For the relay, this doesn't apply. Lastly, this allows us to tidy up the code a bit by: a) generating the `Node`'s private key from the existing RNG b) removing `ConnectArgs` which only had two members left Related: #6461. Related: #6732.	2024-10-07 22:28:51 +00:00
Thomas Eizinger	be250f1e00	refactor(connlib): repurpose `connlib-shared` as `connlib-model` (#6919 ) The `connlib-shared` crate has become a bit of a dependency magnet without a clear purpose. It hosts utilities like `get_user_agent`, messages for the client and gateway to communicate with the portal and domain types like `ResourceId`. To create a better dependency structure in our workspace, we repurpose `connlib-shared` as a `connlib-model` crate. Its purpose is to host domain-specific model types that multiple crates may want to use. For that purpose, we rename the `callbacks::ResourceDescription` type to `ResourceView`, designating that this is a _view_ onto a resource as seen by `connlib`. The message types which currently double up as connlib-internal model thus become an implementation detail of `firezone-tunnel` and shouldn't be used for anything else. --------- Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-10-03 14:47:58 +00:00
Jamil	613127d298	ci: Bump all clients and gateway (#6923 ) Main fix: idle connection timing. These have already been released. --------- Signed-off-by: Jamil <jamilbk@users.noreply.github.com>	2024-10-03 07:12:52 -07:00
Gabi	3501d5b287	feat(clients): use hardware id for device verification (#6857 ) We want to associate additional device information for the device verification, these are all parameters that tries to uniquely identify the hardware. For that reason we read system information and send it as part of the query params to the portal, that way the portal can store this when device is verified and match against that later on. These are the parameters according to each platform: \|Platform\|Query Field\|Field Meaning\| \|-----\|----\|-----\| \|MacOS\|`device_serial`\|Hardware's Serial\| \|MacOS\| `device_uuid`\|Hardware's UUID\| \|iOS\|`identifier_for_vendor`\| Identifier for vendor, resets only on uninstall/install\| \|Android\|`firebase_installation_id`\| Firebase installation ID, resets only on uninstall/install\| \|Windows\|`device_serial`\|Motherboard's Serial\| \|Linux\|`device_serial`\|Motherboard's Serial\| Fixes #6837	2024-10-02 08:44:26 +00:00
Jamil	e7dddee78f	ci: bump android apple dns match (#6833 ) Bumps Android -> 1.3.4, Apple -> 1.3.5 --------- Signed-off-by: Jamil <jamilbk@users.noreply.github.com>	2024-09-26 09:32:41 -07:00
Jamil	332a9fe352	ci: bump all clients to include fix for #6781 (#6820 ) bump all clients to include #6781 fix --------- Co-authored-by: Not Applicable <ReactorScram@users.noreply.github.com>	2024-09-25 19:27:50 +00:00
Thomas Eizinger	a9f515a453	chore(rust): use `#[expect]` instead of `#[allow]` (#6692 ) The `expect` attribute is similar to `allow` in that it will silence a particular lint. In addition to `allow` however, `expect` will fail as soon as the lint is no longer emitted. This ensures we don't end up with stale `allow` attributes in our codebase. Additionally, it provides a way of adding a `reason` to document, why the lint is being suppressed.	2024-09-16 13:51:12 +00:00
Jamil	4c7daddf64	ci: Publish Apple/Android changelog entries (#6631 ) These have been published.	2024-09-07 10:38:00 -07:00
Jamil	ae5613b223	ci: Update changelog for 1.3.1ish clients (#6612 ) Bumps internet resource UI.	2024-09-06 00:07:52 +00:00
Jamil	c6b0b0a922	ci: Release 1.3.0 for Internet Resource (#6503 ) This publishes the 1.3.0 clients and gateways so that Internet Resources will work. The feature is still disabled for the Stripe plans until we publish the launch post. Select customers have the feature enabled. Closes #2667	2024-08-30 01:21:34 -07:00
Jamil	c66f0c15c0	ci: Draft bump 1.3.0 clients (#6470 ) - Internet resources	2024-08-29 23:33:02 -07:00
Jamil	c8eed59387	ci: Release 1.2.0 (#6395 ) Releasing 1.2.0 to unblock portal deploy! Some of these have already been published.	2024-08-22 00:18:27 +00:00
Thomas Eizinger	3b56664e02	test(rust): ensure deterministic proptests (#6319 ) For quite a while now, we have been making extensive use of property-based testing to ensure `connlib` works as intended. The idea of proptests is that - given a certain seed - we deterministically sample test inputs and assert properties on a given function. If the test fails, `proptest` prints the seed which can then be added to a regressions file to iterate on the test case and fix it. It is quite obvious that non-determinism in how the test input gets generated is no bueno and reduces the value we get out of these tests a fair bit. The `HashMap` and `HashSet` data structures are known to be non-deterministic in their iteration order. This causes non-determinism during the input generation because we make use of a lot of maps and sets to gradually build up the test input. We fix all uses of `HashMap` and `HashSet` by replacing them with `BTreeMap` and `BTreeSet`. To ensure this doesn't regress, we refactor `tunnel_test` to not make use of proptest's macros and instead, we initialise and run the test ourselves. This allows us to dump the sampled state and transitions into a file per test run. In CI, we then run a 2nd iteration of all regression tests and compare the sampled state and transitions with the previous run. They must match byte-for-byte. Finally, to discourage use of non-deterministic iteration, we ban the use of the iteration functions on `HashMap` and `HashSet` across the codebase. This doesn't catch iteration in a `for`-loop but it is better than not linting against it at all. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-08-16 23:15:58 +00:00
Thomas Eizinger	d399e65246	build(deps): bump tokio-tungstenite to 0.23 (#5509 ) With the upgrade to 0.23, `tokio-tungstenite` pulls in `rustls` 0.27 which supports multiple crypto providers. By default, this uses the `aws-lc-crypto` provider. The previous default was `ring`. This PR bumps the necessary versions and installs the `ring` crypto provider at the beginning of each application, before connlib starts. We try and do this as early as possible to make it obvious that it only needs to happen once per process. Resolves: #5380.	2024-08-15 06:02:17 +00:00
Thomas Eizinger	7c70850217	feat(connlib): allow glob patterns for matching domain names (#5901 ) Currently, `connlib` can only handle "simple" DNS wildcards where `` matches any number of subdomains, including zero and `?` matches a single subdomain. With this PR, we expand `connlib'`s capabilities to allow for a much more complex matching of domains that more closely resembles glob patterns: - `` matches any number of subdomains. This supersedes the previous `` operator. - `` matches a single subdomain. This supersedes the previous `?` operator. - `?` matches a single character. This wasn't possible before. - Additionally, any of these can be combined. Previously, only `` or `?` was allowed and they were only accepted at the front of the domain name pattern. Resolves: #5056. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-08-15 01:30:53 +00:00
Jamil	296ca4ad4d	ci: Bump Clients and Gateways to fix NAT / allocation issues (#6287 ) Bump all Clients and Gateways due to #6265 being fixed. --------- Co-authored-by: Not Applicable <ReactorScram@users.noreply.github.com>	2024-08-13 21:58:12 +00:00
Thomas Eizinger	7642f37d56	refactor: thin out `connlib-shared` (#6256 ) Most of `connlib-shared` exists only for historical reasons. The `Tunnel` has since been decoupled from the `Callbacks` and most error variants on `ConnlibError` are not actually used. This allows us to move a few things around and trim down `ConnlibError` to just the variants that actually cause a call to `on_disconnect`. Moving everything related to `proptest`s to `firezone-tunnel` also requires us to delete the specialisation for printing IDs in a shorter format during the tests. That is a bit unfortunate but was always kind of a hack. I'd rather make progress on getting rid of `connlib-shared` though and perhaps re-introduce that feature once the messages are fully moved into the tunnel. Related: #4470.	2024-08-12 22:57:06 +00:00
Jamil	e7f8a4e4bf	ci: bump apple / android versions (#6251 ) These were approved and published so the versions need bumping.	2024-08-10 13:04:26 -07:00
Thomas Eizinger	bed625a312	chore(rust): make logging more ergonomic (#6237 ) Setting up a logger is something that pretty much every entrypoint needs to do, be it a test, a shared library embedded in another app or a standalone application. Thus, it makes sense to introduce a dedicated crate that allows us to bundle all the things together, how we want to do logging. This allows us to introduce convenience functions like `firezone_logging::test` which allow you to construct a logger for a test as a one-liner. Crucially though, introducing `firezone-logging` gives us a place to store a default log directive that silences very noisy crates. When looking into a problem, it is common to start by simply setting the log-filter to `debug`. Without further action, this floods the output with logs from crates like `netlink_proto` on Linux. It is very unlikely that those are the logs that you want to see. Without a preset filter, the only alternative here is to explicitly turn off the log filter for `netlink_proto` by typing something like `RUST_LOG=netlink_proto=off,debug`. Especially when debugging issues with customers, this is annoying. Log filters can be overridden, i.e. a 2nd filter that matches the exact same scope overrides a previous one. Thus, with this design it is still possible to activate certain logs at runtime, even if they have silenced by default. I'd expect `firezone-logging` to attract more functionality in the future. For example, we want to support re-loading of log-filters on other platforms. Additionally, where logs get stored could also be defined in this crate. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-08-10 05:17:03 +00:00
Jamil	a6ba9868dd	ci: Revert bumps to 1.2 (#6227 ) We need these at 1.1 until ready to release.	2024-08-08 18:34:39 -07:00
Jamil	096ddfe7c5	ci: bump gui/headless to 1.1.10 (#6221 ) To publish the mpsc channel fix. --------- Signed-off-by: Jamil <jamilbk@users.noreply.github.com> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-08-08 16:20:20 +00:00
Thomas Eizinger	128d0eb407	feat(connlib): transparently forward non-resources DNS queries (#6181 ) Currently, `connlib` depends on `hickory-resolver` to perform DNS queries for non-resources. This is unnecessary. Instead of buffering the original UDP DNS query, consulting hickory to resolve the name and mapping the response back, we can simply take the UDP payload and send it via our protected socket directly to the original upstream DNS server. This ensures `connlib` is as transparent as possible for DNS queries for non-resources. Additionally, it removes a lot of error handling and other cruft that we currently have to perform because we are using hickory. For example, hickory will automatically retry a DNS query after a certain timeout. However, the OS / client talking to `connlib` will also retry after a certain timeout because it is making DNS queries over an unreliable transport (UDP). It is thus unnecessary for us to do that internally. To correctly test this change, our test-suite needed some refactoring. Specifically, DNS servers are now modelled as dedicated `Host`s that can receive (UDP) traffic. Lastly, we can remove our dependency on `hickory-proto` and `hickory-resolver` everywhere and only use `domain` for parsing DNS messages. Resolves: #6141. Related: #6033. Related: #4800. (Impossible to happen with this design)	2024-08-07 08:54:49 +00:00
Gabi	a2d849087a	feat(android): add setDisabledResources FFI (#6166 ) Builds on top of #6164 Part of the effor towards https://github.com/firezone/firezone/issues/6074 Prepares connlib to call `setDisableResource` from android. Furthermore, we add a `disablable` parameter for resources which default to false for now, in the future the portal will set it for the internet resource, and further in the future it may be used for other resources. The `disablable` parameter only affect UI.	2024-08-05 22:43:27 +00:00
Jamil	51e0b61c9c	chore: Bump all clients and gateway versions (#6149 ) Includes major fixes https://github.com/firezone/firezone/pull/6143 and https://github.com/firezone/firezone/pull/6117	2024-08-02 01:12:49 -07:00
Thomas Eizinger	fc4b8c7b46	refactor: rename `reconnect` to `reset` (#6057 ) Connection roaming within `connlib` has changed a fair-bit since we introduced the `reconnect` function. The new implementation is basically a hard-reset of all state within `connlib`. Renaming this function across all layers makes this more obvious. Resolves: #6038.	2024-07-28 07:41:45 +00:00
Thomas Eizinger	59014a9622	refactor(connlib): encapsulate UDP and TCP sockets (#6028 ) As part of debugging full-route tunneling on Windows, we discovered that we need to always explicitly choose the interface through which we want to send packets, otherwise Windows may cause a routing loop by routing our packets back into the TUN device. We already have a `SocketFactory` abstraction in `connlib` that is used by each platform to customise the setup of each socket to prevent routing loops. So far, this abstraction directly returns tokio sockets which don't allow us to intercept the actual sending of packets. For some of our traffic, i.e. the UDP packets exchanged with relays, we don't specify a source address. To make full-route work on Windows, we need to intercept these packets and explicitly set the source address. To achieve that, we introduce dedicated `TcpSocket` and `UdpSocket` structs within `socket-factory`. With this in place, we will be able to add Windows-conditional code to looks up and sets the source address of outgoing UDP packets. For TCP sockets, the lookup will happen prior to connecting to the address and used to bind to the correct interface. Related: #2667. Related: #5955.	2024-07-25 04:28:46 +00:00

1 2 3

150 Commits