firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 10:18:54 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	f5779ff921	chore: release Gateway, headless-client and GUI client (#7903 ) This bumps the versions of Gateway, headless-client and the GUI client as well as updates the respective changelogs. These have been released today: - https://github.com/firezone/firezone/releases/tag/gui-client-1.4.1 - https://github.com/firezone/firezone/releases/tag/gateway-1.4.3 - https://github.com/firezone/firezone/releases/tag/headless-client-1.4.1 It is all done in one PR to avoid merge conflicts within the updates of the Makefile.	2025-01-28 16:17:58 +00:00
Thomas Eizinger	c6492d4832	fix(rust): don't start all log files with `connlib.` (#7853 ) At present, the file logger for all Rust code starts each logfile with `connlib.`. This is very confusing when exporting the logs from the GUI client because even the logs from the client itself will start with `connlib.`. To fix this, we make the base file name of the log file configurable.	2025-01-28 01:35:05 +00:00
Thomas Eizinger	e50b719d5c	refactor(headless-client): remove `FIREZONE_TOKEN` CLI arg (#7770 ) The current CLI of the headless-client allows passing the token as a positional parameter in addition to an env variable. This can be very confusing if you make a spelling error in the _command_ that you are trying to pass to the CLI, i.e. `standalone`. A misspelled command will be interpreted as the token to use to connect to the portal without any warning that it is similar to a command. The env variable `FIREZONE_TOKEN` is completely ignored in that case. To fix this, we remove the ability to pass the token via stdin. The token should instead be set via en env variable or read from a file at `FIREZONE_TOKEN_PATH`. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2025-01-21 14:22:54 +00:00
Thomas Eizinger	ed5285268d	refactor: merge `on_update_routes` and `on_set_interface_config` (#7699 ) For a while now, `connlib` has been calling these two callbacks right after each other because the internal event already bundles all the information about the TUN device. With this PR, we merge the two callback functions also in layers above `connlib` itself. Resolves: #6182.	2025-01-08 18:26:40 +00:00
Thomas Eizinger	c595bb872d	chore(telemetry): use same DSN for GUI and IPC service (#7667 ) The application-split itself doesn't really warrant having two different Sentry projects. 1. The location of the panic / log already tells us, which component is failing. 2. Both of the projects are built with Rust so the same "platform" setting applies. 3. Reducing the number of Sentry projects makes things easier to manage. 4. The binaries are started as independent processes, so the two Sentry contexts don't interfere. What we should keep in mind is that one instance of an application will now log into Sentry twice using the same DSN. I _think_ this means that the number of sessions listed in Sentry will be double the number of actual client-runs. The same is true for the Apple client though and once we integrate Sentry for Android, the same will apply there so relative to each other, those numbers still make sense.	2025-01-05 18:26:45 +00:00
Thomas Eizinger	9789eb5353	refactor(gui-client): de-duplicte code for deleting subkey (#7574 ) The current way this is implemented is a bit tricky to read. By splitting out a dedicated function and adding some logging, it becomes more apparent what we do here. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-12-29 08:45:50 +00:00
Thomas Eizinger	6f0b471652	fix(gui-client): ensure IPC errors are mapped correctly (#7554 ) In order to make sure that the correct error message is displayed to users, we need to preserve error information as much as possible.	2024-12-23 11:44:09 +00:00
Thomas Eizinger	c7a4221cb7	fix(gui-client): flush telemetry events on IPC service exit (#7557 ) Due to how we currently initialise telemetry in the IPC service, I think we are missing out on events when it _exits_ due to an error because we don't explicitly stop the telemetry session. We have alerts from a fair few users in Sentry where the IPC service appears to stop / disappear but there are no corresponding events for the IPC service.	2024-12-19 20:06:47 +00:00
Thomas Eizinger	bc2febed99	fix(connlib): use correct constant for truncating DNS responses (#7551 ) In case an upstream DNS server responds with a payload that exceeds the available buffer space of an IP packet, we need to truncate the response. Currently, this truncation uses the wrong constant to check for the maximum allowed length. Instead of the `MAX_DATAGRAM_PAYLOAD`, we actually need to check against a limit that is less than the MTU as the IP layer and the UDP layer both add an overhead. To fix this, we introduce such a constant and provide additional documentation on the remaining ones to hopefully avoid future errors.	2024-12-19 17:15:43 +00:00
Thomas Eizinger	b63061994d	chore(headless-client): release version 1.4.0 (#7495 ) Headless Client 1.4.0 has been released (https://github.com/firezone/firezone/releases/tag/headless-client-1.4.0). This PR updates the changelog and version numbers accordingly.	2024-12-13 07:10:11 +00:00
Thomas Eizinger	f0c2bfa6eb	chore(gui-client): release version 1.4.0 (#7496 ) GUI Client 1.4.0 has been released (https://github.com/firezone/firezone/releases/tag/gui-client-1.4.0). This PR updates the changelog and versions accordingly.	2024-12-13 04:41:49 +00:00
Thomas Eizinger	81f71cba62	fix(telemetry): use `package@version` notation for releases (#7466 ) In order for Sentry to parse our releases as semver, they need to be in the form of `package@version` [0]. Without this, the feature of "Mark this issue as resolved in the _next_ version" doesn't work properly because Sentry compares the versions as to when it first saw them vs parsing the semver string itself. We test versions prior to releasing them, meaning Sentry learns about a 1.4.0 version before it is actually released. This causes false-positive "regressions" even though they are fixed in a later (as per semver) release. This create some redundancy with the different DSNs that we are already using. I think it would make sense to consider merging the two projects we have for the GUI client for example. That is really just one project that happens to run as two binaries. For all other projects, I think the separation still makes sense because we e.g. may add Sentry to the "host" applications of Android and MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the issues into the same Sentry project. As per Sentry's docs, releases are organisation-wide and therefore need a package identifier to be grouped correctly. [0]: https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version	2024-12-09 05:04:45 +00:00
Thomas Eizinger	90cf191a7c	feat(linux): multi-threaded TUN device operations (#7449 ) ## Context At present, we only have a single thread that reads and writes to the TUN device on all platforms. On Linux, it is possible to open the file descriptor of a TUN device multiple times by setting the `IFF_MULTI_QUEUE` option using `ioctl`. Using multi-queue, we can then spawn multiple threads that concurrently read and write to the TUN device. This is critical for achieving a better throughput. ## Solution `IFF_MULTI_QUEUE` is a Linux-only thing and therefore only applies to headless-client, GUI-client on Linux and the Gateway (it may also be possible on Android, I haven't tried). As such, we need to first change our internal abstractions a bit to move the creation of the TUN thread to the `Tun` abstraction itself. For this, we change the interface of `Tun` to the following: - `poll_recv_many`: An API, inspired by tokio's `mpsc::Receiver` where multiple items in a channel can be batch-received. - `poll_send_ready`: Mimics the API of `Sink` to check whether more items can be written. - `send`: Mimics the API of `Sink` to actually send an item. With these APIs in place, we can implement various (performance) improvements for the different platforms. - On Linux, this allows us to spawn multiple threads to read and write from the TUN device and send all packets into the same channel. The `Io` component of `connlib` then uses `poll_recv_many` to read batches of up to 100 packets at once. This ties in well with #7210 because we can then use GSO to send the encrypted packets in single syscalls to the OS. - On Windows, we already have a dedicated recv thread because `WinTun`'s most-convenient API uses blocking IO. As such, we can now also tie into that by batch-receiving from this channel. - In addition to using multiple threads, this API now also uses correct readiness checks on Linux, Darwin and Android to uphold backpressure in case we cannot write to the TUN device. ## Configuration Local testing has shown that 2 threads give the best performance for a local `iperf3` run. I suspect this is because there is only so much traffic that a single application (i.e. `iperf3`) can generate. With more than 2 threads, the throughput actually drops drastically because `connlib`'s main thread is too busy with lock-contention and triggering `Waker`s for the TUN threads (which mostly idle around if there are 4+ of them). I've made it configurable on the Gateway though so we can experiment with this during concurrent speedtests etc. In addition, switching `connlib` to a single-threaded tokio runtime further increased the throughput. I suspect due to less task / context switching. ## Results Local testing with `iperf3` shows some very promising results. We now achieve a throughput of 2+ Gbit/s. ``` Connecting to host 172.20.0.110, port 5201 Reverse mode, remote host 172.20.0.110 is sending [ 5] local 100.80.159.34 port 57040 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 274 MBytes 2.30 Gbits/sec [ 5] 1.00-2.00 sec 279 MBytes 2.34 Gbits/sec [ 5] 2.00-3.00 sec 216 MBytes 1.82 Gbits/sec [ 5] 3.00-4.00 sec 224 MBytes 1.88 Gbits/sec [ 5] 4.00-5.00 sec 234 MBytes 1.96 Gbits/sec [ 5] 5.00-6.00 sec 238 MBytes 2.00 Gbits/sec [ 5] 6.00-7.00 sec 229 MBytes 1.92 Gbits/sec [ 5] 7.00-8.00 sec 222 MBytes 1.86 Gbits/sec [ 5] 8.00-9.00 sec 223 MBytes 1.87 Gbits/sec [ 5] 9.00-10.00 sec 217 MBytes 1.82 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec 22247 sender [ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec receiver iperf Done. ``` This is a pretty solid improvement over what is in `main`: ``` Connecting to host 172.20.0.110, port 5201 [ 5] local 100.65.159.3 port 56970 connected to 172.20.0.110 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 90.4 MBytes 758 Mbits/sec 1800 106 KBytes [ 5] 1.00-2.00 sec 93.4 MBytes 783 Mbits/sec 1550 51.6 KBytes [ 5] 2.00-3.00 sec 92.6 MBytes 777 Mbits/sec 1350 76.8 KBytes [ 5] 3.00-4.00 sec 92.9 MBytes 779 Mbits/sec 1800 56.4 KBytes [ 5] 4.00-5.00 sec 93.4 MBytes 783 Mbits/sec 1650 69.6 KBytes [ 5] 5.00-6.00 sec 90.6 MBytes 760 Mbits/sec 1500 73.2 KBytes [ 5] 6.00-7.00 sec 87.6 MBytes 735 Mbits/sec 1400 76.8 KBytes [ 5] 7.00-8.00 sec 92.6 MBytes 777 Mbits/sec 1600 82.7 KBytes [ 5] 8.00-9.00 sec 91.1 MBytes 764 Mbits/sec 1500 70.8 KBytes [ 5] 9.00-10.00 sec 92.0 MBytes 771 Mbits/sec 1550 85.1 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 917 MBytes 769 Mbits/sec 15700 sender [ 5] 0.00-10.00 sec 916 MBytes 768 Mbits/sec receiver iperf Done. ```	2024-12-05 00:18:20 +00:00
Thomas Eizinger	48bd0f9804	chore: bump client versions to 1.4.0 (#7092 ) In order to release the new control protocol to users, we need to bump the versions of the clients to 1.4.0. The portal has a version gate to only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus, bumping these versions can only happen once testing has completed and the gateway has actually been released as 1.4.0. Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>	2024-12-04 19:48:51 +00:00
Thomas Eizinger	dd6b52b236	chore(rust): share edition key via workspace table (#7451 )	2024-12-03 00:28:06 +00:00
Thomas Eizinger	f81f8b2ed7	fix(gui-client): don't share log-directives via file system (#7445 ) At present, the GUI client shares the current log-directives with the IPC service via the file system. Supposedly, this has been done to allow the IPC service to start back up with the same log filter as before. This behaviour appears to be buggy though as we are receiving a fair number of error reports where this file is not writable. Instead of relying on the file system to communicate, we send the current log-directives to the IPC service as soon as we start up. The IPC service then uses the file system as a cache that log string and re-apply it on the next startup. This way, no two programs need to read / write the same file. The IPC service runs with higher privileges, so this should resolve the permission errors we are seeing in Sentry.	2024-12-02 23:28:43 +00:00
Thomas Eizinger	4f92a0d7ca	refactor(gui-client): tidy up GUI controller code (#7444 ) This PR intends to be a pure refactoring, i.e. no behaviour change. It simplifies a few aspects of the GUI controller event-loop by getting rid of the `select!` macro. We also remove some indirection of the `gui_controller::Builder`.	2024-12-02 20:07:44 +00:00
Thomas Eizinger	e833cb4f30	fix(rust): don't log and return `DisconnectError`s (#7416 ) These will be handled by whoever sits on the other side of the channel. Logging these here as well causes duplicate logs and error reports to Sentry.	2024-12-01 20:22:29 +00:00
Thomas Eizinger	932f6791fb	fix(phoenix-channel): lazily create backoff timer (#7414 ) Our `phoenix-channel` component is responsible for maintaining a WebSocket connection to the portal. In case that connection fails, we want to reconnect to it using an exponential backoff, eventually giving up after a certain amount of time. Unfortunately, the code we have today doesn't quite do that. An `ExponentialBackoff` has a setting for the `max_elapsed_time`. Regardless of how many and how often we retry something, we won't ever wait longer than this amount of time. For the Relay, this is set to 15min. For other components its indefinite (Gateway, headless-client), or very long (30 days for Android, 1 day for Apple). The point in time from which this duration is counted is when the `ExponentialBackoff` is constructed which translates to when we first connected to the portal. As a result, our backoff would immediately fail on the first error if it has been longer than `max_elapsed_time` since we first connected. For most components, this codepath is not relevant because the `max_elapsed_time` is so long. For the Relay however, that is only 15 minutes so chances are, the Relay would immediately fail (and get rebooted) on the first connection error with the portal. To fix this, we now lazily create the `ExponentialBackoff` on the first error. This bug has some interesting consequences: When a relay reboots, it looses all its state, i.e. allocations, channel bindings, available nonces etc, stamp-secret. Thus, all credentials and state that got distributed to Clients and Gateways get invalidated, causing disconnects from the Relay. We have observed these alerts in Sentry for a while and couldn't explain them. Most likely, this is the root cause for those because whilst a Relay disconnects, the portal also cannot detect its presence and pro-actively inform Clients and Gateways to no longer use this Relay.	2024-11-29 20:19:11 +00:00
Thomas Eizinger	c6e7e6192e	build(rust): bump Rust to 1.83 (#7409 ) Rust 1.83 comes with a bunch of new lints for elidible lifetimes. Those also trigger in the generated code of `derivative`. That crate is actually unmaintained so we replace our usages of it with `derive_more`.	2024-11-29 01:04:06 +00:00
Thomas Eizinger	2c26fc9c0e	ci: lint Rust dependencies using `cargo deny` (#7390 ) One of Rust's promises is "if it compiles, it works". However, there are certain situations in which this isn't true. In particular, when using dynamic typing patterns where trait objects are downcast to concrete types, having two versions of the same dependency can silently break things. This happened in #7379 where I forgot to patch a certain Sentry dependency. A similar problem exists with our `tracing-stackdriver` dependency (see #7241). Lastly, duplicate dependencies increase the compile-times of a project, so we should aim for having as few duplicate versions of a particular dependency as possible in our dependency graph. This PR introduces `cargo deny`, a linter for Rust dependencies. In addition to linting for duplicate dependencies, it also enforces that all dependencies are compatible with an allow-list of licenses and it warns when a dependency is referred to from multiple crates without introducing a workspace dependency. Thanks to existing tooling (https://github.com/mainmatter/cargo-autoinherit), transitioning all dependencies to workspace dependencies was quite easy. Resolves: #7241.	2024-11-22 00:17:28 +00:00
Thomas Eizinger	c93391e8fd	chore(headless-client): setup logging earlier (#7385 ) Logging needs to be set up as early as possible to ensure we capture log messages such as `Starting telemetry`.	2024-11-20 01:30:37 +00:00
Thomas Eizinger	86ada01828	fix(gui-client): initialise `sentry-tracing` for IPC service (#7363 ) It was already a bit sus that we didn't receive as many errors in Sentry from the IPC service as from the GUI client. Turns out that we forgot to initialise our `sentry_layer` there. Additionally, we also didn't initialise the `LogTracer`, meaning we didn't capture logs from the `log` crate which is used by some of the dependencies, for example `wintun`.	2024-11-18 22:40:01 +00:00
Thomas Eizinger	24f7ba530d	refactor(gui-client): add more context to connection failures (#7364 ) Adding more context to these errors makes it easier to identify, which of the operations fails. In addition, we remove some usages of the "log and return" anti-pattern to avoid duplicate reports of the same issue.	2024-11-18 18:16:16 +00:00
Thomas Eizinger	2b3469954a	chore(headless-client): allow disabling telemetry (#7350 ) I've started to set this in my local env to not spam Sentry with events while I am developing.	2024-11-15 08:14:36 +00:00
Thomas Eizinger	0cb96f5a18	chore(gui-client): publish version 1.3.13 (#7346 )	2024-11-15 06:52:38 +00:00
Thomas Eizinger	4fc7e62ba8	chore(headless-client): publish version 1.3.7 (#7348 )	2024-11-15 05:39:39 +00:00
Thomas Eizinger	8c5a5fa690	chore(rust): correctly disable ANSI escapes globally (#7336 ) I think I finally understood and correctly traced, where the use of ANSI escape codes came from. It turns out, the `with_ansi` switch on `tracing_subscriber::fmt::Layer` is what you want to toggle. From there, it trickles down to the `Writer` which we can then test for in our `Format`. Resolves: #7284. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-11-14 05:00:53 +00:00
Thomas Eizinger	48ba2869a8	chore(rust): ban the use of `.unwrap` except in tests (#7319 ) Using the clippy lint `unwrap_used`, we can automatically lint against all uses of `.unwrap()` on `Result` and `Option`. This turns up quite a few results actually. In most cases, they are invariants that can't actually be hit. For these, we change them to `Option`. In other cases, they can actually be hit. For example, if the user supplies an invalid log-filter. Activating this lint ensures the compiler will yell at us every time we use `.unwrap` to double-check whether we do indeed want to panic here. Resolves: #7292.	2024-11-13 03:59:22 +00:00
Jamil	6f7f6a4f34	style: Enforce code style across all supported languages using Prettier (#7322 ) This ensure that we run prettier across all supported filetypes to check for any formatting / style inconsistencies. Previously, it was only run for files in the website/ directory using a deprecated pre-commit plugin. The benefit to keeping this in our pre-commit config is that devs can optionally run these checks locally with `pre-commit run --config .github/pre-commit-config.yaml`. --------- Signed-off-by: Jamil <jamilbk@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-11-13 00:19:15 +00:00
Thomas Eizinger	9e9dfd5e97	chore(gui-client): downgrade warning to debug (#7313 ) With a retry-mechanism in place, there is no need to log a warning when `connect_to_service` fails. Instead, we just log this as on DEBUG and continue trying. If it fails after all attempts, the entire function will bail out and we will receive a Sentry event from error handling higher up the callstack.	2024-11-12 03:54:49 +00:00
Thomas Eizinger	ad4eea29ff	chore(rust): don't panic in fallible functions (#7298 ) "Just let it crash" is terrible advice for software that is shipped to end users. Where possible, we should use proper error handling and only fail the current function / task that is active, e.g. drop a particular packet instead of failing all of connlib. We more or less already do that. Activating the clippy lint `unwrap_in_result` surfaced a few more places where we panic despite being in a function that is fallible already. These cases can easily be converted to not panic and return an error instead.	2024-11-11 23:55:23 +00:00
Thomas Eizinger	0dc078876b	refactor(gui-client): capture error sources when connect fails (#7303 ) When `connlib` fails to establish a session, the GUI client currently only captures the top-level error within `connect_to_firezone` because it uses `.to_string()` for all errors. Unfortunately, that doesn't print any of the sources of an error. To conveniently capture all sources, we can use `anyhow` and its alternate formatting using `format!("{e:#}")` (notice the `#`). Not all errors within `connect_to_firezone` should be captured like this however. Certain IO errors, in particular when trying to resolve the domain of the portal, need to be captured separately because they may resolve by themselves if we gain connectivity again. This is important, otherwise we discard the users token when they boot-up a machine without internet access yet Firezone is auto-starting. To make this more ergonomic, we trim down `IpcServiceError` to two variants: The IO variant we need to special-case and everything else. This allows us to create `From` impls which "do the right thing" by capturing more error information using `anyhow`'s alternate formatting.	2024-11-11 22:52:14 +00:00
Thomas Eizinger	488c599d5b	chore(telemetry): capture Firezone ID and account in user ctx (#7310 ) Sentry has a feature called the "User context" which allows us to assign events to individual users. This in turn will give us statistics in Sentry, how many users are affected by a certain issue. Unfortunately, Sentry's user context cannot be built-up step-by-step but has to be set as a whole. To achieve this, we need to slightly refactor `Telemetry` to not be `clone`d and instead passed around by mutable reference. Resolves: #7248. Related: https://github.com/getsentry/sentry-rust/issues/706.	2024-11-11 19:50:14 +00:00
Jamil	1dda915376	ci: Publish new clients (#7291 ) Fixes the roaming bug.	2024-11-08 22:58:06 +00:00
Thomas Eizinger	cdd3e4d25c	fix(headless-client): don't fuse futures outside of the loop (#7287 ) When waiting on multiple futures concurrently within a loop, it is important that they all get re-created whenever one of them resolves. Currently, due to the `.fuse` call, the SIGHUP signal can only be sent once and future signals get ignored. As a more general fix, I swapped the `futures::select!` macro to the `tokio::select!` macro which allows referencing these futures without pinning and fusing. Ideally, we'd not use any of these macros here and write our own eventloop but that is a larger refactoring.	2024-11-08 05:01:37 +00:00
Thomas Eizinger	e261cb3c27	chore: remove `git_version!` (#7270 ) Reading the Git version requires the entire Git repository to be present, including all tags. The tags are only created _after_ the artifact is being built, when we publish the release. Therefore, these tags are never included in the actual released binary. For Sentry, we use the `CARGO_PKG_VERSION` variable instead. This doesn't tell us whether somebody built a client from source and then used it so there could be some confusion in Sentry events. It is quite unlikely that this happens though so for the majority of Sentry alerts, this will give us the correct version. For the Android client, we also depend on the `GITHUB_SHA` env variable at compile-time. We do the same thing for the GUI client here. Resolves: #6925.	2024-11-07 22:56:17 +00:00
Jamil	71fbfab2d5	fix(gui-client): Include rust files when replacing version sentinels (#7278 ) Fixes an issue where the ipc_service was stuck reporting 1.3.10.	2024-11-06 19:25:56 +00:00
Thomas Eizinger	47e45a3cf3	chore(telemetry): improve telemetry spans and events (#7206 ) DNS resolution is a critical part of `connlib`. If it is slow for whatever reason, users will notice this. To make sure we notice as well, we add `telemetry` spans to the client's and gateway's DNS resolution. For the client, this applies to all DNS queries that we forward to the upstream servers. For the gateway, this applies to all DNS resources. In addition to those IO operations, we also instrument the `match_resource_linear` function. This function operates in `O(n)` of all defined DNS resources. It _should_ be fast enough to not create an impact but it can't hurt to measure this regardless. Lastly, we also instrument `refresh_translations` on the gateway. Refreshing the DNS resolution of a DNS resource should really only happen, when the previous IP addresses become stale yet the user is still trying to send traffic to them. We don't actually have any data on how often that happens. By instrumenting it, we can gather some of this data. To make sure that none of these telemetry events and spans hurt the end-user performance, we introduce macros to `firezone-logging` that sample the creation of these events and spans at a rate of 1%. I ran a flamegraph and none of these even showed up. The most critical one here is probably the `match_resource_linear` span because it happens on every DNS query. Resolves: #7198. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-11-06 01:17:57 +00:00
Thomas Eizinger	78ebad13ab	chore(rust): log more errors as `tracing::Value`s (#7208 ) Logging these as structured values gives us a better stacktrace in Sentry (assuming the errors themselves make proper use of defining an error-chain).	2024-11-05 14:36:47 +00:00
Thomas Eizinger	5564e578fe	fix(telemetry): flush sentry.io events in dedicated task (#7205 ) `sentry`'s transport layer appears to be using blocking IO for flushing events. Performing blocking IO within a future that is running on a worker-thread of tokio causes this operation to hang and eventually time-out after 5 seconds. As a result, many events - especially traces - don't get flushed to sentry when an app is being shut down. To fix this, we make `Telemetry::stop` an `async fn` and offload the flushing to a task on tokio's thread-pool for blocking IO.	2024-11-01 15:52:09 +00:00
Thomas Eizinger	88404c3148	chore: publish `headless-client` v1.3.5 (#7191 ) Signed-off-by: Thomas Eizinger <thomas@eizinger.io>	2024-10-31 20:49:24 +00:00
Reactor Scram	51250faa0d	chore(telemetry): make the firezone device ID a context not a tag (#7179 ) Closes #7175 Also fixes a bug with the initialization order of Tokio and Sentry. Previously: 1. Start Tokio, executor threads inherit main thread context 2. Load device ID and set it on the main telemetry hub Now: 1. Load device ID and set it on the main telemetry hub 2. Start Tokio, executor threads inherit main thread context The context and possibly tags didn't seem to propagate from the main hub if we set them after the worker threads spawned. Based on this understanding, the IPC service process is still wrong, but a fix will have to wait, because telemetry in the IPC service is more complicated than in the GUI process. <img width="818" alt="image" src="https://github.com/user-attachments/assets/9c9efec8-fc55-4863-99eb-5fe9ba5b36fa">	2024-10-30 21:27:17 +00:00
Thomas Eizinger	82fcad0a3b	refactor(rust): only send `telemetry` spans to Sentry (#7153 ) With the introduction of the `tracing-sentry` integration in #7105, we started sending tracing spans to Sentry. By default, all spans with level INFO and above get sampled at the configured rate and sent to Sentry. This results in a lot of useless transaction in Sentry because we use INFO level spans in multiple places in connlib to attach contextual information like the current connection ID. This PR introduces the concept of `telemetry` spans which - similar to the `telemetry` log target in #7147 - qualifies a span for being sent to Sentry. By convention, these are also defined as requiring the TRACE level. This ensures we won't ever see them as part of regular log output.	2024-10-24 20:25:26 +00:00
Thomas Eizinger	ee30368970	refactor(connlib): simplify error handling on crash (#7134 ) The `fmt::Display` implementation of `tokio::task::JoinError` already does exactly what we do here: Extracting the panic message if there is one. Thus, we can simplify this code why just moving the `JoinError` into the `DisconnectError` as its source.	2024-10-23 16:13:39 +00:00
Thomas Eizinger	4020756e7f	chore: remove accidentially committed debugging code (#7130 )	2024-10-23 03:37:25 +00:00
Reactor Scram	2e51274ab0	fix(rust/gui-client): fix the version reported by the IPC service to the portal (#7123 ) Closes #7122 It had been reporting the Headless Client version, since the IPC service is built as part of the Headless Client crate. Now it's corrected from 1.3.5 to 1.3.10 <img width="417" alt="image" src="https://github.com/user-attachments/assets/b868de4a-3dce-42e3-ab4f-39a68c2ba48c">	2024-10-22 20:30:00 +00:00
Thomas Eizinger	0b25e34ebe	fix(headless-client): stop telemetry while `connlib` is active (#7109 ) Flushing events to Sentry requires us to be able to resolve domain names. This is only possible while connlib is active or completely disabled. Without this, stopping telemetry pretty much always times out for me on my local machine when using the headless-client.	2024-10-22 16:08:29 +00:00
dependabot[bot]	1c7ffb79ce	build(deps): Bump serde_json from 1.0.129 to 1.0.132 in /rust (#7114 ) Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.129 to 1.0.132. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/serde-rs/json/releases">serde_json's releases</a>.</em></p> <blockquote> <h2>1.0.132</h2> <ul> <li>Improve binary size and compile time for JSON array and JSON object deserialization by about 50% (<a href="https://redirect.github.com/serde-rs/json/issues/1205">#1205</a>)</li> <li>Improve performance of JSON array and JSON object deserialization by about 8% (<a href="https://redirect.github.com/serde-rs/json/issues/1206">#1206</a>)</li> </ul> <h2>1.0.131</h2> <ul> <li>Implement Deserializer and IntoDeserializer for <code>Map<String, Value></code> and <code>&Map<String, Value></code> (<a href="https://redirect.github.com/serde-rs/json/issues/1135">#1135</a>, thanks <a href="https://github.com/swlynch99"><code>@swlynch99</code></a>)</li> </ul> <h2>1.0.130</h2> <ul> <li>Support converting and deserializing <code>Number</code> from i128 and u128 (<a href="https://redirect.github.com/serde-rs/json/issues/1141">#1141</a>, thanks <a href="https://github.com/druide"><code>@druide</code></a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`86d933cfd7`"><code>86d933c</code></a> Release 1.0.132</li> <li><a href="`f45b422a3b`"><code>f45b422</code></a> Merge pull request <a href="https://redirect.github.com/serde-rs/json/issues/1206">#1206</a> from dtolnay/hasnext</li> <li><a href="`f2082d2a04`"><code>f2082d2</code></a> Clearer order of comparisons</li> <li><a href="`0f54a1a0df`"><code>0f54a1a</code></a> Handle early return sooner on eof in seq or map</li> <li><a href="`2a4cb44f7c`"><code>2a4cb44</code></a> Rearrange 'match peek'</li> <li><a href="`4cb90ce66d`"><code>4cb90ce</code></a> Merge pull request <a href="https://redirect.github.com/serde-rs/json/issues/1205">#1205</a> from dtolnay/hasnext</li> <li><a href="`b71ccd2d8f`"><code>b71ccd2</code></a> Reduce duplicative instantiation of logic in SeqAccess and MapAccess</li> <li><a href="`a810ba9850`"><code>a810ba9</code></a> Release 1.0.131</li> <li><a href="`0d084c5038`"><code>0d084c5</code></a> Touch up PR 1135</li> <li><a href="`b4954a9561`"><code>b4954a9</code></a> Merge pull request <a href="https://redirect.github.com/serde-rs/json/issues/1135">#1135</a> from swlynch99/map-deserializer</li> <li>Additional commits viewable in <a href="https://github.com/serde-rs/json/compare/1.0.129...1.0.132">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=serde_json&package-manager=cargo&previous-version=1.0.129&new-version=1.0.132)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-22 15:42:09 +00:00
Thomas Eizinger	73eebd2c4d	refactor(rust): consistently record errors as `tracing::Value` (#7104 ) Our logging library, `tracing` supports structured logging. This is useful because it preserves the more than just the string representation of a value and thus allows the active logging backend(s) to capture more information for a particular value. In the case of errors, this is especially useful because it allows us to capture the sources of a particular error. Unfortunately, recording an error as a tracing value is a bit cumbersome because `tracing::Value` is only implemented for `&dyn std::error::Error`. Casting an error to this is quite verbose. To make it easier, we introduce two utility functions in `firezone-logging`: - `std_dyn_err` - `anyhow_dyn_err` Tracking errors as correct `tracing::Value`s will be especially helpful once we enable Sentry's `tracing` integration: https://docs.rs/sentry-tracing/latest/sentry_tracing/#tracking-errors	2024-10-22 04:46:26 +00:00

1 2 3 4 5

225 Commits