Commit Graph

248 Commits

Author SHA1 Message Date
Thomas Eizinger
f882edb3bd feat(gui-client): configure IPC service to log to stdout (#8219)
On Linux, logs sent to stdout from a systemd-service are automatically
captured by `journald`. This is where most admins expect logs to be and
frankly, doing any kind of debugging of Firezone is much easier if you
can do `journalctl -efu firezone-client-ipc.service` in a terminal and
check what the IPC service is doing.

On Windows, stdout from a service is (unfortunately) ignored.

To achieve this and also allow dynamically changing the log-filter, I
had to introduce a (long-overdue) abstraction over tracing's "reload"
layer that allows us to combine multiple reload-handles into one.
Unfortunately, neither the `reload::Layer` nor the `reload::Handle`
implement `Clone`, which makes this unnecessarily difficult.

Related: #8173
2025-02-23 00:23:29 +00:00
Thomas Eizinger
6f68b97558 chore(gui-client): release v1.4.6 (#8211) 2025-02-20 04:25:38 +00:00
Thomas Eizinger
7ea17c144a refactor(gui-client): de-duplicate logging of IPC message errors (#8157) 2025-02-17 14:21:52 +00:00
Thomas Eizinger
72782b8389 fix(gui-client): update telemetry context on new session (#8152)
Every time we start a new session, our telemetry context potentially
changes, i.e. the user may sign into a new account. This should ensure
that both the IPC service and the GUI always use the most up-to-date
`account_slug` as part of Sentry events. In addition, this will also set
the `account_slug` for clients that just signed in. Previously, the
`account_slug` would only get populated on the next start of the client.
2025-02-17 03:29:08 +00:00
Thomas Eizinger
8f0db6ad47 fix(connlib): run all callbacks on a separate thread (#8126)
At present, `connlib` communicates with its host app via callbacks.
These callbacks are executed synchronously as part of `connlib`s
event-loop, meaning `connlib` cannot do anything else whilst the
callback is executing in the host app. Additionally, this callback runs
within the `Future` that represents `connlib` and thus runs on a `tokio`
worker thread.

Attempting to interact with the session from within the callback can
lead to panics, for example when `Session::disconnect` is called which
uses `Runtime::block_on`. This isn't allowed by `tokio`: You cannot
block on the execution of an async task from within one of the worker
threads.

To solve both of these problems, we introduce a thread-pool of size 1
that is responsible for executing `connlib` callbacks. Not only does
this allow `connlib` to perform more work such as routing packets or
process portal messages, it also means that it is not possible for the
host app to cause these panics within the `tokio` runtime because the
callbacks run on a different thread.
2025-02-14 06:54:35 +00:00
Jamil
5afeb30f6f ci: Bump GUI clients to 1.4.5 (#8113) 2025-02-12 20:56:27 +00:00
Thomas Eizinger
1847e8407a chore: release Headless Client v1.4.3 (#8093) 2025-02-11 14:10:13 +00:00
Thomas Eizinger
6093199ee3 chore: release GUI Client v1.4.4 (#8092) 2025-02-11 14:09:34 +00:00
Thomas Eizinger
7dcda1dc74 fix(windows): silence 0x800706D9 when DNS deactivation fails (#8085)
The error code we see here means "There are no more endpoints available
from the endpoint mapper." This has something to do with Windows'
internal RPC communication between components. DNS deactivation is on a
best-effort basis and it appears that everything else is working just
fine, despite this error.

It appears to happen when we shut down our own service, so perhaps it is
just a race condition.
2025-02-11 05:38:37 +00:00
Jamil
e59aa0c93f chore: Hide internal commands/flags in headless clients (#8055)
These are just noise for the user and only used internally in Firezone.
2025-02-10 22:38:31 +00:00
Thomas Eizinger
a0c96f7899 refactor(windows): don't fail install on missing IPC service (#8072)
In order to test changes to the IPC service on Windows more easily, the
IPC service binary offers an `install` command that installs a new
"Debug" IPC service. Prior to that, the previous is uninstalled.

This doesn't work if one doesn't have a previous "Debug" IPC service so
the `install` command only works for devs that have at least run it once
with that part of the function commented out. To improve this Dev UX, we
don't abort if we can't uninstall the previous one.
2025-02-10 22:01:01 +00:00
Jamil
eb3c269d05 ci: Publish headless client 1.4.2 (#8080)
Publishes the headless client 1.4.2, now with Windows support.

Resolves: #3782
2025-02-10 19:10:36 +00:00
dependabot[bot]
5013f9b9c6 build(deps): bump sd-notify from 0.4.3 to 0.4.5 in /rust (#8078)
Bumps [sd-notify](https://github.com/lnicola/sd-notify) from 0.4.3 to
0.4.5.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/lnicola/sd-notify/blob/master/CHANGELOG.md">sd-notify's
changelog</a>.</em></p>
<blockquote>
<h2>[0.4.5] - 2025-01-18</h2>
<h3>Fixed</h3>
<ul>
<li>fixed a dubious transmute between different slice types</li>
</ul>
<h2>[0.4.4] - 2025-01-16</h2>
<h3>Added</h3>
<ul>
<li>added <code>NotifyState::MonotonicUsec</code>, for use with
<code>Type=notify-reload</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="70a941baf1"><code>70a941b</code></a>
Bump to 0.4.5</li>
<li><a
href="6958ce12e4"><code>6958ce1</code></a>
Merge pull request <a
href="https://redirect.github.com/lnicola/sd-notify/issues/15">#15</a>
from tbu-/pr_slice_transmute</li>
<li><a
href="1e938f2fd5"><code>1e938f2</code></a>
Use <code>slice::from_raw_parts</code> instead of
<code>mem::transmute</code></li>
<li><a
href="cb4459a4bb"><code>cb4459a</code></a>
Prepare for new release</li>
<li><a
href="8eb2c5cab3"><code>8eb2c5c</code></a>
Add NotifyState::MonotonicUsec and helper</li>
<li><a
href="5462699164"><code>5462699</code></a>
Add NotifyState::MonotonicUsec and helper</li>
<li><a
href="6990e3733f"><code>6990e37</code></a>
Fix clippy warnings</li>
<li>See full diff in <a
href="https://github.com/lnicola/sd-notify/compare/v0.4.3...v0.4.5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sd-notify&package-manager=cargo&previous-version=0.4.3&new-version=0.4.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-10 10:48:29 +00:00
Jamil
b8852b3e7a ci: attach Windows headless client to release (#8041)
This publishes the windows headless client using the same convention set
forth by the linux headless client.

Docs and website changes will come in a subsequent PR.

Related: #3782
Resolves: #8046
2025-02-08 13:51:56 +00:00
Thomas Eizinger
6f70ebe2b8 fix(gui-client): graceful shutdown of IPC service on Windows (#8034)
When the IPC service is shutdown gracefully (i.e. purposely), we send a
`TerminatingGracefully` message of the IPC channel. This allows the GUI
to handle this case differently from the a crash.

On Linux, this is achieved by reacting to signals that are sent to the
IPC process. Windows however doesn't send any signals to services.
Instead, we get an event that we are being shutdown.

Currently, this event is handled separately from the signal handler and
the signal handler does nothing on Windows. To make this more uniform
and allow graceful shutdown of the IPC service on Windows, we introduce
a 2nd constructor to the `Terminate` signal abstraction that is already
hooked up with the correct logic here.
2025-02-06 14:21:39 +00:00
Thomas Eizinger
d2e9b09874 refactor(rust): stringify errors early (#8033)
As it turns out, the effort in #7104 was not a good idea. By logging
errors as values, most of our Sentry reports all have the same title and
thus cannot be differentiated from within the overview at all. To fix
this, we stringify errors with all their sources whenever they got
logged. This ensures log messages are unique and all Sentry issues will
have a useful title.
2025-02-06 14:18:35 +00:00
Thomas Eizinger
3118c3b9cf chore: release GUI client v1.4.3 (#8019) 2025-02-04 22:13:58 +00:00
Thomas Eizinger
855f89ec89 chore(windows): log user that service is running as (#8002)
In order to make debugging permission issues easier, log the elevation
status and the user as part of booting up the IPC service.
2025-02-03 06:16:48 +00:00
Thomas Eizinger
f2725df922 fix(gui-client): don't fail DNS control on nameservers (#7972)
As part of fixing #6777, we added a code path to explicitly set
nameservers on the tunnel interface. This was necessary to make DNS
resources work under WSL. Exactly this code also seems to be causing
issues where this particular registry key is not available on a users
machine.

DNS resources outside of WSL will also work without this, therefore we
make this part optional to at least have something working on the users
machine.

Related: #7962.
2025-02-03 05:48:58 +00:00
Thomas Eizinger
a94c611c05 chore(gui-client): improve logging when applying NRPT rules (#7967)
Should help in further debugging problems like #7962 in the future.
2025-02-01 21:12:23 +00:00
Jamil
2683fa5242 chore: Release GUI client 1.4.2 (#7964)
Releasing the Windows 1.4.2 client with the interface name fix.
2025-01-30 09:46:24 -08:00
Thomas Eizinger
a2a92a52b4 refactor(gui-client): simplify IPC message handling code (#7955)
At present, the GUI client uses a separate task for reading messages
from the IPC connection and forwards them to another channel. The other
end of this channel is then used within the controller to actually react
to IPC messages.

We can simplify this by removing the intermediary task and processing
the messages from the IPC connection directly.
2025-01-30 17:26:19 +00:00
Thomas Eizinger
96daf5c1b2 refactor(gui-client): unify UDS connection errors (#7957)
Both of these branches lead to the same `Error::Other` variant. Might as
well not have different branches then.
2025-01-30 17:03:33 +00:00
Thomas Eizinger
f5779ff921 chore: release Gateway, headless-client and GUI client (#7903)
This bumps the versions of Gateway, headless-client and the GUI client
as well as updates the respective changelogs. These have been released
today:

- https://github.com/firezone/firezone/releases/tag/gui-client-1.4.1
- https://github.com/firezone/firezone/releases/tag/gateway-1.4.3
-
https://github.com/firezone/firezone/releases/tag/headless-client-1.4.1

It is all done in one PR to avoid merge conflicts within the updates of
the Makefile.
2025-01-28 16:17:58 +00:00
Thomas Eizinger
c6492d4832 fix(rust): don't start all log files with connlib. (#7853)
At present, the file logger for all Rust code starts each logfile with
`connlib.`. This is very confusing when exporting the logs from the GUI
client because even the logs from the client itself will start with
`connlib.`. To fix this, we make the base file name of the log file
configurable.
2025-01-28 01:35:05 +00:00
Thomas Eizinger
e50b719d5c refactor(headless-client): remove FIREZONE_TOKEN CLI arg (#7770)
The current CLI of the headless-client allows passing the token as a
positional parameter in addition to an env variable. This can be very
confusing if you make a spelling error in the _command_ that you are
trying to pass to the CLI, i.e. `standalone`. A misspelled command will
be interpreted as the token to use to connect to the portal without any
warning that it is similar to a command. The env variable
`FIREZONE_TOKEN` is completely ignored in that case.

To fix this, we remove the ability to pass the token via stdin. The
token should instead be set via en env variable or read from a file at
`FIREZONE_TOKEN_PATH`.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-01-21 14:22:54 +00:00
Thomas Eizinger
ed5285268d refactor: merge on_update_routes and on_set_interface_config (#7699)
For a while now, `connlib` has been calling these two callbacks right
after each other because the internal event already bundles all the
information about the TUN device. With this PR, we merge the two
callback functions also in layers above `connlib` itself.

Resolves: #6182.
2025-01-08 18:26:40 +00:00
Thomas Eizinger
c595bb872d chore(telemetry): use same DSN for GUI and IPC service (#7667)
The application-split itself doesn't really warrant having two different
Sentry projects.

1. The location of the panic / log already tells us, which component is
failing.
2. Both of the projects are built with Rust so the same "platform"
setting applies.
3. Reducing the number of Sentry projects makes things easier to manage.
4. The binaries are started as independent processes, so the two Sentry
contexts don't interfere.

What we should keep in mind is that one instance of an application will
now log into Sentry twice using the same DSN. I _think_ this means that
the number of sessions listed in Sentry will be double the number of
actual client-runs. The same is true for the Apple client though and
once we integrate Sentry for Android, the same will apply there so
relative to each other, those numbers still make sense.
2025-01-05 18:26:45 +00:00
Thomas Eizinger
9789eb5353 refactor(gui-client): de-duplicte code for deleting subkey (#7574)
The current way this is implemented is a bit tricky to read. By
splitting out a dedicated function and adding some logging, it becomes
more apparent what we do here.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2024-12-29 08:45:50 +00:00
Thomas Eizinger
6f0b471652 fix(gui-client): ensure IPC errors are mapped correctly (#7554)
In order to make sure that the correct error message is displayed to
users, we need to preserve error information as much as possible.
2024-12-23 11:44:09 +00:00
Thomas Eizinger
c7a4221cb7 fix(gui-client): flush telemetry events on IPC service exit (#7557)
Due to how we currently initialise telemetry in the IPC service, I think
we are missing out on events when it _exits_ due to an error because we
don't explicitly stop the telemetry session. We have alerts from a fair
few users in Sentry where the IPC service appears to stop / disappear
but there are no corresponding events for the IPC service.
2024-12-19 20:06:47 +00:00
Thomas Eizinger
bc2febed99 fix(connlib): use correct constant for truncating DNS responses (#7551)
In case an upstream DNS server responds with a payload that exceeds the
available buffer space of an IP packet, we need to truncate the
response. Currently, this truncation uses the **wrong** constant to
check for the maximum allowed length. Instead of the
`MAX_DATAGRAM_PAYLOAD`, we actually need to check against a limit that
is less than the MTU as the IP layer and the UDP layer both add an
overhead.

To fix this, we introduce such a constant and provide additional
documentation on the remaining ones to hopefully avoid future errors.
2024-12-19 17:15:43 +00:00
Thomas Eizinger
b63061994d chore(headless-client): release version 1.4.0 (#7495)
Headless Client 1.4.0 has been released
(https://github.com/firezone/firezone/releases/tag/headless-client-1.4.0).
This PR updates the changelog and version numbers accordingly.
2024-12-13 07:10:11 +00:00
Thomas Eizinger
f0c2bfa6eb chore(gui-client): release version 1.4.0 (#7496)
GUI Client 1.4.0 has been released
(https://github.com/firezone/firezone/releases/tag/gui-client-1.4.0).
This PR updates the changelog and versions accordingly.
2024-12-13 04:41:49 +00:00
Thomas Eizinger
81f71cba62 fix(telemetry): use package@version notation for releases (#7466)
In order for Sentry to parse our releases as semver, they need to be in
the form of `package@version` [0]. Without this, the feature of "Mark
this issue as resolved in the _next_ version" doesn't work properly
because Sentry compares the versions as to when it first saw them vs
parsing the semver string itself. We test versions prior to releasing
them, meaning Sentry learns about a 1.4.0 version before it is actually
released. This causes false-positive "regressions" even though they are
fixed in a later (as per semver) release.

This create some redundancy with the different DSNs that we are already
using. I think it would make sense to consider merging the two projects
we have for the GUI client for example. That is really just one project
that happens to run as two binaries.

For all other projects, I think the separation still makes sense because
we e.g. may add Sentry to the "host" applications of Android and
MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the
issues into the same Sentry project.

As per Sentry's docs, releases are organisation-wide and therefore need
a package identifier to be grouped correctly.

[0]:
https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version
2024-12-09 05:04:45 +00:00
Thomas Eizinger
90cf191a7c feat(linux): multi-threaded TUN device operations (#7449)
## Context

At present, we only have a single thread that reads and writes to the
TUN device on all platforms. On Linux, it is possible to open the file
descriptor of a TUN device multiple times by setting the
`IFF_MULTI_QUEUE` option using `ioctl`. Using multi-queue, we can then
spawn multiple threads that concurrently read and write to the TUN
device. This is critical for achieving a better throughput.

## Solution

`IFF_MULTI_QUEUE` is a Linux-only thing and therefore only applies to
headless-client, GUI-client on Linux and the Gateway (it may also be
possible on Android, I haven't tried). As such, we need to first change
our internal abstractions a bit to move the creation of the TUN thread
to the `Tun` abstraction itself. For this, we change the interface of
`Tun` to the following:

- `poll_recv_many`: An API, inspired by tokio's `mpsc::Receiver` where
multiple items in a channel can be batch-received.
- `poll_send_ready`: Mimics the API of `Sink` to check whether more
items can be written.
- `send`: Mimics the API of `Sink` to actually send an item.

With these APIs in place, we can implement various (performance)
improvements for the different platforms.

- On Linux, this allows us to spawn multiple threads to read and write
from the TUN device and send all packets into the same channel. The `Io`
component of `connlib` then uses `poll_recv_many` to read batches of up
to 100 packets at once. This ties in well with #7210 because we can then
use GSO to send the encrypted packets in single syscalls to the OS.
- On Windows, we already have a dedicated recv thread because `WinTun`'s
most-convenient API uses blocking IO. As such, we can now also tie into
that by batch-receiving from this channel.
- In addition to using multiple threads, this API now also uses correct
readiness checks on Linux, Darwin and Android to uphold backpressure in
case we cannot write to the TUN device.

## Configuration

Local testing has shown that 2 threads give the best performance for a
local `iperf3` run. I suspect this is because there is only so much
traffic that a single application (i.e. `iperf3`) can generate. With
more than 2 threads, the throughput actually drops drastically because
`connlib`'s main thread is too busy with lock-contention and triggering
`Waker`s for the TUN threads (which mostly idle around if there are 4+
of them). I've made it configurable on the Gateway though so we can
experiment with this during concurrent speedtests etc.

In addition, switching `connlib` to a single-threaded tokio runtime
further increased the throughput. I suspect due to less task / context
switching.

## Results

Local testing with `iperf3` shows some very promising results. We now
achieve a throughput of 2+ Gbit/s.

```
Connecting to host 172.20.0.110, port 5201
Reverse mode, remote host 172.20.0.110 is sending
[  5] local 100.80.159.34 port 57040 connected to 172.20.0.110 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   274 MBytes  2.30 Gbits/sec
[  5]   1.00-2.00   sec   279 MBytes  2.34 Gbits/sec
[  5]   2.00-3.00   sec   216 MBytes  1.82 Gbits/sec
[  5]   3.00-4.00   sec   224 MBytes  1.88 Gbits/sec
[  5]   4.00-5.00   sec   234 MBytes  1.96 Gbits/sec
[  5]   5.00-6.00   sec   238 MBytes  2.00 Gbits/sec
[  5]   6.00-7.00   sec   229 MBytes  1.92 Gbits/sec
[  5]   7.00-8.00   sec   222 MBytes  1.86 Gbits/sec
[  5]   8.00-9.00   sec   223 MBytes  1.87 Gbits/sec
[  5]   9.00-10.00  sec   217 MBytes  1.82 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.30 GBytes  1.98 Gbits/sec  22247             sender
[  5]   0.00-10.00  sec  2.30 GBytes  1.98 Gbits/sec                  receiver

iperf Done.
```

This is a pretty solid improvement over what is in `main`:

```
Connecting to host 172.20.0.110, port 5201
[  5] local 100.65.159.3 port 56970 connected to 172.20.0.110 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  90.4 MBytes   758 Mbits/sec  1800    106 KBytes
[  5]   1.00-2.00   sec  93.4 MBytes   783 Mbits/sec  1550   51.6 KBytes
[  5]   2.00-3.00   sec  92.6 MBytes   777 Mbits/sec  1350   76.8 KBytes
[  5]   3.00-4.00   sec  92.9 MBytes   779 Mbits/sec  1800   56.4 KBytes
[  5]   4.00-5.00   sec  93.4 MBytes   783 Mbits/sec  1650   69.6 KBytes
[  5]   5.00-6.00   sec  90.6 MBytes   760 Mbits/sec  1500   73.2 KBytes
[  5]   6.00-7.00   sec  87.6 MBytes   735 Mbits/sec  1400   76.8 KBytes
[  5]   7.00-8.00   sec  92.6 MBytes   777 Mbits/sec  1600   82.7 KBytes
[  5]   8.00-9.00   sec  91.1 MBytes   764 Mbits/sec  1500   70.8 KBytes
[  5]   9.00-10.00  sec  92.0 MBytes   771 Mbits/sec  1550   85.1 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   917 MBytes   769 Mbits/sec  15700             sender
[  5]   0.00-10.00  sec   916 MBytes   768 Mbits/sec                  receiver

iperf Done.
```
2024-12-05 00:18:20 +00:00
Thomas Eizinger
48bd0f9804 chore: bump client versions to 1.4.0 (#7092)
In order to release the new control protocol to users, we need to bump
the versions of the clients to 1.4.0. The portal has a version gate to
only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus,
bumping these versions can only happen once testing has completed and
the gateway has actually been released as 1.4.0.

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2024-12-04 19:48:51 +00:00
Thomas Eizinger
dd6b52b236 chore(rust): share edition key via workspace table (#7451) 2024-12-03 00:28:06 +00:00
Thomas Eizinger
f81f8b2ed7 fix(gui-client): don't share log-directives via file system (#7445)
At present, the GUI client shares the current log-directives with the
IPC service via the file system. Supposedly, this has been done to allow
the IPC service to start back up with the same log filter as before.
This behaviour appears to be buggy though as we are receiving a fair
number of error reports where this file is not writable.

Instead of relying on the file system to communicate, we send the
current log-directives to the IPC service as soon as we start up. The
IPC service then uses the file system as a cache that log string and
re-apply it on the next startup. This way, no two programs need to read
/ write the same file. The IPC service runs with higher privileges, so
this should resolve the permission errors we are seeing in Sentry.
2024-12-02 23:28:43 +00:00
Thomas Eizinger
4f92a0d7ca refactor(gui-client): tidy up GUI controller code (#7444)
This PR intends to be a pure refactoring, i.e. no behaviour change. It
simplifies a few aspects of the GUI controller event-loop by getting rid
of the `select!` macro. We also remove some indirection of the
`gui_controller::Builder`.
2024-12-02 20:07:44 +00:00
Thomas Eizinger
e833cb4f30 fix(rust): don't log and return DisconnectErrors (#7416)
These will be handled by whoever sits on the other side of the channel.
Logging these here as well causes duplicate logs and error reports to
Sentry.
2024-12-01 20:22:29 +00:00
Thomas Eizinger
932f6791fb fix(phoenix-channel): lazily create backoff timer (#7414)
Our `phoenix-channel` component is responsible for maintaining a
WebSocket connection to the portal. In case that connection fails, we
want to reconnect to it using an exponential backoff, eventually giving
up after a certain amount of time.

Unfortunately, the code we have today doesn't quite do that. An
`ExponentialBackoff` has a setting for the `max_elapsed_time`.
Regardless of how many and how often we retry something, we won't ever
wait longer than this amount of time. For the Relay, this is set to
15min. For other components its indefinite (Gateway, headless-client),
or very long (30 days for Android, 1 day for Apple).

The point in time from which this duration is counted is when the
`ExponentialBackoff` is **constructed** which translates to when we
**first** connected to the portal. As a result, our backoff would
immediately fail on the first error if it has been longer than
`max_elapsed_time` since we first connected. For most components, this
codepath is not relevant because the `max_elapsed_time` is so long. For
the Relay however, that is only 15 minutes so chances are, the Relay
would immediately fail (and get rebooted) on the first connection error
with the portal.

To fix this, we now lazily create the `ExponentialBackoff` on the first
error.

This bug has some interesting consequences: When a relay reboots, it
looses all its state, i.e. allocations, channel bindings, available
nonces etc, stamp-secret. Thus, all credentials and state that got
distributed to Clients and Gateways get invalidated, causing disconnects
from the Relay. We have observed these alerts in Sentry for a while and
couldn't explain them. Most likely, this is the root cause for those
because whilst a Relay disconnects, the portal also cannot detect its
presence and pro-actively inform Clients and Gateways to no longer use
this Relay.
2024-11-29 20:19:11 +00:00
Thomas Eizinger
c6e7e6192e build(rust): bump Rust to 1.83 (#7409)
Rust 1.83 comes with a bunch of new lints for elidible lifetimes. Those
also trigger in the generated code of `derivative`. That crate is
actually unmaintained so we replace our usages of it with `derive_more`.
2024-11-29 01:04:06 +00:00
Thomas Eizinger
2c26fc9c0e ci: lint Rust dependencies using cargo deny (#7390)
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.

This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).

Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.

This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.

Resolves: #7241.
2024-11-22 00:17:28 +00:00
Thomas Eizinger
c93391e8fd chore(headless-client): setup logging earlier (#7385)
Logging needs to be set up as early as possible to ensure we capture log
messages such as `Starting telemetry`.
2024-11-20 01:30:37 +00:00
Thomas Eizinger
86ada01828 fix(gui-client): initialise sentry-tracing for IPC service (#7363)
It was already a bit sus that we didn't receive as many errors in Sentry
from the IPC service as from the GUI client. Turns out that we forgot to
initialise our `sentry_layer` there. Additionally, we also didn't
initialise the `LogTracer`, meaning we didn't capture logs from the
`log` crate which is used by some of the dependencies, for example
`wintun`.
2024-11-18 22:40:01 +00:00
Thomas Eizinger
24f7ba530d refactor(gui-client): add more context to connection failures (#7364)
Adding more context to these errors makes it easier to identify, which
of the operations fails. In addition, we remove some usages of the "log
and return" anti-pattern to avoid duplicate reports of the same issue.
2024-11-18 18:16:16 +00:00
Thomas Eizinger
2b3469954a chore(headless-client): allow disabling telemetry (#7350)
I've started to set this in my local env to not spam Sentry with events
while I am developing.
2024-11-15 08:14:36 +00:00
Thomas Eizinger
0cb96f5a18 chore(gui-client): publish version 1.3.13 (#7346) 2024-11-15 06:52:38 +00:00
Thomas Eizinger
4fc7e62ba8 chore(headless-client): publish version 1.3.7 (#7348) 2024-11-15 05:39:39 +00:00