As a follow-up from #7959, we can now simplify the error handling a fair
bit as all codepaths that can fail in the client are threaded back to
the main function.
This configures the GUI client to log to journald in addition to files
as well. For better or worse, this logs all events such that structured
information is preserved, e.g. all additional fields next to the message
are also saved as fields in the journal. By default, when viewing the
logs via `journalctl`, those fields are not displayed. This makes the
default output of `journalctl` for the FIrezone GUI not as useful as it
could be. Fixing that is left to a later stage.
Related: #8173
On Linux, logs sent to stdout from a systemd-service are automatically
captured by `journald`. This is where most admins expect logs to be and
frankly, doing any kind of debugging of Firezone is much easier if you
can do `journalctl -efu firezone-client-ipc.service` in a terminal and
check what the IPC service is doing.
On Windows, stdout from a service is (unfortunately) ignored.
To achieve this and also allow dynamically changing the log-filter, I
had to introduce a (long-overdue) abstraction over tracing's "reload"
layer that allows us to combine multiple reload-handles into one.
Unfortunately, neither the `reload::Layer` nor the `reload::Handle`
implement `Clone`, which makes this unnecessarily difficult.
Related: #8173
On Linux desktops, we install a dedicated `.desktop` file that is
responsible for handling our deep-links for sign-in. This desktop entry
is not meant to be launched manually and therefore should be hidden from
the application menus.
Every time we start a new session, our telemetry context potentially
changes, i.e. the user may sign into a new account. This should ensure
that both the IPC service and the GUI always use the most up-to-date
`account_slug` as part of Sentry events. In addition, this will also set
the `account_slug` for clients that just signed in. Previously, the
`account_slug` would only get populated on the next start of the client.
Alternative to #8128. If the user dismissed the unlock prompt or has
their keyring otherwise misconfigured, it is still useful to allow them
to sign-in. They just won't stay signed-in across reboots of the device.
When the IPC service gets terminated gracefully, the user must have
initiated some kind of action, be it an upgrade or an explicit "Stop the
service". In that case, there is no point in displaying an alert with an
info / error message as the user already knows that they are stopping
Firezone. In order to not fatigue the user with alerts, we exit the GUI
with a toast notification when the IPC service shuts down gracefully.
Toast notifications do not grab the users attention, allowing them to
continue what they are doing while still being notified that their
Firezone client is now disconnected.
Fixes: #6232.
As it turns out, the effort in #7104 was not a good idea. By logging
errors as values, most of our Sentry reports all have the same title and
thus cannot be differentiated from within the overview at all. To fix
this, we stringify errors with all their sources whenever they got
logged. This ensures log messages are unique and all Sentry issues will
have a useful title.
At present, the GUI client uses a monolithic `Error` enum that
represents all kinds of errors. Some of them are unused (see #7956).
Others are only used during startup, like the `deep_link` and
`WebViewNotInstalled` variants. This makes it difficult to write correct
error handling code.
In addition to remove certain variants in #7965, this PR refactors the
`run::gui` function to not depend on this `Error` at all. Instead, we
use `anyhow::Result` and probe for particular errors that we want to
special-case. This is a bit less type-safe because there is no source
code-level connection between the source site that emits an error and
the error handling code.
In the worst case, any regression here is "just" a slight degradation in
UX: We will show a generic error dialog instead of a tailored message.
This risk is deemed acceptable in exchange for an easier to understand
control flow.
At present, the GUI client uses a separate task for reading messages
from the IPC connection and forwards them to another channel. The other
end of this channel is then used within the controller to actually react
to IPC messages.
We can simplify this by removing the intermediary task and processing
the messages from the IPC connection directly.
At present, the file logger for all Rust code starts each logfile with
`connlib.`. This is very confusing when exporting the logs from the GUI
client because even the logs from the client itself will start with
`connlib.`. To fix this, we make the base file name of the log file
configurable.
Currently the GUI Client exits if `update-desktop-database` cannot be
executed after deep-links were registered. On non-Ubuntu systems (or
more generally non-Debian) this will fail since the command does not
exist and prevent the GUI Client from starting.
This PR just ignores any command-not-found error, ensuring the command
still has to succeed on Debian/Ubuntu machines.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: oddlama <oddlama@oddlama.org>
Firezone's authentication scheme uses deep-links to transfer the secret
token via the login-flow using the browser to the application. Such a
deep-link can be opened multiple times, even if we are already signed
in. In such a case, and in any other where we don't have a pending
sign-in request, we currently generate an error.
This is unnecessary as we can simply discard the token received from the
deep-link.
Similar to #7497, when we receive a `ConnectResult`, we can simply
silently bail out of the function and not change our state instead of
printing a loud warning.
Windows appears to randomly fail to update the tray menu. There is
nothing we can do about that. Hence, we downgrade these errors to debug
and make the functions infallible, reducing the complexity for the
caller.
The communication between the GUI client, the IPC service and `connlib`
are asynchronous. As such, it may happen that the state machines run out
of sync. Receiving a `TunnelReady` despite not being in the right state
for that is no concern and can be handled gracefully.
In most cases, the caller of this function already handled the case of
it failing gracefully by logging. From Sentry alerts, we can see that if
this fails, there isn't much we can do about it and most likely, the
next refresh will work again (this has only happened a single time).
Logging this on `debug` is good enough in case something doesn't work
and we need to reproduce it or something really bad happens we need see
it in the breadcrumbs of another Sentry event.
In order for Sentry to parse our releases as semver, they need to be in
the form of `package@version` [0]. Without this, the feature of "Mark
this issue as resolved in the _next_ version" doesn't work properly
because Sentry compares the versions as to when it first saw them vs
parsing the semver string itself. We test versions prior to releasing
them, meaning Sentry learns about a 1.4.0 version before it is actually
released. This causes false-positive "regressions" even though they are
fixed in a later (as per semver) release.
This create some redundancy with the different DSNs that we are already
using. I think it would make sense to consider merging the two projects
we have for the GUI client for example. That is really just one project
that happens to run as two binaries.
For all other projects, I think the separation still makes sense because
we e.g. may add Sentry to the "host" applications of Android and
MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the
issues into the same Sentry project.
As per Sentry's docs, releases are organisation-wide and therefore need
a package identifier to be grouped correctly.
[0]:
https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version
In order to release the new control protocol to users, we need to bump
the versions of the clients to 1.4.0. The portal has a version gate to
only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus,
bumping these versions can only happen once testing has completed and
the gateway has actually been released as 1.4.0.
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
At present, the GUI client shares the current log-directives with the
IPC service via the file system. Supposedly, this has been done to allow
the IPC service to start back up with the same log filter as before.
This behaviour appears to be buggy though as we are receiving a fair
number of error reports where this file is not writable.
Instead of relying on the file system to communicate, we send the
current log-directives to the IPC service as soon as we start up. The
IPC service then uses the file system as a cache that log string and
re-apply it on the next startup. This way, no two programs need to read
/ write the same file. The IPC service runs with higher privileges, so
this should resolve the permission errors we are seeing in Sentry.
This PR intends to be a pure refactoring, i.e. no behaviour change. It
simplifies a few aspects of the GUI controller event-loop by getting rid
of the `select!` macro. We also remove some indirection of the
`gui_controller::Builder`.
Rust 1.83 comes with a bunch of new lints for elidible lifetimes. Those
also trigger in the generated code of `derivative`. That crate is
actually unmaintained so we replace our usages of it with `derive_more`.
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.
This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).
Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.
This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.
Resolves: #7241.
It was already a bit sus that we didn't receive as many errors in Sentry
from the IPC service as from the GUI client. Turns out that we forgot to
initialise our `sentry_layer` there. Additionally, we also didn't
initialise the `LogTracer`, meaning we didn't capture logs from the
`log` crate which is used by some of the dependencies, for example
`wintun`.
Currently, some errors are double-logged when we show them to the user
because of the `tracing::error!` statements within the generation of the
user-friendly error message for the error dialog.
To get rid of these, we generalise the `show_error_dialog` function to
take just the message and move the generation of the message to a
function on the `Error` itself. This also allows us to split out a
separate error type that is only used for the elevation check, thereby
reducing the complexity of the other error enum.
I think I finally understood and correctly traced, where the use of ANSI
escape codes came from. It turns out, the `with_ansi` switch on
`tracing_subscriber::fmt::Layer` is what you want to toggle. From there,
it trickles down to the `Writer` which we can then test for in our
`Format`.
Resolves: #7284.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Using the clippy lint `unwrap_used`, we can automatically lint against
all uses of `.unwrap()` on `Result` and `Option`. This turns up quite a
few results actually. In most cases, they are invariants that can't
actually be hit. For these, we change them to `Option`. In other cases,
they can actually be hit. For example, if the user supplies an invalid
log-filter.
Activating this lint ensures the compiler will yell at us every time we
use `.unwrap` to double-check whether we do indeed want to panic here.
Resolves: #7292.
All warnings triggered events in Sentry. This particular warning is of
no concern, it simply means that the user clicked on "Sign out" while we
were trying to set up the tunnel.
Resolves: #7250.
Windows has some funny behaviour where creating the deep-link server
sometimes fails and we have to try again. Currently, each of these
operations is logged as a warning when it would actually succeed later.
These create unnecessary Sentry alerts.
If we run out of attempts to create the deep-link server (currently 10),
the entire function fails which will be logged as an error further down.
The last 500 INFO and DEBUG logs will be captured as breadcrumbs
together with the event, meaning we still get to see those error
messages on why it failed to create the deep-link server.
Resolves: #7238.