Commit Graph

27 Commits

Author SHA1 Message Date
Thomas Eizinger
d4aafcaf41 fix(gui-client): don't fail on repeated deep-links (#7572)
Firezone's authentication scheme uses deep-links to transfer the secret
token via the login-flow using the browser to the application. Such a
deep-link can be opened multiple times, even if we are already signed
in. In such a case, and in any other where we don't have a pending
sign-in request, we currently generate an error.

This is unnecessary as we can simply discard the token received from the
deep-link.
2024-12-23 16:39:46 +00:00
Thomas Eizinger
8cecdc6906 fix(gui-client): ignore ConnectResult in wrong state (#7499)
Similar to #7497, when we receive a `ConnectResult`, we can simply
silently bail out of the function and not change our state instead of
printing a loud warning.
2024-12-16 01:02:05 +00:00
Thomas Eizinger
3c2c01c44c chore(gui-client): don't warn when tray menu updates fail (#7510)
Windows appears to randomly fail to update the tray menu. There is
nothing we can do about that. Hence, we downgrade these errors to debug
and make the functions infallible, reducing the complexity for the
caller.
2024-12-13 17:55:00 +00:00
Thomas Eizinger
b5f25da5ac fix(gui-client): remove error about unexpected TunnelReady (#7497)
The communication between the GUI client, the IPC service and `connlib`
are asynchronous. As such, it may happen that the state machines run out
of sync. Receiving a `TunnelReady` despite not being in the right state
for that is no concern and can be handled gracefully.
2024-12-13 05:52:34 +00:00
Thomas Eizinger
5d5e5ab0b1 fix(gui-client): make tray menu refresh infallible (#7498)
In most cases, the caller of this function already handled the case of
it failing gracefully by logging. From Sentry alerts, we can see that if
this fails, there isn't much we can do about it and most likely, the
next refresh will work again (this has only happened a single time).

Logging this on `debug` is good enough in case something doesn't work
and we need to reproduce it or something really bad happens we need see
it in the breadcrumbs of another Sentry event.
2024-12-13 04:54:41 +00:00
Thomas Eizinger
81f71cba62 fix(telemetry): use package@version notation for releases (#7466)
In order for Sentry to parse our releases as semver, they need to be in
the form of `package@version` [0]. Without this, the feature of "Mark
this issue as resolved in the _next_ version" doesn't work properly
because Sentry compares the versions as to when it first saw them vs
parsing the semver string itself. We test versions prior to releasing
them, meaning Sentry learns about a 1.4.0 version before it is actually
released. This causes false-positive "regressions" even though they are
fixed in a later (as per semver) release.

This create some redundancy with the different DSNs that we are already
using. I think it would make sense to consider merging the two projects
we have for the GUI client for example. That is really just one project
that happens to run as two binaries.

For all other projects, I think the separation still makes sense because
we e.g. may add Sentry to the "host" applications of Android and
MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the
issues into the same Sentry project.

As per Sentry's docs, releases are organisation-wide and therefore need
a package identifier to be grouped correctly.

[0]:
https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version
2024-12-09 05:04:45 +00:00
Thomas Eizinger
f81f8b2ed7 fix(gui-client): don't share log-directives via file system (#7445)
At present, the GUI client shares the current log-directives with the
IPC service via the file system. Supposedly, this has been done to allow
the IPC service to start back up with the same log filter as before.
This behaviour appears to be buggy though as we are receiving a fair
number of error reports where this file is not writable.

Instead of relying on the file system to communicate, we send the
current log-directives to the IPC service as soon as we start up. The
IPC service then uses the file system as a cache that log string and
re-apply it on the next startup. This way, no two programs need to read
/ write the same file. The IPC service runs with higher privileges, so
this should resolve the permission errors we are seeing in Sentry.
2024-12-02 23:28:43 +00:00
Thomas Eizinger
4f92a0d7ca refactor(gui-client): tidy up GUI controller code (#7444)
This PR intends to be a pure refactoring, i.e. no behaviour change. It
simplifies a few aspects of the GUI controller event-loop by getting rid
of the `select!` macro. We also remove some indirection of the
`gui_controller::Builder`.
2024-12-02 20:07:44 +00:00
Thomas Eizinger
19dbff51f5 chore(gui-client): don't warn on sign-out while raising tunnel (#7327)
All warnings triggered events in Sentry. This particular warning is of
no concern, it simply means that the user clicked on "Sign out" while we
were trying to set up the tunnel.

Resolves: #7250.
2024-11-13 00:15:49 +00:00
Thomas Eizinger
0dc078876b refactor(gui-client): capture error sources when connect fails (#7303)
When `connlib` fails to establish a session, the GUI client currently
only captures the top-level error within `connect_to_firezone` because
it uses `.to_string()` for all errors. Unfortunately, that doesn't print
any of the sources of an error.

To conveniently capture all sources, we can use `anyhow` and its
alternate formatting using `format!("{e:#}")` (notice the `#`). Not all
errors within `connect_to_firezone` should be captured like this
however. Certain IO errors, in particular when trying to resolve the
domain of the portal, need to be captured separately because they may
resolve by themselves if we gain connectivity again. This is important,
otherwise we discard the users token when they boot-up a machine without
internet access yet Firezone is auto-starting.

To make this more ergonomic, we trim down `IpcServiceError` to two
variants: The IO variant we need to special-case and everything else.
This allows us to create `From` impls which "do the right thing" by
capturing more error information using `anyhow`'s alternate formatting.
2024-11-11 22:52:14 +00:00
Thomas Eizinger
a5e20064dc refactor(gui-client): downgrade temporary error (#7304)
If we only temporarily fail to connect to the portal, we don't need to
report this as a warning.

Resolves: #7251.
2024-11-11 19:51:42 +00:00
Thomas Eizinger
488c599d5b chore(telemetry): capture Firezone ID and account in user ctx (#7310)
Sentry has a feature called the "User context" which allows us to assign
events to individual users. This in turn will give us statistics in
Sentry, how many users are affected by a certain issue.

Unfortunately, Sentry's user context cannot be built-up step-by-step but
has to be set as a whole. To achieve this, we need to slightly refactor
`Telemetry` to not be `clone`d and instead passed around by mutable
reference.

Resolves: #7248.
Related: https://github.com/getsentry/sentry-rust/issues/706.
2024-11-11 19:50:14 +00:00
Thomas Eizinger
e261cb3c27 chore: remove git_version! (#7270)
Reading the Git version requires the entire Git repository to be
present, including all tags. The tags are only created _after_ the
artifact is being built, when we publish the release. Therefore, these
tags are never included in the actual released binary.

For Sentry, we use the `CARGO_PKG_VERSION` variable instead. This
doesn't tell us whether somebody built a client from source and then
used it so there could be some confusion in Sentry events. It is quite
unlikely that this happens though so for the majority of Sentry alerts,
this will give us the correct version.

For the Android client, we also depend on the `GITHUB_SHA` env variable
at compile-time. We do the same thing for the GUI client here.

Resolves: #6925.
2024-11-07 22:56:17 +00:00
Thomas Eizinger
2f3fe751bf chore(gui-client): log entire error when connlib fails (#7273)
The `error_msg` here is already a user-friendly string because we are
also showing it to the user in an error message. These can be entirely
different errors so we should display them as different messages. This
will allow Sentry to group them together correctly.
2024-11-06 19:49:23 +00:00
Thomas Eizinger
78ebad13ab chore(rust): log more errors as tracing::Values (#7208)
Logging these as structured values gives us a better stacktrace in
Sentry (assuming the errors themselves make proper use of defining an
error-chain).
2024-11-05 14:36:47 +00:00
Reactor Scram
51250faa0d chore(telemetry): make the firezone device ID a context not a tag (#7179)
Closes #7175 

Also fixes a bug with the initialization order of Tokio and Sentry.

Previously:
1. Start Tokio, executor threads inherit main thread context
2. Load device ID and set it on the main telemetry hub

Now:
1. Load device ID and set it on the main telemetry hub
2. Start Tokio, executor threads inherit main thread context

The context and possibly tags didn't seem to propagate from the main hub
if we set them after the worker threads spawned.

Based on this understanding, the IPC service process is still wrong, but
a fix will have to wait, because telemetry in the IPC service is more
complicated than in the GUI process.

<img width="818" alt="image"
src="https://github.com/user-attachments/assets/9c9efec8-fc55-4863-99eb-5fe9ba5b36fa">
2024-10-30 21:27:17 +00:00
Thomas Eizinger
0825055ff2 fix(rust/gui-client): allow GUI process to read the firezone-id file from disk (#6987)
Closes #6989

- The tunnel daemon (IPC service) now explicitly sets the ID file's
perms to 0o640, even if the file already exists.
- The GUI error is now non-fatal. If the file can't be read, we just
won't get the device ID in Sentry.
- More specific error message when the GUI fails to read the ID file

We attempted to set the tunnel daemon's umask, but this caused the smoke
tests to fail. Fixing the regression is more urgent than getting the
smoke tests to match local debugging.

---------

Co-authored-by: _ <ReactorScram@users.noreply.github.com>
2024-10-09 20:04:24 +00:00
Reactor Scram
b3d9cebe53 chore(rust/telemetry): add firezone ID (formerly device ID) to sentry as a tag (#6946)
This makes it easier to ignore random issues from my dev system.

Also added OS tag (`linux` or `windows`) since that doesn't seem to be a
default for Sentry.

```[tasklist]
- [ ] Bikeshed the name `firezone_id` since it'll be hard to change later
```

<img width="367" alt="image"
src="https://github.com/user-attachments/assets/2e936aea-5c36-4208-965a-c578ff8407b7">
2024-10-07 20:13:48 +00:00
Thomas Eizinger
be250f1e00 refactor(connlib): repurpose connlib-shared as connlib-model (#6919)
The `connlib-shared` crate has become a bit of a dependency magnet
without a clear purpose. It hosts utilities like `get_user_agent`,
messages for the client and gateway to communicate with the portal and
domain types like `ResourceId`.

To create a better dependency structure in our workspace, we repurpose
`connlib-shared` as a `connlib-model` crate. Its purpose is to host
domain-specific model types that multiple crates may want to use. For
that purpose, we rename the `callbacks::ResourceDescription` type to
`ResourceView`, designating that this is a _view_ onto a resource as
seen by `connlib`. The message types which currently double up as
connlib-internal model thus become an implementation detail of
`firezone-tunnel` and shouldn't be used for anything else.

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-10-03 14:47:58 +00:00
Reactor Scram
fd9724a3a3 refactor(rust/gui-client): remove borrows from part of the system tray code (#6916)
Extracted from #6838

This leads to extra cloning of strings, but if there's less than 1,000
Resources and the tray doesn't update often, it should be fine. We can
sample performance with sentry.io if we're worried.
2024-10-03 14:14:04 +00:00
Reactor Scram
05acdd5a03 fix(gui-client): defer GUI exit until tunnel closes (#6874)
Closes #6873

The issue seems to be a race between flushing Sentry in the GUI process
and shutting down Firezone in the tunnel daemon (IPC service).

With this change, the GUI waits to hear `DisconnectedGracefully` from
the tunnel daemon before flushing Sentry, and the issue is prevented.

Adding the new state and new IPC message required small changes in
several places

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-10-01 16:01:43 +00:00
Reactor Scram
d2a8155ba7 fix(rust/client): set sentry release version and environment correctly (#6855)
Closes #6854 


- Sets release version from the GUI Client / Headless Client version
instead of the `firezone-telemetry` version
- Set environment to "production" and "staging" for well-known API URLs,
and "self-hosted" for others, since environments in Sentry can't have
slashes in them
- Sets API URL as a tag
- Sets release to `unit test` for unit testing `firezone-telemetry`
itself, since it has no good version number

<img width="398" alt="image"
src="https://github.com/user-attachments/assets/86f71193-2511-45c1-8304-413db8e5ef90">
2024-09-30 16:24:39 +00:00
Reactor Scram
05a2b28d9f feat(rust/gui-client): add sentry.io error reporting (#6782)
Refs #6138 

Sentry is always enabled for now. In the near future we'll make it
opt-out per device and opt-in per org (see #6138 for details)

- Replaces the `crash_handling` module
- Catches panics in GUI process, tunnel daemon, and Headless Client
- Added a couple "breadcrumbs" to play with that feature
- User ID is not set yet
- Environment is set to the API URL, e.g. `wss://api.firezone.dev`
- Reports panics from the connlib async task
- Release should be automatically pulled from the Cargo version which we
automatically set in the version Makefile

Example screenshot of sentry.io with a caught panic:

<img width="861" alt="image"
src="https://github.com/user-attachments/assets/c5188d86-10d0-4d94-b503-3fba51a21a90">
2024-09-27 16:34:54 +00:00
Reactor Scram
ab66a8fec7 refactor(rust/gui-client): use builder pattern for Controller (#6825)
This makes it easy to add more fields to `Controller` without making
them all public.

This is factored out from https://github.com/firezone/firezone/pull/6782

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-09-27 14:24:50 +00:00
Thomas Eizinger
a9f515a453 chore(rust): use #[expect] instead of #[allow] (#6692)
The `expect` attribute is similar to `allow` in that it will silence a
particular lint. In addition to `allow` however, `expect` will fail as
soon as the lint is no longer emitted. This ensures we don't end up with
stale `allow` attributes in our codebase. Additionally, it provides a
way of adding a `reason` to document, why the lint is being suppressed.
2024-09-16 13:51:12 +00:00
Gabi
bb2b0197e7 fix(tauri): don't fail on ipc message when no internet resource (#6622)
Fixes: #6620.
2024-09-06 10:45:53 -07:00
Reactor Scram
5eab912f60 refactor(rust/gui-client): begin isolating Tauri from our code (#6593)
This moves about 2/3rds of the code from `firezone-gui-client` to
`firezone-gui-client-common`.

I tested it in aarch64 Windows and cycled through sign-in and sign-out
and closing and re-opening the GUI process while the IPC service stays
running. IPC and updates each get their own MPSC channel in this, so I
wanted to be sure it didn't break.

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-09-05 17:42:45 +00:00