With the removal of the NAT64/46 modules, we can now simplify the
internals of our `IpPacket` struct. The requirements for our `IpPacket`
struct are somewhat delicate.
On the one hand, we don't want to be overly restrictive in our parsing /
validation code because there is a lot of broken software out there that
doesn't necessarily follow RFCs. Hence, we want to be as lenient as
possible in what we accept.
On the other hand, we do need to verify certain aspects of the packet,
like the payload lengths. At the moment, we are somewhat too lenient
there which causes errors on the Gateway where we have to NAT or
otherwise manipulate the packets. See #9567 or #9552 for example.
To fix this, we make the parsing in the `IpPacket` constructor more
restrictive. If it is a UDP, TCP or ICMP packet, we attempt to fully
parse its headers and validate the payload lengths.
This parsing allows us to then rely on the integrity of the packet as
part of the implementation. This does create several code paths that can
in theory panic but in practice, should be impossible to hit. To ensure
that this does in fact not happen, we also tackle an issue that is long
overdue: Fuzzing.
Resolves: #6667Resolves: #9567Resolves: #9552
The tunnel service creates the Firezone ID upon start-up. With recent
changes to the GUI client, we now require reading the ID file when
starting the GUI client.
This exposes a race condition in our smoke-tests where we start them
both at roughly the same time.
To fix this, we sleep for 500ms after starting the tunnel process.
The name IPC service is not very descriptive. By nature of being
separate processes, we need to use IPC to communicate between them. The
important thing is that the service process has control over the tunnel.
Therefore, we rename everything to "Tunnel service".
The only part that is not changed are historic changelog entries.
Resolves: #9048
The way the GUI client currently handles the commands and flags provided
via the CLI is somewhat confusing. There are various helper functions
that get called from the same place. We duplicate setup like the `tokio`
runtime in multiple places and the also loggers get initialised all over
the place.
To streamline this, we move global setup like `tokio` and telemetry to
the top-layer. From there, we delegate to a `try_main` function which
handles the various CLI commands. The default path from here is to run
the gui by delegating to the `gui` module. If not, we bail out early.
This structure is significantly easier to understand and provides error
and telemetry handling in a single place.
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.
This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).
Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.
This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.
Resolves: #7241.
Closes#6989
- The tunnel daemon (IPC service) now explicitly sets the ID file's
perms to 0o640, even if the file already exists.
- The GUI error is now non-fatal. If the file can't be read, we just
won't get the device ID in Sentry.
- More specific error message when the GUI fails to read the ID file
We attempted to set the tunnel daemon's umask, but this caused the smoke
tests to fail. Fixing the regression is more urgent than getting the
smoke tests to match local debugging.
---------
Co-authored-by: _ <ReactorScram@users.noreply.github.com>
Refs #6138
Sentry is always enabled for now. In the near future we'll make it
opt-out per device and opt-in per org (see #6138 for details)
- Replaces the `crash_handling` module
- Catches panics in GUI process, tunnel daemon, and Headless Client
- Added a couple "breadcrumbs" to play with that feature
- User ID is not set yet
- Environment is set to the API URL, e.g. `wss://api.firezone.dev`
- Reports panics from the connlib async task
- Release should be automatically pulled from the Cargo version which we
automatically set in the version Makefile
Example screenshot of sentry.io with a caught panic:
<img width="861" alt="image"
src="https://github.com/user-attachments/assets/c5188d86-10d0-4d94-b503-3fba51a21a90">
Synthetic replication for #6791.
The diff for the fix will probably be short, so I wanted this diff for
the test to be reviewed separately.
In your normal terminal: `cargo build -p firezone-gui-client -p
gui-smoke-test`
With sudo / admin powers: `./target/debug/gui-smoke-test.exe
--manual-tests`
Some customers _must_ have hit this, it's so easy to trigger.
I can't add it to the CI smoke test because there's no portal in CI
during the smoke test, unless we use Staging.
When `snownet` was first being developed, these tests ensured that
hole-punching as well as connectivity via a relayed works correctly. We
have since added extensive tests that ensure connectivity works in many
scenarios via `tunnel_test`. `tunnel_test` does not (yet) have a
simulated NAT so hole-punching itself is not covered by that.
UDP hole-punching is shockingly trivial though because all you need to
do is send UDP packets to the same socket that the other party is
sending from. This isn't done by our own code but rather by str0m's
implement of ICE (as long as we add the correct candidates).
The `snownet-tests` themselves are quite fragile because they need to
set up their own event loop and manually construct an IP packet. They
haven't caught a single bug to my knowledge so I am proposing to delete
them for ease of maintenance.
For example, in
https://github.com/firezone/firezone/actions/runs/10449965474/job/28948590058?pr=6335
the tests fail because we no longer directly force a handshake when the
connection is established. This is unnecessary now because the buffered
intent packet will directly force a handshake from the client to the
gateway. Yet, `snownet-tests` event loop would need adjusting to also do
that.