This is needed so that we can auto-update the systemd unit file, either
manually, or with a package manager like `apt`. We don't want users
cut-and-pasting these together on every update, and we don't want
machines doing it. Making the file updatable means we can make security
fixes to it easily.
Upon receiving a SIGTERM, we immediately disconnect from the websocket
connection to the portal and set a flag that we are shutting down.
Once we are disconnected from the portal and no longer have an active
allocations, we exit with 0. A repeated SIGTERM signal will interrupt
this process and force the relay to shutdown.
Disconnecting from the portal will (eventually) trigger a message to
clients and gateways that this relay should no longer be used. Thus,
depending on the timeout our supervisor has configured after sending
SIGTERM, the relay will continue all TURN operations until the number of
allocations drops to 0.
Currently, we also allow clients to make new allocations and refreshing
existing allocations. In the future, it may make sense to implement a
dedicated status code and refuse `ALLOCATE` and `REFRESH` messages
whilst we are shutting down.
Related: #4548.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Since we already have apps published, we need the ability to decouple
the versions of components from each other so that we can run CI and
publish them independently.
This is the first step. The next step would be decoupling releases so
that they're for individual components.
refs #4397
Unfortunately I had to keep `linux-client` to get the compatibility
tests to pass. #4578 aims to remove that package.
Please add to this list if you think of anything:
```[tasklist]
# Things that may break that CI/CD won't catch
- [ ] Github release artifacts
- [ ] Knowledge base
- [ ] Docker images
- [ ] Docker containers
- [ ] Existing `linux-client` users
- [ ] Anything that downloads ghcr artifacts
- [ ] Nix (Not sure if it's built in CI. It had a merge conflict)
```
Refs #4515, and #3712, #3782
I think this is what Thomas and I agreed on in Slack / Github
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
The clients, gateway and relay all employ an internal design that is
based on an eventloop. This gives us a lot of control in how various IO
components interact with each other. Great control also comes with a
source of bugs, the latest of which made the relay busy-loop once it
started relaying some traffic.
Eventloops are notoriously hard to unit-test because they compose
various IO bits together. Instead of writing unit tests, we can go and
assert the process state after the performance tests. Those generate a
fair bit of load on all our components but after that, they should
suspend.
The most effective tests survive even large refactorings and for that,
they need to be coded against a stable API / property. Asserting that
the process sleeps when it is idle from an application PoV is such a
property.
Related: #4511.
(After GA)
This adds a unit test for the Unix domain sockets that I intend to use
for process splitting on Linux.
The length-prefixed encoding and decoding are copied from `subzone`, but
most of that code will not be re-used since it's Windows-specific and
also specific to a Chromium-like process model, which won't work for
Firezone.
This is part of a yak shave towards CI testing of #3812
Moving the DNS control method out of `docker-compose.yml` and up to the
integration tests themselves allows us to test these scenarios:
- `systemd-resolved`
- `etc-resolv-conf`
- `systemd-resolved` but we're in a container where that won't work, so
we should gracefully degrade to just allowing IP/CIDR resources
Refs #3713
With this, the deb package for the Linux GUI Client contains a build of
the Linux CLI Client, at `/usr/bin/firezone-client-tunnel`. Future PRs
can add IPC to the code.
There is also a Windows stub, since Windows will eventually need a
tunnel process and a CLI Client.
In the future we might need to move or rename things, since the CLI
Clients and tunnel binaries for both Linux and Windows may all share
code or at least architecture. For now there is a slight duplication
with this being built as both "Firezone Client Tunnnel" and "Firezone
Linux Client"
On some OSes (Debian 12) the script fails to get the correct version to
download (likely because of `sed` version), so this simplifies things a
bit.
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
- [x] Updated log level string for client and gateways to info or higher
- [x] Update logs to hide DNS information
I also removed `hickory_resolve` errors which could contain sensitive
info from our general error and hide the logs that specifically relates
to them.
@bmanifold double checking that the log levels in the gateway's `*.tf`
files are just used for our own gateways.
Also, the relays still have `debug`, since only we see that I think that
makes sense but double checking with @jamilbk
Fixes: #3618.
---------
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
This adds an integration test that downloads a 10MB file from a server
and simulates the client roaming to another network while the download
is active.
We use a DNS resource for this to ensure it also doesn't take too long
in that case. DNS resources are what most users will be using and we
clear some internal DNS caches on connection failures. Hence, using a
DNS resource here is a somewhat roundabout way to test that we aren't
failing and re-establishing the connection but migrate it to a new
network path.
Running as sudo / root causes a lot of problems for GUI programs, so
we're unwinding that. In this case we can go back to using Tauri's "open
URL" function, which is great.
Closes#4103
Refs #3713
Affects #3972 - I was finally able to debug it because it came up
constantly during this PR
Refs #3713
```[tasklist]
### Before merging
- [ ] Is 'firezone-client-tunnel' okay for the binary name?
- [ ] Using a library and building it as two binaries is correct, right? `cargo run -p firezone-client-tunnel` takes 1 second. `cargo run -p firezone-gui-client --bin firezone-client-tunnel` takes 1m42s because it builds all the GUI deps.
```
I thought this was going to use `cargo-deb` but it was actually easy
with the Tauri deb bundling we already use.
```[tasklist]
### Before merging
- [x] Make sure every file in the Tauri deb is also in our deb (e.g. icons)
```
AppImages won't work with process splitting. (#3713)
As far as I can tell, they just produce one binary. Internally they use
FUSE or something to mount a squashfs image, but that image won't be
able to hook into systemd and run with root permissions and everything.
I don't think it's practical, and Tauri's AppImage bundling doesn't have
the features for it.
Even their deb bundler doesn't have any way to specify a path for a
daemon to be installed. The sidecar feature only seems intended for the
GUI app to call, not anything else on the system.
(There is such a thing as installing AppImages, but I don't think it's
worth pursuing - We should just do debs)
I think I used `-v` wrong here. I meant it to negate a regular grep, but
it will probably always return true since it negates the matching, not
the exit code.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Right now it only works on my dev VM, not on my test VMs, due to #4053
and #4103, but it passes tests and should be safe to merge.
There's one doc fix and one script fix which are unrelated and could be
their own PRs, but they'd be tiny, so I left them in here.
Ref #4106 and #3713 for the plan to fix all this by splitting the tunnel
process off so that the GUI runs as a normal user.
Followup from #4100:
- Add `perf/relay` and `debug/relay` etc data plane images in
`firezone-staging`.
- The `perf` images are `debug` stage images and have tooling installed,
but use release binaries.
- The `debug` images are `debug` binaries inside `debug` images
- `firezone-prod` contains only release binaries -- these image names
haven't changed
Fixes some issues encountered after the merge of #4049
- Fix performance tests to only run using base_ref and head_ref to avoid
dependence on `main`
- Fixes some typos
- Prevents a catch-22 condition where breaking compatibility meant we
wouldn't be able to deploy production
Previously, we had a dedicated timer for this within the tunnel
implementation. Now that we have control over the internals of our
connection via `snownet`, we can timeout the connection if we don't
receive a candidate from the remote within 10s.
Closes#3815
Changes that are breaking (but these aren't in production so it should
be okay)
- Windows, renaming `device_id.json` to `firezone-id.json` to match the
rest of the code
- Linux GUI, storing the firezone-id under `/var/lib` instead of under
`$HOME`
- Linux GUI, bails out if not run with `sudo --preserve-env` by
detecting `$HOME == root` or `$USER != root`
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Refs #3230
It looks like we need to sign the internal exe before it gets bundled
too. We can use `beforeBundleCommand` to do so.
Soon, Tauri should have native support for this exact scenario:
https://github.com/tauri-apps/tauri/pull/8718
Closes#3798
- After the crash test, the Windows smoke test runs `minidump-stackwalk`
to print a stack trace:
https://github.com/firezone/firezone/actions/runs/8100801373/job/22139592883#step:11:770
- This acts as runnable documentation for getting stack traces on
Windows, and it should flunk the test if anything in the crash handling
to stack trace pipeline is broken
- I also updated the comment in the code since the minidump PR I was
waiting on was put into their newest release
Some recent changes to the Rust part of the codebase made it quite
difficult to locally build the project due to tauri's heavy dependencies
on WebKitGTK and other native libraries.
I tried working around this on my local (nix) machine and found it quite
difficult. The cleanest way here is to make use of what Nix calls
"devshells" which give you an environment specifically for hacking on
your project.
Unfortunately, these files need to be tracked in version control and
cannot be ignored (at least I've not found a way to do that). Given that
we already have a lot of clutter in our repository, I put them under
`scripts/nix`.
They are generally useful. I also added a `.envrc` file which
automatically launches the dev-shell. As a result, you have a shell
ready to go with all your dependencies as soon as you `cd` into our
repository (assuming you use `direnv` and it is hooked up with your
shell).
I didn't really want to have any of my local setup leak into the repo
because I think apart from me and @conectado, nobody is using nix, thus
I hope this minimal footprint is an okay compromise.
The CI tests aren't running for Linux just yet.
This organizes the well-known directories used on Linux and Windows for
logs, config, etc., and adds them to the (unused) Linux smoke test
Waiting on #3727
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
I will need to set up the same paths for Linux, (#3734) and I want an
automated test to make sure everything gets into the right directories.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
~~Highlights the issue hypothesized in #3666~~
This tests that restarting a Relay won't cause sustained downtime.
Sleeps have been removed as they shouldn't necessary -- removing them
will better catch race conditions.
(Waiting on #3721)
Ubuntu is headless by default and needs `xvfb` to run Tauri in CI, hence
the difference.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
- Lower UDP bandwidth to 50M -- this fixes intermittent file descriptor
issues because we overload iperf3 for more than 5 seconds
- Simplify iperf3 to the minimum set that makes tests reliable