Commit Graph

228 Commits

Author SHA1 Message Date
Reactor Scram
2f6f2ef260 test(linux-client): check if we can add the user to a group in a CI test (#4600)
Refs #4513

The next step after this is to use this to test security in the Linux
IPC code, it should reject any IPC commands from users not in the
`firezone` group.
2024-04-17 20:40:27 +00:00
Reactor Scram
1f2821415f chore(linux): ask systemd to limit our privileges (#4630)
Should drop our `systemd-analyze security` level from 9.7 to about 2.5.
We could go a little further, but it would take a lot more effort, and
this is a good starting point.

```[tasklist]
# Before review
- [x] Remove unused trap function in Bash
- [x] Remove `systemd-analyze` call
```
2024-04-17 16:11:29 +00:00
Reactor Scram
cdf2bc8838 refactor(test): use 'set -euox' instead of manual echos (#4637)
I wasn't aware of `set x` when I wrote this, and it looks good in the
other test scripts.

I'm not sourcing `lib.sh` yet, because I don't happen to need any
functions from it. I have other draft PRs that will probably end up
using it.
2024-04-16 17:36:43 +00:00
Jamil
05386b8b4b chore(ci): Use netstat instead of ss for release image tests (#4640)
Fixes #4636
2024-04-16 11:14:52 -06:00
Reactor Scram
7bc1d51b0f test(linux-client): separate the token from the systemd unit file (#4626)
This is needed so that we can auto-update the systemd unit file, either
manually, or with a package manager like `apt`. We don't want users
cut-and-pasting these together on every update, and we don't want
machines doing it. Making the file updatable means we can make security
fixes to it easily.
2024-04-15 20:38:49 +00:00
Thomas Eizinger
be1a719e2c chore(relay): perform graceful shutdown upon receiving SIGTERM (#4552)
Upon receiving a SIGTERM, we immediately disconnect from the websocket
connection to the portal and set a flag that we are shutting down.

Once we are disconnected from the portal and no longer have an active
allocations, we exit with 0. A repeated SIGTERM signal will interrupt
this process and force the relay to shutdown.

Disconnecting from the portal will (eventually) trigger a message to
clients and gateways that this relay should no longer be used. Thus,
depending on the timeout our supervisor has configured after sending
SIGTERM, the relay will continue all TURN operations until the number of
allocations drops to 0.

Currently, we also allow clients to make new allocations and refreshing
existing allocations. In the future, it may make sense to implement a
dedicated status code and refuse `ALLOCATE` and `REFRESH` messages
whilst we are shutting down.

Related: #4548.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-12 08:45:08 +00:00
Thomas Eizinger
26494b0e34 ci: reduce duplication in integration tests (#4583)
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-11 23:01:12 +00:00
Jamil
6720ab5bc1 chore(clients): Bump Apple to 1.0.2; Android 1.0.1 (#4590)
CI won't pass for these builds without these bumps because the versions
are already published.
2024-04-11 22:34:17 +00:00
Jamil
539431d9a3 chore(ci): Allow versioning components separately (#4493)
Since we already have apps published, we need the ability to decouple
the versions of components from each other so that we can run CI and
publish them independently.

This is the first step. The next step would be decoupling releases so
that they're for individual components.

refs #4397
2024-04-11 13:38:03 +00:00
Reactor Scram
3a67eacfbe refactor(linux-client): replace client-tunnel with headless-client which is the same thing (#4516)
Unfortunately I had to keep `linux-client` to get the compatibility
tests to pass. #4578 aims to remove that package.

Please add to this list if you think of anything:

```[tasklist]
# Things that may break that CI/CD won't catch
- [ ] Github release artifacts
- [ ] Knowledge base 
- [ ] Docker images
- [ ] Docker containers
- [ ] Existing `linux-client` users
- [ ] Anything that downloads ghcr artifacts
- [ ] Nix (Not sure if it's built in CI. It had a merge conflict)
```

Refs #4515, and #3712, #3782

I think this is what Thomas and I agreed on in Slack / Github

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-04-10 22:01:55 +00:00
Jamil
7d88e28872 chore(ci): Configure relay with new IP on restart tests (#4571)
See https://firezonehq.slack.com/archives/C0575SD66E5/p1712726575563089
2024-04-10 08:45:38 +00:00
Thomas Eizinger
8d49452668 ci: assert that nothing busy loops after the perf tests (#4546)
The clients, gateway and relay all employ an internal design that is
based on an eventloop. This gives us a lot of control in how various IO
components interact with each other. Great control also comes with a
source of bugs, the latest of which made the relay busy-loop once it
started relaying some traffic.

Eventloops are notoriously hard to unit-test because they compose
various IO bits together. Instead of writing unit tests, we can go and
assert the process state after the performance tests. Those generate a
fair bit of load on all our components but after that, they should
suspend.

The most effective tests survive even large refactorings and for that,
they need to be coded against a stable API / property. Asserting that
the process sleeps when it is idle from an application PoV is such a
property.

Related: #4511.
2024-04-09 07:09:50 +00:00
Thomas Eizinger
3951bafb60 chore(nix): add Rust nightly dev-shell and cargo-udeps (#4474) 2024-04-08 12:06:01 +00:00
Jamil
09532ea845 chore(ci): Add portal and relay downtime DNS resource tests (#4517)
Tests that DNS still works in the client with established connections
after the portal and/or relay go down.
2024-04-08 09:43:59 +00:00
Reactor Scram
74a81b2a56 test(gui-client): unit test for Linux IPC (#4277)
(After GA)

This adds a unit test for the Unix domain sockets that I intend to use
for process splitting on Linux.

The length-prefixed encoding and decoding are copied from `subzone`, but
most of that code will not be re-used since it's Windows-specific and
also specific to a Chromium-like process model, which won't work for
Firezone.
2024-04-02 19:34:24 +00:00
Reactor Scram
1e4ed7bad6 refactor(ci): move DNS control method up to docker-compose.yml (#4341)
This is part of a yak shave towards CI testing of #3812 

Moving the DNS control method out of `docker-compose.yml` and up to the
integration tests themselves allows us to test these scenarios:

- `systemd-resolved`
- `etc-resolv-conf`
- `systemd-resolved` but we're in a container where that won't work, so
we should gracefully degrade to just allowing IP/CIDR resources
2024-04-02 17:11:29 +00:00
Reactor Scram
023c885967 refactor(linux-client): extract all code to firezone-client-tunnel (#4448)
Refs #3713 

With this, the deb package for the Linux GUI Client contains a build of
the Linux CLI Client, at `/usr/bin/firezone-client-tunnel`. Future PRs
can add IPC to the code.

There is also a Windows stub, since Windows will eventually need a
tunnel process and a CLI Client.

In the future we might need to move or rename things, since the CLI
Clients and tunnel binaries for both Linux and Windows may all share
code or at least architecture. For now there is a slight duplication
with this being built as both "Firezone Client Tunnnel" and "Firezone
Linux Client"
2024-04-02 16:59:29 +00:00
Jamil
7c369e5b39 fix(gateway): Fix systemd gateway install script (#4407)
On some OSes (Debian 12) the script fails to get the correct version to
download (likely because of `sed` version), so this simplifies things a
bit.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-03-31 15:56:24 +00:00
Jamil
c30138b38e chore(connlib): Remove atomicwrites and tokio::fs from apple compile path (#4395)
Fixes #4377 


Manually verified by running `nm` on the resulting binaries. I'll open
another PR to handle #4393

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-03-29 21:01:53 +00:00
Jamil
16337d57f3 refactor(connlib): Reduce log noisiness for GA (#4381)
Fixes #4380 
Fixes #4379
2024-03-28 20:51:59 +00:00
Gabi
24e0641871 chore: set rust log level to info for gateways and client (#4319)
- [x] Updated log level string for client and gateways to info or higher
- [x] Update logs to hide DNS information

I also removed `hickory_resolve` errors which could contain sensitive
info from our general error and hide the logs that specifically relates
to them.

@bmanifold double checking that the log levels in the gateway's `*.tf`
files are just used for our own gateways.

Also, the relays still have `debug`, since only we see that I think that
makes sense but double checking with @jamilbk

Fixes: #3618.

---------

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-03-27 01:39:12 +00:00
Thomas Eizinger
18033eafec ci: ensure roaming between networks doesn't abort file download (#4213)
This adds an integration test that downloads a 10MB file from a server
and simulates the client roaming to another network while the download
is active.

We use a DNS resource for this to ensure it also doesn't take too long
in that case. DNS resources are what most users will be using and we
clear some internal DNS caches on connection failures. Hence, using a
DNS resource here is a somewhat roundabout way to test that we aren't
failing and re-establishing the connection but migrate it to a new
network path.
2024-03-26 05:44:59 +00:00
Reactor Scram
64f0427ef4 ci(gui-client): hide the Linux GUI deb since it's not ready yet (#4258)
It's still in the CI artifacts for easy testing, but there's no point
letting users see it since it's in the middle of the process split
re-architect
2024-03-21 23:49:34 +00:00
Reactor Scram
7fece80006 refactor(gui-client): refuse to ever be elevated on Linux (#4232)
Running as sudo / root causes a lot of problems for GUI programs, so
we're unwinding that. In this case we can go back to using Tauri's "open
URL" function, which is great.

Closes #4103
Refs #3713
Affects #3972 - I was finally able to debug it because it came up
constantly during this PR
2024-03-21 14:42:48 +00:00
Reactor Scram
b0904e382a chore: add crate for privileged Linux tunnel process (#4229)
Refs #3713 

```[tasklist]
### Before merging
- [ ] Is 'firezone-client-tunnel' okay for the binary name?
- [ ] Using a library and building it as two binaries is correct, right? `cargo run -p firezone-client-tunnel` takes 1 second. `cargo run -p firezone-gui-client --bin firezone-client-tunnel` takes 1m42s because it builds all the GUI deps.
```
2024-03-21 14:06:55 +00:00
Reactor Scram
ada9d896cf chore(gui-client): remove unused env var (#4234)
Must have been a hack to run the smoke test in CI, and was never
actually hooked up.
2024-03-20 22:20:29 +00:00
Reactor Scram
e05cbbe0a0 build(gui-client/linux): include an empty firezone-tunnel binary with the Tauri deb package (#4220)
I thought this was going to use `cargo-deb` but it was actually easy
with the Tauri deb bundling we already use.

```[tasklist]
### Before merging
- [x] Make sure every file in the Tauri deb is also in our deb (e.g. icons)
```
2024-03-20 14:11:41 +00:00
Reactor Scram
651ea3ae00 build(gui-client/linux): make sure debug symbols get uploaded for the Linux GUI client (#4217)
- Split up CI artifacts into "exe", "pkg", and "syms" so it's easy to
check they're being uploaded. This shouldn't affect published artifacts
- Set `strip = "none"` which seems to be necessary to get the debug
symbols in Linux, although they still end up in the exe and not the dwp
file 🤔 don't know why
- Test Linux stacktrace in CI

Stacktrace examples:
- On Linux we at least get function names, but we aren't getting line
numbers for some reason
https://github.com/firezone/firezone/actions/runs/8350493514/job/22857032124#step:10:268
- On Windows we also get line numbers, as before
https://github.com/firezone/firezone/actions/runs/8350493514/job/22857033367#step:11:351

I didn't test downloading the files and doing a stacktrace locally, but
I have batched that up for whenever I do a big manual test of the
CD-produced release artifacts:
https://github.com/firezone/firezone/issues/3887
2024-03-19 22:18:03 +00:00
Reactor Scram
74026d8b13 build(gui-client): disable AppImage bundling (#4216)
AppImages won't work with process splitting. (#3713)

As far as I can tell, they just produce one binary. Internally they use
FUSE or something to mount a squashfs image, but that image won't be
able to hook into systemd and run with root permissions and everything.
I don't think it's practical, and Tauri's AppImage bundling doesn't have
the features for it.

Even their deb bundler doesn't have any way to specify a path for a
daemon to be installed. The sidecar feature only seems intended for the
GUI app to call, not anything else on the system.

(There is such a thing as installing AppImages, but I don't think it's
worth pursuing - We should just do debs)
2024-03-19 17:26:25 +00:00
Reactor Scram
3ced2b3a20 ci(linux): fix incorrect use of grep -v (#4151)
I think I used `-v` wrong here. I meant it to negate a regular grep, but
it will probably always return true since it negates the matching, not
the exit code.

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-03-15 19:06:57 +00:00
Thomas Eizinger
62e082d47a refactor(connlib): make {Client,Gateway}State SANS-IO (#4096)
Resolves: #3929.
2024-03-14 23:44:36 +00:00
Reactor Scram
52cde610e1 feat(linux): make deep link auth work (#4102)
Right now it only works on my dev VM, not on my test VMs, due to #4053
and #4103, but it passes tests and should be safe to merge.

There's one doc fix and one script fix which are unrelated and could be
their own PRs, but they'd be tiny, so I left them in here.

Ref #4106 and #3713 for the plan to fix all this by splitting the tunnel
process off so that the GUI runs as a normal user.
2024-03-13 18:11:04 +00:00
Jamil
574585d146 chore(ci): Add debug/ and perf/ prefix to some images (#4104)
Followup from #4100:


- Add `perf/relay` and `debug/relay` etc data plane images in
`firezone-staging`.
- The `perf` images are `debug` stage images and have tooling installed,
but use release binaries.
- The `debug` images are `debug` binaries inside `debug` images
- `firezone-prod` contains only release binaries -- these image names
haven't changed
2024-03-12 20:27:32 +00:00
Jamil
391150f0e1 chore(ci): Fix new issues in cd.yml (#4085)
Fixes some issues encountered after the merge of #4049 

- Fix performance tests to only run using base_ref and head_ref to avoid
dependence on `main`
- Fixes some typos
- Prevents a catch-22 condition where breaking compatibility meant we
wouldn't be able to deploy production
2024-03-12 02:06:19 +00:00
Thomas Eizinger
ea53ae7a55 feat(snownet): timeout connections if we don't receive a candidate within 10s (#3790)
Previously, we had a dedicated timer for this within the tunnel
implementation. Now that we have control over the internals of our
connection via `snownet`, we can timeout the connection if we don't
receive a candidate from the remote within 10s.
2024-03-09 08:03:57 +00:00
Reactor Scram
7211e88338 feat(linux-client): generate firezone-id (device ID) automatically if it's not provided at launch (#3920)
Closes #3815 

Changes that are breaking (but these aren't in production so it should
be okay)

- Windows, renaming `device_id.json` to `firezone-id.json` to match the
rest of the code
- Linux GUI, storing the firezone-id under `/var/lib` instead of under
`$HOME`
- Linux GUI, bails out if not run with `sudo --preserve-env` by
detecting `$HOME == root` or `$USER != root`

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-03-08 16:13:59 +00:00
Jamil
f358f824a1 chore(devops): Make some vars optional in systemd install script (#4017) 2024-03-06 17:18:25 -08:00
Jamil
92261be9e0 chore(devops): Use separate script to install systemd gateway (#4016)
This prevents us from backslack escape hell when trying to expose this
script in different contexts.

Needed as a pre-req to #4011
2024-03-06 17:04:22 -08:00
Jamil
9cab250696 chore(windows): Sign internal exe using beforeBundleCommand (#3994)
Refs #3230 

It looks like we need to sign the internal exe before it gets bundled
too. We can use `beforeBundleCommand` to do so.

Soon, Tauri should have native support for this exact scenario:
https://github.com/tauri-apps/tauri/pull/8718
2024-03-06 16:00:54 +00:00
Jamil
19e833262f chore(windows): Sign windows exe too (#3992)
Fixes #3230
2024-03-05 22:35:24 -08:00
Reactor Scram
f11edae097 test(windows): run the smoke test twice to shake out issues in the script (#3889)
I may use this for #3867 , or I may do that in-process, either way this
could be handy.
2024-03-04 23:15:58 +00:00
Reactor Scram
789d2160de refactor(linux): make a place for reverting /etc/resolv.conf (#3822)
Makes progress towards #3817

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-03-01 19:09:01 +00:00
Reactor Scram
c5aab6ebba test(windows client): Add stack trace printing to smoke test (#3813)
Closes #3798 

- After the crash test, the Windows smoke test runs `minidump-stackwalk`
to print a stack trace:
https://github.com/firezone/firezone/actions/runs/8100801373/job/22139592883#step:11:770
- This acts as runnable documentation for getting stack traces on
Windows, and it should flunk the test if anything in the crash handling
to stack trace pipeline is broken
- I also updated the comment in the code since the minidump PR I was
waiting on was put into their newest release
2024-02-29 20:41:35 +00:00
Reactor Scram
ca95dcdf77 feat(gui-client): make all modules Linux-friendly (#3737)
Splits up platform-specific modules into `linux.rs` and `windows.rs`
submodules, or `imp` modules within the parent module, so that we can
compile all the platform-independent code (e.g. Tauri calls,
`run_controller`) for Linux.

This will show the Tauri GUI on Linux:


![image](https://github.com/firezone/firezone/assets/13400041/320a4b91-1a37-48dd-94b9-cc280cc09854)

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-02-28 22:09:56 +00:00
Thomas Eizinger
8d652cb96c chore: add nix scripts (#3771)
Some recent changes to the Rust part of the codebase made it quite
difficult to locally build the project due to tauri's heavy dependencies
on WebKitGTK and other native libraries.

I tried working around this on my local (nix) machine and found it quite
difficult. The cleanest way here is to make use of what Nix calls
"devshells" which give you an environment specifically for hacking on
your project.

Unfortunately, these files need to be tracked in version control and
cannot be ignored (at least I've not found a way to do that). Given that
we already have a lot of clutter in our repository, I put them under
`scripts/nix`.

They are generally useful. I also added a `.envrc` file which
automatically launches the dev-shell. As a result, you have a shell
ready to go with all your dependencies as soon as you `cd` into our
repository (assuming you use `direnv` and it is hooked up with your
shell).

I didn't really want to have any of my local setup leak into the repo
because I think apart from me and @conectado, nobody is using nix, thus
I hope this minimal footprint is an okay compromise.
2024-02-27 23:56:46 +00:00
Jamil
2ed6b3d07f chore(connlib): Tune log filters to enable debug in dev and info for gateway deployments (#3788)
Refs #3618

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-02-27 23:35:08 +00:00
Jamil
6a896af638 chore(repo): Move other dotfiles to reduce directory size of root (#3780)
Brings README content further up for our repo visitors.
2024-02-27 17:23:17 +00:00
Thomas Eizinger
67aeb009e9 chore: move markdown files into docs/ directory (#3773)
Apart from the LICENSE, GitHub supports detecting all of these files
also within a `docs/` directory. This includes the README!
2024-02-27 01:12:57 +00:00
Reactor Scram
fd31152106 refactor(ci): enable Linux do-nothing GUI builds (but not tests) in CI/CD, extract scripts for that (#3735)
Builds a do-nothing `return 0` Linux client to make sure the CI/CD
scripts are set up and producing AppImage / deb bundles as expected.


![image](https://github.com/firezone/firezone/assets/13400041/7d2d8f02-adde-4b1b-89ec-02aaf112ac48)

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-02-23 17:57:39 +00:00
Reactor Scram
7825710a69 refactor(GUI clients): extract known_dirs module (#3734)
The CI tests aren't running for Linux just yet.
This organizes the well-known directories used on Linux and Windows for
logs, config, etc., and adds them to the (unused) Linux smoke test

Waiting on #3727

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-02-23 17:31:35 +00:00