```[tasklist]
### Tasks
- [x] Check the GUI saves its settings file
- [x] Check the IPC service writes the device ID to disk
- [x] Check the GUI writes a log file (skipped - we already check if the exported zip has any files in it)
- [x] Run the crash file through `minidump-stackwalk`
- [x] Reach feature parity with the original smoke tests
- [x] Ready for review
- [x] Finish #5452
- [ ] Start on #5453
```
Currently, the smoke tests rebuild the `dump_syms` and
`minidump-stackwalk` tools from scratch every time which is slow,
especially on Windows.
We can speed this up by utilising the `taiki-e/install-action` GitHub
action which discovers and downloads the latest binary releases of those
projects and installs them into $PATH.
I think those binaries might also be cached as part of the Rust cache
action (https://github.com/Swatinem/rust-cache) so the visible speed-up
is only within a few seconds and comes from the binaries not being
re-built inside the script.
Caching those binaries on Github still requires us to build them at
least once and also rebuild them in case the cache gets invalidated.
Hence I still think this is a good idea on its own.
When the `tunnel_test` fails, it generates a lot of output because it
keeps printing the backtrace over and over. This makes it difficult to
access the input seed to the test. Copying this seed into a local
environment is the first step in debugging this, at which point the
backtrace can be enabled locally.
We also disable the `verbose: 1` config option. Users can always set
that using the `PROPTEST_VERBOSE` env variable.
With an increased number of tests that make use of `proptest`, executing
`cargo test` (almost 6 minutes on `main` currently:
https://github.com/firezone/firezone/actions/runs/9278428694/job/25529425068#step:5:1407).
By compiling them with optimisations, we can drastically cut down the
execution time with only little penality in compilation speed as those
should be cached in CI.
This is similar to #4097 and #4585 but for the entire `ClientState` and
`GatewayState`. We also do it in the context of a property-based test
with the vision that we can deterministically explore a large space of
state transitions and see where our main property breaks: Being able to
send an ICMP packet from the client to the gateway.
In other words, we now correctly pass all the `Transmit`s back and forth
between the components as if they would receive it from the network. Due
to the nature of property-based tests, this already exercises a very
large input space. For example, if the client does not have an IPv6
socket and the gateway doesn't have an IPv4 socket, this test already
checks whether we then correctly fall back to using a relay (because the
allocation we make on the relay is the only network path where the STUN
requests pass through).
What this does not (yet) do is set up a proper network topology. The
`dispatch_transmit` function will happily "route" a `Transmit` from e.g.
the client to the gateway even if they are in different subnets. In
other words, these tests assume that the actual network itself works and
we can exchange UDP packets between the components.
For now, we only send ICMPs to CIDR resources. As a next step, we can
extend this to DNS resources by sending DNS queries for our DNS
resources and then sending an ICMP to the resolved IP.
It typically takes about 1 minute to run in CI. We don't have any leads
on fixing this issue, and it may be a regression in a recent release of
WebView2. https://github.com/firezone/firezone/pull/4935
This will fix an issue with `linux-group` and `token-path` that happens
when I try to split up the binaries.
```[tasklist]
### Before merging
- [x] Fix linux-group. That stub-ipc-client command doesn't even exist anymore
```
```[tasklist]
# Before merging
- [x] Remove file extension `.txt`
- [x] Wait for `linux-group` test to go green on `main` (#4692)
- [x] *all* compatibility tests must be green on this branch
```
Closes#4664Closes#4665
~~The compatibility tests are expected to fail until the next release is
cut, for the same reasons as in #4686~~
The compatibility test must be handled somehow, otherwise it'll turn
main red.
`linux-group` was moved out of integration / compatibility testing, but
the DNS tests do need the whole Docker + portal setup, so that one can't
move.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Closes#4669
This should stop the problem of `linux-group` failing because of trying
to test an older release that doesn't have the right CLI features
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
(After GA)
This adds a unit test for the Unix domain sockets that I intend to use
for process splitting on Linux.
The length-prefixed encoding and decoding are copied from `subzone`, but
most of that code will not be re-used since it's Windows-specific and
also specific to a Chromium-like process model, which won't work for
Firezone.
This catches two of the mutants, according to `cargo-mutants`.
~~Unfortunately since `cargo test` runs in one process, it's
all-or-nothing for sudo, this will run all unit tests as sudo.~~
(This explanation is not exactly correct, `cargo test` does run _a_
subprocess, but still, there is no way to request sudo or non-sudo
runners for specific tests, since it's just an environment variable, and
since many tests run in parallel in different threads of the same
process.)
Here it is passing in Linux:
https://github.com/firezone/firezone/actions/runs/8382799272/job/22957555987#step:5:3160
And Windows:
https://github.com/firezone/firezone/actions/runs/8382799272/job/22957558003#step:5:1006
```[tasklist]
### Before merging
- [x] Try `#[ignore]` attribute
- [x] Fail gracefully if `sudo` isn't available
```
Closes#3699 if successful
Ref #3972
I don't understand why it started working. There's at least 3
possibilities:
- Some unrelated change in the last few weeks fixed it (Maybe bumping
Tauri to 1.6.1? https://github.com/firezone/firezone/pull/3881)
- It was a bug in the Github CI runner image that they fixed
- It's an awful race condition and adding `tracing::debug!` fixed it
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
- Runs release asset builds simultaneously with `deploy-staging`. Those
don't depend on each other.
- Prevents running some build workflows in CD because they're run
already in the PR and in the merge group, and the risk of semantic
conflict is negligible
- Run `release` assets in staging
- Adds `compatibility_tests`: **To successfully introduce a breaking
change in the control / data plane APIs, you must now "Merge as
Administrator"**
- Since `CI` is no longer run on `main`, caching needed to be refactored
to make sense again
- Since `CI` is no longer run on `main`, the Elixir
`migrations_and_seeds_test` had to be rewritten. This now tests
migrations using `git checkout` instead of importing `main`'s DB dump.
- Move tauri builds to its own workflow so we can trigger Linux and
Windows builds manually on an adhoc basis like we do for the Swift and
Kotlin builds
- Add a new `hotfix` workflow that will run `compatibility_tests` with
the latest published images
- Add `workflow_dispatch` to trigger `CD` manually for testing purposes
(cc @ReactorScram)
Refs #3995
Closes#3815
Changes that are breaking (but these aren't in production so it should
be okay)
- Windows, renaming `device_id.json` to `firezone-id.json` to match the
rest of the code
- Linux GUI, storing the firezone-id under `/var/lib` instead of under
`$HOME`
- Linux GUI, bails out if not run with `sudo --preserve-env` by
detecting `$HOME == root` or `$USER != root`
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Builds off #3905 and uses the GH actions cache for tauri builds in order
to get around the `crate-type` problem sccache has with Tauri apps.
Fixes#3456
(Waiting on #3721)
Ubuntu is headless by default and needs `xvfb` to run Tauri in CI, hence
the difference.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
This may cause conflicts with all my other PRs but it has to happen.
```[tasklist]
- [ ] Update test names in branch protection (I don't think I have perms for this)
```
This prevents duplication for different Tauri jobs like building the
release packages vs testing a debug build with mock keyring.
```[tasklist]
- [ ] Fix branch protection rules for changed tests
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Closes#3534
I'm running it in another job in parallel. It doesn't work in release
mode for Windows reasons, and I'm not sure how to share it with the
`cargo test` jobs.
So the overall time for `ci.yml` is only 11 minutes, which seems
typical. However it is using up another CI runner unit to build the
whole Tauri app from scratch in debug mode.
```[tasklist]
- [x] Skip the WebView2 error dialog if we're in smoke-test mode
- [x] Research if there's any Github action to install WebView2 in the runner
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This should be faster than the Intel runners. Seems to be at least twice
as fast for uncached builds compared to `ubuntu-22.04`.
- [x] ~~Move elixir checks to `macos-14`~~ can't; Depends on `docker`
and `erlef/setup-beam`
- [x] Add macOS targets to rust checks
- [x] Move swift build to macos-14
- [x] Move kotlin build to macos-14
- [x] Name all jobs that are required for merge group to not depend on
job config
- [x] Update PR branch protection rules
`firezone-connection` was a working title that I never really quite
liked. Here is a proposal to rebrand it to `snownet`. That is a lot more
concise and derived from the fact that we are established a network of
connections using ICE.
I tested this by temporarily putting panics in `test_ipc_manager` and
`test_ipc_worker`.
It looks like, if a process crashes, Windows will clean up its named
pipe, and the process waiting on the other side of the named pipe will
get an error.
This is good but it's not air-tight - ~~We could still have a situation
where a worker process locks up, and the main process crashes, and the
worker process then leaks.~~ #3311 will fix that
For that case I'll try this
https://stackoverflow.com/questions/53208/how-do-i-automatically-destroy-child-processes-in-windows
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Fulfills #2997
cd.yml changes are always blind so it may break the draft release when
it goes into main. Just let me know.
I should probably just switch it to Bash so it's easier to test.