To improve supply-chain security, reference all GitHub actions using the
hash of the released tag. GitHub recommends to do this for third-party
actions
(https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-third-party-actions).
In order to make our CI more deterministic, I opted to do it for all our
actions. This means any change to our workflow configuration requires a
source code change and thus passing CI on our end.
Dependabot will automatically issue PRs for these actions and update the
comment with the new version next to them.
Resolves: #2497.
In order for Sentry to parse our releases as semver, they need to be in
the form of `package@version` [0]. Without this, the feature of "Mark
this issue as resolved in the _next_ version" doesn't work properly
because Sentry compares the versions as to when it first saw them vs
parsing the semver string itself. We test versions prior to releasing
them, meaning Sentry learns about a 1.4.0 version before it is actually
released. This causes false-positive "regressions" even though they are
fixed in a later (as per semver) release.
This create some redundancy with the different DSNs that we are already
using. I think it would make sense to consider merging the two projects
we have for the GUI client for example. That is really just one project
that happens to run as two binaries.
For all other projects, I think the separation still makes sense because
we e.g. may add Sentry to the "host" applications of Android and
MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the
issues into the same Sentry project.
As per Sentry's docs, releases are organisation-wide and therefore need
a package identifier to be grouped correctly.
[0]:
https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version
Explicitly creating the Sentry release allows us to associate the
commits since the last release with the new one. This might help us to
identify potential sources of regressions. For the current releases,
I've set them manually to ensure that this automation has something to
pick up on for the next release.
The releases will already exists prior to this because they are
automatically created when a client / gateway first logs in with a
certain version.
What this does it mark it as "finalized" and set the commit range
accordingly.
Resolves: #7358.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This ensure that we run prettier across all supported filetypes to check
for any formatting / style inconsistencies. Previously, it was only run
for files in the website/ directory using a deprecated pre-commit
plugin.
The benefit to keeping this in our pre-commit config is that devs can
optionally run these checks locally with `pre-commit run --config
.github/pre-commit-config.yaml`.
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Closes#4883
Refs #7005
Adds support for Ubuntu 24.04, drops support for Ubuntu 20.04
Known issues:
- On Ubuntu 22.04, sometimes GNOME shows the wrong tray icon
- On Ubuntu 24.04, the first time you open the tray menu, GNOME takes a
long time to open the menu.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Refs #6138
Sentry is always enabled for now. In the near future we'll make it
opt-out per device and opt-in per org (see #6138 for details)
- Replaces the `crash_handling` module
- Catches panics in GUI process, tunnel daemon, and Headless Client
- Added a couple "breadcrumbs" to play with that feature
- User ID is not set yet
- Environment is set to the API URL, e.g. `wss://api.firezone.dev`
- Reports panics from the connlib async task
- Release should be automatically pulled from the Cargo version which we
automatically set in the version Makefile
Example screenshot of sentry.io with a caught panic:
<img width="861" alt="image"
src="https://github.com/user-attachments/assets/c5188d86-10d0-4d94-b503-3fba51a21a90">
This moves about 2/3rds of the code from `firezone-gui-client` to
`firezone-gui-client-common`.
I tested it in aarch64 Windows and cycled through sign-in and sign-out
and closing and re-opening the GUI process while the IPC service stays
running. IPC and updates each get their own MPSC channel in this, so I
wanted to be sure it didn't break.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
The different implementations of `Tun` are the last platform-specific
code within `firezone-tunnel`. By introducing a dedicated crate and a
`Tun` trait, we can move this code into (platform-specific) leaf crates:
- `connlib-client-android`
- `connlib-client-apple`
- `firezone-bin-shared`
Related: #4473.
---------
Co-authored-by: Not Applicable <ReactorScram@users.noreply.github.com>
Closes#5601
It looks like we can hit 100+ Mbps in theory. This covers Wintun, Tokio,
and Windows OS overhead. It doesn't cover the cryptography or anything
in connlib itself.
The code is kinda messy but I'm not sure how to clean it up so I'll just
leave it for review.
This test should fail if there's any regressions in #5598.
It fails if any packet is dropped or if the speed is under 100 Mbps
```[tasklist]
### Tasks
- [x] Use `ip_packet::make`
- [x] Switch to `cargo bench`
- [x] Extract windows ARM PR
- [x] Clean up wintun.dll install code
- [x] Re-request review
```
This fixes a CI bug where the dialyzer cache was not being scoped to the
elixir version, causing cache issues that fail CI jobs.
This also performs some tidying up of the cache key to scope it by
runner arch too for elixir deps, and make clear what the cache key
references.
https://github.com/firezone/firezone/actions/runs/9877195625
This would make it a little easier to replicate prod issues on old
releases
```[tasklist]
### Tasks
- [x] Add comment to changelog
- [x] Check Vercel preview
- [x] Request review
- [x] Update arches link
- [x] `apt-get update`
- [x] Re-request review
```
```[tasklist]
### Tasks
- [x] Check the GUI saves its settings file
- [x] Check the IPC service writes the device ID to disk
- [x] Check the GUI writes a log file (skipped - we already check if the exported zip has any files in it)
- [x] Run the crash file through `minidump-stackwalk`
- [x] Reach feature parity with the original smoke tests
- [x] Ready for review
- [x] Finish #5452
- [ ] Start on #5453
```
Unfortunately I had to keep `linux-client` to get the compatibility
tests to pass. #4578 aims to remove that package.
Please add to this list if you think of anything:
```[tasklist]
# Things that may break that CI/CD won't catch
- [ ] Github release artifacts
- [ ] Knowledge base
- [ ] Docker images
- [ ] Docker containers
- [ ] Existing `linux-client` users
- [ ] Anything that downloads ghcr artifacts
- [ ] Nix (Not sure if it's built in CI. It had a merge conflict)
```
Refs #4515, and #3712, #3782
I think this is what Thomas and I agreed on in Slack / Github
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
(After GA)
This adds a unit test for the Unix domain sockets that I intend to use
for process splitting on Linux.
The length-prefixed encoding and decoding are copied from `subzone`, but
most of that code will not be re-used since it's Windows-specific and
also specific to a Chromium-like process model, which won't work for
Firezone.
On the domain side this PR extends `Domain.Repo` with filtering,
pagination, and ordering, along with some convention changes are
removing the code that is not needed since we have the filtering now.
This required to touch pretty much all contexts and code, but I went
through all public functions and added missing tests to make sure
nothing will be broken.
On the web side I've introduced a `<.live_table />` which is as close as
possible to being a drop-in replacement for the regular `<.table />`
(but requires to structure the LiveView module differently due to
assigns anyways). I've updated all the listing tables to use it.
Fixes some issues encountered after the merge of #4049
- Fix performance tests to only run using base_ref and head_ref to avoid
dependence on `main`
- Fixes some typos
- Prevents a catch-22 condition where breaking compatibility meant we
wouldn't be able to deploy production
- Runs release asset builds simultaneously with `deploy-staging`. Those
don't depend on each other.
- Prevents running some build workflows in CD because they're run
already in the PR and in the merge group, and the risk of semantic
conflict is negligible
- Run `release` assets in staging
- Adds `compatibility_tests`: **To successfully introduce a breaking
change in the control / data plane APIs, you must now "Merge as
Administrator"**
- Since `CI` is no longer run on `main`, caching needed to be refactored
to make sense again
- Since `CI` is no longer run on `main`, the Elixir
`migrations_and_seeds_test` had to be rewritten. This now tests
migrations using `git checkout` instead of importing `main`'s DB dump.
- Move tauri builds to its own workflow so we can trigger Linux and
Windows builds manually on an adhoc basis like we do for the Swift and
Kotlin builds
- Add a new `hotfix` workflow that will run `compatibility_tests` with
the latest published images
- Add `workflow_dispatch` to trigger `CD` manually for testing purposes
(cc @ReactorScram)
Refs #3995
Builds off #3905 and uses the GH actions cache for tauri builds in order
to get around the `crate-type` problem sccache has with Tauri apps.
Fixes#3456
This prevents duplication for different Tauri jobs like building the
release packages vs testing a debug build with mock keyring.
```[tasklist]
- [ ] Fix branch protection rules for changed tests
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Our caches in GitHub actions are hopelessly overflowing, plus for the
Kotlin and Swift jobs, we don't seem to be doing a particularly good job
at caching the build outputs because those jobs take forever.
Instead of using GitHub actions, this PR configures `sccache` for all
Rust compilation commands and uses a GCP bucket to store the artifacts.
This speeds up some of the builds a fair bit. Android now finishes in
~6minutes.
Apart from the self-hosted MacOS 14 runner, the Swift jobs are slow but
still a lot faster than what we currently have.
Windows seems to be quite slow at compiling / fetching artefacts which
is negatively impacted by this change because they now have to be
fetched from the bucket.
Overall, I think this is a net-positive though and should be much easier
to maintain going forward.
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This patch-set aims to make several improvements to our CI caching:
1. Use of registry as build cache: Pushes a separate image to our docker
registry at GCP that contains the cache layers. This happens for every
PR & main. As a result, we can restore from **both** which should make
repeated runs of CI on an individual PR faster and give us a good
baseline cache for new PRs from `main`. See
https://docs.docker.com/build/ci/github-actions/cache/#registry-cache
for details. As a nice side-effect, this allows us to use the 10 GB we
have on GitHub actions for other jobs.
2. We make better use of `restore-keys` by also attempting to restore
the cache if the fingerprint of our lockfiles doesn't match. This is
useful for CI runs that upgrade dependencies. Those will restore a cache
that is still useful although doesn't quite match. That is better[^1]
than not hitting the cache at all.
3. There were two tiny bugs in our Swift and Android builds:
a. We used `rustup show` in the wrong directory and thus did not
actually install the toolchain properly.
b. We used `shared-key` instead of `key` for the
https://github.com/Swatinem/rust-cache action and thus did not
differentiate between jobs properly.
5. Our Dockerfile for Rust had a bug where it did not copy in the
`rust-toolchain.toml` file in the `chef` layer and thus also did not use
the correctly toolchain.
6. We remove the dedicated gradle cache because the build action already
comes with a cache configuration:
https://github.com/firezone/firezone/actions/runs/6416847209/job/17421412150#step:10:25
[^1]: Over time, this may mean that our caches grow a bit. In an ideal
world, we automatically remove files from the caches that haven't been
used in a while. The cache action we use for Rust does that
automatically:
https://github.com/Swatinem/rust-cache?tab=readme-ov-file#cache-details.
As a workaround, we can just purge all caches every now and then.
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>