Commit Graph

950 Commits

Author SHA1 Message Date
Jamil
c6b0b0a922 ci: Release 1.3.0 for Internet Resource (#6503)
This publishes the 1.3.0 clients and gateways so that Internet Resources
will work.

The feature is still disabled for the Stripe plans until we publish the
launch post. Select customers have the feature enabled.

Closes #2667
2024-08-30 01:21:34 -07:00
Jamil
c66f0c15c0 ci: Draft bump 1.3.0 clients (#6470)
- Internet resources
2024-08-29 23:33:02 -07:00
Jamil
ac8ae14c51 fix(ci): Fix update-release-draft name so we can require it in branch protection rules (#6484)
This makes the job name deterministic so we can add it to branch
protection rules.
2024-08-29 10:06:53 -07:00
Reactor Scram
0f62c08070 ci: Bump GUI to 1.2.2 (#6481)
- No known issues from the knowledge base were fixed
- I confirmed on the Windows laptop that the fix for #6469 is in this
MSI.
- The changelog looks good in the Vercel preview

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-29 15:26:22 +00:00
Jamil
36fc4cb593 fix(ci): Always build debug images - release binaries with debug tools (#6474)
This will always build images we can use for last-minute compatibility
tests, even if the merge group was bypassed.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-28 16:58:40 -07:00
Thomas Eizinger
35017537c7 feat(gateway): allow out-of-order allow_access requests (#6403)
Currently, the gateway requires a strict ordering of first receiving a
`request_connection` message, following by multiple `allow_access`
messages. Additionally, access can be granted as part of the initial
`request_connection` message too.

This isn't an ideal design. Setting up a new connection is infallible,
all we need to do is send our ICE credentials back to the client.
However, untangling that will require a bit more effort.

Starting with #6335, following this strict order on the client is a more
difficult. Whilst we can send them in order, it is harder to maintain
those ordering guarantees across all our systems.

To avoid this, we change the gateway to perform an upsert for its local
ACLs for a client. In case that an `allow_access` call would somehow get
to the gateway earlier, we can simply already create the `Peer` and only
set up the actual connection later.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-28 13:10:06 +00:00
Jamil
ea33b7868f ci: Bump GUI to 1.2.1 (#6462) 2024-08-27 22:19:26 -07:00
Jamil
2b030d801d feat(android): Bundle GITHUB_SHA into Android client (#6405)
Closes #6400 


<img width="659" alt="Screenshot 2024-08-21 at 11 24 16 PM"
src="https://github.com/user-attachments/assets/c1240406-4dda-41df-a36e-1ed9e9b0895a">
2024-08-27 05:17:22 +00:00
Jamil
84a981f668 refactor(ci): Remove browser-based integration tests (#6435)
Fixes a new issue with puppeteer, chromium 128, and Alpine 3.20 that's
causing failing browser tests.

See more: https://github.com/puppeteer/puppeteer/issues/12189

Failure:

https://github.com/firezone/firezone/actions/runs/10549430305/job/29224528663?pr=6391

Unfortunately, puppeteer's embedded browser doesn't seem to want to run
in Alpine:


https://github.com/firezone/firezone/actions/runs/10563167497/job/29265175731?pr=6435#step:6:56


Fixing this is proving very difficult since we can't seem to use
puppeteer with the latest Alpine images, so I questioned the need to
have these in at all. These tests were added at a time where the DNS
mappings were brittle, so we wanted to verify that relayed and direct
connections held up as we deployed.

This is no longer the case, and we also now have much more unit test
coverage around these things, so given the pain of maintaining these
(and the lack of a current solution to the above), they are removed.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-26 20:01:00 +00:00
Thomas Eizinger
095358dd4a ci: set GITHUB_TOKEN For cargo-binstall (#6420)
`install-action` uses `cargo-binstall` as a fallback. That binary
contacts GitHub which may run into rate-limit without being
authenticated. In that case, we will install manually which takes very
long.

Resolves: #6374.
2024-08-23 04:01:40 +00:00
Jamil
0994bd145a feat(apple): Build GITHUB_SHA into Apple clients (#6406)
Closes #6401 

<img width="1012" alt="Screenshot 2024-08-21 at 11 52 31 PM"
src="https://github.com/user-attachments/assets/3012d088-97cb-4a82-8a8f-b2a398865755">

![Screenshot 2024-08-22 at 12 05
44 AM](https://github.com/user-attachments/assets/5e1209f9-e8fa-4453-9bdd-9f40339649b4)
2024-08-22 20:49:57 +00:00
Jamil
c8eed59387 ci: Release 1.2.0 (#6395)
Releasing 1.2.0 to unblock portal deploy! Some of these have already
been published.
2024-08-22 00:18:27 +00:00
Thomas Eizinger
b2e8ccbb49 chore: delete snownet-tests (#6359)
When `snownet` was first being developed, these tests ensured that
hole-punching as well as connectivity via a relayed works correctly. We
have since added extensive tests that ensure connectivity works in many
scenarios via `tunnel_test`. `tunnel_test` does not (yet) have a
simulated NAT so hole-punching itself is not covered by that.

UDP hole-punching is shockingly trivial though because all you need to
do is send UDP packets to the same socket that the other party is
sending from. This isn't done by our own code but rather by str0m's
implement of ICE (as long as we add the correct candidates).

The `snownet-tests` themselves are quite fragile because they need to
set up their own event loop and manually construct an IP packet. They
haven't caught a single bug to my knowledge so I am proposing to delete
them for ease of maintenance.

For example, in
https://github.com/firezone/firezone/actions/runs/10449965474/job/28948590058?pr=6335
the tests fail because we no longer directly force a handshake when the
connection is established. This is unnecessary now because the buffered
intent packet will directly force a handshake from the client to the
gateway. Yet, `snownet-tests` event loop would need adjusting to also do
that.
2024-08-20 03:40:54 +00:00
Thomas Eizinger
504e823a02 ci: assert that we sample certain transitions (#6339)
It happened in the past that we screwed up the `preconditions` of the
state machine test such that no more transitions were sampled that
actually send packets. To protect against this, we use the newly
introduced logs and grep for certain transitions.

In the future, we can consider emitting a more structured output, like
writing all testcases to a DB and run more complex queries against it to
ensure that certain cases are covered.
2024-08-19 22:40:43 +00:00
Jamil
01552cfaec ci: Skip publish images workflow for gui client (#6240)
It doesn't make sense to run this workflow when publishing GUI clients.

Fixes https://github.com/firezone/firezone/actions/runs/10315532876
2024-08-19 04:24:55 +00:00
Thomas Eizinger
3b56664e02 test(rust): ensure deterministic proptests (#6319)
For quite a while now, we have been making extensive use of
property-based testing to ensure `connlib` works as intended. The idea
of proptests is that - given a certain seed - we deterministically
sample test inputs and assert properties on a given function.

If the test fails, `proptest` prints the seed which can then be added to
a regressions file to iterate on the test case and fix it. It is quite
obvious that non-determinism in how the test input gets generated is no
bueno and reduces the value we get out of these tests a fair bit.

The `HashMap` and `HashSet` data structures are known to be
non-deterministic in their iteration order. This causes non-determinism
during the input generation because we make use of a lot of maps and
sets to gradually build up the test input. We fix all uses of `HashMap`
and `HashSet` by replacing them with `BTreeMap` and `BTreeSet`.

To ensure this doesn't regress, we refactor `tunnel_test` to not make
use of proptest's macros and instead, we initialise and run the test
ourselves. This allows us to dump the sampled state and transitions into
a file per test run. In CI, we then run a 2nd iteration of all
regression tests and compare the sampled state and transitions with the
previous run. They must match byte-for-byte.

Finally, to discourage use of non-deterministic iteration, we ban the
use of the iteration functions on `HashMap` and `HashSet` across the
codebase. This doesn't catch iteration in a `for`-loop but it is better
than not linting against it at all.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-16 23:15:58 +00:00
Thomas Eizinger
7c70850217 feat(connlib): allow glob patterns for matching domain names (#5901)
Currently, `connlib` can only handle "simple" DNS wildcards where `*`
matches any number of subdomains, including zero and `?` matches a
single subdomain.

With this PR, we expand `connlib'`s capabilities to allow for a much
more complex matching of domains that more closely resembles glob
patterns:

- `**` matches any number of subdomains. This supersedes the previous
`*` operator.
- `*` matches a single subdomain. This supersedes the previous `?`
operator.
- `?` matches a single character. This wasn't possible before.
- Additionally, any of these can be combined. Previously, only `*` or
`?` was allowed and they were only accepted at the front of the domain
name pattern.

Resolves: #5056.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2024-08-15 01:30:53 +00:00
Jamil
296ca4ad4d ci: Bump Clients and Gateways to fix NAT / allocation issues (#6287)
Bump all Clients and Gateways due to #6265 being fixed.

---------

Co-authored-by: Not Applicable <ReactorScram@users.noreply.github.com>
2024-08-13 21:58:12 +00:00
Thomas Eizinger
eb91a052c3 chore(rust): group testing crates into a tests/ directory (#6257)
Resolves: #5695.

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-12 17:17:01 +00:00
Thomas Eizinger
47a447c65a chore: prepare hotfix release for Tauri & headless clients (#6235) 2024-08-09 08:28:25 +00:00
Jamil
67ae8ff380 ci: publish Gateway 1.1.4 (#6228)
Publishes the `ENABLE_MASQUERADE` removal.
2024-08-09 03:45:26 +00:00
Jamil
a6ba9868dd ci: Revert bumps to 1.2 (#6227)
We need these at 1.1 until ready to release.
2024-08-08 18:34:39 -07:00
Jamil
096ddfe7c5 ci: bump gui/headless to 1.1.10 (#6221)
To publish the mpsc channel fix.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-08 16:20:20 +00:00
Jamil
406426c59f fix(ci): Fix underscores / dashes typo from #6208 (#6212)
Fix underscores / dashes typo from #6208
2024-08-07 12:58:15 -07:00
Jamil
0c6cd4a804 fix(ci): Add http test server image specifiers to CI (#6208)
- Adds `http_test_server_image` to inputs so that it gets set properly
for CI (`debug`) and CD (`perf`)
- Updates `dev` -> `debug` in docker-compose.yml to fix pulls
- Fixes issue with seeds and relevant docs from #6205
2024-08-07 12:15:00 -07:00
Jamil
51e0b61c9c chore: Bump all clients and gateway versions (#6149)
Includes major fixes https://github.com/firezone/firezone/pull/6143 and
https://github.com/firezone/firezone/pull/6117
2024-08-02 01:12:49 -07:00
Reactor Scram
23161ec840 chore(gui-client): release 1.1.8 (#6136)
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-01 21:58:18 +00:00
Jamil
9d8a15ebee ci: Use the same version of buildx for building, tagging, and merging images (#6066)
In debugging https://firezone.statuspage.io/incidents/3vjmjmbh92mw, we
realized that we use potentially different versions of buildx. This PR
fixes that.
2024-07-30 15:05:09 +00:00
Thomas Eizinger
5687befc9d ci: use correct service name in docker-compose.yml (#6055)
The compose service I defined is called `otel` not `otlp`. With this fix
in place, the relay successfully connects to the OTLP exporter.

it is worthwhile noting that the connection to the OTLP exporter itself
is not critical for relay operation. Even if it fails, it won't affect
the actual data plane. I do think it makes sense to still have a working
OTLP exporter in the compose definition. As it makes it easier to test
whether the ingestion of metrics and traces works as expected.
2024-07-27 02:48:08 +00:00
Reactor Scram
6862213cc2 fix(headless-client/linux): only notify systemd that we're up after Resources are available (#6026)
Closes #5912

Before this, I had the `--exit` CLI flag and the `sd_notify` call
hanging off the wrong callback.
2024-07-26 18:53:08 +00:00
Thomas Eizinger
b2298392e6 ci: only post benchmark results on alerts (#6030)
It appears that the configuration via env variables doesn't work as
expected. This PR changes bencher's config to use commandline arguments.
With that, the `--branch-start-point` actually takes effect and copies
over the thresholds configured on bencher for the `main` branch.

With the thresholds in place, we can configure bencher to only alert us
if a threshold is exceeded and otherwise be quiet and not post a
comment.
2024-07-25 04:28:34 +00:00
Jamil
7034344334 ci: Sleep 3 seconds after upping services for integration tests (#6019)
Fixes #4921
2024-07-24 15:49:39 +00:00
Thomas Eizinger
7c8bbd550b test(connlib): introduce network latency to tunnel_test (#5948)
Currently, `tunnel_test` executes all actions within the same `Instant`,
i.e. time is never advanced by itself. The difficulty with advancing
time compared to other actions like sending packets is that all
time-related actions "overlap". In other words, all timers within
connlib advance at the same time. This makes it difficult to model the
expected behaviour after a certain amount of time has passed as we'd
effectively need to model all timers and their relation to particular
actions (like resending of connection intents or STUN requests).

Instead of only advancing time by itself, we can model some aspect of it
by introducing latency on network messages. This allows us to define a
range of an "acceptable" network latency within everything is expected
to work.

Whilst this doesn't cover all failure cases, it gives us a solid
foundation of parameters within which we should not expect any
operational problems.
2024-07-24 04:01:50 +00:00
Thomas Eizinger
a8aafc9e14 ci: use bencher.dev for continuous benchmarking (#5915)
Currently, we have a homegrown benchmark suite that reports results of
the iperf runs within CI by comparing a run on `main` with the current
branch.

These comments are noisy because they happen on every PR, regardless of
the performance results. As a result, they tend to be skimmed over by
devs and not actually considered. To properly track performance, we need
to record benchmark results over time and use statistics to detect
regressions.

https://bencher.dev does exactly that. it supports various benchmark
harnesses to automatically collect benchmarks. For our case, we simply
use the generic JSON adapter to extract the relevant metrics from the
iperf results and report them to the bencher backend.

With these metrics in place, bencher can plot the results over time, and
alert us in the case of regressions using thresholds based on
statistical tests.

Resolves: #5818.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-07-24 01:22:17 +00:00
Andrew Dryga
50318ae1d2 chore(ci): Do not run terraform plan in PRs when there are no changes (#5964) 2024-07-23 09:18:49 -06:00
Reactor Scram
75529ea799 chore(rust): bump nightly version used for checking unused deps (#5918)
This version was a few months old and started throwing errors about
features that stabilized since then.

e.g.
https://github.com/firezone/firezone/actions/runs/10011089436/job/27673759249

```
error[E0658]: use of unstable library feature 'proc_macro_byte_character'
   --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/proc-macro2-1.0.86/src/wrapper.rs:871:21
    |
871 |                     proc_macro::Literal::byte_character(byte)
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #115268 <https://github.com/rust-lang/rust/issues/115268> for more information
    = help: add `#![feature(proc_macro_byte_character)]` to the crate attributes to enable
    = note: this compiler was built on 2024-03-25; consider upgrading it if it is out of date

error[E0658]: use of unstable library feature 'proc_macro_c_str_literals'
   --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/proc-macro2-1.0.86/src/wrapper.rs:898:21
    |
898 |                     proc_macro::Literal::c_string(string)
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #119750 <https://github.com/rust-lang/rust/issues/119750> for more information
    = help: add `#![feature(proc_macro_c_str_literals)]` to the crate attributes to enable
    = note: this compiler was built on 2024-03-25; consider upgrading it if it is out of date

For more information about this error, try `rustc --explain E0658`.
error: could not compile `proc-macro2` (lib) due to 2 previous errors
```
2024-07-19 22:32:51 +00:00
Reactor Scram
6b1b14dc2c chore(gui-client): release GUI Client 1.1.7 (#5897)
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-07-17 22:23:43 +00:00
Thomas Eizinger
aa279d7731 ci: never tolerate warnings in Rust code (#5893)
Our Rust CI runs various jobs in different configurations of packages
and / or features. Currently, only the clippy job denies warnings which
makes it possible that some code still generates warnings under
particular configurations.

To ensure we always fail on warnings, we set a global env var to deny
warnings for all Rust CI jobs.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-07-17 22:22:12 +00:00
Reactor Scram
a8ece49d9e chore: bump GUI to 1.1.6 (#5862)
I started a playbook for publishing GUI releases, I didn't see any other
one around.

I think there's a middle step I'm not clear on:

1. Open this PR and get it approved
2. Do something? Publish the draft release maybe? Run a special CI
workflow?
3. Merge this PR to update the changelog and bump the versions in Git

```[tasklist]
### Tasks
```
2024-07-12 18:45:56 +00:00
Reactor Scram
64e0b71b77 feat(gui-client): set a different tray icon when signed out (#5817)
Closes #5810 

```[tasklist]
### Tasks
- [x] Try not to set the icon every time we change Resources
- [x] Get production icons
- [x] Add changelog comment
- [x] Add CI stress test that sets the icon 10,000 times
- [x] Open for review
- [x] Repair changelog
- [ ] Merge
```

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-07-11 20:50:44 +00:00
Reactor Scram
78f1c7c519 test(firezone-tunnel/windows): Test Windows upload speed in CI (#5607)
Closes #5601
It looks like we can hit 100+ Mbps in theory. This covers Wintun, Tokio,
and Windows OS overhead. It doesn't cover the cryptography or anything
in connlib itself.

The code is kinda messy but I'm not sure how to clean it up so I'll just
leave it for review.

This test should fail if there's any regressions in #5598.

It fails if any packet is dropped or if the speed is under 100 Mbps

```[tasklist]
### Tasks
- [x] Use `ip_packet::make`
- [x] Switch to `cargo bench`
- [x] Extract windows ARM PR
- [x] Clean up wintun.dll install code
- [x] Re-request review
```
2024-07-10 19:09:45 +00:00
Jamil
446d24a761 ci: Fix scoping dialyzer cache to elixir version (#5825)
This fixes a CI bug where the dialyzer cache was not being scoped to the
elixir version, causing cache issues that fail CI jobs.

This also performs some tidying up of the cache key to scope it by
runner arch too for elixir deps, and make clear what the cache key
references.

https://github.com/firezone/firezone/actions/runs/9877195625
2024-07-10 18:01:32 +00:00
Thomas Eizinger
79b14d4399 ci: don't build optimised Rust tests (#5805)
In #5786, we massively increase the performance of `tunnel_test` and
thus, it is no longer necessary to build all tests using optimisation
level 1. Windows is very slow in compiling Rust and forcing it to
compile with optimisations doesn't help that.

On `main`, the compile phase takes ~ **8min**:
https://github.com/firezone/firezone/actions/runs/9847792756/job/27188488313#step:5:968

With this patch, the compile phase takes ~**6min**:
https://github.com/firezone/firezone/actions/runs/9849448280/job/27193128597?pr=5805#step:5:967
2024-07-09 13:17:07 +00:00
Jamil
ef3b4e5dfe feat(linux-gui): Bump GUI to 1.1.5 for arm64 support (#5800) 2024-07-08 21:58:10 -07:00
Jamil
cd1b46c8f5 fix(ci): Install GH CLI on arm runners (#5802)
`main` failure:

https://github.com/firezone/firezone/actions/runs/9847918080/job/27190842443

Opened an issue:
https://github.com/actions/runner-images/issues/10192

gh cli instructions:

https://github.com/cli/cli/blob/trunk/docs/install_linux.md#debian-ubuntu-linux-raspberry-pi-os-apt
2024-07-09 02:56:24 +00:00
Reactor Scram
e0326be807 ci(gui-client/linux): see if we can build the GUI Client for ARM (#5793)
This would make it a little easier to replicate prod issues on old
releases

```[tasklist]
### Tasks
- [x] Add comment to changelog
- [x] Check Vercel preview
- [x] Request review
- [x] Update arches link
- [x] `apt-get update`
- [x] Re-request review
```
2024-07-08 21:30:48 +00:00
Jamil
1b7338e5c3 fix(website): fix sha of deployed portal (#5782)
Needs a storage key, not an env var to read.
2024-07-06 17:25:00 -07:00
Jamil
e39ce22b36 chore: Publish new linux/windows clients (#5767)
Adds the DNS fix.
2024-07-05 13:19:30 -07:00
Reactor Scram
d0f68fc133 test(gui-client): multi-process smoke test for GUI + IPC service (#5672)
```[tasklist]
### Tasks
- [x] Check the GUI saves its settings file
- [x] Check the IPC service writes the device ID to disk
- [x] Check the GUI writes a log file (skipped - we already check if the exported zip has any files in it)
- [x] Run the crash file through `minidump-stackwalk`
- [x] Reach feature parity with the original smoke tests
- [x] Ready for review
- [x] Finish #5452
- [ ] Start on #5453 
```
2024-07-04 21:10:31 +00:00
Jamil
086c730aaf chore: Bump clients to 1.1.2 for DNS record type forward (#5703)
Apps are already in review with App Stores
2024-07-04 01:31:26 +00:00