Commit Graph

40 Commits

Author SHA1 Message Date
Andrew
74e10b512a Bump Alpine version for Rust dockerfiles 2024-06-24 14:18:57 -06:00
Jamil
16bc9d943b fix(infra): Bump base images to resolve CVEs (#5515)
Fixes the CVEs here:

https://alpinelinux.org/posts/Alpine-3.17.8-3.18.7-3.19.2-released.html

I discovered while browsing our Google artifact registry.
2024-06-24 16:56:55 +00:00
Reactor Scram
1cf10f0c3f chore(rust): bump to Rust 1.79 (#5356)
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-06-16 22:06:18 +00:00
Reactor Scram
b1dde546ab chore(rust): update to Rust 1.78 (#5006)
```[tasklist]
### Before merging
- [x] Apple smoke test
- [x] Android smoke test
```
2024-05-17 14:08:35 +00:00
Gabi
adc0bb73f7 test(client): add reconnection tests from a client using a headless browser (#4569)
Considered using Elixir and Rust to write the tests.

For Elixir, `wallaby` doesn't seem to have a way to attach to an
existing `chromium` instance, launching it each time, which makes it
hard to coordinate with the relay restart.

For Rust we considered `thirtyfour` which would be very nice since we
could test both firefox and chrome but each time it connects to the
instance it launches a new session making it hard to test the DNS cache
behavior.

We also considered `chrome_headless` for Rust it needs a small patch to
prevent it from closing the browser after `Drop` but it still presents a
problem, since it has no easy way to retrieve if loading a page has
succeeded. There are some workarounds such as retrieving the title that
we could have used but after some testing they are quite finnicky and we
don't want that for CI.

So I ended up settling for TypeScript but I'm open to other options, or
a fix for the previous ones!

There are some modifications still incoming for this PR, around the test
name and that sleep in the middle of the test doesn't look good so I
will probably add some retries, but the gist is here, will keep it in
draft until we expect it to be passing.

So feel free to do some initial reviews.

Note: the number of lines changed is greatly exaggerated by
`package.lock`

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-04-20 06:57:07 +00:00
Thomas Eizinger
4972e49b34 ci: run assertions inside docker container (#4680)
As part of #4568, we are adding a 2nd relay which showed some
short-comings of the current process state assertions because they were
running outside the docker containers, thus listing all relays as soon
as there are multiple.
2024-04-18 23:48:42 +00:00
Jamil
d79992bd1e build(rust): Use Rust base image and bump to 1.77 (#4401)
These customizations were from before we used `cargo cross` for all
architectures in CI.

1.77.1 has been tested to work with the following clients:

- [x] Apple
- [x] Android
- [x] Windows
2024-03-31 15:01:27 +00:00
Jamil
1cfa80399e fix(connlib): Don't roll log files (#4390)
Fixes #4377 
Closes #3910 

If we decide to implement diagnostic log collection in the future it
will be opt-in and use something like Sentry.
2024-03-29 04:24:24 +00:00
Jamil
16337d57f3 refactor(connlib): Reduce log noisiness for GA (#4381)
Fixes #4380 
Fixes #4379
2024-03-28 20:51:59 +00:00
Jamil
63c546eb45 chore(docker): Fix docker image local builds (#4127)
Fixes an artifact leftover from the refactor.

Fixes #4122
2024-03-14 00:06:10 +00:00
Jamil
6575e0ca26 chore(ci): Refactor CI to use prod images in staging and prevent accidental hotfix breakages (#4049)
- Runs release asset builds simultaneously with `deploy-staging`. Those
don't depend on each other.
- Prevents running some build workflows in CD because they're run
already in the PR and in the merge group, and the risk of semantic
conflict is negligible
- Run `release` assets in staging
- Adds `compatibility_tests`: **To successfully introduce a breaking
change in the control / data plane APIs, you must now "Merge as
Administrator"**
- Since `CI` is no longer run on `main`, caching needed to be refactored
to make sense again
- Since `CI` is no longer run on `main`, the Elixir
`migrations_and_seeds_test` had to be rewritten. This now tests
migrations using `git checkout` instead of importing `main`'s DB dump.
- Move tauri builds to its own workflow so we can trigger Linux and
Windows builds manually on an adhoc basis like we do for the Swift and
Kotlin builds
- Add a new `hotfix` workflow that will run `compatibility_tests` with
the latest published images
- Add `workflow_dispatch` to trigger `CD` manually for testing purposes
(cc @ReactorScram)


Refs #3995
2024-03-11 20:01:34 +00:00
Reactor Scram
e4d828b9b1 ci(docker): bump Rust from 1.74 to 1.76 (#4033)
This was accidentally missed in #3632. Jamil said it may fix
`cargo-chef`'s caching once merged.

Part of #3995
2024-03-07 19:08:23 +00:00
Reactor Scram
4ecdde3653 ci: change cargo chef call so it will ignore the GUI client (#3740)
I don't know much about `cargo chef` so I gave this its own PR in case
I'm doing something that'll subtly break it
I've run into this problem on some branches and not others, where it's
trying to build all the Tauri / glib stuff even though the Docker image
won't need it:
https://github.com/firezone/firezone/actions/runs/8012206575/job/21887478015#step:7:1175
2024-02-23 16:08:21 +00:00
Gabi
781810f918 feat(dev): add dev yml for rust development (#3670)
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-02-23 00:25:25 +00:00
Reactor Scram
830302af43 test(linux): Low-risk changes to prepare for Linux DNS support (#3625)
This splits off the easy parts from #3605.

- Add quotes around `PHOENIX_SECURE_COOKIES` because my local
`docker-compose` considers unquoted 'false' to be a schema error - Env
vars are strings or numbers, not bools, it says
- Create `test.httpbin.docker.local` container in a new subnet so it can
be used as a DNS resource without the existing CIDR resource picking it
up
- Add resources and policies to `seeds.exs` per #3342
- Fix warning about `CONNLIB_LOG_UPLOAD_INTERVAL_SECS` not being set
- Add `resolv-conf` dep and unit tests to `firezone-tunnel` and
`firezone-linux-client`
- Impl `on_disconnect` in the Linux client with `tracing::error!`
- Add comments

```[tasklist]
- [x] (failed) Confirm that the client container actually does stop faster this way
- [x] Wait for tests to pass
- [x] Mark as ready for review
```
2024-02-12 19:04:51 +00:00
Thomas Eizinger
66c85e28b0 feat(connection): use STUN to generate server-reflexive candidate (#3268)
Currently, `firezone-connection` can only handle connections on a LAN.
Via the use of a STUN server, we can discover our public IP and attempt
to direct, hole-punched connection across multiple subnets.
2024-01-19 04:11:07 +00:00
Jamil
df3953983c fix(ci): Fix publish step to publish multi-arch images for public use (#3287)
* Remove `--pull-tags`
* Correctly build and push multi-arch images for public use
* re-revert Fix POSIX shell issue
* re-revert Fix Gateways masquerading for wireless interfaces
2024-01-17 18:03:27 -08:00
Thomas Eizinger
2e4dd9943b feat: dynamically configure network & redis for LAN integration test (#3286)
This also uses the docker healthcheck again for the redis container.
2024-01-17 22:11:29 +00:00
Jamil Bou Kheir
1d80af79bc Revert docker-init.sh 2024-01-17 03:45:39 -08:00
Jamil
3cb54e54d2 revert(ci): Revert Dockerfile to use alpine&musl (#3279) 2024-01-17 03:11:30 -08:00
Jamil
b5e591dfd3 fix(ci): Revert runtime to musl (#3278)
Turns out #3276 was only part of the problem. After that was fixed, the
issue did turn out to be the statically-linked libc runtime. Staging was
using dynamic linking and so didn't hit the issue.

This reverts back to musl which has been tested as @AndrewDryga noted.
2024-01-17 02:58:26 -08:00
Jamil
6c72447b4f fix(rust): Use -n for POSIX shells to handle building for different TARGETs (#3270) 2024-01-16 17:52:30 -08:00
Jamil
b8e2a59570 fix(connlib): Use debian:12-slim for Rust base image (#3243)
Fixes #3215
2024-01-16 01:53:32 +00:00
Andrew Dryga
52b284abd9 Terraform improvements for production (#2873) 2023-12-11 19:41:01 -06:00
Gabi
aec5b97012 Add performance tests for client-gateway communication (#2655) 2023-11-17 00:32:34 -06:00
Andrew Dryga
8b8881f415 Make CodeQL a part of CI workflow (#2492) 2023-10-23 16:16:09 -06:00
Gabi
cc65a63c63 Update Dockerfile (#2490)
When moving from debian to alpine we stopped installing `curl` and it's
needed to get the public ipv4 and ipv6 of the relay in the
`docker-init.sh`

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
2023-10-23 18:44:39 +00:00
Jamil
fa57d66965 Publish Releases (#2344)
- rebuild and publish gateway and relay binaries to currently drafted
release
- re-tag current relay/gateway images and push to ghcr.io

Stacked on #2341 to prevent conflicts

Fixes #2223 
Fixes #2205 
Fixes #2202
Fixes #2239 

~~Still TODO: `arm64` images and binaries...~~ Edit: added via
`cross-rs`
2023-10-20 14:20:43 -07:00
Thomas Eizinger
2cfe7befef refactor(connlib): remove ControlSignal (#2321) 2023-10-18 17:28:04 +11:00
Thomas Eizinger
ecae222674 fix(rust): install toolchain in base layer (#2258)
Copying the `rust-toolchain.toml` file in is one thing but if we want to
avoid repeatedly installing it, we should do that in the same layer too.

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2023-10-06 14:12:58 -06:00
Thomas Eizinger
9a41983447 ci: optimize caching further (#2246)
This patch-set aims to make several improvements to our CI caching:

1. Use of registry as build cache: Pushes a separate image to our docker
registry at GCP that contains the cache layers. This happens for every
PR & main. As a result, we can restore from **both** which should make
repeated runs of CI on an individual PR faster and give us a good
baseline cache for new PRs from `main`. See
https://docs.docker.com/build/ci/github-actions/cache/#registry-cache
for details. As a nice side-effect, this allows us to use the 10 GB we
have on GitHub actions for other jobs.
2. We make better use of `restore-keys` by also attempting to restore
the cache if the fingerprint of our lockfiles doesn't match. This is
useful for CI runs that upgrade dependencies. Those will restore a cache
that is still useful although doesn't quite match. That is better[^1]
than not hitting the cache at all.
3. There were two tiny bugs in our Swift and Android builds:
a. We used `rustup show` in the wrong directory and thus did not
actually install the toolchain properly.
b. We used `shared-key` instead of `key` for the
https://github.com/Swatinem/rust-cache action and thus did not
differentiate between jobs properly.
5. Our Dockerfile for Rust had a bug where it did not copy in the
`rust-toolchain.toml` file in the `chef` layer and thus also did not use
the correctly toolchain.
6. We remove the dedicated gradle cache because the build action already
comes with a cache configuration:
https://github.com/firezone/firezone/actions/runs/6416847209/job/17421412150#step:10:25

[^1]: Over time, this may mean that our caches grow a bit. In an ideal
world, we automatically remove files from the caches that haven't been
used in a while. The cache action we use for Rust does that
automatically:
https://github.com/Swatinem/rust-cache?tab=readme-ov-file#cache-details.
As a workaround, we can just purge all caches every now and then.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2023-10-05 06:26:56 -07:00
Jamil
80234f9c71 Github Actions cache on main and scope caches for all languages/runtimes (#2233) 2023-10-04 17:29:04 -07:00
Jamil
c4c6f3e4ca refactor(portal): Don't pin session token to user_agent or remote_ip (#2195)
Removing the check to get Rust PRs to pass.

**Note**: #2182 was dependent on this one, and has since merged into
this one.
2023-09-30 07:40:57 -07:00
Jamil
a98f30a8dd fix(ci): Fix flaky integration tests (#2190) 2023-09-29 01:12:29 -07:00
Thomas Eizinger
6681301166 fix(relay): use system cert store for root certificates (#1999) 2023-09-08 01:32:48 -06:00
Gabi
7d0e0acfe9 fix(connlib): assorted fixes (#1953)
* single stack ipv6/ipv4
* set mtu for linux connlib
* add iperf3 resource on dev docker-compose

---------

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2023-08-28 23:47:00 +00:00
Andrew Dryga
9e17352fd6 Deploy relays (#1706)
Will finish once #1705 is merged and stable.

cc @thomaseizinger
2023-08-08 17:15:33 -05:00
Gabi
720b2f8cd9 Fix/docker compose up (#1705)
This PR fixes `docker compose up` but it doesn't have the test client ->
resource flow working but it prevent anything from erroring at startup.

This fixes:
* tokens (use the correct token for the client user agent we are using)
* randomize `name_suffix` at start up for connlib (we will eventually
allow options to set it manually)
* remove port ranges for relay (see firezone/product#613)
2023-06-28 18:48:33 +00:00
Gabi
1d50883dbd rust: fix dockerfile for building multiple images in parallel (#1699)
When using `docker compose build` or any other way of building docker
images in parallel the way the cache was working with the rust's
Dockerfile made the caches between images overlap and corrupt each
other. We add a `locked` which prevents multiple writers to the same
cache to fix this behaviour.
2023-06-26 13:46:20 -06:00
Gabi
e9be4b9ef5 connlib: moves it to the main firezone library
This brindgs connlib from its own separated repo to firezone's monorepo.
    
 On top of bringing connlib we also add and unify the Dockerfile for all
 rust binaries and add a docker-compose that can run a headless client, a
 relay and a gateway which eventually will test the whole flow between a
 client and a resource. For this to work we also incorporated some elixir
 scripts to generate portal tokens for those components.
2023-06-23 16:39:58 -06:00