Commit Graph

18 Commits

Author SHA1 Message Date
Jamil
84a981f668 refactor(ci): Remove browser-based integration tests (#6435)
Fixes a new issue with puppeteer, chromium 128, and Alpine 3.20 that's
causing failing browser tests.

See more: https://github.com/puppeteer/puppeteer/issues/12189

Failure:

https://github.com/firezone/firezone/actions/runs/10549430305/job/29224528663?pr=6391

Unfortunately, puppeteer's embedded browser doesn't seem to want to run
in Alpine:


https://github.com/firezone/firezone/actions/runs/10563167497/job/29265175731?pr=6435#step:6:56


Fixing this is proving very difficult since we can't seem to use
puppeteer with the latest Alpine images, so I questioned the need to
have these in at all. These tests were added at a time where the DNS
mappings were brittle, so we wanted to verify that relayed and direct
connections held up as we deployed.

This is no longer the case, and we also now have much more unit test
coverage around these things, so given the pain of maintaining these
(and the lack of a current solution to the above), they are removed.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-26 20:01:00 +00:00
Thomas Eizinger
7159ffb34b ci: timeout curl requests after 30s (#5537)
Currently, we rely on curl's default timeout when connecting to a
resource. This is problematic because the `direct-dns` and `relayed-dns`
integration tests check that a certain resource _isn't_ accessible and
this test currently waits for 5 minutes to assert that.

We can shorten this and thus every CI by passing a `--connect-timeout`
to `curl`.

See
https://github.com/firezone/firezone/actions/runs/9656570163/job/26634409843#step:6:445
for an example CI run on `main`.
2024-06-25 06:07:13 +00:00
Jamil
0b83b12fd2 ci: bootstrap browser test harness if missing (#4767)
Should be a less brittle fix to the problem of testing release images
for `compat-tests` with the browser harness.
2024-04-24 17:02:47 +00:00
Gabi
adc0bb73f7 test(client): add reconnection tests from a client using a headless browser (#4569)
Considered using Elixir and Rust to write the tests.

For Elixir, `wallaby` doesn't seem to have a way to attach to an
existing `chromium` instance, launching it each time, which makes it
hard to coordinate with the relay restart.

For Rust we considered `thirtyfour` which would be very nice since we
could test both firefox and chrome but each time it connects to the
instance it launches a new session making it hard to test the DNS cache
behavior.

We also considered `chrome_headless` for Rust it needs a small patch to
prevent it from closing the browser after `Drop` but it still presents a
problem, since it has no easy way to retrieve if loading a page has
succeeded. There are some workarounds such as retrieving the title that
we could have used but after some testing they are quite finnicky and we
don't want that for CI.

So I ended up settling for TypeScript but I'm open to other options, or
a fix for the previous ones!

There are some modifications still incoming for this PR, around the test
name and that sleep in the middle of the test doesn't look good so I
will probably add some retries, but the gist is here, will keep it in
draft until we expect it to be passing.

So feel free to do some initial reviews.

Note: the number of lines changed is greatly exaggerated by
`package.lock`

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-04-20 06:57:07 +00:00
Thomas Eizinger
51089b89e7 feat(connlib): smoothly migrate relayed connections (#4568)
Whenever we receive a `relays_presence` message from the portal, we
invalidate the candidates of all now disconnected relays and make
allocations on the new ones. This triggers signalling of new candidates
to the remote party and migrates the connection to the newly nominated
socket.

This still relies on #4613 until we have #4634.

Resolves: #4548.

---------

Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-20 06:16:35 +00:00
Reactor Scram
7081c71c10 chore(linux-client): allow custom token path (#4666)
```[tasklist]
# Before merging
- [x] Remove file extension `.txt`
- [x] Wait for `linux-group` test to go green on `main` (#4692)
- [x] *all* compatibility tests must be green on this branch
```

Closes #4664 
Closes #4665 

~~The compatibility tests are expected to fail until the next release is
cut, for the same reasons as in #4686~~

The compatibility test must be handled somehow, otherwise it'll turn
main red.
`linux-group` was moved out of integration / compatibility testing, but
the DNS tests do need the whole Docker + portal setup, so that one can't
move.

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-04-19 18:50:24 +00:00
Thomas Eizinger
4972e49b34 ci: run assertions inside docker container (#4680)
As part of #4568, we are adding a 2nd relay which showed some
short-comings of the current process state assertions because they were
running outside the docker containers, thus listing all relays as soon
as there are multiple.
2024-04-18 23:48:42 +00:00
Reactor Scram
e7a4a83e3d chore(linux): only allow IPC connections from members of the firezone group (#4628)
```[tasklist]
### Before merging
- [x] Update KB
```

Maybe not a feature since Linux IPC isn't available to users yet?

I think it's okay if the new `linux-group` test fails in compatibility,
since it wasn't implemented at all back then.

Closes #4659
Closes #4660

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-04-17 21:42:29 +00:00
Thomas Eizinger
be1a719e2c chore(relay): perform graceful shutdown upon receiving SIGTERM (#4552)
Upon receiving a SIGTERM, we immediately disconnect from the websocket
connection to the portal and set a flag that we are shutting down.

Once we are disconnected from the portal and no longer have an active
allocations, we exit with 0. A repeated SIGTERM signal will interrupt
this process and force the relay to shutdown.

Disconnecting from the portal will (eventually) trigger a message to
clients and gateways that this relay should no longer be used. Thus,
depending on the timeout our supervisor has configured after sending
SIGTERM, the relay will continue all TURN operations until the number of
allocations drops to 0.

Currently, we also allow clients to make new allocations and refreshing
existing allocations. In the future, it may make sense to implement a
dedicated status code and refuse `ALLOCATE` and `REFRESH` messages
whilst we are shutting down.

Related: #4548.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-12 08:45:08 +00:00
Thomas Eizinger
26494b0e34 ci: reduce duplication in integration tests (#4583)
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-11 23:01:12 +00:00
Thomas Eizinger
8d49452668 ci: assert that nothing busy loops after the perf tests (#4546)
The clients, gateway and relay all employ an internal design that is
based on an eventloop. This gives us a lot of control in how various IO
components interact with each other. Great control also comes with a
source of bugs, the latest of which made the relay busy-loop once it
started relaying some traffic.

Eventloops are notoriously hard to unit-test because they compose
various IO bits together. Instead of writing unit tests, we can go and
assert the process state after the performance tests. Those generate a
fair bit of load on all our components but after that, they should
suspend.

The most effective tests survive even large refactorings and for that,
they need to be coded against a stable API / property. Asserting that
the process sleeps when it is idle from an application PoV is such a
property.

Related: #4511.
2024-04-09 07:09:50 +00:00
Jamil
09532ea845 chore(ci): Add portal and relay downtime DNS resource tests (#4517)
Tests that DNS still works in the client with established connections
after the portal and/or relay go down.
2024-04-08 09:43:59 +00:00
Reactor Scram
74a81b2a56 test(gui-client): unit test for Linux IPC (#4277)
(After GA)

This adds a unit test for the Unix domain sockets that I intend to use
for process splitting on Linux.

The length-prefixed encoding and decoding are copied from `subzone`, but
most of that code will not be re-used since it's Windows-specific and
also specific to a Chromium-like process model, which won't work for
Firezone.
2024-04-02 19:34:24 +00:00
Thomas Eizinger
62e082d47a refactor(connlib): make {Client,Gateway}State SANS-IO (#4096)
Resolves: #3929.
2024-03-14 23:44:36 +00:00
Jamil
19a7bac4ae chore(ci): enforce shellscript formatting and style (#3679)
Noticed that we all have different styles of writing scripts :-).

This PR adds linting to our shell scripts to standardize on formatting,
catch common issues and/or possible security bugs.

For editor setup:
- Ensure [`shellcheck`](https://github.com/koalaman/shellcheck) and
[`shfmt`](https://github.com/mvdan/sh) are in your `PATH`
- Configure `shfmt` with indentation of `4`, otherwise it uses tabs by
default.
[Here](https://github.com/jamilbk/nvim/blob/master/init.vim#L159) is how
you can do that with Vim and
[here](https://marketplace.visualstudio.com/items?itemName=mkhl.shfmt)
is how for VScode.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Brian Manifold <bmanifold@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Dryga <andrew@dryga.com>
Co-authored-by: Gabi <gabrielalejandro7@gmail.com>
2024-02-21 01:01:32 +00:00
Jamil
20dc0cf1e9 refactor(ci): Use curl for connectivity tests in CI (#3674)
It would be good to run tests with a TCP protocol like `http` to catch
things like MTU and port issues.
2024-02-16 22:48:13 +00:00
Jamil
9054f70995 refactor(ci): simplify dns resources in ci (#3653)
Attempt at cleaning a couple things I missed in code review.

The old httpbin resource wasn't being used anyhow, so I just deduped
them and updated things in a couple other places that had drifted.

Hopefully this fixes the [flaky
CI](https://github.com/firezone/firezone/actions/runs/7918422653/job/21616835910)
2024-02-15 23:50:12 +00:00
Thomas Eizinger
e47c1766bf ci: move tests to bash scripts (#3648)
This improves maintenance because we can now use a regular matrix for
the integration tests and one can locally use tools like shellcheck or a
`bash-lsp` during development.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-02-14 13:55:28 +00:00