Google Cloud Artifact registry and Cloud storage is a significant cost.
GitHub, on the other hand, is completely free due to our being a public
repository. Hence, it makes sense to ditch GCP for GHCR.
To do this, we move all "staging" artifacts to GHCR. These will then be
used in the infra repo to push to GCP for deploys - we probably still
want pulls for our infra to hit GCP and not GitHub.
One big element of this is that we potentially lose sccache, so I'll be
checking the compile time of this PR and looking for alternatives that
don't involve such a massive cloud bill.
When receiving an `init` message from the portal, we will now revoke all
authorizations not listed in the `authorizations` list of the `init`
message.
We (partly) test this by introducing a new transition in our proptests
that de-authorizes a certain resource whilst the Gateway is simulated to
be partitioned. It is difficult to test that we cannot make a connection
once that has happened because we would have to simulate a malicious
client that knows about resources / connections or ignores the "remove
resource" message.
Testing this is deferred to a dedicated task. We do test that we hit the
code path of revoking the resource authorization and because the other
resources keep working, we also test that we are at least not revoking
the wrong ones.
Resolves: #9892
In Docker environments, applying iptables rules to filter
container-container traffic on the Docker bridged network is not
reliable, leading to direct connections being established in our relayed
tests. To fix this, we insert the rules directly from the client
container itself.
---------
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
The current Git tag for releases of the Apple client is out-of-line with
the naming of rest of the repository. Ideally, the tag would be renamed
to `apple-client-X.Y.Z` as it represents the version for both the macOS
and iOS client.
I am not familiar with the redirect system on our website to
confidentially do this without breaking anything, so the easiest fix
here is to employ the same hack we already do for Sentry where we
special-case the `macos-client` tag.
Resolves: #9871
In the DNS resource NAT table, we track parts of the layer 4 protocol of
the connection in order to map packets back to the correct proxy IP in
case multiple DNS names resolve to the same real IP. The involvement of
layer 4 means we need to perform some packet inspection in case we
receive ICMP errors from an upstream router.
Presently, the only ICMP error we handle here is destination
unreachable. Those are generated e.g. when we are trying to contact an
IPv6 address but we don't have an IPv6 egress interface. An additional
error that we want to handle here is "time exceeded":
Time exceeded is sent when the TTL of a packet reaches 0. Typically,
TTLs are set high enough such that the packet makes it to its
destination. When using tools such as `tracepath` however, the TTL is
specifically only incremented one-by-one in order to resolve the exact
hops a packet is taking to a destination. Without handling the time
exceeded ICMP error, using `tracepath` through Firezone is broken
because the packets get dropped at the DNS resource NAT.
With this PR, we generalise the functionality of detecting destination
unreachable ICMP errors to also handle time-exceeded errors, allowing
tools such as `tracepath` to somewhat work:
```
❯ sudo docker compose exec --env RUST_LOG=info -it client /bin/sh -c 'tracepath -b example.com'
1?: [LOCALHOST] pmtu 1280
1: 100.82.110.64 (100.82.110.64) 0.795ms
1: 100.82.110.64 (100.82.110.64) 0.593ms
2: example.com (100.96.0.1) 0.696ms asymm 45
3: example.com (100.96.0.1) 5.788ms asymm 45
4: example.com (100.96.0.1) 7.787ms asymm 45
5: example.com (100.96.0.1) 8.412ms asymm 45
6: example.com (100.96.0.1) 9.545ms asymm 45
7: example.com (100.96.0.1) 7.312ms asymm 45
8: example.com (100.96.0.1) 8.779ms asymm 45
9: example.com (100.96.0.1) 9.455ms asymm 45
10: example.com (100.96.0.1) 14.410ms asymm 45
11: example.com (100.96.0.1) 24.244ms asymm 45
12: example.com (100.96.0.1) 31.286ms asymm 45
13: no reply
14: example.com (100.96.0.1) 303.860ms asymm 45
15: no reply
16: example.com (100.96.0.1) 135.616ms (This broken router returned corrupted payload) asymm 45
17: no reply
18: example.com (100.96.0.1) 161.647ms asymm 45
19: no reply
20: no reply
21: no reply
22: example.com (100.96.0.1) 238.066ms reached
Resume: pmtu 1280 hops 22 back 45
```
We say "somewhat work" because due to the NAT that is in place for DNS
resources, the output does not disclose the intermediary hops beyond the
Gateway.
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
---------
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
With this patch, we sample a list of DNS resources on each test run and
create a "TCP service" for each of their addresses. Using this list of
resources, we then change the `SendTcpPayload` transition to
`ConnectTcp` and establish TCP connections using `smoltcp` to these
services.
For now, we don't send any data on these connections but we do set the
keep-alive interval to 5s, meaning `smoltcp` itself will keep these
connections alive. We also set the timeout to 30s and after each
transition in a test-run, we assert that all TCP sockets are still in
their expected state:
- `ESTABLISHED` for most of them.
- `CLOSED` for all sockets where we ended up sampling an IPv4 address
but the DNS resource only supports IPv6 addresses (or vice-versa). In
these cases, we use the ICMP error to sent by the Gateway to assert that
the socket is `CLOSED`. Unfortunately, `smoltcp` currently does not
handle ICMP messages for its sockets, so we have to call `abort`
ourselves.
Overall, this should assert that regardless of whether we roam networks,
switch relays or do other kind of stuff with the underlying connection,
the tunneled TCP connection stays alive.
In order to make this work, I had to tweak the timeouts when we are
on-demand refreshing allocations. This only happens in one particular
case: When we are being given new relays by the portal, we refresh all
_other_ relays to make sure they are still present. In other words, all
relays that we didn't remove and didn't just add but still had in-memory
are refreshed. This is important for cases where we are
network-partitioned from the portal whilst relays are deployed or reset
their state otherwise. Instead of the previous 8s max elapsed time of
the exponential backoff like we have it for other requests, we now only
use a single message with a 1s timeout there. With the increased ICE
timeout of 15s, a TCP connection with a 30s timeout would otherwise not
survive such an event. This is because it takes the above mentioned 8s
for us to remove a non-functioning relay, all whilst trying to establish
a new connection (which also incurs its own ICE timeout then).
With the reduced timeout on the on-demand refresh of 1s, we detect the
disappeared relay much quicker and can immediately establish a new
connection via one of the new ones. As always with reduced timeouts,
this can create false-positives if the relay doesn't reply within 1s for
some reason.
Resolves: #9531
The tunnel service creates the Firezone ID upon start-up. With recent
changes to the GUI client, we now require reading the ID file when
starting the GUI client.
This exposes a race condition in our smoke-tests where we start them
both at roughly the same time.
To fix this, we sleep for 500ms after starting the tunnel process.
Bumps
[taiki-e/install-action](https://github.com/taiki-e/install-action) from
2.52.6 to 2.55.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's
releases</a>.</em></p>
<blockquote>
<h2>2.55.3</h2>
<ul>
<li>Update <code>dprint@latest</code> to 0.50.1.</li>
</ul>
<h2>2.55.2</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.11.0.</p>
</li>
<li>
<p>Update <code>cargo-dinghy@latest</code> to 0.8.1.</p>
</li>
</ul>
<h2>2.55.1</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.17.1.</p>
</li>
<li>
<p>Update <code>typos@latest</code> to 1.34.0.</p>
</li>
</ul>
<h2>2.55.0</h2>
<ul>
<li>
<p>Support <code>vacuum</code>. (<a
href="https://redirect.github.com/taiki-e/install-action/pull/1016">#1016</a>,
thanks <a
href="https://github.com/jayvdb"><code>@jayvdb</code></a>)</p>
</li>
<li>
<p>Update <code>cargo-shear@latest</code> to 1.3.2.</p>
</li>
</ul>
<h2>2.54.3</h2>
<ul>
<li>Update <code>cargo-careful@latest</code> to 0.4.8.</li>
</ul>
<h2>2.54.2</h2>
<ul>
<li>
<p>Update <code>rclone@latest</code> to 1.70.2.</p>
</li>
<li>
<p>Update <code>zizmor@latest</code> to 1.10.0.</p>
</li>
</ul>
<h2>2.54.1</h2>
<ul>
<li>
<p>Update <code>wasmtime@latest</code> to 34.0.1.</p>
</li>
<li>
<p>Update <code>cargo-tarpaulin@latest</code> to 0.32.8.</p>
</li>
<li>
<p>Update <code>knope@latest</code> to 0.21.0.</p>
</li>
</ul>
<h2>2.54.0</h2>
<ul>
<li>
<p>Add <code>cyclonedx</code> (<a
href="https://redirect.github.com/taiki-e/install-action/pull/1000">#1000</a>,
thanks <a
href="https://github.com/jayvdb"><code>@jayvdb</code></a>)</p>
</li>
<li>
<p>Update <code>wasmtime@latest</code> to 34.0.0.</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.70.1.</p>
</li>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.14.1.</p>
</li>
<li>
<p>Update <code>release-plz@latest</code> to 0.3.136.</p>
</li>
</ul>
<h2>2.53.2</h2>
<ul>
<li>
<p>Fix <code>cargo-nextest</code> installation failure on Ubuntu 24.04
due to HTTP 403 error on requests to crates.io. (<a
href="https://redirect.github.com/taiki-e/install-action/pull/1007">#1007</a>)</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.70.0.</p>
</li>
</ul>
<h2>2.53.1</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<p>All notable changes to this project will be documented in this
file.</p>
<p>This project adheres to <a href="https://semver.org">Semantic
Versioning</a>.</p>
<!-- raw HTML omitted -->
<h2>[Unreleased]</h2>
<ul>
<li>
<p>Update <code>trivy@latest</code> to 0.64.0.</p>
</li>
<li>
<p>Update <code>just@latest</code> to 1.41.0.</p>
</li>
</ul>
<h2>[2.55.3] - 2025-06-30</h2>
<ul>
<li>Update <code>dprint@latest</code> to 0.50.1.</li>
</ul>
<h2>[2.55.2] - 2025-06-30</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.11.0.</p>
</li>
<li>
<p>Update <code>cargo-dinghy@latest</code> to 0.8.1.</p>
</li>
</ul>
<h2>[2.55.1] - 2025-06-30</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.17.1.</p>
</li>
<li>
<p>Update <code>typos@latest</code> to 1.34.0.</p>
</li>
</ul>
<h2>[2.55.0] - 2025-06-30</h2>
<ul>
<li>
<p>Support <code>vacuum</code>. (<a
href="https://redirect.github.com/taiki-e/install-action/pull/1016">#1016</a>,
thanks <a
href="https://github.com/jayvdb"><code>@jayvdb</code></a>)</p>
</li>
<li>
<p>Update <code>cargo-shear@latest</code> to 1.3.2.</p>
</li>
</ul>
<h2>[2.54.3] - 2025-06-28</h2>
<ul>
<li>Update <code>cargo-careful@latest</code> to 0.4.8.</li>
</ul>
<h2>[2.54.2] - 2025-06-27</h2>
<ul>
<li>
<p>Update <code>rclone@latest</code> to 1.70.2.</p>
</li>
<li>
<p>Update <code>zizmor@latest</code> to 1.10.0.</p>
</li>
</ul>
<h2>[2.54.1] - 2025-06-25</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9ca1734d89"><code>9ca1734</code></a>
Release 2.55.3</li>
<li><a
href="03194083f7"><code>0319408</code></a>
Update <code>dprint@latest</code> to 0.50.1</li>
<li><a
href="078fd1effe"><code>078fd1e</code></a>
Release 2.55.2</li>
<li><a
href="70afd9d53f"><code>70afd9d</code></a>
Update <code>zizmor@latest</code> to 1.11.0</li>
<li><a
href="1e57335387"><code>1e57335</code></a>
Update <code>cargo-dinghy@latest</code> to 0.8.1</li>
<li><a
href="491d37bbaa"><code>491d37b</code></a>
Release 2.55.1</li>
<li><a
href="8d74873246"><code>8d74873</code></a>
Update <code>vacuum@latest</code> to 0.17.1</li>
<li><a
href="d85c2f7865"><code>d85c2f7</code></a>
Update <code>typos@latest</code> to 1.34.0</li>
<li><a
href="e70e8600a5"><code>e70e860</code></a>
Release 2.55.0</li>
<li><a
href="407c37f889"><code>407c37f</code></a>
Update changelog</li>
<li>Additional commits viewable in <a
href="1cefd1553b...9ca1734d89">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The `expires_at` column on the `flows` table was never used outside of
the context in which the flow was created in the Client Channel. This
ephemeral state, which is created in the `Domain.Flows.authorize_flow/4`
function, is never read from the DB in any meaningful capacity, so it
can be safely removed.
The `expire_flows_for` family of functions now simply reads the needed
fields from the flows table in order to broadcast `{:expire_flow,
flow_id, client_id, resource_id}` directly to the subscribed entities.
This PR is step 1 in removing the reliance on `Flows` to manage
ephemeral access state. In a subsequent PR we will actually change the
structure of what state is kept in the channel PIDs such that reliance
on this Flows table will no longer be necessary.
Additionally, in a few places, we were referencing a Flows.Show view
that was never available in production, so this dead code has been
removed.
Lastly, the `flows` table subscription and associated hook processing
has been completely removed as it is no longer needed. We've implemented
in #9667 logic to remove publications from removed table subscriptions,
so we can expect to get a couple ingest warnings when we deploy this as
the `Hooks.Flows` processor no longer exists, and the WAL data may have
lingering flows records in the queue. These can be safely ignored.
[`actionlint`](https://github.com/rhysd/actionlint) is a static analysis
tool for GitHub workflows and actions. It detects various issues ahead
of time and runs shellcheck on all `run` blocks. It is worth noting that
this does **not** lint the contents of composite actions so we still
need to be vigilant when working with those.
This has been disabled for several releases now and is not causing any
problems in production. We can therefore safely remove it.
It is about time we do this because our tests are actually still testing
the variant without the feature flag and therefore deviate from what we
do in production. We therefore have to convert the tests as well. Doing
so uncovered a minor problem in our ICMP error parsing code: We
attempted to parse the payload of an ICMP error as a fully-valid layer 4
header (e.g. TCP header or UDP header). However, per the RFC a node only
needs to embed the first 8 bytes of the original packet in an ICMPv4
error. That is not enough to parse a valid TCP header as those are at
least 20 bytes.
I don't expect this to be a huge problem in production right now though.
We only use this code to parse ICMP errors arriving on the Gateway and I
_think_ most devices actually include more than 8 bytes. This only
surfaced because we are very strict with only embedding exactly 8 bytes
when we generate an ICMP error.
Additionally, we change our ICMP errors to be sent from the resource IP
rather than the Gateway's TUN device. Given that we perform NAT on these
IPs anyway, I think this can still be argued to be RFC conform. The
_proxy_ IP which we are trying to contact can be reached but it cannot
be routed further. Therefore the destination is unreachable, yet the
source of this error is the proxy IP itself. I think this is actually
more correct than sending the packets from the Gateway's TUN device
because the TUN device itself is not a routing hop per-se: its IP won't
ever show up in the routing path.
This wasn't the issue - the issue was that @firezone-bot needed access
to the firezone/winget-pkgs repo.
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
To make releases even more smoother, this PR creates a bit of automation
that automatically bumps the versions in the `scripts/bump-versions.sh`
script and opens a PR for it.
When working on the Swift codebase, I noticed that running the formatter
produced a massive diff. This PR re-formats the Swift code with `swift
format . --recursive --in-place` and adds a CI check to enforce it going
forward.
Resolves: #9534
---------
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>