Bumps the okhttp group in /kotlin/android with 2 updates:
[com.squareup.okhttp3:okhttp](https://github.com/square/okhttp) and
[com.squareup.okhttp3:logging-interceptor](https://github.com/square/okhttp).
Updates `com.squareup.okhttp3:okhttp` from 4.12.0 to 5.1.0
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/square/okhttp/blob/master/CHANGELOG.md">com.squareup.okhttp3:okhttp's
changelog</a>.</em></p>
<blockquote>
<h2>Version 5.1.0</h2>
<p><em>2025-07-07</em></p>
<ul>
<li>
<p>New: <code>Response.peekTrailers()</code>. When we changed
<code>Response.trailers()</code> to block instead of
throwing in 5.0.0, we inadvertently removed the ability for callers to
peek the trailers
(by catching the <code>IllegalStateException</code> if they weren't
available). This new API restores that
capability.</p>
</li>
<li>
<p>Fix: Don't crash on <code>trailers()</code> if the response doesn't
have a body. We broke [Retrofit] users
who read the trailers on the <code>raw()</code> OkHttp response, after
its body was decoded.</p>
</li>
</ul>
<h2>Version 5.0.0</h2>
<p><em>2025-07-02</em></p>
<p>This is our first stable release of OkHttp since 2023. Here's the
highlights if you're upgrading
from OkHttp 4.x:</p>
<p><strong>OkHttp is now packaged as separate JVM and Android
artifacts.</strong> This allows us to offer
platform-specific features and optimizations. If your build system
handles [Gradle module metadata],
this change should be automatic.</p>
<p><strong>MockWebServer has a new coordinate and package name.</strong>
We didn’t like that our old artifact
depends on JUnit 4 so the new one doesn’t. It also has a better API
built on immutable values. (We
intend to continue publishing the old <code>okhttp3.mockwebserver</code>
artifact so there’s no urgency to
migrate.)</p>
<table>
<thead>
<tr>
<th align="left">Coordinate</th>
<th align="left">Package Name</th>
<th align="left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3:5.0.0</td>
<td align="left">mockwebserver3</td>
<td align="left">Core module. No JUnit dependency!</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3-junit4:5.0.0</td>
<td align="left">mockwebserver3.junit4</td>
<td align="left">Optional JUnit 4 integration.</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3-junit5:5.0.0</td>
<td align="left">mockwebserver3.junit5</td>
<td align="left">Optional JUnit 5 integration.</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver:5.0.0</td>
<td align="left">okhttp3.mockwebserver</td>
<td align="left">Obsolete. Depends on JUnit 4.</td>
</tr>
</tbody>
</table>
<p><strong>OkHttp now supports Happy Eyeballs ([RFC 8305][rfc_8305]) for
IPv4+IPv6 networks.</strong> It attempts
both IPv6 and IPv4 connections concurrently, keeping whichever connects
first.</p>
<p><strong>We’ve improved our Kotlin APIs.</strong> You can skip the
builder:</p>
<pre lang="kotlin"><code>val request = Request(
url = "https://cash.app/".toHttpUrl(),
)
</code></pre>
<p><strong>OkHttp now supports [GraalVM].</strong></p>
<p>Here’s what has changed since 5.0.0-alpha.17:</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d2dd180697"><code>d2dd180</code></a>
Prepare for release 5.1.0.</li>
<li><a
href="61a87359f6"><code>61a8735</code></a>
New Response.peekTrailers() API (<a
href="https://redirect.github.com/square/okhttp/issues/8921">#8921</a>)</li>
<li><a
href="66844010f7"><code>6684401</code></a>
Update dependency gradle to v8.14.3 (<a
href="https://redirect.github.com/square/okhttp/issues/8915">#8915</a>)</li>
<li><a
href="7adb2b637c"><code>7adb2b6</code></a>
Update junit-framework monorepo (<a
href="https://redirect.github.com/square/okhttp/issues/8914">#8914</a>)</li>
<li><a
href="e41ff18df8"><code>e41ff18</code></a>
Link to new mockwebserver artifacts (<a
href="https://redirect.github.com/square/okhttp/issues/8911">#8911</a>)</li>
<li><a
href="0ff87513e2"><code>0ff8751</code></a>
Remove Graal init tracing (<a
href="https://redirect.github.com/square/okhttp/issues/8909">#8909</a>)</li>
<li><a
href="b9a2560e56"><code>b9a2560</code></a>
Run graal on master (<a
href="https://redirect.github.com/square/okhttp/issues/8907">#8907</a>)</li>
<li><a
href="8339524463"><code>8339524</code></a>
Remove ExperimentalOkHttpApi references (<a
href="https://redirect.github.com/square/okhttp/issues/8908">#8908</a>)</li>
<li><a
href="ce29ef6182"><code>ce29ef6</code></a>
Fix graal tests (<a
href="https://redirect.github.com/square/okhttp/issues/8906">#8906</a>)</li>
<li><a
href="85796896c3"><code>8579689</code></a>
Don't force a response body read on all trailers (<a
href="https://redirect.github.com/square/okhttp/issues/8904">#8904</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/square/okhttp/compare/parent-4.12.0...parent-5.1.0">compare
view</a></li>
</ul>
</details>
<br />
Updates `com.squareup.okhttp3:logging-interceptor` from 4.12.0 to 5.1.0
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/square/okhttp/blob/master/CHANGELOG.md">com.squareup.okhttp3:logging-interceptor's
changelog</a>.</em></p>
<blockquote>
<h2>Version 5.1.0</h2>
<p><em>2025-07-07</em></p>
<ul>
<li>
<p>New: <code>Response.peekTrailers()</code>. When we changed
<code>Response.trailers()</code> to block instead of
throwing in 5.0.0, we inadvertently removed the ability for callers to
peek the trailers
(by catching the <code>IllegalStateException</code> if they weren't
available). This new API restores that
capability.</p>
</li>
<li>
<p>Fix: Don't crash on <code>trailers()</code> if the response doesn't
have a body. We broke [Retrofit] users
who read the trailers on the <code>raw()</code> OkHttp response, after
its body was decoded.</p>
</li>
</ul>
<h2>Version 5.0.0</h2>
<p><em>2025-07-02</em></p>
<p>This is our first stable release of OkHttp since 2023. Here's the
highlights if you're upgrading
from OkHttp 4.x:</p>
<p><strong>OkHttp is now packaged as separate JVM and Android
artifacts.</strong> This allows us to offer
platform-specific features and optimizations. If your build system
handles [Gradle module metadata],
this change should be automatic.</p>
<p><strong>MockWebServer has a new coordinate and package name.</strong>
We didn’t like that our old artifact
depends on JUnit 4 so the new one doesn’t. It also has a better API
built on immutable values. (We
intend to continue publishing the old <code>okhttp3.mockwebserver</code>
artifact so there’s no urgency to
migrate.)</p>
<table>
<thead>
<tr>
<th align="left">Coordinate</th>
<th align="left">Package Name</th>
<th align="left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3:5.0.0</td>
<td align="left">mockwebserver3</td>
<td align="left">Core module. No JUnit dependency!</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3-junit4:5.0.0</td>
<td align="left">mockwebserver3.junit4</td>
<td align="left">Optional JUnit 4 integration.</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver3-junit5:5.0.0</td>
<td align="left">mockwebserver3.junit5</td>
<td align="left">Optional JUnit 5 integration.</td>
</tr>
<tr>
<td align="left">com.squareup.okhttp3:mockwebserver:5.0.0</td>
<td align="left">okhttp3.mockwebserver</td>
<td align="left">Obsolete. Depends on JUnit 4.</td>
</tr>
</tbody>
</table>
<p><strong>OkHttp now supports Happy Eyeballs ([RFC 8305][rfc_8305]) for
IPv4+IPv6 networks.</strong> It attempts
both IPv6 and IPv4 connections concurrently, keeping whichever connects
first.</p>
<p><strong>We’ve improved our Kotlin APIs.</strong> You can skip the
builder:</p>
<pre lang="kotlin"><code>val request = Request(
url = "https://cash.app/".toHttpUrl(),
)
</code></pre>
<p><strong>OkHttp now supports [GraalVM].</strong></p>
<p>Here’s what has changed since 5.0.0-alpha.17:</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d2dd180697"><code>d2dd180</code></a>
Prepare for release 5.1.0.</li>
<li><a
href="61a87359f6"><code>61a8735</code></a>
New Response.peekTrailers() API (<a
href="https://redirect.github.com/square/okhttp/issues/8921">#8921</a>)</li>
<li><a
href="66844010f7"><code>6684401</code></a>
Update dependency gradle to v8.14.3 (<a
href="https://redirect.github.com/square/okhttp/issues/8915">#8915</a>)</li>
<li><a
href="7adb2b637c"><code>7adb2b6</code></a>
Update junit-framework monorepo (<a
href="https://redirect.github.com/square/okhttp/issues/8914">#8914</a>)</li>
<li><a
href="e41ff18df8"><code>e41ff18</code></a>
Link to new mockwebserver artifacts (<a
href="https://redirect.github.com/square/okhttp/issues/8911">#8911</a>)</li>
<li><a
href="0ff87513e2"><code>0ff8751</code></a>
Remove Graal init tracing (<a
href="https://redirect.github.com/square/okhttp/issues/8909">#8909</a>)</li>
<li><a
href="b9a2560e56"><code>b9a2560</code></a>
Run graal on master (<a
href="https://redirect.github.com/square/okhttp/issues/8907">#8907</a>)</li>
<li><a
href="8339524463"><code>8339524</code></a>
Remove ExperimentalOkHttpApi references (<a
href="https://redirect.github.com/square/okhttp/issues/8908">#8908</a>)</li>
<li><a
href="ce29ef6182"><code>ce29ef6</code></a>
Fix graal tests (<a
href="https://redirect.github.com/square/okhttp/issues/8906">#8906</a>)</li>
<li><a
href="85796896c3"><code>8579689</code></a>
Don't force a response body read on all trailers (<a
href="https://redirect.github.com/square/okhttp/issues/8904">#8904</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/square/okhttp/compare/parent-4.12.0...parent-5.1.0">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
When being presented an invalid peer certificate, there is no reason why
we should retry the connection, it is unlikely to fix itself. Plus, the
certificate may get / be cached and a restart of the application is
necessary.
Resolves: #9944
These parameters should be tuned to how long we expect "normal" queries
to take against the SQL instance. For smaller instances, "normal"
queries may take longer than 500ms, so we need to be able to configure
these via our Terraform configuration.
If not specified, the same defaults are used as before.
Related: https://github.com/firezone/infra/pull/82
Google Cloud Artifact registry and Cloud storage is a significant cost.
GitHub, on the other hand, is completely free due to our being a public
repository. Hence, it makes sense to ditch GCP for GHCR.
To do this, we move all "staging" artifacts to GHCR. These will then be
used in the infra repo to push to GCP for deploys - we probably still
want pulls for our infra to hit GCP and not GitHub.
One big element of this is that we potentially lose sccache, so I'll be
checking the compile time of this PR and looking for alternatives that
don't involve such a massive cloud bill.
This was exposed by #9846. It is being added here as a dedicated PR
because the compatibility tests would fail or at least be flaky for the
latest client release so we cannot add the integration test right away.
When changes occur in the Firezone DB that trigger side effects, we need
some mechanism to broadcast and handle these.
Before, the system we used was:
- Each process subscribes to a myriad of topics related to data it wants
to receive. In some cases it would subscribe to new topics based on
received events from existing topics (I.e. flows in the gateway
channel), and sometimes in a loop. It would then need to be sure to
_unsubscribe_ from these topics
- Handle the side effect in the `after_commit` hook of the Ecto function
call after it completes
- Broadcast only a simply (thin) event message with a DB id
- In the receiver, use the id(s) to re-evaluate, or lookup one or many
records associated with the change
- After the lookup completes, `push` the relevant message(s) to the
LiveView, `client` pid, or `gateway` pid in their respective channel
processes
This system had a number of drawbacks ranging from scalability issues to
undesirable access bugs:
1. The `after_commit` callback, on each App node, is not globally
ordered. Since we broadcast a thin event schema and read from the DB to
hydrate each event, this meant we had a `read after write` problem in
our event architecture, leading to the potential for lost updates. Case
in point: if a policy is updated from `resource_id-1` to
`resource_id-2`, and then back to `resource_id-1`, it's possible that,
given the right amount of delay, the gateway channel will receive two
`reject_access` events for `resource_id-1`, as opposed to one for
`resource_id-1` and one for `resource_id-2`, leading to the potential
for unauthorized access.
1. It was very difficult to ensure that the correct topics were being
subscribed to and unsubscribed from, and the correct number of times,
leading to maintenance issues for other engineers.
1. We had a nasty N+1 query problem whenever memberships were added or
removed that resolved in essentially all access related to that
membership (so all Policies touching its actor group) to be
re-evaluated, and broadcasted. This meant that any bulk addition or
deletion of memberships would generate so many queries that they'd
timeout or consume the entire connection pool.
1. We had no durability for side-effect processing. In some places, we
were iterating over many returned records to send broadcasts.
Broadcasting is not a zero-time operation, each call takes a small
amount of CPU time to copy the message into the receiver's mailbox. If
we deployed while this was happening, the state update would be lost
forever. If this was a `reject_access` for a Gateway, the Gateway would
never remove access for that particular flow.
1. On each flow authorization, we needed to hit `us-east1` not only to
"authorize" the flow, but to log it as well. This incurs latency
especially for users in other parts of the world, which happens on
_each_ connection setup to a new resource.
1. Since we read and re-authorize access due to the thin events
broadcasted from side effects, we risk hitting thundering herd problems
(see the N+1 query problem above) where a single DB change could result
in all receivers hitting the DB at once to "hydrate" their
processing.ion
1. If an administrator modifies the DB directly, or, if we need to run a
DB migration that involves side effects, they'll be lost, because the
side effect triggers happened in `after_commit` hooks that are only
available when querying the DB through Ecto. Manually deleting (or
resurrecting) a policy, for example, would not have updated any
connected clients or gateways with the new state.
To fix all of the above, we move to the system introduced in this PR:
- All changes are now serialized (for free) by Postgres and broadcasted
as a single event stream
- The number of topics has been reduced to just one, the `account_id` of
an account. All receivers subscribe to this one topic for the lifetime
of their pid and then only filter the events they want to act upon,
ignoring all other messages
- The events themselves have been turned into "fat" structs based on the
schemas they present. By making them properly typed, we can apply things
like the existing Policy authorizer functions to them as if we had just
fetched them from the DB.
- All flow creation now happens in memory and doesn't not need to incur
a DB hit in `us-east1` to proceed.
- Since clients and gateways now track state in a push-based manner from
the DB, this means very few actual DB queries are needed to maintain
state in the channel procs, and it also means we can be smarter about
when to send `resource_deleted` and `resource_created_or_updated`
appropriately, since we can always diff between what the client _had_
access to, and what they _now_ have access to.
- All DB operations, whether they happen from the application code, a
`psql` prompt, or even via Google SQL Studio in the GCP console, will
trigger the _same_ side effects.
- We now use a replication consumer based off Postgres logical decoding
of the write-ahead log using a _durable slot_. This means that Postgres
will retain _all events_ until they are acknowledged, giving us the
ability to ensure at-least-once processing semantics for our system.
Today, the ACK is simply, "did we broadcast this event successfully".
But in the future, we can assert that replies are received before we
acknowledge the event as processed back to Postgres.
The tests in this PR have been updated to pass given the refactor.
However, since we are tracking more state now in the channel procs, it
would be a good idea to add more tests for those edge cases. That is
saved as a later PR because (1) this one is already huge, and (2) we
need to get this out to staging to smoke test everything anyhow.
Fixes: #9908Fixes: #9909Fixes: #9910Fixes: #9900
Related: #9501
When receiving an `init` message from the portal, we will now revoke all
authorizations not listed in the `authorizations` list of the `init`
message.
We (partly) test this by introducing a new transition in our proptests
that de-authorizes a certain resource whilst the Gateway is simulated to
be partitioned. It is difficult to test that we cannot make a connection
once that has happened because we would have to simulate a malicious
client that knows about resources / connections or ignores the "remove
resource" message.
Testing this is deferred to a dedicated task. We do test that we hit the
code path of revoking the resource authorization and because the other
resources keep working, we also test that we are at least not revoking
the wrong ones.
Resolves: #9892
From Sentry reports and user-submitted logs, we know that it is possible
for Client and Gateway to de-sync in regards to what each other's public
key is. In such a scenario, ICE will succeed to make a connection but
`boringtun` will fail to handshake a tunnel. By default, `boringtun`
tries for 90s to handshake a session before it gives up and expires it.
In Firezone, the ICE agent takes care of establishing connectivity
whereas `boringtun` itself just encrypts and decrypts packets. As such,
if ICE is working, we know that packets aren't getting lost but instead,
there must be some other issue as to why we cannot establish a session.
To improve the UX in these error cases, we reduce the rekey-attempt-time
to 15s. This roughly matches our ICE timeout. Those 15s count from the
moment we send the first handshake which is just after ICE completes.
Thus we can be sure that after at most 15s, we either have a working
WireGuard session or the connection gets cleaned up.
Related: #9890
Related: #9850
At present, the `direct-download-roaming-network` integration test is a
bit odd. It uses the `--retry` switch from `curl` to retry the download
once it failed. However, what we want to show with this integration test
is that a TCP connection can survive network roaming. We can show that
successfully but only if we specify the `--keepalive-time` option,
otherwise the download stalls.
From inspecting the network logs, this is because `curl` simply waits
for more data to be downloaded. After a network reset, the connection
however is gone and the _client_ (in this case `curl`) needs to send at
least 1 packet to re-establish the connection. By using the keep-alive
option, we can send such a packet and the download completes
successfully.
In Docker environments, applying iptables rules to filter
container-container traffic on the Docker bridged network is not
reliable, leading to direct connections being established in our relayed
tests. To fix this, we insert the rules directly from the client
container itself.
---------
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
When we invalidate or discard an allocation, it may happen that a relay
still sends channel-data messages to us. We don't recognize those and
will therefore attempt to parse them as WireGuard packets, ultimately
ending in an "Packet has unknown format" error.
To avoid this, we check if the packet is a valid channel-data message
even if we presently don't have an allocation on the relay that is
sending us the packet. In those cases, we can stop processing the
packet, thus avoiding these errors from being logged.
As a followup to #9882, we need to ensure that `jsonb` columns that have
value data other than strings are not decoded as jsonb. An example of
when this happens is when Postgres sends an `:unchanged_toast` to
indicate the data hasn't changed.
In #9664, we introduced the `Domain.struct_from_params/2` function which
converts a set of params containing string keys into a provided struct
representing a schema module. This is used to broadcast actual structs
pertaining to WAL data as opposed to simple string encodings of the
data.
The problem is that function was a bit too naive and failed to properly
cast embedded schemas, resulting in all embedded schema on the root
struct being `nil` or `[]`.
To fix this, we need to do two things:
1. We now decode JSON/JSONB fields from binaries (strings) into actual
lists and maps in the replication consumer module for downstream
processors to use
2. We update our `struct_from_params/2` function to properly cast
embedded schemas from these lists and maps using Ecto.Changeset's
`apply_changes` function, which uses the same logic to instantiate the
schemas as if we were saving a form or API request.
Lastly, tests are added to ensure this works under various scenarios,
including nested embedded schemas which we use in some places.
Fixes#9835
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
When a connection is in idle-mode, it only sends a STUN request every 25
seconds. If the Client disconnects e.g. due to a network partition, it
may send a new connection intent later. If the Gateway's connection is
still around then because it was in idle mode, it won't send any
candidates to the remote, making the Client's connection fail with "no
candidates received".
To alleviate this, we wake a connection out of idle mode every time it
is being upserted. This ensures that the connection will fail within 15s
IF the above scenario happens, allowing the Client to reconnect within a
much shorter time-frame.
Note that attempting to repair such a connection is likely pointless. It
is much safer to discard it and let them both establish a new
connection.
Related: #9862
Whilst looking through the auth module of the relay, I noticed that we
unnecessarily convert back and forth between expiry timestamps and
username formats when we could just be using the already parsed version.
Applying a filter globally to the entire subscriber means it filters
events for all layers. This prevents the Sentry layer from uploading
DEBUG logs if configured.
In #9870, the password generation algorithm was broken. The correct
order of the elements in the hash is: expiry, stamp_secret, salt. The
relay expects this order when it re-generates the password to validate
the message.
Due to a different bug in our CI system, we weren't actually checking
for warnings / errors in our perf-test suite:
https://github.com/firezone/firezone/actions/runs/16285038111/job/45982241021#step:9:66
The current Git tag for releases of the Apple client is out-of-line with
the naming of rest of the repository. Ideally, the tag would be renamed
to `apple-client-X.Y.Z` as it represents the version for both the macOS
and iOS client.
I am not familiar with the redirect system on our website to
confidentially do this without breaking anything, so the easiest fix
here is to employ the same hack we already do for Sentry where we
special-case the `macos-client` tag.
Resolves: #9871
Bumps [rustls](https://github.com/rustls/rustls) from 0.23.28 to
0.23.29.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4e0b5fed17"><code>4e0b5fe</code></a>
Bump version to 0.23.29</li>
<li><a
href="b8540790dc"><code>b854079</code></a>
Propagate context for webpki signature algorithm errors</li>
<li><a
href="c84675e34b"><code>c84675e</code></a>
key_schedule: minimise lifetime of resumption secret</li>
<li><a
href="788b0df122"><code>788b0df</code></a>
key_schedule: erase master secret in traffic state</li>
<li><a
href="d2c64f0416"><code>d2c64f0</code></a>
key_schedule: separate ops not using current secret</li>
<li><a
href="e5998cd100"><code>e5998cd</code></a>
key_schedule: add state for derivations before finish</li>
<li><a
href="9620bec130"><code>9620bec</code></a>
tls13::key_schedule: move <code>KeySchedule</code> struct down</li>
<li><a
href="373ad888e2"><code>373ad88</code></a>
tls13::key_schedule: move <code>SecretKind</code> down</li>
<li><a
href="efa2066469"><code>efa2066</code></a>
Improve compactness of Debug impl for extensions</li>
<li><a
href="a5433a154b"><code>a5433a1</code></a>
Correct calculation of ServerHello ECH confirmation</li>
<li>Additional commits viewable in <a
href="https://github.com/rustls/rustls/compare/v/0.23.28...v/0.23.29">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [clap](https://github.com/clap-rs/clap) from 4.5.40 to 4.5.41.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's
changelog</a>.</em></p>
<blockquote>
<h2>[4.5.41] - 2025-07-09</h2>
<h3>Features</h3>
<ul>
<li>Add <code>Styles::context</code> and
<code>Styles::context_value</code> to customize the styling of
<code>[default: value]</code> like notes in the <code>--help</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="92fcd83b76"><code>92fcd83</code></a>
chore: Release</li>
<li><a
href="aca91b99c1"><code>aca91b9</code></a>
docs: Update changelog</li>
<li><a
href="8434510cee"><code>8434510</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5869">#5869</a>
from tw4452852/patch-1</li>
<li><a
href="33b1fc304e"><code>33b1fc3</code></a>
fix(complete): Fix env leakage in elvish dynamic completion</li>
<li><a
href="e5f1f4884c"><code>e5f1f48</code></a>
chore: Release</li>
<li><a
href="9466a552fb"><code>9466a55</code></a>
docs: Update changelog</li>
<li><a
href="d74b793512"><code>d74b793</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5865">#5865</a>
from gifnksm/nushell-completion-value-types</li>
<li><a
href="ecbc775d3b"><code>ecbc775</code></a>
fix(nu): Set argument type based on <code>ValueHint</code></li>
<li><a
href="6784054536"><code>6784054</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5857">#5857</a>
from epage/empty</li>
<li><a
href="cca5f32b3a"><code>cca5f32</code></a>
test(complete): Show empty option-value behavior</li>
<li>Additional commits viewable in <a
href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.40...clap_complete-v4.5.41">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [zbus](https://github.com/dbus2/zbus) from 5.7.1 to 5.8.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dbus2/zbus/releases">zbus's
releases</a>.</em></p>
<blockquote>
<h2>🔖 zbus 5.8.0</h2>
<ul>
<li>✨ <code>interface</code> macro now supports write-only
properties.</li>
<li>✨ Copy attributes over to <code>receive_*_changed</code> and
<code>cached_*</code> methods in <code>proxy</code>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7d8e935927"><code>7d8e935</code></a>
Merge pull request <a
href="https://redirect.github.com/dbus2/zbus/issues/1425">#1425</a> from
zeenix/zb-release</li>
<li><a
href="da0ca55c28"><code>da0ca55</code></a>
🔖 zb,zm: Release 5.8.0</li>
<li><a
href="be41117c4b"><code>be41117</code></a>
Merge pull request <a
href="https://redirect.github.com/dbus2/zbus/issues/1424">#1424</a> from
zeenix/zv-release</li>
<li><a
href="dda4f376e4"><code>dda4f37</code></a>
🔖 zv,zd: Release 5.6.0</li>
<li><a
href="747c64505c"><code>747c645</code></a>
⬆️ micro: Update blocking to v1.6.2 (<a
href="https://redirect.github.com/dbus2/zbus/issues/1423">#1423</a>)</li>
<li><a
href="d01e893a8b"><code>d01e893</code></a>
⬆️ micro: Update tokio to v1.46.1 (<a
href="https://redirect.github.com/dbus2/zbus/issues/1422">#1422</a>)</li>
<li><a
href="8250c5357e"><code>8250c53</code></a>
⬆️ micro: Update libfuzzer-sys to v0.4.10 (<a
href="https://redirect.github.com/dbus2/zbus/issues/1421">#1421</a>)</li>
<li><a
href="7ab8fa67ee"><code>7ab8fa6</code></a>
Merge pull request <a
href="https://redirect.github.com/dbus2/zbus/issues/1420">#1420</a> from
dbus2/renovate/tokio-1.x-lockfile</li>
<li><a
href="36fde484aa"><code>36fde48</code></a>
⬆️ Update tokio to v1.46.0</li>
<li><a
href="f9870cde4a"><code>f9870cd</code></a>
Merge pull request <a
href="https://redirect.github.com/dbus2/zbus/issues/1419">#1419</a> from
zeenix/fix-zv-regression</li>
<li>Additional commits viewable in <a
href="https://github.com/dbus2/zbus/compare/zbus-5.7.1...zbus-5.8.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Why:
* Adding more BEAM VM metrics to give us better insight as to how our
BEAM cluster is running since we're in the middle of making some
moderately large architectural changes to the application.
As a followup to #9856, after talking with @bmanifold, we determined
using the public_key as the username for TURN credentials is a safer bet
because:
- It's by definition public and therefore does not need to be obfuscated
- It's shorter-lived than the token, especially for the gateway
- It essentially represents the data plane connection for client/gateway
and naturally rotates along with the key state for those
When giving TURN credentials to clients and gateways, it's important
that they remain consistent across hiccups in the portal connection so
that relayed connections are not interrupted during a deploy, or if the
user's internet is flaky, or the GCP load balancer decides to disconnect
the client/gateway.
Prior to this PR, that was not the case because we essentially tied TURN
credentials, required for data plane packet flows, to the WebSocket
connection, a control plane element. This happened because we generated
random `expires_at` and `salt` elements on _each_ connection to the
portal.
Instead, what we do now is make these reproducible and tied to the auth
token by hashing then base64-encoding it. The expiry is tied to the
auth-token's expiry.
Fixes#9856
As per the WireGuard paper, `boringtun` tries to handshake with the
remote peer for 90s before it gives up. This timeout is important
because when a session is discarded due to e.g. missing replies,
WireGuard attempts to handshake a new session. Without this timeout, we
would then try to handshake a session forever.
Unfortunately, `boringtun` does not distinguish a missing handshake
response from a bad one. Decryption errors whilst decoding a handshake
response are simply passed up to the upper layer, in our case `snownet`.
I am not sure how we can actually fail to decrypt a handshake but the
pattern we are seeing in customer logs is that this happens over and
over again, so there is no point in having `boringtun` retry the
handshake. Therefore, we immediately fail the connection when this
happens.
Failed connections are immediately removed, triggering the client send a
new connection-intent to the portal. Such a new connection intent will
then sync-up the state between Client and Gateway so both of them use
the most recent public key.
Resolves: #9845
As a first step in preparation for sending OTEL metrics from Clients and
Gateways to a cloud-hosted OTEL collector, we extend the CLI of the
Gateway with configuration options to provide a gRPC endpoint to an OTEL
collector.
If `FIREZONE_METRICS` is set to `otel-collector` and an endpoint is
configured via `OTLP_GRPC_ENDPOINT`, we will report our metrics to that
collector.
The future plan for extending this is such that if `FIREZONE_METRICS` is
set to `otel-collector` (which will likely be the default) and no
`OTLP_GRPC_ENDPOINT` is set, then we will use our own, hosted OTEL
collector and report metrics IF the `export-metrics` feature-flag is set
to `true`.
This is a similar integration as we have done it with streaming logs to
Sentry. We can therefore enable it on a similar granularity as we do
with the logs and e.g. only enable it for the `firezone` account to
start with.
In meantime, customers can already make use of those metrics if they'd
like by using the current integration.
Resolves: #1550
Related: #7419
---------
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
When failing to create the TUN device, the error messages are currently
pretty bare. Add a bit more context so users can self-diagnose easier
what is wrong.
In the DNS resource NAT table, we track parts of the layer 4 protocol of
the connection in order to map packets back to the correct proxy IP in
case multiple DNS names resolve to the same real IP. The involvement of
layer 4 means we need to perform some packet inspection in case we
receive ICMP errors from an upstream router.
Presently, the only ICMP error we handle here is destination
unreachable. Those are generated e.g. when we are trying to contact an
IPv6 address but we don't have an IPv6 egress interface. An additional
error that we want to handle here is "time exceeded":
Time exceeded is sent when the TTL of a packet reaches 0. Typically,
TTLs are set high enough such that the packet makes it to its
destination. When using tools such as `tracepath` however, the TTL is
specifically only incremented one-by-one in order to resolve the exact
hops a packet is taking to a destination. Without handling the time
exceeded ICMP error, using `tracepath` through Firezone is broken
because the packets get dropped at the DNS resource NAT.
With this PR, we generalise the functionality of detecting destination
unreachable ICMP errors to also handle time-exceeded errors, allowing
tools such as `tracepath` to somewhat work:
```
❯ sudo docker compose exec --env RUST_LOG=info -it client /bin/sh -c 'tracepath -b example.com'
1?: [LOCALHOST] pmtu 1280
1: 100.82.110.64 (100.82.110.64) 0.795ms
1: 100.82.110.64 (100.82.110.64) 0.593ms
2: example.com (100.96.0.1) 0.696ms asymm 45
3: example.com (100.96.0.1) 5.788ms asymm 45
4: example.com (100.96.0.1) 7.787ms asymm 45
5: example.com (100.96.0.1) 8.412ms asymm 45
6: example.com (100.96.0.1) 9.545ms asymm 45
7: example.com (100.96.0.1) 7.312ms asymm 45
8: example.com (100.96.0.1) 8.779ms asymm 45
9: example.com (100.96.0.1) 9.455ms asymm 45
10: example.com (100.96.0.1) 14.410ms asymm 45
11: example.com (100.96.0.1) 24.244ms asymm 45
12: example.com (100.96.0.1) 31.286ms asymm 45
13: no reply
14: example.com (100.96.0.1) 303.860ms asymm 45
15: no reply
16: example.com (100.96.0.1) 135.616ms (This broken router returned corrupted payload) asymm 45
17: no reply
18: example.com (100.96.0.1) 161.647ms asymm 45
19: no reply
20: no reply
21: no reply
22: example.com (100.96.0.1) 238.066ms reached
Resume: pmtu 1280 hops 22 back 45
```
We say "somewhat work" because due to the NAT that is in place for DNS
resources, the output does not disclose the intermediary hops beyond the
Gateway.
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
---------
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
The latest version of str0m includes a fix that would result in an
immediate ICE timeout if a remote candidate was added prior to a local
candidate. We mitigated this in #9793 to make Firezone overall more
resilient towards sudden changes in the ICE connection state.
As a defense-in-depth measure, we also fixed this issue in str0m by not
transitioning to `Disconnected` if haven't even formed an candidate
pairs yet.
Diff:
2153bf0385...3d6e3d2f27
I am no longer able to compile `jemalloc` on my system in a debug build.
It fails with the following error:
```
src/malloc_io.c: In function ‘buferror’:
src/malloc_io.c:107:16: error: returning ‘char *’ from a function with return type ‘int’ makes integer from pointer without a cast [-Wint-conversion]
107 | return strerror_r(err, buf, buflen);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
This appears to be a problem with modern versions of clang/gcc. I
believe this started happening when I recently upgraded my system. The
upstream [`jemalloc`](https://github.com/jemalloc/jemalloc) repository
is now archived and thus unmaintained. I am not sure if we ever measured
a significant benefit in using `jemalloc`.
Related: https://github.com/servo/servo/issues/31059
Rust 1.88 has been released and brings with it a quite exciting feature:
let-chains! It allows us to mix-and-match `if` and `let` expressions,
therefore often reducing the "right-drift" of the relevant code, making
it easier to read.
Rust.188 also comes with a new clippy lint that warns when creating a
mutable reference from an immutable pointer. Attempting to fix this
revealed that this is exactly what we are doing in the eBPF kernel.
Unfortunately, it doesn't seem to be possible to design this in a way
that is both accepted by the borrow-checker AND by the eBPF verifier.
Hence, we simply make the function `unsafe` and document for the
programmer, what needs to be upheld.