Attempting to refresh an allocation is the only idempotent way in TURN
to test whether one has an active allocation. As such, logging this on
WARN is too aggressive.
Resolves: #7481.
At present, `connlib` will always drop all IP packets until a connection
is established and the DNS resource NAT is created. This causes an
unnecessary delay until the connection is working because we need to
wait for retransmission timers of the host's network stack to resend
those packets.
With the new idempotent control protocol, it is now much easier to
buffer these packets and send them to the gateway once the connection is
established.
The buffer sizes are chosen somewhat conservatively to ensure we don't
consume a lot of memory. The hypothesis here is that every protocol -
even if the transport layer is unreliable like UDP - will start with a
handshake involving only one or at most a few packets and waiting for a
reply before sending more. Thus, as long as we can set up a connection
quicker than the re-transmit timer in the host's network stack,
buffering those packets should result in no packet loss. Typically,
setting up a new connection takes at most 500ms which should be fast
enough to not trigger any re-transmits.
Resolves: #3246.
On the one hand, learning about in which edgecases our software fails is
useful and thus having telemetry also active for self-hosted users is
beneficial. On the other hand, we have neither control nor a contact to
those self-hosted and whatever they are doing might spam our Sentry
account with errors that we can't do anything about.
To mitigate this, we disable telemetry for self-hosted users with the
next release.
Once we have more resources, we can consider enabling this again.
As per the RFC, the IPv6 traffic class should be 1-to-1 translated to
the IPv4 DSCP value. However, it appears that not all values here are
valid. In particular, when attempting to reach GitHub over IPv6, we
receive an IPv6 packet that has a traffic class value of 72 which is
out-of-range for the IPv4 DSCP value, resulting in the following error
on the Gateway:
```
Failed to translate packet: NAT64 failed: Error '72' is too big to be a 'IPv4 DSCP (Differentiated Services Code Point)' (maximum allowed value is '63')
```
The bigger scope of this issue is that this causes the ICMP packets
returned to the client to be dropped which means that `ssh` spawned by
`git` doesn't learn that the IPv6 address assigned by Firezone is not
actually routable.
Related: #7476.
In order for Sentry to parse our releases as semver, they need to be in
the form of `package@version` [0]. Without this, the feature of "Mark
this issue as resolved in the _next_ version" doesn't work properly
because Sentry compares the versions as to when it first saw them vs
parsing the semver string itself. We test versions prior to releasing
them, meaning Sentry learns about a 1.4.0 version before it is actually
released. This causes false-positive "regressions" even though they are
fixed in a later (as per semver) release.
This create some redundancy with the different DSNs that we are already
using. I think it would make sense to consider merging the two projects
we have for the GUI client for example. That is really just one project
that happens to run as two binaries.
For all other projects, I think the separation still makes sense because
we e.g. may add Sentry to the "host" applications of Android and
MacOS/iOS as well. For those, we would reuse the DSN and thus funnel the
issues into the same Sentry project.
As per Sentry's docs, releases are organisation-wide and therefore need
a package identifier to be grouped correctly.
[0]:
https://docs.sentry.io/platforms/javascript/configuration/releases/#bind-the-version
From within the FFI code, we have no control over the lifecycle of the
host application and `connect` may be called multiple times from within
the same process. Therefore, we cannot rely on the global logger state
to **not** be set when `connect` gets called.
To fix this, we cache the handles for the file logger and a
reload-handle for the log filter in a `static` variable. This allows us
to apply the new log-filter of a repeated `connect` call to the existing
logger, even if `connect` is called multiple times from the same
process.
## Context
At present, we only have a single thread that reads and writes to the
TUN device on all platforms. On Linux, it is possible to open the file
descriptor of a TUN device multiple times by setting the
`IFF_MULTI_QUEUE` option using `ioctl`. Using multi-queue, we can then
spawn multiple threads that concurrently read and write to the TUN
device. This is critical for achieving a better throughput.
## Solution
`IFF_MULTI_QUEUE` is a Linux-only thing and therefore only applies to
headless-client, GUI-client on Linux and the Gateway (it may also be
possible on Android, I haven't tried). As such, we need to first change
our internal abstractions a bit to move the creation of the TUN thread
to the `Tun` abstraction itself. For this, we change the interface of
`Tun` to the following:
- `poll_recv_many`: An API, inspired by tokio's `mpsc::Receiver` where
multiple items in a channel can be batch-received.
- `poll_send_ready`: Mimics the API of `Sink` to check whether more
items can be written.
- `send`: Mimics the API of `Sink` to actually send an item.
With these APIs in place, we can implement various (performance)
improvements for the different platforms.
- On Linux, this allows us to spawn multiple threads to read and write
from the TUN device and send all packets into the same channel. The `Io`
component of `connlib` then uses `poll_recv_many` to read batches of up
to 100 packets at once. This ties in well with #7210 because we can then
use GSO to send the encrypted packets in single syscalls to the OS.
- On Windows, we already have a dedicated recv thread because `WinTun`'s
most-convenient API uses blocking IO. As such, we can now also tie into
that by batch-receiving from this channel.
- In addition to using multiple threads, this API now also uses correct
readiness checks on Linux, Darwin and Android to uphold backpressure in
case we cannot write to the TUN device.
## Configuration
Local testing has shown that 2 threads give the best performance for a
local `iperf3` run. I suspect this is because there is only so much
traffic that a single application (i.e. `iperf3`) can generate. With
more than 2 threads, the throughput actually drops drastically because
`connlib`'s main thread is too busy with lock-contention and triggering
`Waker`s for the TUN threads (which mostly idle around if there are 4+
of them). I've made it configurable on the Gateway though so we can
experiment with this during concurrent speedtests etc.
In addition, switching `connlib` to a single-threaded tokio runtime
further increased the throughput. I suspect due to less task / context
switching.
## Results
Local testing with `iperf3` shows some very promising results. We now
achieve a throughput of 2+ Gbit/s.
```
Connecting to host 172.20.0.110, port 5201
Reverse mode, remote host 172.20.0.110 is sending
[ 5] local 100.80.159.34 port 57040 connected to 172.20.0.110 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 274 MBytes 2.30 Gbits/sec
[ 5] 1.00-2.00 sec 279 MBytes 2.34 Gbits/sec
[ 5] 2.00-3.00 sec 216 MBytes 1.82 Gbits/sec
[ 5] 3.00-4.00 sec 224 MBytes 1.88 Gbits/sec
[ 5] 4.00-5.00 sec 234 MBytes 1.96 Gbits/sec
[ 5] 5.00-6.00 sec 238 MBytes 2.00 Gbits/sec
[ 5] 6.00-7.00 sec 229 MBytes 1.92 Gbits/sec
[ 5] 7.00-8.00 sec 222 MBytes 1.86 Gbits/sec
[ 5] 8.00-9.00 sec 223 MBytes 1.87 Gbits/sec
[ 5] 9.00-10.00 sec 217 MBytes 1.82 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec 22247 sender
[ 5] 0.00-10.00 sec 2.30 GBytes 1.98 Gbits/sec receiver
iperf Done.
```
This is a pretty solid improvement over what is in `main`:
```
Connecting to host 172.20.0.110, port 5201
[ 5] local 100.65.159.3 port 56970 connected to 172.20.0.110 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 90.4 MBytes 758 Mbits/sec 1800 106 KBytes
[ 5] 1.00-2.00 sec 93.4 MBytes 783 Mbits/sec 1550 51.6 KBytes
[ 5] 2.00-3.00 sec 92.6 MBytes 777 Mbits/sec 1350 76.8 KBytes
[ 5] 3.00-4.00 sec 92.9 MBytes 779 Mbits/sec 1800 56.4 KBytes
[ 5] 4.00-5.00 sec 93.4 MBytes 783 Mbits/sec 1650 69.6 KBytes
[ 5] 5.00-6.00 sec 90.6 MBytes 760 Mbits/sec 1500 73.2 KBytes
[ 5] 6.00-7.00 sec 87.6 MBytes 735 Mbits/sec 1400 76.8 KBytes
[ 5] 7.00-8.00 sec 92.6 MBytes 777 Mbits/sec 1600 82.7 KBytes
[ 5] 8.00-9.00 sec 91.1 MBytes 764 Mbits/sec 1500 70.8 KBytes
[ 5] 9.00-10.00 sec 92.0 MBytes 771 Mbits/sec 1550 85.1 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 917 MBytes 769 Mbits/sec 15700 sender
[ 5] 0.00-10.00 sec 916 MBytes 768 Mbits/sec receiver
iperf Done.
```
In order to release the new control protocol to users, we need to bump
the versions of the clients to 1.4.0. The portal has a version gate to
only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus,
bumping these versions can only happen once testing has completed and
the gateway has actually been released as 1.4.0.
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Building on top of the gateway PR (#6941), this PR transitions the
clients to the new control protocol. Clients are **not**
backwards-compatible with old gateways. As a result, a certain customer
environment MUST have at least one gateway with the above PR running in
order for clients to be able to establish connections.
With this transition, Clients send explicit events to Gateways whenever
they assign IPs to a DNS resource name. The actual assignment only
happens once and the IPs then remain stable for the duration of the
client session.
When the Gateway receives such an event, it will perform a DNS
resolution of the requested domain name and set up the NAT between the
assigned proxy IPs and the IPs the domain actually resolves to. In order
to support self-healing of any problems that happen during this process,
the client will send an "Assigned IPs" event every time it receives a
DNS query for a particular domain. This in turn will trigger another DNS
resolution on the Gateway. Effectively, this means that DNS queries for
DNS resources propagate to the Gateway, triggering a DNS resolution
there. In case the domain resolves to the same set of IPs, no state is
changed to ensure existing connections are not interrupted.
With this new functionality in place, we can delete the old logic around
detecting "expired" IPs. This is considered a bugfix as this logic isn't
currently working as intended. It has been observed multiple times that
the Gateway can loop on this behaviour and resolving the same domain
over and over again. The only theoretical "incompatibility" here is that
pre-1.4.0 clients won't have access to this functionality of triggering
DNS refreshes on a Gateway 1.4.2+ Gateway. However, as soon as this PR
merges, we expect all admins to have already upgraded to a 1.4.0+
Gateway anyway which already mandates clients to be on 1.4.0+.
Resolves: #7391.
Resolves: #6828.
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.210 to
1.0.215.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.215</h2>
<ul>
<li>Produce warning when multiple fields or variants have the same
deserialization name (<a
href="https://redirect.github.com/serde-rs/serde/issues/2855">#2855</a>,
<a
href="https://redirect.github.com/serde-rs/serde/issues/2856">#2856</a>,
<a
href="https://redirect.github.com/serde-rs/serde/issues/2857">#2857</a>)</li>
</ul>
<h2>v1.0.214</h2>
<ul>
<li>Implement IntoDeserializer for all Deserializers in serde::de::value
module (<a
href="https://redirect.github.com/serde-rs/serde/issues/2568">#2568</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
</ul>
<h2>v1.0.213</h2>
<ul>
<li>Fix support for macro-generated <code>with</code> attributes inside
a newtype struct (<a
href="https://redirect.github.com/serde-rs/serde/issues/2847">#2847</a>)</li>
</ul>
<h2>v1.0.212</h2>
<ul>
<li>Fix hygiene of macro-generated local variable accesses in
serde(with) wrappers (<a
href="https://redirect.github.com/serde-rs/serde/issues/2845">#2845</a>)</li>
</ul>
<h2>v1.0.211</h2>
<ul>
<li>Improve error reporting about mismatched signature in
<code>with</code> and <code>default</code> attributes (<a
href="https://redirect.github.com/serde-rs/serde/issues/2558">#2558</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
<li>Show variant aliases in error message when variant deserialization
fails (<a
href="https://redirect.github.com/serde-rs/serde/issues/2566">#2566</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
<li>Improve binary size of untagged enum and internally tagged enum
deserialization by about 12% (<a
href="https://redirect.github.com/serde-rs/serde/issues/2821">#2821</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8939af48fe"><code>8939af4</code></a>
Release 1.0.215</li>
<li><a
href="fa5d58cd00"><code>fa5d58c</code></a>
Use ui test syntax that does not interfere with rustfmt</li>
<li><a
href="1a3cf4b3c1"><code>1a3cf4b</code></a>
Update PR 2562 ui tests</li>
<li><a
href="7d96352e96"><code>7d96352</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2857">#2857</a>
from dtolnay/collide</li>
<li><a
href="111ecc5d8c"><code>111ecc5</code></a>
Update ui tests for warning on colliding aliases</li>
<li><a
href="edd6fe954b"><code>edd6fe9</code></a>
Revert "Add checks for conflicts for aliases"</li>
<li><a
href="a20e9249c5"><code>a20e924</code></a>
Revert "pacify clippy"</li>
<li><a
href="b1353a99cd"><code>b1353a9</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2856">#2856</a>
from dtolnay/dename</li>
<li><a
href="c59e876bb3"><code>c59e876</code></a>
Produce a separate warning for every colliding name</li>
<li><a
href="7f1e697c0d"><code>7f1e697</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2855">#2855</a>
from dtolnay/namespan</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.210...v1.0.215">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes a similar issue as #7443 where we were deleting the
`IPHONEOS_DEPLOYMENT_TARGET` variable in our Rust build script, which
caused lots of warnings about building for a different OS than being
linked against.
At present, the GUI client shares the current log-directives with the
IPC service via the file system. Supposedly, this has been done to allow
the IPC service to start back up with the same log filter as before.
This behaviour appears to be buggy though as we are receiving a fair
number of error reports where this file is not writable.
Instead of relying on the file system to communicate, we send the
current log-directives to the IPC service as soon as we start up. The
IPC service then uses the file system as a cache that log string and
re-apply it on the next startup. This way, no two programs need to read
/ write the same file. The IPC service runs with higher privileges, so
this should resolve the permission errors we are seeing in Sentry.
Bumps [@tauri-apps/api](https://github.com/tauri-apps/tauri) from 2.0.3
to 2.1.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/tauri/releases"><code>@tauri-apps/api</code>'s
releases</a>.</em></p>
<blockquote>
<h2><code>@tauri-apps/api</code> v2.1.1</h2>
<!-- raw HTML omitted -->
<pre><code>No known vulnerabilities found
</code></pre>
<!-- raw HTML omitted -->
<h2>[2.1.1]</h2>
<h3>Bug Fixes</h3>
<ul>
<li><a
href="7f81f05236"><code>5e9435487</code></a>
(<a
href="https://redirect.github.com/tauri-apps/tauri/pull/11639">#11645</a>
by <a href="https://github.com/dgerhardt">dgerhardt</a>) Fix regression
in <code>toLogical</code> and <code>toPhysical</code> for position types
in <code>dpi</code> module returning incorrect <code>y</code>
value.</li>
<li><a
href="e8a50f6d76"><code>e8a50f6d7</code></a>
(<a
href="https://redirect.github.com/tauri-apps/tauri/pull/11645">#11645</a>)
Fix integer values of <code>BasDirectory.Home</code> and
<code>BaseDirectory.Font</code> regression which broke path APIs in
JS.</li>
</ul>
<!-- raw HTML omitted -->
<pre><code>> @tauri-apps/api@2.1.1 npm-publish
/home/runner/work/tauri/tauri/packages/api
> pnpm build && cd ./dist && pnpm publish --access
public --loglevel silly --no-git-checks
<p>> <code>@tauri-apps/api</code><a
href="https://github.com/2"><code>@2</code></a>.1.1 build
/home/runner/work/tauri/tauri/packages/api
> rollup -c --configPlugin typescript</p>
<p>[36m
[1m./src/app.ts, ./src/core.ts, ./src/dpi.ts, ./src/event.ts,
./src/image.ts, ./src/index.ts, ./src/menu.ts, ./src/mocks.ts,
./src/path.ts, ./src/tray.ts, ./src/webview.ts, ./src/webviewWindow.ts,
./src/window.ts[22m → [1m./dist, ./dist[22m...[39m
[32mcreated [1m./dist, ./dist[22m in [1m1.3s[22m[39m
[36m
[1msrc/index.ts[22m →
[1m../../crates/tauri/scripts/bundle.global.js[22m...[39m
[32mcreated [1m../../crates/tauri/scripts/bundle.global.js[22m in
[1m1.8s[22m[39m
npm verbose cli /opt/hostedtoolcache/node/20.18.0/x64/bin/node
/opt/hostedtoolcache/node/20.18.0/x64/bin/npm
npm info using npm@10.8.2
npm info using node@v20.18.0
npm silly config
load:file:/opt/hostedtoolcache/node/20.18.0/x64/lib/node_modules/npm/npmrc
npm silly config load:file:/tmp/6ad0a380b775be1a5293f739102d3639/.npmrc
npm silly config load:file:/home/runner/work/_temp/.npmrc
npm silly config
load:file:/opt/hostedtoolcache/node/20.18.0/x64/etc/npmrc
npm verbose title npm publish tauri-apps-api-2.1.1.tgz
npm verbose argv "publish" "--ignore-scripts"
"tauri-apps-api-2.1.1.tgz" "--access"
"public" "--loglevel" "silly"
"--no-git-checks"
npm verbose logfile logs-max:10
dir:/home/runner/.npm/_logs/2024-11-11T14_49_08_178Z-
npm verbose logfile
/home/runner/.npm/_logs/2024-11-11T14_49_08_178Z-debug-0.log
npm verbose publish [ 'tauri-apps-api-2.1.1.tgz' ]
npm silly logfile done cleaning log files
npm notice
npm notice 📦 <code>@tauri-apps/api</code><a
href="https://github.com/2"><code>@2</code></a>.1.1
npm notice Tarball Contents
</tr></table>
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ef2592b5a8"><code>ef2592b</code></a>
Apply Version Updates From Current Changes (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11646">#11646</a>)</li>
<li><a
href="7f81f05236"><code>7f81f05</code></a>
chore: rename change file</li>
<li><a
href="e8a50f6d76"><code>e8a50f6</code></a>
fix(core): hard code <code>BaseDirectory</code> integer values to avoid
regressions when...</li>
<li><a
href="5e94354875"><code>5e94354</code></a>
fix(api/dpi): fix toLogical and toPhysical for positions (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11639">#11639</a>)</li>
<li><a
href="0fcef3f941"><code>0fcef3f</code></a>
docs: document vanilla JS import alternative (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11632">#11632</a>)</li>
<li><a
href="86f22f0ec9"><code>86f22f0</code></a>
apply version updates (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11440">#11440</a>)</li>
<li><a
href="3f6f07a1b8"><code>3f6f07a</code></a>
chore(deps): update <code>wry</code> to <code>0.47</code> and
<code>tao</code> to <code>0.30.6</code> (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11627">#11627</a>)</li>
<li><a
href="60e86d5f6e"><code>60e86d5</code></a>
fix(cli): <code>android dev</code> not working on Windows without
<code>--host</code> (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11624">#11624</a>)</li>
<li><a
href="b28435860c"><code>b284358</code></a>
chore(deps) Update Rust crate thiserror to v2 (dev) (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11604">#11604</a>)</li>
<li><a
href="229d7f8e22"><code>229d7f8</code></a>
fix(core): fix child webviews on macOS and Windows treated as full
webview wi...</li>
<li>Additional commits viewable in <a
href="https://github.com/tauri-apps/tauri/compare/@tauri-apps/api-v2.0.3...@tauri-apps/api-v2.1.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Context
The Gateway implements a stateful NAT that translates the destination IP
and source protocol of every packet that targets a DNS resource IP. This
is necessary because the IPs for DNS resources are generated on the
client without actually performing a DNS lookup, instead it always
generates 4 IPv4 and 4 IPv6 addresses. On the Gateway, these IPs are
then assigned in a round-robin fashion to the actual IPs that the domain
resolves to, necessitating a NAT64/46 translation in case a domain only
resolves to IPs of one family.
A domain may resolve to a set of IPs but not all of these IPs may be
routable. Whilst an arguably poor practise of the domain administrator,
routing problems can occur for all kinds of reasons and are well handled
on the wider Internet.
When an IP packet cannot be routed further, the current routing node
generates an ICMP error describing the routing failure and sends it back
to the original sender. ICMP is a layer 4 protocol itself, same as TCP
and UDP. As such, sending out a UDP packet may result in receiving an
ICMP response. In order to allow the sender to learn, which packet
failed to route, the ICMP error embeds parts of the original packet in
its payload [0] [1].
The Gateway's NAT table uses parts of the layer 4 protocol as part of
its key; the UDP and TCP source port and the ICMP echo request
identifier (further referred to as "source protocol"). An ICMP error
message doesn't have any of these, meaning the lookup in the NAT table
currently fails and the ICMP error is silently dropped.
A lot of software implements a happy-eyeballs approach and probs for
IPv6 and IPv4 connectivity simulataneously. The absence of the ICMP
errors confuses that algorithm as it detects the packet loss and starts
retransmits instead of giving up.
## Solution
Upon receiving an ICMP error on the Gateway, we now extract the
partially embedded packet in the ICMP error payload. We use the
destination IP and source protocol of _that_ packet for the lookup in
the NAT table. This returns us the original (client-assigned)
destination IP and source protocol. In order for the Gateway's NAT to be
transparent, we need to patch the packet embedded in the ICMP error to
use the original destination and source protocol. We also have to
account for the fact that the original packet may have been translated
with NAT64/46 and translate it back. Finally, we generate an ICMP error
with the appropriate code and embed the patched packet in its payload.
## Test implementation
To test that this works for all kind of combinations, we extend
`tunnel_test` to sample a list of unreachable IPs from all IPs sampled
for DNS resources. Upon receiving a packet for one of these IPs, the
Gateway will send an ICMP error back instead of invoking its regular
echo reply logic. On the client-side, upon receiving an ICMP error, we
extract the originally failed packet from the body and treat it as a
successful response.
This may seem a bit hacky at first but is actually how operating systems
would treat ICMP errors as well. For example, a `TcpSocket::connect`
call (triggering a TCP SYN packet) may fail with an IO error if we
receive an ICMP error packet. Thus, in a way, the original packet got
answered, just not with what we expected.
In addition, by treating these ICMP errors as responses to the original
packet, we automatically perform other assertions on them, like ensuring
that they come from the right IP address, that there are no unexpected
packets etc.
## Test alternatives
It is tricky to solve this in other ways in the test suite because at
the time of generating a packet for a DNS resource, we don't know the
actual IP that is being targeted by a certain proxy IP unless we'd start
reimplementing the round-robin algorithm employed by the Gateway. To
"test" the transparency of the NAT, we'd like to avoid knowing about
these implementation details in the test.
## Future work
In this PR, we currently only deal with "Destination Unreachable" ICMP
errors. There are other ICMP messages such as ICMPv6's `PacketTooBig` or
`ParameterProblem`. We should eventually handle these as well. They are
being deferred because translating those between the different IP
versions is only partially implemented and would thus require more work.
The most pressing need is to translate destination unreachable errors to
enable happy-eyeballs algorithms to work correctly.
Resolves: #5614.
Resolves: #6371.
[0]: https://www.rfc-editor.org/rfc/rfc792
[1]: https://www.rfc-editor.org/rfc/rfc4443#section-3.1
This PR intends to be a pure refactoring, i.e. no behaviour change. It
simplifies a few aspects of the GUI controller event-loop by getting rid
of the `select!` macro. We also remove some indirection of the
`gui_controller::Builder`.
This is another attempt at fixing #7386. Previous PR was #7379. The
difference is, this time it works! In the following screenshot,
`handle_input` is a currently active span.

I had to make some patches to Sentry, most notably:
- https://github.com/getsentry/sentry-rust/pull/708
- https://github.com/getsentry/sentry-rust/pull/712
The way we configure Sentry is quite tricky:
First and foremost, we need to understand that the `tracing` adapter for
Sentry has a `span_filter` configuration. When a span gets filtered out
there, the rest of `sentry-tracing` never sees the data in that span.
Thus, in order to capture variables from spans, we need to have a fairly
generous span filter. In this PR, we change this span filter to include
all spans except those on TRACE level.
Secondly, by default, the Sentry SDK doesn't send any spans to the
backend, i.e. the sampling rate is 0. Previously, we set the sampling
rate to 1.0 because the `span_filter` was already filtering out all
non-telemetry spans. A telemetry span is a concept that we invented. It
is a span that gets sampled at _creation_ time with a probability of 1%.
This is useful because creating a lot of spans is also expensive, so we
don't want to do it e.g. on a per-packet basis. With just these
configuration options, we now have a problem: We don't want to submit
all spans to Sentry but we need the `span_filter` to allow all spans
otherwise we can't capture the contextual fields from the span in
breadcrumbs. Luckily, the Sentry SDK has another configuration option:
`traces_sampler`.
The `traces_sampler` gets to compute a sampling rate for each individual
span. This allows us to discard all spans from being sent to Sentry
unless they are `telemetry` spans.
Resolves: #7386.
`MACOSX_DEPLOYMENT_TARGET` is a standard env var read by gcc and rustc
that determines which version of macOS to target binaries for. This
variable was being removed inadvertently in our rust build script.
Exposing this var fixes a slew of warnings about object files being
built for a newer macOS target than being linked that were showing up in
Xcode for some time now.
Hasn't caused any issues thus far from what I can tell, but possibly
related to #7442
## Context
At present, `connlib` sends UDP packets one at a time. Sending a packet
requires us to make a syscall which is quite expensive. Under load, i.e.
during a speedtest, syscalls account for over 50% of our CPU time [0].
In order to improve this situation, we need to somehow make use of GSO
(generic segmentation offload). With GSO, we can send multiple packets
to the same destination in a single syscall.
The tricky question here is, how can we achieve having multiple UDP
packets ready at once so we can send them in a single syscall? Our TUN
interface only feeds us packets one at a time and `connlib`'s state
machine is single-threaded. Additionally, we currently only have a
single `EncryptBuffer` in which the to-be-sent datagram sits.
## 1. Stack-allocating encrypted IP packets
As a first step, we get rid of the single `EncryptBuffer` and instead
stack-allocate each encrypted IP packet. Due to our small MTU, these
packets are only around 1300 bytes. Stack-allocating that requires a few
memcpy's but those are in the single-digit % range in the terms of CPU
time performance hit. That is nothing compared to how much time we are
spending on UDP syscalls. With the `EncryptBuffer` out the way, we can
now "freely" move around the `EncryptedPacket` structs and - technically
- we can have multiple of them at the same time.
## 2. Implementing GSO
The GSO interface allows you to pass multiple packets **of the same
length and for the same destination** in a single syscall, meaning we
cannot just batch-up arbitrary UDP packets. Counterintuitively, making
use of GSO requires us to do more copying: In particular, we change the
interface of `Io` such that "sending" a packet performs essentially a
lookup of a `BytesMut`-buffer by destination and packet length and
appends the payload to that packet.
## 3. Batch-read IP packets
In order to actually perform GSO, we need to process more than a single
IP packet in one event-loop tick. We achieve this by batch-reading up to
50 IP packets from the mpsc-channel that connects `connlib`'s main
event-loop with the dedicated thread that reads and writes to the TUN
device. These reads and writes happen concurrently to `connlib`'s packet
processing. Thus, it is likely that by the time `connlib` is ready to
process another IP packet, multiple have been read from the device and
are sitting in the channel. Batch-processing these IP packets means that
the buffers in our `GsoQueue` are more likely to contain more than a
single datagram.
Imagine you are running a file upload. The OS will send many packets to
the same destination IP and likely max MTU to the TUN device. It is
likely, that we read 10-20 of these packets in one batch (i.e. within a
single "tick" of the event-loop). All packets will be appended to the
same buffer in the `GsoQueue` and on the next event-loop tick, they will
all be flushed out in a single syscall.
## Results
Overall, this results in a significant reduction of syscalls for sending
UDP message. In [1], we spend only a total of 16% of our CPU time in
`udpv6_sendmsg` whereas in [0] (main), we spent a total of 34%. Do note
that these numbers are relative to the total CPU time spent per program
run and thus can't be compared directly (i.e. you cannot just do 34 - 16
and say we now spend 18% less time sending UDP packets). Nevertheless,
this appears to be a great improvement.
In terms of throughput, we achieve a ~60% improvement in our benchmark
suite. That one is running on localhost though so it might not
necessarily be reflect like that in a real network.
[0]: https://share.firefox.dev/4hvoPju
[1]: https://share.firefox.dev/4frhCPv
A client may have lost its state and therefore "probe" the relay whether
or not is still has an allocation. If it does, it will react to the
error, delete it and make a new one. This is no reason to print a
warning on the relay side.
Windows appears to sometimes send ICMP (error?) packets to our DNS
resolver IPs. There is nothing we can do with these but the current code
path logs them as a warning because we expect everything that isn't TCP
to be UDP.
With this patch, we slightly change the parsing logic to first attempt
extracting the UDP packet and fail only with a DEBUG log if it isn't.
Our `phoenix-channel` component is responsible for maintaining a
WebSocket connection to the portal. In case that connection fails, we
want to reconnect to it using an exponential backoff, eventually giving
up after a certain amount of time.
Unfortunately, the code we have today doesn't quite do that. An
`ExponentialBackoff` has a setting for the `max_elapsed_time`.
Regardless of how many and how often we retry something, we won't ever
wait longer than this amount of time. For the Relay, this is set to
15min. For other components its indefinite (Gateway, headless-client),
or very long (30 days for Android, 1 day for Apple).
The point in time from which this duration is counted is when the
`ExponentialBackoff` is **constructed** which translates to when we
**first** connected to the portal. As a result, our backoff would
immediately fail on the first error if it has been longer than
`max_elapsed_time` since we first connected. For most components, this
codepath is not relevant because the `max_elapsed_time` is so long. For
the Relay however, that is only 15 minutes so chances are, the Relay
would immediately fail (and get rebooted) on the first connection error
with the portal.
To fix this, we now lazily create the `ExponentialBackoff` on the first
error.
This bug has some interesting consequences: When a relay reboots, it
looses all its state, i.e. allocations, channel bindings, available
nonces etc, stamp-secret. Thus, all credentials and state that got
distributed to Clients and Gateways get invalidated, causing disconnects
from the Relay. We have observed these alerts in Sentry for a while and
couldn't explain them. Most likely, this is the root cause for those
because whilst a Relay disconnects, the portal also cannot detect its
presence and pro-actively inform Clients and Gateways to no longer use
this Relay.
Rust 1.83 comes with a bunch of new lints for elidible lifetimes. Those
also trigger in the generated code of `derivative`. That crate is
actually unmaintained so we replace our usages of it with `derive_more`.
With this PR, we reduce some of the warnings emitted by the relay. If we
can only partially fulfill an allocation, we now only emit a warning.
Similarly, if we receive a repeated SIGTERM signal, we shut down
successfully (i.e. exit with code 0) instead of failing the event-loop.
During normal operation, we wait for all allocations to expire before we
shut down. On CI however, the relay gets shutdown much earlier so this
would generate unnecessary errors.
Receiving another SIGTERM is a user-initiated action so we shouldn't
fail as a result but instead just comply with it.
Our relays aren't semver-versioned like other components. So for the
version reported to Sentry, we use the current Git SHA. This one is only
available as an ENV variable because we are building within a docker
container using `cargo cross`. By default, no env variables are passed
through to the container. To fix this, we need to add a configuration
file that explicitly opts-in to the necessary ENV variable.
We ignore some advisories of unmaintained crates flagged by `cargo
deny`. As long as the crates work for us, there is not much reason to
directly remove them, especially if it requires upstream effort. We will
get rid of these as they cause problems.
To avoid having to look up what they correspond to, we add a comment to
each line.
Bumps
[tauri-winrt-notification](https://github.com/tauri-apps/winrt-notification)
from 0.6.0 to 0.7.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/winrt-notification/releases">tauri-winrt-notification's
releases</a>.</em></p>
<blockquote>
<h2>tauri-winrt-notification v0.7.0</h2>
<p>Updating crates.io index
Locking 25 packages to latest compatible versions
Adding quick-xml v0.31.0 (latest: v0.37.0)
Adding windows-strings v0.1.0 (latest: v0.2.0)</p>
<!-- raw HTML omitted -->
<pre><code>Fetching advisory database from
`https://github.com/RustSec/advisory-db.git`
Loaded 664 security advisories (from /home/runner/.cargo/advisory-db)
Updating crates.io index
Scanning Cargo.lock for vulnerabilities (25 crate dependencies)
</code></pre>
<!-- raw HTML omitted -->
<h2>[0.7.0]</h2>
<ul>
<li><a
href="987f44fe47"><code>987f44f</code></a>
(<a
href="https://redirect.github.com/tauri-apps/winrt-notification/pull/37">#37</a>
by <a
href="https://github.com/tauri-apps/winrt-notification/../../iKineticate"><code>@iKineticate</code></a>)
Added progress bar APIs, <code>Toast::progress</code> and
<code>Toast::set_progress</code></li>
</ul>
<!-- raw HTML omitted -->
<pre><code>Updating crates.io index
Packaging tauri-winrt-notification v0.7.0
(/home/runner/work/winrt-notification/winrt-notification)
Updating crates.io index
Packaged 31 files, 98.2KiB (44.3KiB compressed)
Uploading tauri-winrt-notification v0.7.0
(/home/runner/work/winrt-notification/winrt-notification)
Uploaded tauri-winrt-notification v0.7.0 to registry `crates-io`
note: waiting for `tauri-winrt-notification v0.7.0` to be available at
registry `crates-io`.
You may press ctrl-c to skip waiting; the crate should be available
shortly.
Published tauri-winrt-notification v0.7.0 at registry `crates-io`
</code></pre>
<!-- raw HTML omitted -->
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/winrt-notification/blob/dev/CHANGELOG.md">tauri-winrt-notification's
changelog</a>.</em></p>
<blockquote>
<h2>[0.7.0]</h2>
<ul>
<li><a
href="987f44fe47"><code>987f44f</code></a>
(<a
href="https://redirect.github.com/tauri-apps/winrt-notification/pull/37">#37</a>
by <a
href="https://github.com/tauri-apps/winrt-notification/../../iKineticate"><code>@iKineticate</code></a>)
Added progress bar APIs, <code>Toast::progress</code> and
<code>Toast::set_progress</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="98d351ba24"><code>98d351b</code></a>
Publish New Versions (<a
href="https://redirect.github.com/tauri-apps/winrt-notification/issues/38">#38</a>)</li>
<li><a
href="987f44fe47"><code>987f44f</code></a>
feat: add progress bar support (<a
href="https://redirect.github.com/tauri-apps/winrt-notification/issues/37">#37</a>)</li>
<li>See full diff in <a
href="https://github.com/tauri-apps/winrt-notification/compare/tauri-winrt-notification-v0.6...tauri-winrt-notification-v0.7">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.132 to
1.0.133.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.133</h2>
<ul>
<li>Implement From<[T; N]> for serde_json::Value (<a
href="https://redirect.github.com/serde-rs/json/issues/1215">#1215</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0903de449c"><code>0903de4</code></a>
Release 1.0.133</li>
<li><a
href="2b65ca0949"><code>2b65ca0</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1215">#1215</a>
from dtolnay/fromarray</li>
<li><a
href="4e5f985958"><code>4e5f985</code></a>
Implement From<[T; N]> for Value</li>
<li><a
href="2ccb5b67ca"><code>2ccb5b6</code></a>
Disable question_mark clippy lint in lexical test</li>
<li><a
href="a11f5f2bc4"><code>a11f5f2</code></a>
Resolve unnecessary_map_or clippy lints</li>
<li><a
href="07f280a79c"><code>07f280a</code></a>
Wrap PR 1213 to 80 columns</li>
<li><a
href="75ed44722d"><code>75ed447</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1213">#1213</a>
from djmitche/safety-comment</li>
<li><a
href="73011c0b2b"><code>73011c0</code></a>
Add a safety comment to unsafe block</li>
<li><a
href="be2198a54d"><code>be2198a</code></a>
Prevent upload-artifact step from causing CI failure</li>
<li><a
href="7cce517f53"><code>7cce517</code></a>
Raise minimum version for preserve_order feature to Rust 1.65</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/json/compare/v1.0.132...v1.0.133">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In addition to monitoring clients and gateways, it is also useful to
monitor relays in the same way. This gives us alerts on ERROR and WARN
messages logged by the relay as well as panics.