In order to release the new control protocol to users, we need to bump
the versions of the clients to 1.4.0. The portal has a version gate to
only select gateways with version >= 1.4.0 for clients >= 1.4.0. Thus,
bumping these versions can only happen once testing has completed and
the gateway has actually been released as 1.4.0.
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Building on top of the gateway PR (#6941), this PR transitions the
clients to the new control protocol. Clients are **not**
backwards-compatible with old gateways. As a result, a certain customer
environment MUST have at least one gateway with the above PR running in
order for clients to be able to establish connections.
With this transition, Clients send explicit events to Gateways whenever
they assign IPs to a DNS resource name. The actual assignment only
happens once and the IPs then remain stable for the duration of the
client session.
When the Gateway receives such an event, it will perform a DNS
resolution of the requested domain name and set up the NAT between the
assigned proxy IPs and the IPs the domain actually resolves to. In order
to support self-healing of any problems that happen during this process,
the client will send an "Assigned IPs" event every time it receives a
DNS query for a particular domain. This in turn will trigger another DNS
resolution on the Gateway. Effectively, this means that DNS queries for
DNS resources propagate to the Gateway, triggering a DNS resolution
there. In case the domain resolves to the same set of IPs, no state is
changed to ensure existing connections are not interrupted.
With this new functionality in place, we can delete the old logic around
detecting "expired" IPs. This is considered a bugfix as this logic isn't
currently working as intended. It has been observed multiple times that
the Gateway can loop on this behaviour and resolving the same domain
over and over again. The only theoretical "incompatibility" here is that
pre-1.4.0 clients won't have access to this functionality of triggering
DNS refreshes on a Gateway 1.4.2+ Gateway. However, as soon as this PR
merges, we expect all admins to have already upgraded to a 1.4.0+
Gateway anyway which already mandates clients to be on 1.4.0+.
Resolves: #7391.
Resolves: #6828.
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.210 to
1.0.215.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.215</h2>
<ul>
<li>Produce warning when multiple fields or variants have the same
deserialization name (<a
href="https://redirect.github.com/serde-rs/serde/issues/2855">#2855</a>,
<a
href="https://redirect.github.com/serde-rs/serde/issues/2856">#2856</a>,
<a
href="https://redirect.github.com/serde-rs/serde/issues/2857">#2857</a>)</li>
</ul>
<h2>v1.0.214</h2>
<ul>
<li>Implement IntoDeserializer for all Deserializers in serde::de::value
module (<a
href="https://redirect.github.com/serde-rs/serde/issues/2568">#2568</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
</ul>
<h2>v1.0.213</h2>
<ul>
<li>Fix support for macro-generated <code>with</code> attributes inside
a newtype struct (<a
href="https://redirect.github.com/serde-rs/serde/issues/2847">#2847</a>)</li>
</ul>
<h2>v1.0.212</h2>
<ul>
<li>Fix hygiene of macro-generated local variable accesses in
serde(with) wrappers (<a
href="https://redirect.github.com/serde-rs/serde/issues/2845">#2845</a>)</li>
</ul>
<h2>v1.0.211</h2>
<ul>
<li>Improve error reporting about mismatched signature in
<code>with</code> and <code>default</code> attributes (<a
href="https://redirect.github.com/serde-rs/serde/issues/2558">#2558</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
<li>Show variant aliases in error message when variant deserialization
fails (<a
href="https://redirect.github.com/serde-rs/serde/issues/2566">#2566</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
<li>Improve binary size of untagged enum and internally tagged enum
deserialization by about 12% (<a
href="https://redirect.github.com/serde-rs/serde/issues/2821">#2821</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8939af48fe"><code>8939af4</code></a>
Release 1.0.215</li>
<li><a
href="fa5d58cd00"><code>fa5d58c</code></a>
Use ui test syntax that does not interfere with rustfmt</li>
<li><a
href="1a3cf4b3c1"><code>1a3cf4b</code></a>
Update PR 2562 ui tests</li>
<li><a
href="7d96352e96"><code>7d96352</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2857">#2857</a>
from dtolnay/collide</li>
<li><a
href="111ecc5d8c"><code>111ecc5</code></a>
Update ui tests for warning on colliding aliases</li>
<li><a
href="edd6fe954b"><code>edd6fe9</code></a>
Revert "Add checks for conflicts for aliases"</li>
<li><a
href="a20e9249c5"><code>a20e924</code></a>
Revert "pacify clippy"</li>
<li><a
href="b1353a99cd"><code>b1353a9</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2856">#2856</a>
from dtolnay/dename</li>
<li><a
href="c59e876bb3"><code>c59e876</code></a>
Produce a separate warning for every colliding name</li>
<li><a
href="7f1e697c0d"><code>7f1e697</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2855">#2855</a>
from dtolnay/namespan</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.210...v1.0.215">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes a similar issue as #7443 where we were deleting the
`IPHONEOS_DEPLOYMENT_TARGET` variable in our Rust build script, which
caused lots of warnings about building for a different OS than being
linked against.
At present, the GUI client shares the current log-directives with the
IPC service via the file system. Supposedly, this has been done to allow
the IPC service to start back up with the same log filter as before.
This behaviour appears to be buggy though as we are receiving a fair
number of error reports where this file is not writable.
Instead of relying on the file system to communicate, we send the
current log-directives to the IPC service as soon as we start up. The
IPC service then uses the file system as a cache that log string and
re-apply it on the next startup. This way, no two programs need to read
/ write the same file. The IPC service runs with higher privileges, so
this should resolve the permission errors we are seeing in Sentry.
Bumps [@tauri-apps/api](https://github.com/tauri-apps/tauri) from 2.0.3
to 2.1.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/tauri/releases"><code>@tauri-apps/api</code>'s
releases</a>.</em></p>
<blockquote>
<h2><code>@tauri-apps/api</code> v2.1.1</h2>
<!-- raw HTML omitted -->
<pre><code>No known vulnerabilities found
</code></pre>
<!-- raw HTML omitted -->
<h2>[2.1.1]</h2>
<h3>Bug Fixes</h3>
<ul>
<li><a
href="7f81f05236"><code>5e9435487</code></a>
(<a
href="https://redirect.github.com/tauri-apps/tauri/pull/11639">#11645</a>
by <a href="https://github.com/dgerhardt">dgerhardt</a>) Fix regression
in <code>toLogical</code> and <code>toPhysical</code> for position types
in <code>dpi</code> module returning incorrect <code>y</code>
value.</li>
<li><a
href="e8a50f6d76"><code>e8a50f6d7</code></a>
(<a
href="https://redirect.github.com/tauri-apps/tauri/pull/11645">#11645</a>)
Fix integer values of <code>BasDirectory.Home</code> and
<code>BaseDirectory.Font</code> regression which broke path APIs in
JS.</li>
</ul>
<!-- raw HTML omitted -->
<pre><code>> @tauri-apps/api@2.1.1 npm-publish
/home/runner/work/tauri/tauri/packages/api
> pnpm build && cd ./dist && pnpm publish --access
public --loglevel silly --no-git-checks
<p>> <code>@tauri-apps/api</code><a
href="https://github.com/2"><code>@2</code></a>.1.1 build
/home/runner/work/tauri/tauri/packages/api
> rollup -c --configPlugin typescript</p>
<p>[36m
[1m./src/app.ts, ./src/core.ts, ./src/dpi.ts, ./src/event.ts,
./src/image.ts, ./src/index.ts, ./src/menu.ts, ./src/mocks.ts,
./src/path.ts, ./src/tray.ts, ./src/webview.ts, ./src/webviewWindow.ts,
./src/window.ts[22m → [1m./dist, ./dist[22m...[39m
[32mcreated [1m./dist, ./dist[22m in [1m1.3s[22m[39m
[36m
[1msrc/index.ts[22m →
[1m../../crates/tauri/scripts/bundle.global.js[22m...[39m
[32mcreated [1m../../crates/tauri/scripts/bundle.global.js[22m in
[1m1.8s[22m[39m
npm verbose cli /opt/hostedtoolcache/node/20.18.0/x64/bin/node
/opt/hostedtoolcache/node/20.18.0/x64/bin/npm
npm info using npm@10.8.2
npm info using node@v20.18.0
npm silly config
load:file:/opt/hostedtoolcache/node/20.18.0/x64/lib/node_modules/npm/npmrc
npm silly config load:file:/tmp/6ad0a380b775be1a5293f739102d3639/.npmrc
npm silly config load:file:/home/runner/work/_temp/.npmrc
npm silly config
load:file:/opt/hostedtoolcache/node/20.18.0/x64/etc/npmrc
npm verbose title npm publish tauri-apps-api-2.1.1.tgz
npm verbose argv "publish" "--ignore-scripts"
"tauri-apps-api-2.1.1.tgz" "--access"
"public" "--loglevel" "silly"
"--no-git-checks"
npm verbose logfile logs-max:10
dir:/home/runner/.npm/_logs/2024-11-11T14_49_08_178Z-
npm verbose logfile
/home/runner/.npm/_logs/2024-11-11T14_49_08_178Z-debug-0.log
npm verbose publish [ 'tauri-apps-api-2.1.1.tgz' ]
npm silly logfile done cleaning log files
npm notice
npm notice 📦 <code>@tauri-apps/api</code><a
href="https://github.com/2"><code>@2</code></a>.1.1
npm notice Tarball Contents
</tr></table>
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ef2592b5a8"><code>ef2592b</code></a>
Apply Version Updates From Current Changes (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11646">#11646</a>)</li>
<li><a
href="7f81f05236"><code>7f81f05</code></a>
chore: rename change file</li>
<li><a
href="e8a50f6d76"><code>e8a50f6</code></a>
fix(core): hard code <code>BaseDirectory</code> integer values to avoid
regressions when...</li>
<li><a
href="5e94354875"><code>5e94354</code></a>
fix(api/dpi): fix toLogical and toPhysical for positions (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11639">#11639</a>)</li>
<li><a
href="0fcef3f941"><code>0fcef3f</code></a>
docs: document vanilla JS import alternative (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11632">#11632</a>)</li>
<li><a
href="86f22f0ec9"><code>86f22f0</code></a>
apply version updates (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11440">#11440</a>)</li>
<li><a
href="3f6f07a1b8"><code>3f6f07a</code></a>
chore(deps): update <code>wry</code> to <code>0.47</code> and
<code>tao</code> to <code>0.30.6</code> (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11627">#11627</a>)</li>
<li><a
href="60e86d5f6e"><code>60e86d5</code></a>
fix(cli): <code>android dev</code> not working on Windows without
<code>--host</code> (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11624">#11624</a>)</li>
<li><a
href="b28435860c"><code>b284358</code></a>
chore(deps) Update Rust crate thiserror to v2 (dev) (<a
href="https://redirect.github.com/tauri-apps/tauri/issues/11604">#11604</a>)</li>
<li><a
href="229d7f8e22"><code>229d7f8</code></a>
fix(core): fix child webviews on macOS and Windows treated as full
webview wi...</li>
<li>Additional commits viewable in <a
href="https://github.com/tauri-apps/tauri/compare/@tauri-apps/api-v2.0.3...@tauri-apps/api-v2.1.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Context
The Gateway implements a stateful NAT that translates the destination IP
and source protocol of every packet that targets a DNS resource IP. This
is necessary because the IPs for DNS resources are generated on the
client without actually performing a DNS lookup, instead it always
generates 4 IPv4 and 4 IPv6 addresses. On the Gateway, these IPs are
then assigned in a round-robin fashion to the actual IPs that the domain
resolves to, necessitating a NAT64/46 translation in case a domain only
resolves to IPs of one family.
A domain may resolve to a set of IPs but not all of these IPs may be
routable. Whilst an arguably poor practise of the domain administrator,
routing problems can occur for all kinds of reasons and are well handled
on the wider Internet.
When an IP packet cannot be routed further, the current routing node
generates an ICMP error describing the routing failure and sends it back
to the original sender. ICMP is a layer 4 protocol itself, same as TCP
and UDP. As such, sending out a UDP packet may result in receiving an
ICMP response. In order to allow the sender to learn, which packet
failed to route, the ICMP error embeds parts of the original packet in
its payload [0] [1].
The Gateway's NAT table uses parts of the layer 4 protocol as part of
its key; the UDP and TCP source port and the ICMP echo request
identifier (further referred to as "source protocol"). An ICMP error
message doesn't have any of these, meaning the lookup in the NAT table
currently fails and the ICMP error is silently dropped.
A lot of software implements a happy-eyeballs approach and probs for
IPv6 and IPv4 connectivity simulataneously. The absence of the ICMP
errors confuses that algorithm as it detects the packet loss and starts
retransmits instead of giving up.
## Solution
Upon receiving an ICMP error on the Gateway, we now extract the
partially embedded packet in the ICMP error payload. We use the
destination IP and source protocol of _that_ packet for the lookup in
the NAT table. This returns us the original (client-assigned)
destination IP and source protocol. In order for the Gateway's NAT to be
transparent, we need to patch the packet embedded in the ICMP error to
use the original destination and source protocol. We also have to
account for the fact that the original packet may have been translated
with NAT64/46 and translate it back. Finally, we generate an ICMP error
with the appropriate code and embed the patched packet in its payload.
## Test implementation
To test that this works for all kind of combinations, we extend
`tunnel_test` to sample a list of unreachable IPs from all IPs sampled
for DNS resources. Upon receiving a packet for one of these IPs, the
Gateway will send an ICMP error back instead of invoking its regular
echo reply logic. On the client-side, upon receiving an ICMP error, we
extract the originally failed packet from the body and treat it as a
successful response.
This may seem a bit hacky at first but is actually how operating systems
would treat ICMP errors as well. For example, a `TcpSocket::connect`
call (triggering a TCP SYN packet) may fail with an IO error if we
receive an ICMP error packet. Thus, in a way, the original packet got
answered, just not with what we expected.
In addition, by treating these ICMP errors as responses to the original
packet, we automatically perform other assertions on them, like ensuring
that they come from the right IP address, that there are no unexpected
packets etc.
## Test alternatives
It is tricky to solve this in other ways in the test suite because at
the time of generating a packet for a DNS resource, we don't know the
actual IP that is being targeted by a certain proxy IP unless we'd start
reimplementing the round-robin algorithm employed by the Gateway. To
"test" the transparency of the NAT, we'd like to avoid knowing about
these implementation details in the test.
## Future work
In this PR, we currently only deal with "Destination Unreachable" ICMP
errors. There are other ICMP messages such as ICMPv6's `PacketTooBig` or
`ParameterProblem`. We should eventually handle these as well. They are
being deferred because translating those between the different IP
versions is only partially implemented and would thus require more work.
The most pressing need is to translate destination unreachable errors to
enable happy-eyeballs algorithms to work correctly.
Resolves: #5614.
Resolves: #6371.
[0]: https://www.rfc-editor.org/rfc/rfc792
[1]: https://www.rfc-editor.org/rfc/rfc4443#section-3.1
This PR intends to be a pure refactoring, i.e. no behaviour change. It
simplifies a few aspects of the GUI controller event-loop by getting rid
of the `select!` macro. We also remove some indirection of the
`gui_controller::Builder`.
This is another attempt at fixing #7386. Previous PR was #7379. The
difference is, this time it works! In the following screenshot,
`handle_input` is a currently active span.

I had to make some patches to Sentry, most notably:
- https://github.com/getsentry/sentry-rust/pull/708
- https://github.com/getsentry/sentry-rust/pull/712
The way we configure Sentry is quite tricky:
First and foremost, we need to understand that the `tracing` adapter for
Sentry has a `span_filter` configuration. When a span gets filtered out
there, the rest of `sentry-tracing` never sees the data in that span.
Thus, in order to capture variables from spans, we need to have a fairly
generous span filter. In this PR, we change this span filter to include
all spans except those on TRACE level.
Secondly, by default, the Sentry SDK doesn't send any spans to the
backend, i.e. the sampling rate is 0. Previously, we set the sampling
rate to 1.0 because the `span_filter` was already filtering out all
non-telemetry spans. A telemetry span is a concept that we invented. It
is a span that gets sampled at _creation_ time with a probability of 1%.
This is useful because creating a lot of spans is also expensive, so we
don't want to do it e.g. on a per-packet basis. With just these
configuration options, we now have a problem: We don't want to submit
all spans to Sentry but we need the `span_filter` to allow all spans
otherwise we can't capture the contextual fields from the span in
breadcrumbs. Luckily, the Sentry SDK has another configuration option:
`traces_sampler`.
The `traces_sampler` gets to compute a sampling rate for each individual
span. This allows us to discard all spans from being sent to Sentry
unless they are `telemetry` spans.
Resolves: #7386.
`MACOSX_DEPLOYMENT_TARGET` is a standard env var read by gcc and rustc
that determines which version of macOS to target binaries for. This
variable was being removed inadvertently in our rust build script.
Exposing this var fixes a slew of warnings about object files being
built for a newer macOS target than being linked that were showing up in
Xcode for some time now.
Hasn't caused any issues thus far from what I can tell, but possibly
related to #7442
## Context
At present, `connlib` sends UDP packets one at a time. Sending a packet
requires us to make a syscall which is quite expensive. Under load, i.e.
during a speedtest, syscalls account for over 50% of our CPU time [0].
In order to improve this situation, we need to somehow make use of GSO
(generic segmentation offload). With GSO, we can send multiple packets
to the same destination in a single syscall.
The tricky question here is, how can we achieve having multiple UDP
packets ready at once so we can send them in a single syscall? Our TUN
interface only feeds us packets one at a time and `connlib`'s state
machine is single-threaded. Additionally, we currently only have a
single `EncryptBuffer` in which the to-be-sent datagram sits.
## 1. Stack-allocating encrypted IP packets
As a first step, we get rid of the single `EncryptBuffer` and instead
stack-allocate each encrypted IP packet. Due to our small MTU, these
packets are only around 1300 bytes. Stack-allocating that requires a few
memcpy's but those are in the single-digit % range in the terms of CPU
time performance hit. That is nothing compared to how much time we are
spending on UDP syscalls. With the `EncryptBuffer` out the way, we can
now "freely" move around the `EncryptedPacket` structs and - technically
- we can have multiple of them at the same time.
## 2. Implementing GSO
The GSO interface allows you to pass multiple packets **of the same
length and for the same destination** in a single syscall, meaning we
cannot just batch-up arbitrary UDP packets. Counterintuitively, making
use of GSO requires us to do more copying: In particular, we change the
interface of `Io` such that "sending" a packet performs essentially a
lookup of a `BytesMut`-buffer by destination and packet length and
appends the payload to that packet.
## 3. Batch-read IP packets
In order to actually perform GSO, we need to process more than a single
IP packet in one event-loop tick. We achieve this by batch-reading up to
50 IP packets from the mpsc-channel that connects `connlib`'s main
event-loop with the dedicated thread that reads and writes to the TUN
device. These reads and writes happen concurrently to `connlib`'s packet
processing. Thus, it is likely that by the time `connlib` is ready to
process another IP packet, multiple have been read from the device and
are sitting in the channel. Batch-processing these IP packets means that
the buffers in our `GsoQueue` are more likely to contain more than a
single datagram.
Imagine you are running a file upload. The OS will send many packets to
the same destination IP and likely max MTU to the TUN device. It is
likely, that we read 10-20 of these packets in one batch (i.e. within a
single "tick" of the event-loop). All packets will be appended to the
same buffer in the `GsoQueue` and on the next event-loop tick, they will
all be flushed out in a single syscall.
## Results
Overall, this results in a significant reduction of syscalls for sending
UDP message. In [1], we spend only a total of 16% of our CPU time in
`udpv6_sendmsg` whereas in [0] (main), we spent a total of 34%. Do note
that these numbers are relative to the total CPU time spent per program
run and thus can't be compared directly (i.e. you cannot just do 34 - 16
and say we now spend 18% less time sending UDP packets). Nevertheless,
this appears to be a great improvement.
In terms of throughput, we achieve a ~60% improvement in our benchmark
suite. That one is running on localhost though so it might not
necessarily be reflect like that in a real network.
[0]: https://share.firefox.dev/4hvoPju
[1]: https://share.firefox.dev/4frhCPv
A client may have lost its state and therefore "probe" the relay whether
or not is still has an allocation. If it does, it will react to the
error, delete it and make a new one. This is no reason to print a
warning on the relay side.
Windows appears to sometimes send ICMP (error?) packets to our DNS
resolver IPs. There is nothing we can do with these but the current code
path logs them as a warning because we expect everything that isn't TCP
to be UDP.
With this patch, we slightly change the parsing logic to first attempt
extracting the UDP packet and fail only with a DEBUG log if it isn't.
Our `phoenix-channel` component is responsible for maintaining a
WebSocket connection to the portal. In case that connection fails, we
want to reconnect to it using an exponential backoff, eventually giving
up after a certain amount of time.
Unfortunately, the code we have today doesn't quite do that. An
`ExponentialBackoff` has a setting for the `max_elapsed_time`.
Regardless of how many and how often we retry something, we won't ever
wait longer than this amount of time. For the Relay, this is set to
15min. For other components its indefinite (Gateway, headless-client),
or very long (30 days for Android, 1 day for Apple).
The point in time from which this duration is counted is when the
`ExponentialBackoff` is **constructed** which translates to when we
**first** connected to the portal. As a result, our backoff would
immediately fail on the first error if it has been longer than
`max_elapsed_time` since we first connected. For most components, this
codepath is not relevant because the `max_elapsed_time` is so long. For
the Relay however, that is only 15 minutes so chances are, the Relay
would immediately fail (and get rebooted) on the first connection error
with the portal.
To fix this, we now lazily create the `ExponentialBackoff` on the first
error.
This bug has some interesting consequences: When a relay reboots, it
looses all its state, i.e. allocations, channel bindings, available
nonces etc, stamp-secret. Thus, all credentials and state that got
distributed to Clients and Gateways get invalidated, causing disconnects
from the Relay. We have observed these alerts in Sentry for a while and
couldn't explain them. Most likely, this is the root cause for those
because whilst a Relay disconnects, the portal also cannot detect its
presence and pro-actively inform Clients and Gateways to no longer use
this Relay.
Rust 1.83 comes with a bunch of new lints for elidible lifetimes. Those
also trigger in the generated code of `derivative`. That crate is
actually unmaintained so we replace our usages of it with `derive_more`.
With this PR, we reduce some of the warnings emitted by the relay. If we
can only partially fulfill an allocation, we now only emit a warning.
Similarly, if we receive a repeated SIGTERM signal, we shut down
successfully (i.e. exit with code 0) instead of failing the event-loop.
During normal operation, we wait for all allocations to expire before we
shut down. On CI however, the relay gets shutdown much earlier so this
would generate unnecessary errors.
Receiving another SIGTERM is a user-initiated action so we shouldn't
fail as a result but instead just comply with it.
Our relays aren't semver-versioned like other components. So for the
version reported to Sentry, we use the current Git SHA. This one is only
available as an ENV variable because we are building within a docker
container using `cargo cross`. By default, no env variables are passed
through to the container. To fix this, we need to add a configuration
file that explicitly opts-in to the necessary ENV variable.
We ignore some advisories of unmaintained crates flagged by `cargo
deny`. As long as the crates work for us, there is not much reason to
directly remove them, especially if it requires upstream effort. We will
get rid of these as they cause problems.
To avoid having to look up what they correspond to, we add a comment to
each line.
Bumps
[tauri-winrt-notification](https://github.com/tauri-apps/winrt-notification)
from 0.6.0 to 0.7.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/winrt-notification/releases">tauri-winrt-notification's
releases</a>.</em></p>
<blockquote>
<h2>tauri-winrt-notification v0.7.0</h2>
<p>Updating crates.io index
Locking 25 packages to latest compatible versions
Adding quick-xml v0.31.0 (latest: v0.37.0)
Adding windows-strings v0.1.0 (latest: v0.2.0)</p>
<!-- raw HTML omitted -->
<pre><code>Fetching advisory database from
`https://github.com/RustSec/advisory-db.git`
Loaded 664 security advisories (from /home/runner/.cargo/advisory-db)
Updating crates.io index
Scanning Cargo.lock for vulnerabilities (25 crate dependencies)
</code></pre>
<!-- raw HTML omitted -->
<h2>[0.7.0]</h2>
<ul>
<li><a
href="987f44fe47"><code>987f44f</code></a>
(<a
href="https://redirect.github.com/tauri-apps/winrt-notification/pull/37">#37</a>
by <a
href="https://github.com/tauri-apps/winrt-notification/../../iKineticate"><code>@iKineticate</code></a>)
Added progress bar APIs, <code>Toast::progress</code> and
<code>Toast::set_progress</code></li>
</ul>
<!-- raw HTML omitted -->
<pre><code>Updating crates.io index
Packaging tauri-winrt-notification v0.7.0
(/home/runner/work/winrt-notification/winrt-notification)
Updating crates.io index
Packaged 31 files, 98.2KiB (44.3KiB compressed)
Uploading tauri-winrt-notification v0.7.0
(/home/runner/work/winrt-notification/winrt-notification)
Uploaded tauri-winrt-notification v0.7.0 to registry `crates-io`
note: waiting for `tauri-winrt-notification v0.7.0` to be available at
registry `crates-io`.
You may press ctrl-c to skip waiting; the crate should be available
shortly.
Published tauri-winrt-notification v0.7.0 at registry `crates-io`
</code></pre>
<!-- raw HTML omitted -->
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tauri-apps/winrt-notification/blob/dev/CHANGELOG.md">tauri-winrt-notification's
changelog</a>.</em></p>
<blockquote>
<h2>[0.7.0]</h2>
<ul>
<li><a
href="987f44fe47"><code>987f44f</code></a>
(<a
href="https://redirect.github.com/tauri-apps/winrt-notification/pull/37">#37</a>
by <a
href="https://github.com/tauri-apps/winrt-notification/../../iKineticate"><code>@iKineticate</code></a>)
Added progress bar APIs, <code>Toast::progress</code> and
<code>Toast::set_progress</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="98d351ba24"><code>98d351b</code></a>
Publish New Versions (<a
href="https://redirect.github.com/tauri-apps/winrt-notification/issues/38">#38</a>)</li>
<li><a
href="987f44fe47"><code>987f44f</code></a>
feat: add progress bar support (<a
href="https://redirect.github.com/tauri-apps/winrt-notification/issues/37">#37</a>)</li>
<li>See full diff in <a
href="https://github.com/tauri-apps/winrt-notification/compare/tauri-winrt-notification-v0.6...tauri-winrt-notification-v0.7">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.132 to
1.0.133.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.133</h2>
<ul>
<li>Implement From<[T; N]> for serde_json::Value (<a
href="https://redirect.github.com/serde-rs/json/issues/1215">#1215</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0903de449c"><code>0903de4</code></a>
Release 1.0.133</li>
<li><a
href="2b65ca0949"><code>2b65ca0</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1215">#1215</a>
from dtolnay/fromarray</li>
<li><a
href="4e5f985958"><code>4e5f985</code></a>
Implement From<[T; N]> for Value</li>
<li><a
href="2ccb5b67ca"><code>2ccb5b6</code></a>
Disable question_mark clippy lint in lexical test</li>
<li><a
href="a11f5f2bc4"><code>a11f5f2</code></a>
Resolve unnecessary_map_or clippy lints</li>
<li><a
href="07f280a79c"><code>07f280a</code></a>
Wrap PR 1213 to 80 columns</li>
<li><a
href="75ed44722d"><code>75ed447</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1213">#1213</a>
from djmitche/safety-comment</li>
<li><a
href="73011c0b2b"><code>73011c0</code></a>
Add a safety comment to unsafe block</li>
<li><a
href="be2198a54d"><code>be2198a</code></a>
Prevent upload-artifact step from causing CI failure</li>
<li><a
href="7cce517f53"><code>7cce517</code></a>
Raise minimum version for preserve_order feature to Rust 1.65</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/json/compare/v1.0.132...v1.0.133">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In addition to monitoring clients and gateways, it is also useful to
monitor relays in the same way. This gives us alerts on ERROR and WARN
messages logged by the relay as well as panics.
By removing the use of the `#[tokio::main]`, we can ensure that
telemetry is initialised as early as possible.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
When `proptest` discovers a test failure, it will attempt to "shrink"
the input to identify, what exactly causes the issue. How this is done
depends on the data type but mostly performs things such as binary
search to be efficient. Not every input within our tests is relevant for
a failure. For example, which ID we have sampled for a client or a
gateway doesn't at all affect whether or not the test will fail.
`proptest` doesn't know that though so it will still happily spend
shrinking time and cycles on figuring out the minimal difference in IDs
(which is 1 because they have to be different). This is a huge waste of
time for no benefit. We are getting much more value out of `proptest` if
it tries to adjust other things such as the transitions involved in a
test, how many gateways and relays there are etc.
By marking the strategies for the IDs and private keys with `no_shrink`,
we can achieve that.
When generating random input data in property-based tests, we have to
ensure that the data conforms to certain criteria. For example, IP
addresses must not be multicast or unspecified addresses and they must
not be within our reserved IP ranges.
Currently, we ensure this using "filtering" which is a pretty poor
technique [0]. To improve on this, we refactor the generation of IPs to
automatically exclude all IPs within certain ranges. On very big
test-runs (i.e. > 30000 test cases), too many local rejections lead to
the test suite being aborted early.
[0]:
https://proptest-rs.github.io/proptest/proptest/tutorial/filtering.html
One of Rust's promises is "if it compiles, it works". However, there are
certain situations in which this isn't true. In particular, when using
dynamic typing patterns where trait objects are downcast to concrete
types, having two versions of the same dependency can silently break
things.
This happened in #7379 where I forgot to patch a certain Sentry
dependency. A similar problem exists with our `tracing-stackdriver`
dependency (see #7241).
Lastly, duplicate dependencies increase the compile-times of a project,
so we should aim for having as few duplicate versions of a particular
dependency as possible in our dependency graph.
This PR introduces `cargo deny`, a linter for Rust dependencies. In
addition to linting for duplicate dependencies, it also enforces that
all dependencies are compatible with an allow-list of licenses and it
warns when a dependency is referred to from multiple crates without
introducing a workspace dependency. Thanks to existing tooling
(https://github.com/mainmatter/cargo-autoinherit), transitioning all
dependencies to workspace dependencies was quite easy.
Resolves: #7241.
This switches our `sentry-tracing` dependency to a fork that includes
https://github.com/getsentry/sentry-rust/pull/708. Recording our span
fields with breadcrumbs is important to provide accurate context of the
message. Without the span fields, the messages give us a lot less
information.
Since the last release, the open issue on `flush` having a flipped
return value got fixed as well.
In order to avoid processing of responses of relays that somehow got
altered on the network path, we now use the client's `password` as a
shared secret for the relay to also authenticate its responses. This
means that not all message can be authenticated. In particular, BINDING
requests will still be unauthenticated.
Performing this validation now requires every component that crafts
input to the `Allocation` to include a valid `MessageIntegrity`
attribute. This is somewhat problematic for the regression tests of the
relay and the unit tests of `Allocation`. In both cases, we implement
workarounds so we don't have to actually compute a valid
`MessageIntegrity`. This is deemed acceptable because:
- Both of these are just tests.
- We do test the validation path using `tunnel_test` because there we
run an actual relay.
When debugging issues related to our TURN allocation code, we sometimes
only have the logs that code submitted to Sentry. As part of the event,
we submit the last 500 debug logs as breadcrumbs to give more context to
the error.
Unconditionally printing the attributes of each request-response pair
will help us in more easily diagnosing, why certain errors happen.
Bumps [clap](https://github.com/clap-rs/clap) from 4.5.20 to 4.5.21.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/releases">clap's
releases</a>.</em></p>
<blockquote>
<h2>v4.5.21</h2>
<h2>[4.5.21] - 2024-11-13</h2>
<h3>Fixes</h3>
<ul>
<li><em>(parser)</em> Ensure defaults are filled in on error with
<code>ignore_errors(true)</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's
changelog</a>.</em></p>
<blockquote>
<h2>[4.5.21] - 2024-11-13</h2>
<h3>Fixes</h3>
<ul>
<li><em>(parser)</em> Ensure defaults are filled in on error with
<code>ignore_errors(true)</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="03d722625a"><code>03d7226</code></a>
chore: Release</li>
<li><a
href="3df70fb2b6"><code>3df70fb</code></a>
docs: Update changelog</li>
<li><a
href="3266c36abf"><code>3266c36</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5691">#5691</a>
from epage/custom</li>
<li><a
href="951762db57"><code>951762d</code></a>
feat(complete): Allow any OsString-compatible type to be a
CompletionCandidate</li>
<li><a
href="bb6493e890"><code>bb6493e</code></a>
feat(complete): Offer - as a path option</li>
<li><a
href="27b348dbcb"><code>27b348d</code></a>
refactor(complete): Simplify ArgValueCandidates code</li>
<li><a
href="49b8108f8c"><code>49b8108</code></a>
feat(complete): Add PathCompleter</li>
<li><a
href="82a360aa54"><code>82a360a</code></a>
feat(complete): Add ArgValueCompleter</li>
<li><a
href="47aedc6906"><code>47aedc6</code></a>
fix(complete): Ensure paths are sorted</li>
<li><a
href="431e2bc931"><code>431e2bc</code></a>
test(complete): Ensure ArgValueCandidates get filtered</li>
<li>Additional commits viewable in <a
href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.20...clap_complete-v4.5.21">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In the latest version, we added a warning log to str0m when the maximum
number of candidate pairs is exceeded:
https://github.com/algesten/str0m/pull/587.
We only ever add the candidates of a single relay to an agent (2
candidates), plus at most 2 server-reflexive candidates and at most 2
host candidates. Unless there is a bug like what we fixed in #7334,
exceeding the default number of candidate _pairs_ (100) should never
happen.
In case it does, the newly added `warn` log in `str0m` will trigger a
Sentry alert.
It was already a bit sus that we didn't receive as many errors in Sentry
from the IPC service as from the GUI client. Turns out that we forgot to
initialise our `sentry_layer` there. Additionally, we also didn't
initialise the `LogTracer`, meaning we didn't capture logs from the
`log` crate which is used by some of the dependencies, for example
`wintun`.
Due to https://github.com/getsentry/sentry-rust/issues/702, errors which
are embedded as `tracing::Value` unfortunately get silently discarded
when reported as part of Sentry "Event"s and not "Exception"s.
The design idea of these telemetry events is that they aren't fatal
errors so we don't need to treat them with the highest priority. They
may also appear quite often, so to save performance and bandwidth, we
sample them at a rate of 1% at creation time.
In order to not lose the context of these errors, we instead format them
into the message. This makes them completely identical to the `debug!`
logs which we have on every call-site of `telemetry_event!` which
prompted me to make that implicit as part of creating the
`telemetry_event!`.
Resolves: #7343.