Closes#5464
These were silently broken, it was exporting an empty zip and passing
the test anyway. So this PR will cause the test to fail if the zip
wasn't fully exported, and then it will fix the export.
In our NAT table on the gateway, we try to first pick the external port
as the one on the packet that we want to translate. This makes that port
mapping consistent between NAT sessions in the majority of cases. In
case the port is taken, we iterate through two chained `Range`s that end
up cycling the entire port range.
[`RangeFrom`](https://doc.rust-lang.org/std/ops/struct.RangeFrom.html)
has a somewhat unexpected behaviour in regards to exhaustived ranges:
They panic when trying to access the next element. To avoid this, we
explicitly end the first range at `u16::MAX` which makes it an empty
range in case the source port is `u16::MAX`.
Without this, a < 1.1.0 client connecting to a > 1.1.0 gateway (i.e.
current main) causes lots of very strange logs that say:
> Assigned translation proxy_ip=X.X.X.X real_ip=X.X.X.X
Where X.X.X.X are the same IP.
Currently, we always emit a connection intent whenever we see a DNS
query for a domain of one of our DNS resources. However, especially for
wildcard DNS resources, we are very likely already connected to the
corresponding gateway. In that case, sending a connection intent
triggers another handshake with the portal only to learn that - surprise
- we should reuse a connection that we already have to that gateway.
We can short-circuit this by checking if we are already connected to the
gateway for this resource and directly requested access for the domain
name in question. We reuse the same event here as we do for refreshing
DNS resources. At a later stage, we should rename this to something else
to make this clearer.
Co-authored-by: Gabi <gabrielalejandro7@gmail.com>
This turns out to break things because we can no longer associate a
working but outdated IP with the DNS resource. Putting this up here in
case we want to merge a fix before we decide on a different one.
Reverts: #5435.
Extracted from https://github.com/firezone/firezone/pull/5426
- Replace `new` and `new_for_test` for IPC servers with `enum ServiceId`
- Rename `debug_command_setup` to `setup_stdout_logging`
It turned out there is no clever way to hide other platforms from
`cargo-mutants`, I thought I had such a way
Whenever we resolve a domain name to real IPs, we assign one proxy IP
per resolved IP. In case the DNS records for that domain actually
changed, we only appended the new proxy IPs to the list we assigned to
that domain.
If a domain no longer resolves to a certain IP, we should clear the
assigned proxy IP and stop returning in DNS responses. To achieve this,
we first remove all proxy IPs from our mapping of IP -> domain and then
add all _current_ proxy IPs back to the map.
When a user sends the first packet to a resource, we generate a
"connection intent" and consult the portal, which gateway to use for
this resource. This process is throttled to only generate a new intent
every 2s.
Once we know, which gateway to use for a certain resource, we initiate a
connection via snownet. This involves an OFFER-ANSWER handshake with the
gateway. A connection for which we have sent an offer and have not yet
received an answer is what we call a "pending connection".
In case the connection setup takes longer than 2s, we will generate
another connection intent which can point to the same gateway that we
are currently setting up a connection with.
Currently, encountering a "pending connection" during another connection
setup is treated as an error which results in some state being
cleaned-up / removed. This is where the bug surfaces: If we remove the
state for a resource as a result of a 2nd connection intent and then
receive the response of the first one, we will be left with no state
that knows about this resource.
We fix this by refactoring `create_or_reuse_connection` to be atomic in
regards to its state changes: All checks that fail the function are
moved to the top which means there is no state to clean up in case of an
error. Additionally, we model the case of a "pending connection" using
an `Option` to not flood the logs with "pending connection" warnings as
those are expected during normal operation.
Fixes: #5385
As part of #4994, the IP translation and mangling of packets to and from
DNS resources is moved to the gateway. This PR represents the
"gateway-half" of the required changes.
Eventually, the client will send a list of proxy IPs that it assigned
for a certain DNS resource. The gateway assigns each proxy IP to a real
IP and mangles outgoing and incoming traffic accordingly. There are a
number of things that we need to take care of as part of that:
- We need to implement NAT to correctly route traffic. Our NAT table
maps from source port* and destination IP to an assigned port* and real
IP. We say port* because that is only true for UDP and TCP. For ICMP, we
use the identifier.
- We need to translate between IPv4 and IPv6 in case a DNS resource e.g.
only resolves to IPv6 addresses but the client gave out an IPv4 proxy
address to the application. This translation is was added in #5364 and
is now being used here.
This PR is backwards-compatible because currently, clients don't send
any IPs to the gateway. No proxy IPs means we cannot do any translation
and thus, packets are simply routed through as is which is what the
current clients expect.
---------
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
When we attempt to establish a connection to a gateway for a DNS
resource, the gateway must resolve the requested domain name before it
can accept the connection. Currently, this timeout is set to 60s which
is much longer than the client's connection timeout.
DNS resolution is typically a very fast protocol so reducing this
timeout to 5s should be safe. In addition, we add a compile-time
assertion that this timeout must be less than the client's connection
timeout.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This is a funny one. `cargo test -p firezone-headless-client -p
firezone-gui-client` actually passes, because the GUI client uses the
pipes feature, and Cargo apparently just does one build for both
packages. But if you build the headless Client by itself, it fails to
build.
I think this caused `cargo-mutants` to consider all its headless Client
mutants to be unviable, and so it didn't show coverage for that package.
Part of a yak shave to profile startup time for reducing it on Windows
#5026
Median of 3 runs:
- Windows 11 aarch64 Parallels VM - 4.8 s
- Windows 11 x86_64 laptop - 3.1 s (I thought it used to be slower)
- Windows Server 2022 VM - 22.2 s
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
## The problem
To find the correct peer for a given resource we keep a map of
`resource_id -> gateway_id` in the client state called
`resources_gateways`.
For CIDR resource connlib when sees a packet it does the following
steps:
1. Find the packet's corresponding resource
2. Find the resource corresponding gateway
3. Find the peer corresponding to the gateway, if none, request
access/connection
The problem was that when roaming, we didn't cleanup the map between
`resource_id -> gateway_id` so if after disconnecting with a gateway we
created a new connection due to a another resource, in step 3, connlib
would find a connected gateway and not request access.
This would cause the client to send unallowed packets to the gateway.
## Steps to reproduce
1. Open the client
2. Ping a CIDR resource on a gateway
3. roam and wait until disconnection
4. Ping a different resource on the same gateway
5. Ping the same CIDR resource as in step 2
This will result in no reply for step 5
## The fix
Cleanup the `resource -> gateway` map after disconnecting with a
gateway.
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Closes#5042
Smoke test plan:
- Install on a before-Firezone VM
- Confirm logs default to `str0m=warn,info`
- Set log filter to `debug` in GUI
- Restart IPC service
- Confirm logs are `debug`
- Clear settings back to default
- Restart IPC service
- Confirm logs are `str0m=warn,info`
Directions to apply new log level:
1. Put the new log filter in
2. Click "Apply"
3. Quit Firezone Client
4. Right-click on the Start Menu and click "Terminal (Admin)" to open a
Powershell prompt
5. Run `Restart-Service -Name FirezoneClientIpcService` (on Linux, `sudo
systemctl restart firezone-client-ipc.service`)
6. Re-open Firezone Client
```[tasklist]
- [x] Log the log filter maybe
- [x] Use `atomicwrites` to write the file
- [x] (cancelled) ~~Make the GUI write the file on boot if it's not there (saves a step when upgrading from older versions)~~
- [x] Windows smoke test
- [x] Fix permissions on `/var/lib/dev.firezone.client/config`
- [x] Fix Linux IPC service not loading the log filter file
- [x] Linux smoke test
- [ ] Make sure it's okay that users in `firezone-client` can change the device ID
- [ ] Update user guides to include restarting the computer or IPC service after updating the log level?
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Closes#4765
It turns out that if I don't join the worker thread explicitly it messes
up wintun a lot. I wonder if I should report that as a bug or what. It's
kind of our fault for keeping a handle to the `Session` alive in the
thread.
```[tasklist]
- [x] Move the debug command from `gui-client` to `headless-client`
- [x] Move the initialization out of `firezone-tunnel`, revert the `pub` changes, use `anyhow::Context`
```
---------
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Bumps [tokio-tungstenite](https://github.com/snapview/tokio-tungstenite)
from 0.21.0 to 0.23.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/snapview/tokio-tungstenite/blob/master/CHANGELOG.md">tokio-tungstenite's
changelog</a>.</em></p>
<blockquote>
<h1>0.23.0</h1>
<ul>
<li>Update <code>tungstenite</code> to <code>0.23.0</code>.</li>
<li>Disable default features on TLS crates.</li>
</ul>
<h1>0.22.0</h1>
<ul>
<li>Update TLS dependencies.</li>
<li><del>Update <code>tungstenite</code> to match
<code>0.22.0</code>.</del></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/snapview/tokio-tungstenite/commits">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gabi <gabrielalejandro7@gmail.com>
It turns out they were all unused, but I like having a place to keep
them for new features.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Some of the debug-assertions in the relay are a bit too strict.
Specifically, if an allocation times out because it is not refreshed, we
also clean-up all channel bindings associated with that allocation. Yet,
if an existing channel binding has already been removed earlier, it will
no longer be present in the respective map.
This isn't an issue at all. We can simply change the debug-assertion to
only compare what used to be present in the map. What really matters is
that the item we just removed does in fact point to the data that we are
expecting.
Related: #5355.
When creating the `Transition` strategy, we are currently repeating the
same pattern again and again: We want to conditionally add a strategy if
one or more parts of our state are not empty.
We reduce this duplication with a custom `CompositeStrategy` that offers
a `with_if_not_empty` chainable method to only construct a strategy if
the given input element is not empty.
To make this usable across several usecases, we define an `IsEmpty`
helper trait that is implemented for `Vec`s, `Option`s and tuples.
If we end up sampling filters that don't have any gaps, we cannot create
filters for all the gaps. Thus, we need to shortcut this strategy to
create an empty set of filters in case we don't have any gaps.
Fixes: #5345.
Currently, we use `sample::Index` and `sample::Selector` to
deterministically select parts of our state. Originally, this was done
because I did not yet fully understand, how `proptest-state-machine`
works.
The available transitions are always sampled from the current state,
meaning we can directly use `sample::select` to pick an element like an
IP address from a list. This has several advantages:
- The transitions are more readable when debug-printed because they now
contain the actual data that is being used.
- I _think_ this results in better shrinking because `sample::select`
will perform a binary search for the problematic value.
- We can more easily implement transitions that _remove_ state.
Currently, we cannot remove things from the `ReferenceState` because the
system-under-test would also have to index into the `ReferenceState` as
part of executing its transition. By directly embedding all necessary
information in the transition, this is much simpler.
Previously, we asserted at the end of `TunnelTest::apply`.
`proptest-state-machine` offers a dedicated function for checking
invariants which only gives you a regular reference. That is a good
thing to enforce as we don't want our assertions to change state.
Bumps
[@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node)
from 20.14.0 to 20.14.2.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Reading through more of the `proptest` library, I came across the
`Selector` concept. It is more generic than the `sample::Index` and
allows us to directly pick from anything that is an `IntoIterator`.
This greatly simplifies a lot of the code in `tunnel_test`. In order
(pun intended) to make things deterministic, we migrate all maps and
sets to `BTreeMap`s and `BTreeSets` which have a deterministic ordering
of their contents, thus avoiding additional sorting.
Detecting blocked STUN traffic is somewhat tricky. What we can observe
is not receiving any responses from a relay (neither on IPv4 nor IPv6).
Once an `Allocation` gives up retrying requests with a relay (after
60s), we now de-allocate the `Allocation` and print the following
message:
> INFO snownet::node: Disconnecting from relay; no response received. Is
STUN blocked? id=613f68ac-483e-4e9d-bf87-457fd7223bf6
I chose to go with the wording of "disconnecting from relay" as
sysdamins likely don't have any clue of what an "Allocation" is. The
error message is specific to a relay though so it could also be emitted
if a relay is down for > 60s or not responding for whatever reason.
Resolves: #5281.
Bumps [crash-handler](https://github.com/EmbarkStudios/crash-handling)
from 0.6.1 to 0.6.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/EmbarkStudios/crash-handling/releases">crash-handler's
releases</a>.</em></p>
<blockquote>
<h2>crash-handler-0.6.2</h2>
<h3>Added</h3>
<ul>
<li><a
href="https://redirect.github.com/EmbarkStudios/crash-handling/pull/86">PR#86</a>
(carrying on from <a
href="https://redirect.github.com/EmbarkStudios/crash-handling/pull/85">PR#85</a>)
added support for <a
href="https://learn.microsoft.com/en-us/windows/win32/debug/vectored-exception-handling">vectored
exception handlers</a> on Windows, which can catch heap corruption
exceptions that the vanilla exception handler cannot catch. Thanks <a
href="https://github.com/h3r2tic">Tom!</a>!</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="45a469c86e"><code>45a469c</code></a>
chore: Release</li>
<li><a
href="d4d6f25cce"><code>d4d6f25</code></a>
chore: Release</li>
<li><a
href="7818928239"><code>7818928</code></a>
Update CHANGELOGs</li>
<li><a
href="e524a897c2"><code>e524a89</code></a>
Add heap corruption exception handling (<a
href="https://redirect.github.com/EmbarkStudios/crash-handling/issues/86">#86</a>)</li>
<li><a
href="065f3dd9c1"><code>065f3dd</code></a>
chore: Release</li>
<li><a
href="37e56acd3f"><code>37e56ac</code></a>
Update (<a
href="https://redirect.github.com/EmbarkStudios/crash-handling/issues/83">#83</a>)</li>
<li><a
href="3b77c9b00d"><code>3b77c9b</code></a>
chore: Release</li>
<li>See full diff in <a
href="https://github.com/EmbarkStudios/crash-handling/compare/crash-handler-0.6.1...crash-handler-0.6.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>