If edns0 doesn't work correctly DNS servers might respond with messages
bigger than our maximum udp size.
In that case we need to truncate those messages when forwarding the
respond back to the interface and expect the OS to retry with TCP.
Otherwise we aren't able to allocate a packet big enough for this.
Fixes#7121
---------
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
We introduced a boolean bug in #7163 that causes us to attempt to add
host candidates much more often than necessary. This spams the logs on
DEBUG level but was otherwise not harmful.
During normal operation, we should never lose connectivity to the set of
assigned relays in a client or gateway. In the presence of odd network
conditions and partitions however, it is possible that we disconnect
from a relay that is in fact only temporarily unavailable. Without an
explicit mechanism to retrieve new relays, this means that both clients
and gateways can end up with no relays at all. For clients, this can be
fixed by either roaming or signing out and in again. For gateways, this
can only be fixed by a restart!
Without connected relays, no connections can be established. With #7163,
we will at least be able to still establish direct connections. Yet,
that isn't good enough and we need a mechanism for restoring full
connectivity in such a case.
We creating a new connection, we already sample one of our relays and
assign it to this particular connection. This ensures that we don't
create an excessive amount of candidates for each individual connection.
Currently, this selection is allowed to be silently fallible. With this
PR, we make this a hard-error and bubble up the error that all the way
to the client's and gateway's event-loop. There, we initiate a reconnect
to the portal as a compensating action. Reconnecting to the portal means
we will receive another `init` message that allows us to reconnect the
relays.
Due to the nature of this implementation, this fix may only apply with a
certain delay from when we actually lost connectivity to the last relay.
However, this design has the advantage that we don't have to introduce
an additional state within `snownet`: Connections now simply fail to
establish and the next one soon after _should_ succeed again because we
will have received a new `init` message.
Resolves: #7162.
As part of maintaining an allocation, we also perform STUN with our
relays to discover our server-reflexive address. At the moment, these
candidates are scoped to an `Allocation`. This is unnecessarily
restrictive. Similar to host candidates, server-reflexive candidate
entirely depend on the socket you send data from and are thus
independent of the allocation's state.
During normal operation, this doesn't really matter because all relay
traffic is sent through the same sockets so all `Allocation`s end up
with the same server-reflexive candidates. Where this does matter is
when we disconnect from relay's for one reason or another (for example:
#7162). The fact that all but host-candidates are scoped to
`Allocation`s means that without `Allocation`s, we cannot make any new
connections, not even direct ones. This is unnecessarily restrictive and
causes bugs within `Allocation` to have a bigger blast radius than
necessary.
With this PR, we keep server-reflexive candidates in the same set as
host candidates. This allows us to at least establish direct connections
in case something is wrong with the relays or our state tracking of
relays on the client side.
In order to make Firezone more mobile-friendly, waking up the CPU less
often is key. In #6845, we introduced a low-power mode into `snownet`
that sends STUN messages on a longer interval if the connection is idle.
Whilst poking around `boringtun` as part integrating our fork into the
main codebase, I noticed that we are updating `boringtun`'s timers every
second - even on idle connections.
This PR applies the same idea of #6845 to the timers within `Node`: Idle
connections get "woken" less and if all connections are idle, we avoid
waking the `Node` altogether (unless we need to refresh allocations /
channels).
Calling `handle_timeout` less often revealed an issue in the tests where
we didn't fully process the state changes after invalidating a candidate
from the remote. To fix this, we now call `handle_timeout` directly
after `{add,remove}_remote_candidate`. This isn't super clean because at
first glance, it looks like `handle_timeout` should just be part of the
add/remove candidate function. It is quite common for sans-IO designs to
require calling `handle_timeout` after state has been changed. In
`{Client,Server}Node`, we do it implicitely so that we don't have to do
it in the tests and the event-loop.
It would be great to test this in some automated fashion but I haven't
figured out how yet. I did temporarily add an `info!` log to the
event-loop of the client and with this patch applied, the entire
event-loop goes to sleep on idle connections. It still does get woken
every now and then but no longer every second!
As a first step for integration Sentry into the Android app, we launch
the Sentry Rust agent as soon as a `connlib` session starts up. At a
later point, we can also integrate Sentry into the Android app itself
using the Java / Kotlin SDK.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
This starts up telemetry together with each `connlib` session. At a
later point, we can also integrate the native Swift SDK into the MacOS /
iOS app to catch non-connlib specific problems.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
With #7105, all ERROR events from `tracing` get logged as exceptions in
Sentry and all WARN events get logged as "messages". We don't want to
fill up the user's harddrive with logs which means we have to be
somewhat conservative, what gets logged on INFO and above (with INFO
being the default log level). There are certain events though where it
would be useful to know, how often they happen because too many of them
can indicate a problem.
To solve this problem, we introduce a dedicated `telemetry` log target
that the tracing-sentry integration layer watches for. Events for the
`telemetry` log target that gets logged on TRACE will be sampled at a
rate of 1% and submitted as messages to Sentry.
Now that we have Sentry integrated with `tracing`, using `warn!` logs a
bit more liberally allows us to detect edge-cases that customers might
run into.
All the logs touched in this PR represent some kind of problem that it
would be good to know about.
Using a trait means the call-site of the log message will always be the
`log_unwrap` module, despite the `#[track_caller]` annotation. That one
only works for `std::panic::Location` unfortunately which `tracing`
isn't using.
Macros will be evaluated earlier and thus the messages will show up with
the correct module name.
Applications may query domains for HTTPS RR using the HTTPS record type.
`connlib` operates on OSI layer 3 and thus can only hand out IPs for the
particular domains. The correct way to signal this to applications is to
answer the HTTPS query with NOERROR and return an empty set of records.
[RFC9460](https://www.rfc-editor.org/rfc/rfc9460.html#name-client-behavior)
says the following:
> 4. If one or more "compatible" ([Section
8](https://www.rfc-editor.org/rfc/rfc9460.html#mandatory)) ServiceMode
records are returned, these represent the alternative endpoints. Sort
the records by ascending SvcPriority.
> 5. Otherwise, SVCB resolution has failed, and the list of available
endpoints is empty.
This implies that returning no records is valid behaviour and forces the
client to consider the HTTPS DNS query as failed and query for A / AAAA
records instead (if it didn't do so via happy-eyeballs already).
My theory for this issue is that we receive a UDP DNS response from an
upstream server that is bigger than our MTU and thus forwarding it
fails.
This PR doesn't fix that issue by itself but only mitigates the actual
panic. To properly fix the underlying issue, we need to parse the DNS
message. Truncate it and set the TC bit.
Related: #7121.
The `fmt::Display` implementation of `tokio::task::JoinError` already
does exactly what we do here: Extracting the panic message if there is
one. Thus, we can simplify this code why just moving the `JoinError`
into the `DisconnectError` as its source.
For simplicity sake I assumed that any packet using port 53 would be a
dns packet, but I forgot to exclude it from the range of possible ports
used.
This can cause spurious CI failures
Using the `sentry-tracing` integration, we can automatically capture
events based on what we log via `tracing`. The mapping is defined as
follows:
- ERROR: Gets captured as a fatal error
- WARN: Gets captured as a message
- INFO: Gets captured as a breadcrumb
- `_`: Does not get captured at all
If telemetry isn't active / configured, this integration does nothing.
It is therefore safe to just always enable it.
Our logging library, `tracing` supports structured logging. This is
useful because it preserves the more than just the string representation
of a value and thus allows the active logging backend(s) to capture more
information for a particular value.
In the case of errors, this is especially useful because it allows us to
capture the sources of a particular error.
Unfortunately, recording an error as a tracing value is a bit cumbersome
because `tracing::Value` is only implemented for `&dyn
std::error::Error`. Casting an error to this is quite verbose. To make
it easier, we introduce two utility functions in `firezone-logging`:
- `std_dyn_err`
- `anyhow_dyn_err`
Tracking errors as correct `tracing::Value`s will be especially helpful
once we enable Sentry's `tracing` integration:
https://docs.rs/sentry-tracing/latest/sentry_tracing/#tracking-errors
Currently, tests only send ICMP packets back and forth, to expand our
coverage and later on permit us cover filters and resource picking this
PR implements sending UDP and TCP packets as part of that logic too.
To make this PR simpler in this stage TCP packets don't track an actual
TCP connection, just that they are forwarded back and forth, this will
be fixed in a future PR by emulating TCP sockets.
We also unify how we handle CIDR/DNS/Non Resources to reduce the number
of transitions.
Fixes#7003
---------
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
When we query upstream DNS servers through the tunnel via TCP DNS, we
will always be successful in establishing a tunnel, regardless of how
many concurrent queries we send because the TCP stack will keep
re-trying. Thus, tracking, which resources we are connected to after
sending a bunch of DNS queries needs to be split by UDP and TCP.
For UDP, only the "first" resource will be connected, however, with
concurrent TCP and UDP DNS queries, "first" isn't necessarily the order
in which we send the queries because with TCP DNS, one packet doesn't
equate to one query anymore.
This is quite hacky but it will get completely deleted once we buffer
packets during the connection setup.
This PR implements the new idempotent control protocol for the gateway.
We retain backwards-compatibility with old clients to allow admins to
perform a disruption-free update to the latest version.
With this new control protocol, we are moving the responsibility of
exchanging the proxy IPs we assigned to DNS resources to a p2p protocol
between client and gateway. As a result, wildcard DNS resources only get
authorized on the first access. Accessing a new domain within the same
resource will thus no longer require a roundtrip to the portal.
Overall, users will see a greatly decreased connection setup latency. On
top of that, the new protocol will allow us to more easily implement
packet buffering which will be another UX boost for Firezone.
At present, `connlib` only supports DNS over UDP on port 53. Responses
over UDP are size-constrained on the IP MTU and thus, not all DNS
responses fit into a UDP packet. RFC9210 therefore mandates that all DNS
resolvers must also support DNS over TCP to overcome this limitation
[0].
Handling UDP packets is easy, handling TCP streams is more difficult
because we need to effectively implement a valid TCP state machine.
Building on top of a lot of earlier work (linked in issue), this is
relatively easy because we can now simply import
`dns_over_tcp::{Client,Server}` which do the heavy lifting of sending
and receiving the correct packets for us.
The main aspects of the integration that are worth pointing out are:
- We can handle at most 10 concurrent DNS TCP connections _per defined
resolver_. The assumption here is that most applications will first
query for DNS records over UDP and only fall back to TCP if the response
is truncated. Additionally, we assume that clients will close the TCP
connections once they no longer need it.
- Errors on the TCP stream to an upstream resolver result in `SERVFAIL`
responses to the client.
- All TCP connections to upstream resolvers get reset when we roam, all
currently ongoing queries will be answered with `SERVFAIL`.
- Upon network reset (i.e. roaming), we also re-allocate new local ports
for all TCP sockets, similar to our UDP sockets.
Resolves: #6140.
[0]: https://www.ietf.org/rfc/rfc9210.html#section-3-5
Currently, any failure in the `StubResolver` while processing a query
results only in a log but no response. For UDP DNS queries, that isn't
too bad because the client will simply retry. With the upcoming support
for TCP DNS queries in #6944, we should really always reply with a
message.
This PR refactors the handling of DNS messages to always generate a
reply. In the case of an error, we reply with SERVFAIL.
This opens up a few more refactorings where we can now collapse the
handling of some branches into the same. As part of that, I noticed the
recurring need for "unwrapping" a `Result<(), E>` and logging the error.
To make that easier, I introduced an extension trait that does exactly
that.
In #6960, split up the filter handling by resource type. This uncovered
an issue that was already known in and fixed in the portal in #6962:
Upstream DNS servers must not be in any of the reserved IP ranges.
Related: #7060.
When forwarding UDP DNS queries through the tunnel, `connlib` needs to
mangle the IP header to set upstream as the correct destination of the
packet. In order to know, which ones need to be mangled back, we keep a
map of query ID to timestamp. This isn't good enough as the added
proptest failure case shows. If there are two concurrent queries with
the same query ID to different resolvers, we only handle one of those
and don't mangle the 2nd one.
This buffer is effectively limited by the maximum size of our IP packets
(which is guided by our interface MTU). Passing a length is
unnecessarily abstract.
For implementing DNS over TCP, we will need to encapsulate packets that
are emitted by the `dns_over_tcp::Client` which requires creating such a
buffer on the fly.
In the future, we should probably consider also stack-allocating all our
`Transmit`s so we can get rid of passing around this buffer altogether.
Currently, in our tests, traffic that is targeted at a resource is
handled "in-line" on the gateway. This doesn't really represent how the
real world works. In the real world, the gateway uses the IP forwarding
functionality of the Linux kernel and the corresponding NAT to send the
IP packet to the actual resource.
We don't want to implement this forwarding and NAT in the tests.
However, our testing harness is about to get more sophisticated. We will
be sending TCP DNS queries with #6944 and we want to test TCP and its
traffic filters with #7003.
The state of those TCP sockets needs to live _somewhere_.
If we "correctly" model this and introduce some kind of `HashMap` with
`dyn Resource` in `TunnelTest`, then we will have to actually implement
NAT for those packets to ensure that e.g. the SYN-ACK of a TCP handshake
makes it back to the correct(!) gateway.
That is rather cumbersome.
This PR suggests taking a shortcut there by associating the resources
with each gateway individually. At present, all we have are UDP DNS
servers. Those don't actually have any connection state themselves but
putting them in place gives us a framework for where we can put
connection-specific state. Most importantly, these resources MUST NOT
hold application-specific state. Instead, that state needs to be kept in
`ReferenceState` or `TunnelState` and passed in by reference, as we do
here for the DNS records.
This has effectively the same behaviour as correctly translating IP
packets back and forth between resources and gateways. The packets
"emitted" by a particular `UdpDnsServerResource` will always go back to
the correct gateway.
Previously, when a gateway checked if a packet was allowed through, it
used the real IP of the DNS resource that the client was trying to
communicate with.
The problem with this was that if there's an overlapping CIDR resource
with the real IP this would allow a user to send packets using the
filters for that resource instead of the DNS resource filters.
This can be confusing for users as packet can flow unexpectedly to the
resources even if the filter doesn't permit it, so we use the IP of the
packet before we translate it to the real IP to match the filters.
This doesn't change the security of this feature as a user can just
change the IP of the packet with the dst of the DNS or the cidr resource
according to what they need.
Fixes#6806
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.12.4 to 0.12.5.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md">lru's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/jeromefroe/lru-rs/tree/0.12.5">v0.12.5</a> -
2024-10-30</h2>
<ul>
<li>Upgrade hashbrown dependency to 0.15.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2d18d2d333"><code>2d18d2d</code></a>
Merge pull request <a
href="https://redirect.github.com/jeromefroe/lru-rs/issues/203">#203</a>
from jeromefroe/jerome/prepare-0-12-5-release</li>
<li><a
href="b42486918b"><code>b424869</code></a>
Prepare 0.12.5 release</li>
<li><a
href="1ba5130174"><code>1ba5130</code></a>
Merge pull request <a
href="https://redirect.github.com/jeromefroe/lru-rs/issues/202">#202</a>
from torokati44/hashbrown-0.15</li>
<li><a
href="60a7e71c59"><code>60a7e71</code></a>
Use top-level DefaultHashBuilder type alias</li>
<li><a
href="12ed995b7b"><code>12ed995</code></a>
Update hashbrown to 0.15</li>
<li>See full diff in <a
href="https://github.com/jeromefroe/lru-rs/compare/0.12.4...0.12.5">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Currently, we have a lot of stupid code to forward data from the
`{Client,Gateway}Tunnel` interface to `{Client,Gateway}State`. Recent
refactorings such as #6919 made it possible to get rid of this
forwarding layer by directly exposing `&mut TRoleState`.
To maintain some type-privacy, several functions are made generic to
accept `impl Into` or `impl TryInto`.
UDP DNS queries for upstream resolvers that happen to be resources need
to be sent through the tunnel. For that to work correctly, `connlib`
needs to rewrite the IP header such that the destination IP points to
the actual address of the DNS server.
Currently, this happens rather "late" in the processing of the packets,
i.e. after `try_handle_dns` has returned (where that decision is
actually made). This is rather confusing and also forces us to re-parse
the packet as a DNS packet at a later stage.
To avoid this, we move main functionality of
`maybe_mangle_dns_query_to_cidr_resource` into the branch where
`connlib`'s stub DNS resolver tells us that the query needs to be
forwarded via the tunnel.
With the upcoming support of TCP DNS queries, we will have a 2nd source
of IP packets that need to go through the tunnel: Packets emitted from
our internal TCP stack. Attempting to perform the same post-processing
on these TCP packets as we do with UDP is rather confusing, which is why
we want to remove this step from the `encapsulate` function.
Resolves: #5391.
Within `connlib`, the `encapsulate` and `decapsulate` functions on
`ClientState` and `GatewayState` are the entrypoint for sending and
receiving network traffic. For example, IP packets read from the TUN
device are processed using these functions.
Not all packets / traffic passed to these functions is meant to be
encrypted. Some of it is TURN traffic with relays, some of it is DNS
traffic that we intercept.
To clarify this, we rename these functions to `handle_tun_input` and
`handle_network_input`.
As part of this clarification, we also call `handle_timeout` in case we
don't emit a decrypted IP packet when handling network input. Once we
support DNS over TCP (#6944), some IP packets sent through the tunnel
will originate from DNS servers that we forwarded queries to. In that
case, those responses will be handled by `connlib`'s internal TCP stack
and thus not produce a decrypted IP packet. To correctly, advance the
state in this case, we mirror what we already do for `handle_tun_input`
and call `handle_timeout` if `handle_network_input` yields `None`.
When handling DNS queries, `connlib` tries to be as transparent as
possible. For this reason, we byte-for-byte forward the DNS response
from the upstream resolver to the original source socket. In #6999, we
started modelling these DNS queries as explicit tasks in preparation for
DNS over TCP and DNS over HTTPS.
As part of that, we create a DNS response for _every_ IO error we
encounter as part of the recursive query. This includes timeouts, i.e.
when we don't receive a response at all. That actually breaks the rule
of "be a transparent DNS proxy".
In this PR, we slightly refactor the handling of the DNS response to
explicitly match on `io::Errorkind::TimedOut` to not send a packet back,
thus mirroring the behaviour the DNS client would encounter without
Firezone being active.
When performing recursive DNS queries over UDP, `connlib` needs to
remember the original source socket a particular query came from in
order to send the response back to the correct socket. Until now, this
was tracked in a separate `HashMap`, indexed by upstream server and
query ID.
When DNS queries are being retried, they may be resent using the same
query ID, causing "Unknown query" logs if the retry happens on a shorter
interval than the timeout of our recursive query.
We are already tracking a bunch of meta data along-side the actual
query, meaning we can just as easily add the original source socket to
that as well.
Once we add TCP DNS queries, we will need to track the handle of the TCP
socket in a similar manner.
This PR introduces a custom logging format for all Rust-components. It
is more or less a copy of `tracing_subscriber::fmt::format::Compact`
with the main difference that span-names don't get logged.
Spans are super useful because they allow us to record contextual
values, like the current connection ID, for a certain scope. What is IMO
less useful about them is that in the default formatter configuration,
active spans cause a right-drift of the actual log message.
The actual log message is still what most accurately describes, what
`connlib` is currently doing. Spans only add contextual information that
the reader may use for further understand what is happening. This
optional nature of the utility of spans IMO means that they should come
_after_ the actual log message.
Resolves: #7014.
This extracts the initial refactoring required for #6944. Currently,
`connlib` sends all DNS queries over the same UDP socket as all the p2p
traffic for gateways and relays. In an earlier design of `connlib`, we
already did something similar as we are doing here but using
`hickory_resolver` for the actual DNS resolution.
Instead of depending on hickory, we implement DNS resolution ourselves
by sending a UDP DNS query to the mapped upstream DNS server. There are
no retries, instead, we rely on the original DNS client to retry in case
a packet gets lost on the way.
Modelling recursive DNS queries as explicit events from the
`ClientState` is necessary for implement DNS over TCP and DNS over
HTTPS. In both cases, the query to the upstream server isn't as simple
as emitting a `Transmit`. By modelling the query as an `async fn` within
`Io`, it will be possible to perform them all in one place.
Resolves: #6297.
Currently, we emit a single TRACE log for each DNS resource entry that
doesn't match. This is quite spammy and often not needed.
When debugging DNS resources, it is useful to know, which resources we
are matching against. To balance this, we now build a list of all DNS
resource domain patterns that we have and log a single "No resources
matched" log with that list in case none match.
With the latest version of `proptest-state-machine`, we no longer need
to use their traits because `Sequential::new` is now exposed. This makes
the overall things less magical because there is less indirection.
This incorporates the feedback from #6939 after a discussion with
@conectado. We agreed that the protocol should be more event-based,
where each message has its own event type. Events MAY appear in pairs or
other cardinality combinations, meaning semantically they could be seen
as requests and responses. In general though, due to the unreliable
nature of IP, it is better to view them as events. Events are typically
designed to be idempotent which is important to make this protocol work.
Using events also means it is not as easy to fall into the "trap" of
modelling requests / responses on the control protocol level.
To correctly handle overlapping CIDR resources, we need to recompute
which ones are active. The `RoamClient` transition was missing that
despite the system-under-test actually doing that bit.
Adding the necessary function call fixes the two regression seeds
detected in CI.
Resolves: #6924.
Do we want to track 401s in sentry? If we see a lot of them, something
is likely wrong but I guess there is some level of 401s that users will
just run into.
Is there a way of marking these as "might not be a really bad error"?
---------
Co-authored-by: Not Applicable <ReactorScram@users.noreply.github.com>
At present, `connlib` utilises the portal as a signalling layer for any
kind of control message that needs to be exchanged between clients and
gateways. For anything regard to connectivity, this is crucial: Before
we have a direct connection to the gateway, we don't really have a
choice other than using the portal as a "relay" to e.g. exchange address
candidates for ICE.
However, once a direct connection has been established, exchanging
information directly with the gateway is faster and removes the portal
as a potential point of failure for the data plane.
For DNS resources, `connlib` intercepts all DNS requests on the client
and assigns its own IPs within the CG-NAT range to all domains that are
configured as resources. Thus, all packets targeting DNS resources will
have one of these IPs set as their destination. The gateway needs to
learn about all the IPs that have been assigned to a certain domain by
the client and perform NAT. We call this concept "DNS resource NAT".
Currently, the domain + the assigned IPs are sent together with the
`allow_access` or `request_connection` message via the portal. The new
control protocol defined in #6732 purposely excludes this information
and only authorises traffic to the entire resource which could also be a
wildcard-DNS resource.
To exchange the assigned IPs for a certain domain with the gateway, we
introduce our own p2p control protocol built on top of IP. All control
protocol messages are sent through the tunnel and thus encrypted at all
times. They are differentiated from regular application traffic as
follows:
- IP src is set to the unspecified IPv6 address (`::`)
- IP dst is set to the unspecified IPv6 address (`::`)
- IP protocol is set to reserved (`0xFF`)
The combination of all three should never appear as regular traffic.
To ensure forwards-compatibility, the control protocol utilises a fixed
8-byte header where the first byte denotes the message kind. In this
current design, there is no concept of a request or response in the
wire-format. Each message is unidirectional and the fact that the two
messages we define in here appear in tandem is purely by convention. We
use the IPv6 payload length to determine the total length of the packet.
The payloads are JSON-encoded. Message types are free to chose whichever
encoding they'd like.
This protocol is sent through the WireGuard tunnel, meaning we are
effectively limited by our device MTU of 1280, otherwise we'd have to
implement fragmentation. For the messages of setting up the DNS resource
NAT, we are below this limit:
- UUIDs are 16 bytes
- Domain names are at most 255 bytes
- IPv6 addresses are 16 bytes * 4
- IPv4 addressers are 4 bytes * 4
Including the JSON serialisation overhead, this results in a total
maximum payload size of 402 bytes, which is well below our MTU.
Finally, another thing to consider here is that IP is unreliable,
meaning each use of this protocol needs to make sure that:
- It is resilient against message re-ordering
- It is resilient against packet loss
The details of how this is ensured for setting up the DNS resource NAT
is left to #6732.
One of the key differences of the new control protocol designed in #6461
is that creating new connections is idempotent. We achieve this by
having the portal generate the ICE credentials and the preshared-key for
the WireGuard tunnel. As long as the ICE credentials don't change, we
don't need to make a new connection.
For `snownet`, this means we are deprecating the previous APIs for
making connections. The client-side APIs will have to stay around until
we merge the client-part of the new control protocol. The server-side
APIs will have to stay around until we remove backwards-compatibility
from the gateway.
As part of #6732, we will be using the tunnel for the p2p control
protocol to setup the DNS resource NAT on the gateway. These messages
will be immediately queued after creating the connection, before ICE is
finished. In order to avoid additional retries within `firezone_tunnel`,
we directly attempt to encapsulate these packets.
`boringtun` internally has a buffer for these because prior to having a
WireGuard session, we don't actually have the necessary keys to encrypt
a packet. Thus, the packet passed here won't actually end up in the
referenced buffer of the `Connecting` state. Instead, if we haven't
already initiated a WireGuard handshake, attempting to encapsulate a
packet will trigger a WireGuard handshake initiation and _that_ is the
packet we need to buffer.
Dropping this packet would require us to wait until `boringtun`
retransmits it as part of the handshake timeout, causing an unnecessary
delay in the connection setup.
We call the `add_ips_with_resource` function with a list of `IpAddr` or
`IpNetwork`s. To make this more ergonomic for the caller, we can accept
an iterator that converts the items on the fly.
To reduce the TTFB, we immediately force a WireGuard handshake as soon
as ICE completes. Currently, both the client and the gateway do this
which is somewhat counter-productive as there can only be one active
session.
With this patch, only the client will force a new WireGuard handshake as
soon as ICE completes.
With the new control protocol specified in #6461, the client will no
longer initiate new connections. Instead, the credentials are generated
deterministically by the portal based on the gateway's and the client's
public key. For as long as they use the same public key, they also have
the same in-memory state which makes creating connections idempotent.
What we didn't consider in the new design at first is that when clients
roam, they discard all connections but keep the same private key. As a
result, the portal would generate the same ICE credentials which means
the gateway thinks it can reuse the existing connection when new flows
get authorized. The client however discarded all connections (and
rotated its ports and maybe IPs), meaning the previous candidates sent
to the gateway are no longer valid and connectivity fails.
We fix this by also rotating the private keys upon reset. Rotating the
keys itself isn't enough, we also need to propagate the new public key
all the way "over" to the phoenix channel component which lives
separately from connlib's data plane.
To achieve this, we change `PhoenixChannel` to now start in the
"disconnected" state and require an explicit `connect` call. In
addition, the `LoginUrl` constructed by various components now acts
merely as a "prototype", which may require additional data to construct
a fully valid URL. In the case of client and gateway, this is the public
key of the `Node`. This additional parameter needs to be passed to
`PhoenixChannel` in the `connect` call, thus forming a type-safe
contract that ensures we never attempt to connect without providing a
public key.
For the relay, this doesn't apply.
Lastly, this allows us to tidy up the code a bit by:
a) generating the `Node`'s private key from the existing RNG
b) removing `ConnectArgs` which only had two members left
Related: #6461.
Related: #6732.