Commit Graph

19 Commits

Author SHA1 Message Date
Thomas Eizinger
a4fef5f6e7 test(connlib): index ICMP packets by custom payload (#6438)
In the `tunnel_test` test suite, we send ICMP requests with arbitrary
sequence numbers and identifiers. Due to the NAT implementation of the
gateway, the sequence number and identifier chosen by the client are not
necessarily the same as the ones sent to the resource. Thus, it is
impossible to correlate the ICMP packets sent by the client with the
ones arriving at the gateway.

Currently, our test suite thus relies on the ordering of packets to
match them up and assert properties on them, like whether they target
the correct resource. As soon as we want to send multiple packets
concurrently, this order is not necessarily stable.

ICMP echo requests can contain an arbitrary payload. We utilise this
payload to embed a random u64 that acts as the unique identifier of an
ICMP request. This allows us to correlate the packets arriving at the
gateway with the ones sent by the client, making the test suite more
robust and ready for handling concurrent ICMP packets.
2024-08-27 04:15:39 +00:00
Thomas Eizinger
3b56664e02 test(rust): ensure deterministic proptests (#6319)
For quite a while now, we have been making extensive use of
property-based testing to ensure `connlib` works as intended. The idea
of proptests is that - given a certain seed - we deterministically
sample test inputs and assert properties on a given function.

If the test fails, `proptest` prints the seed which can then be added to
a regressions file to iterate on the test case and fix it. It is quite
obvious that non-determinism in how the test input gets generated is no
bueno and reduces the value we get out of these tests a fair bit.

The `HashMap` and `HashSet` data structures are known to be
non-deterministic in their iteration order. This causes non-determinism
during the input generation because we make use of a lot of maps and
sets to gradually build up the test input. We fix all uses of `HashMap`
and `HashSet` by replacing them with `BTreeMap` and `BTreeSet`.

To ensure this doesn't regress, we refactor `tunnel_test` to not make
use of proptest's macros and instead, we initialise and run the test
ourselves. This allows us to dump the sampled state and transitions into
a file per test run. In CI, we then run a 2nd iteration of all
regression tests and compare the sampled state and transitions with the
previous run. They must match byte-for-byte.

Finally, to discourage use of non-deterministic iteration, we ban the
use of the iteration functions on `HashMap` and `HashSet` across the
codebase. This doesn't catch iteration in a `for`-loop but it is better
than not linting against it at all.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-08-16 23:15:58 +00:00
Thomas Eizinger
4ae64f0257 fix(connlib): index forwarded DNS queries by ID + socket (#6233)
When forwarding DNS queries, we need to remember the original source
socket in order to send the response back. Previously, this mapping was
indexed by the DNS query ID. As it turns out, at least Windows doesn't
have a global DNS query ID counter and may reuse them across different
DNS servers. If that happens and two of these queries overlap, then we
match the wrong responses together.

In the best case, this produces bad DNS results on the client. In the
worst case, those queries were for DNS servers with different IP
versions in which case we triggered a panic in connlib further down the
stack where we created the IP packet for the response.

To fix this, we first and foremost remove the explicit `panic!` from the
`make::` functions in `ip-packet`. Originally, these functions were only
used in tests but we started to use them in production code too and
unfortunately forgot about this panic. By introducing a `Result`, all
call-sites are made aware that this can fail.

Second, we fix the actual indexing into the data structure for forwarded
DNS queries to also include the DNS server's socket. This ensures we
don't treat the DNS query IDs as globally unique.

Third, we replace the panicking path in
`try_handle_forwarded_dns_response` with a log statement, meaning if the
above assumption turns out wrong for some reason, we still don't panic
and simply don't handle the packet.
2024-08-09 07:01:57 +00:00
Thomas Eizinger
128d0eb407 feat(connlib): transparently forward non-resources DNS queries (#6181)
Currently, `connlib` depends on `hickory-resolver` to perform DNS
queries for non-resources. This is unnecessary. Instead of buffering the
original UDP DNS query, consulting hickory to resolve the name and
mapping the response back, we can simply take the UDP payload and send
it via our protected socket directly to the original upstream DNS
server.

This ensures `connlib` is as transparent as possible for DNS queries for
non-resources. Additionally, it removes a lot of error handling and
other cruft that we currently have to perform because we are using
hickory. For example, hickory will automatically retry a DNS query after
a certain timeout. However, the OS / client talking to `connlib` will
also retry after a certain timeout because it is making DNS queries over
an unreliable transport (UDP). It is thus unnecessary for us to do that
internally.

To correctly test this change, our test-suite needed some refactoring.
Specifically, DNS servers are now modelled as dedicated `Host`s that can
receive (UDP) traffic.

Lastly, we can remove our dependency on `hickory-proto` and
`hickory-resolver` everywhere and only use `domain` for parsing DNS
messages.

Resolves: #6141.
Related: #6033.
Related: #4800. (Impossible to happen with this design)
2024-08-07 08:54:49 +00:00
Thomas Eizinger
1aa95ed17e fix(connlib): be explicit about unsupported ICMP types (#5611)
Our NAT table uses TCP & UDP ports for its entries. To correctly handle
ICMP requests and responses, we use the ICMP identifier in those
packets. All other ICMP messages are currently unsupported.

The errors paths for accessing these fields, i.e. ports for UDP/TCP and
identifier for ICMP currently conflate two different errors:

- Unsupported IP payload: it is neither TCP, UDP or ICMP
- Unsupported ICMP type: it is not an ICMP request or response

This makes certain logs look worse than they are because we say
"Unsupported IP protocol: Icmpv6". To avoid this, we create a dedicated
error variant that calls out the unsupported ICMP type.

Fixes: #5594.
2024-06-28 01:13:25 +00:00
Thomas Eizinger
8cb3659636 chore(connlib): implement some missing ICMP conversions (#5475)
So far, our packet translation only implemented the bare-minimum for
ICMP to work. There are a few things left that haven't been dealt with.
This PR adds additional conversions where it was easy.

There are still some left that require more elaborate mangling of the
packet, like updating pointer fields.
2024-06-24 23:48:14 +00:00
Gabi
2ea6a5d07e feat(gateway): NAT & mangling for DNS resources (#5354)
As part of #4994, the IP translation and mangling of packets to and from
DNS resources is moved to the gateway. This PR represents the
"gateway-half" of the required changes.

Eventually, the client will send a list of proxy IPs that it assigned
for a certain DNS resource. The gateway assigns each proxy IP to a real
IP and mangles outgoing and incoming traffic accordingly. There are a
number of things that we need to take care of as part of that:

- We need to implement NAT to correctly route traffic. Our NAT table
maps from source port* and destination IP to an assigned port* and real
IP. We say port* because that is only true for UDP and TCP. For ICMP, we
use the identifier.
- We need to translate between IPv4 and IPv6 in case a DNS resource e.g.
only resolves to IPv6 addresses but the client gave out an IPv4 proxy
address to the application. This translation is was added in #5364 and
is now being used here.

This PR is backwards-compatible because currently, clients don't send
any IPs to the gateway. No proxy IPs means we cannot do any translation
and thus, packets are simply routed through as is which is what the
current clients expect.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-06-19 01:15:27 +00:00
Gabi
8cc28499e9 chore(connlib): implement IP translation according to RFC6145 (#5364)
As part of #4994, we need to translate IP packets between IPv4 and IPv6.
This PR introduces the `ConvertiblePacket` abstraction that implements
this.
2024-06-14 21:33:07 +00:00
Thomas Eizinger
4117639cf4 fix(connlib): reply with SERVFAIL on DNS query errors (#5263)
Currently, we simply drop a DNS query if we can't fulfill it. Because
DNS is based on UDP which is unreliable, a downstream system will
re-send a DNS query if it doesn't receive an answer within a certain
timeout window.

Instead of dropping queries, we now reply with `SERVFAIL`, indicating to
the client that we can't fulfill that DNS query. The intent is that this
will stop any kind of automated retry-loop and surface an error to the
user.

Related: #4800.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-06-08 02:12:46 +00:00
Thomas Eizinger
63d7c35717 test(connlib): send DNS queries to non-resources (#5168)
Currently, `tunnel_test` only sends DNS queries to a client's configured
DNS resources. However, connlib receives _all_ DNS requests made on a
system and forwards them to the originally set upstream resolvers in
case they are for non-resources.

To capture the code paths for forwarding these DNS queries, we introduce
a `global_dns_records` strategies that pre-fills the `ReferenceState`
with DNS records that are not DNS resources. Thus, when sampling for a
domain to query, we might pick one that is not a DNS resource.

The expectation here is that this query still resolves (we assert that
we don't have any unanswered DNS queries). In addition, we introduce a
`Transition` to send an ICMP packet to such a resolved address. In a
real system, these wouldn't get routed to connlib but if they were, we
still want to assert that they don't get routed.

There is a special-case where the chosen DNS server is actually a CIDR
resource. In that case, the DNS packet gets lost and we use it to
trigger initiate a connection to the corresponding gateway. In that
case, a repeated query to such a DNS server actually gets sent via the
tunnel to the gateway. As such, we need generate a DNS response
similarly to how we need to send an ICMP reply.

This allows us to add a few more useful assertions to the test: Correct
mangling of source and destination port of UDP packets.
2024-06-04 05:52:22 +00:00
Gabi
499edd2dd4 chore(connlib): fix echo request and reply packets (#5169)
When creating an echo request or reply packet using pnet it uses the
whole packet since the identifier and sequence is part of the icmp
header not the payload.

Those fields aren't accessible unless the packet is converted to an echo
request or reply because the interpretation of that header field depends
on the specific type of packet.
2024-05-31 11:15:46 +00:00
Thomas Eizinger
ce929e1204 test(connlib): resolve DNS resources in tunnel_test (#5083)
Currently, `tunnel_test` only sends ICMPs to CIDR resources. We also
want to test certain properties in regards to DNS resources. In
particular, we want to test:

- Given a DNS resource, can we query it for an IP?
- Can we send an ICMP packet to the resolved IP?
- Is the mapping of proxy IP to upstream IP stable?

To achieve this, we sample a list of `IpAddr` whenever we add a DNS
resource to the state. We also add the transition
`SendQueryToDnsResource`. As the name suggests, this one simulates a DNS
query coming from the system for one of our resources. We simulate A and
AAAA queries and take note of the addresses that connlib returns to us
for the queries.

Lastly, as part of `SendICMPPacketToResource`, we now may also sample
from a list of IPs that connlib gave us for a domain and send an ICMP
packet to that one.

There is one caveat in this test that I'd like to point out: At the
moment, the exact mapping of proxy IP to real IP is an implementation
detail of connlib. As a result, I don't know which proxy IP I need to
use in order to ping a particular "real" IP. This presents an issue in
the assertions: Upon the first ICMP packet, I cannot assert what the
expected destination is. Instead, I need to "remember" it. In case we
send another ICMP packet to the same resource and happen to sample the
same proxy IP, we can then assert that the mapping did not change.
2024-05-31 04:44:30 +00:00
Thomas Eizinger
9c1af37c85 chore(ip-packet): model ICMP packets (#5147)
An `IpPacket` may contain an ICMP or ICMPv6 packet. To extract metadata
like the sequence number of identifier from it, we need to be able to
parse an `IpPacket`'s payload into the appropriate packets.

Extracted out of #5083.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>
2024-05-29 01:08:13 +00:00
Thomas Eizinger
994b25bf71 test(connlib): ensure ICMP replies make it back to the client (#5104)
In order to exercise all codepaths of connlib, we need to send traffic
in both directions. This patch sends ICMP replies from the gateway to
the client upon receipt of an ICMP request.
2024-05-24 00:24:19 +00:00
Thomas Eizinger
92676f0f53 test(connlib): simulate IO in state machine tests (#4728)
This is similar to #4097 and #4585 but for the entire `ClientState` and
`GatewayState`. We also do it in the context of a property-based test
with the vision that we can deterministically explore a large space of
state transitions and see where our main property breaks: Being able to
send an ICMP packet from the client to the gateway.

In other words, we now correctly pass all the `Transmit`s back and forth
between the components as if they would receive it from the network. Due
to the nature of property-based tests, this already exercises a very
large input space. For example, if the client does not have an IPv6
socket and the gateway doesn't have an IPv4 socket, this test already
checks whether we then correctly fall back to using a relay (because the
allocation we make on the relay is the only network path where the STUN
requests pass through).

What this does not (yet) do is set up a proper network topology. The
`dispatch_transmit` function will happily "route" a `Transmit` from e.g.
the client to the gateway even if they are in different subnets. In
other words, these tests assume that the actual network itself works and
we can exchange UDP packets between the components.

For now, we only send ICMPs to CIDR resources. As a next step, we can
extend this to DNS resources by sending DNS queries for our DNS
resources and then sending an ICMP to the resolved IP.
2024-05-22 23:10:58 +00:00
Gabi
941b18b2a8 chore(connlib): set_payload is called for udp packets for tests (#5018)
came up while working on #4994
2024-05-16 23:55:12 +00:00
Gabi
68ece0a940 feat(connlib): traffic filtering (#4779)
This implements traffic filtering on the gateway. Filters are set on the
portal, per-resource, in an allow-list manner.

If no filters exist for a given resource all packets are allowed,
otherwise only packets that matches port/protocol for the filters are
allowed, otherwise they are dropped.

Filters can be either TCP, UDP or ICMP. For the first 2 multiple ports
can be given. Furthermore, multiple filters can exists for the same
resource.

To be able to add and remove filters with the same IP/CIDR we keep
around the whole list of filters for any given peer using an ID map and
recalculate the IP each time something is added is removed.

This allows us to remove filters and simply recalculate the allowlist
for each IP.

Furthermore, for any IP, all rules apply, meaning if there are multiple
IPs that apply for a resource all port/protocol combinations for that IP
will apply.

This works well right now for DNS resources, since access is requested
by DNS name, then the resource for that DNS name will arrive at the
gateway, and the port filtering will apply given that resource(and any
other resource with the same IP).

However, since the client has no idea of the filters, it can't request
the resource access based on the port/protocol combination and we are
still using the most specific("longest match") IP. This will mean that
for overlapping CIDR resources, only the rules for the most specific
will be used, even if the gateway supports applying them all, since it
will not have the other resources. This will be solved in #4789.

It can also lead to some weirdness, let's say that you have 10.0.0.0/24
-> TCP/80 and 10.0.0.0/16 -> TCP/443 for your user.

The user tries to access 10.0.0.1, and will then only be allowed port
80. At some point the user might access 10.1.0.1 and it will be allowed
port 443. But from that point on, the user will be allowed to access 80
and 443 in 10.0.0.1 because the rules correctly work on the gateway, the
problem is the client side. Again, #4789 will fix this.

Left for next PRs (in tentative order!):

- #4792 
- #4789 

Depends on: #4773.
Resolves #2030.
Resolves #4791.

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2024-05-07 19:47:49 +00:00
Thomas Eizinger
e387e3e13d chore(ip-packet): address PR feedback (#4721)
Addressing feedback from #4702.
2024-04-22 16:32:54 +00:00
Thomas Eizinger
3669f010c4 chore: extract common ip-packet crate (#4702)
With the introduction of `snownet`, we temporarily duplicated the
`IpPacket` abstraction from `firezone-tunnel` because there was no
common place to put it. Overtime, these have grown in size and we needed
to convert back and forth between time. Lately, we've also been adding
more tests to both `snownet` and `firezone-tunnel` that needed to create
`IpPacket`s as test data.

This seems like an appropriate time to do away with this duplication by
introducing a dedicated crate that acts as a facade for the
`pnet_packet` crate, extending it with the functionality that we need.

Resolves: #3926.

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-04-19 15:05:29 +00:00