Resolvesfirezone/product#619
This additionally removes `ErrorType`:
- `on_error` is now exclusively used for recoverable errors, and no
longer has an `error_type` parameter.
- `on_disconnect` now has an optional `error` parameter, which specifies
the fatal error that caused the disconnect if relevant.
- Replaced connlib dependency to use `rust/connlib/clients/android/lib`
project
- Added `rust-android-gradle` to android project
- Set the `cargo build` target directory to
`rust/connlib/clients/android/lib/build/cargo-target`
- Moved `logger`, `session`, and `vpn` classes to their independent
packages.
- Added `SessionCallback` contract for the session callbacks.
---------
Signed-off-by: Pratik Velani <pratikvelani@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This follows-up on the discussion in #1744 and brings connlib in line
with the callback revisions outlined in firezone/product#586
(It also adds some logging to the Apple bridge that was helpful when
testing this)
---------
Co-authored-by: Roopesh Chander <roop@roopc.net>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This PR improves the build process for the macOS / iOS apps by building
connlib as part of the macOS / iOS app build.
Fixesfirezone/product#625.
This is how the build would work after this PR:
- `build-rust.sh` creates `libconnlib.a` for the appropriate target
triples only. lipo is not used. When creating macOS debug builds, it’s
built only for the native architecture.
- The network extension targets in the Xcode project set a library
search path as the cargo target dir, so that the Xcode build for a
target triple can pickup the appropriate `libconnlib.a` at link time.
Swift code reorganizations:
- connlib’s Adapter has moved to the main app
- connlib’s CallbackHandler’s logic has moved to Adapter, which is set
as CallbackHandler’s delegate. The CallbackHandler serves as an
interface to receive callbacks from the FFI. In case we need to change
the FFI, CallbackHandler should change as well, so it remains in the
connlib directory. In case of changes to the Rust FFI, as part of the
Rust FFI change PR, we can modify the CallbackHandler class and leave
the delegate unchanged, so that the app can continue to be built without
errors.
- `Connlib.xcodeproject` and build scripts for building
`Connlib.xcframework` are removed
- Connlib headers and Swift files are copied to
`FirezoneNetworkExtension/Connlib` as part of the build process, and
used from there.
Rust build changes:
- The rust target dir remains the same, but it’s ~~passed explicitly as
`--target-dir`~~ used to set `CARGO_TARGET_DIR`, so that the same target
dir can be used to populate Xcode’s library search paths
- The `build.rs` for connlib-apple had lots of code to build Swift code
as part of the Rust build. This PR reverts it to the previous simple
version. With this PR, building connlib-apple (i.e. running
`build-rust.sh`) only builds the Rust code.
- ~~We set `cargo:rerun-if-env-changed=CONNLIB_MOCK`.~~ We don't set
this because it's not required.
- The Rust CI job for building connlib-apple is removed. It's built when
the macOS / iOS apps are built in swift.yml. This means that with this
PR, connlib-apple is tested only when `rust/connlib/**` changes, not
when `rust/**` changes. Is that ok?
Other changes not directly related to the build process change but part
of this PR:
- There’s a cleanup script: `./cleanup.sh`
- Fixed a typo in `swift-pass-checks.yml`: “paths-ginore”
Previously, we would access the state around allocations from different
places. This actually led to a minor memory leak where we wouldn't clean
up the `allocations_by_port` table. We refactor the code slightly to
avoid this.
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
With this patch, the relay exposes a `--json` and `JSON_LOG` env
variable that will activate logs in JSON format the way it is expected
by google cloud:
https://cloud.google.com/logging/docs/structured-logging
In addition, we make use of spans to record contextual information as
first-class variables that are available in the context of every
message. An example output here is:
```
{"time":"2023-07-06T19:54:42.643694430Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/main.rs","line":"156"},"severity":"INFO","message":"Seeding RNG from '0'"}
{"time":"2023-07-06T19:54:42.644408014Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/main.rs","line":"130"},"severity":"INFO","message":"Listening for incoming traffic on UDP port 3478"}
{"time":"2023-07-06T19:54:42.843247996Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"417"},"span":{"lifetime":"600","name":"allocate"},"spans":[{"sender":"127.0.0.1:46406","transaction_id":"0531a911a24d1e5297b94cb2","name":"client"},{"lifetime":"600","name":"allocate"}],"severity":"INFO","ip4RelayAddress":"127.0.0.1:65460","message":"Created new allocation"}
{"time":"2023-07-06T19:54:42.851623041Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"569"},"span":{"allocation":"AID-1","peer_address":"127.0.0.1:42314","requested_channel":"16384","name":"channel_bind"},"spans":[{"sender":"127.0.0.1:46406","transaction_id":"e99e07e482789cdc30bd2b50","name":"client"},{"allocation":"AID-1","peer_address":"127.0.0.1:42314","requested_channel":"16384","name":"channel_bind"}],"severity":"INFO","message":"Successfully bound channel"}
{"time":"2023-07-06T19:54:42.852889208Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"288"},"span":{"allocation_id":"AID-1","channel":16384,"recipient":"127.0.0.1:46406","sender":"127.0.0.1:42314","name":"peer"},"spans":[{"allocation_id":"AID-1","channel":16384,"recipient":"127.0.0.1:46406","sender":"127.0.0.1:42314","name":"peer"}],"severity":"DEBUG","message":"Relaying 32 bytes"}
{"time":"2023-07-06T19:54:42.854625857Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"619"},"span":{"channel":"16384","recipient":"127.0.0.1:42314","name":"channel_data"},"spans":[{"sender":"127.0.0.1:46406","name":"client"},{"channel":"16384","recipient":"127.0.0.1:42314","name":"channel_data"}],"severity":"DEBUG","message":"Relaying 32 bytes"}
```
For some reason, the current `span` is always duplicated but I don't
think that is a big issue. When run using the regular log formatter, it
looks like this:
```
2023-07-06T20:02:33.939273Z INFO relay: Seeding RNG from '0'
2023-07-06T20:02:33.940153Z INFO relay: Listening for incoming traffic on UDP port 3478
2023-07-06T20:02:34.135801Z INFO client{sender=127.0.0.1:33919 transaction_id="7092a2363377709cd18b9d98"}:allocate{lifetime=600}: relay: Created new allocation ip4_relay_address=127.0.0.1:65460
2023-07-06T20:02:34.144833Z INFO client{sender=127.0.0.1:33919 transaction_id="4e1a18e58953242c92a075a3"}:channel_bind{requested_channel=16384 peer_address=127.0.0.1:47859 allocation="AID-1"}: relay: Successfully bound channel
2023-07-06T20:02:34.145501Z DEBUG peer{sender=127.0.0.1:47859 allocation_id=AID-1 recipient=127.0.0.1:33919 channel=16384}: relay: Relaying 32 bytes
2023-07-06T20:02:34.146863Z DEBUG client{sender=127.0.0.1:33919}:channel_data{channel=16384 recipient=127.0.0.1:47859}: relay: Relaying 32 bytes
```
This provides lots of contextual information in a DRY and easily
parse-able way.
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Instead of having portal URL and token optional, we default the portal
URL and decide based on the presence of the token, whether we should
connect to the portal on startup. This allows the relay to be
used/tested standalone and keeps the number of config options and error
cases small.
We require the user to config the full path of the websocket and thus
avoid the need for duplicating the connlib function. Given that most
users will never need to override this option, this seems like a good
trade-off.
Resolves https://github.com/firezone/product/issues/614.
This PR fixes a bunch of small things to allow a new flow to test
clients pinging a resource within docker compose.
Masquerade/Forwarding is enabled directly in the container for now, this
might change in the future.
Also added a README to be able to run this locally.
---------
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
With this PR the full control-plane message flow is working.
Meaning that if you do:
```
docker compose up -d
docker compose exec -it client "ping 172.20.0.2" # will fix this IP later
```
Messages start flowing to gateway. The gateway still not correctly
forwards the messages to the resource since masquerading is still not
working, although I suspect there might be an additional problem. Will
fix this in my next PR along with a README on how to test this whole
flow.
This PR also fixes how we sent the stamp secret to the gateway from the
relay, but I still see some warnings in the webrtc that I'm sure that
are due to a mismatch between how webrtc-rs and the relay handle
messages (The most important being `bind() failed: unexpected response
type`), I will take a look at that and a way to test that the flow works
when:
1. hole-punching is available
2. through relay when it's not
Since the flow right now works without hole-punching or relay since the
gateway is in the same network in the docker compose.
Resolvesfirezone/product#607
Setting the env var `CONNLIB_MOCK` when building through either
`build-rust.sh` or `gradle` will activate the `mock` feature.
- Instead of having two, very similar jobs, we run our fmt, clippy and
tests steps across all crates and operating systems.
- We remove the dependency of the android and apple builds on the tests
and thus get faster feedback.
- We force clippy to fail on any warning. This one is super important
IMO. Warnings in Rust are very useful and ignoring them can lead to bugs
(think "unused Result" etc).
Resolves#1714.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Francesca Lovebloom <franlovebloom@gmail.com>
Runs `cargo fmt` on the entire `rust/` directory. This somehow doesn't
seem to be enforced, I think that is because we changed the previous CI
to now only run for the `relay` crate.
I'd like to merge this first to avoid the diff and in a 2nd PR, we can
work on unifying CI again.
Due to a silly bash mistake (I hate bash), the error from the gateway
binary wasn't actually propagated to the script. Thus, we did not notice
that it was been broken for a while.
Attempting to fix it turned up that we were double-hexing the relay
secret and using invalid passwords for the clients.
This PR fixes `docker compose up` but it doesn't have the test client ->
resource flow working but it prevent anything from erroring at startup.
This fixes:
* tokens (use the correct token for the client user agent we are using)
* randomize `name_suffix` at start up for connlib (we will eventually
allow options to set it manually)
* remove port ranges for relay (see firezone/product#613)
This makes it possible to build the Apple/Android FFI bridges and
integrate them with their respective client apps.
---------
Signed-off-by: Francesca Lovebloom <franlovebloom@gmail.com>
Co-authored-by: Roopesh Chander <roop@roopc.net>
There are problems building the docker images in macos using musl due to
ring's problems therefore we started using slim-debian with glibc for
development.
When using `docker compose build` or any other way of building docker
images in parallel the way the cache was working with the rust's
Dockerfile made the caches between images overlap and corrupt each
other. We add a `locked` which prevents multiple writers to the same
cache to fix this behaviour.
This brindgs connlib from its own separated repo to firezone's monorepo.
On top of bringing connlib we also add and unify the Dockerfile for all
rust binaries and add a docker-compose that can run a headless client, a
relay and a gateway which eventually will test the whole flow between a
client and a resource. For this to work we also incorporated some elixir
scripts to generate portal tokens for those components.
With this PR, the relay can be configured with a WebSocket URL on startup. If given, it will attempt to connect to it and join the `relay` room with its `stamp_secret`. Once the `init` message is received, regular relay operation will begin.
Targets specified in the `rust-toolchain.toml` file are automatically installed by `rustup`. This avoid setup steps for other devs and also simplifies the CI setup.
To be able to compile native code to musl, we do need `musl-gcc` which comes with the `musl-tools` package on ubuntu.
Previously, the relay would treat the `stamp_secret` internally as bytes and share it with the outside world as hex-string. The portal however treats it as an opaque string and uses the UTF-8 bytes to create username and password.
This patch aligns the relay's functionality with the portal and stores the `stamp_secret` internally as a string.
This saves us several lines of code and allows usage of the relay via
commandline arguments in addition to env variables. Note that because of
`#[arg(env)]`, all of these can still be configured via environment
variables too.
To complete the authentication scheme for the relay, we need to prompt
the client with a nonce when they send an unauthenticated request. The
semantic meaning of a nonce is opaque to the client. As a starting
point, we implement a count-based scheme. Each nonce is valid for 10
requests. After that, a request will be rejected with a 401 and the
client has to authenticate with a new nonce.
This scheme provides a basic form of replay-protection.
We introduce dedicated types for each message that the `Server` can
handle. This allows us to make the functions public because the
type-system now guarantees that those are either parsed from bytes or
constructed with the correct data.
The latter will be useful to write tests against a richer API.
With this patch, the relay can parse and respond to allocation requests. I
ran some basics tests against https://icetest.info/ and implemented a
regression test as a result of the logged data.
In writing this, I also had to slightly change the design of `Server`
(as expected). Event handlers for incoming data now do not return a
message directly. Instead, the caller is responsible to drain `Command`s
from it.
When creating an allocation, we need to start listening on a new port.
This needs to happen outside the `Server` as I am going for a sans-IO
style. We emit a `Command` that instructs the main event loop to listen
on a new port. Any incoming data on that port will be forwarded to the
`Server`.
At the moment, this incoming data is just dropped. This is actually
standards-compliant because we cannot handle binding requests yet which
would allow this data to be forwarded to the client.
In some areas, the code is still a bit rough but I expect to iron those
things out as we go along.
This is an alternative to https://github.com/firezone/firezone/pull/1602
that implements the server using a library I've found called
`stun_codec`.
It already has support for parsing a variety of attributes.
The following is a nice website to test some of the functionality:
https://icetest.info/
The server is still listening on:
`ec2-3-89-112-240.compute-1.amazonaws.com:3478`.