With this patch, the relay exposes a `--json` and `JSON_LOG` env
variable that will activate logs in JSON format the way it is expected
by google cloud:
https://cloud.google.com/logging/docs/structured-logging
In addition, we make use of spans to record contextual information as
first-class variables that are available in the context of every
message. An example output here is:
```
{"time":"2023-07-06T19:54:42.643694430Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/main.rs","line":"156"},"severity":"INFO","message":"Seeding RNG from '0'"}
{"time":"2023-07-06T19:54:42.644408014Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/main.rs","line":"130"},"severity":"INFO","message":"Listening for incoming traffic on UDP port 3478"}
{"time":"2023-07-06T19:54:42.843247996Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"417"},"span":{"lifetime":"600","name":"allocate"},"spans":[{"sender":"127.0.0.1:46406","transaction_id":"0531a911a24d1e5297b94cb2","name":"client"},{"lifetime":"600","name":"allocate"}],"severity":"INFO","ip4RelayAddress":"127.0.0.1:65460","message":"Created new allocation"}
{"time":"2023-07-06T19:54:42.851623041Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"569"},"span":{"allocation":"AID-1","peer_address":"127.0.0.1:42314","requested_channel":"16384","name":"channel_bind"},"spans":[{"sender":"127.0.0.1:46406","transaction_id":"e99e07e482789cdc30bd2b50","name":"client"},{"allocation":"AID-1","peer_address":"127.0.0.1:42314","requested_channel":"16384","name":"channel_bind"}],"severity":"INFO","message":"Successfully bound channel"}
{"time":"2023-07-06T19:54:42.852889208Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"288"},"span":{"allocation_id":"AID-1","channel":16384,"recipient":"127.0.0.1:46406","sender":"127.0.0.1:42314","name":"peer"},"spans":[{"allocation_id":"AID-1","channel":16384,"recipient":"127.0.0.1:46406","sender":"127.0.0.1:42314","name":"peer"}],"severity":"DEBUG","message":"Relaying 32 bytes"}
{"time":"2023-07-06T19:54:42.854625857Z","target":"relay","logging.googleapis.com/sourceLocation":{"file":"relay/src/server.rs","line":"619"},"span":{"channel":"16384","recipient":"127.0.0.1:42314","name":"channel_data"},"spans":[{"sender":"127.0.0.1:46406","name":"client"},{"channel":"16384","recipient":"127.0.0.1:42314","name":"channel_data"}],"severity":"DEBUG","message":"Relaying 32 bytes"}
```
For some reason, the current `span` is always duplicated but I don't
think that is a big issue. When run using the regular log formatter, it
looks like this:
```
2023-07-06T20:02:33.939273Z INFO relay: Seeding RNG from '0'
2023-07-06T20:02:33.940153Z INFO relay: Listening for incoming traffic on UDP port 3478
2023-07-06T20:02:34.135801Z INFO client{sender=127.0.0.1:33919 transaction_id="7092a2363377709cd18b9d98"}:allocate{lifetime=600}: relay: Created new allocation ip4_relay_address=127.0.0.1:65460
2023-07-06T20:02:34.144833Z INFO client{sender=127.0.0.1:33919 transaction_id="4e1a18e58953242c92a075a3"}:channel_bind{requested_channel=16384 peer_address=127.0.0.1:47859 allocation="AID-1"}: relay: Successfully bound channel
2023-07-06T20:02:34.145501Z DEBUG peer{sender=127.0.0.1:47859 allocation_id=AID-1 recipient=127.0.0.1:33919 channel=16384}: relay: Relaying 32 bytes
2023-07-06T20:02:34.146863Z DEBUG client{sender=127.0.0.1:33919}:channel_data{channel=16384 recipient=127.0.0.1:47859}: relay: Relaying 32 bytes
```
This provides lots of contextual information in a DRY and easily
parse-able way.
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Instead of having portal URL and token optional, we default the portal
URL and decide based on the presence of the token, whether we should
connect to the portal on startup. This allows the relay to be
used/tested standalone and keeps the number of config options and error
cases small.
We require the user to config the full path of the websocket and thus
avoid the need for duplicating the connlib function. Given that most
users will never need to override this option, this seems like a good
trade-off.
Resolves https://github.com/firezone/product/issues/614.
This PR fixes a bunch of small things to allow a new flow to test
clients pinging a resource within docker compose.
Masquerade/Forwarding is enabled directly in the container for now, this
might change in the future.
Also added a README to be able to run this locally.
---------
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
With this PR the full control-plane message flow is working.
Meaning that if you do:
```
docker compose up -d
docker compose exec -it client "ping 172.20.0.2" # will fix this IP later
```
Messages start flowing to gateway. The gateway still not correctly
forwards the messages to the resource since masquerading is still not
working, although I suspect there might be an additional problem. Will
fix this in my next PR along with a README on how to test this whole
flow.
This PR also fixes how we sent the stamp secret to the gateway from the
relay, but I still see some warnings in the webrtc that I'm sure that
are due to a mismatch between how webrtc-rs and the relay handle
messages (The most important being `bind() failed: unexpected response
type`), I will take a look at that and a way to test that the flow works
when:
1. hole-punching is available
2. through relay when it's not
Since the flow right now works without hole-punching or relay since the
gateway is in the same network in the docker compose.
Resolvesfirezone/product#607
Setting the env var `CONNLIB_MOCK` when building through either
`build-rust.sh` or `gradle` will activate the `mock` feature.
- Instead of having two, very similar jobs, we run our fmt, clippy and
tests steps across all crates and operating systems.
- We remove the dependency of the android and apple builds on the tests
and thus get faster feedback.
- We force clippy to fail on any warning. This one is super important
IMO. Warnings in Rust are very useful and ignoring them can lead to bugs
(think "unused Result" etc).
Resolves#1714.
---------
Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Francesca Lovebloom <franlovebloom@gmail.com>
Stubs out the client app dirs and basic CI workflow for the client apps
in preparation to move them into this repository.
After this is merged @roop @pratikvelani you should be able to add the
client repos here.
Looks like for some reason the id/1 callback doesn't subscribe the channel process any more (only the socket itself), so we are doing that explicitly now.
Unfortunately, this doesn't seem to be stable. I don't really understand
why. Judging from the logs, the problem is not in the relay but somehow
the final UDP packet doesn't arrive at the `gateway` binary.
To not unnecessarily block other PRs, I am removing the check for now.
Runs `cargo fmt` on the entire `rust/` directory. This somehow doesn't
seem to be enforced, I think that is because we changed the previous CI
to now only run for the `relay` crate.
I'd like to merge this first to avoid the diff and in a 2nd PR, we can
work on unifying CI again.
Due to a silly bash mistake (I hate bash), the error from the gateway
binary wasn't actually propagated to the script. Thus, we did not notice
that it was been broken for a while.
Attempting to fix it turned up that we were double-hexing the relay
secret and using invalid passwords for the clients.
This PR fixes `docker compose up` but it doesn't have the test client ->
resource flow working but it prevent anything from erroring at startup.
This fixes:
* tokens (use the correct token for the client user agent we are using)
* randomize `name_suffix` at start up for connlib (we will eventually
allow options to set it manually)
* remove port ranges for relay (see firezone/product#613)
This makes it possible to build the Apple/Android FFI bridges and
integrate them with their respective client apps.
---------
Signed-off-by: Francesca Lovebloom <franlovebloom@gmail.com>
Co-authored-by: Roopesh Chander <roop@roopc.net>
**Update CONTRIBUTING.md**
Why:
* The CONTRIBUTING.md doc seems to have fallen slightly out of date with
how Firezone now works. This commit updates the doc to provide a
quick start guide for getting all of the various Firezone components
up and running as quick as possible. The doc then links to the more
specific `Elixir` and `Rust` README.md files in the respective
directories to help developers who would like to contribute.
**Update docker-compose vault health check**
Why:
* The current Vault health check listed in the docker-compose file does
not seem to be working when using `localhost` in the `wget` command.
Updating the URL to use `127.0.0.1` seems to have fixed it.
---------
Signed-off-by: bmanifold <bmanifold@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Did some research on status page providers to manage incidents.
statuspage.io seems to be easy to use and cost-effective, fairly popular
and provides a good amount of flexibility to customize emails,
notifications, etc.
Super easy to set up and use but am not married to it if anyone feels
strongly about using another incident management service.
https://firezone.statuspage.io
## Demo:
<img width="235" alt="Screenshot 2023-06-27 at 8 07 29 AM"
src="https://github.com/firezone/firezone/assets/167144/8ad12b9b-7345-4a5d-bf43-c8af798d85f9">
@AndrewDryga ~~Was still hitting some redirect issues so I'll wait for
those to be resolved before continuing on building more views.~~ Edit:
After some sleep and coffee, I figured it out. Nice work on the sign in
form!
I went ahead and scoped existing dashboard links with `@account` and
fixed a dark mode issue -- you may want to cherry-pick those commits.
I'll add these to authenticated routes and integrate into what you have
so far.
As I was going through last night exploring your route approach I
thought of some edge cases; can discuss next week. I think the main one
that came to mind was that we probably want to differentiate between
login flows initiated directly in the browser (this is an admin logging
into the dashboard) vs login flows initiated from a client app (these
will terminate with a final redirect to respective `dest` whitelisted
URL). Maybe it makes sense to segregate these flows?
If a regular user tries login directly from the browser maybe we want to
show them something like "Please login from your Firezone application
instead" as they should only be able to initiate logins from a client
application. Or maybe there's simply no possibility to end up at the
final Android App Link or `firezone://` URI with a login initiated
directly from the browser?
There are problems building the docker images in macos using musl due to
ring's problems therefore we started using slim-debian with glibc for
development.
When using `docker compose build` or any other way of building docker
images in parallel the way the cache was working with the rust's
Dockerfile made the caches between images overlap and corrupt each
other. We add a `locked` which prevents multiple writers to the same
cache to fix this behaviour.
This brindgs connlib from its own separated repo to firezone's monorepo.
On top of bringing connlib we also add and unify the Dockerfile for all
rust binaries and add a docker-compose that can run a headless client, a
relay and a gateway which eventually will test the whole flow between a
client and a resource. For this to work we also incorporated some elixir
scripts to generate portal tokens for those components.
Did some research when picking a package manager for the website and
settled on `pnpm` for the following reasons:
- CLI-compatible with `npm`
- Typically faster than even `yarn` especially on Apple silicon
- Security: Pnpm uses a different dependency resolution algorithm and
different folder structure of node_modules that prevents illegal access
to packages by other packages.
I think I caught all the places, but I may be missing something, so if
this isn't a good idea we can revert back.
This PR also cleans up the actions workflows to remove dead code.
* Remove `www/`
* Stub empty `website/` to silence Vercel. This shouldn't cause
conflicts when we merge `cloud` to `master`. Perhaps we want to start
working off `master` soon, and move the current tip of master to
`legacy`?