mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 10:18:54 +00:00

Files

Thomas Eizinger 8e0f00a3a6 fix(relay): buffer packets in case IO is busy (#7536 )

At present, the relay's event-loop simply drops a UDP packet in case the
socket is not ready for writing. This is terrible for throughput because
it means the encapsulated packet within the WG payload needs to be
retransmitted by the source after a timeout. To avoid this, we instead
buffer the packet and suspend the event loop until it has been correctly
flushed out. This may still cause packet loss because the receive buffer
may overflow in the meantime. However, there is nothing we can do about
that because UDP itself doesn't have any backpressure.

The relay listens on many sockets at once via a separate worker thread
and an `mio` event-loop. In addition to the current subscription to
readable event, we now also subscribe to writable events.

At the very top of the relay's event-loop, we insert a `flush` function
that ensures all buffered packets have been written out and - in case
writing a packet fails - suspends the event-loop with a waker. If we
receive a new event for write-readiness, we wake the waker which will
trigger a new call to `Eventloop::poll` where we again try to flush the
pending packet. We don't bother with tracking exactly, which socket sent
the write-readiness and which socket we have still pending packets in.
Instead, we suspend the entire event-loop until all pending packets have
been flushed.

Resolves: #7519.

2024-12-18 17:01:24 +00:00

proptest-regressions/server

feat(relay): add smoke test script (#1834 )

2023-07-31 20:13:27 +00:00

src

fix(relay): buffer packets in case IO is busy (#7536 )

2024-12-18 17:01:24 +00:00

tests

refactor(relay): improve error messages on failed requests (#7405 )

2024-11-28 22:12:27 +00:00

Cargo.toml

chore(rust): share edition key via workspace table (#7451 )

2024-12-03 00:28:06 +00:00

README.md

feat(relay): set OTEL metadata for metrics and traces (#6249 )

2024-08-10 16:32:01 +00:00

README.md

relay

This crate houses a minimalistic STUN & TURN server.

Features

We aim to support the following feature set:

STUN binding requests
TURN allocate requests
TURN refresh requests
TURN channel bind requests
TURN channel data requests

Relaying of data through other means such as DATA frames is not supported.

Building

You can build the relay using: cargo build --release --bin firezone-relay

You should then find a binary in target/release/firezone-relay.

Running

The Firezone Relay supports Linux only. To run the Relay binary on your Linux host:

Generate a new Relay token from the "Relays" section of the admin portal and save it in your secrets manager.
Ensure the FIREZONE_TOKEN=<relay_token> environment variable is set securely in your Relay's shell environment. The Relay expects this variable at startup.
Now, you can start the Firezone Relay with:

firezone-relay

To view more advanced configuration options pass the --help flag:

firezone-relay --help

Ports

By default, the relay listens on port udp/3478. This is the standard port for STUN/TURN. Additionally, the relay needs to have access to the port range 49152 - 65535 for the allocations.

Portal Connection

When given a token, the relay will connect to the Firezone portal and wait for an init message before commencing relay operations.

Metrics

The relay parses the OTLP_GRPC_ENDPOINT env variable. Traces and metrics will be sent to an OTLP collector listening on that endpoint.

It is recommended to set additional environment variables to scope your metrics:

OTEL_SERVICE_NAME: Translates to the service.name.
OTEL_RESOURCE_ATTRIBUTES: Additional, comma-separated key=value attributes.

By default, we set the following OTEL attributes:

service.name=relay
service.namespace=firezone

The docker-init-relay.sh script integrates with GCE. When OTEL_METADATA_DISCOVERY_METHOD=gce_metadata, the service.instance.id variables is set to the instance ID of the VM.

Design

The relay is designed in a sans-IO fashion, meaning the core components do not cause side effects but operate as pure, synchronous state machines. They take in data and emit commands: wake me at this point in time, send these bytes to this peer, etc.

This allows us to very easily unit-test all kinds of scenarios because all inputs are simple values.

The main server runs in a single task and spawns one additional task for each allocation. Incoming data that needs to be relayed is forwarded to the main task where it gets authenticated and relayed on success.