firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-03-21 20:41:57 +00:00

Author	SHA1	Message	Date
Thomas Eizinger	96536a23cf	refactor(connlib): ignore relays per connection (#5631 ) In a previous design of firezone, relays used to be scoped to a certain connection. For a while now, this constraint has been lifted and all connections can use all relays. A related, outdated concern is the idea of STUN-only servers. Those also used to be assigned on a per-connection basis. By removing any use of per-connection relays and STUN-only servers, the entire `StunBinding` concept is unused code and can thus be deleted. To push this over the finish line, the `snownet-tests` which test the hole-punching functionality needed to be slightly adapted to make use of the more recently introduced API `Node::update_relays`. Resolves: #4749.	2024-06-29 02:36:17 +00:00
Thomas Eizinger	38275ecad0	refactor(gateway): extract fn for update-device task (#5581 ) Follow-up feedback from #5512.	2024-06-29 01:27:23 +00:00
Thomas Eizinger	aadb045b27	chore(connlib): batch together sending of ICE candidates (#5616 ) Currently, we are sending each ICE candidate individually from the client to the gateway and vice versa. This causes a slight delay as to when each ICE candidate gets added on the remote ICE agent. As a result, they all start being tested with a slight offset which causes "endpoint hopping" whenever a connection expires as they expire just after each other. In addition, sending multiple messages to the portal causes unnecessary load when establishing connections. Finally, with #5283 we started not adding the server-reflexive candidate to the local ICE agent. Because we talk to multiple relays, we detect the same server-reflexive candidate multiple times if we are behind a non-symmetric NAT. Not adding the server-reflexive candidate to the ICE agent mitigated our de-duplication strategy here which means we currently send the same candidate multiple times to a peer, causing additional, unnecessary load. All of this can be mitigated by batching together all our ICE candidates together into one message. Resolves: #3978.	2024-06-28 02:04:31 +00:00
Thomas Eizinger	1b5076fa57	fix(gateway): handle `init` messages during operation (#5512 ) Currently, the gateway only handles an `init` message on startup. For clients, we handle `init` messages also during operation so it only makes sense to do the same thing for gateways. This allows us to remove some old code from `phoenix_channel`. In particular, the `init` function which used to wait for the `init` message before continuing. In https://github.com/firezone/firezone/pull/4594, we refactored `phoenix-channel` to reconnect internally on errors. As a result, the `connect` function became synchronous and no longer needed an `async` context. At the time, the gateway wasn't updated to make use of this. We can now simplify the gateway code and resolve the outstanding TODO of handling `init` messages during operation.	2024-06-26 00:11:07 +00:00
Reactor Scram	28378fe24e	refactor(headless-client): remove FIREZONE_PACKAGE_VERSION (#5487 ) Closes #5481 With this, I can connect to the staging portal without a build.rs or any extra env var setup <img width="387" alt="image" src="https://github.com/firezone/firezone/assets/13400041/9c080b36-3a76-49c7-b706-20723697edc7"> ```[tasklist] ### Next steps - [x] Split out a refactor PR for `ConnectArgs` (#5488) - [x] Try doing this for other Clients - [x] Check Gateway - [x] Check Tauri Client - [x] Change to `app_version` - [x] Open for review - [ ] Use `option_env` so that `FIREZONE_PACKAGE_VERSION` can still override the Cargo.toml version for local testing - [ ] Check Android Client - [ ] Check Apple Client ``` --------- Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-06-21 23:06:41 +00:00
Gabi	2ea6a5d07e	feat(gateway): NAT & mangling for DNS resources (#5354 ) As part of #4994, the IP translation and mangling of packets to and from DNS resources is moved to the gateway. This PR represents the "gateway-half" of the required changes. Eventually, the client will send a list of proxy IPs that it assigned for a certain DNS resource. The gateway assigns each proxy IP to a real IP and mangles outgoing and incoming traffic accordingly. There are a number of things that we need to take care of as part of that: - We need to implement NAT to correctly route traffic. Our NAT table maps from source port* and destination IP to an assigned port* and real IP. We say port* because that is only true for UDP and TCP. For ICMP, we use the identifier. - We need to translate between IPv4 and IPv6 in case a DNS resource e.g. only resolves to IPv6 addresses but the client gave out an IPv4 proxy address to the application. This translation is was added in #5364 and is now being used here. This PR is backwards-compatible because currently, clients don't send any IPs to the gateway. No proxy IPs means we cannot do any translation and thus, packets are simply routed through as is which is what the current clients expect. --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-06-19 01:15:27 +00:00
Thomas Eizinger	c4e608bd14	fix(gateway): ensure DNS resolution times out before connection (#5419 ) When we attempt to establish a connection to a gateway for a DNS resource, the gateway must resolve the requested domain name before it can accept the connection. Currently, this timeout is set to 60s which is much longer than the client's connection timeout. DNS resolution is typically a very fast protocol so reducing this timeout to 5s should be safe. In addition, we add a compile-time assertion that this timeout must be less than the client's connection timeout. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2024-06-18 22:08:49 +00:00
Reactor Scram	deefabd8f8	refactor(firezone-tunnel): move routes and DNS control out of connlib and up to the Client (#5111 ) Refs #3636 (This pays down some of the technical debt from Linux DNS) Refs #4473 (This partially fulfills it) Refs #5068 (This is needed to make `FIREZONE_DNS_CONTROL` mandatory) As of dd6421: - On both Linux and Windows, DNS control and IP setting (i.e. `on_set_interface_config`) both move to the Client - On Windows, route setting stays in `tun_windows.rs`. Route setting in Windows requires us to know the interface index, which we don't know in the Client code. If we could pass opaque platform-specific data between the tunnel and the Client it would be easy. - On Linux, route setting moves to the Client and Gateway, which completely removes the `worker` task in `tun_linux.rs` - Notifying systemd that we're ready moves up to the headless Client / IPC service ```[tasklist] ### Before merging / notes - [x] Does DNS roaming work on Linux on `main`? I don't see where it hooks up. I think I only set up DNS in `Tun::new` (Yes, the `Tun` gets recreated every time we reconfigure the device) - [x] Fix Windows Clients - [x] Fix Gateway - [x] Make sure connlib doesn't get the DNS control method from the env var (will be fixed in #5068) - [x] De-dupe consts - [ ] ~~Add DNS control test~~ (failed) - [ ] Smoke test Linux - [ ] Smoke test Windows ```	2024-06-03 14:32:08 +00:00
Gabi	b3d2059cad	chore(connlib): split allowed_ips into ipv4 and ipv6 in `ClientOnGateway` (#5160 ) To encode that clients always have both ipv4 and ipv6 and they are the only allowed source ips for any given client, into the type, we split those into their specific fields in the `ClientOnGateway` struct and update tests accordingly. Furthermore, these will be used for the DNS refactor for ipv6-in-ipv4 and ipv4-in-ipv6 to set the source ip of outgoing packets, without having to do additional routing or mappings. There will be more notes on this on the corresponding PR #5049 . --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-05-30 05:51:44 +00:00
Gabi	361aafb746	chore(connlib): upgrade domain version from 0.9 to 0.10 (#5028 )	2024-05-20 20:54:22 +00:00
Gabi	c46967e1d6	fix(connlib): resource filter deserialization (#4910 ) There was an error on how resource filters were deserialized in the gateway: * we always assumed that there would be the ports included but the portal sends no port down when the "all" range is allowed * also we didn't support the resource_updated message, this fixes it, and resources allow-list can be changes in-flight	2024-05-08 00:16:06 +00:00
Gabi	68ece0a940	feat(connlib): traffic filtering (#4779 ) This implements traffic filtering on the gateway. Filters are set on the portal, per-resource, in an allow-list manner. If no filters exist for a given resource all packets are allowed, otherwise only packets that matches port/protocol for the filters are allowed, otherwise they are dropped. Filters can be either TCP, UDP or ICMP. For the first 2 multiple ports can be given. Furthermore, multiple filters can exists for the same resource. To be able to add and remove filters with the same IP/CIDR we keep around the whole list of filters for any given peer using an ID map and recalculate the IP each time something is added is removed. This allows us to remove filters and simply recalculate the allowlist for each IP. Furthermore, for any IP, all rules apply, meaning if there are multiple IPs that apply for a resource all port/protocol combinations for that IP will apply. This works well right now for DNS resources, since access is requested by DNS name, then the resource for that DNS name will arrive at the gateway, and the port filtering will apply given that resource(and any other resource with the same IP). However, since the client has no idea of the filters, it can't request the resource access based on the port/protocol combination and we are still using the most specific("longest match") IP. This will mean that for overlapping CIDR resources, only the rules for the most specific will be used, even if the gateway supports applying them all, since it will not have the other resources. This will be solved in #4789. It can also lead to some weirdness, let's say that you have 10.0.0.0/24 -> TCP/80 and 10.0.0.0/16 -> TCP/443 for your user. The user tries to access 10.0.0.1, and will then only be allowed port 80. At some point the user might access 10.1.0.1 and it will be allowed port 443. But from that point on, the user will be allowed to access 80 and 443 in 10.0.0.1 because the rules correctly work on the gateway, the problem is the client side. Again, #4789 will fix this. Left for next PRs (in tentative order!): - #4792 - #4789 Depends on: #4773. Resolves #2030. Resolves #4791. --------- Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>	2024-05-07 19:47:49 +00:00
Thomas Eizinger	51089b89e7	feat(connlib): smoothly migrate relayed connections (#4568 ) Whenever we receive a `relays_presence` message from the portal, we invalidate the candidates of all now disconnected relays and make allocations on the new ones. This triggers signalling of new candidates to the remote party and migrates the connection to the newly nominated socket. This still relies on #4613 until we have #4634. Resolves: #4548. --------- Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2024-04-20 06:16:35 +00:00
Thomas Eizinger	0f7e80642d	chore(snownet): don't update remote socket from WG activity (#4615 ) Resolves: #4613.	2024-04-20 00:15:19 +00:00
Thomas Eizinger	bfe07d7ebd	chore(connlib): upsert relays from "init" message (#4567 ) This is another step towards #4548. The portal now includes a list of relays as part of the "init" message. Any time we receive an "init", we will now upsert those relays based on their ID. This requires us to change our internal bookkeeping of relays from indexing them by address to indexing by ID. To ensure that this works correctly, the unit tests are rewritten to use the new `upsert_relays` API. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2024-04-15 21:30:49 +00:00
Thomas Eizinger	be1a719e2c	chore(relay): perform graceful shutdown upon receiving SIGTERM (#4552 ) Upon receiving a SIGTERM, we immediately disconnect from the websocket connection to the portal and set a flag that we are shutting down. Once we are disconnected from the portal and no longer have an active allocations, we exit with 0. A repeated SIGTERM signal will interrupt this process and force the relay to shutdown. Disconnecting from the portal will (eventually) trigger a message to clients and gateways that this relay should no longer be used. Thus, depending on the timeout our supervisor has configured after sending SIGTERM, the relay will continue all TURN operations until the number of allocations drops to 0. Currently, we also allow clients to make new allocations and refreshing existing allocations. In the future, it may make sense to implement a dedicated status code and refuse `ALLOCATE` and `REFRESH` messages whilst we are shutting down. Related: #4548. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2024-04-12 08:45:08 +00:00
Thomas Eizinger	5e871d955b	chore(gateway): remove unused derives and messages (#4563 )	2024-04-10 09:18:59 +00:00
Thomas Eizinger	03d89fec50	chore(relay): fail health-check with 400 on being partitioned for > 15min (#4553 ) During the latest relay outage, we failed to send heartbeats to the portal because we were busy-looping and never got to handle messages or timers for the portal. To mitigate this or similar bugs, we update an `Instant` every time we send a heartbeat to the portal. In case we are actually network-partitioned, this will cause the health-check to fail after 15 minutes. This value is the same as the partition timeout for the portal connection itself[^1]. Very likely, we will never see a relay being shutdown because of a failing health check in this case as it would have already shut itself down. An exception to this are bugs in the eventloop where we fail to interact with the portal at all. Resolves: #4510. [^1]: Previously, this was unlimited.	2024-04-10 02:05:59 +00:00
Thomas Eizinger	a8201abd6e	chore(connlib): remove stale code (#4562 ) Reducing the number of crates as outlined in #4470 would help with detecting this sort of unused code because we could make more things `pub(crate)` which allows the compiler to check whether code is actually used. Public API items are never subject to the dead-code analysis of the compiler because they could be used by other crates.	2024-04-10 02:12:59 +00:00
Thomas Eizinger	e169150ee7	fix(gateway): don't errenously suspend eventloop (#4486 ) Within the gateway's eventloop, we MUST only return `Poll::Pending` if `Waker`s are registered for anything that needs to happen. To ensure that, we MUST `loop` around our the calls to `poll()` to ensure we drain everything that is `Poll::Ready`. Only once all sub-state machines return `Poll::Pending`, we can return `Poll::Pending`.	2024-04-03 17:24:38 -06:00
Gabi	ee34621ee8	chore(connlib): unit tests for additional fields in messages (#4337 ) Fixes #4308	2024-03-28 02:14:02 +00:00
Jamil	228389882e	refactor(connlib): delay initialization of `Sockets` until we have a tokio runtime (#4286 ) Our sockets need to be initialized within a tokio runtime context. To achieve this, we don't actually initialize anything on `Sockets::new`. Instead, we call `rebind` within the constructor of `Tunnel` which already runs in a tokio context. Fixes: #4282 --------- Signed-off-by: Jamil <jamilbk@users.noreply.github.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-03-25 22:51:35 +00:00
Thomas Eizinger	e628fa5d06	refactor(connlib): implement new FFI guidelines (#4263 ) This updates connlib to follow the new guidelines described in #4262. I only made the bare-minimum changes to the clients. With these changes `reconnect` should only be called when the network interface actually changed, meaning clients have to be updated to reflect that.	2024-03-23 04:13:05 +00:00
Thomas Eizinger	8c1500d03e	chore(connlib): tidy up logs and docs (#4265 ) Wrong / outdated docs are worse than no docs. This PR removes some of these stale docs. We may add new docs at a later point again.	2024-03-23 00:52:24 +00:00
Thomas Eizinger	e8f2320d08	fix(gateway): answer with empty list of addresses on DNS resolution failure (#4266 ) Currently, a failure during DNS resolution results in the client hanging during the connection setup. Instead, we fall back to an empty list which results in an empty DNS query result for the client. That in turn will make most application consider the DNS request failed. As far as I know, we don't currently retry these DNS requests, meaning a user would have to sign-in and out again to fix this state. Whilst not ideal, I think this is a better behaviour and what we currently have where the initial connection just hangs.	2024-03-22 22:16:38 +00:00
Thomas Eizinger	2a46fce574	refactor(connlib): remove `Result` return values from callbacks (#4158 ) Currently, an error returned by `Tunnel::poll_next_event` is only logged. In other words, they are never fatal. This creates a tricky to understand relationship on what kind of errors should be returned from callbacks. Because connlib is used on multiple operating systems, it has no idea how fatal a particular error is. This PR removes all of these `Result` return values with the following consequences: - For Android, we now panic when a callback fails. This is a slight change in behaviour. I believe that previously, any exception thrown by a callback into Android was caught and returned as an error. Now, we panic because in the FFI layer, we don't have any information on how fatal the error is. For non-fatal errors, the Android app should simply not throw an exception. The panics will cause the connlib task to be shut down which triggers an `on_disconnect`. - For Swift, there is no behaviour change. The FFI layer already did not support `Result`s for those callbacks. I don't know how exceptions from Swift are translated across the FFI layer but there is no change to what we had before. - For the Tauri client: - I chose to log errors on ERROR level and continue gracefully for the DNS resolvers. - We panic in case the controller channel is full / closed. That should really never happen in practice though unless we are currently shutting down the app. Resolves: #4064.	2024-03-20 02:09:20 +00:00
Thomas Eizinger	62e082d47a	refactor(connlib): make `{Client,Gateway}State` SANS-IO (#4096 ) Resolves: #3929.	2024-03-14 23:44:36 +00:00
Thomas Eizinger	9767bddcca	feat(gateway): add HTTP health check (#4120 ) This adds the same kind of HTTP health-check that is already present in the relay to the gateway. The health-check returns 200 OK for as long as the gateway is active. The gateway automatically shuts down on fatal errors (like authentication failures with the portal). To enable this, I've extracted a crate `http-health-check` that shares this code between the relay and the gateway. Resolves: #2465. --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Reactor Scram <ReactorScram@users.noreply.github.com>	2024-03-13 21:05:21 +00:00
Thomas Eizinger	407d20d817	refactor(connlib): use `phoenix-channel` crate for clients (#3682 ) Depends-On: #4048. Depends-On: #4015. Resolves: #2158. --------- Co-authored-by: conectado <gabrielalejandro7@gmail.com>	2024-03-12 08:10:56 +00:00
Thomas Eizinger	fdb33674cd	refactor(connlib): introduce `LoginUrl` component (#4048 ) Currently, we are passing a lot of data into `Session::connect`. Half of this data is only needed to construct the URL we will use to connect to the portal. We can simplify this by extracting a dedicated `LoginUrl` component that captures and validates this data early. Not only does this reduce the number of parameters we pass to `Session::connect`, it also reduces the number of failure cases we have to deal with in `Session::connect`. Any time the session fails, we have to call `onDisconnected` to inform the client. Thus, we should perform as much validation as we can early on. In other words, once `Session::connect` returns, the client should be able to expect that the tunnel is starting.	2024-03-09 09:35:15 +00:00
Thomas Eizinger	4339030d03	refactor(phoenix-channel): reduce `Error` to fatal errors (#4015 ) As part of doing https://github.com/firezone/firezone/pull/3682, we noticed that the handling of errors up to the clients needs to differentiate between fatal errors that require clearing the token vs not. Upon closer inspection of `phoenix_channel::Error`, it becomes obvious that the current design is not good here. In particular, we handle certain errors with retries internally but still expose those same errors. To make this more obvious, we reduce the public `Error` to the variants that are actually fatal. Those can really only be three: - HTTP client errors (those are by definition non-retryable) - Token expired - We have reached our max number of retries	2024-03-09 08:03:25 +00:00
Thomas Eizinger	0ed2480ac0	refactor(connlib): merge `control_protocol::gateway` into `gateway` module (#3984 ) This separation doesn't really hold anymore as we already have an `impl Tunnel` and `impl GatewayState` within `gateway.rs`. It is easier to maintain if more gateway-specific things are in `gateway.rs`. Plus, once we integrate the portal connection into the tunnel, we can collapse a lot of these APIs.	2024-03-06 19:37:09 +00:00
Andrew Dryga	bfe1fb0ff4	refactor(portal): unify format of error payloads in websocket connection (#3697 ) Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-02-28 23:06:52 +00:00
Gabi	77b00b3be9	feat(connlib): support resource updates from the portal (#3754 ) This PR doesn't yet provide support for the update of upstream DNS but it does provide support for all the other resources update messages. Should comply with the description of issue #2022 but it doesn't respond to DNS upstream updates which is imply it should on the issue title --------- Signed-off-by: Gabi <gabrielalejandro7@gmail.com> Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-02-27 03:24:14 +00:00
Gabi	5edd195320	refactor(connlib): unify peer storage (#3738 ) Now that we have `&mut` access everywhere in the tunnel, the remaining shared-memory and locks are in how we store peers. To resolve this, we introduce a new `PeerStore` that allows us to look up peers by IP and by ID.	2024-02-26 16:07:38 +00:00
Thomas Eizinger	e766407dfb	feat!(portal): return relays as plain socket addresses (#3665 ) Extracted out of #3391. We don't actually need this for #3391 though because we've added a compatibility layer during deserialization. But, it will be good to remove that compat layer at some point which means we have to return the addresses as plain socket addresses. Because that is a breaking change, I decided to extract this into a different PR. Co-authored-by: conectado <gabrielalejandro7@gmail.com> --------- Co-authored-by: conectado <gabrielalejandro7@gmail.com>	2024-02-21 01:31:03 +00:00
Gabi	3d3e737ba3	refactor(connlib): replace `webrtc-rs` with `snownet` (#3391 ) Co-authored-by: Thomas Eizinger <thomas@eizinger.io> Resolves: #3377. --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-02-20 06:56:31 +00:00
Gabi	55e4fb100f	fix(gateway): re-implement resource address resolution in eventloop (#3656 ) Reimplements what #3654 reverted with a fix --------- Co-authored-by: Thomas Eizinger <thomas@eizinger.io>	2024-02-15 20:51:59 +00:00
Reactor Scram	085351f455	revert: 3622 to fix failing DNS CI test (#3654 ) Reverts #3622 I don't know why, but that change seemed to cause the `/etc/resolv.conf` test to fail in CI and I was thinking of the "roll back first" principle https://cloud.google.com/blog/products/gcp/reliable-releases-and-rollbacks-cre-life-lessons ~~I also change one `ping` in CI to `until ping`. This was an earlier attempt before I did the revert, and it seems safe to leave it in.~~	2024-02-15 19:26:34 +00:00
Thomas Eizinger	f42aa862a8	refactor(gateway): perform DNS resolution of resources in eventloop (#3622 ) With #3391, constructing a new tunnel will no longer be `async` which makes DNS resolution the only `async` component of `set_peer_connection_request`. In general, adding resources as part of setting up a connection is a duplicated of the logic within `allow_access`. We solve both of these problems at once by moving the DNS resolution out of `connlib` into the `gateway` binary and perform it as part of the eventloop during a connection setup.	2024-02-15 01:40:44 +00:00
Thomas Eizinger	6c2fdcfd0a	chore: bump Rust version to 1.76 (#3632 )	2024-02-13 17:01:22 +00:00
Thomas Eizinger	d550c9da89	refactor(connlib): remove unnecessary `Serialize` derive (#3595 ) These messages are only deserialized, never serialized. The `derive` can thus be removed. Extracted from: #3391.	2024-02-07 19:54:25 +00:00
Andrew Dryga	a211f96109	feat(portal): Broadcast state changes to connected clients and gateways (#2240 ) # Gateways - [x] When Gateway Group is deleted all gateways should be disconnected - [x] When Gateway Group is updated (eg. routing) broadcast to all affected gateway to disconnect all the clients - [x] When Gateway is deleted it should be disconnected - [x] When Gateway Token is revoked all gateways that use it should be disconnected # Relays - [x] When Relay Group is deleted all relays should be disconnected - [x] When Relay is deleted it should be disconnected - [x] When Relay Token is revoked all gateways that use it should be disconnected # Clients - [x] Remove Delete Client button, show clients using the token on the Actors page (#2669) - [x] When client is deleted disconnect it - [ ] ~When Gateway is offline broadcast to the Clients connected to it it's status~ - [x] Persist `last_used_token_id` in Clients and show it in tokens UI # Resources - [x] When Resource is deleted it should be removed from all gateways and clients - [x] When Resource connection is removed it should be deleted from removed gateway groups - [x] When Resource is updated (eg. traffic filters) all it's authorizations should removed # Authentication - [x] When Token is deleted related sessions are terminated - [x] When an Actor is deleted or disabled it should be disconnected from browser and client - [x] When Identity is deleted it's sessions should be disconnected from browser and client - [x] ^ Ensure the same happens for identities during IdP sync - [x] When IdP is disabled act like all actors for it are disabled? - [x] When IdP is deleted act like all actors for it are deleted? # Authorization - [x] When Policy is created clients that gain access to a resource should get an update - [x] When Policy is deleted we need to all authorizations it's made - [x] When Policy is disabled we need to all authorizations it's made - [x] When Actor Group adds or removes a user, related policies should be re-evaluated - [x] ^ Ensure the same happens for identities during IdP sync # Settings - [x] Re-send init message to Client when DNS settings change # Code - [x] Crear way to see all available topics and messages, do not use binary topics any more --------- Co-authored-by: conectado <gabrielalejandro7@gmail.com>	2024-02-01 11:02:13 -06:00
Thomas Eizinger	35cdf84578	feat(gateway): don't print stacktrace upon exit (#3404 ) Previously, we would print the following whenever the gateway exits: ``` 2024-01-25T17:37:53.258145Z INFO init{user_agent="Alpine Linux/3.19.0 (x86_64;6.6.11;) connlib/1.0.0" login_topic="gateway"}: phoenix_channel: Connected to portal, waiting for `init` message 2024-01-25T17:37:53.260751Z WARN init{user_agent="Alpine Linux/3.19.0 (x86_64;6.6.11;) connlib/1.0.0" login_topic="gateway"}: phoenix_channel: Fatal client error (401 Unauthorized) in portal connection: Invalid token Error: websocket failed Caused by: HTTP error: 401 Unauthorized Stack backtrace: 0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:565:25 1: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/result.rs:1963:27 2: firezone_gateway::run::{{closure}} at /build/gateway/src/main.rs:85:26 3: <core::pin::Pin<P> as core::future::future::Future>::poll at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/future/future.rs:125:9 4: tokio::runtime::task::core::Core<T,S>::poll::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:328:17 5: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/loom/std/unsafe_cell.rs:16:9 6: tokio::runtime::task::core::Core<T,S>::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:317:13 7: tokio::runtime::task::harness::poll_future::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:485:19 8: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9 9: std::panicking::try::do_call at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40 10: __rust_try 11: std::panicking::try at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19 12: std::panic::catch_unwind at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14 13: tokio::runtime::task::harness::poll_future at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:473:18 14: tokio::runtime::task::harness::Harness<T,S>::poll_inner at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:208:27 15: tokio::runtime::task::harness::Harness<T,S>::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:153:15 16: tokio::runtime::task::raw::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:271:5 17: tokio::runtime::task::raw::RawTask::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201:18 18: tokio::runtime::task::LocalNotified<S>::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/mod.rs:416:9 19: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:576:13 20: tokio::runtime::coop::with_budget at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107:5 21: tokio::runtime::coop::budget at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73:5 22: tokio::runtime::scheduler::multi_thread::worker::Context::run_task at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:575:9 23: tokio::runtime::scheduler::multi_thread::worker::Context::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:526:24 24: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:491:21 25: tokio::runtime::context::scoped::Scoped<T>::set at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/scoped.rs:40:9 26: tokio::runtime::context::set_scheduler::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176:26 27: std::thread::local::LocalKey<T>::try_with at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/local.rs:270:16 28: std::thread::local::LocalKey<T>::with at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/local.rs:246:9 29: tokio::runtime::context::set_scheduler at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176:9 30: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:486:9 31: tokio::runtime::context::runtime::enter_runtime at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65:16 32: tokio::runtime::scheduler::multi_thread::worker::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:478:5 33: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:447:45 34: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/task.rs:42:21 35: tokio::runtime::task::core::Core<T,S>::poll::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:328:17 36: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/loom/std/unsafe_cell.rs:16:9 37: tokio::runtime::task::core::Core<T,S>::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:317:13 38: tokio::runtime::task::harness::poll_future::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:485:19 39: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9 40: std::panicking::try::do_call at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40 41: __rust_try 42: std::panicking::try at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19 43: std::panic::catch_unwind at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14 44: tokio::runtime::task::harness::poll_future at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:473:18 45: tokio::runtime::task::harness::Harness<T,S>::poll_inner at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:208:27 46: tokio::runtime::task::harness::Harness<T,S>::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:153:15 47: tokio::runtime::task::raw::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:271:5 48: tokio::runtime::task::raw::RawTask::poll at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201:18 49: tokio::runtime::task::UnownedTask<S>::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/mod.rs:453:9 50: tokio::runtime::blocking::pool::Task::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:159:9 51: tokio::runtime::blocking::pool::Inner::run at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:513:17 52: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}} at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:471:13 53: std::sys_common::backtrace::__rust_begin_short_backtrace at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18 54: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/mod.rs:529:17 55: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9 56: std::panicking::try::do_call at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40 57: __rust_try 58: std::panicking::try at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19 59: std::panic::catch_unwind at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14 60: std::thread::Builder::spawn_unchecked_::{{closure}} at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/mod.rs:528:30 61: core::ops::function::FnOnce::call_once{{vtable.shim}} at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5 62: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9 63: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9 64: std::sys::unix::thread::Thread::new::thread_start at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys/unix/thread.rs:108:17 ``` Now, we are just printing this: ``` 2024-01-25T17:32:51.613258Z INFO init{user_agent="Alpine Linux/3.19.0 (x86_64;6.6.11;) connlib/1.0.0" login_topic="gateway"}: phoenix_channel: Connected to portal, waiting for `init` message 2024-01-25T17:32:51.617971Z WARN init{user_agent="Alpine Linux/3.19.0 (x86_64;6.6.11;) connlib/1.0.0" login_topic="gateway"}: phoenix_channel: Fatal client error (401 Unauthorized) in portal connection: Invalid token 2024-01-25T17:32:51.619680Z ERROR firezone_gateway: websocket failed: HTTP error: 401 Unauthorized ``` Resolves: #3401.	2024-01-26 00:12:32 +00:00
Thomas Eizinger	f9f95677d5	feat: automatically rejoin channel on portal after reconnect (#3393 ) In https://github.com/firezone/firezone/pull/3364, we forgot to rejoin the channel on the portal. Additionally, I found a way to detect the disconnect even more quickly.	2024-01-25 02:05:15 +00:00
Thomas Eizinger	6b789d6932	feat(phoenix-channel): automatically reconnect based on provided `ExponentialBackoff` (#3364 ) Currently, only the gateway has a reconnect logic for (transient) errors when connecting to the portal. Instead of duplicating this for the relay, I moved the reconnect state machine to `phoenix-channel`. This means the relay now automatically gets it too and in the future, the clients will also benefit from it. As a nice benefit, this also greatly simplifies the gateway's `Eventloop` and removes a bunch of cruft with channels. Resolves: #2915.	2024-01-24 16:39:53 +00:00
Gabi	0629afce3a	connlib: make dns request in a new task without blocking peers (#3370 ) This required making `allow_access` `async` which is ugly, but we can fix it later like we did it with `set_peer_connection_request`, but doing this ASAP otherwise this would block the `peers_by_ip` struct and also block the executor a bunch of times and slow everything down. Co-authored-by: Jamil <jamilbk@users.noreply.github.com>	2024-01-24 00:17:50 +00:00
Gabi	7233ccdc0a	gateway(fix): accept nil expiration times (#3288 ) Fixes #3240	2024-01-17 21:13:11 +00:00
Jamil	1251397651	fix(ios/android): Pass device name and os version as overrides over connect (#3036 ) Fixes #3035 Fixes #3037 # Before <img width="738" alt="Screenshot 2023-12-28 at 8 05 31 AM" src="https://github.com/firezone/firezone/assets/167144/c7ab4d74-672c-4536-97fe-f75d8d158bfb"> <img width="546" alt="Screenshot 2023-12-28 at 6 12 30 PM" src="https://github.com/firezone/firezone/assets/167144/1bd4ba98-d11d-4277-bd14-b0afcdf78119"> # After <img width="742" alt="Screenshot 2023-12-28 at 10 48 31 AM" src="https://github.com/firezone/firezone/assets/167144/96054f82-069f-47f7-862c-986455ef76c0"> <img width="744" alt="Screenshot 2023-12-28 at 6 29 37 PM" src="https://github.com/firezone/firezone/assets/167144/4ffc19b6-7c87-4ccb-bcfe-cb0e76fe95b7">	2024-01-03 20:08:33 +00:00
Gabi	73823ecba0	Fix/firezone id handling (#2958 ) fixes #2651 Wip because firezone portal doesn't handle names longer than 8 characters yet cc @AndrewDryga	2023-12-19 15:38:27 -06:00

1 2

70 Commits