Currently, an error returned by `Tunnel::poll_next_event` is only
logged. In other words, they are never fatal. This creates a tricky to
understand relationship on what kind of errors should be returned from
callbacks. Because connlib is used on multiple operating systems, it has
no idea how fatal a particular error is.
This PR removes all of these `Result` return values with the following
consequences:
- For Android, we now panic when a callback fails. This is a slight
change in behaviour. I believe that previously, any exception thrown by
a callback into Android was caught and returned as an error. Now, we
panic because in the FFI layer, we don't have any information on how
fatal the error is. For non-fatal errors, the Android app should simply
not throw an exception. The panics will cause the connlib task to be
shut down which triggers an `on_disconnect`.
- For Swift, there is no behaviour change. The FFI layer already did not
support `Result`s for those callbacks. I don't know how exceptions from
Swift are translated across the FFI layer but there is no change to what
we had before.
- For the Tauri client:
- I chose to log errors on ERROR level and continue gracefully for the
DNS resolvers.
- We panic in case the controller channel is full / closed. That should
really never happen in practice though unless we are currently shutting
down the app.
Resolves: #4064.
This isn't really user-facing, so I marked it down from `feat` to
`chore`. Closes#3817
- If we exit gracefully, `/etc/resolv.conf` is reverted
- We always keep the `.before-firezone` backup in case we lose power and
the revert transaction is corrupted or rolled back
- We use a magic header to detect whether the last run was a crash or
not. If Firezone crashes and the user wants to modify their default DNS,
they need to delete that header so that Firezone won't accidentally
revert its backup and trash their change.
- All error variants for this module replaced with `anyhow::Error` since
they were never matched by callers.
I ran `cargo mutants` locally and it helped me validate the unit tests
and it picked up a `match` branch that I forgot to delete.
```[tasklist]
- [x] (Failed: Integration tests didn't like it) ~~Add the system default resolvers below Firezone's sentinels~~
- [x] `tracing::info` "Last run crashed" if we have to revert the file at startup
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
I ended up calling it `reconnect` because that is really what we are
doing:
- We reconnect to the portal.
- We "reconnect" to all relays, i.e. refresh the allocations.
I decided **not** to use an ICE restart. An ICE restart clears the local
as well as the remote credentials, meaning we would need to run another
instance of the signalling protocol. The current control plane does not
support this and it is also unnecessary in our situation. In the case of
an actual network change (e.g. WiFI to cellular), refreshing of the
allocations will turn up new candidates as that is how we discovered our
original ones in the first place. Because we constantly operate in ICE
trickle mode, those will be sent to the remote via the control plane and
we start testing them.
As those new paths become available, str0m will automatically nominate
them in case the current one runs into an ICE timeout. Here is a
screen-recording of the Linux CLI client where `Session::refresh` is
triggered via the SIGHUP signal:
[Screencast from 2024-03-14
11-16-47.webm](https://github.com/firezone/firezone/assets/5486389/7171d199-f2a2-4b22-92c8-243494d5d6d8)
Provides the infrastructure for: #4028.
Currently, each use of `Session` creates its own `Runtime`. That is
unnecessary because some platforms already have a tokio runtime running.
Instead of creating another one, we simply ask the caller to provide us
with a `Handle` to an existing tokio runtime. For Android and iOS we
spawn a new single-threaded runtime to satisfy this new requirement.
This refactors `Session` to allow for commands to be sent to the
`Eventloop`. Currently, we only send a `Stop` command. With #3429, we
will add more commands like refreshing and updating the DNS servers.
Currently, we are passing a lot of data into `Session::connect`. Half of
this data is only needed to construct the URL we will use to connect to
the portal. We can simplify this by extracting a dedicated `LoginUrl`
component that captures and validates this data early.
Not only does this reduce the number of parameters we pass to
`Session::connect`, it also reduces the number of failure cases we have
to deal with in `Session::connect`. Any time the session fails, we have
to call `onDisconnected` to inform the client. Thus, we should perform
as much validation as we can early on. In other words, once
`Session::connect` returns, the client should be able to expect that the
tunnel is starting.
Closes#3815
Changes that are breaking (but these aren't in production so it should
be okay)
- Windows, renaming `device_id.json` to `firezone-id.json` to match the
rest of the code
- Linux GUI, storing the firezone-id under `/var/lib` instead of under
`$HOME`
- Linux GUI, bails out if not run with `sudo --preserve-env` by
detecting `$HOME == root` or `$USER != root`
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Previously, we called `onDisconnect` in two kinds of situations:
- With an error when we wanted the clients to clear the token
- Without an error when the token was still valid (i.e. after a call to
`disconnect` from the clients)
This is unnecessarily redundant. Firezone is designed to **not** have a
state of "signed in but disconnected". Thus, every time connlib calls
`disconnect`, we should clear the token and sign the user out.
At present, we only do this for errors with the control plane. Errors in
the actual tunnel are only logged and we continue trying to use the
tunnel. There are errors in the tunnel where we should also give up
(i.e. TUN device gone, fatal IO error, etc). At present, those are not
yet bubbled up but we will at some point. Once we have
https://github.com/firezone/firezone/pull/3682, it will be much easier
to create a type-safe contract that ensures we only disconnect on fatal
errors.
---------
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Co-authored-by: ReactorScram <ReactorScram@users.noreply.github.com>
Bumps [clap](https://github.com/clap-rs/clap) from 4.5.1 to 4.5.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/releases">clap's
releases</a>.</em></p>
<blockquote>
<h2>v4.5.2</h2>
<h2>[4.5.2] - 2024-03-06</h2>
<h3>Fixes</h3>
<ul>
<li><em>(macros)</em> Silence a warning</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's
changelog</a>.</em></p>
<blockquote>
<h2>[4.5.2] - 2024-03-06</h2>
<h3>Fixes</h3>
<ul>
<li><em>(macros)</em> Silence a warning</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f65d421607"><code>f65d421</code></a>
chore: Release</li>
<li><a
href="886b2729e4"><code>886b272</code></a>
docs: Update changelog</li>
<li><a
href="3ba429752f"><code>3ba4297</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5386">#5386</a>
from amaanq/static-var-name</li>
<li><a
href="2aea9504c4"><code>2aea950</code></a>
fix: Use SCREAMING_SNAKE_CASE for static variable
<code>authors</code></li>
<li><a
href="690f5557d7"><code>690f555</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5382">#5382</a>
from clap-rs/renovate/pre-commit-action-3.x</li>
<li><a
href="a2aa644368"><code>a2aa644</code></a>
chore(deps): update compatible (dev) (<a
href="https://redirect.github.com/clap-rs/clap/issues/5381">#5381</a>)</li>
<li><a
href="c233de53c0"><code>c233de5</code></a>
chore(deps): update pre-commit/action action to v3.0.1</li>
<li><a
href="d0028d74b5"><code>d0028d7</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/5371">#5371</a>
from BenWiederhake/dev-fix-link-command-trailing_var...</li>
<li><a
href="0076cac7cb"><code>0076cac</code></a>
fix(builder): Don't doc-link to undocumented item</li>
<li>See full diff in <a
href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.1...v4.5.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
If `FIREZONE_DNS_CONTROL` is set to `systemd-resolved`, then shell out
to `resolvectl` to request all system DNS queries to go to Firezone's
sentinel DNS server(s).
```[tasklist]
- [ ] Figure out how to stop the runner from using the Docker bridge iface
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Trying to make sure I don't overlook anything. The possible combinations
of setups is like 100+, but these 6 will at least exercise everything
one time, and they're probably going to be the most common, right?
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Only user-facing if users are using the Docker image for the Linux
client.
I split off a module for `/etc/resolv.conf` since the code and unit
tests are about 300 lines and aren't related to the rest of the
`tun_linux.rs` code.
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
This splits off the easy parts from #3605.
- Add quotes around `PHOENIX_SECURE_COOKIES` because my local
`docker-compose` considers unquoted 'false' to be a schema error - Env
vars are strings or numbers, not bools, it says
- Create `test.httpbin.docker.local` container in a new subnet so it can
be used as a DNS resource without the existing CIDR resource picking it
up
- Add resources and policies to `seeds.exs` per #3342
- Fix warning about `CONNLIB_LOG_UPLOAD_INTERVAL_SECS` not being set
- Add `resolv-conf` dep and unit tests to `firezone-tunnel` and
`firezone-linux-client`
- Impl `on_disconnect` in the Linux client with `tracing::error!`
- Add comments
```[tasklist]
- [x] (failed) Confirm that the client container actually does stop faster this way
- [x] Wait for tests to pass
- [x] Mark as ready for review
```
Test basic connectivity with the headless client after the portal API
restarts.
Based on top of #3364 to test that portal restarts don't cause a
cascading failure.
Fixes#2948
So it seems that it's easiest just to use an old-fashioned semver
string. This means we'll need to keep a version matrix in the docs of
which components are supported and for how long, but it's better than
having different version schemes for different Firezone components
altogether.
## Changelog
- Updates connlib parameter API_URL (formerly known under different
names as `CONTROL_PLANE_URL`, `PORTAL_URL`, `PORTAL_WS_URL`, and
friends) to be configured as an "advanced" or "hidden" feature at
runtime so that we can test production builds on both staging and
production.
- Makes `AUTH_BASE_URL` configurable at runtime too
- Moves `CONNLIB_LOG_FILTER_STRING` to be configured like this as well
and simplifies its naming
- Fixes a timing attack bug on Android when comparing the `csrf` token
- Adds proper account ID validation to Android to prevent invalid URL
parameter strings from being saved and used
- Cleans up a number of UI / view issues on Android regarding typos,
consistency, etc
- Hides vars from from the `relay` CLI we may not want to expose just
yet
- `get_device_id()` is flawed for connlib components -- SMBios is rarely
available. Data plane components now require a `FIREZONE_ID` now instead
to use for upserting.
Fixes#2482Fixes#2471
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Gabi <gabrielalejandro7@gmail.com>
Fixes#2363
* Rename `relay` package to `firezone-relay` so that binaries outputted
match the `firezone-*` cli naming scheme
* Rename `firezone-headless-client` package to `firezone-linux-client`
for consistency
* Add READMEs for user-facing CLI components (there will also be docs
later)