Move almost half the lines of code into `ipc_service` so that it's
separate from the headless Client code
Moves the standalone / headless Client code into `standalone`
Currently, connlib re-initialises the TUN device on Linux every time its
configuration gets updated such as when roaming from one network to
another. This is unnecessary. Instead, we can adopt the same approach as
already used on MacOS, iOS and Windows and only initialise it if it
doesn't exist yet.
Doing so surfaces an interesting bug. Currently, attempting to
re-initialise the TUN device fails with a warning:
> connlib_client_shared::eventloop: Failed to set interface on tunnel:
Resource busy (os error 16)
See
https://github.com/firezone/firezone/actions/runs/9656570163/job/26634409346#step:7:103
for an example. As a consequence, we never actually trigger the
`on_set_interface_config` callback and thus never actually set the new
IPs on the TUN device.
Now that we _are_ calling this callback, we execute
`TunDeviceManager::set_ips` which first clears all IPs from the device
and then attaches the new ones. A consequence of this is that the Linux
kernel will clear all routes associated with the device. This clashes
with an optimisation we have in `TunDeviceManager` where we remember the
previously set routes and don't set new ones if they are the same.
This `HashSet` needs to be cleared upon setting new IPs in order to
actually set the new routes correctly afterwards. Without that, we stop
receiving traffic on the TUN device.
Closes#5450
Now the entire `Handler::run` function is allowed to fail, similar to a
web request handler failing in a web server.
Previously we only allowed the Handler to fail if it was idle, waiting
on incoming IPC requests. Now it can fail even if it's working with
connlib and about to send over IPC.
I replicated this on my Windows 11 VM in Parallels and the fix works
fine there. Should be the same bug and same fix in Linux.
Closes#5481
With this, I can connect to the staging portal without a build.rs or any
extra env var setup
<img width="387" alt="image"
src="https://github.com/firezone/firezone/assets/13400041/9c080b36-3a76-49c7-b706-20723697edc7">
```[tasklist]
### Next steps
- [x] Split out a refactor PR for `ConnectArgs` (#5488)
- [x] Try doing this for other Clients
- [x] Check Gateway
- [x] Check Tauri Client
- [x] Change to `app_version`
- [x] Open for review
- [ ] Use `option_env` so that `FIREZONE_PACKAGE_VERSION` can still override the Cargo.toml version for local testing
- [ ] Check Android Client
- [ ] Check Apple Client
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
This is extracted from #5487 since I needed to add an 8th parameter and
Clippy said 8 is too many.
Refs #2986
Stepping stone towards using the Builder pattern. There's only a few
Clients so this has 80% of the advantage for 20% of the effort
Extracted from https://github.com/firezone/firezone/pull/5426
- Replace `new` and `new_for_test` for IPC servers with `enum ServiceId`
- Rename `debug_command_setup` to `setup_stdout_logging`
It turned out there is no clever way to hide other platforms from
`cargo-mutants`, I thought I had such a way
This is a funny one. `cargo test -p firezone-headless-client -p
firezone-gui-client` actually passes, because the GUI client uses the
pipes feature, and Cargo apparently just does one build for both
packages. But if you build the headless Client by itself, it fails to
build.
I think this caused `cargo-mutants` to consider all its headless Client
mutants to be unviable, and so it didn't show coverage for that package.
Part of a yak shave to profile startup time for reducing it on Windows
#5026
Median of 3 runs:
- Windows 11 aarch64 Parallels VM - 4.8 s
- Windows 11 x86_64 laptop - 3.1 s (I thought it used to be slower)
- Windows Server 2022 VM - 22.2 s
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Closes#5042
Smoke test plan:
- Install on a before-Firezone VM
- Confirm logs default to `str0m=warn,info`
- Set log filter to `debug` in GUI
- Restart IPC service
- Confirm logs are `debug`
- Clear settings back to default
- Restart IPC service
- Confirm logs are `str0m=warn,info`
Directions to apply new log level:
1. Put the new log filter in
2. Click "Apply"
3. Quit Firezone Client
4. Right-click on the Start Menu and click "Terminal (Admin)" to open a
Powershell prompt
5. Run `Restart-Service -Name FirezoneClientIpcService` (on Linux, `sudo
systemctl restart firezone-client-ipc.service`)
6. Re-open Firezone Client
```[tasklist]
- [x] Log the log filter maybe
- [x] Use `atomicwrites` to write the file
- [x] (cancelled) ~~Make the GUI write the file on boot if it's not there (saves a step when upgrading from older versions)~~
- [x] Windows smoke test
- [x] Fix permissions on `/var/lib/dev.firezone.client/config`
- [x] Fix Linux IPC service not loading the log filter file
- [x] Linux smoke test
- [ ] Make sure it's okay that users in `firezone-client` can change the device ID
- [ ] Update user guides to include restarting the computer or IPC service after updating the log level?
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
- Removes version numbers from infra components (elixir/relay)
- Removes version bumping from Rust workspace members that don't get
published
- Splits release publishing into `gateway-`, `headless-client-`, and
`gui-client-`
- Removes auto-deploying new infrastructure when a release is published.
Use the Deploy Production workflow instead.
Fixes#4397
Closes#3567 (again)
Closes#5214
Ready for review
```[tasklist]
### Before merging
- [x] The IPC service should report system uptime when it starts. This will tell us whether the computer was rebooted or just the IPC service itself was upgraded / rebooted.
- [x] The IPC service should report the PID of itself and the GUI if possible
- [x] The GUI should report the PID of the IPC service if possible
- [x] Extra logging between `GIT_VERSION = ` and the token loading log line, especially right before and right after the critical Tauri launching step
- [x] If a 2nd GUI or IPC service runs and exits due to single-instance, it must log that
- [x] Remove redundant DNS deactivation when IPC service starts (I think conectado noticed this in another PR)
- [x] Manually test that the GUI logs something on clean shutdown
- [x] Logarithmic heartbeat?
- [x] If possible, log monotonic time somewhere so NTP syncs don't make the logs unreadable (uptime in the heartbeat should be monotonic, mostly)
- [x] Apply the same logging fix to the IPC service
- [x] Ensure log zips include GUI crash dumps
- [x] ~~Fix #5042~~ (that's a separate issue, I don't want to drag this PR out)
- [x] Test IPC service restart (logs as a stop event)
- [x] Test IPC service stop
- [x] Test IPC service logs during system suspend (Not logged, maybe because we aren't subscribed to power events)
- [x] Test IPC service logs during system reboot (Logged as shutdown, we exit gracefully)
- [x] Test IPC service logs during system shut down (Logged as a suspend)
- [x] Test IPC service upgrade (Logged as a stop)
- [x] Log unhandled events from the Windows service controller (Power events like suspend and resume are logged and not handled)
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Closes#5143
The initial half-second backoff should typically be enough, and if the
user is manually re-opening the GUI after a GUI crash, I don't think
they'll notice. If they do, they can open the GUI again and it should
all work.
Most of these were in `known_dirs.rs` because it's platform-specific and
`cargo-mutants` wasn't ignoring other platforms correctly.
Using `cargo mutants -p firezone-gui-client -p firezone-headless-client`
176 / 236 mutants missed before
155 / 206 mutants missed after
Refs #3636 (This pays down some of the technical debt from Linux DNS)
Refs #4473 (This partially fulfills it)
Refs #5068 (This is needed to make `FIREZONE_DNS_CONTROL` mandatory)
As of dd6421:
- On both Linux and Windows, DNS control and IP setting (i.e.
`on_set_interface_config`) both move to the Client
- On Windows, route setting stays in `tun_windows.rs`. Route setting in
Windows requires us to know the interface index, which we don't know in
the Client code. If we could pass opaque platform-specific data between
the tunnel and the Client it would be easy.
- On Linux, route setting moves to the Client and Gateway, which
completely removes the `worker` task in `tun_linux.rs`
- Notifying systemd that we're ready moves up to the headless Client /
IPC service
```[tasklist]
### Before merging / notes
- [x] Does DNS roaming work on Linux on `main`? I don't see where it hooks up. I think I only set up DNS in `Tun::new` (Yes, the `Tun` gets recreated every time we reconfigure the device)
- [x] Fix Windows Clients
- [x] Fix Gateway
- [x] Make sure connlib doesn't get the DNS control method from the env var (will be fixed in #5068)
- [x] De-dupe consts
- [ ] ~~Add DNS control test~~ (failed)
- [ ] Smoke test Linux
- [ ] Smoke test Windows
```
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.116 to
1.0.117.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.117</h2>
<ul>
<li>Resolve unexpected_cfgs warning (<a
href="https://redirect.github.com/serde-rs/json/issues/1130">#1130</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0ae247ca63"><code>0ae247c</code></a>
Release 1.0.117</li>
<li><a
href="4517c7a2d9"><code>4517c7a</code></a>
PartialEq is not implemented between Value and 128-bit ints</li>
<li><a
href="fdf99c7c38"><code>fdf99c7</code></a>
Combine number PartialEq tests</li>
<li><a
href="b4fc2451d7"><code>b4fc245</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1130">#1130</a>
from serde-rs/checkcfg</li>
<li><a
href="98f1a247de"><code>98f1a24</code></a>
Resolve unexpected_cfgs warning</li>
<li>See full diff in <a
href="https://github.com/serde-rs/json/compare/v1.0.116...v1.0.117">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Refs #5022
The debug IPC service has been useful on Windows, and since there is
more refactoring to do, I want it on Linux too.
With this you can just do `sudo -E target/debug/firezone-client-ipc
debug-ipc-service` and it will launch an IPC service without messing
with systemd or installing anything. (Assuming the directory for the
socket is created)
```[tasklist]
### Before merging
- [ ] Check for regressions in Windows
- [ ] Check for regressions in Linux
```
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.197 to
1.0.203.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.203</h2>
<ul>
<li>Documentation improvements (<a
href="https://redirect.github.com/serde-rs/serde/issues/2747">#2747</a>)</li>
</ul>
<h2>v1.0.202</h2>
<ul>
<li>Provide public access to RenameAllRules in serde_derive_internals
(<a
href="https://redirect.github.com/serde-rs/serde/issues/2743">#2743</a>)</li>
</ul>
<h2>v1.0.201</h2>
<ul>
<li>Resolve unexpected_cfgs warning (<a
href="https://redirect.github.com/serde-rs/serde/issues/2737">#2737</a>)</li>
</ul>
<h2>v1.0.200</h2>
<ul>
<li>Fix formatting of "invalid type" and "invalid
value" deserialization error messages containing NaN or infinite
floats (<a
href="https://redirect.github.com/serde-rs/serde/issues/2733">#2733</a>,
thanks <a
href="https://github.com/jamessan"><code>@jamessan</code></a>)</li>
</ul>
<h2>v1.0.199</h2>
<ul>
<li>Fix ambiguous associated item when
<code>forward_to_deserialize_any!</code> is used on an enum with
<code>Error</code> variant (<a
href="https://redirect.github.com/serde-rs/serde/issues/2732">#2732</a>,
thanks <a
href="https://github.com/aatifsyed"><code>@aatifsyed</code></a>)</li>
</ul>
<h2>v1.0.198</h2>
<ul>
<li>Support serializing and deserializing
<code>Saturating<T></code> (<a
href="https://redirect.github.com/serde-rs/serde/issues/2709">#2709</a>,
thanks <a
href="https://github.com/jbethune"><code>@jbethune</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d5bc546ca5"><code>d5bc546</code></a>
Release 1.0.203</li>
<li><a
href="45ae217728"><code>45ae217</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2747">#2747</a>
from dtolnay/variadic</li>
<li><a
href="b7b97dda73"><code>b7b97dd</code></a>
Unindent implementation inside tuple_impl_body macro</li>
<li><a
href="5d3c563d46"><code>5d3c563</code></a>
Document tuple impls as fake variadic</li>
<li><a
href="376185458b"><code>3761854</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2745">#2745</a>
from dtolnay/docsrs</li>
<li><a
href="a8f14840ab"><code>a8f1484</code></a>
Rely on docs.rs to define --cfg=docsrs by default</li>
<li><a
href="9e32a40b1c"><code>9e32a40</code></a>
Release 1.0.202</li>
<li><a
href="87f635e54d"><code>87f635e</code></a>
Release serde_derive_internals 0.29.1</li>
<li><a
href="d4b2dfbde2"><code>d4b2dfb</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2743">#2743</a>
from dtolnay/renameallrules</li>
<li><a
href="f6ab0bc56f"><code>f6ab0bc</code></a>
Provide public access to RenameAllRules in serde_derive_internals</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.197...v1.0.203">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#4995Closes#4925Closes#4997Closes#5047
Supersedes #4965 and #5004.
NOT changing:
- Page description for other Clients. That is still "Firezone
Documentation"
Need these Clients:
- Windows GUI
- Linux headless
- Linux GUI
to have these things documented: (with exact terms)
- Prerequisites
- Installation
- Usage
- Signing in
- Accessing a Resource
- Signing out
- Quitting
- Upgrading
- Diagnostic logs
- Uninstalling
- Troubleshooting
- DNS not reverted after exit
- DNS Resource not accessible
- Known issues
```[tasklist]
### Before merging
- [x] Test Windows GUI instructions
- [x] Add troubleshooting for #5027
- [x] Fill in troubleshooting sections
- [x] Test Linux GUI instructions
- [x] Linux headless - Make sure SIGTERM or Ctrl+C or whatever reverts resolv.conf
- [x] Test Linux Headless instructions
- [x] Page descriptions should be "How to install and use the Firezone $OS $UI client."
- [x] ~~Linux headless - Confirm behaviors and default values of all env vars~~ (skipping - The ones that are used are exercised)
- [x] Grep for TODOs
- [x] Change "un-install" to "uninstall"
- [x] Capitalize "Client" where needed
- [x] Change "IPC service" to "Tunnel service" or something
- [x] Change "SplitDNS" to "Split DNS"
- [ ] Wait for next Client release to be cut
```
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Ready for review.
Closes#3712.
Supersedes #4940.
Refs #4963.
I haven't figured out if it needs any new automated tests (unit,
integration, etc.) but the code itself is ready for review. There is
more refactoring that could be done, or could be left for later.
```[tasklist]
- [x] Move wintun setup from GUI to IPC service / headless client
- [x] Make sure the device ID is in a sensible place
- [x] Export IPC service logs in the zips
- [x] Test GUI + SC IPC service on Windows (f4db808919a passed)
- [x] Make sure IPC service does not busy-loop
- [x] Test un-install checklist for Windows
- [x] Test upgrade checklist for Windows
- [x] Test GUI + systemd IPC service on Linux (c4ab7e7 passed)
- [x] Test upgrade checklist for Linux
- [x] Test un-install checklist for Linux
- [x] Make sure the IPC service logs out and deactivates DNS control if the GUI crashes
- [x] Test network changing
- [x] (it's intended behavior) ~~Look into spurious `on_update_resources` (fad86babd7)~~
- [x] ~~Test max partition time on offline laptop~~ (I ended up just setting a 30-day default in the code)
- [x] Make sure headless Client does not busy-loop
- [x] Test standalone headless on Linux
- [ ] Add unit / integration tests
- [ ] Think about security a bit #3971
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This PR introduces site's `Status`. That's used to report to the client
the status, either, unknown, online or offline, mostly as a hint to
users as what's wrong with a connection.
This are the criteria for an online or offline resource
* If all sites related to a resource are offline the resource is
considered offline, since there's no gateway that can respond to that
resource's connection
* If any site is online the resource is online, since that same peer can
be used to reach that resource
* Any other case is unknown
Right now resources are single site so it doesn't matter too much but
tracking online/offline per-site instead of per-gateway or resource
seems like the better long-term solution.
The way to "find out" the site's status is:
* If a response to a connection details is offline, all sites related to
that resource must be offline otherwise there would've been a gateway in
the response
* At the point we connect to a gateway, the site that corresponds to
that gateway must be online
* When a connection to a peer stops it's considered unknown again
Fixes#4738
Closes#4907
They're still accepted, but the binary entirely determines the behavior.
This makes the code for CLI parsing and token handling simpler with
fewer branches, so it's easier to be sure it's correct.
Replaces #4942 which isn't doing what I intended anymore.
Closes#4899
This has a known gap where theoretically the GUI could sign in while the
service is hung in startup, and then the service would wipe out the
GUI's DNS rules.
The workaround for that would be to restart the GUI, but in practice I
think the gap will not be hit, and it will go away once #3712 is done
anyway.
I tested it manually once using the reproduction steps from #4899 and it
worked.
```[tasklist]
### Before merging
- [x] Make sure the service auto-starts
- [x] Make the process idle and report its status to Windows properly using https://github.com/mullvad/windows-service-rs
- [x] DRY log dir code
- [x] Figure out where service logs will go and how the GUI will zip them
- [x] Make sure the service gets a shut down signal from Windows (this is hard to catch in the Tauri GUI)
- [x] Make sure the service restarts when Firezone is updated
- [x] Make sure the service is stopped and un-installed when Firezone is un-installed
- [x] Add test to install the MSI and check that the service runs
- [x] (will move to another PR) ~~Clean up function names~~
- [x] Make sure the Linux GUI was not broken by refactoring
```
Is this worth it?
```[tasklist]
### Before merging
- [x] Double-check docs and ask Jamil to review
- [x] Would need Brian to review the terraform thing
- [x] Make sure Docker compat isn't broken for existing users (shouldn't be, the image is still just `client`)
- [x] Decide whether compatibility tests need to pass (if something breaks after merge we can revert this)
```
```[tasklist]
# Before merging
- [x] Add CI test to check that the Unix domain socket is owned by `root:firezone` (#4832 will do this)
```
This allows the GUI (running as a normal user who belongs to the
`firezone` group) to read back the connlib logs and export them in the
zip file.
<img width="716" alt="image"
src="https://github.com/firezone/firezone/assets/13400041/59cb7cc5-fd6a-4b27-a311-1b9c56b7b23e">