Closes#5789
The SIGTERM catching would have helped debug #5790
```[tasklist]
### Tasks
- [x] catch SIGTERM and log when systemd shuts us down gracefully
- [x] Log architecture at startup
```
I had to change the smoke test because it had a couple issues:
- The IPC socket had the wrong permissions because I didn't realize you
can tell `su` / `sudo` / `runuser` to set a group in addition to setting
a user
- It had a hard-coded timer of 12 seconds, and one time the test failed
because the IPC service exited before the GUI finished loading. So I
changed it so the IPC service in smoke test mode will wait forever for
exactly one client, then quit
```[tasklist]
### Tasks
- [x] Run `chown` in the Ubuntu smoke test
```
PR #5700 had a typo in it. I didn't notice that these match arms use
`|`, so I accidentally flush the DNS for an event that doesn't need it.
Only `OnUpdateResources` should flush DNS.
```[tasklist]
### Tasks
- [x] Check the GUI saves its settings file
- [x] Check the IPC service writes the device ID to disk
- [x] Check the GUI writes a log file (skipped - we already check if the exported zip has any files in it)
- [x] Run the crash file through `minidump-stackwalk`
- [x] Reach feature parity with the original smoke tests
- [x] Ready for review
- [x] Finish #5452
- [ ] Start on #5453
```
Closes#5052
On my dev VMs:
- systemd-resolved = 15 ms to flush
- Windows = 600 ms to flush
I tested with the headless Clients on Linux and Windows and it fixes the
issue. On Windows I didn't replicate the issue with the GUI Client, on
Linux this patch also fixes it for the GUI Client.
We added this to diagnose a hang in the IPC service, #5441. That hang,
to the best of our knowledge, was caused by a deadlock which we fixed in
#5571. So the heartbeat task just adds a lot of noise to the stdout
which is annoying for debugging and won't be used in production logs.
The system uptime measuring is still useful, so we now log that just
once when logging starts, next to the git version and log directives.
If we see this pattern in either process' logs, we know something is
suspicious:
- Log file ends without a clean shutdown message
- Next log file starts with a high system uptime
Updates should always result in a clean shutdown message, and a sudden
power loss (mains power outage, or laptop battery dying) would result in
the system uptime being low for the 2nd log file.
This does the same thing as #5621 without removing the library, since it
will now compile against whatever version of `windows` we need
We could do the same with `hostname`, either vendor or ask upstream to
bump deps, and then `windows` 0.52.0 should be gone.
```[tasklist]
### Tasks
- [x] Remove macOS code and shrink everything
```
Move almost half the lines of code into `ipc_service` so that it's
separate from the headless Client code
Moves the standalone / headless Client code into `standalone`
Currently, connlib re-initialises the TUN device on Linux every time its
configuration gets updated such as when roaming from one network to
another. This is unnecessary. Instead, we can adopt the same approach as
already used on MacOS, iOS and Windows and only initialise it if it
doesn't exist yet.
Doing so surfaces an interesting bug. Currently, attempting to
re-initialise the TUN device fails with a warning:
> connlib_client_shared::eventloop: Failed to set interface on tunnel:
Resource busy (os error 16)
See
https://github.com/firezone/firezone/actions/runs/9656570163/job/26634409346#step:7:103
for an example. As a consequence, we never actually trigger the
`on_set_interface_config` callback and thus never actually set the new
IPs on the TUN device.
Now that we _are_ calling this callback, we execute
`TunDeviceManager::set_ips` which first clears all IPs from the device
and then attaches the new ones. A consequence of this is that the Linux
kernel will clear all routes associated with the device. This clashes
with an optimisation we have in `TunDeviceManager` where we remember the
previously set routes and don't set new ones if they are the same.
This `HashSet` needs to be cleared upon setting new IPs in order to
actually set the new routes correctly afterwards. Without that, we stop
receiving traffic on the TUN device.
Closes#5450
Now the entire `Handler::run` function is allowed to fail, similar to a
web request handler failing in a web server.
Previously we only allowed the Handler to fail if it was idle, waiting
on incoming IPC requests. Now it can fail even if it's working with
connlib and about to send over IPC.
I replicated this on my Windows 11 VM in Parallels and the fix works
fine there. Should be the same bug and same fix in Linux.
Closes#5481
With this, I can connect to the staging portal without a build.rs or any
extra env var setup
<img width="387" alt="image"
src="https://github.com/firezone/firezone/assets/13400041/9c080b36-3a76-49c7-b706-20723697edc7">
```[tasklist]
### Next steps
- [x] Split out a refactor PR for `ConnectArgs` (#5488)
- [x] Try doing this for other Clients
- [x] Check Gateway
- [x] Check Tauri Client
- [x] Change to `app_version`
- [x] Open for review
- [ ] Use `option_env` so that `FIREZONE_PACKAGE_VERSION` can still override the Cargo.toml version for local testing
- [ ] Check Android Client
- [ ] Check Apple Client
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
This is extracted from #5487 since I needed to add an 8th parameter and
Clippy said 8 is too many.
Refs #2986
Stepping stone towards using the Builder pattern. There's only a few
Clients so this has 80% of the advantage for 20% of the effort
Extracted from https://github.com/firezone/firezone/pull/5426
- Replace `new` and `new_for_test` for IPC servers with `enum ServiceId`
- Rename `debug_command_setup` to `setup_stdout_logging`
It turned out there is no clever way to hide other platforms from
`cargo-mutants`, I thought I had such a way
This is a funny one. `cargo test -p firezone-headless-client -p
firezone-gui-client` actually passes, because the GUI client uses the
pipes feature, and Cargo apparently just does one build for both
packages. But if you build the headless Client by itself, it fails to
build.
I think this caused `cargo-mutants` to consider all its headless Client
mutants to be unviable, and so it didn't show coverage for that package.
Part of a yak shave to profile startup time for reducing it on Windows
#5026
Median of 3 runs:
- Windows 11 aarch64 Parallels VM - 4.8 s
- Windows 11 x86_64 laptop - 3.1 s (I thought it used to be slower)
- Windows Server 2022 VM - 22.2 s
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Closes#5042
Smoke test plan:
- Install on a before-Firezone VM
- Confirm logs default to `str0m=warn,info`
- Set log filter to `debug` in GUI
- Restart IPC service
- Confirm logs are `debug`
- Clear settings back to default
- Restart IPC service
- Confirm logs are `str0m=warn,info`
Directions to apply new log level:
1. Put the new log filter in
2. Click "Apply"
3. Quit Firezone Client
4. Right-click on the Start Menu and click "Terminal (Admin)" to open a
Powershell prompt
5. Run `Restart-Service -Name FirezoneClientIpcService` (on Linux, `sudo
systemctl restart firezone-client-ipc.service`)
6. Re-open Firezone Client
```[tasklist]
- [x] Log the log filter maybe
- [x] Use `atomicwrites` to write the file
- [x] (cancelled) ~~Make the GUI write the file on boot if it's not there (saves a step when upgrading from older versions)~~
- [x] Windows smoke test
- [x] Fix permissions on `/var/lib/dev.firezone.client/config`
- [x] Fix Linux IPC service not loading the log filter file
- [x] Linux smoke test
- [ ] Make sure it's okay that users in `firezone-client` can change the device ID
- [ ] Update user guides to include restarting the computer or IPC service after updating the log level?
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
- Removes version numbers from infra components (elixir/relay)
- Removes version bumping from Rust workspace members that don't get
published
- Splits release publishing into `gateway-`, `headless-client-`, and
`gui-client-`
- Removes auto-deploying new infrastructure when a release is published.
Use the Deploy Production workflow instead.
Fixes#4397
Closes#3567 (again)
Closes#5214
Ready for review
```[tasklist]
### Before merging
- [x] The IPC service should report system uptime when it starts. This will tell us whether the computer was rebooted or just the IPC service itself was upgraded / rebooted.
- [x] The IPC service should report the PID of itself and the GUI if possible
- [x] The GUI should report the PID of the IPC service if possible
- [x] Extra logging between `GIT_VERSION = ` and the token loading log line, especially right before and right after the critical Tauri launching step
- [x] If a 2nd GUI or IPC service runs and exits due to single-instance, it must log that
- [x] Remove redundant DNS deactivation when IPC service starts (I think conectado noticed this in another PR)
- [x] Manually test that the GUI logs something on clean shutdown
- [x] Logarithmic heartbeat?
- [x] If possible, log monotonic time somewhere so NTP syncs don't make the logs unreadable (uptime in the heartbeat should be monotonic, mostly)
- [x] Apply the same logging fix to the IPC service
- [x] Ensure log zips include GUI crash dumps
- [x] ~~Fix #5042~~ (that's a separate issue, I don't want to drag this PR out)
- [x] Test IPC service restart (logs as a stop event)
- [x] Test IPC service stop
- [x] Test IPC service logs during system suspend (Not logged, maybe because we aren't subscribed to power events)
- [x] Test IPC service logs during system reboot (Logged as shutdown, we exit gracefully)
- [x] Test IPC service logs during system shut down (Logged as a suspend)
- [x] Test IPC service upgrade (Logged as a stop)
- [x] Log unhandled events from the Windows service controller (Power events like suspend and resume are logged and not handled)
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Closes#5143
The initial half-second backoff should typically be enough, and if the
user is manually re-opening the GUI after a GUI crash, I don't think
they'll notice. If they do, they can open the GUI again and it should
all work.
Most of these were in `known_dirs.rs` because it's platform-specific and
`cargo-mutants` wasn't ignoring other platforms correctly.
Using `cargo mutants -p firezone-gui-client -p firezone-headless-client`
176 / 236 mutants missed before
155 / 206 mutants missed after
Refs #3636 (This pays down some of the technical debt from Linux DNS)
Refs #4473 (This partially fulfills it)
Refs #5068 (This is needed to make `FIREZONE_DNS_CONTROL` mandatory)
As of dd6421:
- On both Linux and Windows, DNS control and IP setting (i.e.
`on_set_interface_config`) both move to the Client
- On Windows, route setting stays in `tun_windows.rs`. Route setting in
Windows requires us to know the interface index, which we don't know in
the Client code. If we could pass opaque platform-specific data between
the tunnel and the Client it would be easy.
- On Linux, route setting moves to the Client and Gateway, which
completely removes the `worker` task in `tun_linux.rs`
- Notifying systemd that we're ready moves up to the headless Client /
IPC service
```[tasklist]
### Before merging / notes
- [x] Does DNS roaming work on Linux on `main`? I don't see where it hooks up. I think I only set up DNS in `Tun::new` (Yes, the `Tun` gets recreated every time we reconfigure the device)
- [x] Fix Windows Clients
- [x] Fix Gateway
- [x] Make sure connlib doesn't get the DNS control method from the env var (will be fixed in #5068)
- [x] De-dupe consts
- [ ] ~~Add DNS control test~~ (failed)
- [ ] Smoke test Linux
- [ ] Smoke test Windows
```
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.116 to
1.0.117.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.117</h2>
<ul>
<li>Resolve unexpected_cfgs warning (<a
href="https://redirect.github.com/serde-rs/json/issues/1130">#1130</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0ae247ca63"><code>0ae247c</code></a>
Release 1.0.117</li>
<li><a
href="4517c7a2d9"><code>4517c7a</code></a>
PartialEq is not implemented between Value and 128-bit ints</li>
<li><a
href="fdf99c7c38"><code>fdf99c7</code></a>
Combine number PartialEq tests</li>
<li><a
href="b4fc2451d7"><code>b4fc245</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1130">#1130</a>
from serde-rs/checkcfg</li>
<li><a
href="98f1a247de"><code>98f1a24</code></a>
Resolve unexpected_cfgs warning</li>
<li>See full diff in <a
href="https://github.com/serde-rs/json/compare/v1.0.116...v1.0.117">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Refs #5022
The debug IPC service has been useful on Windows, and since there is
more refactoring to do, I want it on Linux too.
With this you can just do `sudo -E target/debug/firezone-client-ipc
debug-ipc-service` and it will launch an IPC service without messing
with systemd or installing anything. (Assuming the directory for the
socket is created)
```[tasklist]
### Before merging
- [ ] Check for regressions in Windows
- [ ] Check for regressions in Linux
```
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.197 to
1.0.203.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.203</h2>
<ul>
<li>Documentation improvements (<a
href="https://redirect.github.com/serde-rs/serde/issues/2747">#2747</a>)</li>
</ul>
<h2>v1.0.202</h2>
<ul>
<li>Provide public access to RenameAllRules in serde_derive_internals
(<a
href="https://redirect.github.com/serde-rs/serde/issues/2743">#2743</a>)</li>
</ul>
<h2>v1.0.201</h2>
<ul>
<li>Resolve unexpected_cfgs warning (<a
href="https://redirect.github.com/serde-rs/serde/issues/2737">#2737</a>)</li>
</ul>
<h2>v1.0.200</h2>
<ul>
<li>Fix formatting of "invalid type" and "invalid
value" deserialization error messages containing NaN or infinite
floats (<a
href="https://redirect.github.com/serde-rs/serde/issues/2733">#2733</a>,
thanks <a
href="https://github.com/jamessan"><code>@jamessan</code></a>)</li>
</ul>
<h2>v1.0.199</h2>
<ul>
<li>Fix ambiguous associated item when
<code>forward_to_deserialize_any!</code> is used on an enum with
<code>Error</code> variant (<a
href="https://redirect.github.com/serde-rs/serde/issues/2732">#2732</a>,
thanks <a
href="https://github.com/aatifsyed"><code>@aatifsyed</code></a>)</li>
</ul>
<h2>v1.0.198</h2>
<ul>
<li>Support serializing and deserializing
<code>Saturating<T></code> (<a
href="https://redirect.github.com/serde-rs/serde/issues/2709">#2709</a>,
thanks <a
href="https://github.com/jbethune"><code>@jbethune</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d5bc546ca5"><code>d5bc546</code></a>
Release 1.0.203</li>
<li><a
href="45ae217728"><code>45ae217</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2747">#2747</a>
from dtolnay/variadic</li>
<li><a
href="b7b97dda73"><code>b7b97dd</code></a>
Unindent implementation inside tuple_impl_body macro</li>
<li><a
href="5d3c563d46"><code>5d3c563</code></a>
Document tuple impls as fake variadic</li>
<li><a
href="376185458b"><code>3761854</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2745">#2745</a>
from dtolnay/docsrs</li>
<li><a
href="a8f14840ab"><code>a8f1484</code></a>
Rely on docs.rs to define --cfg=docsrs by default</li>
<li><a
href="9e32a40b1c"><code>9e32a40</code></a>
Release 1.0.202</li>
<li><a
href="87f635e54d"><code>87f635e</code></a>
Release serde_derive_internals 0.29.1</li>
<li><a
href="d4b2dfbde2"><code>d4b2dfb</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2743">#2743</a>
from dtolnay/renameallrules</li>
<li><a
href="f6ab0bc56f"><code>f6ab0bc</code></a>
Provide public access to RenameAllRules in serde_derive_internals</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.197...v1.0.203">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#4995Closes#4925Closes#4997Closes#5047
Supersedes #4965 and #5004.
NOT changing:
- Page description for other Clients. That is still "Firezone
Documentation"
Need these Clients:
- Windows GUI
- Linux headless
- Linux GUI
to have these things documented: (with exact terms)
- Prerequisites
- Installation
- Usage
- Signing in
- Accessing a Resource
- Signing out
- Quitting
- Upgrading
- Diagnostic logs
- Uninstalling
- Troubleshooting
- DNS not reverted after exit
- DNS Resource not accessible
- Known issues
```[tasklist]
### Before merging
- [x] Test Windows GUI instructions
- [x] Add troubleshooting for #5027
- [x] Fill in troubleshooting sections
- [x] Test Linux GUI instructions
- [x] Linux headless - Make sure SIGTERM or Ctrl+C or whatever reverts resolv.conf
- [x] Test Linux Headless instructions
- [x] Page descriptions should be "How to install and use the Firezone $OS $UI client."
- [x] ~~Linux headless - Confirm behaviors and default values of all env vars~~ (skipping - The ones that are used are exercised)
- [x] Grep for TODOs
- [x] Change "un-install" to "uninstall"
- [x] Capitalize "Client" where needed
- [x] Change "IPC service" to "Tunnel service" or something
- [x] Change "SplitDNS" to "Split DNS"
- [ ] Wait for next Client release to be cut
```
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
Ready for review.
Closes#3712.
Supersedes #4940.
Refs #4963.
I haven't figured out if it needs any new automated tests (unit,
integration, etc.) but the code itself is ready for review. There is
more refactoring that could be done, or could be left for later.
```[tasklist]
- [x] Move wintun setup from GUI to IPC service / headless client
- [x] Make sure the device ID is in a sensible place
- [x] Export IPC service logs in the zips
- [x] Test GUI + SC IPC service on Windows (f4db808919a passed)
- [x] Make sure IPC service does not busy-loop
- [x] Test un-install checklist for Windows
- [x] Test upgrade checklist for Windows
- [x] Test GUI + systemd IPC service on Linux (c4ab7e7 passed)
- [x] Test upgrade checklist for Linux
- [x] Test un-install checklist for Linux
- [x] Make sure the IPC service logs out and deactivates DNS control if the GUI crashes
- [x] Test network changing
- [x] (it's intended behavior) ~~Look into spurious `on_update_resources` (fad86babd7)~~
- [x] ~~Test max partition time on offline laptop~~ (I ended up just setting a 30-day default in the code)
- [x] Make sure headless Client does not busy-loop
- [x] Test standalone headless on Linux
- [ ] Add unit / integration tests
- [ ] Think about security a bit #3971
```
---------
Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
This PR introduces site's `Status`. That's used to report to the client
the status, either, unknown, online or offline, mostly as a hint to
users as what's wrong with a connection.
This are the criteria for an online or offline resource
* If all sites related to a resource are offline the resource is
considered offline, since there's no gateway that can respond to that
resource's connection
* If any site is online the resource is online, since that same peer can
be used to reach that resource
* Any other case is unknown
Right now resources are single site so it doesn't matter too much but
tracking online/offline per-site instead of per-gateway or resource
seems like the better long-term solution.
The way to "find out" the site's status is:
* If a response to a connection details is offline, all sites related to
that resource must be offline otherwise there would've been a gateway in
the response
* At the point we connect to a gateway, the site that corresponds to
that gateway must be online
* When a connection to a peer stops it's considered unknown again
Fixes#4738
Closes#4907
They're still accepted, but the binary entirely determines the behavior.
This makes the code for CLI parsing and token handling simpler with
fewer branches, so it's easier to be sure it's correct.
Replaces #4942 which isn't doing what I intended anymore.