Commit Graph

1183 Commits

Author SHA1 Message Date
Reactor Scram
d0f68fc133 test(gui-client): multi-process smoke test for GUI + IPC service (#5672)
```[tasklist]
### Tasks
- [x] Check the GUI saves its settings file
- [x] Check the IPC service writes the device ID to disk
- [x] Check the GUI writes a log file (skipped - we already check if the exported zip has any files in it)
- [x] Run the crash file through `minidump-stackwalk`
- [x] Reach feature parity with the original smoke tests
- [x] Ready for review
- [x] Finish #5452
- [ ] Start on #5453 
```
2024-07-04 21:10:31 +00:00
Jamil
60d2a2befd fix(infra): relay listens on UDP only (#5718)
I don't believe we use/need TCP for the Relays. Better to keep the ports
closed if so.

Also, the docker-compose.yml is updated to allow the `relay-1` service
to respond to all its ports, since we don't need those mapped typically.
2024-07-04 16:53:08 +00:00
Jamil
086c730aaf chore: Bump clients to 1.1.2 for DNS record type forward (#5703)
Apps are already in review with App Stores
2024-07-04 01:31:26 +00:00
Reactor Scram
f6e99752ec fix(client): flush the OS' DNS cache whenever resources change (#5700)
Closes #5052

On my dev VMs:
- systemd-resolved = 15 ms to flush
- Windows = 600 ms to flush

I tested with the headless Clients on Linux and Windows and it fixes the
issue. On Windows I didn't replicate the issue with the GUI Client, on
Linux this patch also fixes it for the GUI Client.
2024-07-03 21:14:43 +00:00
Reactor Scram
ecb38dedf9 fix(gui-client/windows): retry 10 times while creating the deep link server (#5570)
Temporary fix for #5566 

A better fix would be to merge the deep link and IPC service code, but I
tried that a couple times and failed, their interfaces are different.

```[tasklist]
### Tasks
- [x] Expand comment explaining the root cause
- [x] Re-request review
```
2024-07-03 20:55:30 +00:00
Gabi
5fd321c4bb chore(connlib): forward non-address record queries (#5674)
Since we only handle `A`, `AAAA` and `PTR` records of names we handle,
this can lead to unexpected behavior with other record types, where
using Firezone breaks `TXT`, `MX` or other record types for the
resources we handle.

So this is a bit of a refactor, now we lookup a resource and explicitly
return `Some` when there is a record we should be returning (even if
it's empty due to IP exhaustion) or `None` when we should just forward
the query.

This has the added benefit of no longer breaking bonjour or other
non-standard `PTR` queries.

Fixes: #5673.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-07-03 05:15:23 +00:00
Reactor Scram
4b6b706d46 refactor(gui-client): remove the heartbeat module (#5682)
We added this to diagnose a hang in the IPC service, #5441. That hang,
to the best of our knowledge, was caused by a deadlock which we fixed in
#5571. So the heartbeat task just adds a lot of noise to the stdout
which is annoying for debugging and won't be used in production logs.

The system uptime measuring is still useful, so we now log that just
once when logging starts, next to the git version and log directives.

If we see this pattern in either process' logs, we know something is
suspicious:
- Log file ends without a clean shutdown message
- Next log file starts with a high system uptime

Updates should always result in a clean shutdown message, and a sudden
power loss (mains power outage, or laptop battery dying) would result in
the system uptime being low for the 2nd log file.
2024-07-02 18:33:47 +00:00
Gabi
79fd8f6063 chore(connlib): add message type to the no records found logs (#5641)
Added for clarity when debugging, it used to look like:

```
2024-06-30T00:16:05.718337Z DEBUG firezone_tunnel::dns: No records for github.com, returning NXDOMAIN
```

And now looks like:

```
2024-06-30T00:16:05.718337Z DEBUG firezone_tunnel::dns: No MX records for github.com, returning NXDOMAIN
```
2024-07-01 23:15:44 +00:00
Reactor Scram
4075b779b5 refactor(gui-client): reload log filter immediately (#5671)
This will simplify #5590 some. The API URL and auth URL still take
effect on the next sign-in, but we don't have to explain that the
settings take effect after restarting the entire Client process, those
take effect somewhat immediately.

For some reason I see some lag, maybe the tracing layers don't check for
a new filter on every span, maybe they have some delay to save CPU time.
2024-07-01 21:36:00 +00:00
Reactor Scram
976cdfa731 refactor(headless-client): vendor uptime_lib (#5625)
This does the same thing as #5621 without removing the library, since it
will now compile against whatever version of `windows` we need

We could do the same with `hostname`, either vendor or ask upstream to
bump deps, and then `windows` 0.52.0 should be gone.

```[tasklist]
### Tasks
- [x] Remove macOS code and shrink everything
```
2024-07-01 16:44:46 +00:00
Thomas Eizinger
02f5c67974 chore(windows): reduce nesting in wintun recv-thread (#5573)
Related: #5571.
2024-07-01 16:33:59 +00:00
dependabot[bot]
8c5092cf6c build(deps-dev): Bump typescript from 5.4.5 to 5.5.2 in /rust/gui-client (#5664)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.4.5
to 5.5.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Microsoft/TypeScript/releases">typescript's
releases</a>.</em></p>
<blockquote>
<h2>TypeScript 5.5</h2>
<p>For release notes, check out the <a
href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-5/">release
announcement</a>.</p>
<p>For the complete list of fixed issues, check out the</p>
<ul>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=is%3Aissue+milestone%3A%22TypeScript+5.5.2%22+is%3Aclosed+">fixed
issues query for TypeScript v5.5.2 (Stable)</a>.</li>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=is%3Aissue+milestone%3A%22TypeScript+5.5.1%22+is%3Aclosed+">fixed
issues query for TypeScript v5.5.1 (RC)</a>.</li>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=is%3Aissue+milestone%3A%22TypeScript+5.5.0%22+is%3Aclosed+">fixed
issues query for TypeScript v5.5.0 (Beta)</a>.</li>
</ul>
<p>Downloads are available on:</p>
<ul>
<li><a href="https://www.npmjs.com/package/typescript">npm</a></li>
<li><a
href="https://www.nuget.org/packages/Microsoft.TypeScript.MSBuild">NuGet
package</a></li>
</ul>
<h2>TypeScript 5.5 RC</h2>
<p>For release notes, check out the <a
href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-5-rc/">release
announcement</a>.</p>
<p>For the complete list of fixed issues, check out the</p>
<ul>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=milestone%3A%22TypeScript+5.5.0%22+is%3Aclosed+">fixed
issues query for Typescript 5.5.0 (Beta)</a>.</li>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=milestone%3A%22TypeScript+5.5.1%22+is%3Aclosed+">fixed
issues query for Typescript 5.5.1 (RC)</a>.</li>
</ul>
<p>Downloads are available on:</p>
<ul>
<li><a
href="https://www.nuget.org/packages/Microsoft.TypeScript.MSBuild">NuGet
package</a></li>
</ul>
<h2>TypeScript 5.5 Beta</h2>
<p>For release notes, check out the <a
href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-5-beta/">release
announcement</a>.</p>
<p>For the complete list of fixed issues, check out the</p>
<ul>
<li><a
href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&amp;q=milestone%3A%22TypeScript+5.5.0%22+is%3Aclosed+">fixed
issues query for Typescript 5.5.0 (Beta)</a>.</li>
</ul>
<p>Downloads are available on:</p>
<ul>
<li><a href="https://www.npmjs.com/package/typescript">npm</a></li>
<li><a
href="https://www.nuget.org/packages/Microsoft.TypeScript.MSBuild">NuGet
package</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ce2e60e4ea"><code>ce2e60e</code></a>
Update LKG</li>
<li><a
href="f3b21a2033"><code>f3b21a2</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58931">#58931</a>
(Defer creation of barebonesLibSourc...) into release-5.5 (#...</li>
<li><a
href="7b1620bea2"><code>7b1620b</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58811">#58811</a>
(fix(58801): &quot;Move to file&quot; on globa...) into release-5.5
(#...</li>
<li><a
href="5367ae10f5"><code>5367ae1</code></a>
Bump version to 5.5.2 and LKG</li>
<li><a
href="02132e5b81"><code>02132e5</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58895">#58895</a>
(Fix global when typescript.js loade...) into release-5.5 (#...</li>
<li><a
href="45b1e3c254"><code>45b1e3c</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58872">#58872</a>
(Fix declaration emit crash) into release-5.5 (<a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58874">#58874</a>)</li>
<li><a
href="17933ee33a"><code>17933ee</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58810">#58810</a>
(Fixed declaration emit issue relate...) into release-5.5 (#...</li>
<li><a
href="552b07e795"><code>552b07e</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58786">#58786</a>
(Fixed declaration emit crash relate...) into release-5.5 (#...</li>
<li><a
href="39c9eebf17"><code>39c9eeb</code></a>
Pick <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58857">#58857</a>
to release-5.5 (<a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58858">#58858</a>)</li>
<li><a
href="2b0009c679"><code>2b0009c</code></a>
🤖 Pick PR <a
href="https://redirect.github.com/Microsoft/TypeScript/issues/58846">#58846</a>
(Ensure the updates with crashes rev...) into release-5.5 (#...</li>
<li>Additional commits viewable in <a
href="https://github.com/Microsoft/TypeScript/compare/v5.4.5...v5.5.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=typescript&package-manager=npm_and_yarn&previous-version=5.4.5&new-version=5.5.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 14:20:19 +00:00
dependabot[bot]
278d26b083 build(deps-dev): Bump @types/node from 20.14.2 to 20.14.9 in /rust/gui-client (#5663)
Bumps
[@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node)
from 20.14.2 to 20.14.9.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=npm_and_yarn&previous-version=20.14.2&new-version=20.14.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 14:11:56 +00:00
dependabot[bot]
9994690f63 build(deps): Bump flowbite from 2.3.0 to 2.4.1 in /rust/gui-client (#5665)
Bumps [flowbite](https://github.com/themesberg/flowbite) from 2.3.0 to
2.4.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/themesberg/flowbite/releases">flowbite's
releases</a>.</em></p>
<blockquote>
<h2>v2.4.1</h2>
<ul>
<li>fix datepicker module declaration naming for TypeScript</li>
</ul>
<h2>v2.4.0</h2>
<ul>
<li>the datepicker is now a core component of Flowbite and has API
methods, events, and options</li>
<li>updated the documentation for the datepicker component and related
integration guides</li>
<li>minor visual bug fixes and improvements</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8c8d65e489"><code>8c8d65e</code></a>
fix(typescript): datepicker naming and version bump to v2.4.1</li>
<li><a
href="2a8c18eed9"><code>2a8c18e</code></a>
Merge branch 'datepicker-instance'</li>
<li><a
href="6b160cc82d"><code>6b160cc</code></a>
chore(version): bump to v2.4.0</li>
<li><a
href="e9b8ae3715"><code>e9b8ae3</code></a>
Merge pull request <a
href="https://redirect.github.com/themesberg/flowbite/issues/907">#907</a>
from themesberg/datepicker-instance</li>
<li><a
href="1d76b8ffc1"><code>1d76b8f</code></a>
docs(changelog): add changelog</li>
<li><a
href="213577a394"><code>213577a</code></a>
docs(datepicker): update Phoenix and Rails docs for new datepicker
update</li>
<li><a
href="6a16510f28"><code>6a16510</code></a>
docs(datepicker): fix TypeScript example from docs</li>
<li><a
href="1e0d112435"><code>1e0d112</code></a>
fix(typescript): fix fucking typescript config for cross npm
declarations</li>
<li><a
href="6d1fbf3285"><code>6d1fbf3</code></a>
docs(nuxt): update Nuxt docs for Flowbite via composables</li>
<li><a
href="36eeab7fb9"><code>36eeab7</code></a>
docs(datepicker): update import statements for parent plugin</li>
<li>Additional commits viewable in <a
href="https://github.com/themesberg/flowbite/compare/v2.3.0...v2.4.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=flowbite&package-manager=npm_and_yarn&previous-version=2.3.0&new-version=2.4.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 13:11:49 +00:00
Jamil
25b6528942 chore: Bump versions and update changelog (#5636)
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-06-29 09:06:10 -07:00
Thomas Eizinger
b5fd980fb2 fix(relay): don't log all request failures on the same level (#5622)
Currently, the relay logs all failed requests on WARN. This is a bit
excessive because during normal operation, clients are expected to hit
several 401s due to stale or missing nonces.

In order to not flood the logs with these, we introduce a new type,
`ResponseErrorLevel` that represents the subset of `tracing::Level` that
`make_error_response` can log:

- `Warn`
- `Debug`

Both variants mapping to the variants in `tracing::Level` with the same
name, and the function will log accordingly.

So now the caller can pick what level of error is meant to be used and
reduce the noise on the logs when it's meant to be part of normal
operation.

Fixes: #5490.

---------

Co-authored-by: conectado <gabrielalejandro7@gmail.com>
2024-06-29 02:38:55 +00:00
Thomas Eizinger
96536a23cf refactor(connlib): ignore relays per connection (#5631)
In a previous design of firezone, relays used to be scoped to a certain
connection. For a while now, this constraint has been lifted and all
connections can use all relays. A related, outdated concern is the idea
of STUN-only servers. Those also used to be assigned on a per-connection
basis.

By removing any use of per-connection relays and STUN-only servers, the
entire `StunBinding` concept is unused code and can thus be deleted.

To push this over the finish line, the `snownet-tests` which test the
hole-punching functionality needed to be slightly adapted to make use of
the more recently introduced API `Node::update_relays`.

Resolves: #4749.
2024-06-29 02:36:17 +00:00
Thomas Eizinger
f2b6c205c2 refactor(snownet): change reconnect to reset (#5630)
Currently, `snownet` still supports this notion of "reconnecting" which
is a mix between resetting some state but keeping other. In particular,
we currently retain the `StunBinding` and `Allocation` state. This used
to be important because allocations are bound to the 3-tuple of the
client and thus needed to be kept around in case we weren't actually
roaming.

We always rebind the the local UDP sockets upon reconnecting and thus
the 3-tuple always changes anyway. In addition, we always reconnect to
the portal, meaning we receive another `init` message and thus can
actually completely clear the `Node`'s state.

This PR does that an in the process, rebrands `reconnect` as `reset`
which now makes more sense.

Related: #5619.
2024-06-29 02:07:10 +00:00
Thomas Eizinger
38275ecad0 refactor(gateway): extract fn for update-device task (#5581)
Follow-up feedback from #5512.
2024-06-29 01:27:23 +00:00
Thomas Eizinger
8973cc5785 refactor(android): use fmt::Layer with custom writer (#5558)
Currently, the logs that go to logcat on Android are pretty badly
formatted because we use `tracing-android` and it formats the span
fields and message fields itself. There is actually no reason for doing
the formatting ourselves. Instead, we can use the `MakeWriter`
abstraction from `tracing_subscriber` to plug in a custom writer that
writes to Android's logcat.

This results in logs like this:

```
[nix-shell:~/src/github.com/firezone/firezone/rust]$ adb logcat -s connlib
--------- beginning of main
06-28 19:41:20.057 19955 20213 D connlib : phoenix_channel: Connecting to portal host=api.firez.one user_agent=Android/14 5.15.137-android14-11-gbf4f9bc41c3b-ab11664771 connlib/1.1.1
06-28 19:41:20.058 19955 20213 I connlib : firezone_tunnel::client: Network change detected
06-28 19:41:20.061 19955 20213 D connlib : snownet::node: Closed all connections as part of reconnecting num_connections=0
06-28 19:41:20.365 19955 20213 I connlib : phoenix_channel: Connected to portal host=api.firez.one
06-28 19:41:20.601 19955 20213 I connlib : firezone_tunnel::io: Setting new DNS resolvers
06-28 19:41:21.031 19955 20213 D connlib : firezone_tunnel::client: TUN device initialized ip4=100.66.86.233 ip6=fd00:2021:1111::f:d9c1 name=tun1
06-28 19:41:21.031 19955 20213 I connlib : connlib_client_shared::eventloop: Firezone Started!
06-28 19:41:21.031 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.slackb.com
06-28 19:41:21.031 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.test-ipv6.com
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=5.4.6.7/32 name=5.4.6.7
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=10.0.32.101/32 name=IPerf3
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=ifconfig.net
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.slack-imgs.com
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.google.com
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=10.0.0.5/32 name=10.0.0.5
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.githubassets.com
06-28 19:41:21.032 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=dnsleaktest.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.slack-edge.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.github.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=speed.cloudflare.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.githubusercontent.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=10.0.14.11/32 name=Staging resource performance
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::dns: Activating DNS resource address=*.whatismyip.com
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=10.0.0.8/32 name=10.0.0.8
06-28 19:41:21.033 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=9.9.9.9/32 name=Quad9 DNS
06-28 19:41:21.034 19955 20213 I connlib : firezone_tunnel::client: Activating CIDR resource address=10.0.32.10/32 name=CoreDNS
06-28 19:41:21.216 19955 20213 I connlib : snownet::node: Added new TURN server id=bd6e9d1a-4696-4f8b-8337-aab5d5cea810 address=Dual { v4: 35.197.171.113:3478, v6: [2600:1900:40b0:1504:0:27::]:3478 }
```

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
2024-06-28 22:15:10 +00:00
Jamil
8655b711db fix(connlib): Don't use operatingSystemVersionString on Apple OSes (#5628)
The [HTTP 1.1 RFC](https://datatracker.ietf.org/doc/html/rfc2616) states
that HTTP headers should be US-ASCII. This is not the case when the
macOS Client is run from a host that has a non-English language selected
as its system default due to the way we build the user agent.

This PR fixes that by normalizing how we build the user agent by more
granularly selecting which fields compose it, and not just relying on
OS-provided version strings that may contain non-ASCII characters.

fixes https://github.com/firezone/firezone/issues/5467

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2024-06-28 21:59:02 +00:00
Thomas Eizinger
e5cba1caf4 refactor(apple): use fmt::Layer with custom writer (#5623)
Currently, we use the `tracing-oslog` crate to ingest logs on MacOS and
iOS. This crate has a "feature" where it creates so called "Activities"
for spans. Whilst that may initially sound useful, Apple's UI for
viewing these activities is absolutely useless.

Instead of tinkering around with that, we remove the `tracing-oslog`
crate and let `tracing-subscriber` format our logs first and then only
send a single string to the oslog backend.

Related: #5619.
2024-06-28 21:22:54 +00:00
Reactor Scram
37d3ebbb7c chore(gui-client/windows): bump tauri-winrt-notification (#5627)
This eliminates `windows` 0.54.0 so it should speed up Windows builds a
little. It's 6% faster on my Macbook according to `cargo build
--timing`, in debug mode.
2024-06-28 21:19:51 +00:00
Reactor Scram
a315c49b3c chore(firezone-tunnel/windows): reduce ring buffer from 64 MiB to 1 MiB (#5609)
Oops. It runs the same either way so we definitely don't need all that
RAM to be tied up. The Linux and macOS Clients probably have similar
buffer sizes already.

I tested before and after with CloudFlare's speed test and got roughly
140/12 with latency 50 ms both times. The error bars on speed tests are
pretty wide, but we definitely aren't falling 60 MiB behind on
processing and then catching up.

```[tasklist]
### Tasks
- [x] (failed, can't do it right now) ~~Log if we knowingly drop a lot of packets~~
- [x] Extract constant
- [x] Add comment about not knowing if we drop packets
- [x] Merge
- [ ] (skipped) Test while the CPU is loaded
```
2024-06-28 21:03:18 +00:00
Reactor Scram
649db863ca chore(gui-client): explain why the update check has redirects disabled (#5608)
Closes #5383
2024-06-28 14:28:09 +00:00
Thomas Eizinger
ed34ca096b chore(gateway): remove dead IP detection (#5618)
This does not work as well as intended and spams the logs. We may need
#5542 before we can implement this properly.

Fixes: #5593.
2024-06-28 04:47:00 +00:00
Jamil
d529ace29c chore: Bump Windows to 1.1.1, update changelog with dl links (#5610)
Fixes #5597
2024-06-27 20:53:00 -07:00
Thomas Eizinger
66cb565915 fix(snownet): use unused channels before reused expired ones (#5613)
Within each allocation, a client has 4095 channels that it can bind to a
different peers. Each channel bindings is valid for 10 minutes unless
rebound. Additionally, there is a 5min cool-down period after a channel
binding expires before it can be rebound to a different peer.

This patch fixes a bug in snownet where we would have first attempted to
rebind the last bound channel instead of just picking the next unused
one. In the case of a clock drift between client and relay, this caused
unnecessary errors when attempting to rebind channels.

Fixes: #5603.

---------

Co-authored-by: conectado <gabrielalejandro7@gmail.com>
2024-06-28 03:14:16 +00:00
Thomas Eizinger
aadb045b27 chore(connlib): batch together sending of ICE candidates (#5616)
Currently, we are sending each ICE candidate individually from the
client to the gateway and vice versa. This causes a slight delay as to
when each ICE candidate gets added on the remote ICE agent. As a result,
they all start being tested with a slight offset which causes "endpoint
hopping" whenever a connection expires as they expire just after each
other.

In addition, sending multiple messages to the portal causes unnecessary
load when establishing connections.

Finally, with #5283 we started **not** adding the server-reflexive
candidate to the local ICE agent. Because we talk to multiple relays, we
detect the same server-reflexive candidate multiple times if we are
behind a non-symmetric NAT. Not adding the server-reflexive candidate to
the ICE agent mitigated our de-duplication strategy here which means we
currently send the same candidate multiple times to a peer, causing
additional, unnecessary load.

All of this can be mitigated by batching together all our ICE candidates
together into one message.

Resolves: #3978.
2024-06-28 02:04:31 +00:00
Thomas Eizinger
79ff3f830b chore(gateway): downgrade warn logs (#5612)
Whilst it has been helpful to find issues such as #5611, having these
logs on `warn` spams the end user too much and creates a false sense
that things might not be working as there can be a variety of reasons
why packets might not be able to be routed.
2024-06-28 01:13:29 +00:00
Thomas Eizinger
1aa95ed17e fix(connlib): be explicit about unsupported ICMP types (#5611)
Our NAT table uses TCP & UDP ports for its entries. To correctly handle
ICMP requests and responses, we use the ICMP identifier in those
packets. All other ICMP messages are currently unsupported.

The errors paths for accessing these fields, i.e. ports for UDP/TCP and
identifier for ICMP currently conflate two different errors:

- Unsupported IP payload: it is neither TCP, UDP or ICMP
- Unsupported ICMP type: it is not an ICMP request or response

This makes certain logs look worse than they are because we say
"Unsupported IP protocol: Icmpv6". To avoid this, we create a dedicated
error variant that calls out the unsupported ICMP type.

Fixes: #5594.
2024-06-28 01:13:25 +00:00
Gabi
375a1b5586 fix(connlib): allow 1s ACK for packet before refreshing DNS (#5560)
Currently, we refresh DNS mappings when:
* We translate a packet for the first time
* There are no more incoming packets for 120 seconds
* There is at least 1 outoing packet in the last 10 seconds

The idea was to coordinate with conntrack somehow, to expire DNS
translation at the point where the NAT session of the OS stops being
valid. That way, if the triggered DNS refresh changes the resolved IPs
it would never kill the underlying connection.

However, TCP sessions by default can last for up to 5 days! And I have
no idea how long for ICMP. To prevent killing these connections, we
assume that for TCP and ICMP packets will elicit a response within 1s.
The DNS refresh for a translation mapping that hasn't seen any responses
is thus delayed by 1s after the last packet has been sent out.

To get an idea of how this works you can imagine it like this

|last incoming packet|------ 120 seconds + x seconds ----|out going
packet|----1 second ----|dns refresh|

However this another case where dns refresh is triggered, in this case
the same packet triggers the refresh period and the period where it was
used in the last 10 seconds

|last incoming packet|------ 111 seconds ----|out going packet|---- 9
seconds ----|dns refresh|

The unit tests should also make clear of when we want to trigger dns
refresh and when we don't.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-06-28 00:25:26 +00:00
Reactor Scram
76e55e6138 fix(client/windows): fix upload speed by letting Wintun queue packets again (#5598)
Closes #5589. Refs #5571

Improves upload speeds on my Windows 11 VM from 2 Mbps to 10.5 Mbps.

On the resource-constrained VM it improved from 3 to 7 Mbps.

```[tasklist]
### Tasks
- [x] Open for review
- [x] Manual test on resource-constrained VM
- [x] Run 5x replication steps from #5571 and make sure it doesn't deadlock again
- [x] Merge
- [ ] https://github.com/firezone/firezone/issues/5601
```

Sorted by decreasing speed, M = macOS host, W = Windows guest in
Parallels, RC = Resource-constrained Windows guest in VirtualBox:

- M, Internet - 16 Mbps
- W, Internet - 13 Mbps
- M, Firezone - 12 Mbps
- RC, Internet - 12 Mbps
- W, Firezone, after this PR - 10.5 Mbps
- RC, Firezone, after this PR - 8.5 Mbps
- RC, Firezone, before this PR - 4 Mbps
- W, Firezone, before this PR - 2 Mbps

So it's not perfect but the worst part is fixed.

The slow upload speeds were probably a regression from #5571. The MPSC
channel only has a few spots in it, so if connlib doesn't pick up every
packet immediately (which would be impossible under load), we drop
packets. I measured 25% packet drops in an earlier commit.

I first tried increasing the channel size from 5 to 64, and that worked.

But this solution is simpler. I switch back to `blocking_send` so if
connlib isn't clearing the MPSC channel, Wintun will just queue up
packets in its internal ring buffers, and we aren't responsible for
buffering.

Getting rid of `blocking_send` was a defense-in-depth thing to fix the
deadlock yesterday, but we still close the MPSC channel inside
`Tun::drop`, and I confirmed in a manual test that this will kick the
worker thread out of `blocking_send`, so the deadlock won't come back.
2024-06-27 17:59:22 +00:00
Jamil
b5de55ac26 chore: Bump clients to 1.1.0, Gateway to 1.1.1 (#5591) 2024-06-27 02:43:48 -07:00
Thomas Eizinger
b6420eaa3e feat(snownet): close idle connections after 5min (#5576)
We define a connection as idle if we haven't sent or received any
packets in the last 5 minutes. From `snownet`'s perspective, keep-alives
sent by upper layers (like TCP keep-alives) must be honored and thus
outgoing as well as incoming packets are accounted for.

If the underlying connection breaks, we will hit an ICE timeout which is
an implementation detail of `snownet`. The packets tracked here are IP
packets that the user wants to send / receive via the tunnel. Similarly,
wireguard's keep-alives do not update these timestamps and thus don't
mark a connection as non-idle.

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2024-06-27 08:28:38 +00:00
Thomas Eizinger
58fad7cb2d refactor(connlib): batch resource change updates (#5575)
Currently, upon reconnecting, `snownet` returns a list of connection IDs
that have been closed. This was done to avoid emitting many identical
`ResourcesChanged` events. In all other events, `snownet` always only
references a single connection. To align this whilst not duplicating
`ResourcesChanged` events, we use a dedicated `bool` to check, whether
any of the events emitted by `snownet` require updating the clients
about our active resources.
2024-06-27 07:48:41 +00:00
Thomas Eizinger
18b9c35316 chore(connlib): explicitly handle invalid_version error (#5577)
Ensures we correctly deserialize `invalid_version` and don't fall-back
to `Other`.

Related: #5525.
2024-06-27 07:41:41 +00:00
Gabi
ad8c92ca35 fix(connlib): dont panic in invalid PTR records (#5588) 2024-06-27 07:24:06 +00:00
Thomas Eizinger
9ddee774b4 chore(connlib): allow filtering of wire log target (#5578)
Currently, enabling the `wire` log is an all or nothing approach,
logging incoming and outgoing messages from the TUN device, network and
the portal.

Often, only one or more of these is desired but enabling all of `wire`
spams the logs to the point where one cannot see the information they'd
like. With this PR, we move some of the fields of the `wire` log
statements to the log target instead. This allows controlling the logs
via the `RUST_LOG` env variable.

For example, to only see messages sent and received to the API, one can
set `RUST_LOG=wire::api=trace` which will output something like:

```
2024-06-27T02:12:41.821374Z TRACE wire::api::send: {"topic":"client","event":"phx_join","payload":null,"ref":0}
2024-06-27T02:12:42.030573Z TRACE wire::api::recv: {"event":"phx_reply","ref":0,"topic":"client","payload":{"status":"ok","response":{}}}
```

Similarly, enabling `wire::net=trace` will give you logs for packets
sent over the network:

```
2024-06-27T02:12:50.487503Z TRACE wire::net::send: src=None dst=34.80.2.250:3478 num_bytes=20
2024-06-27T02:12:50.487589Z TRACE wire::net::send: src=None dst=[2600:1900:4030:b0d9:0:5::]:3478 num_bytes=20
2024-06-27T02:12:50.487622Z TRACE wire::net::send: src=None dst=34.87.210.10:3478 num_bytes=20
2024-06-27T02:12:50.487652Z TRACE wire::net::send: src=None dst=[2600:1900:40b0:1504:0:17::]:3478 num_bytes=20
2024-06-27T02:12:50.510049Z TRACE wire::net::recv: src=34.87.210.10:3478 dst=192.168.188.71:39207 num_bytes=32
2024-06-27T02:12:50.510382Z TRACE wire::net::send: src=None dst=34.87.210.10:3478 num_bytes=112
2024-06-27T02:12:50.526947Z TRACE wire::net::recv: src=34.87.210.10:3478 dst=192.168.188.71:39207 num_bytes=92
2024-06-27T02:12:50.527295Z TRACE wire::net::send: src=None dst=34.87.210.10:3478 num_bytes=152
```

These targets have been designed to take up equal amounts of space. All
three types (`dev`, `net`, `api`) have 3 letters and `send` and `recv`
have 4. That way, these logs are always aligned which makes them easier
to scan.
2024-06-27 06:36:49 +00:00
Gabi
e0e9e078a0 fix(connlib): statically resolve API domain (#5563)
In order to handle DNS resources, connlib intercepts all DNS requests on
the system once it has started up. The DNS queries are then forwarded to
the original DNS resolver in case the query isn't for one of the
configured DNS resources _except_ if the configured DNS resovler is also
a CIDR resource.

In that case, the DNS query will be tunneled to a gateway and forwarded
to the DNS resolver from there.

Exactly this configuration results in a dead-lock when roaming networks.
To make roaming more reliable, we now drop all connections when
detecting a network change (see #5308). As a result, DNS queries cannot
be tunneled right away. This isn't usually a problem: We just send a
connection intent to the portal to connect to the gateway. Upon a
network change, we also reconnect the websocket to the portal which also
requires to resolve the domain name. Connlib's DNS resolver is still
active at the point and thus, we end up deadlocking ourselves because
the DNS query to resolve the portal's domain is waiting for a connection
to a gateway that can only be established once we are connected to the
portal.

To prevent this, we extend connlib with a "known hosts" feature. These
are DNS records that are defined statically for the lifetime of a
connlib session and can thus always be resolved, regardless of the
connection state with the portal or the gateways. We populate these
records with the portal's API, allowing the reconnect to work without
having connected gateways.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-06-27 06:00:56 +00:00
Thomas Eizinger
c2b5379fba chore(connlib): demote log for unknown incoming packets to debug (#5584)
There are several reasons why we would legitimately receive a packet
that we can't handle, i.e. when a connection got cleared locally but the
gateway is still trying to send us packets for that socket. Not handling
these packets can be a bug but more often than not, it is not an issue.

Additionally, all our unit-tests actually `.unwrap` the
`Node::encapsulate` function so any unhandled packets in the tests will
be caught.
2024-06-27 05:58:04 +00:00
Reactor Scram
990f98e60f fix(windows): prevent deadlock when closing wintun (#5571)
Refs #5441, but without a reliable way to replicate that issue, I'm not
sure if this will completely fix it.

Before this PR, a deadlock can happen between 2 threads, call them "main
thread" and "worker thread".
The deadlock is more likely if more traffic is flowing through the
tunnel.

# Test results

I ran a build from this PR inside the resource-constrained VM and it's
likely the deadlock could have triggered there, since the packet channel
had 0 capacity (it was full) when we reached `Tun::drop`:

```jsonl
{"time":"2024-06-26T22:43:33.2398441Z","target":"firezone_headless_client::ipc_service","logging.googleapis.com/sourceLocation":{"file":"headless-client\\src\\ipc_service.rs","line":"304"},"severity":"INFO","gitVersion":"e591bb9","logFilter":"\"str0m=warn,info\""}
..
{"time":"2024-06-26T22:45:42.9035226Z","target":"firezone_tunnel::device_channel::tun_windows","logging.googleapis.com/sourceLocation":{"file":"connlib\\tunnel\\src\\device_channel\\tun_windows.rs","line":"45"},"severity":"INFO","channelCapacity":0,"message":"Shutting down packet channel..."}
{"time":"2024-06-26T22:45:42.9035467Z","target":"firezone_tunnel::device_channel::tun_windows","logging.googleapis.com/sourceLocation":{"file":"connlib\\tunnel\\src\\device_channel\\tun_windows.rs","line":"274"},"severity":"INFO","message":"recv_task exiting gracefully"}
{"time":"2024-06-26T22:45:43.4978015Z","target":"connlib_client_shared","logging.googleapis.com/sourceLocation":{"file":"connlib\\clients\\shared\\src\\lib.rs","line":"150"},"severity":"INFO","message":"connlib exited gracefully"}
```

I followed these steps:
- Run Firezone and sign in
- Start a speed test using Cloudflare
- During the download phase, quit the GUI

I did the same test with 0fac698 (`main`) and got the "All pipe
instances are busy" error dialog 3 out of 5 times.

# Details

The deadlock will happen in this scenario:

- The main thread enters `Tun::drop` here
0fac698dfc/rust/connlib/tunnel/src/device_channel/tun_windows.rs (L44)
- The worker thread is waiting for space in the packet channel
(`packet_tx` and `packet_rx`) here
0fac698dfc/rust/connlib/tunnel/src/device_channel/tun_windows.rs (L249)
- The main thread tells wintun to shut down. If the worker was on line
247 waiting on wintun, this would unblock it, but the worker is not on
line 247.
0fac698dfc/rust/connlib/tunnel/src/device_channel/tun_windows.rs (L45)
- The main thread waits to join the worker thread
0fac698dfc/rust/connlib/tunnel/src/device_channel/tun_windows.rs (L52)

The threads are now deadlocked. The main thread is waiting for the
worker thread to exit, and the worker thread is waiting for the main
thread to either call `poll_recv`, which would cause `blocking_send` to
return, or for the main thread to complete `Tun::drop`, which would
cause Rust to drop `packet_rx`, which would cause `blocking_send` to
return an error.

This PR makes 2 changes to prevent this deadlock. Each change alone
should work, but for defense-in-depth we make both changes:

1. When the main thread starts `Tun::drop`, we `close` the packet
channel, which would unblock any thread waiting on
`Sender::blocking_send`
2. We use `Sender::try_send` instead of `Sender::blocking_send`. If the
main thread can't consume packets fast enough, we're going to drop them
anyway, because the ring buffer in wintun will eventually fill up. So
dropping them here isn't much different from dropping them anywhere
else, and this keeps the worker thread from locking up.
2024-06-26 23:52:20 +00:00
Gabi
0fac698dfc chore(connlib): set connection expiration to 120seconds to respect the conntrack udp timeout (#5559) 2024-06-26 21:25:00 +00:00
Gabi
2d312ddc71 chore(connlib): reduce log level for unallowed packets in client (#5569)
Work around for too many `unallowed packets`.

Long term fix on #5568 and #5560
2024-06-26 21:24:13 +00:00
Jamil
89bb7c2c5d fix(android): Fix crash in setDns on 32-bit Android by using jlong consistently for the SessionWrapper pointer (#5564)
`connlibSessionPtr` is a `Long`, which is 64-bits. On 32-bit Android
architectures, this overwrites part of the `dns_list` for the `setDns`
native function call because Rust uses a `32-bit` sized pointer for
`SessionWrapper` in the function definition.

This causes a JNI crash, detailed below. To fix this, we make sure
`jlong` is received in Rust, and do the pointer conversion in the body
of the functions that need to use it.

Adding @ReactorScram to review for visibility.


```
runtime.cc:655] Runtime aborting...
runtime.cc:655] Dumping all threads without mutator lock held
runtime.cc:655] All threads:
runtime.cc:655] DALVIK THREADS (35):
runtime.cc:655] "ConnectivityThread" prio=5 tid=35 Runnable
runtime.cc:655]   | group="" sCount=0 dsCount=0 flags=0 obj=0x131809a8 self=0xa42dea10
runtime.cc:655]   | sysTid=8854 nice=0 cgrp=default sched=0/0 handle=0x7fbb71c0
runtime.cc:655]   | state=R schedstat=( 0 0 0 ) utm=8 stm=0 core=2 HZ=100
runtime.cc:655]   | stack=0x7fab4000-0x7fab6000 stackSize=1040KB
runtime.cc:655]   | held mutexes= "abort lock" "mutator lock"(shared held)
runtime.cc:655]   native: #00 pc 0037b1dd  /apex/com.android.art/lib/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int, BacktraceMap*, char const*, art::ArtMethod*, void*, bool)+76)
runtime.cc:655]   native: #01 pc 0044cd01  /apex/com.android.art/lib/libart.so (art::Thread::DumpStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool, BacktraceMap*, bool) const+388)
runtime.cc:655]   native: #02 pc 00448447  /apex/com.android.art/lib/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool, BacktraceMap*, bool) const+34)
runtime.cc:655]   native: #03 pc 00465995  /apex/com.android.art/lib/libart.so (art::DumpCheckpoint::Run(art::Thread*)+688)
runtime.cc:655]   native: #04 pc 00460e57  /apex/com.android.art/lib/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*)+354)
runtime.cc:655]   native: #05 pc 0046034f  /apex/com.android.art/lib/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+1514)
runtime.cc:655]   native: #06 pc 0040a3af  /apex/com.android.art/lib/libart.so (art::Runtime::Abort(char const*)+1510)
runtime.cc:655]   native: #07 pc 0000d989  /system/lib/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_3::__invoke(char const*)+48)
runtime.cc:655]   native: #08 pc 0000d295  /system/lib/libbase.so (android::base::LogMessage::~LogMessage()+224)
runtime.cc:655]   native: #09 pc 002965db  /apex/com.android.art/lib/libart.so (art::JavaVMExt::JniAbort(char const*, char const*)+1962)
runtime.cc:655]   native: #10 pc 002966a5  /apex/com.android.art/lib/libart.so (art::JavaVMExt::JniAbortF(char const*, char const*, ...)+64)
runtime.cc:655]   native: #11 pc 004521c1  /apex/com.android.art/lib/libart.so (art::Thread::DecodeJObject(_jobject*) const+544)
runtime.cc:655]   native: #12 pc 0028a6e7  /apex/com.android.art/lib/libart.so (art::(anonymous namespace)::ScopedCheck::CheckInstance(art::ScopedObjectAccess&, art::(anonymous namespace)::ScopedCheck::InstanceKind, _jobject*, bool)+82)
runtime.cc:655]   native: #13 pc 00289779  /apex/com.android.art/lib/libart.so (art::(anonymous namespace)::ScopedCheck::CheckPossibleHeapValue(art::ScopedObjectAccess&, char, art::(anonymous namespace)::JniValueType)+552)
runtime.cc:655]   native: #14 pc 00288f55  /apex/com.android.art/lib/libart.so (art::(anonymous namespace)::ScopedCheck::Check(art::ScopedObjectAccess&, bool, char const*, art::(anonymous namespace)::JniValueType*)+592)
runtime.cc:655]   native: #15 pc 0027cbe7  /apex/com.android.art/lib/libart.so (art::(anonymous namespace)::CheckJNI::GetObjectClass(_JNIEnv*, _jobject*)+586)
runtime.cc:655]   native: #16 pc 003412db  /data/app/~~X6p_4xQWTraApNXlo4SIHA==/dev.firezone.android-zJrN9FN3yhs12tvUNeoOmw==/base.apk!libconnlib.so (offset ec000) (???)
runtime.cc:655]   at dev.firezone.android.tunnel.ConnlibSession.setDns(Native method)
runtime.cc:655]   at NetworkMonitor.onLinkPropertiesChanged(NetworkMonitor.kt:28)
runtime.cc:655]   at android.net.ConnectivityManager$NetworkCallback.onAvailable(ConnectivityManager.java:3328)
runtime.cc:655]   at android.net.ConnectivityManager$CallbackHandler.handleMessage(ConnectivityManager.java:3607)
runtime.cc:655]   at android.os.Handler.dispatchMessage(Handler.java:106)
runtime.cc:655]   at android.os.Looper.loop(Looper.java:223)
runtime.cc:655]   at android.os.HandlerThread.run(HandlerThread.java:67)
```

---------

Co-authored-by: conectado <gabrielalejandro7@gmail.com>
2024-06-26 19:24:44 +00:00
Reactor Scram
f80e550c77 refactor(headless-client): extract ipc_service module (#5546)
Move almost half the lines of code into `ipc_service` so that it's
separate from the headless Client code
Moves the standalone / headless Client code into `standalone`
2024-06-26 14:14:04 +00:00
Gabi
3fa8f04831 chore(connlib): fix test compilation without proptest flag (#5561)
Fixes plain `cargo test`
2024-06-26 11:29:55 +00:00
Gabi
98aa902374 chore(connlib): only refresh DNS for connections that are in use (#5555)
With the current behavior after a connection stops being used it will
trigger a refresh DNS after every 30 seconds forever.

This can be bad for a gateway that could be handling more than thousands
of domain names.

This was prevented before by only setting `slated_for_refresh` when we
see the first packet, this was deprecated in favor of checking times in
the `handle_timeout`.

So the solution now is to check that the connection is being used
currently before triggering any DNS refresh.
2024-06-26 01:10:58 +00:00
Thomas Eizinger
1b5076fa57 fix(gateway): handle init messages during operation (#5512)
Currently, the gateway only handles an `init` message on startup. For
clients, we handle `init` messages also during operation so it only
makes sense to do the same thing for gateways.

This allows us to remove some old code from `phoenix_channel`. In
particular, the `init` function which used to wait for the `init`
message before continuing. In
https://github.com/firezone/firezone/pull/4594, we refactored
`phoenix-channel` to reconnect internally on errors. As a result, the
`connect` function became synchronous and no longer needed an `async`
context.

At the time, the gateway wasn't updated to make use of this. We can now
simplify the gateway code and resolve the outstanding TODO of handling
`init` messages during operation.
2024-06-26 00:11:07 +00:00
Thomas Eizinger
6c842de83c refactor(connlib): don't re-initialise Tun on config updates (#5392)
Currently, connlib re-initialises the TUN device on Linux every time its
configuration gets updated such as when roaming from one network to
another. This is unnecessary. Instead, we can adopt the same approach as
already used on MacOS, iOS and Windows and only initialise it if it
doesn't exist yet.

Doing so surfaces an interesting bug. Currently, attempting to
re-initialise the TUN device fails with a warning:

> connlib_client_shared::eventloop: Failed to set interface on tunnel:
Resource busy (os error 16)

See
https://github.com/firezone/firezone/actions/runs/9656570163/job/26634409346#step:7:103
for an example. As a consequence, we never actually trigger the
`on_set_interface_config` callback and thus never actually set the new
IPs on the TUN device.

Now that we _are_ calling this callback, we execute
`TunDeviceManager::set_ips` which first clears all IPs from the device
and then attaches the new ones. A consequence of this is that the Linux
kernel will clear all routes associated with the device. This clashes
with an optimisation we have in `TunDeviceManager` where we remember the
previously set routes and don't set new ones if they are the same.

This `HashSet` needs to be cleared upon setting new IPs in order to
actually set the new routes correctly afterwards. Without that, we stop
receiving traffic on the TUN device.
2024-06-25 22:30:31 +00:00