Commit Graph

3003 Commits

Author SHA1 Message Date
Andrew Dryga
4fb101ed9f UX cleanup pt 3 (#2789)
Closes https://github.com/firezone/firezone/issues/2601
Also addresses a lot of TODOs from
https://github.com/firezone/firezone/issues/2788
<img width="1728" alt="Screenshot 2023-12-01 at 18 25 11"
src="https://github.com/firezone/firezone/assets/1877644/95137fca-15ab-4b8c-9598-16d92a7951c7">
<img width="1728" alt="Screenshot 2023-12-01 at 18 25 16"
src="https://github.com/firezone/firezone/assets/1877644/9315b754-c3de-4336-8b59-c1d87ac83f69">
<img width="1728" alt="Screenshot 2023-12-01 at 18 25 33"
src="https://github.com/firezone/firezone/assets/1877644/65245194-c922-401e-bbc4-ff4a378520d2">
<img width="1728" alt="Screenshot 2023-12-01 at 18 25 39"
src="https://github.com/firezone/firezone/assets/1877644/3ac8c2c8-c0a8-4074-9cb1-123bc2c21e71">
<img width="1728" alt="Screenshot 2023-12-01 at 18 25 59"
src="https://github.com/firezone/firezone/assets/1877644/7a96cf74-3a9a-4215-9b22-871dee335b30">
2023-12-04 13:56:31 -05:00
Thomas Eizinger
9a5f4e0ce2 fix(relay): ensure channel numbers are unique to a client (#2744)
Previously, there was a misinterpretation of the spec that didn't allow
_different_ clients to use the same channel number. This is wrong
though. Because channel numbers are managed by clients, they must be
unique _per client_. This patch addresses this short-coming.

I didn't include any dedicated tests for this. The fact that the
existing ones still work means the feature is overall working and the
data structure shows that the channels are now indeed unique per client.
2023-12-04 17:01:55 +00:00
Jamil
e3e2baf87d Use typed links to detect broken links (#2750)
- Fixes firezone/gtm#220
- Add @jefferenced and @ReactorScram to team page and simplify it

@conectado Interesting use of strong typing to enforce no broken links
in NextJS
2023-11-30 23:54:41 +00:00
Reactor Scram
04d4371b93 fix(CI): backticks for Powershell line continuation (#2754) 2023-11-30 21:52:15 +00:00
Jamil
55ba09c8bf Remove PortalMock from Android and Apple (#2752)
It's fallen out of date and is no longer used.
2023-11-30 21:42:48 +00:00
Reactor Scram
189a35f692 feat(windows): Tauri boilerplate and CI changes (#2742)
Trying to get CI/CD to produce firezone-windows-client.exe. Can't
remember if I need both a PR and a draft release or just the draft
release for that.

---------

Signed-off-by: Reactor Scram <ReactorScram@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2023-11-30 19:50:43 +00:00
Andrew Dryga
55e8d3407f Render deleted entities on fetch (#2692)
Since we have flows we should either delete the flow when the related
entity is deleted (making them not very useful) or allow viewing deleted
entities properly marking them and removing all action buttons in the
UI:

<img width="1728" alt="Screenshot 2023-11-22 at 13 41 51"
src="https://github.com/firezone/firezone/assets/1877644/ae7f14b9-9607-4de0-a90f-049faf7e4374">
<img width="1728" alt="Screenshot 2023-11-22 at 13 41 54"
src="https://github.com/firezone/firezone/assets/1877644/491f8e1f-6aad-459b-b038-6100c25b3bf4">
<img width="1728" alt="Screenshot 2023-11-22 at 13 41 48"
src="https://github.com/firezone/firezone/assets/1877644/9200e521-0d92-41b5-9197-355353f09a50">

<img width="1728" alt="Screenshot 2023-11-22 at 13 07 47"
src="https://github.com/firezone/firezone/assets/1877644/dca59bbd-9771-4b06-b32b-f17cf0047520">

This change only affects fetching relation by ID (eg. `actors/:id`),
rest of pages (index, edit) will not show deleted entities unless they
are a critical relation (eg. for Policy to work both actor group and
resource are needed):

<img width="1728" alt="Screenshot 2023-11-22 at 13 42 23"
src="https://github.com/firezone/firezone/assets/1877644/d8b15011-838a-477d-97c8-5c7109299cb9">

Closes #2681

Signed-off-by: Andrew Dryga <andrew@dryga.com>
2023-11-30 13:55:07 -06:00
Andrew Dryga
af5cc38f9e Pick latest-versioned gateways (#2739)
Closes #2733
2023-11-30 11:52:24 -06:00
Jamil
ce79b4922d Update bot comments in-place to reduce notification spam (#2748)
Updating a comment doesn't produce a notification, so this will fix
comment spam on PRs.

History can be viewed in the comment itself like so:

<img width="658" alt="Screenshot 2023-11-30 at 7 23 29 AM"
src="https://github.com/firezone/firezone/assets/167144/30846b58-336b-41de-88c3-c72947f4ec97">
2023-11-30 16:45:39 +00:00
Gabi
e3546cfa12 connlib: limit the number of host candidates used (#2746)
In some cases we were obvserving that connections between clients and
gateways couldn't be established.

This happened even when candidates where being found on both ends.

This usually was obvserved when ipv6 isn't working on the relays and
it's still used as one of the viable candidates.

To reproduce this more easily I created an iface with 50 ips using this
script:

```bash
#!/bin/bash

# Generate 10 IPv6 addresses
for i in {1..10}
do
  for j in {1..5}
  do
    # Generate a random IPv6 address
    ipv6_address=$(openssl rand -hex 5 | sed 's/\(..\)/\1:/g; s/.$//' | awk '{print "fd00::"$1}')

    # Add IPv6 address to lo0
    sudo ifconfig lo0 inet6 alias $ipv6_address

    echo "Added IPv6 address $ipv6_address to lo0"
  done
done
```

This behavior was almost consistently obvserved, as it depended on the
order candidates were used.

I tried modifying timeouts and the limits to channel binding requests
that are internal to webrtc but the connections were still not
consistent, the only thing that worked was limiting the number of host
candidates.

This is okay since even if we can't stablish the local connection (no
hairpin nat) relayed connection will still happen.

But this is not a good long-term solution. In the future we should be
smarter how we sort and ping candidates, prioritizing srflx to srflx or
srflx to relay and leave host candidates for last. Will be easier to
improve on after refactoring webrtc out.
2023-11-30 14:33:07 +00:00
Jamil
79aa4cfb8e 1.x docs first iteration (#2688)
Doing a first pass over documentation and minor UI cleanup. This PR
isn't meant to represent the final state of launch docs, but instead
something that will unblock #2685 and #2675

Fixes #2729
2023-11-30 04:04:54 +00:00
bmanifold
67c14c02ed Add Relay admin feature flag (#2736)
Why:

* Self-hosted Relays are not going to be apart of the beta release, so
hiding the functionality in the portal will allow the user not to get
confused about a feature they aren't able to use.

Closes #2178
2023-11-29 22:02:50 +00:00
Thomas Eizinger
81598dbaff feat(relay): reduce packet drops (#2737)
There is another channel which we didn't yet increase in size, the one
between the allocation and the main task loop. Increasing to 1000 means
each allocation can potentially buffer 65MB of data. With the biggest
port range (16383 allocations), that would be a theoretical memory
consumption of ~ 1TB. But, this would imply that we have 16383 connected
clients that all send data at max speed, saturating our downlink and our
uplink is somehow ridiculously small. As long as up and downlink are
roughly within the same ballpark figure, it should be impossible to
actually fill up these buffers.

I suspect that the current packet drops of the iperf test are happening
because on localhost, sending 10 UDP packets is so quick that a tokio is
unable to wake up the task in time to empty the queue.

In addition to the increased channel size, I've also added a check for
the other channels to avoid writing to them in case they are not ready
for some reason.

---------

Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2023-11-29 19:01:17 +00:00
Reactor Scram
ce0e396c49 feat(windows): Windows boilerplate and CI (#2715) (#2730)
Testing if CI will build the Windows exe, or at least check the code.

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2023-11-29 14:59:32 +00:00
Gabi
a309f11011 Fix gateway cleanup (#2704)
Yesterday, during some portion of the day connections between clients
and resources were impossible.

While I couldn't pinpoint the exact cause I found some issues with
cleanup. This PR fixes those.

Furthermore, I increased the default log level for tunnels in the
clients so that if this happens again we have better logs to triage.

~~Furthermore, I found out about #2705 so, I removed the limit of relays
from connlib since the portal already limits it to 2 (4 if you count
per-ip), that way we make sure that we always use both ipv4 and ipv6.
The connection start up time seems to slow down due to this but I think
this is better. We might want to go to only 2 urls again later on to
speed this up, if the portal can ensure it's a working relay
load-balanced relay there might not be a point in using more than a
single server~~. cc @AndrewDryga

Edit: we always get an ipv4 and ipv6 address for the same relay as the
first two relays in the relay list, save the case where only one of the
ip types is supported. We should be safe limiting it to 2.

---------

Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2023-11-29 04:49:30 +00:00
Jamil
8ad82b515e "Magic Link" -> "Email" (#2731)
Updates user-facing terminology to `One-Time Password` to more
accurately reflect this sign in method and match docs more consistently

Refs #2688 
Refs #2021
2023-11-28 23:58:50 +00:00
Jamil
2371df946a Fix tooltips (#2734)
Fixes a JS bug
2023-11-28 23:06:55 +00:00
Andrew Dryga
efc71914f8 Configure ip6tables rules for docker to reflect v4 rules 2023-11-28 16:50:58 -06:00
Jamil
480a0f336b Add pricing page (#2685)
Fixes firezone/gtm#194
2023-11-28 21:33:45 +00:00
Jamil
db601312c2 1.x landing page first iteration (#2675)
Fixes firezone/gtm#165
Fixes firezone/gtm#219
2023-11-28 21:07:12 +00:00
Jamil
5dd5f5a650 use random UUID fallback if deviceId is blank on Apple (#2699)
Fixes #2697
2023-11-28 19:33:45 +00:00
Jamil
916a34a677 Remove redundant Android README (#2719)
Maintained one is in `kotlin/android`
2023-11-28 07:24:35 -08:00
Jamil
09d1f8cd68 Fix Android logo and fonts (#2689)
- Use capital `F` for logo
- Use Source Code Pro for font family on Android
2023-11-27 22:24:21 +00:00
Jamil
6d08f2f4cb Fix button colors to match product (#2702)
Removes gradient to match product 

Refs #2682
2023-11-27 21:05:15 +00:00
bmanifold
29709fd239 Update portal button colors, button sizing, and sign-in page spacing (#2693)
Closes #2682 #2640 #2639 

This screenshot should demonstrate all 3 issues

<img width="670" alt="Screenshot 2023-11-22 at 3 02 13 PM"
src="https://github.com/firezone/firezone/assets/2646332/d564c6ac-2482-40b1-92c8-0ee961b0ec78">

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2023-11-27 21:04:45 +00:00
Roopesh Chander
747d8d92ac Do retry-then-signout when startTunnel() fails as well (#2723)
Fixes #2718

It appears that #2687 missed the case where the tunnel failed to start
for some reason, so the tunnel status never goes to "Connecting" and
then back to "Disconnected" -- we were reacting to the tunnel status
becoming "Disconnected".

In this PR, we do the retry when startTunnel() throws an error as well.
2023-11-27 13:09:11 +00:00
Jamil
7fab285525 Update appIcons to match brand colors (#2720)
Tweaks the flame color to match elsewhere
2023-11-27 13:04:40 +00:00
Roopesh Chander
f9c2031e91 Shutdown tunnel on quit in macos (#2710)
Fixes #2565

With this PR, the 'Quit' menu item (in the macOS status item menu)
disconnects the tunnel (if tunnel is active) before quitting. While the
tunnel is connected, the menu item's title is changed to 'Disconnect and
Quit'.

Depends on #2687 (and this PR currently includes those commits), so
opened as draft.

**Edit:** Now made ready for review.
2023-11-25 21:13:56 +00:00
Roopesh Chander
c283610402 Signout when connlib calls onDisconnect (#2687)
Fixes #2304.

- If the tunnel disconnects because of connlib, or because the tunnel is
incorrectly configured (we don't expect that to happen), the user is
signed out.
- If the tunnel disconnects because the user disconnected it in
Settings, the user is not signed out, and no alert is shown
- If the tunnel disconnects because the OS brought it down for some
other reason (not sure what it could be), the user is not signed out,
and an alert is shown. Alert will be shown only if the app is running at
that time.
2023-11-25 11:08:51 +00:00
Andrew Dryga
b9cd94ec82 Show online clients first on the page (#2698) 2023-11-24 12:02:43 -06:00
Andrew Dryga
c6b64403db Fix unit file (#2684)
Keep in mind it will not work until we release a binary on the GitHub.
2023-11-24 15:01:57 +00:00
Andrew Dryga
484b5a49ce Fix OIDC form and redirect urls (#2695)
Closes #2674
2023-11-24 15:01:10 +00:00
bmanifold
ef480e1acd Add routing option for sites (#2610)
Why:

* As sites are created, the default behavior right now is to route
traffic through whichever path is easiest/fastest. This commit adds the
ability to allow the admin to choose a routing policy for a given site.
2023-11-22 19:59:54 +00:00
Andrew Dryga
48722d609f Fix production gateways deployment 2023-11-20 18:43:32 -06:00
Jamil
ba7be34f77 Add Request Demo button; update sales lead form (#2673)
Fixes firezone/gtm#217
2023-11-20 02:10:01 +00:00
Jamil
a13d1ab0c7 Add YC logo (#2671)
Fixes firezone/gtm#212

<img width="472" alt="Screenshot 2023-11-17 at 3 02 24 PM"
src="https://github.com/firezone/firezone/assets/167144/d8e4f40c-7ff8-44e4-81d0-62cf968a21fd">
<img width="1291" alt="Screenshot 2023-11-17 at 3 02 15 PM"
src="https://github.com/firezone/firezone/assets/167144/20174b47-a7d4-45df-a530-35a2800911dc">
2023-11-18 18:28:27 +00:00
Jamil
a5b6929fbf Capitalize logo (#2666)
Forgot to make this consistent. Alternatively we could use a text logo
with the text in-place.
2023-11-18 16:50:29 +00:00
Jeff S
aa29200dea Blog post: secure remote access makes remote work a win-win (#2642)
Review & approve

---------

Signed-off-by: Jeff S. <148512665+jefferenced@users.noreply.github.com>
Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2023-11-18 16:13:48 +00:00
Andrew Dryga
b054c88a62 Finish implementing performance testing (#2665) 2023-11-17 20:22:11 -06:00
Andrew Dryga
c9f062c7c7 Remove flow logs from gateway page and some of TODOs (#2662) 2023-11-17 12:10:54 -06:00
Andrew Dryga
c0c8d879d0 Upload perfomance test results 2023-11-17 01:22:11 -06:00
Gabi
aec5b97012 Add performance tests for client-gateway communication (#2655) 2023-11-17 00:32:34 -06:00
Gabi
7528a765fb connlib: fix incorrect assumption for buffer size that was causing panics (#2663)
There was an incorrect assumption with buffer size that was causing a
panic (detected on macos client)
2023-11-17 04:13:45 +00:00
Andrew Dryga
54f83f43a3 Fix typo 2023-11-16 22:13:31 -06:00
Andrew Dryga
091a8ddbc8 Copy major and major-minor containers to prod 2023-11-16 21:17:34 -06:00
Andrew Dryga
61221f3899 Add host.firezone.local to the demo server 2023-11-16 12:16:43 -06:00
Andrew Dryga
1ab3fdd3b5 Ephemeral gateways (#2656)
- [x] Fixed docker run command to mount a volume at `/etc/firezone`
- [x] Fixed systemd unit file to prope setcap, create writeable
`/etc/firezone` directory, use non-root user, etc
- [x] Removed `FIREZONE_ID` from our terraform scripts

Now on Sites index we only show online gateways:
<img width="1728" alt="Screenshot 2023-11-15 at 18 04 12"
src="https://github.com/firezone/firezone/assets/1877644/b532f200-0420-4427-acff-a3b8623560c5">

On the Site view we also show only online ones with a link to see all:
<img width="1728" alt="Screenshot 2023-11-15 at 18 02 33"
src="https://github.com/firezone/firezone/assets/1877644/9774dfac-4340-41d4-8404-586e081505f5">

All can be seen on a separate page:
<img width="1728" alt="Screenshot 2023-11-15 at 18 02 27"
src="https://github.com/firezone/firezone/assets/1877644/5d135f60-c7af-4e48-9ebb-626ff7461316">

Some of the functions I've added are pretty dirty hacks, we really need
to implement filters from #2029 to properly implement those and remove
code duplicates.
2023-11-16 11:17:22 -06:00
Gabi
683723ee17 connlib: fix logging string for macos (#2658)
filter for macos wasn't being applied correctly, this fixes that.
2023-11-16 06:11:42 +00:00
Gabi
bc8f438a56 feat(connlib): directly send wireguard traffic instead of tunneling it through WebRTC datachannels (#2643)
This PR started as part of a degradation in performance for the
gateways.

The way to test performance in a realistic enviroment is using a GCP vm
as a client and an AWS vm as a gateway with a single iperf server behind
the gateway.

Then the `iperf` results with current main:

```
Connecting to host 172.31.92.238, port 5201
Reverse mode, remote host 172.31.92.238 is sending
[  5] local 100.83.194.77 port 58426 connected to 172.31.92.238 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.01 MBytes  8.50 Mbits/sec                  
[  5]   1.00-2.00   sec  1.14 MBytes  9.59 Mbits/sec                  
[  5]   2.00-3.00   sec   699 KBytes  5.73 Mbits/sec                  
[  5]   3.00-4.00   sec  1.11 MBytes  9.31 Mbits/sec                  
[  5]   4.00-5.00   sec   664 KBytes  5.44 Mbits/sec                  
[  5]   5.00-6.00   sec   591 KBytes  4.84 Mbits/sec                  
[  5]   6.00-7.00   sec   722 KBytes  5.91 Mbits/sec                  
[  5]   7.00-8.00   sec   833 KBytes  6.83 Mbits/sec                  
[  5]   8.00-9.00   sec   738 KBytes  6.04 Mbits/sec                  
[  5]   9.00-10.00  sec   836 KBytes  6.85 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.06  sec  8.78 MBytes  7.32 Mbits/sec    3             sender
[  5]   0.00-10.00  sec  8.23 MBytes  6.90 Mbits/sec                  receiver

iperf Done.
```

Most of the performance problems were due to using SCTP and DTLS.

So I created a
[fork](https://github.com/firezone/webrtc/tree/expose-new-endpoint) of
webrtc that let us circumvent those, since we don't need them because we
are depending on wireguard for encryption.

With those changes much better throughput is achieved:

```
gabriel@cloudshell:~ (firezone-personal-instances)$ iperf3 -R -c 172.31.92.238
Connecting to host 172.31.92.238, port 5201
Reverse mode, remote host 172.31.92.238 is sending
[  5] local 100.83.194.77 port 51206 connected to 172.31.92.238 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  5.60 MBytes  47.0 Mbits/sec                  
[  5]   1.00-2.00   sec  17.2 MBytes   144 Mbits/sec                  
[  5]   2.00-3.00   sec  15.8 MBytes   132 Mbits/sec                  
[  5]   3.00-4.00   sec  14.8 MBytes   125 Mbits/sec                  
[  5]   4.00-5.00   sec  15.9 MBytes   133 Mbits/sec                  
[  5]   5.00-6.00   sec  15.8 MBytes   133 Mbits/sec                  
[  5]   6.00-7.00   sec  15.3 MBytes   128 Mbits/sec                  
[  5]   7.00-8.00   sec  15.6 MBytes   131 Mbits/sec                  
[  5]   8.00-9.00   sec  15.6 MBytes   131 Mbits/sec                  
[  5]   9.00-10.00  sec  16.0 MBytes   134 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.05  sec   151 MBytes   126 Mbits/sec   74             sender
[  5]   0.00-10.00  sec   148 MBytes   124 Mbits/sec                  receiver

iperf Done
```

However, this is still worse than it was achieved with a previous
commit(`21afdf0a9a113c996d60a63b2e8c8f32d3aeb87`):
```
gabriel@cloudshell:~ (firezone-personal-instances)$ iperf3 -R -c 172.31.92.238
Connecting to host 172.31.92.238, port 5201
Reverse mode, remote host 172.31.92.238 is sending
[  5] local 100.100.68.41 port 49762 connected to 172.31.92.238 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  6.14 MBytes  51.5 Mbits/sec                  
[  5]   1.00-2.00   sec  17.1 MBytes   144 Mbits/sec                  
[  5]   2.00-3.00   sec  22.8 MBytes   191 Mbits/sec                  
[  5]   3.00-4.00   sec  23.5 MBytes   197 Mbits/sec                  
[  5]   4.00-5.00   sec  23.0 MBytes   193 Mbits/sec                  
[  5]   5.00-6.00   sec  22.1 MBytes   185 Mbits/sec                  
[  5]   6.00-7.00   sec  23.0 MBytes   193 Mbits/sec                  
[  5]   7.00-8.00   sec  22.7 MBytes   190 Mbits/sec                  
[  5]   8.00-9.00   sec  21.0 MBytes   176 Mbits/sec                  
[  5]   9.00-10.00  sec  19.9 MBytes   167 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.05  sec   204 MBytes   170 Mbits/sec  127             sender
[  5]   0.00-10.00  sec   201 MBytes   169 Mbits/sec                  receiver
```

My profiling suggested that this is due to reading/writing packets
happening in its own dedicated tasks. So much so that maybe in the
future we should even consider spawning their own dedicated runtime so
that those loops have a dedicated OS thread.

Also, probably using a multi-queue interface will give us huge gains if
we have a dedicated task for each queue(currently the interface is
started as a multi-queue but a single file descriptor is used) for
handling multiple concurrent clients.

However, the changes proposed in this PR are good enough for now as long
as performance don't degrade.

In that line I will create a CI that reports the throughput using the
local `docker-compose.yml` file that we should always check before
merging, that is not the be all end all of the performance story but for
smaller PRs the correlation to real world throughput should be enough.

For bigger PRs we should manually test before merging for now, until we
have a way in CI to spin up some realistic tests(note that vms should be
in separate cloud enviroments, the same-cloud links are so reliable that
we miss actual performance degradation due to dropped packets). On this
note I'll write a small manual on how to conduct those tests with full
current results that we should use always before merging new PRs that
affect the hot-path. cc @thomaseizinger

Finally, when testing these changes I found some flakiness regarding the
re-connection path. So I changed things so that we cleanup connections
only using wireguard's error(connection expiration). This is quite slow
for now (~120 seconds) but in the future we can issue an ice restart
each time wireguard keepalive expires(rekey timeout) so that we can
restart connection each ~30 seconds and we can reduce the keepalive time
out from the portal to accelerate it even more. And in the future we can
get smarter about it.

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2023-11-16 02:59:48 +00:00
Andrew Dryga
ce7c5198fa Deploy Metabase to production
Closes https://github.com/firezone/firezone/issues/2527
2023-11-15 17:04:23 -06:00