Commit Graph

122 Commits

Author SHA1 Message Date
Jamil
fee808bc62 chore(portal): Log error for unknown channel messages (#8299)
Instead of crashing, it would make sense to log these and let the
connected entity maintain its WebSocket connection.

This should never happen in practice if we maintain our version
compatibility matrix properly, but it will help reduce the blast radius
of a channel message bug that happens to slip out into the wild.

Fixes #4679
2025-03-03 21:21:39 +00:00
Jamil
e5ae00ab99 fix(portal): norely -> noreply in gateway/channel.ex (#8329)
Fixes a typo that snuck in in #8267
2025-03-03 08:15:46 +00:00
Jamil
cb0bf44815 chore: Remove ability to create GCP log sinks (#8298)
This has long since been removed in the Clients.
2025-02-28 20:57:21 +00:00
Jamil
e03047d549 feat(portal): Send gateway ipv4 and ipv6 to client (#8291)
In order to properly handle SRV and TXT records on the clients, we need
to be able to pick a Gateway using the initial query itself. After that,
we need to know the Gateway Tunnel IPs we're connecting to so we can
have the query perform the lookup.

Fixes #8281
2025-02-28 03:52:27 +00:00
Brian Manifold
bc150156ce fix(portal): Update gateway channel to process resource_update (#8280)
Why:

* After merging #8267 it was discovered that there was a race condition
that allowed a `resource_create` message to end up at the Gateway
Channel process. Previously, this message would not have ever arrived,
because we were replacing Resource IDs when a breaking change was made,
but since that is no longer the case, it is possible that a connection
could be established between the time the `delete_resource` and
`create_resource` messages are sent and the `create_resource` would end
up at the Gateway Channel process. This commit adds a no-op handler to
make sure the message gets processed without throwing an error.
2025-02-27 01:46:13 +00:00
Brian Manifold
d0f0de0f8d refactor(portal): Allow breaking changes in Resources/Policies (#8267)
Why:

* Rather than using a persistent_id field in Resources/Policies, it was
decided that we should allow "breaking changes" to these entities. This
means that Resources/Policies will now be able to update all fields on
the schema without changing the primary key ID of the entity.
* This change will greatly help the API and Terraform provider
development.

@jamilbk, would you like me to put a migration in this PR to actually
get rid of all of the existing soft deleted entities?

@thomaseizinger, I tagged you on this, because I wanted to make sure
that these changes weren't going to break any expectations in the client
and/or gateways.

---------

Signed-off-by: Brian Manifold <bmanifold@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-02-26 17:05:34 +00:00
Jamil
dec2b0ee81 fix(portal): Only configure Sentry.LoggerHandler once (#8025)
The applications within our umbrella are all joined into a single Erlang
cluster, and logger configuration is applied already to the entire
umbrella.

As such, registering the Sentry log handler in each application's
startup routine triggers duplicate handlers to be registered for the
cluster, resulting in warnings like this in GCP:

```
Event dropped due to being a duplicate of a previously-captured event.
```

As such, we can move the log handler configuration to the top-level
`:logger` key, under the `:logger` subkey for configuring a single
handler. We then load this handler config in the `domain` app only and
it applies to the entire cluster.
2025-02-05 13:41:19 +00:00
Jamil
6be7cf6b45 feat(portal): Add Sentry reporting (#8013)
This adds https://github.com/getsentry/sentry-elixir to the portal for
automatic process crash and exception trace reporting.

It also configures Logger reporting for the `warning` level and higher,
and sets the data scrubbing rules to allow all Logger metadata keys
(`logger_metadata.*` in the Sentry project settings).

Lastly, it configures automatic HTTP error reporting by tying into the
`api` and `web` endpoint modules with a custom `plug` middleware so we
get automatic reporting of unsuccessful Phoenix responses.

It is expected this will be noisy when we first deploy and we'll need to
tune it down a bit. This is the same approach used with other Sentry
platforms.
2025-02-04 18:35:52 +00:00
Jamil
3f3a908bd2 chore(portal): Bump opentelemetry versions (#7794)
Dependabot is having issues figuring out the opentelemetry bumps due to
a [package pull](https://github.com/firezone/firezone/pull/7788), so
this PR aims to alleviate that as a one-off fix.

This bumps a few deps' major versions. Nothing jumped out at first
glance when I reviewed the changelogs, but I figured we'll have a better
idea when this goes out to staging since OTLP is basically disabled in
dev/test.
2025-01-17 01:34:12 +00:00
Brian Manifold
eea7079776 fix(portal): Catch seat limit error in API fallback controller (#7783)
Why:

* The fallback controller in the API was not catching `{:error,
:seat_limit_reached}` being returned and was then generating a 500
response when this happened. This commit adds the condition in the
fallback controller and adds a new template for a more specific error
message in the returned JSON.
2025-01-17 00:13:45 +00:00
Jamil
603a64435e chore(portal): use appropriate sha in dev (#7782)
Not a huge deal, but this doesn't actually need to be a valid SHA and
this is more clear / has no risk of collision with an actual git sha.
2025-01-16 22:58:12 +00:00
Jamil
53032fcbe1 fix(ci): Populate elixir vsn from env at build time (#7773)
Dependabot's workflow is set up in such a way it seems that it can't
find our `sha.exs` file.

This is a cleaner approach that doesn't rely on using external files for
the application version.

Interesting note: `mix compile` will happily use the cached `version`
even though it's computed from an env var, because `mix compile` uses
file hash and mtime to know when to recompile.

See https://github.com/firezone/firezone/network/updates/942719116
2025-01-16 22:26:22 +00:00
Brian Manifold
1f457d2127 fix(portal): Fixing a few edge cases for identity email (#7532) 2024-12-16 23:11:25 +00:00
Brian Manifold
f114bc95cd refactor(portal): Add email as separate column on auth_identities table (#7472)
Why:

* Currently, when using the API, a user has no way of easily identifying
what identities they are pulling back as the response only includes the
`provider_identifier` which for most of our AuthProviders is an ID for
the IdP and not an email address. Along with that, when adding users to
an OIDC provider within Firezone, there is no check for whether or not
an identity has already been added with a given email address. By
creating a separate email column on the `auth_identities` table, it will
be very straight forward to know whether an email address exists for a
given identity, return it in an API response and allow the admin of a
Firezone account to track users (Identities) by email rather than IdP
identifier.

Fixes #7392
2024-12-13 17:26:47 +00:00
Brian Manifold
9711cf56c1 fix(portal): Fix update API endpoint for resources (#7493)
Why:

* The API endpoint for updating Resources was using
`Resources.fetch_resource_by_id_or_persistent_id`, however that function
was fetching all Resources, which included deleted Resources. In order
to prevent an API user from attempting to update a Resource that is
deleted, a new function was added to fetch active Resources only.

Fixes: #7492
2024-12-12 22:51:28 +00:00
Brian Manifold
06791d2d05 refactor(portal): API persistent IDs (#7182)
In order for the firezone terraform provider to work properly, the
Resources and Policies need to be able to be referenced by their
`persistent_id`, specifically in the portal API.
2024-11-07 20:45:56 +00:00
Thomas Eizinger
ce1e59c9fe feat(connlib): implement idempotent control protocol for gateway (#6941)
This PR implements the new idempotent control protocol for the gateway.
We retain backwards-compatibility with old clients to allow admins to
perform a disruption-free update to the latest version.

With this new control protocol, we are moving the responsibility of
exchanging the proxy IPs we assigned to DNS resources to a p2p protocol
between client and gateway. As a result, wildcard DNS resources only get
authorized on the first access. Accessing a new domain within the same
resource will thus no longer require a roundtrip to the portal.

Overall, users will see a greatly decreased connection setup latency. On
top of that, the new protocol will allow us to more easily implement
packet buffering which will be another UX boost for Firezone.
2024-10-18 15:59:47 +00:00
Andrew Dryga
b3c2e54460 feat(portal): New version of the WS control protocol (#6761)
TODOs:
- [x] Switch to sending messages instead of replies
- [ ] Do not hide pre-filtered resources and render them with an error
instead (in case we will want to expose that on a client later)
- [x] Figure out how to generate PSK so that it stays across WS
connections
2024-10-16 10:57:54 -06:00
Andrew Dryga
1abfa10fb7 fix(portal): UX improvements (#7013)
This PR accumulates lots of small UX fixes from #6645.

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2024-10-14 11:32:44 -06:00
Andrew Dryga
3652839b1a feat(portal): Allow updating policies and resources (#6690)
Now you can "edit" any fields on the policy, when one of fields that
govern the access is changed (resource, actor group or conditions) a new
policy will be created and an old one is deleted. This will be
broadcasted to the clients right away to minimize downtime. New policy
will have it's own flows to prevent confusion while auditing. To make
experience better for external systems we added `persistent_id` that
will be the same across all versions of a given policy.

Resources work in a similar fashion but when they are replaced we will
also replace all corresponding policies.

An additional nice effect of this approach is that we also got
configuration audit log for resources and policies.

Fixes #2504
2024-09-18 13:06:05 -06:00
Brian Manifold
716623a993 feat(portal): Add IDP sync error email notifications (#6483)
This adds a feature that will email all admins in a Firezone Account
when sync errors occur with their Identity Provider.

In order to avoid spamming admins with sync error emails, the error
emails are only sent once every 24 hours. One exception to that is when
there is a successful sync the `sync_error_emailed_at` field is reset,
which means in theory if an identity provider was flip flopping between
successful and unsuccessful syncs the admins would be emailed more than
once in a 24 hours period.

### Sample Email Message
<img width="589" alt="idp-sync-error-message"
src="https://github.com/user-attachments/assets/d7128c7c-c10d-4d02-8283-059e2f1f5db5">
2024-09-18 15:29:50 +00:00
Andrew Dryga
a6a1da7796 chore(portal): Bump Elixir deps (#6672)
We are most interested in tzdata, which had issues due to underlying
breaking change in the timezone database.
2024-09-12 11:15:06 -06:00
Andrew Dryga
f4f2b45d2b fix(portal): Reload client on updates (#6614) 2024-09-05 18:45:39 -07:00
Andrew Dryga
e72bb05436 feat(portal): Reinit client when itself or a known group were updated (#6609)
This allows us to push a whole set of resources at once when client was
verified/unverified/updated/blocked.

Closes #6560
2024-09-05 16:51:47 -07:00
Andrew Dryga
1dae0a3ed5 fix(portal): Do not send resources not connected to any sites down to clients (#6512)
This is only possible for internet resources, any other resource will
always have at least one site connected at all times.

Closes #6510
2024-08-30 14:11:48 -06:00
Andrew Dryga
2a808292d0 feat(portal): Add blocked_tx_bytes to flow activity metrics (#6487)
Closes #4787
2024-08-29 14:21:51 -06:00
Andrew
7c6eac6af5 Hotfix: crash while rendering internet resources for gateways 2024-08-28 10:44:13 -06:00
Andrew Dryga
835fc4c8eb chore(portal): Bump all deps related to portal (#6445) 2024-08-28 10:40:02 -06:00
Thomas Eizinger
35017537c7 feat(gateway): allow out-of-order allow_access requests (#6403)
Currently, the gateway requires a strict ordering of first receiving a
`request_connection` message, following by multiple `allow_access`
messages. Additionally, access can be granted as part of the initial
`request_connection` message too.

This isn't an ideal design. Setting up a new connection is infallible,
all we need to do is send our ICE credentials back to the client.
However, untangling that will require a bit more effort.

Starting with #6335, following this strict order on the client is a more
difficult. Whilst we can send them in order, it is harder to maintain
those ordering guarantees across all our systems.

To avoid this, we change the gateway to perform an upsert for its local
ACLs for a client. In case that an `allow_access` call would somehow get
to the gateway earlier, we can simply already create the `Peer` and only
set up the actual connection later.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-28 13:10:06 +00:00
Andrew Dryga
2d083379c6 feat(portal): Internet resources (#6299)
They will be sent in the API for connlib 1.3 and above.

I think in future we can make a whole menu section called "Internet
Security" which will be a specialized UI for the new resource type (and
now show it in Resources list) to improve the user experience around it.

Closes #5852

---------

Signed-off-by: Andrew Dryga <andrew@dryga.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2024-08-27 23:11:17 +00:00
Andrew Dryga
8e4a4a7b05 feat(portal): Pre-check constraint conformation on client connect (#6431)
Closes #6216
2024-08-26 15:30:46 -06:00
Andrew Dryga
25a22b4780 chore(portal): Test that we only render resources once in WS API (#6394) 2024-08-21 17:16:19 -06:00
Andrew Dryga
c922ea29e9 fix(portal): Fix DNS wildcard support for Gateways (#6270) 2024-08-12 12:54:20 -06:00
Andrew Dryga
00b93f6b82 feat(portal): Wildcard dns with backwards compatibility (#6214)
If a new resource is created that will use format not supported by
previous client versions we temporarily show a warning:
<img width="683" alt="Screenshot 2024-08-07 at 2 28 57 PM"
src="https://github.com/user-attachments/assets/bbfdfc96-0c4b-4226-93c5-bc2b5fdb9d30">

It will also be excluded from `resources` list for older clients (below
1.2).

---------

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
2024-08-10 18:25:24 +00:00
Brian Manifold
0df2d34126 fix(portal): Update Resource definition in OpenAPI spec (#6234)
Update Resource definition in OpenAPI spec to include "connections" i.e.
which gateway groups/sites a new Resource would be connected to.

<img width="775" alt="Screenshot 2024-08-09 at 2 57 04 AM"
src="https://github.com/user-attachments/assets/502979b1-e928-4e36-91c0-ed7b62f7c4a8">
2024-08-09 22:45:20 +00:00
Jamil
83033d91ed fix(ux): Mention (Sites) on Gateway Groups section of REST API docs (#6161)
I'm thinking if we can just add `(Sites)` next the Gateway Groups title,
that will be enough for users to make the connection.
2024-08-02 19:50:30 +00:00
Andrew Dryga
bf06534caf fix(portal): Prevent races during relay selection by only using the ones connected for more than 5 seconds ago (#6111)
Closes #6099
Should push #6109 to not being needed short term.
2024-08-02 11:10:40 -06:00
Andrew Dryga
8e1eb2429d fix(portal): Decrease WS timeouts for relays and gateways (#6112)
Related to #6095
2024-07-31 16:34:52 -06:00
Brian Manifold
97df661626 fix(api): add missing path parameter (#6039) (#6041)
Looks like I forgot one:

https://validator.swagger.io/validator/debug?url=https%3A%2F%2Fapi.firez.one%2Fopenapi

Co-authored-by: Antoine <antoinelabarussias@gmail.com>
2024-07-25 15:23:20 +00:00
Brian Manifold
bdc4d85afa fix(api): fix generated openapi spec (#6008)
(External contribution)

Hi, first thanks to @bmanifold for his awesome work! I've not yet tested
the API but here is a first PR fixing various small mistakes in the
generated openapi spec:

Schema names cannot contain spaces
Add missing path parameters in the spec
Remove duplicated endpoint for creating an identity (not sure about
that, I'll let you check)
If you want to validate the generated spec you can paste it here:
https://editor.swagger.io/ (or at the bottow of your swagger ui)

Please review commit by commit

Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Antoine Labarussias <antoinelabarussias@gmail.com>
2024-07-24 15:59:15 +00:00
Brian Manifold
79c815fbbc feat(portal): Add REST API (#5579)
Why:

* In order to manage a large number of Firezone Sites, Resources,
Policies, etc... a REST API is needed as clicking through the UI is too
time consuming, as well as prone to error. By providing a REST API
Firezone customers will be able to manage things within their Firezone
accounts with code.
2024-07-20 04:20:43 +00:00
Andrew Dryga
c9a9c1864a fix(portal): Update client identity on every connection (#5697)
This identity must track the last sign in method used by the client

Closes #5633
2024-07-03 13:17:06 -06:00
Andrew Dryga
cfe777f389 fix(portal): Do not crash WebSocket when client version is invalid (#5525) 2024-06-26 18:50:43 -06:00
Andrew Dryga
66e0ee17e6 fix(portal): Prevent double-subscribing to various presence events (#5554)
Closes #5531
2024-06-25 19:36:27 -06:00
Andrew Dryga
40d7889dd1 fix(portal): Adopt LiveView 1.0 breaking changes (#5549)
Closes https://github.com/firezone/firezone/issues/5545
2024-06-25 13:30:46 -06:00
Andrew Dryga
eb7b3f62ab feat(portal): Select only compatible gateways during candidate selection (#5463) 2024-06-20 20:35:20 -06:00
Jamil
17ea02d1a9 fix(portal): Don't send null address_description (#5365)
In #5273, I assumed that connlib optionally expected
`address_description`, but this is not the case. That feature assumes
the admin will optionally enter `address_description` to **override**
the address shown in Clients. The Clients already expect an optional
type for `address_description` and implement the correct behavior.

This PR is a workaround to prevent breaking existing Clients until we
can be relatively sure most clients have upgraded, in ~2 months.
2024-06-14 01:56:16 +00:00
Jamil
7e533c42f8 refactor: Split releases for Clients and Gateways (#5287)
- Removes version numbers from infra components (elixir/relay)
- Removes version bumping from Rust workspace members that don't get
published
- Splits release publishing into `gateway-`, `headless-client-`, and
`gui-client-`
- Removes auto-deploying new infrastructure when a release is published.
Use the Deploy Production workflow instead.

Fixes #4397
2024-06-10 16:47:49 +00:00
Andrew Dryga
650d7d7998 feat(portal): Add Policy conditions (#5144)
Now policies can have additional conditions based on Client location
(country or IP range), IdP provider used for sign in or the current time
of the day at a given timezone. This covers use cases where employees
can access the production system only from certain countries (states can
be added later) or when contractors can only access internal tools
during working hours.

Closes https://github.com/firezone/firezone/issues/4743
Closes #4742
Closes #4741
Closes #4740


<img width="1728" alt="Screenshot 2024-05-31 at 13 50 53"
src="https://github.com/firezone/firezone/assets/1877644/55f509f2-0f49-4edb-8c03-7a5a6d884ccc">
<img width="1728" alt="Screenshot 2024-05-31 at 13 50 56"
src="https://github.com/firezone/firezone/assets/1877644/756bb03f-4024-4978-ac85-6daa918ae037">
<img width="1728" alt="Screenshot 2024-05-31 at 13 51 01"
src="https://github.com/firezone/firezone/assets/1877644/cf159a86-077f-4ada-9952-9e8d399d0dc1">
<img width="1728" alt="Screenshot 2024-05-31 at 13 51 03"
src="https://github.com/firezone/firezone/assets/1877644/c070719e-2d4b-41bd-ad03-430baf2dbe9b">
<img width="676" alt="Screenshot 2024-05-31 at 14 56 06"
src="https://github.com/firezone/firezone/assets/1877644/435a4951-479d-4371-99c4-29a055348175">
2024-06-09 12:46:35 -06:00
Thomas Eizinger
d27a7a3083 feat(relay): support custom turn port (#5208)
Original PR: #5130.

Co-authored-by: Antoine <antoinelabarussias@gmail.com>
2024-06-05 04:04:17 +00:00