442 Commits

Author SHA1 Message Date
Jamil
1a806f3399 fix(portal): prefix privileged cmds with sudo (#10978)
The copy-paste functionality for these is broken if you are not already
on a root shell. If you are, then prefixing with `sudo` is essentially a
no-op and doesn't hurt.

To reduce friction here with the vast majority of end-user VMs we prefix
all privileged commands with `sudo` for them.

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-27 04:14:51 +00:00
Thomas Eizinger
bce2aa30b5 feat(portal): extend DNS settings to allow for DoH providers (#10882)
In order to allow customers to make use of connlib's DoH functionality,
we need a configuration UI for it. We take inspiration from the "New
Resource" page and implement a 3-choice UI component for configuring how
Clients should resolve DNS queries:

- System
- Secure DNS
- Custom

The secure and custom DNS options show an additional form when selected
for either picking a DoH provider or the addresses of the custom DNS
servers.

Right now, the "Secure DNS" part is disabled if the
`DISABLE_DOH_PROVIDER` env variable is set. We render a "Coming soon"
tooltip on hover:

<img width="1534" height="1100" alt="image"
src="https://github.com/user-attachments/assets/a12a6ba4-806f-4d19-8aea-5c1cd981d609"
/>

This allows us to test this in staging and still ship to production if
needed prior to enabling it.

Resolves: #10792
Resolves: #10786

---------

Co-authored-by: Jamil Bou Kheir <jamilbk@users.noreply.github.com>
2025-11-22 06:48:07 +00:00
Jamil
8f6f6666a1 fix(portal): phx-ignore checkbox changes (#10879)
On Resources and Policies forms, we were triggering the form's
validation helpers when checking and unchecking a checkbox.
Unfortunately this causes the checkbox to be reset since it was not
saved across the to_form(changeset) rebuilding.

To prevent this, we simply ignore checkbox changes from triggering form
validations. We can also remove the `field=` property on these because
we are setting the `checked` property ourselves.

This will be refactored to be made simpler with the new modals approach,
so a minimal fix is implemented for now.

Related:
https://firezonehq.slack.com/archives/C098RV5BL1K/p1763024374383399
Fixes: #9143
2025-11-13 16:16:24 +00:00
Thomas Eizinger
189c358975 feat(portal): add Debian/Ubuntu deployment tab (#10741)
Now that we have an APT repository for Debian / Ubuntu packages, we
should also tell our users about it. We introduce a new "Debian /
Ubuntu" tab on the deployments screen in the portal. The tab is selected
by default as it should provide the best user experience for manually
deployed Gateways:

- Updates are as easy as `sudo apt upgrade`
- The systemd file and token are fully managed in the background

Here is what the new tab looks like:

<img width="679" height="786" alt="image"
src="https://github.com/user-attachments/assets/da69fc55-6a6a-476d-bed4-634dd05df8bc"
/>


Resolves: #10701

---------

Signed-off-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-11-11 02:18:33 +00:00
Jamil
f2f8665c6a fix(portal): renew session on sign in (#10616)
When signing in, it's a good idea to clear any previous session cookie
and regenerate it, preventing the chance that any unchecked data in a
possible-fixated session cookie is used.
2025-10-21 15:21:07 -07:00
Jamil
2729a7731b chore(portal): remove dead Web.ControllerDocumentation (#10619) 2025-10-21 03:07:36 +00:00
Brian Manifold
27565ea5c8 refactor(portal): remove soft delete elements from portal code (#10607)
Why:

* In previous commits, the portal code had been updated to use hard
deletion rather than soft deletion of data. The fields used in the soft
deletion were still kept in the DB and the code to allow for zero
downtime rollout and an easy rollback if necessary. To continue with
that work the portal code has now been updated to remove any reference
to the soft deleted fields (e.g. deleted_at, persistent_id, etc...).
While the code has been updated the actual data in the DB will need to
remain for now, to once again allow for a zero downtime rollout. Once
this commit has been deployed to production another PR can follow to
remove the columns from the necessary tables in the DB.


Related: #8187
2025-10-18 17:02:26 +00:00
Jamil
cfc410626c chore(portal): remove unused nimble_csv dep (#10548)
This was added I believe to export certain live tables as CSV and won't
be used soon.
2025-10-13 22:50:21 +00:00
Jamil
d329880ec8 fix(portal): don't use Web functions from Domain (#10546)
Fixes an issue introduced in #10510 where Web functions (like
VerifiedRoutes) cannot be called from Domain because they are not
available in the release.

This happens to work in dev mode because everything is available under
the same dev context.
2025-10-13 20:24:46 +00:00
Jamil
b61fd20de8 chore(portal): remove Jason in favor of JSON (#10550)
Since Elixir 1.18, json encoding and decoding support is included in the
standard library. This is built on OTP's native json support which is
often faster than other implementations.

It mostly has the same API as the popular Jason library, differing
mainly in the format of the error responses returned when decoding
fails.

To minimize dependence on external libraries, we remove the Jason lib in
favor of this external dependency.

Fixes #8011
2025-10-13 17:39:53 +00:00
Jamil
bb089846d7 chore(portal): bump phoenix to 1.8 (#10510)
Bumps Phoenix to 1.8 and Phoenix LiveView to 1.1. As part of the bump a
number of issues had to be addressed. Comments inline provide more
context.

Supersedes #10475 
Supersedes #10448
2025-10-10 15:08:50 +00:00
Jamil
f07d2932dc feat(portal): show outdated clients (#10456)
Clients more than one minor away from a particular won't be able to
connect, so it would be useful to help admins recognize when this might
be the case and encourage them to upgrade.

We accomplish that with two small UX improvements in this PR:

- In the outdated gateways email, we show them a count of clients that
will no longer be able to connect to the new gateway version, linking
them to a sorted view of the clients table
- In the clients table, we add a new sortable `version` column to allow
admins to see which clients are outdated

Fixes #7727 
Fixes #10385

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2025-09-30 13:23:30 +00:00
Jamil
81ddf22aa0 fix(portal): use href for non-live routes (#10407)
When redirecting to paths that don't have LiveViews attached to them,
LiveView complains and emits a warning. To reduce alarm noise this PR
attempts to fix the issue.
2025-09-22 15:35:13 +00:00
Brian Manifold
e2e370fd76 fix(portal): fix client show page sign-in method (#10327) 2025-09-11 04:33:56 +00:00
Brian Manifold
826a304071 feat(portal): enable outdated gateway email (#10281)
Enables 'outdated gateway' notifications for all accounts.

Closes #8361
2025-09-04 03:56:01 +00:00
Brian Manifold
6bd19ee9b0 refactor(portal): hard delete data (#9694) 2025-08-29 22:13:44 +00:00
Jamil
cafe6554ff refactor(portal): reduce cache memory usage (#10058)
Napkin math shows that we can save substantial memory (~3x or more) on
the API nodes as connected clients/gateways grow if we just store the
fields we need in order to keep the client and gateway state maintained
in the channel pids.

To facilitate this, we create new `Cacheable` structs that represent
their `Domain` cousins, which use byte arrays for `id`s and strip out
unused fields.

Additionally, all business logic involved with maintaining these caches
is now contained within two modules: `Domain.Cache.Client` and
`Domain.Cache.Gateway`, and type specs have been added to aid in static
analysis and code documentation.

Comprehensive testing is now added not only for the cache modules, but
for their associated channel modules as well to ensure we handle
different kinds of edge cases gracefully.

The `Events` nomenclature was renamed to `Changes` to better name what
we are doing: Change-Data-Capture.

Lastly, the following related changes are included in this PR since they
were "in the way" so to speak of getting this done:

- We save the last received LSN in each channel and drop the `change`
with a warning if we receive it twice in a row, or we receive it out of
order
- The client/gateway version compatibility calculations have been moved
to `Domain.Resources` and `Domain.Gateways` and have been simplified to
make them easier to understand and maintain going forward.


Related: #10174 
Fixes: #9392 
Fixes: #9965
Fixes: #9501 
Fixes: #10227

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-22 21:52:29 +00:00
Jamil
f379e85e9b refactor(portal): cache access state in channel pids (#9773)
When changes occur in the Firezone DB that trigger side effects, we need
some mechanism to broadcast and handle these.

Before, the system we used was:

- Each process subscribes to a myriad of topics related to data it wants
to receive. In some cases it would subscribe to new topics based on
received events from existing topics (I.e. flows in the gateway
channel), and sometimes in a loop. It would then need to be sure to
_unsubscribe_ from these topics
- Handle the side effect in the `after_commit` hook of the Ecto function
call after it completes
- Broadcast only a simply (thin) event message with a DB id
- In the receiver, use the id(s) to re-evaluate, or lookup one or many
records associated with the change
- After the lookup completes, `push` the relevant message(s) to the
LiveView, `client` pid, or `gateway` pid in their respective channel
processes

This system had a number of drawbacks ranging from scalability issues to
undesirable access bugs:

1. The `after_commit` callback, on each App node, is not globally
ordered. Since we broadcast a thin event schema and read from the DB to
hydrate each event, this meant we had a `read after write` problem in
our event architecture, leading to the potential for lost updates. Case
in point: if a policy is updated from `resource_id-1` to
`resource_id-2`, and then back to `resource_id-1`, it's possible that,
given the right amount of delay, the gateway channel will receive two
`reject_access` events for `resource_id-1`, as opposed to one for
`resource_id-1` and one for `resource_id-2`, leading to the potential
for unauthorized access.
1. It was very difficult to ensure that the correct topics were being
subscribed to and unsubscribed from, and the correct number of times,
leading to maintenance issues for other engineers.
1. We had a nasty N+1 query problem whenever memberships were added or
removed that resolved in essentially all access related to that
membership (so all Policies touching its actor group) to be
re-evaluated, and broadcasted. This meant that any bulk addition or
deletion of memberships would generate so many queries that they'd
timeout or consume the entire connection pool.
1. We had no durability for side-effect processing. In some places, we
were iterating over many returned records to send broadcasts.
Broadcasting is not a zero-time operation, each call takes a small
amount of CPU time to copy the message into the receiver's mailbox. If
we deployed while this was happening, the state update would be lost
forever. If this was a `reject_access` for a Gateway, the Gateway would
never remove access for that particular flow.
1. On each flow authorization, we needed to hit `us-east1` not only to
"authorize" the flow, but to log it as well. This incurs latency
especially for users in other parts of the world, which happens on
_each_ connection setup to a new resource.
1. Since we read and re-authorize access due to the thin events
broadcasted from side effects, we risk hitting thundering herd problems
(see the N+1 query problem above) where a single DB change could result
in all receivers hitting the DB at once to "hydrate" their
processing.ion
1. If an administrator modifies the DB directly, or, if we need to run a
DB migration that involves side effects, they'll be lost, because the
side effect triggers happened in `after_commit` hooks that are only
available when querying the DB through Ecto. Manually deleting (or
resurrecting) a policy, for example, would not have updated any
connected clients or gateways with the new state.


To fix all of the above, we move to the system introduced in this PR:

- All changes are now serialized (for free) by Postgres and broadcasted
as a single event stream
- The number of topics has been reduced to just one, the `account_id` of
an account. All receivers subscribe to this one topic for the lifetime
of their pid and then only filter the events they want to act upon,
ignoring all other messages
- The events themselves have been turned into "fat" structs based on the
schemas they present. By making them properly typed, we can apply things
like the existing Policy authorizer functions to them as if we had just
fetched them from the DB.
- All flow creation now happens in memory and doesn't not need to incur
a DB hit in `us-east1` to proceed.
- Since clients and gateways now track state in a push-based manner from
the DB, this means very few actual DB queries are needed to maintain
state in the channel procs, and it also means we can be smarter about
when to send `resource_deleted` and `resource_created_or_updated`
appropriately, since we can always diff between what the client _had_
access to, and what they _now_ have access to.
- All DB operations, whether they happen from the application code, a
`psql` prompt, or even via Google SQL Studio in the GCP console, will
trigger the _same_ side effects.
- We now use a replication consumer based off Postgres logical decoding
of the write-ahead log using a _durable slot_. This means that Postgres
will retain _all events_ until they are acknowledged, giving us the
ability to ensure at-least-once processing semantics for our system.
Today, the ACK is simply, "did we broadcast this event successfully".
But in the future, we can assert that replies are received before we
acknowledge the event as processed back to Postgres.



The tests in this PR have been updated to pass given the refactor.
However, since we are tracking more state now in the channel procs, it
would be a good idea to add more tests for those edge cases. That is
saved as a later PR because (1) this one is already huge, and (2) we
need to get this out to staging to smoke test everything anyhow.

Fixes: #9908 
Fixes: #9909 
Fixes: #9910
Fixes: #9900 
Related: #9501
2025-07-18 22:47:18 +00:00
Thomas Eizinger
8e5ce66810 feat(gateway): don't apply traffic filters to ICMP errors (#9834)
Firezone uses ICMP errors to signal to client applications that e.g. a
certain IP is not reachable. This happens for example if a DNS resource
only resolves to IPv4 addresses yet the client application attempted to
use an IPv6 proxy address to connect to it.

In the presence of traffic filters for such a resource that does _not_
allow ICMP, we currently filter out these ICMP errors because - well -
ICMP traffic is not allowed! However, even in the presence of ICMP
traffic being allowed, we would fail to evaluate this filter because the
ICMP error packet is not an ICMP echo reply and therefore doesn't have
an ICMP identifier. We require this in the DNS resource NAT to identify
"connections" and NAT them correctly. The same L4 component is used to
evaluate the traffic filters.

ICMP errors are critical to many usage scenarios and algorithms like
happy-eyeballs. Dropping them usually results in weird behaviour as
client applications can then only react to timeouts.
2025-07-11 13:20:37 +00:00
Brian Manifold
83e71f45b8 fix(portal): catch all errors when sending welcome email (#9776)
Why:

* We were previously only catching the `:rate_limited` error when
sending welcome emails. This update adds a catch-all case to gracefully
handle the error and alert us.

---------

Signed-off-by: Brian Manifold <bmanifold@users.noreply.github.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
2025-07-03 21:41:12 +00:00
Jamil
dddd1b57fc refactor(portal): remove flow_activities (#9693)
This has been dead code for a long time. The feature this was meant to
support, #8353, will require a different domain model, views, and user
flows.

Related: #8353
2025-06-27 20:40:25 +00:00
Jamil
0b09d9f2f5 refactor(portal): don't rely on flows.expires_at (#9692)
The `expires_at` column on the `flows` table was never used outside of
the context in which the flow was created in the Client Channel. This
ephemeral state, which is created in the `Domain.Flows.authorize_flow/4`
function, is never read from the DB in any meaningful capacity, so it
can be safely removed.

The `expire_flows_for` family of functions now simply reads the needed
fields from the flows table in order to broadcast `{:expire_flow,
flow_id, client_id, resource_id}` directly to the subscribed entities.

This PR is step 1 in removing the reliance on `Flows` to manage
ephemeral access state. In a subsequent PR we will actually change the
structure of what state is kept in the channel PIDs such that reliance
on this Flows table will no longer be necessary.

Additionally, in a few places, we were referencing a Flows.Show view
that was never available in production, so this dead code has been
removed.

Lastly, the `flows` table subscription and associated hook processing
has been completely removed as it is no longer needed. We've implemented
in #9667 logic to remove publications from removed table subscriptions,
so we can expect to get a couple ingest warnings when we deploy this as
the `Hooks.Flows` processor no longer exists, and the WAL data may have
lingering flows records in the queue. These can be safely ignored.
2025-06-27 18:29:12 +00:00
Jamil
ff5a632d2a fix(portal): only show never synced correctly (#9652)
It's confusing that we clear this field upon sync failure. Instead, we
let it track the time of the last sync.

Will be cleaned up in #6294 so just applying a minimal fix now.

Fixes #7715
2025-06-24 22:54:30 +00:00
Brian Manifold
e5914af50f fix(portal): Add more logging around OIDC setup (#9555)
Why:

* Adding some simple logging around OIDC calls to help with better
debugging.
* Removing the `opentelemetry_liveview` package as it has been pulled in
to the `opentelemetry_phoenix` package that we are already using.
2025-06-17 16:52:33 +00:00
Brian Manifold
25434c6898 fix(portal): update non-root layout to use main.css (#9533)
After updating the CSS config to use `main.css` in the portal the root
layout was updated, but there were a small number of one-off templates
that do not use the root layout and those pages were not updated with
the new `main.css` file. This commit updates those non-root templates.

Fixes #9532
2025-06-15 15:31:45 +00:00
Jamil
c6545fe853 refactor(portal): consolidate pubsub functions (#9529)
We issue broadcasts and subscribes in many places throughout the portal.
To help keep the cognitive overhead low, this PR consolidates all PubSub
functionality to the `Domain.PubSub` module.

This allows for:

- better maintainability
- see all of the topics we use at a glance
- consolidate repeated functionality (saved for a future PR)
- use the module hierarchy to define function names, which feels more
intuitive when reading and sets a convention

We also introduce a `Domain.Events.Hooks` behavior to ensure all hooks
comply with this simple contract, and we also introduce a convention to
standardize on topic names using the module hierarchy defined herein.

Lastly, we add convenience functions to the Presence modules to save a
bit of duplication and chance for errors.

This will make it much easier to maintain PubSub going forward.


Related: #9501
2025-06-15 04:30:57 +00:00
Jamil
cbe33cd108 refactor(portal): move policy events to WAL (#9521)
Moves all of the policy lifecycle events to be broadcasted from the WAL
consumer.

#### Test

- [x] Enable policy
- [x] Disable policy
- [x] Delete policy
- [x] Non-breaking change
- [x] Breaking change


Related: #6294

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
2025-06-14 01:10:09 +00:00
Jamil
817eeff19f refactor(portal): simplify managed groups (#9513)
In many places throughout the portal codebase, we called a function
"update_dynamic_group_memberships/1" which recomputed all of the
dynamic/managed memberships for a particular account, and reapplied them
to each affected group.

Since the `has_many :memberships` relationship used `on_replace:
:delete`, this caused Ecto to delete _all_ the `Everyone` group
memberships, and reinsert them on each sync.

Since each membership change triggers a policy re-evaluation for all
policies to the affected actor
(`Policies.broadcast_access_events_for/3`), this in effect was causing a
massive amount of queries to be triggered upon each sync job as each
membership deletion and insertion triggered a lookup for all resources
available to that particular actor.

To fix this, we introduce the following changes:

- Remove `dynamic` group type. This will never be used as it will create
an immense amount of complexity for any organization trying to manage
groups this way
- Refactor `update_dynamic_group_memberships/1` to use a smarter query
that first gathers all the _needed_ changes and applies them within a
transaction using Ecto.Multi. Previously all memberships would be rolled
over unconditionally due to the `on_replace: :delete` option on the
relationship. Note that the option is still there, but we generally
don't set memberships on groups any longer unless editing the affected
group directly, where the everyone group doesn't apply.

Resolves: #8407 
Resolves: #8408
Related: #6294

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-13 18:55:37 +00:00
Jamil
c31f51d138 refactor(portal): move resource events to WAL (#9406)
We move the resource events to the WAL system. Notably, we no longer
need `fetch_and_update_breakable` for resource updates, so a bit of
refactoring is included to update the call sites for those.

Additionally, we need to add a `Flow.expire_flows_for_resource_id/1`
function to expire flows from the WAL system. This is now being called
in the WAL event handler. To prevent this from blocking the WAL
consumer/broadcaster, we wrap it with a Task.async. These will be
cleaned up when the lookup table for access is implemented next.

Another thing to note is that we lose the `subject` when moving from
`Flows.expire_flows_for(%Resource{}, subject)` to
`Flows.expire_flows_for_resource_id(resource_id)` when a resource is
deleted or updated by an actor since we respond to this event in the WAL
where that data isn't available. However, we don't actually _use_ the
subject when expiring flows (other than authorize the initial resource
update), so this isn't an issue.

Related: #9501

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Brian Manifold <bmanifold@users.noreply.github.com>
2025-06-11 00:12:45 +00:00
Brian Manifold
d4c7b48754 refactor(portal): update asset config in portal (#9504)
Why:

* This commit brings our web app inline with how new Phoenix
applications manage and configure js/css/font assets. Along with that
this commit updates our Tailwind and esbuild tools.
2025-06-10 23:00:44 +00:00
Jamil
6fc7d2e4e0 feat(portal): configurable ip stack for DNS resources (#9303)
Some poorly-behaved applications (e.g. mongo) will fail to connect if
they see both IPv4 and IPv6 addresses for a DNS resource, because they
will try to connect to both of them and fail the whole connection setup
if either one is not routable.

To fix this, we need to introduce a knob to allow admins to restrict DNS
resources to only A or AAAA records.


<img width="750" alt="Screenshot 2025-06-02 at 10 48 39 AM"
src="https://github.com/user-attachments/assets/4dbcb6ae-685f-43ee-b9e8-1502b365a294"
/>

<img width="1174" alt="Screenshot 2025-06-02 at 11 05 53 AM"
src="https://github.com/user-attachments/assets/02d0a4b3-e6e8-4b6d-89fa-d3d999b5811e"
/>

---

Related:
https://firezonehq.slack.com/archives/C08KPQKJZKM/p1746720923535349
Related: #9300
Fixes: #9042
2025-06-03 02:24:41 +00:00
Jamil
73c3e2d87b refactor(portal): move gateway events to WAL (#9299)
This PR moves Gateway events to be triggered by the WAL broadcaster.
Some things of note that are cleaned up:

- The gateway `:update` event was never received anywhere (but in a
test) and so has been removed
- The account topic has been removed as it was also never acted upon
anywhere. Presence yes, but topic no
- The group topic has also been removed as it was only used to receive
broadcasted disconnects when a group is deleted, but this was already
handled by the token deletion and so is redundant.
2025-06-01 16:40:28 +00:00
Jamil
23bae8f878 fix(portal): Use account param for autoredirect (#9304)
When the client is connecting for the first time without any cookies
loaded the `conn.assigns.account` is non-existent, causing a `KeyError`.

Instead, we should be loading this param from the URL and fetching the
account from it.
2025-05-30 21:23:25 +00:00
Brian Manifold
a51b35a6b4 refactor(portal): remove created_by_<identity/actor> columns (#9306)
Why:

* Now that we have started using the `created_by_subject` field on
various tables, we no longer need to keep the
`created_by_<identity/actor>` fields. This will help remove a foreign
key reference and will be one step closer to allowing us to hard delete
data rather than soft deleting all data in order to keep foreign key
references like these.
2025-05-30 21:06:35 +00:00
Jamil
6cea0cd6ec refactor(portal): Move client updates to WAL broadcaster (#9288)
Client updates are next on the path to moving more side effects to the
WAL broadcaster. This one has the following notable changes:

- ~~The `actor_clients` pubsub topic were only used to broadcast removal
of clients belonging to an actor; these are no longer needed since we
handle this in the individual removal event~~ EDIT: only the presence is
kept
- The `account_clients:{account_id}` pubsub and presence topic
definition has been moved to `Events.Hooks.Accounts` because these are
broadcasted using the account_id field based on account changes, and
have nothing to do with the client lifecycle


Related: #6294 
Related: #8187
2025-05-29 16:56:08 +00:00
Brian Manifold
1358da189d refactor(portal): start using created_by_subject (#9284)
Now that we've added the `created_by_subject` column on all relevant
tables, we can start using that data in the portal.
2025-05-28 14:57:36 +00:00
Jamil
3659b07259 fix(portal): Fix capitalization for All Identity Providers (#9241) 2025-05-26 17:30:01 +00:00
Jamil
ca59492003 fix(portal): bump width of default auth provider selection (#9174)
This is just a bit short at the moment:

<img width="467" alt="Screenshot 2025-05-16 at 3 55 55 PM"
src="https://github.com/user-attachments/assets/6d4b6d6d-d3a2-453e-a860-cb638127f684"
/>

---------

Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-16 16:20:47 -07:00
Jamil
65c58ee254 feat(portal): Zero-click client authentication (#9144)
Adds a new field to `settings/identity_providers` that allows an Admin
to designate any non-email/otp provider as the `default` for client
authentication. Clients will then navigate directly to the provider's
`/redirect` endpoint when authenticating, which in many cases will
automatically sign them in.

No existing providers are updated in this PR.



https://github.com/user-attachments/assets/7b962a25-76fd-491f-a194-60ed993821fc
2025-05-16 19:26:08 +00:00
Brian Manifold
dd5a53f686 fix(portal): Fix sign_up to properly populate email (#9105)
Why:

* During the account sign up flow, the email of the first admin was not
being populated in the `email` column on the auth_identities table. This
was due to atoms being passed in the attrs instead of strings to the
`create_identity` function. A migration was also created to backfill the
missing emails in the `auth_identities` table.
2025-05-13 19:49:25 +00:00
Brian Manifold
3f3f007920 fix(portal): Update copy to clipboard button (#8907)
Why:

* The copy to clipboard button was not working at all on the API new
token page due to the fact that the FlowbiteJS library expects the
presence of the elements in the DOM on first render. This was not true
of the API Token code block. Along with that issue the existing code
blocks copy to clipboard buttons did not give any visual indication that
the copy had been completed. It was also somewhat difficult to see the
copy to clipboard button on those code blocks as well. This commit
updates the buttons to be more visible, as well as adds a phx-hook to
make sure the FlowbiteJS init functions are run on every code block even
if it's inserted after the initial load of the page and adds functions
that are run as a callback to toggle the button text and icon to show
the text has been copied.
2025-04-26 00:43:43 +00:00
Jamil
0a2a393d4c fix(portal): Prevent additional email identities per actor (#8888)
This is a UI-only change for now to serve as a stop-gap while we work to
overhaul the identity domain model.

Related: #6294
2025-04-22 21:13:37 +00:00
Jamil
8293e6c440 fix(portal): Don't peek groups for api_client actors (#8890)
API clients don't belong to any actor_groups and attempting to deep link
into the `groups` section when viewing an actor raises a 500 error.

This PR fixes that by removing the deep link into `actor_groups` from
the actors index view.
2025-04-22 13:59:06 +00:00
Brian Manifold
4c9848453d refactor(portal): Add more logging around sign in errors (#8789)
Why:

* To allow for more accurate and efficient troubleshooting in
production.
2025-04-15 14:25:06 +00:00
Jamil
649c03e290 chore(portal): Bump LoggerJSON to 7.0.0, fixing config (#8759)
There was slight API change in the way LoggerJSON's configuration is
generation, so I took the time to do a little fixing and cleanup here.

Specifically, we should be using the `new/1` callback to create the
Logger config which fixes the below exception due to missing config
keys:

```
FORMATTER CRASH: {report,[{formatter_crashed,'Elixir.LoggerJSON.Formatters.GoogleCloud'},{config,[{metadata,{all_except,[socket,conn]}},{redactors,[{'Elixir.LoggerJSON.Redactors.RedactKeys',[<<"password">>,<<"secret">>,<<"nonce">>,<<"fragment">>,<<"state">>,<<"token">>,<<"public_key">>,<<"private_key">>,<<"preshared_key">>,<<"session">>,<<"sessions">>]}]}]},{log_event,#{meta => #{line => 15,pid => <0.308.0>,time => 1744145139650804,file => "lib/logger.ex",gl => <0.281.0>,domain => [elixir],application => libcluster,mfa => {'Elixir.Cluster.Logger',info,2}},msg => {string,<<"[libcluster:default] connected to :\"web@web.cluster.local\"">>},level => info}},{reason,{error,{badmatch,[{metadata,{all_except,[socket,conn]}},{redactors,[{'Elixir.LoggerJSON.Redactors.RedactKeys',[<<"password">>,<<"secret">>,<<"nonce">>,<<"fragment">>,<<"state">>,<<"token">>,<<"public_key">>,<<"private_key">>,<<"preshared_key">>,<<"session">>,<<"sessions">>]}]}]},[{'Elixir.LoggerJSON.Formatters.GoogleCloud',format,2,[{file,"lib/logger_json/formatters/google_cloud.ex"},{line,148}]}]}}]}
```

Supersedes #8714

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-11 19:00:06 -07:00
Jamil
d2fd57a3b6 fix(portal): Attach Sentry in each umbrella app (#8749)
- Attaches the Sentry Logging hook in each of [api, web, domain]
- Removes errant Sentry logging configuration in config/config.exs
- Fixes the exception logger to default to logging exceptions, use
`skip_sentry: true` to skip

Tested successfully in dev. Hopefully the cluster behaves the same way.

Fixes #8639
2025-04-11 04:17:12 +00:00
Jamil
05dafabbad fix(portal): Fix human display of geo location (#8665)
These seem to be swapped. Generally accepted is `city, country`.
2025-04-09 01:28:35 +00:00
Jamil
95d3f765f4 feat(portal): Show Internet Resource in resources/index (#8495)
After removing some of the functionality for viewing the Internet
Resource, customer was confused where to find it again.

This places an `Internet` section in the Resources index page (similar
to Sites page) with a short help text and an action button to view the
Internet Resource.

This also adds a convenient helper that allows us to route to
`/#{account}/resources/internet` for a nicer-looking URL that users can
bookmark if needed.

<img width="1423" alt="Screenshot 2025-03-19 at 11 52 31 PM"
src="https://github.com/user-attachments/assets/f2da1c31-92b2-429e-832f-73ddd0524155"
/>


Fixes #8479
2025-03-26 21:30:11 +00:00
Brian Manifold
3313e7377e feat(portal): Add account delete button (#8487)
Why:

* This commit will allow account admins to send a request through the
Firezone portal to schedule a deletion of their account, rather than
having the account admins email their request manually. Doing this
through the portal allows us to verify that the request actually came
from an admin of the account.
2025-03-19 18:23:32 +00:00
Jamil
366215b1d6 fix(gateway): Prefer setting FIREZONE_ID over /var/lib/firezone (#8475)
When deploying a Gateway from the admin portal UI, we show various
environment variables required for setup. Until now, we've relied on the
`/var/lib/firezone` persistence method for identifying the Gateway.

However, this can cause issues on some systems that don't have writeable
access to /var/lib/firezone, or old versions of systemd that don't
support sandboxed access to this directory.

This PR updates each deployment method to use `FIREZONE_ID` instead
everywhere. Additionally, since the Docker upgrade script needs to
reinvoke the new container using the same arguments (more or less) as
the install, we need to extract the old `/var/lib/firezone/gateway_id`
file out of the existing container if it exists, and try to insert it
into the upgraded container.

Tested both scripts, including upgrades for the Docker script.

Fixes: #8471
2025-03-18 04:08:21 +00:00