Why:
* There were intermittent issues with accounts updates from Stripe
events. Specifically, when an account would update it's subscription
from Starter to Team. The reason was due to the fact that Stripe does
not guarantee order of delivery for it's webhook events. At times we
were seeing and responding to an event that was a few seconds old after
processing a newer event. This would have the effect of quickly
transitioning an account from Team back to Starter. This commit
refactors our event handler and adds a `processed_stripe_events` DB
table to make sure we don't process duplicate events as well as prevent
processing an event that was created prior to the last event we've
processed for a given account.
* Along with refactoring the billing event handling, the Stripe mock
module has also been refactored to better reflect real Stripe objects.
Related: #8668
When starting a local client with a local portal, this URL is hit and
times out, causing noise in the local gateway log.
In order to develop against this API in local dev, it might be better to
use the local website URL as well.
Oban includes its own configuration validation, which seems to prevent
`runtime.exs` from overriding any compile-time options. This prevents us
from using ENV vars to configure it, such as restricting job execution
to `domain` nodes by setting `queues: []`. To fix that, we make sure to
set Oban configuration in env-specific files `config/dev.exs` and
`config/test.exs`, and at runtime for prod with `config/runtime.exs`.
Fixes#10016
Some migrations take a long time to run because they require locks or
modify large amounts of data. To prevent this from causing issues during
deploy, we leverage Ecto's native support for loading migrations from
multiple directories to introduce a `conditional_migrations/` directory
that houses any conditional migrations we want to run.
To run these migrations, you'll need to do one of the following:
- `dev, test`: The `mix ecto.migrate` will run them by default because
we have aliased this to load conditional_migrations for dev
- `prod`: Set the `RUN_CONDITIONAL_MIGRATIONS` env var to `true` before
starting a prod server using the `bin/migrate` script.
- `dev, test, prod`: Run `Domain.Release.migrate(conditional: true)`
from an IEx shell.
If conditional migrations were found that weren't executed during
`Domain.Release.migrate`, a warning is logged to remind us to run them.
---------
Signed-off-by: Jamil <jamilbk@users.noreply.github.com>
The adapter itself isn't enabled in the UI on prod, but the background
job to sync mock data was. This prevents the job from being started and
emitting log noise into production logs.
By specifying the `before_send` hook, we can easily drop events based on
their data, such as `original_exception` which contains the original
exception instance raised.
Leveraging this, we can add a `report_to_sentry` parameter to
`Web.LiveErrors.NotFound` to optionally ignore certain not found errors
from going to Sentry.
This adds a feature that will email all admins in a Firezone Account
when sync errors occur with their Identity Provider.
In order to avoid spamming admins with sync error emails, the error
emails are only sent once every 24 hours. One exception to that is when
there is a successful sync the `sync_error_emailed_at` field is reset,
which means in theory if an identity provider was flip flopping between
successful and unsuccessful syncs the admins would be emailed more than
once in a 24 hours period.
### Sample Email Message
<img width="589" alt="idp-sync-error-message"
src="https://github.com/user-attachments/assets/d7128c7c-c10d-4d02-8283-059e2f1f5db5">
Why:
* The Swagger UI is currently served from the API application. This
means that the Web application does not have access to the external URL
in the API configuration during/after compilation. Without the API
external URL, we cannot generate a proper link in the portal to the
Swagger UI. This commit refactors how the API external URL is set from
the environment variables and allows the Web app to have access to the
value of the API URL.
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Why:
* In order to manage a large number of Firezone Sites, Resources,
Policies, etc... a REST API is needed as clicking through the UI is too
time consuming, as well as prone to error. By providing a REST API
Firezone customers will be able to manage things within their Firezone
accounts with code.
Why:
* JumpCloud directory sync was requested from customers. JumpCloud only
offers the ability to use it's API with an admin level access token that
is tied to a specific user within a given JumpCloud account. This would
require Firezone customers to give an access token with much more
permissions that needed for our directory sync. To avoid this, we've
decide to use WorkOS to provide SCIM support between JumpCloud and
WorkOS, which will allow Firezone to then easily and safely retrieve
JumpCloud directory info from WorkOS.
---------
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
Getting some weird behavior with AppLinks. They don't seem to work upon
first use and require a few tries to function correctly.
Edit: Found the issue: Android Studio doesn't like when the Manifest
contains variables for AppLinks. I added a note in the Manifest.
@conectado To test Applinks are working correctly, you can use the App
Link Assistant:
<img width="930" alt="Screenshot 2023-09-28 at 11 15 11 PM"
src="https://github.com/firezone/firezone/assets/167144/e4bd4674-d562-44ec-bdb8-3a5f97250b84">
Then from there you can click "Test App Links":
<img width="683" alt="Screenshot 2023-09-28 at 11 15 30 PM"
src="https://github.com/firezone/firezone/assets/167144/f3dc8e0d-f58a-4a4b-9855-62472096dc9e">
Did some research on status page providers to manage incidents.
statuspage.io seems to be easy to use and cost-effective, fairly popular
and provides a good amount of flexibility to customize emails,
notifications, etc.
Super easy to set up and use but am not married to it if anyone feels
strongly about using another incident management service.
https://firezone.statuspage.io
## Demo:
<img width="235" alt="Screenshot 2023-06-27 at 8 07 29 AM"
src="https://github.com/firezone/firezone/assets/167144/8ad12b9b-7345-4a5d-bf43-c8af798d85f9">
TODO:
- [x] Cluster formation for all API and web nodes
- [x] Injest Docker logs to Stackdriver
- [x] Fix assets building for prod
To finish later:
- [ ] Structured logging:
https://issuetracker.google.com/issues/285950891
- [ ] Better networking policy (eg. use public postmark ranges and deny
all unwanted egress)
- [ ] OpenTelemetry collector for Google Stackdriver
- [ ] LoggerJSON.Plug integration
---------
Signed-off-by: Andrew Dryga <andrew@dryga.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>