This is mostly to stay up to date with current Elixir and benefit from
the new included [JSON parser](https://hexdocs.pm/elixir/JSON.html).
Removing `Jason` in favor of the embedded `JSON` parser is saved for a
[future PR](https://github.com/firezone/firezone/issues/8011).
It found a couple type violations which were simple to fix, and some
formatting changes.
This adds https://github.com/getsentry/sentry-elixir to the portal for
automatic process crash and exception trace reporting.
It also configures Logger reporting for the `warning` level and higher,
and sets the data scrubbing rules to allow all Logger metadata keys
(`logger_metadata.*` in the Sentry project settings).
Lastly, it configures automatic HTTP error reporting by tying into the
`api` and `web` endpoint modules with a custom `plug` middleware so we
get automatic reporting of unsuccessful Phoenix responses.
It is expected this will be noisy when we first deploy and we'll need to
tune it down a bit. This is the same approach used with other Sentry
platforms.
Dependabot's workflow is set up in such a way it seems that it can't
find our `sha.exs` file.
This is a cleaner approach that doesn't rely on using external files for
the application version.
Interesting note: `mix compile` will happily use the cached `version`
even though it's computed from an env var, because `mix compile` uses
file hash and mtime to know when to recompile.
See https://github.com/firezone/firezone/network/updates/942719116
This PR adds support for ECS metadata API
(https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint-v4.html)
in order to discover hostname.
It also adds jq in the runtime image
Unlike EC2 or GCP VM, ECS tasks do not have a DNS record, we can only
use their IP as RELEASE_HOSTNAME. So I use their IPv4, IPv6 only
networks are therefore not supported.
- Removes version numbers from infra components (elixir/relay)
- Removes version bumping from Rust workspace members that don't get
published
- Splits release publishing into `gateway-`, `headless-client-`, and
`gui-client-`
- Removes auto-deploying new infrastructure when a release is published.
Use the Deploy Production workflow instead.
Fixes#4397
Feel free to correct me if I'm wrong but it seems the telemetry id is
not longer used in Firezone 1.x
Removing this uuid generation would allow me to put the folder
`/var/firezone` as readonly instead of mounting a
[volume](367a46a5c8/firezone/values.yaml (L157))
to allow firezone to write inside. The folder `/var/firezone` seems to
be used only for this purpose
Maybe I should also remove
[this](49a965a686/elixir/Dockerfile (L293))
?
PS: I cannot find the contrib branch, but don't hesite to create it and
change the target branch of this PR
Co-authored-by: Antoine <antoinelabarussias@gmail.com>
On the domain side this PR extends `Domain.Repo` with filtering,
pagination, and ordering, along with some convention changes are
removing the code that is not needed since we have the filtering now.
This required to touch pretty much all contexts and code, but I went
through all public functions and added missing tests to make sure
nothing will be broken.
On the web side I've introduced a `<.live_table />` which is as close as
possible to being a drop-in replacement for the regular `<.table />`
(but requires to structure the LiveView module differently due to
assigns anyways). I've updated all the listing tables to use it.
Upsides:
1. We don't need to maintain a separate repo and Dockerfile just for
Elixir image (permissions, runner labels, etc)
2. No need to push intermediate images to the container registry
3. No need to copy-paste alpine/erlang/elixir version and hashes from
`firezone/containers` to `elixir/dockerfile` every time they change
4. No need to cross-compile for local dev environments, better
experience building with slow internet connection
5. One command to test if our code works on our containers but a
different alpine/erlang/elixir version
Downsides:
1. Locally devs will need to compile Erlang at least once per version,
but the whole build takes ~6 minutes on my M1 Max. It also takes only 8
minutes on the free GitHub Actions runner without any cache.
2. Worse experience on slow machines
FYI: there is no performance penalty once we have cache layers, still
takes 30 seconds on CI.
~~This is an attempt to fix the CI bug
[here](https://github.com/firezone/firezone/actions/runs/5491388141/jobs/10007864417#step:4:1638)
possibly introduced in
[d9eb2d18](https://github.com/firezone/firezone/commit/d9eb2d18#diff-88bd94db0d5cfd5f0617b7c4ed48c0212597378ed7e28714c5d86c95999b4c7dR29)
and uncovered / exacerbated in Elixir 1.15~~
Edit: looks like this ended up being a couple cache issues with GitHub
actions:
1. The `elixir_api-container-build` cache would always overwrite the
`elixir_web-container-build` on subsequent builds of the same
`github.ref_name` (cache is scoped to branch name by default), leading
to the consistent error `Elixir.Web.Mailer.NoopAdapter does not exist`
whenever a branch was pushed to more than once.
2. The same thing happens with the `integration_test-basic-flow` job
because the `api` service gets built after the `web` service in
docker-compose.yml, overwriting its cache
For some reason it seems the `APPLICATION_NAME` ARG is not busting the
Docker cache properly on GitHub actions for elixir container builds, so
the fix here was to [use
`scope=`](https://docs.docker.com/build/cache/backends/gha/#scope) to
segregate the cache layers between builds of the same branch.
Did some research when picking a package manager for the website and
settled on `pnpm` for the following reasons:
- CLI-compatible with `npm`
- Typically faster than even `yarn` especially on Apple silicon
- Security: Pnpm uses a different dependency resolution algorithm and
different folder structure of node_modules that prevents illegal access
to packages by other packages.
I think I caught all the places, but I may be missing something, so if
this isn't a good idea we can revert back.
This PR also cleans up the actions workflows to remove dead code.
TODO:
- [x] Cluster formation for all API and web nodes
- [x] Injest Docker logs to Stackdriver
- [x] Fix assets building for prod
To finish later:
- [ ] Structured logging:
https://issuetracker.google.com/issues/285950891
- [ ] Better networking policy (eg. use public postmark ranges and deny
all unwanted egress)
- [ ] OpenTelemetry collector for Google Stackdriver
- [ ] LoggerJSON.Plug integration
---------
Signed-off-by: Andrew Dryga <andrew@dryga.com>
Co-authored-by: Jamil <jamilbk@users.noreply.github.com>