Commit Graph

16 Commits

Author SHA1 Message Date
Jamil
fa40d6e852 fix(infra): Adjust rule to total_latencies from backend_latencies (#7323)
This is the check that Oneleet is expecting.
2024-11-12 21:30:28 +00:00
Jamil
f40528f8f0 chore(infra): Relax load balancer to app latency alert to 3s (#7317)
1000ms is a little too agressive here. The latency is measured from load
balancer, which are global, to our app servers, which are in us-east1.
2024-11-12 05:44:05 +00:00
Jamil
2825522844 fix(infra): Filter out WebSocket upgrade from latency alerting (#7242) 2024-11-02 15:43:49 -07:00
Jamil
e9db936c0f feat(infra): Add Google load balancer latency alert (#7231)
Oneleet has a new monitor failing that suggests adding this.


https://app.oneleet.com/tenants/148d888b-6cbe-4198-b4be-359e816927f4/monitors/9ad764bf-147b-4b87-bee8-f825ea9e0adc
2024-11-01 15:57:32 +00:00
Andrew Dryga
ba71d651d9 chore(infra): Silence alerts from OTEL Finch integration (#6188) 2024-08-07 10:26:51 -06:00
Jamil
e82a9506ab fix(infra): use sensitive attribute for all secrets (#5562)
Is there a reason not to mark these `sensitive`?


https://developer.hashicorp.com/terraform/tutorials/configuration-language/sensitive-variables
2024-06-27 08:13:35 +00:00
Andrew Dryga
f5b4736f12 fix(portal): Fix edge cases with OIDC discovered in logs (#4777)
Can be reviewed commit by commit.
2024-05-11 09:37:28 -06:00
Andrew Dryga
90e203312d chore(portal): Strop triggering alerts on fluentbit logs (#4955) 2024-05-11 09:35:27 -06:00
Andrew Dryga
c7f300a5ca Do not trigger alerts on errors logged by OSConfigAgent 2024-04-24 15:28:16 -06:00
Andrew Dryga
1555b80a72 Do not trigger alerts on errors logged by GCEGuestAgent 2024-04-24 15:13:11 -06:00
Andrew Dryga
450b647553 Increase CPU utilization alert alignment window 2024-04-22 13:32:51 -06:00
Andrew Dryga
7fe043aee0 Increase CPU utilization alert window to reduce alerts noise when portal is rolled out 2024-04-19 13:36:42 -06:00
Andrew Dryga
b653d66414 Trigger monitoring alerts on crash reports 2024-04-11 23:43:53 -06:00
Andrew Dryga
8f1785f7c7 Do not raise alerts on errors from auditlog 2024-04-11 23:34:58 -06:00
Andrew Dryga
ea351465a3 chore(portal): Send alert notifications to mobile channels (#4463) 2024-04-02 11:56:46 -06:00
Andrew Dryga
114696c0ba chore(infra): Split terraform files into folders and add domain to production app (#4172) 2024-03-16 11:54:06 -06:00