firezone

mirror of https://github.com/outbackdingo/firezone.git synced 2026-01-27 18:18:55 +00:00

Files

Jamil 5bac3f5ec2 fix(infra): Don't send more/faster metrics than Google accepts (#8028 )

We are getting quite a few of these warnings on prod:

```
{400, "{\n  \"error\": {\n    \"code\": 400,\n    \"message\": \"One or more TimeSeries could not be written: timeSeries[0-39]: write for resource=gce_instance{zone:us-east1-d,instance_id:2678918148122610092} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric.\",\n    \"status\": \"INVALID_ARGUMENT\",\n    \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.monitoring.v3.CreateTimeSeriesSummary\",\n        \"totalPointCount\": 40,\n        \"successPointCount\": 31,\n        \"errors\": [\n          {\n            \"status\": {\n              \"code\": 9\n            },\n            \"pointCount\": 9\n          }\n        ]\n      }\n    ]\n  }\n}\n"}
```

Since the point count is _much_ less than our flush buffer size of 1000,
we can only surmise the limit we're hitting is the flush interval.

The telemetry metrics reporter is run on each node, so we run the risk
of violating Google's API limit regardless of what a single node's
`@flush_interval` is set to.

To solve this, we use a new table `telemetry_reporter_logs` that stores
the last time a particular `flush` occurred for a reporter module. This
tracks global state as to when the last flush occurred, and if too
recent, the timer-based flush is call is `no-op`ed until the next one.

**Note**: The buffer-based `flush` is left unchanged, this will always
be called when `buffer_size > max_buffer_size`.

2025-02-10 18:21:40 +00:00

environments

chore(infra): Reduce gateway size to e2-micro for prod (#8027 )

2025-02-05 13:39:46 +00:00

examples

docs: fix references to AWS and Azure example modules (#5829 )

2024-07-11 16:10:12 +00:00

modules

fix(infra): Don't send more/faster metrics than Google accepts (#8028 )

2025-02-10 18:21:40 +00:00

.gitignore

chore(infra): Use Regional Instance Group in the GCP NAT example (#4183 )

2024-03-19 08:44:14 -06:00