Commit Graph

8 Commits

Author SHA1 Message Date
Jamil
391150f0e1 chore(ci): Fix new issues in cd.yml (#4085)
Fixes some issues encountered after the merge of #4049 

- Fix performance tests to only run using base_ref and head_ref to avoid
dependence on `main`
- Fixes some typos
- Prevents a catch-22 condition where breaking compatibility meant we
wouldn't be able to deploy production
2024-03-12 02:06:19 +00:00
Jamil
3bd7dc504e fix(ci): Fix flaky iperf3 "Bad file descriptor" (#3731)
- Lower UDP bandwidth to 50M -- this fixes intermittent file descriptor
issues because we overload iperf3 for more than 5 seconds
- Simplify iperf3 to the minimum set that makes tests reliable
2024-02-22 19:57:22 +00:00
Jamil
5bd717b877 fix(ci): Use workflow id to fetch perf results (#3710) 2024-02-20 19:40:16 -08:00
Jamil
63cdd09a01 refactor(ci): Merge perf results into one comment (#3707)
One comment vs eight, need I say more?
2024-02-20 18:17:48 -08:00
Jamil
2d208b1991 fix(ci): Fix js typo (#3704)
More fixes from the perf test refactor
2024-02-20 16:38:05 -08:00
Jamil
0598ca55c3 fix(ci): Fix result overwrite (#3700)
Buttoning up fixes from #3695
2024-02-20 15:46:59 -08:00
Jamil
7ff40b82ed fix(ci): Run each perf test in its own matrix job (#3695)
The iperf3 server sometimes hangs, or takes a while to startup.

Rather than trying to reset the iperf3 state between performance tests,
this PR refactors them so they each run in their matrix job. This
ensures each performance test will run on a separate VM, unaffected by
previous test runs to eliminate the effect any residual network buffer
state can have on a particular test.

It also makes sure the server is listening with a `healthcheck`.
2024-02-20 22:44:20 +00:00
Jamil
eebd7fc7f1 fix(ci): Ensure integration-tests allow for at least 30 seconds to establish a connection (#3676)
So the cause of the flaky tests is that they aren't waiting long enough
for a connection to be established. Both the test in #3666 and the
`iperf` tests have a timeout of 10 seconds.

Connections _should_ be established **very quickly** in CI. However, I
have a few guesses as to why they might not be, essentially causing us
to have to wait for a timeout to re-initiate a connection request:

- Packets arrive out of order or too quickly for the WireGuard state
machine to establish a handshake.
- Too many ICE candidates gathered (the gateway has 3 interfaces)


This PR:

- Refactors the iperf tests to be a little easier to maintain
- Ensures `integration-tests` run for at least 30 seconds before timing
out


In any case, we can debug / optimize this further after snownet is
merged, which might just solve the problem completely.
2024-02-19 20:50:58 +00:00