* Ignore errors from rollback manager invocations
During reload and mount move operations, we want to ensure that errors
created by the final Rollback are not fatal (which risk failing
replication in Enterprise when the core/mounts table gets invalidated).
This mirrors the behavior of the periodic rollback manager, which
only logs the error.
This updates the noop backend to allow failing just rollback operations,
which we can use in tests to verify this behavior and ensure the core
operations (plugin reload, plugin move, and seal/unseal) are not broken
by this. Note that most of these operations were asynchronous from the
client's PoV and thus did not fail anyways prior to this change.
* Add changelog entry
* Update vault/external_tests/router/router_ext_test.go
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* Use library/consul as the mirror path instead of hashicorp/consul
- Looks like the older 1.4.4 image was not published within the
hashicorp/consul space, only newer images are.
- Switch to library/consul which seems to have both versions
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* use verify changes for docs to skip tests
* add verify-changes to the needed jobs
* skip go tests for doc/ui only changes
* fix a job ref
* change names, remove script
* remove ui conditions
* separate flags
* feedback
We further optimize the CI workflow for better costs and speed.
We tested the Go CI workflows across several instance classes
and update our compute choices. We achieve an average execution
speed improvement of 2-2.5 minutes per test workflow while
reducing the infrastructure cost by about 20%. We also also save
another ~2 minutes by installing `gotestsum` from the Github release
instead of downloading the Go modules and compiling it every time.
In addition to the speed improvements, we also further reduced our cache
usage by updating the `security-scan` workflow to not cache Go modules.
We also use the `cache/save` and `cache/restore` actions for timing
caches. This results is saving half as many cache results for timing
data.
*UI test results*
results for 2x runs:
* c6a.2xlarge (12m54s, 11m55s)
* c6a.4xlarge (10m47s, 11m6s)
* c6a.8xlarge (11m32s, 10m51s)
* m5.2xlarge (15m23s, 14m16s)
* m5.4xlarge (14m48s, 12m54s)
* m5.8xlarge (12m27s, 12m24s)
* m6a.2xlarge (11m55s, 12m20s)
* m6a.4xlarge (10m54s, 10m43s)
* m6a.8xlarge (10m33s, 10m51s)
Current runner:
m5.2xlarge (15m23s, 14m16s, avg 14m50s) @ 0.448/hr = $0.11
Faster candidates
* c6a.2xlarge (12m54s, 11m55s, avg 12m24s) @ 0.3816/hr = $0.078
* m6a.2xlarge (11m55s, 12m20s, avg 12m8s) @ 0.4032/hr = $0.081
* c6a.4xlarge (10m47s, 11m6s, avg 10m56s) @ 0.7632/hr = $0.139
* m6a.4xlarge (10m54s, 10m43s, avg 10m48s) @ 0.8064/hr = $0.140
Best bang for the buck for test-ui:
m6a.2xlarge, > 25% cost savings from current and we save ~2.5 minutes.
*Go test results*
During testing the external replication tests, when not broken up, will
always take the longest. Our original analysis focuses on this job.
Most other tests groups will finish ~3m faster so we'll use subtract
that time when estimating the cost for the whole job.
external replication job results:
* c6a.2xlarge (20m49s, 19m20s, avg 20m5s)
* c6a.4xlarge (19m1s, 19m38s, avg 19m20s)
* c6a.8xlarge (19m51s, 18m54s, avg 19m23s)
* m5.2xlarge (22m12s, 20m29s, avg 21m20s)
* m5.4xlarge (20m7s, 19m3s, avg 20m35s)
* m5.8xlarge (20m24s, 19m42s, avg 20m3s)
* m6a.2xlarge (21m10s, 19m37s, avg 20m23s)
* m6a.4xlarge (18m58s, 19m51s, avg 19m24s)
* m6a.8xlarge (19m27s, 18m47s, avg 19m7s)
There is little separation in time when we increase class size. In the
best case a class size increase yields about a ~5% performance increase
and doubles the cost. For test-go our best bang for the buck is
certainly going to be in the 2xlarge class.
Current runner:
m5.2xlarge (22m12s, 20m29s, avg 21m20s) @ 0.448/hr (16@avg-3m + 1@avg) = $2.35
Candidates in the same class
* c6a.2xlarge (20m49s, 19m20s, avg 20m5s) @ 0.3816/hr (16@avg-3m + 1@avg) = $1.86
* m6a.2xlarge (21m10s, 19m37s, avg 20m23s) @ 0.4032/hr (16@avg-3m + 1@avg) = $2.00
Best bang for the buck for test-go:
c6a.2xlarge: 20% cost savings and save about ~2.25 minutes.
We ran the tests with similar instances and saw similar execution times as
with test-go. Therefore we can use the same recommended instance sizes.
After breaking up test-go's external replication tests, the longest group
was shorter on average. I choose to look at group 3 as it was usually the
longest grouping:
* c6a.2xlarge: (14m51s, 14m48s)
* c6a.4xlarge: (14m14s, 14m15)
* c6a.8xlarge: (14m0s, 13m54s)
* m5.2xlarge: (15m36s, 15m35s)
* m5.4xlarge: (14m46s, 14m49s)
* m5.8xlarge: (14m25s, 14m25s)
* m6a.2xlarge: 14m51s, 14m53s)
* m6a.4xlarge: 14m16s, 14m16s)
* m6a.8xlarge: (14m2s, 13m57s)
Again, we see ~5% performance gains between the 2x and 8x instance classes
at quadruple the cost. The c6a and m6a families are almost identical, with
the c6a class being cheaper.
*Notes*
* UI and Go Test timing results: https://github.com/hashicorp/vault-enterprise/actions/runs/5556957460/jobs/10150759959
* Go Test with data race detection timing results: https://github.com/hashicorp/vault-enterprise/actions/runs/5558013192
* Go Test with replication broken up: https://github.com/hashicorp/vault-enterprise/actions/runs/5558490899
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* Sync missing scenarios and modules
* Clean up variables and examples vars
* Add a `lint` make target for enos
* Update enos `fmt` workflow to run the `lint` target.
* Always use ipv4 addresses in target security groups.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* backport of commit dc104898f7 (#21853)
* fix multiline
* shellcheck, and success message for builds
* add full path
* cat the summary
* fix and faster
* fix if condition
* base64 in a separate step
* echo
* check against empty string
* add echo
* only use matrix ids
* only id
* echo matrix
* remove wrapping array
* tojson
* try echo again
* use jq to get packages
* don't quote
* only run binary tests once
* only run binary tests once
* test what's wrong with the binary
* separate file
* use matrix file
* failed test
* update comment on success
* correct variable name
* bae64 fix
* output to file
* use multiline
* fix
* fix formatting
* fix newline
* fix whitespace
* correct body, remove comma
* small fixes
* shellcheck
* another shellcheck fix
* fix deprecation checker
* only run comments for prs
* Update .github/workflows/test-go.yml
Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
* Update .github/workflows/test-go.yml
Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
* fixes
---------
Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
* backport of commit 3b00dde1ba (#21936)
* limit test comments
* remove unecessary tee
* fix go test condition
* fix
* fail test
* remove ailways entirely
* fix columns
* make a bunch of tests fail
* separate line
* include Failures:
* remove test fails
* fix whitespace
* backport of commit 245430215c (#21973)
* only add binary tests if they exist
* shellcheck
---------
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>