This fixes the enterprise failure of the test
```
=== FAIL: builtin/logical/pki TestCRLIssuerRemoval (0.00s)
crl_test.go:1456:
Error Trace: /home/runner/actions-runner/_work/vault-enterprise/vault-enterprise/builtin/logical/pki/crl_test.go:1456
Error: Received unexpected error:
Global, cross-cluster revocation queue cannot be enabled when auto rebuilding is disabled as the local cluster may not have the certificate entry!
Test: TestCRLIssuerRemoval
Messages: failed enabling unified CRLs on enterprise
```
* Clean up unused CRL entries when issuer is removed
When a issuer is removed, the space utilized by its CRL was not freed,
both from the CRL config mapping issuer IDs to CRL IDs and from the
CRL storage entry. We thus implement a two step cleanup, wherein
orphaned CRL IDs are removed from the config and any remaining full
CRL entries are removed from disk.
This relates to a Consul<->Vault interop issue (#22980), wherein Consul
creates a new issuer on every leadership election, causing this config
to grow. Deleting issuers manually does not entirely solve this problem
as the config does not fully reclaim space used in this entry.
Notably, an observation that when deleting issuers, the CRL was rebuilt
on secondary clusters (due to the invalidation not caring about type of
the operation); for consistency and to clean up the unified CRLs, we
also need to run the rebuild on the active primary cluster that deleted
the issuer as well.
This approach does allow cleanup on existing impacted clusters by simply
rebuilding the CRL.
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add test case on CRL removal
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* allow users to specify files for child process stdout/stderr
* added changelog
* check if exec config is nil
* fix test
* first attempt at a test
* revise test
* passing test
* added failing test
* Apply suggestions from code review
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
* code review suggestions
* always close log files
* refactor to use real files
* hopefully fixed tests
* add back bool gates so we don't close global stdout/stderr
* compare to os.Stdout/os.Stderr
* remove unused
---------
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
* fix group name typos
* add flaky note and cleanup generate function
* rename variable
* remove other test for other key types
* move key types to relevant test
This adds edition handling to the test-run-enos-scenario-matrix
workflow. Previously we'd pass the version and edition from the caller,
but that isn't an option in the release testing workflow, which only
passes the metadata version without the edition.
Signed-off-by: Ryan Cragun <me@ryan.ec>
The CRT orchestrator triggers the release testing workflows for all
release versions using the same main ref. Therefore, if we have
concurrency controls in place we could cancel them if more than one
release branch is executing workflows.
Signed-off-by: Ryan Cragun <me@ryan.ec>
- This protects against a test in ENT and a use-case in which
we would force a migration for stored configs that had been
written with a nil configuration
There seems to be a bug, but I'm not sure if it is because this documentation is almost worse than guessing
Co-authored-by: Yoko Hyakuna <yoko@hashicorp.com>
If the agent fails to start, for example when a port conflict occurs,
we want the test to fail fast, rather than continuing until the test
times out.
If this 5-second timeout occurs waiting for the agent to start up,
then the it does not make logical sense to continue the test. So,
we use `t.Fatalf` to trigger the failure.
Replace our prior implementation of Enos test groups with the new Enos
sampling feature. With this feature we're able to describe which
scenarios and variant combinations are valid for a given artifact and
allow enos to create a valid sample field (a matrix of all compatible
scenarios) and take an observation (select some to run) for us. This
ensures that every valid scenario and variant combination will
now be a candidate for testing in the pipeline. See QT-504[0] for further
details on the Enos sampling capabilities.
Our prior implementation only tested the amd64 and arm64 zip artifacts,
as well as the Docker container. We now include the following new artifacts
in the test matrix:
* CE Amd64 Debian package
* CE Amd64 RPM package
* CE Arm64 Debian package
* CE Arm64 RPM package
Each artifact includes a sample definition for both pre-merge/post-merge
(build) and release testing.
Changes:
* Remove the hand crafted `enos-run-matrices` ci matrix targets and replace
them with per-artifact samples.
* Use enos sampling to generate different sample groups on all pull
requests.
* Update the enos scenario matrices to handle HSM and FIPS packages.
* Simplify enos scenarios by using shared globals instead of
cargo-culted locals.
Note: This will require coordination with vault-enterprise to ensure a
smooth migration to the new system. Integrating new scenarios or
modifying existing scenarios/variants should be much smoother after this
initial migration.
[0] https://github.com/hashicorp/enos/pull/102
Signed-off-by: Ryan Cragun <me@ryan.ec>
We grab the state lock and check that the core is not shutting down.
This panic mostly seems to happen if Vault is shutting down, usually
in a test.
Also, we try clean up the go-bexpr test by sending duplicates, and
deduplicating in the receive loop.