* Cache trusted cert values, invalidating when anything changes
* rename to something more indicative
* defer
* changelog
* Use an LRU cache rather than a static map so we can't use too much memory. Add docs, unit tests
* Don't add to cache if disabled. But this races if just a bool, so make the disabled an atomic
* VAULT-528 add test reproducing the failure that should pass after the fix
* VAULT-528 Upgrade consul-template to version with the fix
* VAULT-528 changelog
v2.7.x overflows on 386 targets:
../../../.go/pkg/mod/github.com/couchbase/gocb/v2@v2.7.0/search/internal.go:31:62: cannot use math.MaxUint32 (untyped int constant 4294967295) as int value in argument to fmt.Errorf (overflows)
This rolls us back to 2.6.5 until it's fixed upstream.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Initial version of an internal plugin interface for event subscription plugins,
and an AWS SQS plugin as an example.
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
* add new plugin wif fields to AWS Secrets Engine
* add changelog
* go get awsutil v0.3.0
* fix up changelog
* fix test and field parsing helper
* godoc on new test
* require role arn when audience set
* make fmt
---------
Co-authored-by: Austin Gebauer <agebauer@hashicorp.com>
Co-authored-by: Austin Gebauer <34121980+austingebauer@users.noreply.github.com>
This commit introduces two new adaptive concurrency limiters in Vault,
which should handle overloading of the server during periods of
untenable request rate. The limiter adjusts the number of allowable
in-flight requests based on latency measurements performed across the
request duration. This approach allows us to reject entire requests
prior to doing any work and prevents clients from exceeding server
capacity.
The limiters intentionally target two separate vectors that have been
proven to lead to server over-utilization.
- Back pressure from the storage backend, resulting in bufferbloat in
the WAL system. (enterprise)
- Back pressure from CPU over-utilization via PKI issue requests
(specifically for RSA keys), resulting in failed heartbeats.
Storage constraints can be accounted for by limiting logical requests
according to their http.Method. We only limit requests with write-based
methods, since these will result in storage Puts and exhibit the
aforementioned bufferbloat.
CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number
of concurrent pki/issue requests found in testing (again, specifically
for RSA keys) is far lower than the minimum tolerable write request
rate. Without separate limiting, we would artificially impose limits on
tolerable request rates for non-PKI requests. To specifically target PKI
issue requests, we add a new PathsSpecial field, called limited,
allowing backends to specify a list of paths which should get
special-case request limiting.
For the sake of code cleanliness and future extensibility, we introduce
the concept of a LimiterRegistry. The registry proposed in this PR has
two entries, corresponding with the two vectors above. Each Limiter
entry has its own corresponding maximum and minimum concurrency,
allowing them to react to latency deviation independently and handle
high volumes of requests to targeted bottlenecks (CPU and storage).
In both cases, utilization will be effectively throttled before Vault
reaches any degraded state. The resulting 503 - Service Unavailable is a
retryable HTTP response code, which can be handled to gracefully retry
and eventually succeed. Clients should handle this by retrying with
jitter and exponential backoff. This is done within Vault's API, using
the go-retryablehttp library.
Limiter testing was performed via benchmarks of mixed workloads and
across a deployment of agent pods with great success.
* Implement raft-wal
* go mod tidy
* add metrics, fix a panic
* fix the panic for real this time
* PR feedback
* refactor tests to use a helper and reduce duplication
* add a test to verify we don't use raft-wal if raft.db exists
* add config to enable the verifier
* add tests for parsing verification intervals
* run the verifier in the background
* wire up the verifier
* go mod tidy
* refactor config parsing
* remove unused function
* trying to get the verifier working
* wire up some more verifier bits
* sorted out an error, added a new test, lots of debug logging that needs to come out
* fix a bug and remove all the debugging statements
* make sure we close raft-wal stablestore too
* run verifier tests for both boltdb and raft-wal
* PR feedback
* Vault 20270 docker test raft wal (#24463)
* adding a migration test from boltdb to raftwal and back
adding a migration test using snapshot restore
* feedback
* Update physical/raft/raft.go
Co-authored-by: Paul Banks <pbanks@hashicorp.com>
* PR feedback
* change verifier function
* make this shorter
* add changelog
* Fix Close behavior
* make supporting empty logs more explicit
* add some godocs
---------
Co-authored-by: hamid ghaf <hamid@hashicorp.com>
Co-authored-by: Hamid Ghaf <83242695+hghaf099@users.noreply.github.com>
Co-authored-by: Paul Banks <pbanks@hashicorp.com>
* bump github.com/hashicorp/go-kms-wrapping/wrappers/azurekeyvault/v2 version to include support for azure workload identities
* add changelog
* bump azurekeyvault/v2 to 2.0.10
* bump azcore to 1.9.1
We're on a quest to reduce our pipeline execution time to both enhance
our developer productivity but also to reduce the overall cost of the CI
pipeline. The strategy we use here reduces workflow execution time and
network I/O cost by reducing our module cache size and using binary
external tools when possible. We no longer download modules and build
many of the external tools thousands of times a day.
Our previous process of installing internal and external developer tools
was scattered and inconsistent. Some tools were installed via `go
generate -tags tools ./tools/...`,
others via various `make` targets, and some only in Github Actions
workflows. This process led to some undesirable side effects:
* The modules of some dev and test tools were included with those
of the Vault project. This leads to us having to manage our own
Go modules with those of external tools. Prior to Go 1.16 this
was the recommended way to handle external tools, but now
`go install tool@version` is the recommended way to handle
external tools that need to be build from source as it supports
specific versions but does not modify the go.mod.
* Due to Github cache constraints we combine our build and test Go
module caches together, but having our developer tools as deps in
our module results in a larger cache which is downloaded on every
build and test workflow runner. Removing the external tools that were
included in our go.mod reduced the expanded module cache by size
by ~300MB, thus saving time and network I/O costs when downloading
the module cache.
* Not all of our developer tools were included in our modules. Some were
being installed with `go install` or `go run`, so they didn't take
advantage of a single module cache. This resulted in us downloading
Go modules on every CI and Build runner in order to build our
external tools.
* Building our developer tools from source in CI is slow. Where possible
we can prefer to use pre-built binaries in CI workflows. No more
module download or tool compiles if we can avoid them.
I've refactored how we define internal and external build tools
in our Makefile and added several new targets to handle both building
the developer tools locally for development and verifying that they are
available. This allows for an easy developer bootstrap while also
supporting installation of many of the external developer tools from
pre-build binaries in CI. This reduces our network IO and run time
across nearly all of our actions runners.
While working on this I caught and resolved a few unrelated issue:
* Both our Go and Proto format checks we're being run incorrectly. In
CI they we're writing changes but not failing if changes were
detected. The Go was less of a problem as we have git hooks that
are intended to enforce formatting, however we drifted over time.
* Our Git hooks couldn't handle removing a Go file without failing. I
moved the diff check into the new Go helper and updated it to handle
removing files.
* I combined a few separate scripts and into helpers and added a few
new capabilities.
* I refactored how we install Go modules to make it easier to download
and tidy all of the projects go.mod's.
* Refactor our internal and external tool installation and verification
into a tools.sh helper.
* Combined more complex Go verification into `scripts/go-helper.sh` and
utilize it in the `Makefile` and git commit hooks.
* Add `Makefile` targets for executing our various tools.sh helpers.
* Update our existing `make` targets to use new tool targets.
* Normalize our various scripts and targets output to have a consistent
output format.
* In CI, install many of our external dependencies as binaries wherever
possible. When not possible we'll build them from scratch but not mess
with the shared module cache.
* [QT-641] Remove our external build tools from our project Go modules.
* [QT-641] Remove extraneous `go list`'s from our `set-up-to` composite
action.
* Fix formatting and regen our protos
Signed-off-by: Ryan Cragun <me@ryan.ec>