* auth/aws: fix panic in IAM-based login when client config doesn't exist
* add changelog
* adds known issue for 1.15.0
* fixes up known issue with workaround
* fix link
* maintain behavior of client config not needing to exist for IAM login
* update changelog
* Remove old initial versions from the upgrade scenario as they're
unreliable.
* Ensure that shellcheck is available on runners for linting job.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Fix formatting issue within pki health-check cli
- Missing a ``` within the CRL validity period which caused a bunch of sections to be collected within the box
- One shell session was shifted over too much in the Too many certificates section
* Add missing '$' in front of the command
* Reorder pki entry in nav bar and add more missing $ in vault commands
---------
Co-authored-by: Yoko Hyakuna <yoko@hashicorp.com>
Update our `proxy` and `agent` scenarios to support new variants and
perform baseline verification and their scenario specific verification.
We integrate these updated scenarios into the pipeline by adding them
to artifact samples.
We've also improved the reliability of the `autopilot` and `replication`
scenarios by refactoring our IP address gathering. Previously, we'd ask
vault for the primary IP address and use some Terraform logic to determine
followers. The leader IP address gathering script was also implicitly
responsible for ensuring that a found leader was within a given group of
hosts, and thus waiting for a given cluster to have a leader, and also for
doing some arithmetic and outputting `replication` specific output data.
We've broken these responsibilities into individual modules, improved their
error messages, and fixed various races and bugs, including:
* Fix a race between creating the file audit device and installing and starting
vault in the `replication` scenario.
* Fix how we determine our leader and follower IP addresses. We now query
vault instead of a prior implementation that inferred the followers and sometimes
did not allow all nodes to be an expected leader.
* Fix a bug where we'd always always fail on the first wrong condition
in the `vault_verify_performance_replication` module.
We also performed some maintenance tasks on Enos scenarios byupdating our
references from `oss` to `ce` to handle the naming and license changes. We
also enabled `shellcheck` linting for enos module scripts.
* Rename `oss` to `ce` for license and naming changes.
* Convert template enos scripts to scripts that take environment
variables.
* Add `shellcheck` linting for enos module scripts.
* Add additional `backend` and `seal` support to `proxy` and `agent`
scenarios.
* Update scenarios to include all baseline verification.
* Add `proxy` and `agent` scenarios to artifact samples.
* Remove IP address verification from the `vault_get_cluster_ips`
modules and implement a new `vault_wait_for_leader` module.
* Determine follower IP addresses by querying vault in the
`vault_get_cluster_ips` module.
* Move replication specific behavior out of the `vault_get_cluster_ips`
module and into it's own `replication_data` module.
* Extend initial version support for the `upgrade` and `autopilot`
scenarios.
We also discovered an issue with undo_logs that has been described in
the VAULT-20259. As such, we've disabled the undo_logs check until
it has been fixed.
Signed-off-by: Ryan Cragun <me@ryan.ec>
- Do not load existing ACME challenges persisted within storage on non-active nodes. This was the main culprit of the issues, secondary nodes would load existing persisted challenges trying to resolve them but writes would fail leading to the excessive logging.
- We now handle this by not starting the ACME background thread on non-active nodes, while also checking within the scheduling loop and breaking out. That will force a re-reading of the Closing channel that should have been called by the PKI plugin's Cleanup method.
- If a node is stepped down from being the active node while it is actively processing a verification, we could get into an infinite loop due to an ErrReadOnly error attempting to clean up a challenge entry
- Add a maximum number of retries for errors around attempting to decode,fetch challenge/authorization entries from disk. We use double the number of "normal" max attempts for these types of errors, than we would for normal ACME retry attempts to avoid collision issues. Note that these additional retry attempts are not persisted to disk and will restart on every node start
- Add a 1 second backoff to any disk related error to not immediately spin on disk/io errors for challenges.
* change currentPage to page to be consistent
* replace pagination in listview and always show pagination
* wip
* fix query param issue
* access identity aliases index
* leases done and dusted
* policies and secrets backend
* remove list Pagination
* changelog
* add confirm modal for downloading masked data
* close modal if user clicks download
* add changelog;
* pass onSuccess function instead
* only render modal on DOM if download is allowed
* wip
* Initial draft of Seal HA docs
* nav data
* Fix env var name
* title
* Note partially wrapped values and disabled seal participation
* Update website/data/docs-nav-data.json
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* correct initial upgrade limitation
* Add note about shamir seals and migration
* fix nav json
* snapshot note
* availability note
* seal-backend-status
* Add a couple more clarifying statements
* header typo
* correct initial upgrade wording
* Update website/content/docs/configuration/seal/seal-ha.mdx
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Update website/content/docs/concepts/seal.mdx
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
---------
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Add test to demonstrate a split-brain active node when using Consul
* Add Consul session check to prevent split-brain updates
* It's not right
Co-authored-by: Josh Black <raskchanky@gmail.com>
---------
Co-authored-by: Josh Black <raskchanky@gmail.com>