mirror of
https://github.com/optim-enterprises-bv/vault.git
synced 2025-11-02 03:27:54 +00:00
Documentation for Adaptive Overload Protection (#26690)
* Document enabling config * Fix nav data JSON after disabling over-zealous prettifier * Address review feedback * Add warning about reloading config during overload * Bad metrics links * Another bad link * Add upgrade note about deprecation --------- Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
---
|
||||
layout: docs
|
||||
page_title: 'Adaptive overload protection'
|
||||
description: >-
|
||||
Vault Enterprise provides adaptive overload protection to automatically
|
||||
prevent workloads from overloading different resources of the Vault servers.
|
||||
---
|
||||
|
||||
# Adaptive overload protection
|
||||
|
||||
@include 'alerts/enterprise-only.mdx'
|
||||
|
||||
@include 'alerts/beta.mdx'
|
||||
|
||||
Adaptive overload protection refers to a set of features in Vault Enterprise
|
||||
that prevent client requests from overwhelming different server resources
|
||||
leading to poor availability.
|
||||
|
||||
## Preventing overload
|
||||
|
||||
Vault currently supports one type of adaptive overload protection that prevents
|
||||
Vault servers from being overwhelmed by write requests.
|
||||
|
||||
These protection measures are "Adaptive" in the sense that they automatically
|
||||
and continuously adjust to maintain optimal performance for the current workload
|
||||
and hardware resources available without any user tuning.
|
||||
|
||||
Load testing and tuning of appropriate limits is time consuming for users during
|
||||
initial setup. Even when clusters are carefully tuned during installation,
|
||||
real-world workloads and hardware performance both change over time. A static
|
||||
tuning will soon be sub-optimal or even completely ineffective at preventing
|
||||
overloads.
|
||||
|
||||
For example, an increase in disk latency caused by failing hardware might reduce
|
||||
the server's available throughput. A static limit configured while disks were
|
||||
performing a their peak would not protect the degraded system from overload. By
|
||||
adaptively responding to current load and performance characteristics, Vault
|
||||
Enterprise is able to provide long-term protection against overloads.
|
||||
|
||||
## Types of overload
|
||||
|
||||
There are many potential resources that could become a performance bottleneck in
|
||||
a Vault Enterprise cluster. Different forms of adaptive overload protection
|
||||
target specific components and workloads. This allows each one to be carefully
|
||||
specialized and tuned to the needs of that sub-system. The sections below
|
||||
describe specific mechanisms that prevent overload of particular subsystems and
|
||||
protect against particular types of overloads.
|
||||
|
||||
## Write overload protection
|
||||
|
||||
In Vault Enterprise, all writes go through the `WALBackend` to allow for
|
||||
replication to other clusters. This is true even if replication is not being
|
||||
used. Vault performs batching or "group commit" for these writes to increases
|
||||
throughput. Optimal throughput for a given storage backend is obtained when
|
||||
there are enough write requests in the queue to fill the next batch. However, if
|
||||
there are more requests queued than will fit in a batch, latencies start to grow
|
||||
quickly as all writes have to wait behind multiple other batches.
|
||||
|
||||
In some cases, a sudden influx of write requests that exceeds Vault's hardware
|
||||
capacity can result in the writes queueing for so long that every request times
|
||||
out before the write can make it through the queue. This makes Vault effectively
|
||||
unavailable to clients even though it is still processing requests and storing
|
||||
data as fast as it can. This is illustrated in the test results shown below for
|
||||
a workload of 100% logins.
|
||||
|
||||

|
||||
|
||||
Adaptive Write Overload Protection prevents this scenario. It constantly
|
||||
monitors the current state of the write queue and uses a carefully tuned
|
||||
algorithm to allow just enough queueing to maximize throughput on the available
|
||||
hardware while keeping latencies under control and unnecessary rejections to a
|
||||
minimum.
|
||||
|
||||
Write overload protection was added in Vault Enterprise 1.17 as a beta feature
|
||||
which is disabled by default.
|
||||
|
||||
To enable the feature use the [`adaptive_overload_protection` configuration
|
||||
stanza](/vault/docs/configuration/adaptive-overload-protection).
|
||||
|
||||
### Metrics
|
||||
|
||||
Operators may wish to monitor metrics related to the write overload protection
|
||||
controller. The most useful of these is the `reject_fraction` which represents
|
||||
the controller's current estimate for the fraction of write requests that need
|
||||
to be rejected to maintain optimal throughput and stability.
|
||||
|
||||
See the [wal.write_controller.reject_fraction metrics reference](/vault/docs/internals/telemetry/metrics/availability#vault-wal-write_controller-reject_fraction).
|
||||
|
||||
## Client handling of overloads
|
||||
|
||||
When Vault has reached capacity, new requests will be immediately rejected with
|
||||
a retryable `503 - Service Unavailable`. See [Vault Server Temporarily
|
||||
Overloaded](/vault/docs/concepts/adaptive-overload-protection/vault-server-temporarily-overloaded)
|
||||
for additional considerations around handling this error correctly.
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
layout: docs
|
||||
page_title: Vault server temporarily overloaded
|
||||
description: |-
|
||||
How to handle Vault servers rejecting requests due to overload.
|
||||
---
|
||||
|
||||
Vault Enterprise includes features for [Adaptive Overload
|
||||
Protection](/vault/docs/concepts/adaptive-overload-protection). When some server
|
||||
resource is at capacity, Vault Enterprise may reject some HTTP client requests
|
||||
to preserve the Vault server's ability to remain stable and available. This
|
||||
document described considerations for handling these requests in client code.
|
||||
|
||||
# Vault server temporarily overloaded
|
||||
|
||||
Vault returns a `503 - Service Unavailable` response to indicate that a request
|
||||
was rejected because there was not enough capacity to service it in a timely way:
|
||||
|
||||
```
|
||||
Error making API request.
|
||||
|
||||
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo
|
||||
Code: 503. Errors:
|
||||
|
||||
* 1 error occurred:
|
||||
* Vault server temporarily overloaded
|
||||
```
|
||||
|
||||
`503 - Service Unavailable` is a retryable HTTP error.
|
||||
|
||||
Vault clients should retry their request with a suitable backoff strategy.
|
||||
When retrying you should:
|
||||
* Wait for an increasing amount of time between retries.
|
||||
* Randomize the wait time between retries to avoid many clients becoming
|
||||
synchronized and all retrying at the same moment. This is often called
|
||||
adding "jitter".
|
||||
* Limit the total number of retries so that request volume doesn't continue to
|
||||
grow for the duration of an outage as more and more clients add on retries.
|
||||
|
||||
~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a
|
||||
specific client is issuing too many requests. A `503 - Service Unavailable`
|
||||
instead indicates that that the server is under excess load, which is likely to
|
||||
be unrelated to the behavior of the specific client being rejected.
|
||||
|
||||
For more information on request rejection, refer to the [Adaptive Overload
|
||||
Protection Overview](/vault/docs/concepts/adaptive-overload-protection).
|
||||
|
||||
## API Package
|
||||
|
||||
For clients written in Go that use Vault's API package, retries are handled by
|
||||
default with no further work needed.
|
||||
@@ -10,7 +10,14 @@ description: >-
|
||||
|
||||
@include 'alerts/enterprise-only.mdx'
|
||||
|
||||
@include 'alerts/beta.mdx'
|
||||
<Warning title="Beta (Deprecated)">
|
||||
|
||||
The request limiter was released in Vault 1.16 as a Beta
|
||||
feature. During Beta evaluation we found an alternative approach better met
|
||||
the needs of our users. This feature will be removed from Vault in a future
|
||||
release. It is replaced with [adaptive overload protection](/vault/docs/concepts/adaptive-overload-protection).
|
||||
|
||||
</Warning>
|
||||
|
||||
This document contains conceptual information about the **Request Limiter** and
|
||||
its user-facing effects.
|
||||
@@ -71,4 +78,4 @@ needing to retry.
|
||||
|
||||
When Vault has reached capacity, new requests will be immediately rejected with a
|
||||
retryable `503 - Service Unavailable`
|
||||
[error](/vault/docs/concepts/request-limiter/vault-server-temporarily-overloaded).
|
||||
[error](/vault/docs/concepts/adaptive-overload-protection/vault-server-temporarily-overloaded).
|
||||
|
||||
@@ -1,33 +0,0 @@
|
||||
---
|
||||
layout: docs
|
||||
page_title: Vault server temporarily overloaded
|
||||
description: |-
|
||||
Vault Enterprise error when the request limiter is at capacity.
|
||||
---
|
||||
|
||||
# Vault server temporarily overloaded
|
||||
|
||||
Vault returns a `503 - Service Unavailable` response to indicate that a request
|
||||
was rejected after Vault has reached its in-flight request capacity:
|
||||
|
||||
```
|
||||
Error making API request.
|
||||
|
||||
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo
|
||||
Code: 503. Errors:
|
||||
|
||||
* 1 error occurred:
|
||||
* Vault server temporarily overloaded
|
||||
```
|
||||
|
||||
`503 - Service Unavailable` is a retryable HTTP error, which is handled by the
|
||||
Vault API `Client` implementation.
|
||||
|
||||
~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a
|
||||
specific client is issuing too many requests. The choice of `503 - Service
|
||||
Unavailable` for request rejection emphasizes that that the server is
|
||||
temporarily under excess load, which may not be related to the behavior of a
|
||||
specific client.
|
||||
|
||||
For more information on request rejection, refer to the [Request
|
||||
Limiter](/vault/docs/concepts/request-limiter) documentation.
|
||||
@@ -0,0 +1,46 @@
|
||||
---
|
||||
layout: docs
|
||||
page_title: Adaptive overload protection - Configuration
|
||||
description: |-
|
||||
Use adaptive overload protection with Vault Enterprise to automatically
|
||||
prevent workloads from overloading different resources of your Vault servers.
|
||||
---
|
||||
|
||||
# `adaptive_overload_protection`
|
||||
|
||||
@include 'alerts/enterprise-only.mdx'
|
||||
|
||||
@include 'alerts/beta.mdx'
|
||||
|
||||
Configure the `adaptive_overload_protection` stanza to control overload
|
||||
protection features for your Vault server.
|
||||
|
||||
@include 'config-reload-supported.mdx'
|
||||
|
||||
<Warning title="Do not disable during overload">
|
||||
|
||||
Do not disable the adaptive overload protection features during an overload.
|
||||
This feature is designed to protect your Vault server from overload conditions.
|
||||
Disabling it can lead to poor availability.
|
||||
|
||||
</Warning>
|
||||
|
||||
For more information read [Adaptive Overload
|
||||
Protection](/vault/docs/concepts/adaptive-overload-protection).
|
||||
|
||||
|
||||
```hcl
|
||||
adaptive_overload_protection {
|
||||
disable_write_controller = false
|
||||
}
|
||||
```
|
||||
|
||||
## `adaptive_overload_protection` parameters
|
||||
|
||||
These parameters apply to the `adaptive_overload_protection` stanza in the Vault
|
||||
configuration file:
|
||||
|
||||
- `disable_write_controller` `(bool: <optional>)`: Disables the adaptive write
|
||||
overload controller. Defaults to `true` (controller disabled). Set
|
||||
`disable_write_controller` to `false` to enable the write controller and opt
|
||||
in to the beta functionality.
|
||||
@@ -10,7 +10,14 @@ description: |-
|
||||
|
||||
@include 'alerts/enterprise-only.mdx'
|
||||
|
||||
@include 'alerts/beta.mdx'
|
||||
<Warning title="Deprecated beta feature">
|
||||
|
||||
Vault 1.16 included the request limiter as a Beta feature. During the beta, we
|
||||
found an alternative approach that better meets user needs. The request limiter
|
||||
has been deprecated in favor of [adaptive overload
|
||||
protection](/vault/docs/concepts/adaptive-overload-protection).
|
||||
|
||||
</Warning>
|
||||
|
||||
The `request_limiter` stanza allows operators to turn on the adaptive
|
||||
concurrency limiter, which is off by default. This is a reloadable config.
|
||||
|
||||
@@ -768,6 +768,14 @@ alphabetic order by name.
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/persistwals.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/d.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/i.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/p.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/reject_fraction.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/zookeeper/delete.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/zookeeper/get.mdx'
|
||||
|
||||
@@ -49,6 +49,14 @@ your Vault instance. Enterprise installations also include
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/persistwals.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/d.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/i.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/p.mdx'
|
||||
|
||||
@include 'telemetry-metrics/vault/wal/write_controller/reject_fraction.mdx'
|
||||
|
||||
## Log shipping metrics
|
||||
|
||||
@include 'telemetry-metrics/vault/logshipper/buffer/length.mdx'
|
||||
|
||||
@@ -50,6 +50,18 @@ to control truncation the behavior. Setting the issuer `leaf_not_after_behavior`
|
||||
field to `permit` and `enforce_leaf_not_after_behavior` to true restores the
|
||||
legacy behavior.
|
||||
|
||||
### Request limiter deprecation
|
||||
|
||||
Vault 1.16.0 included an experimental request limiter. The limiter was disabled
|
||||
by default. Further testing indicated that an alternative approach improves
|
||||
performance and reduces risk for many workloads. Vault 1.17.0 includes a
|
||||
new [adaptive overload
|
||||
protection](/vault/docs/concepts/adaptive-overload-protection) feature that
|
||||
prevents outages when Vault is overwhelmed by write requests. Adaptive overload
|
||||
protection is a beta feature in 1.17.0 and is disabled by default.
|
||||
|
||||
The beta request limiter will be removed from Vault entirely in a later release.
|
||||
|
||||
## Known issues and workarounds
|
||||
|
||||
@include 'known-issues/ocsp-redirect.mdx'
|
||||
|
||||
5
website/content/partials/config-reload-supported.mdx
Normal file
5
website/content/partials/config-reload-supported.mdx
Normal file
@@ -0,0 +1,5 @@
|
||||
<Note title="Configuration reload supported">
|
||||
|
||||
Restart or reload your Vault server for configuration updates to take effect.
|
||||
|
||||
</Note>
|
||||
@@ -1,2 +1,3 @@
|
||||
Request Limiter metrics relate to request success signals observed by the
|
||||
request limiter and its current state.
|
||||
Request Limiter metrics relate to request success signals observed by the
|
||||
request limiter and its current state. Note the [request limiter is deprecated](/vault/docs/upgrading/upgrade-to-1.17.x#request-limiter-deprecation)
|
||||
and will be removed in future Vault versions.
|
||||
@@ -0,0 +1,9 @@
|
||||
### vault.wal.write_controller.d ((#vault-wal-write_controller-d))
|
||||
|
||||
Metric type | Value | Description
|
||||
----------- | ------- | -----------
|
||||
gauge | number | Current derivative value computed by the write controller.
|
||||
|
||||
The `vault.wal.write_controller.d` metric has limited production use, but Vault
|
||||
developers may find `vault.wal.write_controller.d` useful for tuning or
|
||||
debugging controller behavior.
|
||||
@@ -0,0 +1,10 @@
|
||||
### vault.wal.write_controller.i ((#vault-wal-write_controller-i))
|
||||
|
||||
Metric type | Value | Description
|
||||
----------- | ------- | -----------
|
||||
gauge | number | Current integral value computed by the write controller.
|
||||
|
||||
|
||||
The `vault.wal.write_controller.i` metric has limited production use, but Vault
|
||||
developers may find `vault.wal.write_controller.i` useful for tuning or
|
||||
debugging controller behavior.
|
||||
@@ -0,0 +1,9 @@
|
||||
### vault.wal.write_controller.p ((#vault-wal-write_controller-p))
|
||||
|
||||
Metric type | Value | Description
|
||||
----------- | ------- | -----------
|
||||
gauge | number | Current proportional error value detected by the write controller.
|
||||
|
||||
The `vault.wal.write_controller.p` metric has limited production use, but Vault
|
||||
developers may find `vault.wal.write_controller.p` useful for tuning or
|
||||
debugging controller behavior.
|
||||
@@ -0,0 +1,8 @@
|
||||
### vault.wal.write_controller.reject_fraction ((#vault-wal-write_controller-reject_fraction))
|
||||
|
||||
Metric type | Value | Description
|
||||
----------- | ------- | -----------
|
||||
gauge | number | The estimated fraction of write requests that must be rejected to maintain cluster stability.
|
||||
|
||||
The [write controller](/vault/docs/concepts/adaptive-overload-protection) reject
|
||||
fraction is an estimate between 0 and 1.
|
||||
@@ -308,7 +308,7 @@
|
||||
{
|
||||
"title": "Request Limiter",
|
||||
"badge": {
|
||||
"text": "ENTERPRISE",
|
||||
"text": "ENTERPRISE | DEPRECATED",
|
||||
"type": "outlined",
|
||||
"color": "neutral"
|
||||
},
|
||||
@@ -321,10 +321,29 @@
|
||||
"type": "outlined",
|
||||
"color": "highlight"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"title": "Adaptive overload protection",
|
||||
"badge": {
|
||||
"text": "ENTERPRISE | BETA",
|
||||
"type": "outlined",
|
||||
"color": "neutral"
|
||||
},
|
||||
"routes": [
|
||||
{
|
||||
"title": "Overview",
|
||||
"path": "concepts/adaptive-overload-protection",
|
||||
"badge": {
|
||||
"text": "BETA",
|
||||
"type": "outlined",
|
||||
"color": "highlight"
|
||||
}
|
||||
},
|
||||
{
|
||||
"title": "Vault server temporarily overloaded",
|
||||
"path": "concepts/request-limiter/vault-server-temporarily-overloaded"
|
||||
"path": "concepts/adaptive-overload-protection/vault-server-temporarily-overloaded"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -544,6 +563,10 @@
|
||||
"title": "<code>Request Limiter</code>",
|
||||
"path": "configuration/request-limiter"
|
||||
},
|
||||
{
|
||||
"title": "Adaptive overload protection",
|
||||
"path": "configuration/adaptive-overload-protection"
|
||||
},
|
||||
{
|
||||
"title": "<code>ui</code>",
|
||||
"path": "configuration/ui"
|
||||
|
||||
BIN
website/public/img/adaptive-overload-protection-writes.png
Normal file
BIN
website/public/img/adaptive-overload-protection-writes.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 147 KiB |
Reference in New Issue
Block a user