Documentation for Adaptive Overload Protection (#26690)

* Document enabling config

* Fix nav data JSON after disabling over-zealous prettifier

* Address review feedback

* Add warning about reloading config during overload

* Bad metrics links

* Another bad link

* Add upgrade note about deprecation

---------

Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
This commit is contained in:
Paul Banks
2024-05-10 17:55:57 +01:00
committed by GitHub
parent fc4042bd2e
commit 0a06215d1a
17 changed files with 305 additions and 40 deletions

View File

@@ -0,0 +1,94 @@
---
layout: docs
page_title: 'Adaptive overload protection'
description: >-
Vault Enterprise provides adaptive overload protection to automatically
prevent workloads from overloading different resources of the Vault servers.
---
# Adaptive overload protection
@include 'alerts/enterprise-only.mdx'
@include 'alerts/beta.mdx'
Adaptive overload protection refers to a set of features in Vault Enterprise
that prevent client requests from overwhelming different server resources
leading to poor availability.
## Preventing overload
Vault currently supports one type of adaptive overload protection that prevents
Vault servers from being overwhelmed by write requests.
These protection measures are "Adaptive" in the sense that they automatically
and continuously adjust to maintain optimal performance for the current workload
and hardware resources available without any user tuning.
Load testing and tuning of appropriate limits is time consuming for users during
initial setup. Even when clusters are carefully tuned during installation,
real-world workloads and hardware performance both change over time. A static
tuning will soon be sub-optimal or even completely ineffective at preventing
overloads.
For example, an increase in disk latency caused by failing hardware might reduce
the server's available throughput. A static limit configured while disks were
performing a their peak would not protect the degraded system from overload. By
adaptively responding to current load and performance characteristics, Vault
Enterprise is able to provide long-term protection against overloads.
## Types of overload
There are many potential resources that could become a performance bottleneck in
a Vault Enterprise cluster. Different forms of adaptive overload protection
target specific components and workloads. This allows each one to be carefully
specialized and tuned to the needs of that sub-system. The sections below
describe specific mechanisms that prevent overload of particular subsystems and
protect against particular types of overloads.
## Write overload protection
In Vault Enterprise, all writes go through the `WALBackend` to allow for
replication to other clusters. This is true even if replication is not being
used. Vault performs batching or "group commit" for these writes to increases
throughput. Optimal throughput for a given storage backend is obtained when
there are enough write requests in the queue to fill the next batch. However, if
there are more requests queued than will fit in a batch, latencies start to grow
quickly as all writes have to wait behind multiple other batches.
In some cases, a sudden influx of write requests that exceeds Vault's hardware
capacity can result in the writes queueing for so long that every request times
out before the write can make it through the queue. This makes Vault effectively
unavailable to clients even though it is still processing requests and storing
data as fast as it can. This is illustrated in the test results shown below for
a workload of 100% logins.
![Login workload telemetry graphs showing difference with and without adaptive overload protection for writes](/img/adaptive-overload-protection-writes.png)
Adaptive Write Overload Protection prevents this scenario. It constantly
monitors the current state of the write queue and uses a carefully tuned
algorithm to allow just enough queueing to maximize throughput on the available
hardware while keeping latencies under control and unnecessary rejections to a
minimum.
Write overload protection was added in Vault Enterprise 1.17 as a beta feature
which is disabled by default.
To enable the feature use the [`adaptive_overload_protection` configuration
stanza](/vault/docs/configuration/adaptive-overload-protection).
### Metrics
Operators may wish to monitor metrics related to the write overload protection
controller. The most useful of these is the `reject_fraction` which represents
the controller's current estimate for the fraction of write requests that need
to be rejected to maintain optimal throughput and stability.
See the [wal.write_controller.reject_fraction metrics reference](/vault/docs/internals/telemetry/metrics/availability#vault-wal-write_controller-reject_fraction).
## Client handling of overloads
When Vault has reached capacity, new requests will be immediately rejected with
a retryable `503 - Service Unavailable`. See [Vault Server Temporarily
Overloaded](/vault/docs/concepts/adaptive-overload-protection/vault-server-temporarily-overloaded)
for additional considerations around handling this error correctly.

View File

@@ -0,0 +1,51 @@
---
layout: docs
page_title: Vault server temporarily overloaded
description: |-
How to handle Vault servers rejecting requests due to overload.
---
Vault Enterprise includes features for [Adaptive Overload
Protection](/vault/docs/concepts/adaptive-overload-protection). When some server
resource is at capacity, Vault Enterprise may reject some HTTP client requests
to preserve the Vault server's ability to remain stable and available. This
document described considerations for handling these requests in client code.
# Vault server temporarily overloaded
Vault returns a `503 - Service Unavailable` response to indicate that a request
was rejected because there was not enough capacity to service it in a timely way:
```
Error making API request.
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo
Code: 503. Errors:
* 1 error occurred:
* Vault server temporarily overloaded
```
`503 - Service Unavailable` is a retryable HTTP error.
Vault clients should retry their request with a suitable backoff strategy.
When retrying you should:
* Wait for an increasing amount of time between retries.
* Randomize the wait time between retries to avoid many clients becoming
synchronized and all retrying at the same moment. This is often called
adding "jitter".
* Limit the total number of retries so that request volume doesn't continue to
grow for the duration of an outage as more and more clients add on retries.
~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a
specific client is issuing too many requests. A `503 - Service Unavailable`
instead indicates that that the server is under excess load, which is likely to
be unrelated to the behavior of the specific client being rejected.
For more information on request rejection, refer to the [Adaptive Overload
Protection Overview](/vault/docs/concepts/adaptive-overload-protection).
## API Package
For clients written in Go that use Vault's API package, retries are handled by
default with no further work needed.

View File

@@ -10,7 +10,14 @@ description: >-
@include 'alerts/enterprise-only.mdx'
@include 'alerts/beta.mdx'
<Warning title="Beta (Deprecated)">
The request limiter was released in Vault 1.16 as a Beta
feature. During Beta evaluation we found an alternative approach better met
the needs of our users. This feature will be removed from Vault in a future
release. It is replaced with [adaptive overload protection](/vault/docs/concepts/adaptive-overload-protection).
</Warning>
This document contains conceptual information about the **Request Limiter** and
its user-facing effects.
@@ -71,4 +78,4 @@ needing to retry.
When Vault has reached capacity, new requests will be immediately rejected with a
retryable `503 - Service Unavailable`
[error](/vault/docs/concepts/request-limiter/vault-server-temporarily-overloaded).
[error](/vault/docs/concepts/adaptive-overload-protection/vault-server-temporarily-overloaded).

View File

@@ -1,33 +0,0 @@
---
layout: docs
page_title: Vault server temporarily overloaded
description: |-
Vault Enterprise error when the request limiter is at capacity.
---
# Vault server temporarily overloaded
Vault returns a `503 - Service Unavailable` response to indicate that a request
was rejected after Vault has reached its in-flight request capacity:
```
Error making API request.
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo
Code: 503. Errors:
* 1 error occurred:
* Vault server temporarily overloaded
```
`503 - Service Unavailable` is a retryable HTTP error, which is handled by the
Vault API `Client` implementation.
~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a
specific client is issuing too many requests. The choice of `503 - Service
Unavailable` for request rejection emphasizes that that the server is
temporarily under excess load, which may not be related to the behavior of a
specific client.
For more information on request rejection, refer to the [Request
Limiter](/vault/docs/concepts/request-limiter) documentation.

View File

@@ -0,0 +1,46 @@
---
layout: docs
page_title: Adaptive overload protection - Configuration
description: |-
Use adaptive overload protection with Vault Enterprise to automatically
prevent workloads from overloading different resources of your Vault servers.
---
# `adaptive_overload_protection`
@include 'alerts/enterprise-only.mdx'
@include 'alerts/beta.mdx'
Configure the `adaptive_overload_protection` stanza to control overload
protection features for your Vault server.
@include 'config-reload-supported.mdx'
<Warning title="Do not disable during overload">
Do not disable the adaptive overload protection features during an overload.
This feature is designed to protect your Vault server from overload conditions.
Disabling it can lead to poor availability.
</Warning>
For more information read [Adaptive Overload
Protection](/vault/docs/concepts/adaptive-overload-protection).
```hcl
adaptive_overload_protection {
disable_write_controller = false
}
```
## `adaptive_overload_protection` parameters
These parameters apply to the `adaptive_overload_protection` stanza in the Vault
configuration file:
- `disable_write_controller` `(bool: <optional>)`: Disables the adaptive write
overload controller. Defaults to `true` (controller disabled). Set
`disable_write_controller` to `false` to enable the write controller and opt
in to the beta functionality.

View File

@@ -10,7 +10,14 @@ description: |-
@include 'alerts/enterprise-only.mdx'
@include 'alerts/beta.mdx'
<Warning title="Deprecated beta feature">
Vault 1.16 included the request limiter as a Beta feature. During the beta, we
found an alternative approach that better meets user needs. The request limiter
has been deprecated in favor of [adaptive overload
protection](/vault/docs/concepts/adaptive-overload-protection).
</Warning>
The `request_limiter` stanza allows operators to turn on the adaptive
concurrency limiter, which is off by default. This is a reloadable config.

View File

@@ -768,6 +768,14 @@ alphabetic order by name.
@include 'telemetry-metrics/vault/wal/persistwals.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/d.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/i.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/p.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/reject_fraction.mdx'
@include 'telemetry-metrics/vault/zookeeper/delete.mdx'
@include 'telemetry-metrics/vault/zookeeper/get.mdx'

View File

@@ -49,6 +49,14 @@ your Vault instance. Enterprise installations also include
@include 'telemetry-metrics/vault/wal/persistwals.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/d.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/i.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/p.mdx'
@include 'telemetry-metrics/vault/wal/write_controller/reject_fraction.mdx'
## Log shipping metrics
@include 'telemetry-metrics/vault/logshipper/buffer/length.mdx'

View File

@@ -50,6 +50,18 @@ to control truncation the behavior. Setting the issuer `leaf_not_after_behavior`
field to `permit` and `enforce_leaf_not_after_behavior` to true restores the
legacy behavior.
### Request limiter deprecation
Vault 1.16.0 included an experimental request limiter. The limiter was disabled
by default. Further testing indicated that an alternative approach improves
performance and reduces risk for many workloads. Vault 1.17.0 includes a
new [adaptive overload
protection](/vault/docs/concepts/adaptive-overload-protection) feature that
prevents outages when Vault is overwhelmed by write requests. Adaptive overload
protection is a beta feature in 1.17.0 and is disabled by default.
The beta request limiter will be removed from Vault entirely in a later release.
## Known issues and workarounds
@include 'known-issues/ocsp-redirect.mdx'

View File

@@ -0,0 +1,5 @@
<Note title="Configuration reload supported">
Restart or reload your Vault server for configuration updates to take effect.
</Note>

View File

@@ -1,2 +1,3 @@
Request Limiter metrics relate to request success signals observed by the
request limiter and its current state.
Request Limiter metrics relate to request success signals observed by the
request limiter and its current state. Note the [request limiter is deprecated](/vault/docs/upgrading/upgrade-to-1.17.x#request-limiter-deprecation)
and will be removed in future Vault versions.

View File

@@ -0,0 +1,9 @@
### vault.wal.write_controller.d ((#vault-wal-write_controller-d))
Metric type | Value | Description
----------- | ------- | -----------
gauge | number | Current derivative value computed by the write controller.
The `vault.wal.write_controller.d` metric has limited production use, but Vault
developers may find `vault.wal.write_controller.d` useful for tuning or
debugging controller behavior.

View File

@@ -0,0 +1,10 @@
### vault.wal.write_controller.i ((#vault-wal-write_controller-i))
Metric type | Value | Description
----------- | ------- | -----------
gauge | number | Current integral value computed by the write controller.
The `vault.wal.write_controller.i` metric has limited production use, but Vault
developers may find `vault.wal.write_controller.i` useful for tuning or
debugging controller behavior.

View File

@@ -0,0 +1,9 @@
### vault.wal.write_controller.p ((#vault-wal-write_controller-p))
Metric type | Value | Description
----------- | ------- | -----------
gauge | number | Current proportional error value detected by the write controller.
The `vault.wal.write_controller.p` metric has limited production use, but Vault
developers may find `vault.wal.write_controller.p` useful for tuning or
debugging controller behavior.

View File

@@ -0,0 +1,8 @@
### vault.wal.write_controller.reject_fraction ((#vault-wal-write_controller-reject_fraction))
Metric type | Value | Description
----------- | ------- | -----------
gauge | number | The estimated fraction of write requests that must be rejected to maintain cluster stability.
The [write controller](/vault/docs/concepts/adaptive-overload-protection) reject
fraction is an estimate between 0 and 1.

View File

@@ -308,7 +308,7 @@
{
"title": "Request Limiter",
"badge": {
"text": "ENTERPRISE",
"text": "ENTERPRISE | DEPRECATED",
"type": "outlined",
"color": "neutral"
},
@@ -321,10 +321,29 @@
"type": "outlined",
"color": "highlight"
}
}
]
},
{
"title": "Adaptive overload protection",
"badge": {
"text": "ENTERPRISE | BETA",
"type": "outlined",
"color": "neutral"
},
"routes": [
{
"title": "Overview",
"path": "concepts/adaptive-overload-protection",
"badge": {
"text": "BETA",
"type": "outlined",
"color": "highlight"
}
},
{
"title": "Vault server temporarily overloaded",
"path": "concepts/request-limiter/vault-server-temporarily-overloaded"
"path": "concepts/adaptive-overload-protection/vault-server-temporarily-overloaded"
}
]
}
@@ -544,6 +563,10 @@
"title": "<code>Request Limiter</code>",
"path": "configuration/request-limiter"
},
{
"title": "Adaptive overload protection",
"path": "configuration/adaptive-overload-protection"
},
{
"title": "<code>ui</code>",
"path": "configuration/ui"

Binary file not shown.

After

Width:  |  Height:  |  Size: 147 KiB