151 Commits

Author SHA1 Message Date
Alexander Kukushkin
0a1f389686 Release 2.0.0 (#1680)
* update release notes
* bump version
* change the default alignment in patronictl table output to `left`
* add missing tests
* add missing pieces to the documentation
2020-09-02 15:35:04 +02:00
Floris van Nee
98f50423ca Add support for configuration directories (#1669) (#1671)
It is now also possible to point the configuration path to a directory instead of a file.
Patroni will find all yml files in the directory and apply them in sorted order

Close https://github.com/zalando/patroni/issues/1669
2020-09-02 13:57:22 +02:00
Sergey Dudoladov
950eff27ad Optional fencing script (pre_promote) (#1099)
Call a fencing script after acquiring the leader lock. If the script didn't finish successfully - don't promote but remove leader key

Close https://github.com/zalando/patroni/issues/1567
2020-09-01 07:50:39 +02:00
Kostiantyn Nemchenko
918a57fe0c Add no_params option for custom bootstrap method (#1664)
Close #1475
2020-08-28 08:23:00 +02:00
Kostiantyn Nemchenko
48aa0ba61b Add SSL support for ZooKeeper (#1662)
Close #1658
2020-08-28 08:22:15 +02:00
Yogesh Sharma
62463db5e2 Add support for user defined HTTP header to Patroni REST API response (#1645)
Close #1644
2020-08-26 17:37:02 +02:00
Victor Sudakov
35c3fd37a1 Security of Patroni (#1655)
Close #1636
2020-08-26 16:42:33 +02:00
Alexander Kukushkin
7bf60b64b0 Compatibility with PostgreSQL 13 (#1654)
So far Patroni was enforcing the same value of `wal_keep_segments` on all nodes in the cluster. If the parameter was missing from the global configuration it was using the default value `8`.
In pg13 beta3 the `wal_keep_segments` was renamed to the `wal_keep_size` and it broke Patroni.

If `wal_keep_segments` happened to be present in the configuration for pg13, Paroni will recalculate the value to `wal_keep_size` assuming that the `wal_segment_size` is 16MB. Sure, it is possible to get the real value of `wal_segment_size` from pg_control, but since we are dealing with the case of misconfiguration it is not worse time spend on it.
2020-08-17 10:45:02 +02:00
Alexander Kukushkin
23dcfaab49 Make it possible to bypass kubernetes service (#1614)
When running on K8s Patroni is communicating with API via the `kubernetes` service, which is address is exposed via the
`KUBERNETES_SERVICE_HOST` environment variable. Like any other service, the `kubernetes` service is handled by `kube-proxy`, that depending on configuration is either relying on userspace program or `iptables` for traffic routing.

During K8s upgrade, when master nodes are replaced, it is possible that `kube-proxy` doesn't update the service configuration in time and as a result Patroni fails to update the leader lock and demotes postgres.

In order to improve the user experience and get more control on the problem we make it possible to bypass the `kubernetes` service and connect directly to API nodes.
The strategy is very simple:
1. Resolve list IPs of API nodes from the kubernetes endpoint on every iteration of HA loop.
2. Stick to one of these IPs for API requests
3. Switch to a different IP if connected to IP is not from the list
4. If the request fails, switch to another IP and retry

Such a strategy is already used for Etcd and proven to work quite well.

In order to enable the feature, you need either to set to `true` `kubernetes.bypass_api_service` in the Patroni configuration file or `PATRONI_KUBERNETES_BYPASS_API_SERVICE` environment variable.

If for some reason `GET /default/endpoints/kubernetes` isn't allowed Patroni will disable the feature.
2020-08-14 12:39:47 +02:00
ksarabu1
1ab709c5f0 Multi Sync Standby Support (#1594)
The new parameter `synchronous_node_count` is used by Patroni to manage number of synchronous standby databases. It is set to 1 by default. It has no effect when synchronous_mode is set to off. When enabled, Patroni manages precise number of synchronous standby databases based on parameter synchronous_node_count and adjusts the state in DCS & synchronous_standby_names as members join and leave.

This functionality can be further extended to support Priority (FIRST n) based synchronous replication & Quorum (ANY n) based synchronous replication in future.
2020-08-14 11:51:07 +02:00
Victor Sudakov
d4c6987f78 First variant of notes on PostgreSQL major upgrades. (#1634)
[skip ci]
2020-07-31 15:43:02 +02:00
Alexander Kukushkin
3341c898ff Add Etcd v3 protocol support via api gRPC-gateway (#1162)
The only python-etcd3 client working directly via gRPC still supports only a single endpoint, which is not very nice for high-availability.

Since Patroni is already using a heavily hacked version of python-etcd with smart retries and auto-discovery out-of-the-box, I decided to enhance the existing code with limited support of v3 protocol via gRPC-gateway.

Unfortunately, watches via gRPC-gateway requires us to open and keep the second connection to the etcd.

Known limitations:
* The very minimal supported version is 3.0.4. On earlier versions transactions don't work due to bugs in grpc-gateway. Without transactions we can't do atomic operations, i.e. leader locks.
* Watches work only starting from 3.1.0
* Authentication works only starting from 3.3.0
* gRPC-gateway does not support authentication using TLS Common Name. This is because gRPC-proxy terminates TLS from its client so all the clients share a cert of the proxy: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/authentication.md#using-tls-common-name
2020-07-31 14:33:40 +02:00
Alexander Kukushkin
bfbc4860d5 PoC: Patroni on pure RAFT (#375)
* new node can join the cluster dynamically and become a part of consensus
 * it is also possible to join only Patroni cluster (without adding the node to the raft), just comment or remove `raft.self_addr` for that
 * when the node joins the cluster it is using values from `raft.partner_addrs` only for initial discovery.
* It is possible to run Patroni and Postgres on two nodes plus one node with `patroni_raft_controller` (without Patroni and Postgres). In such setup one can temporarily lose one node without affecting the primary.
2020-07-29 15:34:44 +02:00
Robert Edström
c42d507b82 Add consul service_tags configuration field (#1625)
This is useful for dynamic service discovery, for example by load balancers.
2020-07-28 12:07:24 +02:00
Victor Sudakov
20bc5ed684 Update README.rst (#1622)
[ci skip]
2020-07-28 08:36:02 +02:00
Robert Edström
5cc35ec855 Documented required Consul policy (#1626)
[ci skip]
Close #1615
2020-07-28 08:34:37 +02:00
ksarabu1
8a62999eaa replica & async rest API health check enhancement (#1599)
- ``GET /replica?lag=<max-lag>``: replica check endpoint.
- ``GET /asynchronous?lag=<max-lag>`` or ``GET /async&lag=<max-lag>``: asynchronous standby check endpoint.

Checks replication latency and returns status code **200** only when the latency is below a specified value. The key leader_optime from DCS is used for the leader WAL position and compute latency on the replica for performance reasons. Please note that the value in leader_optime might be a couple of seconds old (based on loop_wait).

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2020-07-15 10:36:48 +02:00
Alexander Kukushkin
db8c634db3 Create readiness and liveness endpoints (#1590)
They could be useful to eliminate "unhealthy" pods from subsets addresses when the K8s service with label selectors are used.
Real-life example: the node where the primary was running has failed and being shutdown and Patroni can't update (remove) the role label.
Therefore on OpenShift the leader service will have two pods assigned, one of them is a failed primary.
With the readiness probe defined, the failed primary pod will be excluded from the list.
2020-07-10 14:08:39 +02:00
Alexander Kukushkin
cbff544b9c Implement patronictl flush switchover (#1554)
It includes implementing the `DELETE /switchover` REST API endpoint.

Close https://github.com/zalando/patroni/issues/1376
2020-06-25 16:27:57 +02:00
Alexander Kukushkin
1b2491cedf Check basic-auth indepandantly from client certificate (#1556)
this is absolutely valid use-case
2020-06-05 09:25:33 +02:00
Tomáš Pospíšek
6406b39b77 add config section keys, improve verify_client documentation (#1549) 2020-06-03 09:55:21 +02:00
Alexander Kukushkin
fe23d1f2d0 Release 1.6.5 (#1503)
* bump version
* update release notes
* implement missing unit-tests and format code.
2020-04-23 16:02:01 +02:00
ksarabu1
5fa912f8fa Make max history timelines in DCS configurable (#1491)
Close https://github.com/zalando/patroni/issues/1487
2020-04-17 16:27:38 +02:00
ksarabu1
e3335bea1a Master stop timeout (#1445)
## Feature: Postgres stop timeout

Switchover/Failover operation hangs on signal_stop (or checkpoint) call when postmaster doesn't respond or  hangs for some reason(Issue described in [1371](https://github.com/zalando/patroni/issues/1371)). This is leading to service loss for an extended period of time until the hung postmaster starts responding or it is killed by some other actor.

### master_stop_timeout

The number of seconds Patroni is allowed to wait when stopping Postgres and effective only when synchronous_mode is enabled. When set to > 0 and the synchronous_mode is enabled, Patroni sends SIGKILL to the postmaster if the stop operation is running for more than the value set by master_stop_timeout. Set the value according to your durability/availability tradeoff. If the parameter is not set or set <= 0, master_stop_timeout does not apply.
2020-04-15 12:18:49 +02:00
Casey Allen Shobe
0e4d7f01f2 Correct documentation for consul.host (#1438)
Close #1434
2020-04-01 15:50:50 +02:00
Michail Nikolaev
795efc4548 Note about possible data loss while canceling postgres backends. (#1414)
Note about possible data loss while canceling postgres backends.

Related to zalando#1412
2020-03-10 12:08:01 +01:00
Steven De Coeyer
0fa70e8d88 Updates README (#1394)
We need to ensure to enable etcd v2, cfr. https://github.com/zalando/patroni/issues/1270 and https://github.com/zalando/patroni/issues/1163.
2020-02-20 10:15:58 +01:00
damien clochard
e759a3f2ef [doc] add PATRONICTL_CONFIG_FILE env var (#1397) 2020-02-20 10:14:36 +01:00
Alexander Kukushkin
80ce61876e Don't create permanent physical slot with name of the primary (#1392)
It is a regular issue that primary is recycling WALs when one of the replicas is down for a long time. So far there were only two solutions for such a problem and both of them are not perfect:
1. Increase `wal_keep_segments`, but it is hard to guess the good value.
2. Use continuous archiving and PITR, but it is not always possible.

This PR is introducing the way to solve the problem for static clusters, with a fixed number of nodes and names that never change. You just need to list the names of all nodes in the `slots` so the primary will not remove the slot when the node is down (not registered in DCS).
Of course, the primary will not create the permanent slot which is matching its own name.

Usage example: let's assume you have a cluster with nodes named *abc1*, *abc2*, and *abc3*.
You have to run `patronictl edit-config` and put the following snippet into the configuration:
```yaml
slots:
  abc1:
    type: physical
  abc2:
    type: physical
  abc3:
    type: physical
```

If the node *abc2* is the primary, it will always create slots for *abc1* and *abc3* even if they are not running, but will not create slot *abc2*.
Other nodes will behave the same.

Close #280
2020-02-20 10:07:43 +01:00
Alexander Kukushkin
dc1966e3bc Release 1.6.4 (#1380)
* Bump version
* Update release notes
2020-01-27 14:15:21 +01:00
Kostiantyn Nemchenko
a2a5cc2f71 Disable serfHealth Consul check (#1364)
Fixes #1362 and #1363.
2020-01-15 12:37:35 +01:00
Alexander Kukushkin
08d6e5e50e BUGFIX: don't leak password when running pg_rewind (#1321)
In addition to that:
* enforce security settings from `postgresql.authention`
* update release notes
* bump version
* close https://github.com/zalando/patroni/issues/1320
2019-12-05 18:19:38 +01:00
Alexander Kukushkin
b542e4b5f0 Release 1.6.2 (#1319)
* update release notes
* bump version
2019-12-05 11:36:17 +01:00
Igor Yanchenko
49d3968c23 Make it possible to configure log level for exception tracebacks (#1311)
If you set `log.traceback_level=DEBUG`, the tracebacks will be visible only when `log.level=DEBUG`. The default behavior remains the same.
2019-12-03 15:13:42 +01:00
Alexander Kukushkin
35a2ccf8a8 A couple of small fixes in docs (#1285)
* fix formatting in release notes
* fix patronictl reinit command name
2019-11-21 10:39:28 +01:00
Alexander Kukushkin
2f9a48fae4 Release 1.6.1 (#1281)
* Bump version to 1.6.1
* Update release notes
2019-11-15 12:48:00 +01:00
Alexander Kukushkin
c1adbafbc5 Improve documentation (#1244)
* document tags
* move dynamic configuration out of `bootstrap.dcs`
* document REST API endpoints
2019-11-13 16:10:28 +01:00
Feike Steenbergen
d2d49907ad Correctly document PATRONI_KUBERNETES_PORTS (#1266)
The previous documentation was wrong and will throw the following error
when used:

        Exception when parsing list {[{"name": "postgresql", "port": 5432}]}

When removing the surrounding braces, the error goes away and the
endpoint is updated with the correct Port name.
2019-11-05 10:09:24 +01:00
Alexander Kukushkin
b666f5e4ed Refactor Patroni REST API communication (#1197)
* make it possible to use client certificates with REST API
* define a separate PatroniRequest class which handles all communication
* refactor patronictl to use the new class
* make Ha to use the new class instead of calling requests.get. The old call wasn't taking into account certificates and basic-auth

Close #898
2019-10-11 10:16:33 +02:00
wilfriedroset
ee678f61d7 Fix typos in documentation (#1202) 2019-10-07 10:34:43 +02:00
Jecho
a8c32a4032 Fix minor typo in documentation #1212
Close #1211
2019-10-07 10:14:15 +02:00
geokala
178e565fe4 Update cacert documentation for use with REST API (#1190)
Fixes #1188
2019-09-24 13:04:07 +02:00
Jonathan S. Katz
a88704e792 Allow for certificate-based authentication from Patroni PostgreSQL accounts (#1134)
The two principal features this introduces:

1. Provide the Patroni PostgreSQL management accounts (superuser, replication, rewind) to be able to authenticate using certificate-based authentication
2. Allow the user to specify the `sslmode` they wish to connect as.

### References
- [PostgreSQL Certificate Based Authentication](https://www.postgresql.org/docs/current/auth-cert.html)
- [libpq connection parameters](https://www.postgresql.org/docs/current/libpq-connect.html) which are used by psycopg2
- [SSL Modes](https://www.postgresql.org/docs/current/libpq-ssl.html)
2019-09-17 12:14:49 +02:00
Alexander Kukushkin
278bf9852b Release 1.6.0 (#1131)
* Implement missing tests and do a few minor fixes
* Bump version to 1.6.0
* Update release notes
2019-08-05 15:08:04 +02:00
Don Seiler
5cb7d1bdc1 Grammar fixes for SETTINGS.rst (#1106) 2019-07-26 09:34:42 +02:00
Jan Tomsa
7d1a5cad03 Allow to specify consul consistency mode (#1094)
Allow users to specify consul consistency mode.
This option will be passed to the Consul client as kwargs https://github.com/zalando/patroni/blob/master/patroni/dcs/consul.py#L213.
The library will then enforce the selected consistency level https://python-consul.readthedocs.io/en/latest/#consul

More about consistency mode here https://www.consul.io/api/features/consistency.html
2019-07-01 11:02:26 +02:00
Alexander Kukushkin
37f03790cc Implement two-step logging (#1080)
A few times we observed that Patroni HA loop was blocked for a few minutes due to not being able to write logs to stderr. This is a very rare condition which we hit so far only on k8s. This commit makes Patroni resilient to such kind of problems. All log messages first are written into the in-memory queue and later they are asynchronously flushed into the stderr or file from a separate thread.

The maximum queue size is configurable and the default value is 1000. This should be enough to keep more than one hour of log messages with default settings and when Patroni cluster operates normally (without big issues).

In case if we hit the maximum size of the queue further logs will be discarded until the queue size will be reduced. The number of discarded messages will be reported into the log later.

In addition to that, the number of non-flushed and discarded messages (if there are any), will be reported via Patroni REST API as:
```json
"logger_queue_size": X,
"logger_records_lost": Y`
```
2019-06-13 14:18:49 +02:00
Kostiantyn Nemchenko
dcd605ebc8 Update existing_data.rst (#1071) 2019-06-11 15:15:48 +02:00
Alexander Kukushkin
bba9066315 Make it possible to run pg_rewind without superuser on pg11+ (#1035)
* expose the current patroni version in DCS
* expose `checkpoint_after_promote` flag in DCS as an indicator that pg_rewind could be safely executed
* other nodes will wait until this flag is set instead of connecting as superuser and issuing the CHECKPOINT
* define `postgresql.authention.rewind` with credentials for pg_rewind in patroni configuration files.
* create user for pg_rewind if postgres is 11+
* grant execute on functions required for pg_rewind to rewind user
2019-05-02 14:07:26 +02:00
Alexander Kukushkin
f0b784fe7f Manage pg_ident.conf with Patroni (#1037)
This functionality works similarly to the `pg_hba`:
If the `postgresql.pg_ident` is defined in the config file or DCS, Patroni will write its value to pg_ident.conf, however, if `postgresql.parameters.ident_file` is defined, Patroni will assume that pg_ident is managed from outside and not update the file.
2019-04-23 16:16:53 +02:00