2020 Commits

Author SHA1 Message Date
Casey Allen Shobe
0e4d7f01f2 Correct documentation for consul.host (#1438)
Close #1434
2020-04-01 15:50:50 +02:00
Danyal Prout
810c179592 Kazoo 2.7.0 Compatibility
Close https://github.com/zalando/patroni/issues/1448
2020-03-18 11:04:09 +01:00
Alexander Kukushkin
d2080a3116 Retry if the retry-after http header is set (#1431)
If the K8s API is overwhelmed with requests it might ask to retry.
2020-03-13 16:36:39 +01:00
Alexander Kukushkin
d82301688f Fix pyinstaller compatibility (#1441)
Close https://github.com/zalando/patroni/issues/1440
2020-03-13 16:36:12 +01:00
Feike Steenbergen
d74a4b23a6 Scrub KUBERNETES_ environment from the postmaster (#1407)
The KUBERNETES_ environment variables are not required for PostgreSQL, yet having them exposed to the postmaster will also expose them to backends and to regular database users (using pl/perl for example).
2020-03-10 12:08:29 +01:00
Michail Nikolaev
795efc4548 Note about possible data loss while canceling postgres backends. (#1414)
Note about possible data loss while canceling postgres backends.

Related to zalando#1412
2020-03-10 12:08:01 +01:00
Alexander Kukushkin
b020874486 Small improvement in tests (#1423)
which actually revealed a small issue in the validator
2020-03-10 12:07:40 +01:00
Alexander Kukushkin
ab38ab2e97 Apply 1 second backoff if LIST failed (#1424)
It is mostly necessary to avoid flooding logs, but also help to prevent starvation of the main thread.
2020-03-10 12:07:26 +01:00
Alexander Kukushkin
613634c26b Reset rewind state if postgres started after successful pg_rewind (#1408)
Close https://github.com/zalando/patroni/issues/1406
2020-02-27 12:24:17 +01:00
Alexander Kukushkin
4a29caa9d3 On role change callback didn't fire on failed primary (#1420)
Bug was introduced in https://github.com/zalando/patroni/pull/703
Close https://github.com/zalando/patroni/issues/1418
2020-02-27 12:22:44 +01:00
Alexander Kukushkin
bcd75bbeeb Avoid opening replication connection on every cycle of HA loop (#1422)
Bug was introduces in the https://github.com/zalando/patroni/pull/1332
2020-02-27 12:22:11 +01:00
Steven De Coeyer
0fa70e8d88 Updates README (#1394)
We need to ensure to enable etcd v2, cfr. https://github.com/zalando/patroni/issues/1270 and https://github.com/zalando/patroni/issues/1163.
2020-02-20 10:15:58 +01:00
damien clochard
e759a3f2ef [doc] add PATRONICTL_CONFIG_FILE env var (#1397) 2020-02-20 10:14:36 +01:00
Julien Riou
7b0e012f62 Disable SSL verification for Consul when it is required (#1399)
Consul client uses urllib3 with a verify=True by default. When
SSL verification is disabled with verify=False, we can see
CERTIFICATE_VERIFY_FAILED exceptions. With urllib3 1.19.1-1 on
Debian Stretch, the "cert_reqs" argument  must be explicitaly set
to ssl.CERT_NONE to effectively disable SSL verification.
2020-02-20 10:13:41 +01:00
Alexander Kukushkin
80ce61876e Don't create permanent physical slot with name of the primary (#1392)
It is a regular issue that primary is recycling WALs when one of the replicas is down for a long time. So far there were only two solutions for such a problem and both of them are not perfect:
1. Increase `wal_keep_segments`, but it is hard to guess the good value.
2. Use continuous archiving and PITR, but it is not always possible.

This PR is introducing the way to solve the problem for static clusters, with a fixed number of nodes and names that never change. You just need to list the names of all nodes in the `slots` so the primary will not remove the slot when the node is down (not registered in DCS).
Of course, the primary will not create the permanent slot which is matching its own name.

Usage example: let's assume you have a cluster with nodes named *abc1*, *abc2*, and *abc3*.
You have to run `patronictl edit-config` and put the following snippet into the configuration:
```yaml
slots:
  abc1:
    type: physical
  abc2:
    type: physical
  abc3:
    type: physical
```

If the node *abc2* is the primary, it will always create slots for *abc1* and *abc3* even if they are not running, but will not create slot *abc2*.
Other nodes will behave the same.

Close #280
2020-02-20 10:07:43 +01:00
Igor Yanchenko
ffde403a0a Config validator implemented (#1314) 2020-02-20 09:40:44 +01:00
Paul Voss
7e17092809 Fix permissions for openshift block devices (#1361)
OpenShift enforces securityContext.fsGroups for block devices and sets group stickybits for volumeMounts.

This leads to patroni pods failing to start after the first restart:
> 2020-01-13 14:46:13.695 UTC [143] FATAL:  data directory "/home/postgres/pgdata/pgroot/data" has invalid permissions
2020-01-13 14:46:13.695 UTC [143] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).

A initContainer which fixes the OpenShift tampering solves the issue. I stole the solution from the stable postgres helm chart:
https://github.com/helm/charts/pull/14540/files

Tested on OpenShift v3.11

Note: This error does not occur when using shared filesystems (like NFS)
2020-02-13 15:07:56 +01:00
Kostiantyn Nemchenko
bc948ce551 Show tags for member via patronictl (#1383)
Add one more field to patronictl output to check tags defined for a member.
Closes #1375
2020-02-13 15:07:05 +01:00
Michael Banck
f419d73465 Set postgresql.pgpass to ./pgpass (#1386)
This avoids test failures if $HOME is not available (fixes: #1385).
2020-02-13 15:06:36 +01:00
Alexander Kukushkin
4737af48bf Get rid of dependency on tzlocal module (#1390)
It was used only to add the local timezone to the datetime specified in the patronictl for scheduled switchover or restart.
The `dateutil.tz.tzlocal()` does the same job equally well.
2020-02-13 14:49:57 +01:00
Alexander Kukushkin
dc1966e3bc Release 1.6.4 (#1380)
* Bump version
* Update release notes
v1.6.4
2020-01-27 14:15:21 +01:00
Alexander Kukushkin
6aa3f809d4 Configure keepalive for connections to K8s API (#1366)
In case if we got nothing from the socket after the TTL seconds it should be considered dead.
2020-01-27 09:25:08 +01:00
Alexander Kukushkin
902411239f More compatibility with windows (#1367)
* unix-domain sockets are not yet supported
* signal.SIGQUIT doesn't exists
2020-01-24 12:52:55 +01:00
Alexander Kukushkin
0eb1e0568b Compatibility with python 3.7.6 (#1368)
The urlparse function has changed.

Old versions:
```python
>>> urlparse('localhost:8500')
ParseResult(scheme='', netloc='', path='localhost:8500', params='', query='', fragment='')
```

3.7.6:
```python
>>> urlparse('localhost:8500')
ParseResult(scheme='localhost', netloc='', path='8500', params='', query='', fragment='')
```
2020-01-24 12:52:34 +01:00
Alexander Kukushkin
27ff9dfda3 Avoid logging passwords on user creation (#1370)
Fixes https://github.com/zalando/patroni/issues/1365
2020-01-24 10:52:23 +01:00
Igor Yanchenko
16fe180ed6 implemented stop signal using pg_ctl for non posix systems (#1342)
Using pg_ctl to send stop signal for non posix os.
2020-01-16 14:35:47 +01:00
Alexander Kukushkin
1c4d395d5a Handle exception from Ha.shutdown (#1351)
During the shutdown Patroni is trying to update its status in the DCS.
If the DCS is inaccessible an exception might be raised. Lack of exception handling prevents logger thread from stopping.

Fixes https://github.com/zalando/patroni/issues/1344
2020-01-16 14:34:58 +01:00
Alexander Kukushkin
c5fc6fd936 Special handling of parameters with period (#1349)
It might be that they are defined by the extension and therefore the unit is not necessarily is the string.
It also could be that change of the value requires a restart (for example pg_stat_statements.max).
2020-01-16 14:34:31 +01:00
Alexander Kukushkin
1461d7d4b8 Allow certain recovery parameters be defined in the custom_conf (#1335)
Fixes https://github.com/zalando/patroni/issues/1333
2020-01-15 12:41:07 +01:00
Igor Yanchenko
ea76a40845 Make sure postgresql.pgpass is a file or it does not exist (#1337)
Also make sure that it is located in the writable directory.
2020-01-15 12:40:41 +01:00
Alexander Kukushkin
102a12ea5a Catch all exceptions when calling socket.getaddrinfo (#1355)
it serves two purposes:
1. We don't want accidentally break the thread
2. During the shutdown socket.gaierror become unresolvable and nasty exceptions are raised

Close: https://github.com/zalando/patroni/issues/1353
2020-01-15 12:38:34 +01:00
Alexander Kukushkin
9e8ae95bce Finally fix a problem with case sensitivity of sync standby names (#1359)
Close https://github.com/zalando/patroni/issues/1358
2020-01-15 12:38:05 +01:00
Kostiantyn Nemchenko
a2a5cc2f71 Disable serfHealth Consul check (#1364)
Fixes #1362 and #1363.
2020-01-15 12:37:35 +01:00
Alexander Kukushkin
16d1ffdde7 Update timeline on standby cluster (#1332)
Fixes https://github.com/zalando/patroni/issues/1031
2019-12-20 12:56:00 +01:00
Alexander Kukushkin
a675fa18dc Handle case when replication password is set to null (#1330)
Fixes https://github.com/zalando/patroni/issues/1231
2019-12-20 12:06:18 +01:00
Igor Yanchenko
26b6e00575 wait option for patronictl reinit implemented (#1339)
Wait to finish `reinit` if `--wait` option is used.
Every 2 seconds it pulls the status from Patroni REST API and reports to console.
2019-12-20 12:05:39 +01:00
Alexander Kukushkin
d941f6bc5e Use restore_command from standby_cluster config on cascading replicas (#1341)
The standby_leader was already doing it from the beginning feature existed. Not doing the same on replicas might prevent them from catching up with standby leader due to WALs being recycled.

In addition to that apply the same strategy to archive_cleanup_command.
2019-12-20 12:03:25 +01:00
Igor Yanchenko
7ff27d9e10 Make sure unix_socket_directories and stats_temp_directory exist (#1293)
Upon the start of Patroni and Postgres make sure that unix_socket_directories and stats_temp_directory exist or try to create them. Patroni will exit if failed to create them.

Close https://github.com/zalando/patroni/issues/863
2019-12-11 12:26:17 +01:00
Igor Yanchenko
2174d66f97 Rewriten shell scripts in python to make them compatible with windows (#1326) 2019-12-11 12:07:05 +01:00
Pavlo Golub
919e9c54d2 Make dest argument default value of backup() cross platform (#1324)
Fixes #1325
2019-12-11 11:25:41 +01:00
Alexander Kukushkin
08d6e5e50e BUGFIX: don't leak password when running pg_rewind (#1321)
In addition to that:
* enforce security settings from `postgresql.authention`
* update release notes
* bump version
* close https://github.com/zalando/patroni/issues/1320
v1.6.3
2019-12-05 18:19:38 +01:00
Alexander Kukushkin
b542e4b5f0 Release 1.6.2 (#1319)
* update release notes
* bump version
v1.6.2
2019-12-05 11:36:17 +01:00
Alexander Kukushkin
0693fe7dd0 Housekeeping (#1315)
* Reduce memory usage by patroni init process
* More cleanup in setup.py
* Implement missing tests
2019-12-04 11:28:46 +01:00
Igor Yanchenko
49d3968c23 Make it possible to configure log level for exception tracebacks (#1311)
If you set `log.traceback_level=DEBUG`, the tracebacks will be visible only when `log.level=DEBUG`. The default behavior remains the same.
2019-12-03 15:13:42 +01:00
Alexander Kukushkin
f1819443ef Avoid spawning semaphore tracker process (#1299)
We are not using semaphores, therefore we don't need to track them.
2019-12-02 12:16:18 +01:00
Igor Yanchenko
cf0c0f8e7c Make the error more helpful if restapi cannot bind (#1300)
Giving the user a hint if we couldn't start the restapi service.
2019-12-02 12:15:45 +01:00
Alexander Kukushkin
e1d569ad75 Inherit CaseInsensitiveDict from urllib3 HTTPHeaderDict (#1302)
It might look like a hack, but the API is stable enough and didn't change in the past 3+ years.
2019-12-02 12:14:59 +01:00
Igor Yanchenko
726ee46111 Implemented patroni --version (#1291)
That required a refactoring of `Config` and `Patroni` classes. Now one has to explicitely create the instance of `Config` before creating `Patroni`.

The Config file can optionally call the validate function.
2019-12-02 12:14:19 +01:00
Alexander Kukushkin
cc0df4900b Set User-Agent for all http requests (#1312)
Example: `Patroni/1.6.1 Python/3.6.8 Linux`
2019-12-02 10:46:20 +01:00
Igor Yanchenko
638aa63023 Don't make user to choose from an empty list (#1305)
If a user provides a wrong cluster name, we will raise an exception rather than ask to choose a member from an empty list.
2019-12-02 10:38:35 +01:00