Commit Graph

108 Commits

Author SHA1 Message Date
wilfriedroset
2384d9e735 Add API route /health (#1079)
close #119
2019-06-11 15:22:52 +02:00
Alexander Kukushkin
f1f2389146 A couple of small improvements in acceptance tests (#1057)
* Keep basebackup and wal_archive next to PGDATA in the data directory
* Test bootstrap of standby cluster nodes with custom scripts
2019-05-13 16:33:19 +02:00
Alexander Kukushkin
e38fe78b56 Fix callbacks behavior (mostly for standby cluster) (#998)
First of all, this patch changes the behavior of `on_start`/`on_restart` callbacks, they will be called only when postgres is started or restarted without role changes. In case if the member is promoted or demoted only the `on_role_change` callback will be executed. `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument.
Before that `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument.

In addition to that, the REST API will return standby_leader role for the leader of the standby cluster.

Closes https://github.com/zalando/patroni/issues/988
2019-03-29 10:28:07 +01:00
Michael Banck
073074f83e Run coverage as python -m coverage (#968)
Depending on the platform the coverage binary might not always be available under the standard name.
2019-02-13 16:02:12 +01:00
Michael Banck
345e6d3131 Copy away output directories of failed acceptance tests. (#967)
And dump logs on travis from only failed features
2019-02-13 16:00:15 +01:00
Michael Banck
d01a9bdcd5 Change base port for acceptance tests from 5440 to 5360 (#966) 2019-02-13 15:59:13 +01:00
Alexander Kukushkin
381a5b80d2 Release 1.5.4 (#931)
* Bump version
* Update release notes
* Make it possible to configure registration of Service in Consul via env variables
2019-01-15 12:14:19 +01:00
Alexander Kukushkin
1a0876e5ca Refactor acceptance tests to improve stability (#884)
Hope it will crash less often when executed on travis against k8s
2018-11-30 12:40:56 +01:00
Alexander Kukushkin
f8f928420d Release 1.5.2 (#875)
* Update release notes
* Bump version
2018-11-26 10:31:14 +01:00
Alexander Kukushkin
fb01aaebc5 Compatibility with kazoo-2.6.0 (#872)
Recently 2.6.0 was release which changes the way how create_connection method is called. Before it was passing two arguments, and in the new version all argument names are specified explicitly.
2018-11-19 14:26:20 +01:00
Alexander Kukushkin
2efd97baab Permanent replication slots (#819)
Permanent replication slots are preserved on failover/switchover, that is Patroni on the new primary will create configured replication slots right after doing promote.

Slots could be configured with the help of `patronictl edit-config`.
The initial configuration could be also done in the `bootstrap.dcs`

```yaml
slots:
  permanent_physical_1:
    type: physical
  permanent_logical_1:
    type: logical
    database: foo
    plugin: pgoutput
```

It is the responsibility of the operator to make sure that there are no clashes in names between replication slots automatically created by Patroni for members and permanent replication slots.

Closes https://github.com/zalando/patroni/issues/656
2018-10-31 11:37:42 +01:00
Dmitry Dolgov
dd7c3c349f [WIP] Standby cluster implementation (#679)
Implementation of "standby cluster" described in #657. Standby cluster consists
of a "standby leader", that replicates from a "remote master" (which is not a
part of current patroni cluster and can be anywhere), and cascade replicas,
that replicate from the corresponding standby leader. "Standby leader" behaves
pretty much like a regular leader, which means that it holds a leader lock in
DSC, in case if disappears there will be an election of a new "standby
leader".
One can define such a cluster using the section "standby_cluster" in patroni
config file. This section provides parameters for standby cluster, that will be
applied only once during bootstrap and can be changed only through DSC.
2018-09-07 10:10:56 +02:00
Alexander Kukushkin
4ca8a6e506 Make retries of calls to DCS consistent across implementations (#805)
in addition to that do a small refactoring of zookeeper and consul and try to improve the stability of AT
2018-09-06 08:37:26 +02:00
Alexander Kukushkin
87e9aab04c Improve tests (#778)
* Implement missing unit-tests
* Add acceptance tests for ISSUE #776
* Update list of classifiers, keywords and authors
2018-08-29 11:29:37 +02:00
Alexander Kukushkin
a513a7bb68 Improve stability of acceptance tests (#780)
last time tests were failing due to postgres/patroni slowness in picking sync standby
2018-08-29 11:13:18 +02:00
Alexander Kukushkin
e939304001 Take and apply some parameters from controldata when starting as replica (#703)
* Take and apply some parameters from controldata when starting as replica

https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN
There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start:
* max_connections
* max_prepared_transactions
* max_locks_per_transaction
* max_worker_processes

It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup.

As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set `pending_restart` flag.
2018-06-12 14:04:32 +02:00
Alexander Kukushkin
5668367181 Implement '/sync' and /async endpoints (#578)
They will respond with http status code 200 only when the node is running as a synchronous or asynchronous replica.

Fixes https://github.com/zalando/patroni/issues/189
Fixes https://github.com/zalando/patroni/issues/415
2018-01-05 15:28:40 +01:00
Alexander Kukushkin
18786464a1 Rename failover to switchover and make new failover work without leader (#588)
In addition to that implement /switchover endpoint as an alias to /failover endpoint and implement more checks like:
* candidate must be provided for a failover
* switchover can't be scheduled in a pause state
* and so on

Fixes https://github.com/zalando/patroni/issues/585
Fixes https://github.com/zalando/patroni/issues/520
2018-01-05 15:17:56 +01:00
Alexander Kukushkin
4328c15010 Make Patroni Kubernetes native (#500)
* Use ConfigMaps or Endpoins for leader elections and to keep cluster state
* Label pods with a postgres role
* change behavior of pip install. From now on it will not install all dependencies, you have to specify explicitly DCS you want to use Patroni with: `pip install patroni[etcd,zookeeper,kubernetes]`
2017-12-08 16:55:00 +01:00
Alexander Kukushkin
8e3511ca6b Different minor fixes (#551)
* Use unix line endings
* Make flake8 happy
2017-11-02 16:24:17 +01:00
Ants Aasma
32b0768631 Fix watchdog on Python 3 (#531)
A misunderstanding of the ioctl() call interface. If mutable=False then fcntl.ioctl() actually returns the arg buffer back.
This accidentally worked on Python2 because int and str comparison did not return an error.
Error reporting is actually done by raising IOError on Python2 and OSError on Python3.

* Properly handle errors in set_timeout(), have them result in only a warning if watchdog support is not required.

* Improve watchdog device driver name display on Python3

* Eliminate race condition in watchdog feature tests.
  The pinged/closed states were not getting reset properly if the checks ran too quickly.
  Add explicit reset points in feature test so the check is unambiguous.
2017-09-29 10:27:10 +02:00
Alexander Kukushkin
77aea03df9 Different bugfixes around pause state, mostly related to watchdog (#507)
* Do not send keepalives if watchdog is not active
* Avoid activating watchdog in a pause mode
* Set correct postgres state in pause mode
* Don't try to run queries from API if postgres is stopped
2017-08-24 07:53:32 +02:00
Alexander Kukushkin
25aa49b240 Run one manual failover test via rest API instead of patronictl
and bump Patroni version
2017-07-31 11:18:01 +02:00
Alexander Kukushkin
322aa45e09 BUGFIX: patronictl edit-config didn't worked with zookeeper (#492)
When updating config key we should use `ClusterConfig.index` instead of
`ClusterConfig.modify_index`. The second one should be used by Patroni
internally to check that key was really changed, because when key is
deleted and recreated it's version always starts from the same value: 0

In addition to that use patronictl instead of http PATCH in some of
acceptance tests to change cluster config.

Fixes https://github.com/zalando/patroni/issues/491
2017-07-31 11:07:00 +02:00
Ants Aasma
70d718a058 Simplify watchdog code (#452)
* Only activate watchdog while master and not paused

We don't really need the protections while we are not master. This way
we only need to tickle the watchdog when we are updating leader key or
while demotion is happening.

As implemented we might fail to notice to shut down the watchdog if
someone demotes postgres and removes leader key behind Patroni's back.
There are probably other similar cases. Basically if the administrator
if being actively stupid they might get unexpected restarts. That seems
fine.

* Add configuration change support. Change MODE_REQUIRED to disable leader eligibility instead of closing Patroni.

Changes watchdog timeout during the next keepalive when ttl is changed. Watchdog driver and requirement can also be switched online.

When watchdog mode is `required` and watchdog setup does not work then the effect is similar to nofailover. Add watchdog_failed to status API to signify this. This is True only when watchdog does not work **AND** it is required.

* Reset implementation when config changed while active.

* Add watchdog safety margin configuration

Defaults to 5 seconds. Basically this is the maximum amount of time
that can pass between the calls to odcs.update_leader()` and
`watchdog.keepalive()`, which are called right after each other. Should
be safe for pretty much any sane scenario and allows the default
settings to not trigger watchdog when DCS is not responding.

* Cancel bootstrap if watchdog activation fails

The system would have demoted itself anyway the next HA loop. Doing it
in bootstrap gives at least some other node chance to try bootstrapping
in the hope that it is configured correctly.

If all nodes are unable to activate they will continue to try until the
disk is filled with moved datadirs. Perhaps not ideal behavior, but as
the situation is unlikely to resolve itself without administrator
intervention it doesn't seem too bad.
2017-07-27 12:16:11 +02:00
Alexander Kukushkin
d5b3d94377 Custom bootstrap (#454)
Task of restoring a cluster from backup or cloning existing cluster into a new one was floating around for some time. It was kind of possible to achieve it by doing a lot of manual actions and very error prone. So I come up with the idea of making the way how we bootstrap a new cluster configurable.

In short - we want to run a custom script instead of running initdb.
2017-07-18 15:12:58 +02:00
Alexander Kukushkin
acc6d7c2c2 Watchdog unit-tests, bugfixes and questions (#449)
Implement missing unit-tests for and drop unused code
2017-07-11 10:00:30 +02:00
Alexander Kukushkin
681b6b507b Support unix sockets when connecting to a local postgres cluster (#457)
For backward compatibility this feature is not enabled by default. To enable it you have to set `postgresql.use_unix_socket: true`.
If feature is enable, and `unix_socket_directories` is defined and non empty, Patroni will use the first suitable value from it to connect to the local postgres cluster.
If the `unix_socket_directories` is not defined, Patroni will assume that default value should be used and will not pass `host` to command line arguments and omit it from connection url.

Solves: https://github.com/zalando/patroni/issues/61

In addition to mentioned above, this commit solves couple of bugs:
* manual failover with pg_rewind in a pause state was broken
* psycopg2 (or libpq, I am not really sure what exactly) doesn't mark cursors connection as closed when we use unix socket and there is an `OperationalError` occurs. We will close such connection on our own.
2017-06-22 11:47:57 +02:00
Ants Aasma
a70b46ef13 Add watchdog support on Linux (#343)
Ensures that system gets rebooted before TTL runs out.

Initial version. Open questions:

    Do we want to disable watchdog while we are not master?
2017-06-01 16:53:46 +02:00
Alexander Kukushkin
37c1552c0a Smart pg_rewind (#417)
Previously we were running pg_rewind only in limited amount of cases:
 * when we knew postgres was a master (no recovery.conf in data dir)
 * when we were doing a manual switchover to a specific node (no
   guaranty that this node is the most up-to-date)
 * when a given node has nofailover tag (it could be ahead of new master)

This approach was kind of working in most of the cases, but sometimes we
were executing pg_rewind when it was not necessary and in some other
cases we were not executing it although it was needed.

The main idea of this PR is first try to figure out that we really need
to run pg_rewind by analyzing timelineid, LSN and history file on master
and replica and run it only if it's needed.
2017-05-19 16:32:06 +02:00
Alexander Kukushkin
39f5f7982c Scheduled failovers in 1 second don't work reliably with loop_wait=2 2017-01-13 11:25:07 +01:00
Alexander Kukushkin
1f829a4b34 Switch to trusty and run acceptance tests with postgres 9.6 2017-01-13 09:32:38 +01:00
Alexander Kukushkin
d138a8db17 AT for master_start_timeout + minor fixes (#361) 2016-12-09 12:02:41 +01:00
Alexander Kukushkin
37b020e7a3 Various bugfixes and improvements: (#346)
* Replace pytz.UTC with dateutil.tz.tzutc, it helps to reduce memory by more than 4Mb...

* fix check of python version: 0x0300000 => 0x3000000

* Update leader key before restart and demote
2016-11-04 18:42:56 +02:00
Ants Aasma
7e53a604d4 Add synchronous replication support. (#314)
Adds a new configuration variable synchronous_mode. When enabled Patroni will manage synchronous_standby_names to enable synchronous replication whenever there are healthy standbys available. With synchronous mode enabled Patroni will automatically fail over only to a standby that was synchronously replicating at the time of master failure. This effectively means zero lost user visible transactions.

To enforce the synchronous failover guarantee Patroni stores current synchronous replication state in the DCS, using strict ordering, first enable synchronous replication, then publish the information. Standby can use this to verify that it was indeed a synchronous standby before master failed and is allowed to fail over.

We can't enable multiple standbys as synchronous, allowing PostreSQL to pick one because we can't know which one was actually set to be synchronous on the master when it failed. This means that on standby failure commits will be blocked on the master until next run_cycle iteration. TODO: figure out a way to poke Patroni to run sooner or allow for PostgreSQL to pick one without the possibility of lost transactions.

On graceful shutdown standbys will disable themselves by setting a nosync tag for themselves and waiting for the master to notice and pick another standby. This adds a new mechanism for Ha to publish dynamic tags to the DCS.

When the synchronous standby goes away or disconnects a new one is picked and Patroni switches master over to the new one. If no synchronous standby exists Patroni disables synchronous replication (synchronous_standby_names=''), but not synchronous_mode. In this case, only the node that was previously master is allowed to acquire the leader lock.

Added acceptance tests and documentation.

Implementation by @ants with extensive review by @CyberDem0n.
2016-10-19 16:12:51 +02:00
Alexander Kukushkin
1e573aec8f Do session/renew call to Consul when update_leader is called (#336) 2016-10-10 10:05:55 +02:00
Alexander Kukushkin
4594bc98da Increase timeouts when running AT on travis (#324)
* Increase timeouts two times when running AT on travis
* Make up to 3 attempts to download DCS
* Get rid from hard-coded names
2016-09-28 15:13:09 +02:00
Alexander Kukushkin
10c7fa41f3 Exclude unhealthy nodes when choosing where to clone from (#313)
Node MUST have tag clonefrom: true, be in the 'running' state and also
we should not try to clone from itself.
2016-09-21 09:42:48 +02:00
Alexander Kukushkin
0b1bfeca5b Make sure that we are running and testing latest versions of everything (#303) 2016-09-19 13:32:53 +02:00
Alexander Kukushkin
33ff372ef6 Always try to rewind on manual failover 2016-09-01 11:08:26 +02:00
Alexander Kukushkin
1dcdd6eaa0 Acceptance tests for pause mode 2016-08-30 16:50:07 +02:00
Alexander Kukushkin
366ed9cc52 fix pep8 formatting and implement missing tests 2016-08-29 15:39:24 +02:00
Murat Kabilov
a47a2bceff Manage scheduled restarts using patronictl (#248)
Manage scheduled restarts using patronictl
2016-08-09 12:54:48 +02:00
Oleksii Kliukin
ffd27b5705 Rename with_pending_restart to restart_pending. 2016-07-13 11:07:37 +02:00
Oleksii Kliukin
bf95b75489 Use the parameter that really sets the pending_restart flag. 2016-07-11 18:20:15 +02:00
Oleksii Kliukin
c91eda8d78 Merge branch 'master' into feature/scheduled_restarts 2016-07-11 12:56:24 +02:00
Alexander Kukushkin
ae88e7c96e Document that every single zookeeper host:port MUST be quoted
otherwise yaml library can not parse the list.
And make visible yaml exception when trying to parse this list.
2016-06-29 14:25:50 +02:00
Oleksii Kliukin
7a1e2e0c72 Fix the assert message. 2016-06-28 17:11:13 +02:00
Oleksii Kliukin
d2832ee43b Address the code review.
Fix return  value in the should_run_scheduled_action and the comments.
Correct the json composition in the scheduled_restart test.
Fix the delete in case there is no scheduled restart.
Fix the usage of format in the logger output.
Fix the indentation in the evaluate_scheduled_restart.
Fix the condition related to the body_is_optional in the do_POST_restart.
Fix a few typos in the error messages.
Fix the _read_json_content
Make the scheduled restart unit-tests a bit less ugly
2016-06-28 16:54:20 +02:00
Oleksii Kliukin
29845dd383 Restart the node according to the schedule.
The scheduled restart data structures are now independent of those
used by the normal restarts. This would be fixed in subsequent
commits.
Add the behave tests, that cover the POST /restart (but not DELETE).
2016-06-23 10:43:54 +02:00