10 Commits

Author SHA1 Message Date
Alexander Kukushkin
d3e3b4e16f Minor tuning of tests (#2201)
- Reduce verbosity for unit tests
- Refactor GH actions config and try again macos behave tests
2022-02-10 15:38:16 +01:00
Alexander Kukushkin
c7173aadd7 Failover logical slots (#1820)
Effectively, this PR consists of a few changes:

1. The easy part:
  In case of permanent logical slots are defined in the global configuration, Patroni on the primary will not only create them, but also periodically update DCS with the current values of `confirmed_flush_lsn` for all these slots.
  In order to reduce the number of interactions with DCS the new `/status` key was introduced. It will contain the json object with `optime` and `slots` keys. For backward compatibility the `/optime/leader` will be updated if there are members with old Patroni in the cluster.

2. The tricky part:
  On replicas that are eligible for a failover, Patroni creates the logical replication slot by copying the slot file from the primary and restarting the replica. In order to copy the slot file Patroni opens a connection to the primary with `rewind` or `superuser` credentials and calls `pg_read_binary_file()`  function.
  When the logical slot already exists on the replica Patroni periodically calls `pg_replication_slot_advance()` function, which allows moving the slot forward.

3. Additional requirements:
  In order to ensure that primary doesn't cleanup tuples from pg_catalog that are required for logical decoding, Patroni enables `hot_standby_feedback` on replicas with logical slots and on cascading replicas if they are used for streaming by replicas with logical slots.

4. When logical slots are copied from to the replica there is a timeframe when it could be not safe to use them after promotion. Right now there is no protection from promoting such a replica. But, Patroni will show the warning with names of the slots that might be not safe to use.

Compatibility.
The `pg_replication_slot_advance()` function is only available starting from PostgreSQL 11. For older Postgres versions Patroni will refuse to create the logical slot on the primary.

The old "permanent slots" feature, which creates logical slots right after promotion and before allowing connections, was removed.

Close: https://github.com/zalando/patroni/issues/1749
2021-03-25 16:18:23 +01:00
Pavlo Golub
4cc6034165 Fix features/steps/standby_cluster.py under Windows (#1535)
Resolves #1534
2020-05-15 16:22:15 +02:00
Igor Yanchenko
2174d66f97 Rewriten shell scripts in python to make them compatible with windows (#1326) 2019-12-11 12:07:05 +01:00
Alexander Kukushkin
183adb7848 Housekeeping (#1284)
* Implement proper tests for `multiprocessing.set_start_method()`
* Exclude some watchdog code from coverage (it is used only for behave tests)
* properly use os.path.join for windows compatibility
* import DCS modules in `features/environment.py` on demand. It allows to run behave tests against chosen DCS without installing all dependencies.
* remove some unused behave code
* fix some minor issues in the dcs.kubernetes module
2019-11-21 13:27:55 +01:00
Alexander Kukushkin
f1f2389146 A couple of small improvements in acceptance tests (#1057)
* Keep basebackup and wal_archive next to PGDATA in the data directory
* Test bootstrap of standby cluster nodes with custom scripts
2019-05-13 16:33:19 +02:00
Alexander Kukushkin
e38fe78b56 Fix callbacks behavior (mostly for standby cluster) (#998)
First of all, this patch changes the behavior of `on_start`/`on_restart` callbacks, they will be called only when postgres is started or restarted without role changes. In case if the member is promoted or demoted only the `on_role_change` callback will be executed. `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument.
Before that `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument.

In addition to that, the REST API will return standby_leader role for the leader of the standby cluster.

Closes https://github.com/zalando/patroni/issues/988
2019-03-29 10:28:07 +01:00
Alexander Kukushkin
1a0876e5ca Refactor acceptance tests to improve stability (#884)
Hope it will crash less often when executed on travis against k8s
2018-11-30 12:40:56 +01:00
Alexander Kukushkin
2efd97baab Permanent replication slots (#819)
Permanent replication slots are preserved on failover/switchover, that is Patroni on the new primary will create configured replication slots right after doing promote.

Slots could be configured with the help of `patronictl edit-config`.
The initial configuration could be also done in the `bootstrap.dcs`

```yaml
slots:
  permanent_physical_1:
    type: physical
  permanent_logical_1:
    type: logical
    database: foo
    plugin: pgoutput
```

It is the responsibility of the operator to make sure that there are no clashes in names between replication slots automatically created by Patroni for members and permanent replication slots.

Closes https://github.com/zalando/patroni/issues/656
2018-10-31 11:37:42 +01:00
Dmitry Dolgov
dd7c3c349f [WIP] Standby cluster implementation (#679)
Implementation of "standby cluster" described in #657. Standby cluster consists
of a "standby leader", that replicates from a "remote master" (which is not a
part of current patroni cluster and can be anywhere), and cascade replicas,
that replicate from the corresponding standby leader. "Standby leader" behaves
pretty much like a regular leader, which means that it holds a leader lock in
DSC, in case if disappears there will be an election of a new "standby
leader".
One can define such a cluster using the section "standby_cluster" in patroni
config file. This section provides parameters for standby cluster, that will be
applied only once during bootstrap and can be changed only through DSC.
2018-09-07 10:10:56 +02:00