2124 Commits

Author SHA1 Message Date
Alexander Kukushkin
39875f448c Release v3.0.2 (#2617)
- bump version
- update release notes
- update links to Postgres Slack
- simplify /sync health-check endpoint code
- update unit-tests to cover missing lines
v3.0.2
2023-03-24 08:54:54 +01:00
Israel
a1095e385c Handle patronictl edit-config diff pager in a more user friendly way (#2605)
`patronictl edit-config` requires a pager to show the diff output back to the user. It used to be hard-coded to use either `less` or `more`.

When these tools were not available in the host that would cause `patronictl` to face an exception in `ydiff` module and to show the stack trace in the console.

This PR changes `patronictl edit-config` command to behave like this:

- If `PAGER` environment variable is set, attempt to find the corresponding executable.
- If `PAGER` is not set or is set with an invalid executable, then attempt to use either `less` or `more` as it used to do.
- If no executable is find at all then throw a `PatroniCtlException` to show an user friendly message

Unit tests in `tests/test_ctl.py` were modified accordingly.

References: PAT-21
Close #2604
2023-03-23 13:43:48 +01:00
Israel
84353b88c9 Add docstrings and type annotations to patroni/daemon.py (#2610)
References: PAT-38
2023-03-23 13:41:16 +01:00
T.v.Dein
60723f5fa4 Add metric to report about sync standby replica status (#2615)
Close #2613

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2023-03-23 09:32:29 +01:00
Alexander Kukushkin
a8b90f0cd6 Make sure Cluster.sync is never empty (#2614)
It was possible to have it empty if the all cluster keys are missing in DCS. In this case the `Cluster` object was manually created with all values set to `None` or `[]` (including sync).
It already resulted in #2217, which is in fact wasn't a correct fix.

In order to solve it and reduce code duplication we introduce `Cluster.empty()` and `SyncState.empty()` methods, which will create corresponding empty objects and start using `Cluster.empty()` from all places where the empty `Cluster` object was manually created.
2023-03-22 16:41:41 +01:00
Israel
918674e7bb Document code in patroni/version.py (#2611)
References: PAT-39

Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>
2023-03-22 11:46:15 +01:00
Alexander Kukushkin
ddac8683e6 Use config file as a fallback when all current etcd nodes failed (#2599)
If communication with etcd nodes failed it is logical to start from scratch, from nodes that are listed in the config. But, it could happen that config is in fact outdated and all nodes in the real cluster were replaced.

Previously we used to track whether config file was changed, which turned out not to work in all possible cases.
The new strategy is a bit more different - if communication with all nodes failed we will continue keeping the last know topology and at the same time will try to figure out the new one by merging two lists together, the cached list and the list from the config file.
2023-03-14 15:54:17 +01:00
Víctor Oriol i Aguilar
36c17e944b high availability across multiple datacenter #2587 (#2598)
documentations about how deploy a high availability across multiple datacenters

Close #2587
2023-03-14 15:39:50 +01:00
Alexander Kukushkin
c1bfb0e6d6 Remove python 2.7 support (#2571)
- get rid from 2.7 specific modules: `six`, `ipaddress`
- use Python3 unpacking operator
- use `shutil.which()` instead of `find_executable()`
2023-03-13 17:00:04 +01:00
Polina Bungina
373affe707 Use IMDSv2 in aws callback example script (#2590) 2023-03-13 13:31:57 +01:00
Alexander Kukushkin
95ba8b9e59 Fix bug with metadata after coordinator failover (#2597)
We made incorrect assumption that `citus_set_coordinator_host()` will trigger `pg_dist_node` sync. Instead we should also use `citus_update_node()` and call `citus_set_coordinator_host()` only during the bootstrap.

Adjust behave tests to verify that coordinator failover is visible on workers.
2023-03-13 13:30:39 +01:00
Benoit
60a7e5a514 Fix typo in set_state: initializing new cluster (#2586) 2023-03-10 09:41:17 +01:00
Alexander Kukushkin
eefa15b390 Make K8s retriable HTTP status code configurable (#2585)
Configuration parameter is `kubernetes.retriable_http_codes` or `PATRONI_KUBERNETES_RETRIABLE_HTTP_CODES` environment variable.

These status codes are added to the default list of 500, 503, 504.

Close https://github.com/zalando/patroni/issues/2536
2023-03-10 09:38:12 +01:00
Alexander Kukushkin
8622fcea3d Switch to GH forms for issues (#2594)
and make link to #patroni channel on PostgreSQL Slack more visible
2023-03-10 09:37:41 +01:00
Alexander Kukushkin
2afcaa9d83 Don't write to PGDATA if major version is not known (#2583)
It could happen that Patroni is started up before PGDATA was mounted. In this case Patroni can't determine major Postgres version from PG_VERSION file. Later, when PGDATA is mounted, Patroni was trying to create the recovery.conf even if the actual Postgres major version is newver than 12.

To mitigate the problem we double check that the `Postgresql._major_version` is set before writing recovery configuration or starting postgres up.

Close https://github.com/zalando/patroni/issues/2434
2023-03-06 16:33:32 +01:00
Alexander Kukushkin
09d0d78b74 Don't allow on_reload callback kill other callbacks (#2578)
Since a long time Patroni enforcing only one callback script running at a time. If the new callback is executed while the old one is still running, the old one is killed (including all child processes).

Such behavior is fine for all callbacks but on_reload, because the last one may accidentally cancel important ones, that for example updating DNS or assigning/removing Virtual IP.

To mitigate the problem we introduce a dedicated executor for on_reload callbacks, so that on_reload may only cancel another on_reload.

Ref: https://github.com/zalando/patroni/issues/2445
2023-03-06 16:33:03 +01:00
Burak Ergen
89595babdf add "GET /metrics" rest_api.rst (#2576) 2023-03-02 09:40:54 +01:00
Alexander Kukushkin
dff5537954 Compatibility with flake8>=5.0 (#2579)
The main() function now returns exit code instead of exiting on it's own
2023-03-02 09:16:17 +01:00
Alexander Kukushkin
c985974ece Set hot_standby=off only if recovery_target_action=promote (#2570)
During custom bootstrap the `hot_standby` is set to off to protect postgres from panicking and shutting down when some parameters like `max_connections` are increased on the primary.

According to the [documentation](https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-RECOVERY-TARGET-ACTION), `hot_standby` set to `off` affects behavior of the `recovery_target_action`, and `pause` starts acting as the `shutdown`:
> If [hot_standby](https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-HOT-STANDBY) is not enabled, a setting of pause will act the same as shutdown

 This is not what users expect/need, because normally they resolve pause state on their own.

To solve the problem we will set `hot_standby` to `off` during custom bootstrap only if `recovery_target_action` is set to 'promote'.

Close https://github.com/zalando/patroni/issues/2569
2023-02-28 10:08:42 +01:00
Lukáš Lalinský
388bb40b71 Fix patronictl switchover on Citus cluster running on Kubernetes (#2562)
The patronictl code tries to initialize DCS twice, first for the current Citus group and the second time for the selected group. However, kubernetes.py was overwriting the namespace config. As a result, after the second initialization patronictl was trying to work with the `default` namespace instead of the configured one.
2023-02-28 10:07:27 +01:00
Polina Bungina
422047f105 Release 3.0.1 (#2561)
* Bump version
* Update release notes
* Return 3.6 to supported versions in setup.py
v3.0.1
2023-02-16 08:51:47 +01:00
Polina Bungina
b85f155dbe Pass 'master' role to a callback script instead of 'promoted' (#2554)
Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2023-02-08 14:09:51 +01:00
Alexander Kukushkin
1669a49b2d Switch to Citus 11.2 (#2548)
- Update Dockerfile.citus files
- Enable behave tests with Citus
2023-02-03 15:29:25 +01:00
Alexander Kukushkin
8ac8ed6584 Update Citus link to the github.com repo (#2546)
Per suggestion from @clairegiordano
2023-02-02 11:50:19 +01:00
Alexander Kukushkin
7869f5e211 Release 3.0.0 (#2545)
* bump version
* update release notes
* removed 2.7, 3.4, 3.5, and 3.6 from supported versions in setup.py
* switched GH actions back to ubuntu-latest, removed tests with 2.7 and 3.6, and added 3.11
* some little fixes in Citus documentation and behave tests
v3.0.0
2023-01-30 10:29:08 +01:00
Alexander Kukushkin
45e5ac2baf Remove patronictl scaffold (#2544)
The only reason for having it was a hacky way of running standby clusters.
2023-01-27 08:52:59 +01:00
Alexander Kukushkin
4c3af2d1a0 Change master->primary/leader/member (#2541)
keep as much backward compatibility as possible.

Following changes were made:
1. All internal checks are performed as `role in ('master', 'primary')`
2. All internal variables/functions/methods are renamed
3. `GET /metrics` endpoint returns `patroni_primary` in addition to `patroni_master`.
4. Logs are changed to use leader/primary/member/remote depending on the context
5. Unit-tests are using only role = 'primary' instead of 'master' to verify that 1 works.
6. patronictl still supports old syntax, but also accepts `--leader` and `--primary`.
7. `master_(start|stop)_timeout` is automatically translated to `primary_(start|stop)_timeout` if the last one is not set.
8. updated the documentation and some examples

Future plan: in the next major release switch role name from `master` to `primary` and maybe drop `master` altogether.
The Kubernetes implementation will require more work and keep two labels in parallel. Label values should probably be configurable as described in https://github.com/zalando/patroni/issues/2495.
2023-01-27 07:40:24 +01:00
Alexander Kukushkin
0273eac15e Compatibility with pyinstaller (#2537)
it doesn't like relative imports and not recognise `http.server` imported with `six`.
The last one is explicitly added to the list of `hiddenimports()` and will break compatibility with python 2.7, which support will be dropped in the next Patroni release anyway.

Close https://github.com/zalando/patroni/issues/2535
2023-01-26 16:35:30 +01:00
Alexander Kukushkin
79458688d1 Check unexpected exceptions in Patroni logs after behave (#2538)
and make behave fail if there are anything unexpected found.

In addition to that fix globing rule when uploading artifacts with logs.
2023-01-25 11:02:52 +01:00
Alexander Kukushkin
4872ac51e0 Citus integration (#2504)
Citus cluster (coordinator and workers) will be stored in DCS as a fleet of Patroni logically grouped together:
```
/service/batman/
/service/batman/0/
/service/batman/0/initialize
/service/batman/0/leader
/service/batman/0/members/
/service/batman/0/members/m1
/service/batman/0/members/m2
/service/batman/
/service/batman/1/
/service/batman/1/initialize
/service/batman/1/leader
/service/batman/1/members/
/service/batman/1/members/m1
/service/batman/1/members/m2
...
```

Where 0 is a Citus group for coordinator and 1, 2, etc are worker groups.

Such hierarchy allows reading the entire Citus cluster with a single call to DCS (except Zookeeper).

The get_cluster() method will be reading the entire Citus cluster on the coordinator because it needs to discover workers. For the worker cluster it will be reading the subtree of its own group.

Besides that we introduce a new method  get_citus_coordinator(). It will be used only by worker clusters.

Since there is no hierarchical structures on K8s we will use the citus group suffix on all objects that Patroni creates.
E.g.
```
batman-0-leader  # the leader config map for the coordinator
batman-0-config  # the config map holding initialize, config, and history "keys"
...
batman-1-leader  # the leader config map for worker group 1
batman-1-config
...
```

Citus integration is enabled from patroni.yaml:
```yaml
citus:
  database: citus
  group: 0  # 0 is for coordinator, 1, 2, etc are for workers
```

If enabled, Patroni will create the database, citus extension in it, and INSERTs INTO `pg_dist_authinfo` information required for Citus nodes to communicate between each other, i.e. 'password', 'sslcert', 'sslkey' for superuser if they are defined in the Patroni configuration file.

When the new Citus coordinator/worker is bootstrapped, Patroni adds `synchronous_mode: on` to the `bootstrap.dcs` section.

Besides that, Patroni takes over management of some Postgres GUCs:
- `shared_preload_libraries` - Patroni ensures that the "citus" is added to the first place
- `max_prepared_transactions` - if not set or set to 0, Patroni changes the value to `max_connections*2`
- wal_level - automatically set to logical. It is used by Citus to move/split shards. Under the hood Citus is creating/removing replication slots and they are automatically added by Patroni to the `ignore_slots` configuration to avoid accidental removal.

The coordinator primary actively discovers worker primary nodes and registers/updates them in the `pg_dist_node` table using
citus_add_node() and citus_update_node() functions.

Patroni running on the coordinator provides the new REST API endpoint: `POST /citus`. It is used by workers to facilitate controlled switchovers and restarts of worker primaries.
When the worker primary needs to shut down Postgres because of restart or switchover, it calls the `POST /citus` endpoint on the coordinator and the Patroni on the coordinator starts a transaction and calls `citus_update_node(nodeid, 'host-demoted', port)` in order to pause client connections that work with the given worker.
Once the new leader is elected or postgres started back, they perform another call to the `POST/citus` endpoint, that does another `citus_update_node()` call with actual hostname and port and commits a transaction. After transaction is committed, coordinator reestablishes connections to the worker node and client connections are unblocked.
If clients don't run long transaction the operation finishes without client visible errors, but only a short latency spike.

All operations on the `pg_dist_node` are serialized by Patroni on the coordinator. It allows to have more control and ROLLBACK transaction in progress if its lifetime exceeding a certain threshold and there are other worker nodes should be updated.
2023-01-24 16:14:58 +01:00
Alexander Kukushkin
3161f31088 Enhanced sync connections check (#2524)
When `synchronous_standby_names` GUC is changed PostgreSQL nearly immediately starts reporting corresponding walsenders as synchronous, while in fact they maybe didn't reach this state yet. To mitigate this problem we memorize current flush lsn on the primary right after change of `synchronous_standby_names` got visible and use it as an additional check for walsenders.
The walsender will be counted as truly "sync" only when write/flush/replay_lsn on it reached memorized LSN and the `application_name` is known to be a part of `synchronous_standby_names`.

The size of PR mostly related to refactoring and moving the code responsible for working with `synchronous_standby_names` and `pg_stat_replication` to the dedicated file.
And `parse_sync_standby_names()` function was mostly copied from #672.
2023-01-24 15:05:54 +01:00
Alexander Kukushkin
40d16443f9 Fixes and improvements in failsafe (#2532)
1. Fix problem with logical slots not advancing when only the primary lost access to DCS
2. Don't let Patroni to join as a raft voting member when running failsafe behave tests. It allows to test exactly the same conditions as for other DCS
3. Speed up dcs_failsafe_mode behave tests by getting rid from long sleeps, slight reshuffling of places when we start/stop outage, and by killing Patroni/Postgres to avoid long shutdown due to the leader key removal attempts.
2023-01-24 14:07:31 +01:00
Alexander Kukushkin
1e208736f8 Refactor drop_replication_slot() and _drop_incorrect_slots() (#2534)
Use CTE to avoid running the second query if pg_drop_replication_slot() failed
2023-01-23 16:46:07 +01:00
William Albertus Dembo
f06d432dab Keep only latest failed data directory (#2471)
Use constant postfix when moving data directory due to failure so it only keeps data from the latest failure.
2023-01-19 21:47:41 +01:00
Polina Bungina
838653325a Clean pg_replslot/ after pg_rewind (#2531)
As pg_rewind cleans this directory on target only since pg11

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2023-01-19 15:50:30 +01:00
Michael Banck
06bbe2eadc Suppress recurring errors when dropping unknown but active replication slots (#2502)
When a replication slot is not registered with Patroni but is active, Patroni would log an error during each HA cycle in certain conditions (after a restart or role change). To avoid this, first check if the replication slot we are about to drop is still active and if so, only log a warning. Otherwise, log the slot we are dropping for informational purposes.

Close: #2499
2023-01-19 09:53:17 +01:00
Alexander Kukushkin
b75cd5a7d9 Submit coverage to codacy only if secret is available (#2528)
If PR is open from the external GH repo secrets are not set due to security reasons. It makes codacy coverage report to fail.

Co-authored-by: Polina Bungina <bungina@gmail.com>
2023-01-17 15:28:39 +01:00
Polina Bungina
acecbe0d8f Fix a couple of linter problems, delete TODO.md (#2526)
Fix a couple of linter problems, remove trailing whitespaces

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2023-01-17 10:52:03 +01:00
Alexander Kukushkin
2ea0357854 DCS failsafe mode (#2379)
If enabled it will allow Patroni to cope with DCS outages.
In case of a DCS outage the leader tries to call all remaining members in the cluster via API and if all of them respond with success the leader will not be demoted.

The failsafe_mode could be enabled by running
```sh
patronictl edit-config -s failsafe_mode=true
```

or by calling the `/config` REST API endpoint.

Co-authored-by: Polina Bungina <bungina@gmail.com>
2023-01-13 13:35:05 +01:00
Polina Bungina
b13354b6a3 Make launch.sh pass shellcheck (#2522) 2023-01-12 09:14:47 +01:00
Alexander Kukushkin
5bbb5dceeb Improve /(a)sync checks in behave tests (#2521)
They are frequently failing because sometimes replicas are a bit slow realizing that they are synchronous. Instead of instroducing more sleeps we will poll for required http status code with some timeout.
2023-01-12 08:23:59 +01:00
Polina Bungina
650344fca8 Update Slack link in README.rst and CONTRIBUTING.rst (#2520)
* Update Slack link in README.rst and CONTRIBUTING.rst
2023-01-11 16:06:25 +01:00
Polina Bungina
9de22e667b Report coverage to Codacy for behave tests (#2518) 2023-01-11 11:47:08 +01:00
Alexander Kukushkin
c12fe4146d Run only one query per HA loop (#2516)
If the cluster is stable (no nodes are joining/leaving/lagging) we want to run at most one monitor query per every HA loop. So far it worker perfectly except when synchronous_mode is enabled, where we run two additional queries:
1. SHOW synchronous_mode
2. SELECT ... FROM pg_stat_replication

In order to solve it, we will include these "queries" to the common monitoring query is synchronous_mode is enabled.

In addition to that make sure that `synchronous_standby_names` is reset on replicas that used to be a primary and avoid using replicas which are not in the 'running' state.

P.S.: in the monitoring query we also extract the current value of synchronous_standby_names, because it will be useful for the quorum commit feature.

Close https://github.com/zalando/patroni/issues/2469
2023-01-10 10:44:17 +01:00
Alexander Kukushkin
baaf187c81 Fix behave tests on GH actions MacOS (#2515)
- the new MacOS doesn't play well with old go binaries (bump etcd)
- use brew to install Postgres and expect (unbuffer, to make behave output colorful) and use the latest version
- upload failed logs instead of grepping them to stdout
2023-01-05 12:32:39 +01:00
Alexander Kukushkin
442bd3f434 Compatibility with some old modules (#2514)
- old click differently handles argument names
- old pytest doesn't like `from mock import call`

Bump version and update release notes.

Close: https://github.com/zalando/patroni/issues/2508
Close: https://github.com/zalando/patroni/issues/2512
v2.1.7
2023-01-04 07:24:52 +01:00
Michael Banck
e3e4ad0ada Start etcd with V2 API enabled for V2 etcd acceptance tests (#2509)
Otherwise, the etcd (not etcd3) behave tests fail to connect:
```
Jan 02 09:56:18 HOOK-ERROR in before_all: AssertionError: etcd instance is not available for queries after 5 seconds
```
2023-01-03 15:39:30 +01:00
Polina Bungina
bad158046e Release v2.1.6 (#2507)
* bump version
* update release notes

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
v2.1.6
2022-12-30 13:32:34 +01:00
Alexander Kukushkin
55e1549341 Do not rely on 'role' value when checking other nodes via REST API (#2503)
When doing the leader race we need to check that the former primary isn't alive anymore. For that we relied on non-inclusive terms. In order to simplify future work on getting rid from all non-inclusive words we change the check to rely on a difference in format of wal/xlog field. There is only "location" for the primary and "replayed_location" + "received_location" for standbys.

In addition to that we start supporting "wal" field as well as deprecated "xlog".

Co-authored-by: Polina Bungina <bungina@gmail.com>
2022-12-29 09:13:09 +01:00
Alexander Kukushkin
2d79757309 The Consul TTL is off by twice from reality (#2501)
we use `ttl/2.0` when setting the value on the HTTPClient, but forgot to multiply the current value by 2.
2022-12-27 12:06:29 +01:00