mirror of
https://github.com/outbackdingo/patroni.git
synced 2026-01-27 10:20:10 +00:00
This commit is a breaking change: 1. `role` in DCS is written as "primary" instead of "master". 2. `role` in REST API responses is also written as "primary". 3. REST API no longer accepts role=master in requests (for example switchover/failover/restart endpoints). 4. `/metrics` REST API endpoint will no longer report `patroni_master`. 5. `patronictl` no longer accepts `--master` argument. 6. `no_master` option in declarative configuration of custom replica creation methods is no longer treated as a special option, please use `no_leader` instead. 7. `patroni_wale_restore` doesn't accept `--no_master` anymore. 8. `patroni_barman` doesn't accept `--role=master` anymore. 9. callback scripts will be executed with role=primary instead of role=master 10. On Kubernetes Patroni by default will set role label to primary. In case if you want to keep old behavior and avoid downtime or lengthy complex migrations you can configure `kubernetes.leader_label_value` and `kubernetes.standby_leader_label_value` to `master`. However, a few exceptions regarding master are still in place: 1. `GET /master` REST API endpoint will continue to work. 2. `master_start_timeout` and `master_stop_timeout` in global configuration are still accepted. 3. `master` tag is still preserved in Consul services in addition to `primary`. Rationale for these exceptions: DBA doesn't always 100% control the infrastructure and can't adjust the configuration.
73 lines
4.0 KiB
Gherkin
73 lines
4.0 KiB
Gherkin
Feature: standby cluster
|
|
Scenario: prepare the cluster with logical slots
|
|
Given I start postgres1
|
|
Then postgres1 is a leader after 10 seconds
|
|
And there is a non empty initialize key in DCS after 15 seconds
|
|
When I issue a PATCH request to http://127.0.0.1:8009/config with {"slots": {"pm_1": {"type": "physical"}}, "postgresql": {"parameters": {"wal_level": "logical"}}}
|
|
Then I receive a response code 200
|
|
And Response on GET http://127.0.0.1:8009/config contains slots after 10 seconds
|
|
And I sleep for 3 seconds
|
|
When I issue a PATCH request to http://127.0.0.1:8009/config with {"slots": {"test_logical": {"type": "logical", "database": "postgres", "plugin": "test_decoding"}}}
|
|
Then I receive a response code 200
|
|
And I do a backup of postgres1
|
|
When I start postgres0
|
|
Then "members/postgres0" key in DCS has state=running after 10 seconds
|
|
And replication works from postgres1 to postgres0 after 15 seconds
|
|
When I issue a GET request to http://127.0.0.1:8008/patroni
|
|
Then I receive a response code 200
|
|
And I receive a response replication_state streaming
|
|
And "members/postgres0" key in DCS has replication_state=streaming after 10 seconds
|
|
|
|
@slot-advance
|
|
Scenario: check permanent logical slots are synced to the replica
|
|
Given I run patronictl.py restart batman postgres1 --force
|
|
Then Logical slot test_logical is in sync between postgres0 and postgres1 after 10 seconds
|
|
|
|
Scenario: Detach exiting node from the cluster
|
|
When I shut down postgres1
|
|
Then postgres0 is a leader after 10 seconds
|
|
And "members/postgres0" key in DCS has role=primary after 5 seconds
|
|
When I issue a GET request to http://127.0.0.1:8008/
|
|
Then I receive a response code 200
|
|
|
|
Scenario: check replication of a single table in a standby cluster
|
|
Given I start postgres1 in a standby cluster batman1 as a clone of postgres0
|
|
Then postgres1 is a leader of batman1 after 10 seconds
|
|
When I add the table foo to postgres0
|
|
Then table foo is present on postgres1 after 20 seconds
|
|
When I issue a GET request to http://127.0.0.1:8009/patroni
|
|
Then I receive a response code 200
|
|
And I receive a response replication_state streaming
|
|
And I sleep for 3 seconds
|
|
When I issue a GET request to http://127.0.0.1:8009/primary
|
|
Then I receive a response code 503
|
|
When I issue a GET request to http://127.0.0.1:8009/standby_leader
|
|
Then I receive a response code 200
|
|
And I receive a response role standby_leader
|
|
And there is a postgres1_cb.log with "on_role_change standby_leader batman1" in postgres1 data directory
|
|
When I start postgres2 in a cluster batman1
|
|
Then postgres2 role is the replica after 24 seconds
|
|
And postgres2 is replicating from postgres1 after 10 seconds
|
|
And table foo is present on postgres2 after 20 seconds
|
|
When I issue a GET request to http://127.0.0.1:8010/patroni
|
|
Then I receive a response code 200
|
|
And I receive a response replication_state streaming
|
|
And postgres1 does not have a replication slot named test_logical
|
|
|
|
Scenario: check switchover
|
|
Given I run patronictl.py switchover batman1 --force
|
|
Then Status code on GET http://127.0.0.1:8010/standby_leader is 200 after 10 seconds
|
|
And postgres1 is replicating from postgres2 after 32 seconds
|
|
And there is a postgres2_cb.log with "on_start replica batman1\non_role_change standby_leader batman1" in postgres2 data directory
|
|
|
|
Scenario: check failover
|
|
When I kill postgres2
|
|
And I kill postmaster on postgres2
|
|
Then postgres1 is replicating from postgres0 after 32 seconds
|
|
And Status code on GET http://127.0.0.1:8009/standby_leader is 200 after 10 seconds
|
|
When I issue a GET request to http://127.0.0.1:8009/primary
|
|
Then I receive a response code 503
|
|
And I receive a response role standby_leader
|
|
And replication works from postgres0 to postgres1 after 15 seconds
|
|
And there is a postgres1_cb.log with "on_role_change replica batman1\non_role_change standby_leader batman1" in postgres1 data directory
|