mirror of
https://github.com/outbackdingo/patroni.git
synced 2026-01-27 18:20:05 +00:00
74 lines
4.3 KiB
Gherkin
74 lines
4.3 KiB
Gherkin
Feature: citus
|
|
We should check that coordinator discovers and registers workers and clients don't have errors when worker cluster switches over
|
|
|
|
Scenario: check that worker cluster is registered in the coordinator
|
|
Given I start postgres0 in citus group 0
|
|
And I start postgres2 in citus group 1
|
|
Then postgres0 is a leader in a group 0 after 10 seconds
|
|
And postgres2 is a leader in a group 1 after 10 seconds
|
|
When I start postgres1 in citus group 0
|
|
And I start postgres3 in citus group 1
|
|
Then replication works from postgres0 to postgres1 after 15 seconds
|
|
Then replication works from postgres2 to postgres3 after 15 seconds
|
|
And postgres0 is registered in the postgres0 as the primary in group 0 after 5 seconds
|
|
And postgres2 is registered in the postgres0 as the primary in group 1 after 5 seconds
|
|
|
|
Scenario: coordinator failover updates pg_dist_node
|
|
Given I run patronictl.py failover batman --group 0 --candidate postgres1 --force
|
|
Then postgres1 role is the primary after 10 seconds
|
|
And "members/postgres0" key in a group 0 in DCS has state=running after 15 seconds
|
|
And replication works from postgres1 to postgres0 after 15 seconds
|
|
And postgres1 is registered in the postgres2 as the primary in group 0 after 5 seconds
|
|
And "sync" key in a group 0 in DCS has sync_standby=postgres0 after 15 seconds
|
|
When I run patronictl.py switchover batman --group 0 --candidate postgres0 --force
|
|
Then postgres0 role is the primary after 10 seconds
|
|
And replication works from postgres0 to postgres1 after 15 seconds
|
|
And postgres0 is registered in the postgres2 as the primary in group 0 after 5 seconds
|
|
And "sync" key in a group 0 in DCS has sync_standby=postgres1 after 15 seconds
|
|
|
|
Scenario: worker switchover doesn't break client queries on the coordinator
|
|
Given I create a distributed table on postgres0
|
|
And I start a thread inserting data on postgres0
|
|
When I run patronictl.py switchover batman --group 1 --force
|
|
Then I receive a response returncode 0
|
|
And postgres3 role is the primary after 10 seconds
|
|
And "members/postgres2" key in a group 1 in DCS has state=running after 15 seconds
|
|
And replication works from postgres3 to postgres2 after 15 seconds
|
|
And postgres3 is registered in the postgres0 as the primary in group 1 after 5 seconds
|
|
And "sync" key in a group 1 in DCS has sync_standby=postgres2 after 15 seconds
|
|
And a thread is still alive
|
|
When I run patronictl.py switchover batman --group 1 --force
|
|
Then I receive a response returncode 0
|
|
And postgres2 role is the primary after 10 seconds
|
|
And replication works from postgres2 to postgres3 after 15 seconds
|
|
And postgres2 is registered in the postgres0 as the primary in group 1 after 5 seconds
|
|
And "sync" key in a group 1 in DCS has sync_standby=postgres3 after 15 seconds
|
|
And a thread is still alive
|
|
When I stop a thread
|
|
Then a distributed table on postgres0 has expected rows
|
|
|
|
Scenario: worker primary restart doesn't break client queries on the coordinator
|
|
Given I cleanup a distributed table on postgres0
|
|
And I start a thread inserting data on postgres0
|
|
When I run patronictl.py restart batman postgres2 --group 1 --force
|
|
Then I receive a response returncode 0
|
|
And postgres2 role is the primary after 10 seconds
|
|
And replication works from postgres2 to postgres3 after 15 seconds
|
|
And postgres2 is registered in the postgres0 as the primary in group 1 after 5 seconds
|
|
And a thread is still alive
|
|
When I stop a thread
|
|
Then a distributed table on postgres0 has expected rows
|
|
|
|
Scenario: check that in-flight transaction is rolled back after timeout when other workers need to change pg_dist_node
|
|
Given I start postgres4 in citus group 2
|
|
Then postgres4 is a leader in a group 2 after 10 seconds
|
|
And "members/postgres4" key in a group 2 in DCS has role=master after 3 seconds
|
|
When I run patronictl.py edit-config batman --group 2 -s ttl=20 --force
|
|
Then I receive a response returncode 0
|
|
And I receive a response output "+ttl: 20"
|
|
Then postgres4 is registered in the postgres2 as the primary in group 2 after 5 seconds
|
|
When I shut down postgres4
|
|
Then there is a transaction in progress on postgres0 changing pg_dist_node after 5 seconds
|
|
When I run patronictl.py restart batman postgres2 --group 1 --force
|
|
Then a transaction finishes in 20 seconds
|