13 Commits

Author SHA1 Message Date
OutBackDingo
3a9befc1a2 update citus images and hosts, to generate helm charts 2025-04-25 10:32:39 +07:00
Alexander Kukushkin
8cdb0c25d9 Follow up on #2755 (#3137)
- don't register secondaries with `noloadbalance` tag.
- mention in the documentation that secondaries are also registered in `pg_dist_node`.
- update docker/kubernetes README files to include examples with secondaries being registered in `pg_dist_node`.
2024-08-27 09:34:12 +02:00
Polina Bungina
c1ee99d81d Update PG version in a couple of places (#2986)
* All dockerfiles to use PG16 by default
* PGVERSION env in the test pipelines to 16.1-1 by default
* 11->14 in the dcs-pg mapping for test pipelines
* Code comments fixes
2023-12-18 10:44:05 +01:00
Konstantin Demin
36e3dfbe41 update Dockerfiles (#2937)
- better cleanup for vim
- introduce dumb-init for patroni containers
2023-11-27 09:38:03 +01:00
Ali Mehraji
ac6f6ae1c2 Add ETCDCTL_API=3 env to Dockerfiles and update docker/README.md (#2946) 2023-11-22 08:55:51 +01:00
Polina Bungina
ffd1ad97d2 Fix Dockerfile_s (#2770)
* Install dumb-init using apt
* Remove python 2.7 packages purge
2023-07-21 15:10:13 +02:00
Alexander Kukushkin
6f91f4f4e2 Release v3.0.3 (#2719)
* Bump version
* Bump pyright version and fix newly reported issues
* Update release notes
* Fix typos, extend release process desc
* Add readthedocs configuration file v2
* Fix Dockerfile.citus files
2023-06-22 10:46:02 +02:00
mikecaat
f3c80d5706 Fix a minor error building a docker image for citus (#2705)
This handles the following syntax error.

$ docker build -t patroni-citus -f Dockerfile.citus .
(snip)
  => ERROR [builder 2/3] RUN set -ex     && export DEBIAN_FRONTEND=noninteractive     && echo  0.5s -
(snip)
  #5 0.456 /bin/sh: 1: Syntax error: end of file unexpected (expecting "fi")

Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
2023-05-31 21:22:30 +02:00
Polina Bungina
6c8a3b0d25 Remove bootstrap.pg_hba (#2684)
* Remove bootstrap.pg_hba
* Extend docs for postgresql.pg_hba/pg_ident
* Add postgresql.pg_hba/pg_ident to dynamic config docs

---------

Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2023-05-24 09:01:56 +02:00
Polina Bungina
db71ba3955 Fix dev Dockerfile.citus for arm (#2683)
- Fix dev Dockerfile.citus for arm
Don't purge lib required for citus run

* Change citus repo url, update citus version
2023-05-22 16:15:40 +02:00
Polina Bungina
44e58a1ba1 Dev docker images improvements (#2677)
- configurable image
- ETCD_UNSUPPORTED_ARCH env in docker-compose
- Build confd and citus for arm64 images
2023-05-15 11:40:35 +02:00
Alexander Kukushkin
1669a49b2d Switch to Citus 11.2 (#2548)
- Update Dockerfile.citus files
- Enable behave tests with Citus
2023-02-03 15:29:25 +01:00
Alexander Kukushkin
4872ac51e0 Citus integration (#2504)
Citus cluster (coordinator and workers) will be stored in DCS as a fleet of Patroni logically grouped together:
```
/service/batman/
/service/batman/0/
/service/batman/0/initialize
/service/batman/0/leader
/service/batman/0/members/
/service/batman/0/members/m1
/service/batman/0/members/m2
/service/batman/
/service/batman/1/
/service/batman/1/initialize
/service/batman/1/leader
/service/batman/1/members/
/service/batman/1/members/m1
/service/batman/1/members/m2
...
```

Where 0 is a Citus group for coordinator and 1, 2, etc are worker groups.

Such hierarchy allows reading the entire Citus cluster with a single call to DCS (except Zookeeper).

The get_cluster() method will be reading the entire Citus cluster on the coordinator because it needs to discover workers. For the worker cluster it will be reading the subtree of its own group.

Besides that we introduce a new method  get_citus_coordinator(). It will be used only by worker clusters.

Since there is no hierarchical structures on K8s we will use the citus group suffix on all objects that Patroni creates.
E.g.
```
batman-0-leader  # the leader config map for the coordinator
batman-0-config  # the config map holding initialize, config, and history "keys"
...
batman-1-leader  # the leader config map for worker group 1
batman-1-config
...
```

Citus integration is enabled from patroni.yaml:
```yaml
citus:
  database: citus
  group: 0  # 0 is for coordinator, 1, 2, etc are for workers
```

If enabled, Patroni will create the database, citus extension in it, and INSERTs INTO `pg_dist_authinfo` information required for Citus nodes to communicate between each other, i.e. 'password', 'sslcert', 'sslkey' for superuser if they are defined in the Patroni configuration file.

When the new Citus coordinator/worker is bootstrapped, Patroni adds `synchronous_mode: on` to the `bootstrap.dcs` section.

Besides that, Patroni takes over management of some Postgres GUCs:
- `shared_preload_libraries` - Patroni ensures that the "citus" is added to the first place
- `max_prepared_transactions` - if not set or set to 0, Patroni changes the value to `max_connections*2`
- wal_level - automatically set to logical. It is used by Citus to move/split shards. Under the hood Citus is creating/removing replication slots and they are automatically added by Patroni to the `ignore_slots` configuration to avoid accidental removal.

The coordinator primary actively discovers worker primary nodes and registers/updates them in the `pg_dist_node` table using
citus_add_node() and citus_update_node() functions.

Patroni running on the coordinator provides the new REST API endpoint: `POST /citus`. It is used by workers to facilitate controlled switchovers and restarts of worker primaries.
When the worker primary needs to shut down Postgres because of restart or switchover, it calls the `POST /citus` endpoint on the coordinator and the Patroni on the coordinator starts a transaction and calls `citus_update_node(nodeid, 'host-demoted', port)` in order to pause client connections that work with the given worker.
Once the new leader is elected or postgres started back, they perform another call to the `POST/citus` endpoint, that does another `citus_update_node()` call with actual hostname and port and commits a transaction. After transaction is committed, coordinator reestablishes connections to the worker node and client connections are unblocked.
If clients don't run long transaction the operation finishes without client visible errors, but only a short latency spike.

All operations on the `pg_dist_node` are serialized by Patroni on the coordinator. It allows to have more control and ROLLBACK transaction in progress if its lifetime exceeding a certain threshold and there are other worker nodes should be updated.
2023-01-24 16:14:58 +01:00