This commit is a breaking change:
1. `role` in DCS is written as "primary" instead of "master".
2. `role` in REST API responses is also written as "primary".
3. REST API no longer accepts role=master in requests (for example switchover/failover/restart endpoints).
4. `/metrics` REST API endpoint will no longer report `patroni_master`.
5. `patronictl` no longer accepts `--master` argument.
6. `no_master` option in declarative configuration of custom replica creation methods is no longer treated as a special option, please use `no_leader` instead.
7. `patroni_wale_restore` doesn't accept `--no_master` anymore.
8. `patroni_barman` doesn't accept `--role=master` anymore.
9. callback scripts will be executed with role=primary instead of role=master
10. On Kubernetes Patroni by default will set role label to primary. In case if you want to keep old behavior and avoid downtime or lengthy complex migrations you can configure `kubernetes.leader_label_value` and `kubernetes.standby_leader_label_value` to `master`.
However, a few exceptions regarding master are still in place:
1. `GET /master` REST API endpoint will continue to work.
2. `master_start_timeout` and `master_stop_timeout` in global configuration are still accepted.
3. `master` tag is still preserved in Consul services in addition to `primary`.
Rationale for these exceptions: DBA doesn't always 100% control the infrastructure and can't adjust the configuration.
When running on K8s Patroni is communicating with API via the `kubernetes` service, which is address is exposed via the
`KUBERNETES_SERVICE_HOST` environment variable. Like any other service, the `kubernetes` service is handled by `kube-proxy`, that depending on configuration is either relying on userspace program or `iptables` for traffic routing.
During K8s upgrade, when master nodes are replaced, it is possible that `kube-proxy` doesn't update the service configuration in time and as a result Patroni fails to update the leader lock and demotes postgres.
In order to improve the user experience and get more control on the problem we make it possible to bypass the `kubernetes` service and connect directly to API nodes.
The strategy is very simple:
1. Resolve list IPs of API nodes from the kubernetes endpoint on every iteration of HA loop.
2. Stick to one of these IPs for API requests
3. Switch to a different IP if connected to IP is not from the list
4. If the request fails, switch to another IP and retry
Such a strategy is already used for Etcd and proven to work quite well.
In order to enable the feature, you need either to set to `true` `kubernetes.bypass_api_service` in the Patroni configuration file or `PATRONI_KUBERNETES_BYPASS_API_SERVICE` environment variable.
If for some reason `GET /default/endpoints/kubernetes` isn't allowed Patroni will disable the feature.
They could be useful to eliminate "unhealthy" pods from subsets addresses when the K8s service with label selectors are used.
Real-life example: the node where the primary was running has failed and being shutdown and Patroni can't update (remove) the role label.
Therefore on OpenShift the leader service will have two pods assigned, one of them is a failed primary.
With the readiness probe defined, the failed primary pod will be excluded from the list.
OpenShift enforces securityContext.fsGroups for block devices and sets group stickybits for volumeMounts.
This leads to patroni pods failing to start after the first restart:
> 2020-01-13 14:46:13.695 UTC [143] FATAL: data directory "/home/postgres/pgdata/pgroot/data" has invalid permissions
2020-01-13 14:46:13.695 UTC [143] DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
A initContainer which fixes the OpenShift tampering solves the issue. I stole the solution from the stable postgres helm chart:
https://github.com/helm/charts/pull/14540/files
Tested on OpenShift v3.11
Note: This error does not occur when using shared filesystems (like NFS)
- Update postgres docker image to the latest 11 version.
- Remove empty lines inside the `RUN` command to make the Dockerfile compatible with future docker versions.
- Set the `PATRONI_KUBERNETES_POD_IP` environment variable, which is required when _use_endpoints_ is enabled. Otherwise, the `KeyError` is raised [here](https://github.com/zalando/patroni/blob/master/patroni/dcs/kubernetes.py#L95).
- Set `EDITOR` environment variable to make configuration changes via `patronictl edit-config`.
- It modifies the Dockerfile and entrypoint slightly to allow for OpenShift SCCs to operate correctly
- It adds 2 template examples that can be easily modified by changing parameters
Fixes#572