280 Commits

Author SHA1 Message Date
Alexander Kukushkin
48fbf64ea9 Release v3.3.0 (#3043)
* Make sure tests are not making external calls
and pass url with scheme to urllib3 to avoid warnings

* Make sure unit tests not rely on filesystem state

* Bump pyright and "solve" reported "issues"

Most of them are related to partially unknown types of values from empty
dict or list. To solve it for the empty dict we use `EMPTY_DICT` object of
newly introduced `_FrozenDict` class.

* Improve unit-tests code coverage

* Add release notes for 3.3.0

* Bump version

* Fix pyinstaller spec file

* python 3.6 compatibility

---------

Co-authored-by: Polina Bungina <27892524+hughcapet@users.noreply.github.com>
2024-04-04 17:51:26 +02:00
Alexander Kukushkin
d7454f7bcd Use target_session_attrs only when multiple hosts in standby_cluster (#3040)
Actually comment in the code was already saying that, but on practice it didn't happen.

It should help #3039
2024-04-02 11:59:57 +02:00
Grigory Smolkin
b09af642e6 Disable WAL streaming on standby node via new boolean tag "nostream" (#2842)
Add support for ``nostream`` tag. If set to ``true`` the node will not use replication protocol to stream WAL. It will rely instead on archive recovery (if ``restore_command`` is configured) and ``pg_wal``/``pg_xlog`` polling. It also disables copying and synchronization of permanent logical replication slots on the node itself and all its cascading replicas. Setting this tag on primary node has no effect.
2024-03-20 10:10:53 +01:00
Israel
014777b20a Refactor Barman scripts and add a sub-command to switch Barman config (#3016)
We currently have a script named `patroni_barman_recover` in Patroni, which is intended to be used as a custom bootstrap method, or as a custom replica creation method.

Now there is need of one more Barman related script in Patroni to handle switching of config models in Barman upon `on_role_change` events.

However, instead of creating another Patroni script, let's say `patroni_barman_config_switch`, and duplicating a lot of logic in the code, we decided to refactor the code so:

* Instead of two separate scripts (`patroni_barman_recover` and `patroni_barman_config_switch`), we have a single script (`patroni_barman`) with 2 sub-commands (`recover` and `config-switch`)

This is the overview of changes that have been performed:

* File `patroni.scripts.barman_recover` has been removed, and its logic has been split into a few files:
  * `patroni.scripts.barman.cli`: handles the entrypoint of the new `patroni_barman` command, exposing the argument parser and calling the appropriate functions depending on the sub-command
  * `patroni.scripts.barman.utils`: implements utilitary enums, functions and classes wich can be used by `cli` and by sub-commands implementation:
    * retry mechanism
    * logging set up
    * communication with pg-backup-api
  * `patroni.scripts.barman.recover`: implements the `recover` sub-command only
* File `patroni.tests.test_barman_recover` has been renamed as `patroni.tests.test_barman`
* File `patroni.scripts.barman.config_switch` was created to implement the `config-switch` sub-command only
* `setup.py` has been changed so it generates a `patroni_barman` application instead of `patroni_barman_recover`
* Docs and unit tests were updated accordingly

References: PAT-154.
2024-03-20 09:04:55 +01:00
Alexander Kukushkin
688c85389c Release v3.2.2 (#3007)
- update release notes
- bump Patroni version
- bump pyright version and fix reported issues
- improve compatibility with legacy psycopg2

Co-authored-by: Polina Bungina <bungina@gmail.com>
2024-01-17 08:31:08 +01:00
علی سالمی
5c4ee30dae Add JSON log format to logging configuration (#2982)
Now patroni can be configured as bellow to log in json format.

```yaml
log:
  type: json
  format:
    - asctime: '@timestamp'
    - levelname: level
    - message
    - module
    - name: logger_name
  static_fields:
    app: patroni
```

This config produce this log:

```json
{
  "@timestamp": "2023-12-14 19:51:24,872",
  "level": "INFO",
  "message": "Lock owner: None; I am postgresql1",
  "module": "ha",
  "app": "patroni",
  "logger_name": "patroni.ha"
}
```
2024-01-16 10:42:48 +01:00
Alexander Kukushkin
6976939f09 Release/v3.2.1 (#2968)
- bump version
- bump pyright
- update release notes
2023-11-30 16:50:42 +01:00
Israel
269b04be5d Add a contrib script for remote Barman recovery (#2931)
A contrib script, which can be used as a custom bootstrap method, or as a custom create replica method.

The script communicates with the pg-backup-api on the Barman node so Patroni is able to restore a Barman backup remotely.

The `--help` option of the script, along with the script docstring, should provide some context on how to use fill its parameters.

Patroni docs were updated accordingly to share examples about how to configure the script as a custom bootstrap method, or as a custom create replica method.

References: PAT-216.
2023-11-06 16:25:27 +01:00
Israel
d72f7cb259 Add a FAQ page to the docs (#2933)
This commit introduces a FAQ page to the docs. The idea is to get
most frequently asked questions answered before-hand, so the user
is able to get them answered quickly without going into detail in
the docs or having to go to Slack/GitHub to clarify questions.

---------
Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>
2023-11-01 14:02:04 +01:00
Aras Mumcuyan
c3dce46830 Add ability to pass auth_data to zk client (#2932) 2023-10-30 11:46:36 +01:00
Alexander Kukushkin
ce10e5fccc Release v3.2.0 (#2930)
- bump version
- bump pyright and apply fixes
- update release notes
2023-10-25 16:13:30 +02:00
Israel
bb90feb393 Add support for additional parameters on custom bootstrap (#2927)
Previous to this commit, if a user would ever like to add parameters to the custom bootstrap script call, they would need to configure Patroni like this:

```
bootstrap:
  method: custom_method_name
  custom_method_name:
    command: /path/to/my/custom_script --arg1=value1 --arg2=value2 ...
```

This commit extends that so we achieve a similar behavior that is seen when using `create_replica_methods`, i.e., we also allow the following syntax:

```
bootstrap:
  method: custom_method_name
  custom_method_name:
    command: /path/to/my/custom_script
    arg1: value1
    arg2: value2
```

All keys in the mapping which are not recognized by Patroni, will be dealt with as if they were additional named arguments to be passed down to the `command` call.

References: PAT-218.
2023-10-25 15:01:08 +02:00
Polina Bungina
6c06f5cc96 Add initial docs for patroni --validate/generate config (#2929)
For now it will sit in the section about the Patroni configuration. We can later move it to (or reference from) a new section where all the functionality of the `patroni` executable will be described.
2023-10-25 14:20:17 +02:00
Mark Pekala
f5ee67fa1c Feature: failover priority (#2780)
The priority is configured with `failover_priority` tag. Possible values are from `0` till infinity, where `0` means that the node will never become the leader, which is the same as `nofailover` tag set to `true`. As a result, in the configuration file one should set only one of `failover_priority` or `nofailover` tags.

The failover priority kicks in only when there are more than one node have the same receive/replay LSN and are ahead of other nodes in the cluster. In this case the node with higher value of `failover_priority` is preferred. If there is a node with higher values of receive/replay LSN, it will become the new leader even if it has lower value of `failover_priority` (except when priority is set to 0).

Close https://github.com/zalando/patroni/issues/2759
2023-10-24 12:22:48 +02:00
Israel
65030c56ee Add capability of specifying namespace through --dcs argument (#2926)
This commit changes the `patronictl` application in such a way its
`--dcs` argument is now able to receive a namespace.

Previous to this commit this was the format of that argument's value:
`DCS://HOST:PORT`.

From now on it accepts this format: `DCS://HORT:PORT/NAMESPACE`. As all
previous parts of the argument value, `NAMESPACE` is optional, and if
not given `patronictl` will fallback to the value from the configuration
file, if any, or to `service`.

This change is specifically useful when you are running a cluster in a
custom namespace, and from a machine where you don't have a configuration
file for Patroni or `patronictl`. It can avoid that you would have to
create a configuration file only with `namespace` filed in that case.

Issue reported by: Shaun Thomas <shaun@bonesmoses.org>

Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>
2023-10-24 12:09:44 +02:00
GuanqunYang193
ce187bec38 Remove user creation related docs (#2920)
* Remove user creation related docs
* remove template
2023-10-23 08:29:09 +02:00
Alexander Kukushkin
c5fffb3c97 Further work on permanent physical slots (#2891)
- Fixed issues with has_permanent_slots() method. It didn't took into account the case of permanent physical slots for members, falsely concluding that there are no permanent slots.
- Write to the status key only LSNs for permanent slots (not just for slots that exist on the primary).
  - Include pg_current_wal_flush_lsn() to slots feedback, so that slots on standby nodes could be advanced
- Improved behave tests:
  - Verify that permanent slots are properly created on standby nodes
  - Verify that permanent slots are properly advanced, including DCS failsafe mode
  - Verify that only permanent slots are written to the `/status`
2023-10-23 08:24:28 +02:00
zhjwpku
cb5f34b721 add some guide to run tests in different scopes (#2921)
Introduce ways to run tests in different scopes which should be helpful for beginners.
2023-10-23 08:17:53 +02:00
Alexander Kukushkin
fc67ba73f0 Allow to specify psycopg* in extras and switch to build (#2907)
* remove check_psycopg() call from the setup.py, when installing from wheel it doesn't work anyway.
* call check_psycopg() function before process_arguments(), because the last one is trying to import psycopg and fails with the stacktrace, while the first one shows a nice human-readable error message.
* add psycopg2, psycopg2-binary, and psycopg3 extras, that will install psycopg2>=2.5.4, psycopg2-binary, or psycopg[binary]>=3.0.0 modules respectively.
* move check_psycopg() function to the __main__.py.
* introduce the new extra called `all`, it will allow to install all dependencies at once (except psycopg related).
* use the `build` module in order to create sdist bdist_wheel packages.
* update the documentation regarding psycopg and extras (dependencies).
2023-10-17 14:46:15 +02:00
Alexander Kukushkin
d93db20baa Set citus.local_hostname (#2903)
There are cases when Citus wants to have a connection to the local postgres. By default it uses `localhost` for that, which is not alwasy available. To solve it we will set `citus.local_hostname` GUC to custom value, which is the same as Patroni uses to connect to Postgres.
2023-10-16 10:21:50 +02:00
Chris Bandy
588df5da05 Refine the documentation about custom_conf (#2901)
some back icks in this section needed to be balanced.
2023-10-11 08:41:11 +02:00
Alexander Kukushkin
9283ebda64 Enforce loop_wait/retry_timeout/ttl rule (#2869)
* hard-code minimal possible values
* make adjustments if values are lower or if the rule is violated and show warnings
* update documentation
2023-10-04 11:44:57 +02:00
Israel
a329a9d320 Add a documentation page for patronictl (#2874)
This PR introduces a documentation page for `patronictl` application.

We adopted a top-down approach when writing this document. We start by describing the outer most parts, and then keep writing new sections that specialize the knowledge.

We basically added a section called `patronictl` to the left menu. Inside that section we created a page with this structure:

- `patronictl`: describes what it is
    - `Configuraiton`: how to configure `patronictl`
    - `Usage`: how to use the CLI. Inside this section, there are subsections for each of the subcommands exposed by `patronictl`, and each of them are described using the following subsubsections:
        - `Synopsis`: syntax of the command and its positional and optional arguments
        - `Description`: a description of what the command does
        - `Parameters`: a detailed description of the arguments and how to use them
        - `Examples`: one or more examples of execution of the command

References: PAT-200.
2023-10-04 11:43:38 +02:00
Polina Bungina
27915984b4 Add contrib requirement for tests, small docs refactoring (#2887) 2023-09-27 12:19:58 +02:00
Alexander Kukushkin
a3b3e1bc1c Release v3.1.2 (#2885)
- bump version
- update release notes
2023-09-26 12:30:27 +02:00
Alexander Kukushkin
bc15813de0 Permanent physical slots on standby nodes (#2852)
Create permanent physical replication slots on standby nodes and use `pg_replication_slot_advance()` function to move them forward.

The `restart_lsn` is advanced based on values stored in the `/status` key by the primary node.

When slot is created on a replica it could be ahead the same slot on the primary and therefore there is some period of time when it doesn't protect WAL files from being recycled.
2023-09-20 16:50:37 +02:00
Alexander Kukushkin
18d9cb1124 Stick with sphinx_rtd_theme (#2873)
by default they are using something else
2023-09-20 14:59:15 +02:00
Alexander Kukushkin
66bdb1ae12 Release v3.1.1 (#2872)
* Bump version
* Update release notes
* Update contributing guidelines and tox.ini (include v16)
* Enable tests for `REL*` branches
2023-09-20 12:00:18 +02:00
Alexander Kukushkin
75dbe4ff96 Update supported Postgres versions (#2857) 2023-09-14 19:36:26 +02:00
Polina Bungina
b31a4d55c9 Ensure strict failover/switchover definition difference (#2784)
- Don't set leader in failover key from patronictl failover
- Show warning and execute switchover if leader option is provided for patronictl failover command
- Be more precise in the log messages
- Allow to failover to an async candidate in sync mode
- Check if candidate is the same as the leader specified in api
- Fix and extend some tests
- Add documentation
2023-09-12 08:51:17 +02:00
Israel
3c24c33e59 Document how to change Postgres settings that touch shared memory (#2843)
Some special handling is required when changing either of these settings in a Postgres cluster that has standby nodes:

* `max_connections`
* `max_prepared_transactions`
* `max_locks_per_transaction`
* `max_wal_senders`
* `max_worker_processes`

If one attempts to decrease `max_connections` dynamic setting and restart all nodes at the same time (primary and standbys), Patroni will refuse to apply the new value on the standbys and require the user to restart it again later, once replication catches up.

That behavior is correct, but it is not documented.

This commit adds information to documentation about that behavior and why it's required.

References: PAT-166.
2023-09-11 19:25:01 +02:00
Matt Baker
83a060fc15 Extend documentation with package installation and upgrade process (#2854) 2023-09-11 19:02:13 +02:00
Israel
a2ceff1517 Generate documentation of private members through sphinx docs (#2831)
* Generate documentation of private members through sphinx docs

With this commit we make sphinx build API docs for the following
things, which were missing up to this point:

* `__init__` method of classes;
* "private" members (properties, functions, methods, attributes, etc.,
  which name starts with an underscore);
* members that are missing a docstring, so we can still reference
them with links in the documentation.

The third point can be removed later, if we wish, when we reach a
point where everything has proper docstrings in the Patroni code base.

* Fix documentation problems found after enabling private methods in sphinx

* `:cvar:` is not a valid domain role. Replaced with `:attr:`.
* documentation for `consul.base.Consul.__init__` has a single backtick
quoted string which is interpreted as a reference which cannot be found.
Therefore, the docstring has been copied as a block quote.
* various list spacing problems and indentation problems.
* code blocks added where indentation is interpreted incorrectly
* literal string quoting issues.

---------

Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>
Co-authored-by: Matt Baker <matt.baker@enterprisedb.com>
2023-09-11 15:41:34 +02:00
SK
80a03a4892 Enreach some endpoints with the scope and name (#2846)
- monitoring endpoints - added `name` to the `patroni`, next to the `scope` and `version`
- metrics endpoint - added name to labels
2023-09-05 07:24:17 +02:00
Alexander Kukushkin
6b7f914da7 Fix bug with kubernetes.standby_leader_label_value (#2832)
When running with the leader lock Patroni was just setting the `role` label to `master` and effectively `kubernetes.standby_leader_label_value` feature never worked.

Now it is fixed, but in order to not introduce breaking changes we just update default value of the `standby_leader_label_value` to the `master`.
2023-09-04 10:03:37 +02:00
Polina Bungina
7319d12026 Remove accidentally added .DS_Store (#2826)
And extend .gitignore
2023-08-21 07:50:45 +02:00
Polina Bungina
2ec9834c60 Update api examples (#2824)
* Add failsafe_mode_is_active to /patroni and /metrics
* Add patroni_primary to /metrics
* Add examples showing that failsafe_mode_is_active and cluster_unlocked
  are only shown for /patroni when the value is "true"
* Update /patroni and /config examples
2023-08-18 16:13:13 +02:00
Alexander Kukushkin
93be10a655 Remove Python 2 install instructions from docs/README (#2822)
docs/README.rst mainly duplicates README.rst and also should be changed. Besides that remove test/coverage badges.

followup on #2821
2023-08-17 16:17:34 +02:00
Israel
4138d0b830 Add docstrings to patroni.config (#2708)
Besides adding docstrings to `patroni.config`, a few side changes
have been applied:

* Reference `config_file` property instead of internal attribute
`_config_file` in method `_load_config_file`;
* Have `_AUTH_ALLOWED_PARAMETERS[:2]` as default value of `params`
argument in method `_get_auth` instead of using
`params or _AUTH_ALLOWED_PARAMETERS[:2]` in the body;
* Use `len(PATRONI_ENV_PREFIX)` instead of a hard-coded `8` when
removing the prefix from environment variable names;
* Fix documentation of `wal_log_hints` setting. The previous docs
mentioned it was a dynamic setting that could be changed. However
it is managed by Patroni, which forces `on` value.

References: PAT-123.
2023-08-17 11:19:49 +02:00
Matt Baker
b7ea511511 Generate API docs from code with sphinx autodoc (#2699)
Expanding on the addition of docstrings in code, this adds python module API docs to sphinx documentation.

A developer can preview what this might look like by running this locally:

```
tox -m docs
```

The option `-W` is added to the tox env so that warning messages are considered errors.

Adds doc generation using the above method to the test GitHub workflow to catch documentation problems on PRs.

Some docstrings have been reformatted and fixed to satisfy errors generated with the above setup.
2023-08-17 10:27:33 +02:00
Matt Baker
82d2ef4878 Make docs more clear on changes to the bootstrap.dcs section of YAML config (#2811)
It seems that a common pitfall for new users of Patroni is that the `bootstrap.dcs` section is only used to initialize the configuration in DCS. This moves the comment about this to an info block so it is more visible to the reader.
2023-08-11 10:31:31 +02:00
Alexander Kukushkin
84aac437c1 Release v3.1.0 (#2801)
- bump pyright and resolve reported issues
- bump Patroni version
- update release notes
2023-08-03 13:02:29 +02:00
Israel
48e3d31e1d Refactor docs about migration to Patroni (#2796)
This PR is an attempt of refactoring the docs about migration to Patroni.

These are a few enhancements that we propose through this PR:

* Docs used to mention the procedure can only be performed in a single-node cluster. We changed that so the procedure considers a cluster composed of primary and standbys;
* Teach how to deal with pre-existing replication slots;
* Explain how to create the user for `pg_rewind`, if user intends to enable `use_pg_rewind`.

References: PAT-143.
2023-08-03 09:01:16 +02:00
Israel
018a2f4dd9 Enhance docs of slots dynamic configuration (#2797)
The docs of `slots` configuration used to have this mention:

```
my_slot_name: the name of replication slot. If the permanent slot name
matches with the name of the current primary it will not be created.
Everything else is the responsibility of the operator to make sure that
there are no clashes in names between replication slots automatically
created by Patroni for members and permanent replication slots.
```

However that is not true in the sense that Patroni does not check for
clashes between `my_slot_name` and the name of replication slots created
for replicating changes among members. If you specify a slot name that
clashes with the name of a replication slot used by a member, it turns
out Patroni will make the slot permanent in the primary even if the member
key expire from the DCS.

Through this commit we also enhance the docs in terms of explaining that
physical permanent slots are maintained only in the primary, while logical
replication slots are copied from primary to standbys.

Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>
2023-08-01 15:40:07 +02:00
Waynerv
0e19e3e98e Make pod role label configurable (#2659)
Close #2495
2023-07-25 10:29:04 +02:00
Alexander Kukushkin
06db296612 Fixes in patroni.request (#2768)
1.  Take client certificates only from the `ctl` section. Motivation: sometimes there are server-only certificates that can't be used as client certificates. As a result neither Patroni not patronictl work correctly even if `--insecure` option is used.
2. Document that if `restapi.verify_client` is set to `required` then client certificates **must** be provided in the `ctl` section.
3.  Add support for `ctl.authentication` and prefer to use it over `restapi.authentication`.
4. Silence annoying InsecureRequestWarning when `patronictl -k` is used, so that behavior becomes is similar to `curl -k`.
2023-07-25 08:48:18 +02:00
Alexander Kukushkin
a4d29eb99e Release v3.0.4 (#2754)
- update release notes
- bump version
- bump pyright version
2023-07-13 11:51:38 +02:00
Alexander Kukushkin
d46ca88e6b Make it visible replication state on standbys (#2733)
To do that we use `pg_stat_get_wal_receiver()` function, which is available since 9.6. For older versions the `patronictl list` output and REST API responses remain as before.

In case if there is no wal receiver process we check if `restore_command` is set and show the state as `in archive recovery`.

Example of `patronictl list` output:
```bash
$ patronictl list
+ Cluster: batman -------------+---------+---------------------+----+-----------+
| Member      | Host           | Role    | State               | TL | Lag in MB |
+-------------+----------------+---------+---------------------+----+-----------+
| postgresql0 | 127.0.0.1:5432 | Leader  | running             | 12 |           |
| postgresql1 | 127.0.0.1:5433 | Replica | in archive recovery | 12 |         0 |
+-------------+----------------+---------+---------------------+----+-----------+

$ patronictl list
+ Cluster: batman -------------+---------+-----------+----+-----------+
| Member      | Host           | Role    | State     | TL | Lag in MB |
+-------------+----------------+---------+-----------+----+-----------+
| postgresql0 | 127.0.0.1:5432 | Leader  | running   | 12 |           |
| postgresql1 | 127.0.0.1:5433 | Replica | streaming | 12 |         0 |
+-------------+----------------+---------+-----------+----+-----------+
```

Example of REST API response:
```bash
$ curl -s localhost:8009 | jq .
{
  "state": "running",
  "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00",
  "role": "replica",
  "server_version": 150003,
  "xlog": {
    "received_location": 335544480,
    "replayed_location": 335544480,
    "replayed_timestamp": null,
    "paused": false
  },
  "timeline": 12,
  "replication_state": "in archive recovery",
  "dcs_last_seen": 1688642069,
  "database_system_identifier": "7252327498286490579",
  "patroni": {
    "version": "3.0.3",
    "scope": "batman"
  }
}

$ curl -s localhost:8009 | jq .
{
  "state": "running",
  "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00",
  "role": "replica",
  "server_version": 150003,
  "xlog": {
    "received_location": 335544816,
    "replayed_location": 335544816,
    "replayed_timestamp": null,
    "paused": false
  },
  "timeline": 12,
  "replication_state": "streaming",
  "dcs_last_seen": 1688642089,
  "database_system_identifier": "7252327498286490579",
  "patroni": {
    "version": "3.0.3",
    "scope": "batman"
  }
}
```
2023-07-13 09:24:20 +02:00
Martín Marqués
e72d3ba79e Use full names for contributors in the release notes (#2725)
Until the last release, contributors' names were fully written on the
first occurence during that release. This meant that if Alexander had
four contributions in the release, we would use Alexander Kukushkin on
the first item in the release, and on all the others just Alexander.

This could, in some cases, create some confusion. For example, if there
are more than one contributor with the same first name that has more
than one contribution each.

For this reason, in release 3.0.3, we used the full names of contributors
on all the items from the release.

This patch is to amend the old release notes and have each entry with the
full name of the contributor.

Also fix typo with 2 spaces between first name and last name in one bug fix

Signed-off-by: Martín Marqués <martin.marques@enterprisedb.com>
2023-07-04 18:53:53 +03:00
Andrey
74d78dbba2 Update request_queue_size feature authors (#2723)
Add Aleksei Sukhov do the authors
2023-06-26 08:11:09 +02:00