Thursday, 3 October 2024

Failover Slots in PostgreSQL-17: Ensuring High Availability with Logical Replication

With the introduction of failover slots in PostgreSQL-17, logical replication can now be effectively utilized in high-availability environments. This feature ensures that logical replication continues seamlessly even if the publisher node fails and its physical standby is promoted as the new publisher.

To maintain continuous replication after a failover, you need to configure the following parameters and options for both logical and physical replication setups:

1. Enable Failover Property on Slots:

•  Set the failover property for the slots corresponding to the subscriptions that need to continue receiving changes after a failover. This can be done using the failover option in CREATE SUBSCRIPTION or the failover parameter in the pg_create_logical_replication_slot() SQL API. These slots will be referred to as failover slots.

2. Synchronize Replication Slots:

•  Set the sync_replication_slots parameter to on on the standby. This ensures that the failover slots are periodically synchronized by the slotsync worker.

3. Configure Physical Replication Slot:

•  Ensure a physical replication slot exists between the primary and the standby. Set primary_slot_name on the standby and ensure the slot with that name exists on the primary. Also, enable hot_standby_feedback on the standby to prevent row removal needed by the standby.

4. Prevent Subscriber Overrun:

•  On the primary, set the synchronized_standby_slots parameter to include the physical replication slot. This prevents the subscriber from consuming changes faster than the hot standby can handle.

5. Update Connection Info:

•  Modify primary_conninfo on the standby to include dbname=postgres. This is necessary for the slot sync worker to connect to the primary.

Example Setup

Here's an example demonstrating logical replication continuity after failover using three nodes: Node-A (primary server/publisher), Node-B (physical standby for Node-A), and Node-C (subscriber for Node-A).

Node-A (Publisher)
Initialize the publisher database cluster:
initdb -D <data_pub_path>

Start the server:
pg_ctl start -D <data_pub_path>

Set wal_level to logical.

Create a table and publication:
CREATE TABLE mytbl(c1 INT PRIMARY KEY);
CREATE PUBLICATION mypub FOR TABLE mytbl;

Node-C (Subscriber)
Initialize the subscriber database cluster:
initdb -D <data_sub_path>

Start the server:
pg_ctl start -o "-p 5444" -D <data_sub_path>

Create the table and a failover-enabled subscription:
CREATE TABLE mytbl(c1 INT PRIMARY KEY);
CREATE SUBSCRIPTION mysub CONNECTION 'dbname=postgres' PUBLICATION mypub WITH (failover = true);

Verify Replication

Node-A (Publisher):
SELECT slot_name, failover FROM pg_replication_slots;
 slot_name | failover
-----------+----------
 mysub     | t

INSERT INTO mytbl VALUES (generate_series(1, 2));

On Node-C (Subscriber):
SELECT * FROM mytbl;
 c1
----
  1
  2

Setting Up Physical Standby for Failover
Node-B (Physical Standby)
Create a base backup:
pg_basebackup -D <data_standby_path> -R -X stream -P

Start the standby server:
pg_ctl start -o "-p 5555" -D <data_standby_path>

Node-A (Publisher/Primary)
Create a physical replication slot:
SELECT pg_create_physical_replication_slot('sb1_slot');

Node-B (Physical Standby)
Set the following parameters:
ALTER SYSTEM SET sync_replication_slots = on;
ALTER SYSTEM SET primary_slot_name = 'sb1_slot';
ALTER SYSTEM SET hot_standby_feedback = on;
 
Restart the standby server and ensure the failover slot is synced:
SELECT slot_name, failover, synced FROM pg_replication_slots;
select slot_name, failover, synced from pg_replication_slots;
 slot_name | failover | synced
-----------+----------+--------
 mysub     | t        | t 

Testing Failover
Stop the primary server (Node-A):
pg_ctl stop -D <data_pub_path>

Promote the standby server (Node-B):
pg_ctl promote -D <data_standby_path>

Update Subscription on Node-C (Subscriber)
Disable the subscription:
ALTER SUBSCRIPTION mysub DISABLE;

Change the connection to the new primary:
ALTER SUBSCRIPTION mysub CONNECTION 'dbname=postgres port=5555';
 
Enable the subscription:
ALTER SUBSCRIPTION mysub ENABLE;

Verify Continuity

On Node-B (New Primary):
INSERT INTO mytbl VALUES (generate_series(3, 4));

On Node-C (Subscriber):
SELECT * FROM mytbl;
 c1
----
  1
  2
  3
  4

Under the Hood
Slot synchronization on a physical standby works by fetching logical failover slot information from the primary server, creating the slots on the standby, and synchronizing them periodically. This can be done automatically by enabling the slot sync worker or manually using the pg_sync_replication_slots() function.

If the WAL corresponding to the remote's restart_lsn is unavailable or the remote's catalog_xmin precedes the "oldest xid," the local standby slot cannot be created, and the slot will be marked as RS_TEMPORARY. Once the primary server catches up, the slot will be marked as RS_PERSISTENT.

The slot sync worker waits before the next synchronization, with the duration varying based on whether any slots were updated during the last cycle. Standby synchronized slots will be dropped if they no longer need synchronization.

Wednesday, 25 September 2024

Online Upgrading Logical and Physical Replication Nodes

In my last blog post, Evolution of Logical Replication, I mentioned the future development of a feature to allow "upgrades of logical replication nodes." The upcoming release of PostgreSQL 17 includes this feature. Previously, after major version upgrades (via pg_upgrade), users couldn't immediately connect and write data to logical replication nodes. This was because the slots were lost during upgrades, preventing replication from continuing. As a result, new writes wouldn't get replicated, causing data on both nodes to become out of sync. As explained in this blog post, users had to block applications from writing until the replication setup was re-enabled after the upgrade.

With PostgreSQL 17, logical replication nodes can be upgraded without blocking writes or requiring users to manually detach/attach subscriptions or create slots. The migration of logical slots is supported only when the old cluster is version 17.0 or later, so users will benefit from this feature when upgrading from 17.0 to 18.0 (or later versions). Refer to the PostgreSQL documentation for the prerequisites for upgrading publisher and subscriber nodes.

This blog will delve into the internals of this feature and provide an example of upgrading a two-node logical replica setup. The later part of this blog will talk about online upgrading physical replication setups. 

Let's start with an example of upgrading a logical replica setup where the old and new versions of databases are PostgreSQL 17.

Publisher:
1. CREATE TABLE foo(c1 int);
2. CREATE PUBLICATION pub FOR TABLE foo;

Subscriber:
1. CREATE TABLE foo(c1 int);
2. CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub;

Publisher:
3. INSERT INTO foo VALUES(generate_series(1,5));

Subscriber:
3. SELECT * FROM foo;
c1
----
1
2
3
4
5
(5 rows)

This shows that logical replication is happening between the publisher and the subscriber. Now, we can upgrade either the publisher or subscriber first and let the other node continue operations. In this example, I choose to upgrade the subscriber first but note that the publisher needs to retain the WAL for all changes until the subscriber is upgraded.

Subscriber:
4. Stop the subscriber server.
5. Upgrade the subscriber:
pg_upgrade --old-datadir "/opt/PostgreSQL/postgres/17/data1_sub" --new-datadir "/opt/PostgreSQL/postgres/17/data1_sub_upgraded" --old-bindir "/opt/PostgreSQL/postgres/17/bin" --new-bindir "/opt/PostgreSQL/postgres/17/bin"

During this upgrade, the publisher receives writes.

Publisher:
4. INSERT INTO foo VALUES(generate_series(10,15));

Now restart the subscriber and see that the new writes are replicated.

Subscriber:
6. Start the server.
7. SELECT * FROM foo;
c1
----
1
2
3
4
5
10
11
12
13
14
15
(11 rows)

By this time, the subscriber has been upgraded and can fetch all the writes that happened on the publisher during the upgrade. The next step is to upgrade the publisher and verify if the replication can continue. We should disable the subscriptions on the subscriber before the publisher's upgrade to prevent it from polling the publisher for data and fetching any changes during the upgrade process.

Subscriber:
8. ALTER SUBSCRIPTION sub DISABLE;

Publisher:
5. Stop the publisher server.
6. Upgrade the publisher:
pg_upgrade --old-datadir "/opt/PostgreSQL/postgres/17/data1_pub" --new-datadir "/opt/PostgreSQL/postgres/17/data1_pub_upgraded" --old-bindir "/opt/PostgreSQL/postgres/17/bin" --new-bindir "/opt/PostgreSQL/postgres/17/bin"
7. Start the publisher server.

Now enable the subscriptions on the subscriber and write some data on the publisher.

Subscriber:
9. ALTER SUBSCRIPTION sub ENABLE;

Publisher:
8. INSERT INTO foo VALUES(99);

Subscriber:
10. SELECT * FROM foo WHERE c1 = 99;
c1
----
99
(1 row)

Note that the subscriber has replicated the new value from the publisher after its upgrade.

Under the Hood:
To migrate logical slots during an upgrade, a list of logical slots is fetched while reading information from the old cluster. Later in the upgrade process, pg_upgrade revisits the list and restores slots by executing pg_create_logical_replication_slot() on the new cluster. If the old node has invalid slots or slots with unconsumed WAL records, pg_upgrade fails. These checks are necessary to prevent data loss. We also ensure that logical decoding output plugins required by slots in the old cluster are present in the new cluster.

This work also allows upgrades to preserve the full subscription state, enabling us to replicate changes on subscriber nodes after the upgrade. Previously, only the subscription's metadata information was preserved. Without the list of relations and their state, it wasn't possible to re-enable subscriptions without missing some records. Similarly, we weren't preserving the replication origin's information after the upgrade, which is needed to ensure that we don't replicate anything twice. Additionally, pg_upgrade will check that all subscription relations are in 'i' (init) or 'r' (ready) state and will error out if that's not the case, logging the reason for the failure. This helps avoid the risk of any dangling slot or origin after the upgrade.

Upgrading Streaming (Physical) Replication Setup:
Before PostgreSQL 17, upgrading a streaming replica was possible only when the primary used link mode to upgrade, and users could use rsync to sync data from the primary to the standby. Also, users couldn't perform any operations on any of the nodes (primary and standby) until the upgrade on the primary and rsync to the standby were finished.

With the advent of pg_createsubscriber and the ability to retain the subscriber's full state in PostgreSQL 17, one can upgrade a streaming replica online (without stopping operations), and the upgrade doesn't even require rsync.

Say there is a physical replication setup between node-A and node-B. Now, we want to upgrade both nodes in the physical replication setup.

1. Stop the standby server (node-B).
2. Run pg_createsubscriber on node-B.
3. Upgrade node-B.
4. Start node-B.
5. Create a physical replica from node-B, say node-C. So both node-B and node-C are on newer server versions.
6. Transition all writes from node-A to node-B.
7. Decommission node-A.

By the end, we have a physical replica setup (node-B → node-C) of the newer version without stopping operations.

Note: As pg_createsubscriber internally uses logical replication, any new DDLs or sequence changes won't be replicated. Refer to the PostgreSQL documentation for the usage of pg_createsubscriber.

Thanks to Vignesh C. and Hayato Kuroda for verifying the above steps for upgrading the streaming replication setup.