Wednesday, 24 September 2025

Parallel Apply of Large Transactions

Logical replication in PostgreSQL has steadily evolved since its introduction in version 10. In a previous blog post, I discussed how PostgreSQL 14 introduced streaming of large transactions. PostgreSQL 16 took this further by enabling parallel apply of large transactions via a non-default subscription option. Now, with PostgreSQL 18, parallel apply is the default behavior—marking a significant milestone in replication performance and scalability.

In this post, we’ll explore how parallel apply works for large transactions and touch upon future plans for small transactions.

Enabling Parallel Apply

From PostgreSQL 18 onwards, parallel apply is enabled by default. To explicitly configure it, users can set the streaming parameter to parallel when creating a subscription:

postgres=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub WITH (streaming = parallel);
NOTICE:  created replication slot "sub" on publisher
CREATE SUBSCRIPTION

postgres=# SELECT subname, substream FROM pg_subscription;
 subname | substream
---------+-----------
 sub     | p
(1 row)

You can control the degree of parallelism using the max_parallel_apply_workers_per_subscription parameter. By default, this is set to 2, allowing two large transactions to be applied in parallel per subscription.

How Parallel Apply Works

Parallel apply significantly improves the efficiency of large transaction handling by not waiting for the COMMIT message from the publisher before applying changes. Previously, the subscriber would write all streamed data to temporary files and apply them only after receiving the commit record.




Architecture Overview

When the first stream of a large transaction arrives, the leader apply worker (LA) assigns a parallel apply worker (PA)—if available. The LA sends changes to the PA via a dedicated shared memory queue (shm_mq). The PA applies the changes and waits for the next stream. This worker remains assigned until the transaction is committed.

At commit time, the LA waits for the PA to finish to ensure commit order is preserved, which is critical to avoid:

  • Transaction dependencies: e.g., inserting a row in one transaction and updating it in another.
  • Deadlocks: e.g., transactions updating the same rows/tables in reverse order.

Locking Considerations

Parallel apply introduces potential deadlocks, especially when transactions are independent on the publisher but become interdependent on the subscriber due to schema differences (e.g., constraints, indexes).

To detect such deadlocks between LA and PA workers, PostgreSQL uses heavy-weight locks during:

  • PA waiting for the next stream
  • LA waiting for PA to finish

Example 1: Deadlock Due to Conflicting Changes

Consider a scenario where a table lacks a unique key on the publisher but has one on the subscriber. If:

  • PA is applying TX-1
  • LA is applying TX-2

...and both transactions conflict due to the unique constraint, a deadlock can occur:

  • LA waits for a lock on the unique index
  • PA waits for the next stream from LA
  • Deadlock cycle: LA → PA → LA

PostgreSQL detects and reports such deadlocks. Users can skip problematic transactions using: ALTER SUBSCRIPTION ... SKIP ... command.

Example 2: Deadlock Due to shm_mq Buffer Full

In a variant of the previous example:

  • PA-1 and PA-2 apply TX-1 and TX-2 respectively.
  • PA-2 waits for PA-1 to finish.
  • PA-1 waits for input from LA.
  • If the shm_mq buffer between LA and PA-2 is full, LA cannot send messages.

To avoid this:

  • LA uses non-blocking writes with a timeout.
  • If the timeout is exceeded, LA serializes pending messages to a file.
  • PA-2 reads from the file, freeing LA to continue processing PA-1.

This mechanism prevents deadlocks caused by full shm_mq buffers.

Performance

Synchronous logical replication system
Setup: Publisher → Subscriber on the same machine with:

Non-default GUC settings:
shared_buffers = 100GB
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off
synchronous_commit = remote_apply
logical_decoding_work_mem = 30MB

Workload: 1 million to 10 million inserts on a single table
Time measured: Time taken from insert to commit


Results:
Elapsed time improved by ~2X with parallel apply.
As the number of tuples increases, parallel apply provides greater benefit since transactions are applied immediately, significantly reducing replication lag between publisher and subscriber.

What’s Next?

With parallel apply now default for large transactions, the next step is to extend this capability to small transactions. This will further enhance logical replication performance and scalability in PostgreSQL. The community has started discussion on it.

No comments:

Post a Comment