Saturday, 8 August 2015

Improved Writes in PostgreSQL For 9.6 (Part - 1)


Lately, PostgreSQL has gained attention because of numerous performance
improvements that are being done in various areas (like for 9.5 the major
areas as covered in my PGCon presentation are Read operations, Sorting, 
plpgsql, new index type for data access, compression of full_page_writes),
however still there is more to be done to make it better than other commercial
RDBMS's and one of the important areas for improvements is Write
operations as shown in one of my previous posts (Write Scalability in
PostgreSQL). During my investigation of Write operations, I found that
there are locking bottlenecks during Write operations which is one of the
cause for limiting its performance and the one which contends most is
ProcArrayLock which is used during commit of transaction and for taking
Snapshots.

Removing the contention around ProcArrayLock gives a very good boost
in performance especially at higher client count and this work has been done
for PostgreSQL 9.6.  To start with let us first discuss the improvement
in-terms of TPS (transactions per second) after this work. I have ran a pgbench
read-write (sort of tpcb) workload to compare the performance difference
with and without this commit in PostgreSQL on Intel m/c having 8 sockets,
64 cores (128 hardware threads), 500GB RAM and here is performance data
(running same tests on IBM POWER-8 m/c also shows similar gain)



































Non-default settings used in all the tests are:
max_connections = 300
shared_buffers = 8GB
wal_buffers = 256MB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout    =35min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9

The data is taken when all the data fits in shared_buffers as this work mainly helps
such cases. The performance increase is visible at somewhat higher client count,
at 64 clients we will see 30% improvement and at 256 clients, the performance
improvement is 133%.  At lower client-count (8 or 16 clients), there is not much
difference (due to fluctuation, I see 1-2% difference, but I think for such cases this
work doesn't help).

Now coming to the work done to improve the performance, presently for the
correctness requirement of taking snapshot's in PostgreSQL, it enforces the strict
serialization of commits and rollbacks with snapshot-taking: it doesn't allow any
transaction to exit the set of running transactions while a snapshot is being taken.
To achieve the same, while taking snapshot it acquires ProcArrayLock in SHARED
mode and each exiting transaction acquires it in EXCLUSIVE mode.  So in this
protocol, there are two different types of contention, one is between a backend
which is trying to acquire a snapshot with backend trying to commit a transaction
and second is among backends  that are trying to commit a transaction at same
time.  The idea used in this work is to allow only one backend (we can call it as
a group leader) at-a-time to take a ProcArrayLock and complete the work for
all other transactions which are trying to commit the transactions at the same
time.  This helps in minimising the ProcArrayLock acquisition in EXCLUSIVE
mode and which intern greatly reduces the contention around it.

Apart from the benefit this patch brings, it also opens up the opportunity
to do more optimisations to reduce contention of various other locks like
CLogControlLock and WALWriteLock etc. in PostgreSQL which I see as a huge
benefit for Write operations. I hope to see more improvements for Write
operations and cover them in future Blogs.

Last but not least, I would like to thank all who were involved in this work.  Firstly
I would like to thank my employer EnterpriseDB and Robert Haas who not only
encouraged me to work in this area, but also helped in various stages of this
Patch development. When I was in-middle of this work and wanted feedback
and suggestions, a lot of people (during PGCon) shared their thoughts with me
and among them who really helped me to move this work to a level where it
can be presented to PostgreSQL community are Robert Haas, Andres Freund
and Simon Riggs.  In the end, I would also like to thank Pavan Deolasee who
has reviewed this patch.