Wednesday, 13 May 2015

Extend Tar Format in pg_basebackup


As of now, one can't reliably use tar format to take backup on Windows
because it can't restore tablespaces data which is stored in form of symbolic
links in <data_directory>/pg_tblspc/.  The reason for the same is that  native
windows utilites are not able to create symbolic links while extracting files
from tar.  It might be possible to create symbolic links if cygwin is installed
on your system, however we need this feature to work for native windows as
well.

In PostgreSQL 9.5, a new feature (commit id - 72d422a5) to extend existing
tar format made it possible to reliably take the backup (in tar mode). 
From user perspective, there is nothing much that is changed to take the backup
except that  tar format mode (--format=tar) in pg_basebackup (of the PostgreSQL
9.5 version) will only work with server version 9.5 or later.  This feature is mainly
required for windows, but for the sake consistency it has been changed for all
platforms and also it should enable long (length greater than 99) target symbolic
link for tar format (I think the changes for same are still not done, but we can do
the same now as this feature is committed).

The basic idea behind the feature is that it forms the tablespace map of
all the tablespace symbolic links that are present inside
<data_directory>/pg_tblspc/ and store the same in data_directory for
Exclusive backups (aka backups taken via pg_start_backup() and
pg_stop_backup() functions) and store in backup archive for Non-Exclusive
backups (aka backups taken by pg_basebackup).

The format of tablespace_map file is:
16384 E:\WorkSpace\PostgreSQL\master\tbs
16388 E:\WorkSpace\PostgreSQL\master\tbs              2               3

The tablespace symbolic links are restored during archive recovery and the
tablespace_map file will be renamed to tablespace_map.old at the end of
recovery similar to backup_label file.