pg_rman 1.3

documentation in Japanese
  1. Name
  2. Synopsis
  3. Description
  4. Examples
  5. Options
  6. Way to pass options
  7. Restrictions
  8. Details
  9. External Scripts
  10. Download
  11. Installation
  12. Requirements
  13. See Also

Name

pg_rman -- manages backup and recovery of PostgreSQL.

Synopsis

pg_rman [ OPTIONS ] { init |
                      backup |
                      restore |
                      show [ DATE | detail ] |
                      validate [ DATE ] |
                      delete DATE |
                      purge }

pg_rman has the features below:

DATE is the start time of the target backup in ISO-format (YYYY-MM-DD HH:MI:SS). Prefix match is used to compare DATE and backup files.

$ pg_rman show 2009-12 # show backups in a month of December 2009
$ pg_rman validate     # validate all unvalidated backups

pg_rman supports the following commands. See also Options for details of OPTIONS.

Description

pg_rman is a utility program to backup and restore PostgreSQL database. It takes a physical online backup of whole database cluster, archive WALs, and server logs.

pg_rman supports getting backup from standby-site with PostgreSQL 9.0 later, also supports storage snapshot backup.

Initialize a backup catalog

First, you need to create "a backup catalog" to store backup files and their metadata.

$ pg_rman init -B <a backup catalog path>

It is recommended to setup log_directory, archive_mode and archive_command in postgresql.conf before initialize the backup catalog. If the variables are initialized, pg_rman can adjust the configuration file to the setting. In this case, you have to specify the database cluster path for PostgreSQL. Please specify it in PGDATA environmental variable or -D/--pgdata option.

Backup

The mode of backup can be one of the following types.

Pg_rman also can backup PostgreSQL server log files.

Validate backup data

It is necessary to validate the data backuped by pg_rman. Pg_rman uses file size check and CRC for validation.

It is recommended to verify backup files as soon as possible after backup. Unverified backup cannot be used in restore nor in incremental backup.

View backup information

The show command outputs backup lists.

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   375MB     1  OK
2023-11-28 12:13:15  2023-11-28 12:13:17  INCR    33MB     1  OK
2023-11-28 12:12:48  2023-11-28 12:12:50  INCR    33MB     1  OK
2023-11-28 12:12:36  2023-11-28 12:12:38  INCR    33MB     1  OK
2023-11-28 12:11:51  2023-11-28 12:12:00  FULL  3366MB     1  OK

show detail command shows more detail information.

$ pg_rman show detail
======================================================================================================================
 StartTime           EndTime              Mode    Data  ArcLog  SrvLog   Total  Compressed  CurTLI  ParentTLI  Status
======================================================================================================================
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   369MB    67MB    66kB   375MB       false       1          0  OK
2023-11-28 12:13:15  2023-11-28 12:13:17  INCR   297kB    33MB    63kB    33MB       false       1          0  OK
2023-11-28 12:12:48  2023-11-28 12:12:50  INCR   297kB    33MB    60kB    33MB       false       1          0  OK
2023-11-28 12:12:36  2023-11-28 12:12:38  INCR   297kB    33MB    57kB    33MB       false       1          0  OK
2023-11-28 12:11:51  2023-11-28 12:12:00  FULL   369MB  3053MB  3909kB  3366MB       false       1          0  OK

The fields are:

And more, when you specify the date in “Start” field, you can see the detail information of the backup.

$ pg_rman show '2023-11-28 12:14:03'
# configuration
BACKUP_MODE=FULL
FULL_BACKUP_ON_ERROR=false
WITH_SERVERLOG=true
COMPRESS_DATA=false
# result
TIMELINEID=1
START_LSN=0/c2000028
STOP_LSN=0/c2000ee0
START_TIME='2023-11-28 12:14:03'
END_TIME='2023-11-28 12:14:05'
RECOVERY_XID=22719
RECOVERY_TIME='2023-11-28 12:14:05'
TOTAL_DATA_BYTES=369864268
READ_DATA_BYTES=369864034
READ_ARCLOG_BYTES=33554780
READ_SRVLOG_BYTES=68692
WRITE_BYTES=342403000
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK

Restore

pg_rman restore the backuped data into target database cluster path.

PostgreSQL server should be stopped before restoring. In addition, do not erase an original database cluster, because pg_rman has to check the timeline ID or data checksum status from it. Restore command will save unarchived transaction log and delete all database files. You can retry recovery until a new backup is taken. After restoring files, pg_rman create recovery.conf in $PGDATA. The conf file contains parameters to recovery, and you can also modify the file if needed.

pg_rman configure guc parameters related recovery when restoring. The configuration file depends on PostgreSQL's version and pg_rman's version. Please start a server and execute PITR after modifying the file manually if you need.

# PostgreSQL's version is lower than 12
$ cat $PGDATA/recovery.conf
# recovery.conf generated by pg_rman 1.2.11
restore_command = 'cp /home/postgres/arclog/%f %p'
recovery_target_timeline = '1'

# PostgreSQL's version is 12 or higher, and pg_rman's version is 1.3.12 or less
$ tail -n 3 $PGDATA/postgresql.conf
# postgresql.conf generated by pg_rman 1.3.12
restore_command = 'cp /home/postgres/arclog/%f %p'
recovery_target_timeline = '1'

# PostgreSQL's version is 12 or higher, and pg_rman's version is higher than 1.3.12
$ tail -n 1 $PGDATA/postgresql.conf
include = 'pg_rman_recovery.conf' # added by pg_rman 1.3.16
$ cat $PGDATA/pg_rman_recovery.conf
# added by pg_rman 1.3.16
restore_command = 'cp /home/postgres/arclog/%f %p'
recovery_target_timeline = '1'

It is recommended to take a full backup as soon as possible after recovery is succeeded and to remove the recovery-related parameters configured by pg_rman manually. The reason is that there is a case that even after recovery is done, PostgreSQL doesn't work with HA cluster software since recovery.conf is integrated to postgresql.conf after PostgreSQL's version is 12 or higher. Pacemaker which is a HA cluster software start postgresql server as standby at first, after that it decides it should promote or not. So, the postgresql server doesn't start properly because the recovery-related parameter configured by pg_rman works as valid values unexpectedly. For example, in case using PostgreSQL's version is 12 or higher, and pg_rman's version is higher than 1.3.12, you need to remove an include directive in $PGDATA/postgresql.conf and $PGDATA/pg_rman_recovery.conf.

If --recovery-target-timeline is not specified, the last checkpoint’s TimeLineID in control file ($PGDATA/global/pg_control) will be a restore target. If pg_control is not present, TimeLineID in the full backup used by the restore will be a restore target.

When specifying --recovery-target-time, make sure to specify a timestamp greater than (or equal to) the EndTime of the full backup that you want to use as the base.

If the archive WALs are not compressed at the time of backup, archive WALs that do not exist in the archive storage area will be restored as symbolic links. When used in combination with peripheral tools (ex. PG-REX) that are not designed for this behavior, please specify the option (--hard-copy) to perform physical copying.

Delete backups

The delete command deletes all backup files before the specified date not required by other incremental backups. Incremental backups depend on earlier validated full backup.

The following example deletes unneeded backup files to recovery at 12:00 11, September 2009.

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-28 12:14:03  2023-11-28 12:14:05  FULL   342MB     1  OK
2023-11-28 12:13:56  2023-11-28 12:13:57  ARCH    16MB     1  OK
2023-11-28 12:13:52  2023-11-28 12:13:53  ARCH    16MB     1  OK
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   375MB     1  OK
2023-11-28 12:13:15  2023-11-28 12:13:17  INCR    33MB     1  OK
2023-11-28 12:12:48  2023-11-28 12:12:50  INCR    33MB     1  OK
2023-11-28 12:12:36  2023-11-28 12:12:38  INCR    33MB     1  OK
2023-11-28 12:11:51  2023-11-28 12:12:00  FULL  3366MB     1  OK

$ pg_rman delete 2023-11-28 12:13:30
WARNING: cannot delete backup with start time "2023-11-28 12:13:24"
DETAIL: This is the latest full backup necessary for successful recovery.
INFO: delete the backup with start time: "2023-11-28 12:13:15"
INFO: delete the backup with start time: "2023-11-28 12:12:48"
INFO: delete the backup with start time: "2023-11-28 12:12:36"
INFO: delete the backup with start time: "2023-11-28 12:11:51"

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-28 12:14:03  2023-11-28 12:14:05  FULL   342MB     1  OK
2023-11-28 12:13:56  2023-11-28 12:13:57  ARCH    16MB     1  OK
2023-11-28 12:13:52  2023-11-28 12:13:53  ARCH    16MB     1  OK
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   375MB     1  OK

Remove deleted backups

Though delete command removes actual data from file system, there remains some catalog information of deleted backups. In order to remove this, execute purge command.

$ pg_rman show -a
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-28 12:14:03  2023-11-28 12:14:05  FULL   342MB     1  OK
2023-11-28 12:13:56  2023-11-28 12:13:57  ARCH    16MB     1  OK
2023-11-28 12:13:52  2023-11-28 12:13:53  ARCH    16MB     1  OK
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   375MB     1  OK
2023-11-28 12:13:15  2023-11-28 12:13:17  INCR    33MB     1  DELETED
2023-11-28 12:12:48  2023-11-28 12:12:50  INCR    33MB     1  DELETED
2023-11-28 12:12:36  2023-11-28 12:12:38  INCR    33MB     1  DELETED
2023-11-28 12:11:51  2023-11-28 12:12:00  FULL  3366MB     1  DELETED

$ pg_rman purge
INFO: DELETED backup "2023-11-28 12:13:15" is purged
INFO: DELETED backup "2023-11-28 12:12:48" is purged
INFO: DELETED backup "2023-11-28 12:12:36" is purged
INFO: DELETED backup "2023-11-28 12:11:51" is purged

$ pg_rman show -a
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-28 12:14:03  2023-11-28 12:14:05  FULL   342MB     1  OK
2023-11-28 12:13:56  2023-11-28 12:13:57  ARCH    16MB     1  OK
2023-11-28 12:13:52  2023-11-28 12:13:53  ARCH    16MB     1  OK
2023-11-28 12:13:24  2023-11-28 12:13:26  FULL   375MB     1  OK

Standby-site Backup

If you use replication feature on PostgreSQL 9.0 later, you can get backup from standby-site. The basic usage is the same as when using it with a single master server, so only the points that need attention are described.

Archive WALs must also be taken when you take a backup of the standby-site. So, you need to prepare a shared disk and so on so that the archive area of the master can be accessed from the standby, or set archive_mode to 'always' at the standby-site.

In the latter case, copy the primary's archive WALs (including history file) when the standby-site is created to make sure that you can take back up all the files required for restoring. You can delete old archive WALs at the time of backup using --keep-arclog-files / --keep-arclog-days. But, since the deletion target is only the one which it take a backup, the master's archived WALs are not deleted if you take a backup at standby-site.

You should specify different options from usual use for getting backup from standby-site. In detail, you should specify the database cluster on standby-site by -D/--pgdata option. And you should specify information on master-site by connection options (-d/--dbname, -h/--host, -p/--port). In addition, you should specify information to connect standby-site by standby connection options (--standby-host, --standby-port).

$ pg_rman init -B <a backup catalog path> -D <(the database cluster path(on standby-site)>

Here shows an example with the below environment.

Then, the backup from standby-site can be done with the below command:

$ pg_rman backup --pgdata=/home/postgres/pgdata_sby --backup-mode=full --host=master --standby-host=localhost --standby-port=5432

Examples

In this example, let's consider about PostgreSQL server with the following configurations.

postgres=# SHOW log_directory ;
 log_directory
---------------
 pg_log
(1 row)

postgres=# SHOW archive_command ;
              archive_command
--------------------------------------------
 cp %p /home/postgres/arc_log/%f
(1 row)

And the PGDATA and BACKUP_PATH are set as environment variables.

$ echo $PGDATA
/home/postgres/pgdata
$ echo $BACKUP_PATH
/home/postgres/backup

Initialize a backup catalog.

$ pg_rman init
INFO: ARCLOG_PATH is set to '/home/postgres/arclog'
INFO: SRVLOG_PATH is set to '/home/postgres/pgdata/pg_log'

By this, the configuration file for pg_rman, named pg_rman.init, is created. All the commands of pg_rman load configurations from this file as default.

For this example, we use the following configurtaions.

$ cat $BACKUP_PATH/pg_rman.ini
ARCLOG_PATH = /home/postgres/arclog
SRVLOG_PATH = /home/postgres/pgdata/pg_log

BACKUP_MODE = F
COMPRESS_DATA = YES
KEEP_ARCLOG_FILES = 10
KEEP_DATA_GENERATIONS = 3
KEEP_SRVLOG_FILES = 10

Then, do a backup. It should be start from a full backup. Here, we will also take server log files.

$ pg_rman backup --backup-mode=full --with-serverlog --progress
INFO: copying database files
Processed 2049 of 2049 files, skipped 0
INFO: copying archived WAL files
Processed 21 of 21 files, skipped 0
INFO: copying server log files
Processed 10 of 10 files, skipped 0
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 10)
INFO: delete "0000000300000000000000E2"
INFO: delete "0000000300000000000000E2.00000028.backup"
INFO: delete "0000000300000000000000E1"
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10)
INFO: start deleting old backup (keep generations = 3)
INFO: does not include the backup just taken

Check the result by show command.

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-29 16:04:17  2023-11-29 16:04:26  FULL    50MB     5  DONE

The status of the backup we have just taken is DONE. This is because we does not do validate yet. So, do validate command next.

$ pg_rman validate
INFO: validate: "2023-11-29 16:04:17" backup, archive log files and server log files by CRC
INFO: backup "2023-11-29 16:04:17" is valid

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status
=====================================================================
2023-11-29 16:04:17  2023-11-29 16:04:26  FULL    50MB     5  OK

Now the status has been changed to OK.

Let's try to restore the backup data. Before try to do it, PostgreSQL server should be stopped.

$ pg_ctl stop -m immediate
$ pg_rman restore

The pg_rman has configured recovery-related parameters. If necessary, modify them as you wanted. In this example, we use this without modifications and will try to do PITR to latest database status.

$ cat $PGDATA/pg_rman_recovery.conf
# added by pg_rman 1.3.16
restore_command = 'cp /dbfp/pgarch/arc1/%f %p'
recovery_target_timeline = '4'
$ pg_ctl start

Options

pg_rman accepts the following command line parameters. Some of them can be also specified as environment variables.

Common options

As a general rule, paths for data location need to be specified as absolute paths; relative paths are not allowed.

The following parameter determines the behavior of restore.

Catalog options

Connection options

Parameters to connect PostgreSQL server.

Standby connection options

Parameters to connect standby server. They are used only when you get backup from the standby-site.

Generic options

Way to pass options

Some of parameters can be specified in command line arguments, environment variables or configuration file as follows:

Short Long Environment variable Conf file Description Remarks
-h –host PGHOST database server host or socket directory
-p –port PGPORT database server port
-d –dbname PGDATABASE database to connect
-U –username PGUSER user name to connect as
PGPASSWORD password used to connect
-w –no-password never prompt for password
-W –password force password prompt
-D –pgdata PGDATA Yes location of the database storage area
-B –backup-path BACKUP_PATH Yes location of the backup storage area
-A –arclog-path ARCLOG_PATH Yes location of archive WAL storage area
-S –srvlog-path SRVLOG_PATH Yes location of server log storage area
-b –backup-mode BACKUP_MODE Yes backup mode (full, incremental, or archive)
-s –with-serverlog WITH_SERVERLOG Yes also backup server log files specify boolean type in environmental variable or configuration file
-Z –compress-data COMPRESS_DATA Yes compress data backup with zlib specify boolean type in environmental variable or configuration file
-C –smooth-checkpoint SMOOTH_CHECKPOINT Yes do smooth checkpoint before backup specify boolean type in environmental variable or configuration file
–standby-host STANDBY_HOST Yes standby server host or socket directory
–standby-port STANDBY_PORT Yes standby server port
–keep-data-generations KEEP_DATA_GENERATIONS Yes keep GENERATION of full data backup
–keep-data-days KEEP_DATA_DAYS Yes keep enough data backup to recover to DAY days age
–keep-srvlog-files KEEP_SRVLOG_FILES Yes keep NUM of serverlogs
–keep-srvlog-days KEEP_SRVLOG_DAYS Yes keep serverlog modified in DAY days
–keep-arclog-files KEEP_ARCLOG_FILES Yes keep NUM of archived WAL
–keep-arclog-days KEEP_ARCLOG_DAYS Yes keep archived WAL modified in DAY days
–recovery-target-timeline RECOVERY_TARGET_TIMELINE Yes recovering into a particular timeline
–recovery-target-xid RECOVERY_TARGET_XID Yes transaction ID up to which recovery will proceed
–recovery-target-time RECOVERY_TARGET_TIME Yes time stamp up to which recovery will proceed
–recovery-target-inclusive RECOVERY_TARGET_INCLUSIVE Yes whether we stop just after the recovery target
–recovery-target-action RECOVERY_TARGET_ACTION Yes action the server should take once the recovery target is reached This option is provided versions higher than 1.3.12
–hard-copy HARD_COPY Yes how to restore archive WAL specify boolean type in environmental variable or configuration file

This utility, like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Environment Variables)

Restrictions

pg_rman has the following restrictions.

Getting backup from standby-site, pg_rman has the follow restrictions too.

When using storage snapshot, pg_rman has the following restrictions too.

Details

Recovery to Point-in-Time

pg_rman can recover to point-in-time if timeline, transaction ID, or timestamp are specified in recovery. pg_xlogdump(9.3 or later)xlogdump(9.2 or before) is an useful tool to check the contents of WAL files and determine when to recover. See Continuous Archiving and Point-in-Time Recovery (PITR) for the details.

Configuration file

Setting parameters can be specified with form of “name=value” in the configuration file. Quotes are required if the value contains whitespaces. Comments starts with “#”. Whitespaces and tabs are ignored excluding values.

Exit codes

pg_rman returns exit codes for each error status.

Code Name Description
0 SUCCESS Succeeded.
1 HELP Print a help, then exit.
2 ERROR Generic error.
3 FATAL Exit because of repeated errors
4 PANIC Unknown critical condition.
10 ERROR_SYSTEM I/O or system error.
11 ERROR_NOMEM Out of memory.
12 ERROR_ARGS Invalid input parameters.
13 ERROR_INTERRUPTED Interrupted by user. (Ctrl+C etc.)
14 ERROR_PG_COMMAND SQL error.
15 ERROR_PG_CONNECT Cannot connect to PostgreSQL server.
20 ERROR_ARCHIVE_FAILED Cannot archive WAL files.
21 ERROR_NO_BACKUP Backup file not found.
22 ERROR_CORRUPTED Backup file is broken.
23 ERROR_ALREADY_RUNNING Cannot start because another pg_rman is running.
24 ERROR_PG_INCOMPATIBLE Version conflicted with PostgreSQL server.
25 ERROR_PG_RUNNING Cannot restore because PostgreSQL server is running.
26 ERROR_PID_BROKEN postmaster.pid file is broken.

External Scripts

This is the script to getting snapshot and mounting file systems. If you want to add outer scripts, you should make your script corresponding outer script interface according to referring manuals of the storage. Please refer Interface Specification about what you should make.

Outer script performs some operation for getting several snapshots in a time execution.

If you want to use outer script, you should set the script in backup catalog directory and rename it to “snapshot_script”.

A sample outer script is released for LVM(Logical Volume Manager).

Commands Specification

$ ${BACKUP_PATH}/snapshot_script { split | resync | mount | umount | freeze | unfreeze } [cleanup]

Interface Specification

Explanation for sample script for LVM(Logical Volume Manager)

Download

You can download pg_rman rpm packages and source from: Click here to download pg_rman


Installation

pg_rman can be installed as same as standard contrib modules.

No need to register to databases.

Build from source

The module can be built with pgxs.

$ cd pg_rman
$ make
$ make install

Install from rpm package

Download rpm which name contains the PostgreSQL version and OS version of your environment.

# rpm -ivh pg_rman-x.x.xx-x.pgxx.rhelx.x86_64.rpm

Requirements

PostgreSQL
PostgreSQL 12, 13, 14, 15, 16
OS
RHEL 7, 8, 9

See Also

Backup and Restore