Tuesday, November 4, 2008

PostgreSQL Replication and Load-Balancing

The task was to set up a number of PostgreSQL systems that should always have identical data so one can do load-balancing.

I found the tool pgpool-II (http://pgfoundry.org/projects/pgpool/) quite useful for this.
Unluckily Debian GNU/Linux only has Version 1.3 in its repositories (lenny).
I took the sources of Version 2.1 and created a new package.

Simple load-balancing and replication is quite easy and straight forward. But...
what to do if one nodes gets broken or needs the database to be reinitialized?



Even here pgpool-II offers lots of help by making use of the PITR (point in time recovery) functionality of PostgreSQL.

I came up with the following solution:

1. Install first node with postgres and other necessary packages:

apt-get install postgresql-8.3 postgresql-client-8.3
postgresql-client-common postgresql-common postgresql-contrib-8.3

postgresql-server-dev-8.3 make

2. create a path for the PITR archive logs (we need them later for recovery)

mkdir /var/lib/postgresql-archive
chown postgres. /var/lib/postgresql-archive

3. ssh-keys for postgresql user

create ssh-keys without passphrase for postgresql user and distribute the key as authorizedkeys file to all nodes.
Hint: we need to copy some data over the filesystem without interaction or login.

4. make changes to the postgresql configuration file

/etc/postgresql/8.3/main/postgresql.conf

listenaddresses = '*'

archivemode = on

archivecommand = 'test ! -f /var/lib/postgresql-archive/%f && cp %p /var/lib/postgresql-archive/%f'

archive_timeout = 60

5. make changes to the pgpool configuration file

listen_addresses = '*'

replication_mode = true

load_balanced_mode = true

pgpool2_hostname = 'localhost'

recovery_timeout = 90

replication_timeout = 5000

backend_hostname0 = 'postgres-1'

backend_port0 = 5432

backend_weight = 1

backend_data_directory0 = '/var/lib/postgresql/8.3/main/'

backend_hostname1 = 'postgres-2'

backend_port1 = 5432

backend_weight = 1

backend_data_directory1 = '/var/lib/postgresql/8.3/main/

[...]

recovery_user = 'postgres'

recovery_password =

recovery_1st_stage_command = 'copy-base-backup'

recovery_2nd_stage_command = 'pgpool-recovery-pitr'

6. make changes to the configuration file for the pgpool control processor

/etc/pcp.conf

use the comand pg_md5 to generate md5 hashes of passwords.

pg_md5 <password>

add entries to the configuration file:

<username>:<md5 hash of password>

7. create 1st stage backup script in /var/lib/postgresql/8.3/main/copy-base-backup

#!/bin/sh

datadir=$1 DEST=$2 DESTDIR=$3

# switch master to prepare for backup

psql -c "select pg_start_backup('pgpool-recovery')" postgres

# prepare local command for archive fetching from master warning! scp hostname should be different for every system!

echo "restore_command = 'scp postgres-1:/var/lib/postgresql-archive/%f %p'" > /var/lib/postgresql/8.3/main/recovery.conf

# create complete tarball on master

tar
-C /var/lib/postgresql/8.3 -czf main.tar.gz main/global main/base
main/pg_multixact main/pg_subtrans main/pg_clog main/pg_xlog
main/pg_twophase main/pg_tblspc main/recovery.conf
main/backup_label.old

# switch master back to normal operation

psql -c 'select pg_stop_backup()' postgres

# copy tarball to destination

scp main.tar.gz $DEST:/var/lib/postgresql/8.3/

# last line

8. create 2nd stage recovery script /var/lib/postgresql/8.3/main/pgpool_recovery_pitr

#! /bin/sh

psql -c 'select pg_switch_xlog()' postgres

# last line

9. create post restore initialization script /var/lib/postgresql/8.3/main/pgpool_remote_start

#! /bin/sh

if [ $# -ne 2 ]

then

  • echo "pgpool_remote_start remote_host remote_datadir" exit 1

fi

DEST=$1

DESTDIR=$2

PGCTL=/usr/lib/postgresql/8.3/bin/pg_ctl

ssh -T $DEST $PGCTL -w -D /var/lib/postgresql/8.3/main/ stop 2>/dev/null 1>/dev/null < /dev/null

# delete old content

ssh
-T $DEST 'cd /var/lib/postgresql/8.3/; rm -r main/global main/base
main/pg_multixact main/pg_subtrans main/pg_clog main/pg_xlog
main/pg_twophase main/pg_tblspc main/recovery.conf
main/backup_label.old'

# expand the archive on the remote system

ssh -T $DEST 'cd /var/lib/postgresql/8.3/; tar zxf main.tar.gz' 2>/dev/null 1>/dev/null < /dev/null

# restart postgresql on the remote system

ssh -T $DEST $PGCTL -w -D /etc/postgresql/8.3/main/ start 2>/dev/null 1>/dev/null < /dev/null &

# last line

10. prepare databases

log in to first database node.

stop postgresql

cd /var/lib/postgresql/8.3/; tar zxf main.tar.gz main

stop postgresql on all other nodes

copy main.tar.gz to all other nodes

extract main.tar.gz on all other nodes

start postgresql on all nodes

11. install pgpool2 Admin interface (web based)

fetch pgpoolAdmin sources from http://pgfoundry.org/projects/pgpool/

apt-get install apache2 libapache2-mod-php5 php5-pgsql

copy content to /var/www/pgpooladmin

open a browser: http://<nodename>/pgpooladmin/install/

follow the setup description (name proper paths to files, set permissions)

12. prepare system for pgpool in combination with PITR on line recovery

The postgresql on-line recovery makes use of the PITR (point in time recovery).

Postgresql writes all transactions to a log (/var/lib/postgresql-archive) and remembers last made transaction.

To make use of pgpool and on-line recovery the template1 database needs to have the C language function installed.

cd /root/pgpool2-2.1/sql/pgpool-recovery

make install

su - postgres

cd /root/pgpool2-2.1/sql/pgpool-recovery

psql -f pgpool-recovery.sql template1


13. recovery

do never run the recovery on the host that you like to recover!!!!

log in to a functional node and run the following commands:

pcp_detach_node 10 localhost 9898 <username> <password> <number of node to recover>

pcp_recovery_node 10 localhost 9898 <username> <password> <number of node to recover>

afterwards log in to the web-admin interfaces on all nodes and enable the recovered system

or log in to all other nodes and run

pcp_attach_node 10 localhost <username> <password> <number of node to attach>