Backup and Recovery
60 minRegular backups are essential for data protection, enabling recovery from data loss, corruption, or disasters. PostgreSQL provides multiple backup methods, each suited to different scenarios. Understanding backup types, recovery procedures, and testing restore processes is critical for maintaining data availability. A backup strategy is only as good as your ability to restore from it, making regular restore testing essential.
Logical backups (pg_dump) create SQL scripts or custom-format files containing database schema and data. Logical backups are portable across PostgreSQL versions and platforms, making them ideal for migrations and cross-platform backups. pg_dump can backup entire databases, specific schemas, or individual tables. Logical backups are human-readable (SQL format) or compressed (custom format), and can be restored with psql or pg_restore. Logical backups are suitable for small to medium databases.
Physical backups copy the actual database files from the filesystem. Physical backups are faster than logical backups for large databases and enable point-in-time recovery (PITR). pg_basebackup creates physical backups by copying data files while the database is running. Physical backups require the database to be in the same PostgreSQL version and are typically used with WAL archiving for continuous backup and PITR capabilities.
Point-in-time recovery (PITR) enables restoring a database to any specific moment in time, not just when a backup was taken. PITR requires a base backup (physical backup) and archived WAL files. When you restore, PostgreSQL replays WAL files up to the specified recovery target time. This enables recovery from accidental data deletion or corruption that occurred after the last backup. PITR is essential for databases requiring minimal data loss.
WAL archiving is required for PITR. PostgreSQL writes WAL (Write-Ahead Log) files that record all database changes. By archiving these files (copying them to a safe location), you can replay them during recovery. The archive_command in postgresql.conf specifies how to archive WAL files. Proper WAL archiving ensures you can recover to any point in time, limited only by how long you retain archived WAL files.
Backup best practices include automating backups, storing backups off-site, testing restore procedures regularly, monitoring backup success, and maintaining multiple backup copies. Backup retention policies should balance storage costs with recovery requirements. Understanding backup and recovery procedures enables you to protect your data and meet recovery time objectives (RTO) and recovery point objectives (RPO) for your applications.
Key Concepts
- Regular backups are essential for data protection and disaster recovery.
- Logical backups (pg_dump) create portable SQL or custom-format files.
- Physical backups copy database files and enable point-in-time recovery.
- Point-in-time recovery requires base backups and archived WAL files.
- WAL archiving enables recovery to any point in time.
Learning Objectives
Master
- Performing logical backups with pg_dump
- Creating physical backups with pg_basebackup
- Configuring WAL archiving for point-in-time recovery
- Restoring databases from backups
Develop
- Understanding backup and recovery strategies
- Designing disaster recovery procedures
- Implementing data protection best practices
Tips
- Use pg_dump for logical backups: pg_dump -Fc for custom format (compressed).
- Enable WAL archiving: archive_mode = on in postgresql.conf.
- Test restore procedures regularly to ensure backups are valid.
- Use pg_basebackup for physical backups: faster for large databases.
Common Pitfalls
- Not testing restore procedures, discovering backups are invalid when needed.
- Not archiving WAL files, unable to perform point-in-time recovery.
- Storing backups on same server, losing backups if server fails.
- Not automating backups, missing backups due to human error.
Summary
- Regular backups are essential for data protection.
- Logical and physical backups serve different use cases.
- Point-in-time recovery requires WAL archiving.
- Testing restore procedures ensures backup validity.
Exercise
Perform logical and physical backups and test recovery procedures.
-- Logical backup (pg_dump)
pg_dump -h localhost -U postgres -d my_database > backup.sql
-- Logical backup with custom format
pg_dump -h localhost -U postgres -d my_database -Fc > backup.dump
-- Physical backup (requires superuser)
-- First, enable WAL archiving in postgresql.conf
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/archive/%f'
-- Create base backup
pg_basebackup -h localhost -U postgres -D /var/lib/postgresql/backup -Ft -z -P
-- Restore logical backup
psql -h localhost -U postgres -d my_database < backup.sql
-- Restore custom format backup
pg_restore -h localhost -U postgres -d my_database backup.dump
-- Point-in-time recovery
-- Create recovery.conf file
restore_command = 'cp /var/lib/postgresql/archive/%f %p'
recovery_target_time = '2024-01-15 14:30:00'
recovery_target_action = 'promote'
Exercise Tips
- Automate backups: use cron or pg_cron for scheduled backups.
- Test PITR: practice restoring to different points in time.
- Monitor backup sizes and durations to detect issues early.
- Use pg_verifybackup to verify physical backup integrity.