WALF

Loading...

WalFetch.com helps database engineers configure wal fetch, WAL-G, and PostgreSQL continuous archiving.

Wal Fetch S3: Optimizing WAL Recovery Speed from Amazon S3

While wal fetch from Amazon S3 works reliably out of the box, many users encounter slow WAL restoration speeds during large database recoveries. Understanding WAL-G's prefetch mechanism and concurrency settings is key to achieving maximum S3 download throughput during PostgreSQL recovery.

The most common cause of slow wal-fetch from S3 is sequential downloading—by default, WAL-G may not saturate your available bandwidth if downloading WAL files one at a time. The fix is to increase WALG_DOWNLOAD_CONCURRENCY to match your available network bandwidth and S3 performance tier.

WAL-G's prefetch feature proactively downloads upcoming WAL segments into the /.wal-g/prefetch directory while the current WAL file is being replayed. This pipeline effect keeps PostgreSQL's recovery process fed with files without waiting for each individual download.

To configure retry behavior for failed S3 downloads, set WALG_DOWNLOAD_FILE_RETRIES (default 15). This ensures transient S3 errors don't interrupt a long recovery operation.

For S3-compatible storage backends (MinIO, Ceph RADOS, Wasabi), set the WALE_S3_ENDPOINT environment variable to point to the custom endpoint. This gives full flexibility to use private or alternative S3-compatible storage.

Using AWS EC2 Instance Profiles eliminates the need to manage access keys. When running WAL-G on EC2, credentials are automatically rotated by AWS, improving both security and operational simplicity for wal-fetch operations.

“Tuning WALG_DOWNLOAD_CONCURRENCY for wal fetch from S3 can cut PostgreSQL recovery times by 60% or more on high-bandwidth connections.”

Step-by-Step: Set S3 Prefix

Follow these steps to implement wal fetch in your PostgreSQL environment effectively.

Step 1

Set S3 Prefix

Configure WALG_S3_PREFIX=s3://your-bucket/postgres-backups/ and ensure AWS credentials are available via environment or instance profile.
Step 2

Enable Prefetching

WAL-G prefetching is automatic. Verify prefetch files appear in /.wal-g/prefetch during recovery to confirm it is active.
Step 3

Tune Concurrency

Set WALG_DOWNLOAD_CONCURRENCY=4 or higher to enable parallel WAL downloads and maximize S3 bandwidth utilization.
Step 4

Monitor Recovery Speed

Check PostgreSQL logs during recovery to measure WAL fetch throughput. Aim for consistent download speeds without S3 throttling errors.