Sunday, August 31, 2025

PostgreSQL Cascading Replication: Full 4-Node Setup Guide

PostgreSQL Cascading Replication: Full 4-Node Setup Guide

Cascading replication allows PostgreSQL replicas to stream data not only from the primary node but also from other replicas. This setup is ideal for scaling read operations and distributing replication load across multiple servers.

🗂️ Node Architecture and Port Assignment

Node 1 (Primary) — Port 5432 — Accepts read/write operations and streams to Node 2
Node 2 (Standby 1) — Port 5433 — Receives from Node 1 and streams to Node 3.
Node 3 (Standby 2) — Port 5434 — Receives from Node 2 and streams to Node 4.
Node 4 (Standby 3) — Port 5435 — Receives from Node 3.

IP assignments:

Node 1 → 172.19.0.21
Node 2 → 172.19.0.22
Node 3 → 172.19.0.23
Node 4 → 172.19.0.24

⚙️ Prerequisites

PostgreSQL installed on all four nodes (same version recommended).
Network connectivity between nodes.
A replication user (repuser) with REPLICATION privilege created on the primary.
PostgreSQL data directories initialized for each node.

Step 1: Configure the Primary Server (Node 1, Port 5432)

🔹 Modify postgresql.conf

listen_addresses = '*'
port = 5432
wal_level = replica
max_wal_senders = 10
wal_keep_size = 512MB
hot_standby = on

🔹 Update pg_hba.conf to allow replication from Node 2

Generic:
host replication repuser <Node2_IP>/32 trust

Actual:
host replication repuser 172.19.0.22/32 trust

🔹 Reload PostgreSQL Configuration

pg_ctl reload
Or from SQL:
sql
SELECT pg_reload_conf();

🔹 Start Node 1

pg_ctl -D /path/to/node1_data -o '-p 5432' -l logfile start

Actual:
pg_ctl -D /u01/postgres/pgdata_15/data -o '-p 5432' -l logfile start

🔹 Create Replication User

sql
CREATE USER repuser WITH REPLICATION ENCRYPTED PASSWORD 'your_password';

🔹 Verify Node 1 is Not in Recovery

sql
SELECT pg_is_in_recovery();

Step 2: Set Up Standby 1 (Node 2, Port 5433)

🔹 Take a Base Backup from Node 1

Generic:
pg_basebackup -h <Node1_IP> -p 5432 -D /path/to/node2_data -U repuser -Fp -Xs -P -R

Actual:
pg_basebackup -h 172.19.0.21 -p 5432 -D /u01/postgres/pgdata_15/data -U repuser -Fp -Xs -P -R

If you get an error:
pg_basebackup: error: directory "/u01/postgres/pgdata_15/data" exists but is not empty

Clear the directory:
rm -rf /u01/postgres/pgdata_15/data/*

🔹 Modify postgresql.conf on Node 2

listen_addresses = '*'
port = 5433
hot_standby = on
primary_conninfo = 'host=172.19.0.21 port=5432 user=repuser password=your_password application_name=node2'

🔹 Start Node 2

Generic:
pg_ctl -D /path/to/node2_data -o '-p 5433' -l logfile start

Actual:
pg_ctl -D /u01/postgres/pgdata_15/data -o '-p 5433' -l logfile start

🔹 Verify Replication on Node 1

sql
SELECT * FROM pg_stat_replication WHERE application_name = 'node2';

Step 3: Set Up Standby 2 (Node 3, Port 5434)

🔹 Allow Replication from Node 3 in Node 2’s pg_hba.conf

Generic:
host replication repuser <Node3_IP>/32 trust

Actual:
host replication repuser 172.19.0.23/32 trust

🔹 Reload Node 2

pg_ctl -D /u01/postgres/pgdata_15/data reload

🔹 Take a Base Backup from Node 2

Generic:
pg_basebackup -h <Node2_IP> -p 5433 -D /path/to/node3_data -U repuser -Fp -Xs -P -R

Actual:
pg_basebackup -h 172.19.0.22 -p 5433 -D /u01/postgres/pgdata_15/data -U repuser -Fp -Xs -P -R

🔹 Modify postgresql.conf on Node 3

listen_addresses = '*'
port = 5434
hot_standby = on
primary_conninfo = 'host=172.19.0.22 port=5433 user=repuser password=your_password application_name=node3'

🔹 Start Node 3

Generic:
pg_ctl -D /path/to/node3_data -o '-p 5434' -l logfile start

Actual:
pg_ctl -D /u01/postgres/pgdata_15/data -o '-p 5434' -l logfile start

🔹 Verify Replication on Node 2

sql
SELECT * FROM pg_stat_replication;

Step 4: Set Up Standby 3 (Node 4, Port 5435)

🔹 Allow Replication from Node 4 in Node 3’s pg_hba.conf

Generic:
host replication repuser <Node4_IP>/32 trust

Actual:
host replication repuser 172.19.0.24/32 trust

🔹 Reload Node 3

pg_ctl -D /u01/postgres/pgdata_15/data reload

🔹 Take a Base Backup from Node 3

Generic:
pg_basebackup -h <Node3_IP> -p 5434 -D /path/to/node4_data -U repuser -Fp -Xs -P -R

Actual:
pg_basebackup -h 172.19.0.23 -p 5434 -D /u01/postgres/pgdata_15/data -U repuser -Fp -Xs -P -R

🔹 Modify postgresql.conf on Node 4

listen_addresses = '*'
port = 5435
hot_standby = on
primary_conninfo = 'host=172.19.0.23 port=5434 user=repuser password=your_password application_name=node4'

🔹 Start Node 4

Generic:
pg_ctl -D /path/to/node4_data -o '-p 5435' -l logfile start

Actual:
pg_ctl -D /u01/postgres/pgdata_15/data -o '-p 5435' -l logfile start

🔹 Verify Replication on Node 3

sql
SELECT * FROM pg_stat_replication;

✅ Final Verification & Monitoring

🔹 Check Replication Status

On Node 1:
sql
SELECT * FROM pg_stat_replication;

On Nodes 2 and 3:
sql

SELECT * FROM pg_stat_wal_receiver;
SELECT * FROM pg_stat_replication;

On Node 4:

sql
SELECT pg_is_in_recovery();
SELECT pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()) AS byte_lag;

🔹 Confirm Listening Ports

bash
sudo netstat -plnt | grep postgres
You should see ports 5432, 5433, 5434, and 5435 active.

🔹 Monitor WAL Activity

sql
SELECT * FROM pg_stat_wal;

🔹 Measure Replication Lag

On Primary:
sql

SELECT application_name, client_addr,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS byte_lag FROM pg_stat_replication;

No comments:

Post a Comment