10. Failover Scenarios
Potential failover scenarios and the expected reaction to them from this deployment.
Scenario 1: PostgreSQL master node fails
When node0 (the current PostgreSQL master node) fails, the following actions will occur automatically:
• Pgpool will start to run /etc/pgpool-II/failover.sh.
• PostgreSQL node1 will be chosen as new master.
• Pgpool-II master will remotely login to node1 and promote it as master.
Expected manual intervention
Actions to be performed by the system administrator:
• Review logs on node0 to determine server health.
• Repair server as necessary.
• Once it is deemed healthy, then:
◦ On node0, run the start_replication script to have node0 follow node2 (the IP address 10.91.9.41 is the IP of node2 in this example):
/db/bin/start_replication.sh 10.91.9.41
◦ Start the PostgreSQL service on node0
sudo systemctl start postgresql-10.x
◦ Use the PCP command to add node0 back to normal status.
pcp_attach_node -h /var/run/pgpoolpcp -n 0
Scenario 2: PostgreSQL first standby node fails
When node1 (the current PostgreSQL primary standby node) fails, the following actions will occur automatically:
• Pgpool will start to run '/etcpgpool-II/failover.sh'.
• PostgreSQL node2 will be chosen as primary standby node for the master node (node0 in this example).
• Pgpool will remotely login to node2 and retarget it to node0.
• node2 now is the standby node to node0.
Expected manual intervention
Actions to be performed by the system administrator:
• Review logs on node1 to determine server health.
• Repair server as necessary.
• Once it is deemed healthy, then:
◦ On node1, run the start_replication script to have node1 follow node2 (the IP address 10.91.9.41 is the IP of node2 in this example):
/db/bin/start_replication.sh 10.91.9.41
◦ Start PostgreSQL service on node1
sudo systemctl start postgresql-10.x
◦ Use the PCP command to add node1 back to normal status.
pcp_attach_node -h /var/run/pgpoolpcp -n 1
Scenario 3: PostgreSQL second standby node fails
When node2 (the current PostgreSQL secondary standby node) fails, the following actions will occur automatically:
• Pgpool will start to run '/etcpgpool-II/failover.sh'.
• No additional actions as node2 does not impact current master and standby operation.
Expected manual intervention
Actions to be performed by the system administrator:
• Review logs on node2 to determine server health.
• Repair server as necessary.
• Once it is deemed healthy, then:
◦ On node2, run the start_replication script to have node2 resynchronize with node1 (the IP address 10.91.9.24 is the IP of node1 in this example):
/db/bin/start_replication.sh 10.91.9.24
◦ Start PostgreSQL service on node2
sudo systemctl start postgresql-10.x
◦ Use the PCP command to add node2 back to normal status.
pcp_attach_node -h /var/run/pgpoolpcp -n 2