Current File : //usr/share/doc/postgresql-9.2.24/html/warm-standby.html |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>Log-Shipping Standby Servers</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REV="MADE"
HREF="mailto:pgsql-docs@postgresql.org"><LINK
REL="HOME"
TITLE="PostgreSQL 9.2.24 Documentation"
HREF="index.html"><LINK
REL="UP"
TITLE="High Availability, Load Balancing, and Replication"
HREF="high-availability.html"><LINK
REL="PREVIOUS"
TITLE="Comparison of Different Solutions"
HREF="different-replication-solutions.html"><LINK
REL="NEXT"
TITLE="Failover"
HREF="warm-standby-failover.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="stylesheet.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=ISO-8859-1"><META
NAME="creation"
CONTENT="2017-11-06T22:43:11"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="5"
ALIGN="center"
VALIGN="bottom"
><A
HREF="index.html"
>PostgreSQL 9.2.24 Documentation</A
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
TITLE="Comparison of Different Solutions"
HREF="different-replication-solutions.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="high-availability.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="60%"
ALIGN="center"
VALIGN="bottom"
>Chapter 25. High Availability, Load Balancing, and Replication</TD
><TD
WIDTH="20%"
ALIGN="right"
VALIGN="top"
><A
TITLE="Failover"
HREF="warm-standby-failover.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="WARM-STANDBY"
>25.2. Log-Shipping Standby Servers</A
></H1
><P
> Continuous archiving can be used to create a <I
CLASS="FIRSTTERM"
>high
availability</I
> (HA) cluster configuration with one or more
<I
CLASS="FIRSTTERM"
>standby servers</I
> ready to take over operations if the
primary server fails. This capability is widely referred to as
<I
CLASS="FIRSTTERM"
>warm standby</I
> or <I
CLASS="FIRSTTERM"
>log shipping</I
>.
</P
><P
> The primary and standby server work together to provide this capability,
though the servers are only loosely coupled. The primary server operates
in continuous archiving mode, while each standby server operates in
continuous recovery mode, reading the WAL files from the primary. No
changes to the database tables are required to enable this capability,
so it offers low administration overhead compared to some other
replication solutions. This configuration also has relatively low
performance impact on the primary server.
</P
><P
> Directly moving WAL records from one database server to another
is typically described as log shipping. <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
>
implements file-based log shipping by transferring WAL records
one file (WAL segment) at a time. WAL files (16MB) can be
shipped easily and cheaply over any distance, whether it be to an
adjacent system, another system at the same site, or another system on
the far side of the globe. The bandwidth required for this technique
varies according to the transaction rate of the primary server.
Record-based log shipping is more granular and streams WAL changes
incrementally over a network connection (see <A
HREF="warm-standby.html#STREAMING-REPLICATION"
>Section 25.2.5</A
>).
</P
><P
> It should be noted that log shipping is asynchronous, i.e., the WAL
records are shipped after transaction commit. As a result, there is a
window for data loss should the primary server suffer a catastrophic
failure; transactions not yet shipped will be lost. The size of the
data loss window in file-based log shipping can be limited by use of the
<TT
CLASS="VARNAME"
>archive_timeout</TT
> parameter, which can be set as low
as a few seconds. However such a low setting will
substantially increase the bandwidth required for file shipping.
Streaming replication (see <A
HREF="warm-standby.html#STREAMING-REPLICATION"
>Section 25.2.5</A
>)
allows a much smaller window of data loss.
</P
><P
> Recovery performance is sufficiently good that the standby will
typically be only moments away from full
availability once it has been activated. As a result, this is called
a warm standby configuration which offers high
availability. Restoring a server from an archived base backup and
rollforward will take considerably longer, so that technique only
offers a solution for disaster recovery, not high availability.
A standby server can also be used for read-only queries, in which case
it is called a Hot Standby server. See <A
HREF="hot-standby.html"
>Section 25.5</A
> for
more information.
</P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="STANDBY-PLANNING"
>25.2.1. Planning</A
></H2
><P
> It is usually wise to create the primary and standby servers
so that they are as similar as possible, at least from the
perspective of the database server. In particular, the path names
associated with tablespaces will be passed across unmodified, so both
primary and standby servers must have the same mount paths for
tablespaces if that feature is used. Keep in mind that if
<A
HREF="sql-createtablespace.html"
>CREATE TABLESPACE</A
>
is executed on the primary, any new mount point needed for it must
be created on the primary and all standby servers before the command
is executed. Hardware need not be exactly the same, but experience shows
that maintaining two identical systems is easier than maintaining two
dissimilar ones over the lifetime of the application and system.
In any case the hardware architecture must be the same — shipping
from, say, a 32-bit to a 64-bit system will not work.
</P
><P
> In general, log shipping between servers running different major
<SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> release
levels is not possible. It is the policy of the PostgreSQL Global
Development Group not to make changes to disk formats during minor release
upgrades, so it is likely that running different minor release levels
on primary and standby servers will work successfully. However, no
formal support for that is offered and you are advised to keep primary
and standby servers at the same release level as much as possible.
When updating to a new minor release, the safest policy is to update
the standby servers first — a new minor release is more likely
to be able to read WAL files from a previous minor release than vice
versa.
</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="STANDBY-SERVER-OPERATION"
>25.2.2. Standby Server Operation</A
></H2
><P
> In standby mode, the server continuously applies WAL received from the
master server. The standby server can read WAL from a WAL archive
(see <A
HREF="archive-recovery-settings.html#RESTORE-COMMAND"
>restore_command</A
>) or directly from the master
over a TCP connection (streaming replication). The standby server will
also attempt to restore any WAL found in the standby cluster's
<TT
CLASS="FILENAME"
>pg_xlog</TT
> directory. That typically happens after a server
restart, when the standby replays again WAL that was streamed from the
master before the restart, but you can also manually copy files to
<TT
CLASS="FILENAME"
>pg_xlog</TT
> at any time to have them replayed.
</P
><P
> At startup, the standby begins by restoring all WAL available in the
archive location, calling <TT
CLASS="VARNAME"
>restore_command</TT
>. Once it
reaches the end of WAL available there and <TT
CLASS="VARNAME"
>restore_command</TT
>
fails, it tries to restore any WAL available in the <TT
CLASS="FILENAME"
>pg_xlog</TT
> directory.
If that fails, and streaming replication has been configured, the
standby tries to connect to the primary server and start streaming WAL
from the last valid record found in archive or <TT
CLASS="FILENAME"
>pg_xlog</TT
>. If that fails
or streaming replication is not configured, or if the connection is
later disconnected, the standby goes back to step 1 and tries to
restore the file from the archive again. This loop of retries from the
archive, <TT
CLASS="FILENAME"
>pg_xlog</TT
>, and via streaming replication goes on until the server
is stopped or failover is triggered by a trigger file.
</P
><P
> Standby mode is exited and the server switches to normal operation
when <TT
CLASS="COMMAND"
>pg_ctl promote</TT
> is run or a trigger file is found
(<TT
CLASS="VARNAME"
>trigger_file</TT
>). Before failover,
any WAL immediately available in the archive or in <TT
CLASS="FILENAME"
>pg_xlog</TT
> will be
restored, but no attempt is made to connect to the master.
</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="PREPARING-MASTER-FOR-STANDBY"
>25.2.3. Preparing the Master for Standby Servers</A
></H2
><P
> Set up continuous archiving on the primary to an archive directory
accessible from the standby, as described
in <A
HREF="continuous-archiving.html"
>Section 24.3</A
>. The archive location should be
accessible from the standby even when the master is down, i.e. it should
reside on the standby server itself or another trusted server, not on
the master server.
</P
><P
> If you want to use streaming replication, set up authentication on the
primary server to allow replication connections from the standby
server(s); that is, create a role and provide a suitable entry or
entries in <TT
CLASS="FILENAME"
>pg_hba.conf</TT
> with the database field set to
<TT
CLASS="LITERAL"
>replication</TT
>. Also ensure <TT
CLASS="VARNAME"
>max_wal_senders</TT
> is set
to a sufficiently large value in the configuration file of the primary
server.
</P
><P
> Take a base backup as described in <A
HREF="continuous-archiving.html#BACKUP-BASE-BACKUP"
>Section 24.3.2</A
>
to bootstrap the standby server.
</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="STANDBY-SERVER-SETUP"
>25.2.4. Setting Up a Standby Server</A
></H2
><P
> To set up the standby server, restore the base backup taken from primary
server (see <A
HREF="continuous-archiving.html#BACKUP-PITR-RECOVERY"
>Section 24.3.4</A
>). Create a recovery
command file <TT
CLASS="FILENAME"
>recovery.conf</TT
> in the standby's cluster data
directory, and turn on <TT
CLASS="VARNAME"
>standby_mode</TT
>. Set
<TT
CLASS="VARNAME"
>restore_command</TT
> to a simple command to copy files from
the WAL archive. If you plan to have multiple standby servers for high
availability purposes, set <TT
CLASS="VARNAME"
>recovery_target_timeline</TT
> to
<TT
CLASS="LITERAL"
>latest</TT
>, to make the standby server follow the timeline change
that occurs at failover to another standby.
</P
><DIV
CLASS="NOTE"
><BLOCKQUOTE
CLASS="NOTE"
><P
><B
>Note: </B
> Do not use pg_standby or similar tools with the built-in standby mode
described here. <TT
CLASS="VARNAME"
>restore_command</TT
> should return immediately
if the file does not exist; the server will retry the command again if
necessary. See <A
HREF="log-shipping-alternative.html"
>Section 25.4</A
>
for using tools like pg_standby.
</P
></BLOCKQUOTE
></DIV
><P
> If you want to use streaming replication, fill in
<TT
CLASS="VARNAME"
>primary_conninfo</TT
> with a libpq connection string, including
the host name (or IP address) and any additional details needed to
connect to the primary server. If the primary needs a password for
authentication, the password needs to be specified in
<TT
CLASS="VARNAME"
>primary_conninfo</TT
> as well.
</P
><P
> If you're setting up the standby server for high availability purposes,
set up WAL archiving, connections and authentication like the primary
server, because the standby server will work as a primary server after
failover.
</P
><P
> If you're using a WAL archive, its size can be minimized using the <A
HREF="archive-recovery-settings.html#ARCHIVE-CLEANUP-COMMAND"
>archive_cleanup_command</A
> parameter to remove files that are no
longer required by the standby server.
The <SPAN
CLASS="APPLICATION"
>pg_archivecleanup</SPAN
> utility is designed specifically to
be used with <TT
CLASS="VARNAME"
>archive_cleanup_command</TT
> in typical single-standby
configurations, see <A
HREF="pgarchivecleanup.html"
><SPAN
CLASS="APPLICATION"
>pg_archivecleanup</SPAN
></A
>.
Note however, that if you're using the archive for backup purposes, you
need to retain files needed to recover from at least the latest base
backup, even if they're no longer needed by the standby.
</P
><P
> A simple example of a <TT
CLASS="FILENAME"
>recovery.conf</TT
> is:
</P><PRE
CLASS="PROGRAMLISTING"
>standby_mode = 'on'
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
restore_command = 'cp /path/to/archive/%f %p'
archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'</PRE
><P>
</P
><P
> You can have any number of standby servers, but if you use streaming
replication, make sure you set <TT
CLASS="VARNAME"
>max_wal_senders</TT
> high enough in
the primary to allow them to be connected simultaneously.
</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="STREAMING-REPLICATION"
>25.2.5. Streaming Replication</A
></H2
><P
> Streaming replication allows a standby server to stay more up-to-date
than is possible with file-based log shipping. The standby connects
to the primary, which streams WAL records to the standby as they're
generated, without waiting for the WAL file to be filled.
</P
><P
> Streaming replication is asynchronous by default
(see <A
HREF="warm-standby.html#SYNCHRONOUS-REPLICATION"
>Section 25.2.7</A
>), in which case there is
a small delay between committing a transaction in the primary and the
changes becoming visible in the standby. This delay is however much
smaller than with file-based log shipping, typically under one second
assuming the standby is powerful enough to keep up with the load. With
streaming replication, <TT
CLASS="VARNAME"
>archive_timeout</TT
> is not required to
reduce the data loss window.
</P
><P
> If you use streaming replication without file-based continuous
archiving, you have to set <TT
CLASS="VARNAME"
>wal_keep_segments</TT
> in the master
to a value high enough to ensure that old WAL segments are not recycled
too early, while the standby might still need them to catch up. If the
standby falls behind too much, it needs to be reinitialized from a new
base backup. If you set up a WAL archive that's accessible from the
standby, <TT
CLASS="VARNAME"
>wal_keep_segments</TT
> is not required as the standby can always
use the archive to catch up.
</P
><P
> To use streaming replication, set up a file-based log-shipping standby
server as described in <A
HREF="warm-standby.html"
>Section 25.2</A
>. The step that
turns a file-based log-shipping standby into streaming replication
standby is setting <TT
CLASS="VARNAME"
>primary_conninfo</TT
> setting in the
<TT
CLASS="FILENAME"
>recovery.conf</TT
> file to point to the primary server. Set
<A
HREF="runtime-config-connection.html#GUC-LISTEN-ADDRESSES"
>listen_addresses</A
> and authentication options
(see <TT
CLASS="FILENAME"
>pg_hba.conf</TT
>) on the primary so that the standby server
can connect to the <TT
CLASS="LITERAL"
>replication</TT
> pseudo-database on the primary
server (see <A
HREF="warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION"
>Section 25.2.5.1</A
>).
</P
><P
> On systems that support the keepalive socket option, setting
<A
HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-IDLE"
>tcp_keepalives_idle</A
>,
<A
HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-INTERVAL"
>tcp_keepalives_interval</A
> and
<A
HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-COUNT"
>tcp_keepalives_count</A
> helps the primary promptly
notice a broken connection.
</P
><P
> Set the maximum number of concurrent connections from the standby servers
(see <A
HREF="runtime-config-replication.html#GUC-MAX-WAL-SENDERS"
>max_wal_senders</A
> for details).
</P
><P
> When the standby is started and <TT
CLASS="VARNAME"
>primary_conninfo</TT
> is set
correctly, the standby will connect to the primary after replaying all
WAL files available in the archive. If the connection is established
successfully, you will see a walreceiver process in the standby, and
a corresponding walsender process in the primary.
</P
><DIV
CLASS="SECT3"
><H3
CLASS="SECT3"
><A
NAME="STREAMING-REPLICATION-AUTHENTICATION"
>25.2.5.1. Authentication</A
></H3
><P
> It is very important that the access privileges for replication be set up
so that only trusted users can read the WAL stream, because it is
easy to extract privileged information from it. Standby servers must
authenticate to the primary as a superuser or an account that has the
<TT
CLASS="LITERAL"
>REPLICATION</TT
> privilege. It is recommended to create a
dedicated user account with <TT
CLASS="LITERAL"
>REPLICATION</TT
> and <TT
CLASS="LITERAL"
>LOGIN</TT
>
privileges for replication. While <TT
CLASS="LITERAL"
>REPLICATION</TT
> privilege gives
very high permissions, it does not allow the user to modify any data on
the primary system, which the <TT
CLASS="LITERAL"
>SUPERUSER</TT
> privilege does.
</P
><P
> Client authentication for replication is controlled by a
<TT
CLASS="FILENAME"
>pg_hba.conf</TT
> record specifying <TT
CLASS="LITERAL"
>replication</TT
> in the
<TT
CLASS="REPLACEABLE"
><I
>database</I
></TT
> field. For example, if the standby is running on
host IP <TT
CLASS="LITERAL"
>192.168.1.100</TT
> and the account name for replication
is <TT
CLASS="LITERAL"
>foo</TT
>, the administrator can add the following line to the
<TT
CLASS="FILENAME"
>pg_hba.conf</TT
> file on the primary:
</P><PRE
CLASS="PROGRAMLISTING"
># Allow the user "foo" from host 192.168.1.100 to connect to the primary
# as a replication standby if the user's password is correctly supplied.
#
# TYPE DATABASE USER ADDRESS METHOD
host replication foo 192.168.1.100/32 md5</PRE
><P>
</P
><P
> The host name and port number of the primary, connection user name,
and password are specified in the <TT
CLASS="FILENAME"
>recovery.conf</TT
> file.
The password can also be set in the <TT
CLASS="FILENAME"
>~/.pgpass</TT
> file on the
standby (specify <TT
CLASS="LITERAL"
>replication</TT
> in the <TT
CLASS="REPLACEABLE"
><I
>database</I
></TT
>
field).
For example, if the primary is running on host IP <TT
CLASS="LITERAL"
>192.168.1.50</TT
>,
port <TT
CLASS="LITERAL"
>5432</TT
>, the account name for replication is
<TT
CLASS="LITERAL"
>foo</TT
>, and the password is <TT
CLASS="LITERAL"
>foopass</TT
>, the administrator
can add the following line to the <TT
CLASS="FILENAME"
>recovery.conf</TT
> file on the
standby:
</P><PRE
CLASS="PROGRAMLISTING"
># The standby connects to the primary that is running on host 192.168.1.50
# and port 5432 as the user "foo" whose password is "foopass".
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'</PRE
><P>
</P
></DIV
><DIV
CLASS="SECT3"
><H3
CLASS="SECT3"
><A
NAME="STREAMING-REPLICATION-MONITORING"
>25.2.5.2. Monitoring</A
></H3
><P
> An important health indicator of streaming replication is the amount
of WAL records generated in the primary, but not yet applied in the
standby. You can calculate this lag by comparing the current WAL write
location on the primary with the last WAL location received by the
standby. They can be retrieved using
<CODE
CLASS="FUNCTION"
>pg_current_xlog_location</CODE
> on the primary and the
<CODE
CLASS="FUNCTION"
>pg_last_xlog_receive_location</CODE
> on the standby,
respectively (see <A
HREF="functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE"
>Table 9-60</A
> and
<A
HREF="functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE"
>Table 9-61</A
> for details).
The last WAL receive location in the standby is also displayed in the
process status of the WAL receiver process, displayed using the
<TT
CLASS="COMMAND"
>ps</TT
> command (see <A
HREF="monitoring-ps.html"
>Section 27.1</A
> for details).
</P
><P
> You can retrieve a list of WAL sender processes via the
<A
HREF="monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE"
> <TT
CLASS="LITERAL"
>pg_stat_replication</TT
></A
> view. Large differences between
<CODE
CLASS="FUNCTION"
>pg_current_xlog_location</CODE
> and <TT
CLASS="LITERAL"
>sent_location</TT
> field
might indicate that the master server is under heavy load, while
differences between <TT
CLASS="LITERAL"
>sent_location</TT
> and
<CODE
CLASS="FUNCTION"
>pg_last_xlog_receive_location</CODE
> on the standby might indicate
network delay, or that the standby is under heavy load.
</P
></DIV
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="CASCADING-REPLICATION"
>25.2.6. Cascading Replication</A
></H2
><P
> The cascading replication feature allows a standby server to accept replication
connections and stream WAL records to other standbys, acting as a relay.
This can be used to reduce the number of direct connections to the master
and also to minimize inter-site bandwidth overheads.
</P
><P
> A standby acting as both a receiver and a sender is known as a cascading
standby. Standbys that are more directly connected to the master are known
as upstream servers, while those standby servers further away are downstream
servers. Cascading replication does not place limits on the number or
arrangement of downstream servers, though each standby connects to only
one upstream server which eventually links to a single master/primary
server.
</P
><P
> A cascading standby sends not only WAL records received from the
master but also those restored from the archive. So even if the replication
connection in some upstream connection is terminated, streaming replication
continues downstream for as long as new WAL records are available.
</P
><P
> Cascading replication is currently asynchronous. Synchronous replication
(see <A
HREF="warm-standby.html#SYNCHRONOUS-REPLICATION"
>Section 25.2.7</A
>) settings have no effect on
cascading replication at present.
</P
><P
> Hot Standby feedback propagates upstream, whatever the cascaded arrangement.
</P
><P
> Promoting a cascading standby terminates the immediate downstream replication
connections which it serves. This is because the timeline becomes different
between standbys, and they can no longer continue replication. The
affected standby(s) may reconnect to reestablish streaming replication.
</P
><P
> To use cascading replication, set up the cascading standby so that it can
accept replication connections (that is, set
<A
HREF="runtime-config-replication.html#GUC-MAX-WAL-SENDERS"
>max_wal_senders</A
> and <A
HREF="runtime-config-replication.html#GUC-HOT-STANDBY"
>hot_standby</A
>,
and configure
<A
HREF="auth-pg-hba-conf.html"
>host-based authentication</A
>).
You will also need to set <TT
CLASS="VARNAME"
>primary_conninfo</TT
> in the downstream
standby to point to the cascading standby.
</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="SYNCHRONOUS-REPLICATION"
>25.2.7. Synchronous Replication</A
></H2
><P
> <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> streaming replication is asynchronous by
default. If the primary server
crashes then some transactions that were committed may not have been
replicated to the standby server, causing data loss. The amount
of data loss is proportional to the replication delay at the time of
failover.
</P
><P
> Synchronous replication offers the ability to confirm that all changes
made by a transaction have been transferred to one synchronous standby
server. This extends the standard level of durability
offered by a transaction commit. This level of protection is referred
to as 2-safe replication in computer science theory.
</P
><P
> When requesting synchronous replication, each commit of a
write transaction will wait until confirmation is
received that the commit has been written to the transaction log on disk
of both the primary and standby server. The only possibility that data
can be lost is if both the primary and the standby suffer crashes at the
same time. This can provide a much higher level of durability, though only
if the sysadmin is cautious about the placement and management of the two
servers. Waiting for confirmation increases the user's confidence that the
changes will not be lost in the event of server crashes but it also
necessarily increases the response time for the requesting transaction.
The minimum wait time is the roundtrip time between primary to standby.
</P
><P
> Read only transactions and transaction rollbacks need not wait for
replies from standby servers. Subtransaction commits do not wait for
responses from standby servers, only top-level commits. Long
running actions such as data loading or index building do not wait
until the very final commit message. All two-phase commit actions
require commit waits, including both prepare and commit.
</P
><DIV
CLASS="SECT3"
><H3
CLASS="SECT3"
><A
NAME="SYNCHRONOUS-REPLICATION-CONFIG"
>25.2.7.1. Basic Configuration</A
></H3
><P
> Once streaming replication has been configured, configuring synchronous
replication requires only one additional configuration step:
<A
HREF="runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES"
>synchronous_standby_names</A
> must be set to
a non-empty value. <TT
CLASS="VARNAME"
>synchronous_commit</TT
> must also be set to
<TT
CLASS="LITERAL"
>on</TT
>, but since this is the default value, typically no change is
required. (See <A
HREF="runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS"
>Section 18.5.1</A
> and
<A
HREF="runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER"
>Section 18.6.2</A
>.)
This configuration will cause each commit to wait for
confirmation that the standby has written the commit record to durable
storage.
<TT
CLASS="VARNAME"
>synchronous_commit</TT
> can be set by individual
users, so it can be configured in the configuration file, for particular
users or databases, or dynamically by applications, in order to control
the durability guarantee on a per-transaction basis.
</P
><P
> After a commit record has been written to disk on the primary, the
WAL record is then sent to the standby. The standby sends reply
messages each time a new batch of WAL data is written to disk, unless
<TT
CLASS="VARNAME"
>wal_receiver_status_interval</TT
> is set to zero on the standby.
If the standby is the first matching standby, as specified in
<TT
CLASS="VARNAME"
>synchronous_standby_names</TT
> on the primary, the reply
messages from that standby will be used to wake users waiting for
confirmation that the commit record has been received. These parameters
allow the administrator to specify which standby servers should be
synchronous standbys. Note that the configuration of synchronous
replication is mainly on the master. Named standbys must be directly
connected to the master; the master knows nothing about downstream
standby servers using cascaded replication.
</P
><P
> Setting <TT
CLASS="VARNAME"
>synchronous_commit</TT
> to <TT
CLASS="LITERAL"
>remote_write</TT
> will
cause each commit to wait for confirmation that the standby has received
the commit record and written it out to its own operating system, but not
for the data to be flushed to disk on the standby. This
setting provides a weaker guarantee of durability than <TT
CLASS="LITERAL"
>on</TT
>
does: the standby could lose the data in the event of an operating system
crash, though not a <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> crash.
However, it's a useful setting in practice
because it can decrease the response time for the transaction.
Data loss could only occur if both the primary and the standby crash and
the database of the primary gets corrupted at the same time.
</P
><P
> Users will stop waiting if a fast shutdown is requested. However, as
when using asynchronous replication, the server will not fully
shutdown until all outstanding WAL records are transferred to the currently
connected standby servers.
</P
></DIV
><DIV
CLASS="SECT3"
><H3
CLASS="SECT3"
><A
NAME="SYNCHRONOUS-REPLICATION-PERFORMANCE"
>25.2.7.2. Planning for Performance</A
></H3
><P
> Synchronous replication usually requires carefully planned and placed
standby servers to ensure applications perform acceptably. Waiting
doesn't utilize system resources, but transaction locks continue to be
held until the transfer is confirmed. As a result, incautious use of
synchronous replication will reduce performance for database
applications because of increased response times and higher contention.
</P
><P
> <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> allows the application developer
to specify the durability level required via replication. This can be
specified for the system overall, though it can also be specified for
specific users or connections, or even individual transactions.
</P
><P
> For example, an application workload might consist of:
10% of changes are important customer details, while
90% of changes are less important data that the business can more
easily survive if it is lost, such as chat messages between users.
</P
><P
> With synchronous replication options specified at the application level
(on the primary) we can offer synchronous replication for the most
important changes, without slowing down the bulk of the total workload.
Application level options are an important and practical tool for allowing
the benefits of synchronous replication for high performance applications.
</P
><P
> You should consider that the network bandwidth must be higher than
the rate of generation of WAL data.
</P
></DIV
><DIV
CLASS="SECT3"
><H3
CLASS="SECT3"
><A
NAME="SYNCHRONOUS-REPLICATION-HA"
>25.2.7.3. Planning for High Availability</A
></H3
><P
> Commits made when <TT
CLASS="VARNAME"
>synchronous_commit</TT
> is set to <TT
CLASS="LITERAL"
>on</TT
>
or <TT
CLASS="LITERAL"
>remote_write</TT
> will wait until the synchronous standby responds. The response
may never occur if the last, or only, standby should crash.
</P
><P
> The best solution for avoiding data loss is to ensure you don't lose
your last remaining synchronous standby. This can be achieved by naming multiple
potential synchronous standbys using <TT
CLASS="VARNAME"
>synchronous_standby_names</TT
>.
The first named standby will be used as the synchronous standby. Standbys
listed after this will take over the role of synchronous standby if the
first one should fail.
</P
><P
> When a standby first attaches to the primary, it will not yet be properly
synchronized. This is described as <TT
CLASS="LITERAL"
>catchup</TT
> mode. Once
the lag between standby and primary reaches zero for the first time
we move to real-time <TT
CLASS="LITERAL"
>streaming</TT
> state.
The catch-up duration may be long immediately after the standby has
been created. If the standby is shut down, then the catch-up period
will increase according to the length of time the standby has been down.
The standby is only able to become a synchronous standby
once it has reached <TT
CLASS="LITERAL"
>streaming</TT
> state.
</P
><P
> If primary restarts while commits are waiting for acknowledgement, those
waiting transactions will be marked fully committed once the primary
database recovers.
There is no way to be certain that all standbys have received all
outstanding WAL data at time of the crash of the primary. Some
transactions may not show as committed on the standby, even though
they show as committed on the primary. The guarantee we offer is that
the application will not receive explicit acknowledgement of the
successful commit of a transaction until the WAL data is known to be
safely received by the standby.
</P
><P
> If you really do lose your last standby server then you should disable
<TT
CLASS="VARNAME"
>synchronous_standby_names</TT
> and reload the configuration file
on the primary server.
</P
><P
> If the primary is isolated from remaining standby servers you should
fail over to the best candidate of those other remaining standby servers.
</P
><P
> If you need to re-create a standby server while transactions are
waiting, make sure that the commands pg_start_backup() and
pg_stop_backup() are run in a session with
<TT
CLASS="VARNAME"
>synchronous_commit</TT
> = <TT
CLASS="LITERAL"
>off</TT
>, otherwise those
requests will wait forever for the standby to appear.
</P
></DIV
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="different-replication-solutions.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="warm-standby-failover.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Comparison of Different Solutions</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="high-availability.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Failover</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>