8 Defining Replication Schemes

This chapter describes how to define replication schemes that are not active standby pairs. For information about defining active standby pair replication schemes, see Chapter 3, "Defining an Active Standby Pair Replication Scheme". If you want to replicate a data store that has cache groups, see Chapter 5, "Administering an Active Standby Pair with Cache Groups".

To reduce the amount of bandwidth required for replication, see "Compressing replicated traffic".

To replicate tables with columns in a different order or with a different number of partitions, see "Replicating tables with different definitions".

This chapter includes these topics:

Designing a highly available system
Defining a replication scheme
Defining replication elements
Checking for replication conflicts on table elements
Setting transmit durability on data store elements
Using a return service
Setting STORE attributes
Configuring network operations
Replication scheme syntax examples
Creating replication schemes with scripts

Designing a highly available system

These are the primary objectives of any replication scheme:

Provide one or more backup data stores to ensure that the data is always available to applications
Provide a means to recover failed data stores from their backup stores
Distribute workloads efficiently to provide applications with the quickest possible access to the data
Enable software upgrades and maintenance without disrupting service to users

In a highly available system, a subscriber data store must be able to survive failures that may affect the master. At a minimum, the master and subscriber need to be on separate machines. For some applications, you may want to place the subscriber in an environment that has a separate power supply. In certain cases, you may need to place a subscriber at an entirely separate site.

In this chapter, we consider the replication schemes described in "Types of replication schemes":

Unidirectional
Bidirectional split workload
Bidirectional distributed workload
Propagation

In addition, consider whether you want to replicate a whole data store or selected elements of the data store. Also consider the number of subscribers in the replication scheme. Unidirectional and propagation replication schemes allow you to choose the number of subscribers.

The rest of this section includes these topics:

Considering failover and recovery scenarios
Making decisions about performance and recovery tradeoffs
Distributing workloads

For more information about using replication to facilitate online upgrades, see "Performing an online upgrade with replication" and "Performing an online upgrade with active standby pair replication" in Oracle TimesTen In-Memory Database Installation Guide.

Considering failover and recovery scenarios

As you plan a replication scheme, consider every failover and recovery scenario. For example, subscriber failures generally have no impact on the applications connected to the master data stores and can be recovered from without disrupting user service. If a failure occurs on a master data store, you should have a means to redirect the application load to a subscriber and continue service with no or minimal interruption. This process is typically handled by a cluster manager or custom software designed to detect failures, redirect users or applications from the failed data store to one of its subscribers, and manage recovery of the failed data store. See Chapter 10, "Managing Data Store Failover and Recovery".

When planning failover strategies, consider which subscribers will take on the role of the master and for which users or applications. Also consider recovery factors. For example, a failed master must be able to recover its data store from its most up-to-date subscriber, and any subscriber must be able to recover from its master. A bidirectional scheme that replicates the entire data store can take advantage of automatic restoration of a failed master. See "Automatic catch-up of a failed master data store".

Consider the failure scenario for the unidirectionally replicated data store shown in Figure 8-1. In the case of a master failure, the application cannot access the data store until it is recovered from the subscriber. You cannot switch the application connection or user load to the subscriber unless you use an ALTER REPLICATION statement to redefine the subscriber data store as the master. See "Replacing a master data store".

Figure 8-1 Recovering a master in a unidirectional scheme

Description of "Figure 8-1 Recovering a master in a unidirectional scheme"

Figure 8-2 shows a bidirectional distributed workload scheme in which the entire data store is replicated. Failover in this type of replication scheme involves shifting the users of the application on the failed data store to the application on the surviving data store. Upon recovery, the workload can be redistributed to the application on the recovered data store.

Figure 8-2 Recovering a master in a distributed workload scheme

Description of "Figure 8-2 Recovering a master in a distributed workload scheme"

Similarly, the users in a split workload scheme must be shifted from the failed data store to the surviving data store. Because replication in a split workload scheme is not at the data store level, you must use an ALTER REPLICATION statement to set a new master data store. See "Replacing a master data store". Upon recovery, the users can be moved back to the recovered master data store.

Propagation replication schemes also require the use of the ALTER REPLICATION statement to set a new master or a new propagator if the master or propagator fails. Higher availability is achieved if two propagators are defined in the replication scheme. See Figure 1-11 for an example of a propagation replication scheme with two propagators.

Making decisions about performance and recovery tradeoffs

When you design a replication scheme, weigh operational efficiencies against the complexities of failover and recovery. Factors that may complicate failover and recovery include the network topology that connects a master with its subscribers and the complexity of the replication scheme. For example, it is easier to recover a master that has been fully replicated to a single subscriber than recover a master that has selected elements replicated to different subscribers.

You can configure replication to work asynchronously (the default), "semi-synchronously" with the return receipt service, or fully synchronously with the return twosafe service. Selecting a return service provides greater confidence that your data is consistent on the master and subscriber data stores. Your decision to use default asynchronous replication or to configure return receipt or return twosafe mode depends on the degree of confidence you require and the performance tradeoff you are willing to make in exchange.

Table 8-1 summarizes the performance and recover tradeoffs of asynchronous replication, return receipt service and return twosafe service.

Table 8-1 Performance and recovery tradeoffs

Type of behavior	Asynchronous replication (default)	Return receipt	Return twosafe
Commit sequence	Each transaction is committed first on the master data store.	Each transaction is committed first on the master data store	Each transaction is committed first on the subscriber data store.
Performance on master	Shortest response time and best throughput because there is no log wait between transactions or before the commit on the master.	Longer response time and less throughput than asynchronous. The application is blocked for the duration of the network round-trip after commit. Replicated transactions are more serialized than with asynchronous replication, which results in less throughput.	Longest response time and least throughput. The application is blocked for the duration of the network round-trip and remote commit on the subscriber before the commit on the master. Transactions are fully serialized, which results in the least throughput.
Effect of a runtime error	Because the transaction is first committed on the master data store, errors that occur when committing on a subscriber require the subscriber to be either manually corrected or destroyed and then recovered from the master data store.	Because the transaction is first committed on the master data store, errors that occur when committing on a subscriber require the subscriber to be either manually corrected or destroyed and then recovered from the master data store.	Because the transaction is first committed on the subscriber data store, errors that occur when committing on the master require the master to be either manually corrected or destroyed and then recovered from the subscriber data store.
Failover after failure of master	If the master fails and the subscriber takes over, the subscriber may be behind the master and must reprocess data feeds and be able to remove duplicates.	If the master fails and the subscriber takes over, the subscriber may be behind the master and must reprocess data feeds and be able to remove duplicates.	If the master fails and the subscriber takes over, the subscriber is at least up to date with the master. It is also possible for the subscriber to be ahead of the master if the master fails before committing a transaction it had replicated to the subscriber.

In addition to the performance and recovery tradeoffs between the two return services, you should also consider the following:

Return receipt can be used in more configurations, whereas return twosafe can only be used in a bidirectional configuration or an active standby pair.
Return twosafe allows you to specify a "local action" to be taken on the master data store in the event of a timeout or other error encountered when replicating a transaction to the subscriber data store.

A transaction is classified as return receipt or return twosafe when the application updates a table that is configured for either return receipt or return twosafe. Once a transaction is classified as either return receipt or return twosafe, it remains so, even if the replication scheme is altered before the transaction completes.

For more information about return services, see "Using a return service".

Distributing workloads

Consider configuring the data stores to distribute application workloads and make the best use of a limited number of server machines. For example, it may be efficient and economical to configure the data stores in a bidirectional distributed workload replication scheme so that each serves as both master and subscriber, rather than as separate master and subscriber data stores. However, a distributed workload scheme works best with applications that primarily read from the data stores. Implementing a distributed workload scheme for applications that frequently write to the same elements in a data store may diminish performance and require that you implement a solution to prevent or manage update conflicts, as described in Chapter 13, "Resolving Replication Conflicts".

Defining a replication scheme

After you have designed your replication scheme, use the CREATE REPLICATION SQL statement to apply the scheme to your data stores. You must have the ADMIN privilege to use the CREATE REPLICATION statement.

Table 8-2 shows the components of a replication scheme and identifies the clauses associated with the topics in this chapter. The complete syntax for the CREATE REPLICATION statement is provided in Oracle TimesTen In-Memory Database SQL Reference.

Table 8-2 Components of a replication scheme

Component	See...
`CREATE REPLICATION` `Owner.SchemeName`	"Owner of the replication scheme and tables"
`ELEMENT` `ElementName` `ElementType`	"Defining replication elements"
`[CheckConflicts]`	"Checking for replication conflicts on table elements"
`{MASTER\|PROPAGATOR}` `DataStoreName` `ON` `"HostName"`	"Data store names"
`[TRANSMIT {NONDURABLE\|DURABLE}]`	"Setting transmit durability on data store elements"
`SUBSCRIBER` `DataStoreName` `ON` `"HostName"`	"Data store names"
`[ReturnServiceAttribute]`	"Using a return service"
`INCLUDE\|EXCLUDE`	"Defining data store elements"
`STORE` `DataStoreName` `DataStoreAttributes`	"Setting STORE attributes"
`[NetworkOperation]`	"Configuring network operations"

Note:

Naming errors in your CREATE REPLICATION statement are often hard to troubleshoot, so take the time to check and double-check your element, data store, and host names for mistakes.

The replication scheme used by a data store is represented in its TTREP tables and persists across system reboots. You cannot directly modify the contents of the TTREP tables. Modifications can be made only by means of the CREATE REPLICATION or ALTER REPLICATION statements. See "System and Replication Tables" in Oracle TimesTen In-Memory Database SQL Reference for descriptions of the TTREP tables.

Owner of the replication scheme and tables

The owner and name of the replication scheme and the replicated tables must be identical on both the master and subscriber data stores. To ensure that there is a common owner across all data stores, you can explicitly specify an owner name with your replication scheme name in the CREATE REPLICATION statement.

For example, to assign an owner named repl to the replication scheme named repscheme, the first line of your CREATE REPLICATION statement would look like:

CREATE REPLICATION rep1.repscheme

If you omit the owner from the name of your replication scheme and the replicated tables, the default owner name, as specified by the login name of the requester or the name set by the UID attribute in the DSN, is used in its place. Your replication scheme does not work if owner names are different across its data stores.

Data store names

These are the roles of the data stores in a replication scheme:

Master: Applications update the master data store. The master sends the updates to the propagator or to the subscribers directly.
Propagator: The propagator data store receives updates from the master data store and sends them to subscriber data stores.
Subscriber: Subscribers receive updates from the propagator or the master.

Before you define the replication scheme, you need to define the data source names (DSNs) for the data stores in the replication scheme. On Linux or UNIX, create an odbc.ini file. On Windows, use the ODBC Administrator to name the data stores and set data store attributes. See "Step 1: Create the DSNs for the master and the subscriber" for an example.

Each data store "name" specified in a replication scheme must match the prefix of the data store file name without the path specified for the DataStore attribute in the DSN definition. A replication scheme that uses the names specified in the Data Source Name attributes does not work. To avoid confusion, use the same name for both your DataStore and Data Source Name attributes in each DSN definition. For example, if the data store path is directory/subdirectory/foo.ds0, then foo is the data store name that you should use.

Defining replication elements

A replication scheme consists of one or more ELEMENT descriptions that contain the name of the element, its type (DATASTORE, TABLE or SEQUENCE), the master data store on which it is updated, and the subscriber stores to which the updates are replicated.

If you want to replicate a data store with cache groups, see Chapter 5, "Administering an Active Standby Pair with Cache Groups".

These are restrictions on elements:

Do not include a specific object (table, sequence or data store) in more than one element description.
Do not define the same element in the role of both master and propagator.
An element must include the data store on the current host as either the master, subscriber or propagator.
Element names must be unique within a replication scheme.

The correct way to define elements in a multiple subscriber scheme is described in "Multiple subscriber schemes with return services and a failure threshold". The correct way to propagate elements is described in "Propagation scheme".

The name of each element in a scheme can be used to identify the element if you decide later to drop or modify the element by using the ALTER REPLICATION statement.

You can add tables, sequences and data stores to an existing replication scheme. See "Altering a replication scheme". You can drop a table or sequence from a data store that is part of an existing replication scheme. See "Dropping a table or sequence from a replication scheme".

The rest of this section includes the following topics:

Defining data store elements
Defining table elements
Replicating tables with foreign key relationships
Replicating materialized views
Replicating sequences

Defining data store elements

To replicate the entire contents of the master data store (masterds) to the subscriber data store (subscriberds), the ELEMENT description (named ds1) might look like the following:

ELEMENT ds1 DATASTORE
  MASTER masterds ON "system1"
  SUBSCRIBER subscriberds ON "system2"

Identify a data store host using the host name returned by the hostname operating system command. Host names containing special characters must be surrounded by double quotes (").

You can choose to exclude certain tables and sequences from the data store element by using the EXCLUDE TABLE and EXCLUDE SEQUENCE clauses of the CREATE REPLICATION statement. When you use the EXCLUDE clauses, the entire data store is replicated to all subscribers in the element except for the objects that are specified in the EXCLUDE clauses. Use only one EXCLUDE TABLE and one EXCLUDE SEQUENCE clause in an element description. For example, this element description excludes two tables and one sequence:

ELEMENT ds1 DATASTORE
  MASTER masterds ON "system1"
  SUBSCRIBER subscriberds ON "system2"
  EXCLUDE TABLE tab1, tab2
  EXCLUDE SEQUENCE seq1

You can choose to include only certain tables and sequences in the data store by using the INCLUDE TABLE and INCLUDE SEQUENCE clauses of the CREATE REPLICATION statement. When you use the INCLUDE clauses, only the objects that are specified in the INCLUDE clauses are replicated to each subscriber in the element. Use only one INCLUDE TABLE and one INCLUDE SEQUENCE clause in an element description. For example, this element description includes one table and two sequences:

ELEMENT ds1 DATASTORE
  MASTER masterds ON "system1"
  SUBSCRIBER subscriberds ON "system2"
  INCLUDE TABLE tab3
  INCLUDE SEQUENCE seq2, seq3

Defining table elements

To replicate the tab1 and tab2 tables from a master data store (named masterds and located on a host named system1) to a subscriber data store (named subscriberds on a host named system2), the ELEMENT descriptions (named a and b) might look like the following:

ELEMENT a TABLE tab1
  MASTER masterds ON "system1"
 SUBSCRIBER subscriberds ON "system2"
ELEMENT b TABLE tab2
  MASTER masterds ON "system1"
  SUBSCRIBER subscriberds ON "system2"

Replicating tables with foreign key relationships

You may choose to replicate all or a subset of tables that have foreign key relationships with one another. However, if the foreign key relationships have been configured with ON DELETE CASCADE, then you must configure replication to replicate all of the tables, either by configuring the replication scheme with a DATASTORE element that does not exclude any of the tables, or by configuring the scheme with a TABLE element for every table that is involved in the relationship.

It is not possible to add a table with a foreign key relationship configured with ON DELETE CASCADE to a pre-existing replication scheme using ALTER REPLICATION. Instead, you must drop the replication scheme, create the new table with the foreign key relationship, and then create a new replication scheme replicating all of the related tables.

Replicating materialized views

A materialized view is a summary of data selected from one or more TimesTen tables, called detail tables. Although you cannot replicate materialized views directly, you can replicate their underlying detail tables in the same manner as you would replicate regular TimesTen tables.

The detail tables on the master and subscriber data stores can be referenced by materialized views. However, TimesTen replication verifies only that the replicated detail tables have the same structure on both the master and subscriber. It does not enforce that the materialized views are the same on each data store.

If you replicate an entire data store containing a materialized or nonmaterialized view as a DATASTORE element, only the detail tables associated with the view are replicated. The view itself is not replicated. A matching view can be defined on the subscriber data store, but is not required. If detail tables are replicated, TimesTen automatically updates the corresponding view.

Materialized views defined on replicated tables may result in replication failures or inconsistencies if the materialized view is specified so that overflow or underflow conditions occur when the materialized view is updated.

Replicating sequences

You can use replication to ensure that the current value of a sequence on a subscriber data store is always in advance of the current value on the master data store, thereby preventing conflicts if the sequence is later used to make updates directly on the subscriber data store. For example, you may have an application that uses a sequence to determine primary key values in a replicated table, and a bidirectional configuration that includes a data store that must assume the master role when the master data store fails. By replicating the sequence, you can guarantee that the same sequence value is not used twice, regardless of which data store you update directly.

Sequence replication works by transmitting a new current value from the master data store to the subscriber every 20 references to the sequence's NEXTVAL, starting with the first reference. For example, consider a sequence my.seq with a MINVALUE of 1 and an INCREMENT of 2. The very first time that you use my.seq.NEXTVAL in a transaction, the current value of the sequence on the master data store is changed to three, and a new current value of 41 is replicated to the subscriber. The next 19 references to my.seq.NEXTVAL on the master data store result in no new current value being replicated, since the current value of 41 on the subscriber data store is still ahead of the current value on the master. Only on the twenty-first reference to my.seq.NEXTVAL is a new current value, 61, transmitted to the subscriber data store, as the subscriber's previous current value of 41 would now be behind the value of 43 on the master.

Sequence replication has these limitations:

Sequences with the CYCLE attribute cannot be replicated.
The definition of the replicated sequence on each peer data store must be identical.
No conflict checking is performed on sequences. If you make updates to sequences in both data stores in a bidirectional replication configuration without using the RETURN TWOSAFE service, it is possible for both sequences to return the identical NEXTVAL.

If you need to use sequences in a bidirectional replication scheme where updates may occur on either peer, you may instead use a nonreplicated sequence with different MINVALUE and MAXVALUE attributes on each data store. For example, you may create sequence my.seq on datastore DS1 with a MINVALUE of 1 and a MAXVALUE of 100, and the same sequence on DS2 with a MINVALUE of 101 and a MAXVALUE of 200. Then, if you configure DS1 and DS2 with a bidirectional replication scheme, you may make updates to either data store using the sequence my.seq with the guarantee that the sequence values never conflict. Be aware that if you are planning on using ttRepAdmin -duplicate to recover from a failure in this configuration, you must drop and then re-create the sequence with a new MINVALUE and MAXVALUE after you have performed the duplicate.

Replicated sequences are intended to be used with replicated tables. Therefore, sequence updates are only replicated when they are followed by or used in updates to replicated tables. Operations on sequences such as SELECT my.seq.NEXTVAL FROM sys.dual, while incrementing the sequence value, are not replicated until they are followed by updates to tables that are replicated. A side effect of this behavior is that these sequence updates are not purged from the log until followed by updates to tables that are replicated. This causes ttRepSubscriberWait and ttRepAdmin -wait to fail when only these sequence updates are present at the end of the log.

To replicate updates to the current value of the seq sequence from a master data store (named masterds and located on a host named system1) to a subscriber data store (named subscriberds on a host named system2), the ELEMENT description (named a) might look like the following:

ELEMENT a SEQUENCE seq
  MASTER masterds ON "system1"
  SUBSCRIBER subscriberds ON "system2"

Checking for replication conflicts on table elements

When data stores are configured for bidirectional replication, there is a potential for replication conflicts to occur if the same table row in two or more data stores is independently updated at the same time.

Such conflicts can be detected and resolved on a table-by-table basis by including timestamps in your replicated tables and configuring the replication scheme with the optional CHECK CONFLICTS clause in each table's ELEMENT description.

See Chapter 13, "Resolving Replication Conflicts" for a complete discussion on replication conflicts and how to configure the CHECK CONFLICTS clause in the CREATE REPLICATION statement.

Setting transmit durability on data store elements

A master data store configured for asynchronous or return receipt replication is durable by default. This means that log records are committed to disk when transactions are committed. The master data store can be set to nondurable by including the TRANSMIT NONDURABLE clause in the ELEMENT description.

Transaction records in the master data store log buffer are, by default, flushed to disk before they are forwarded to subscribers. If the entire master data store is replicated (ELEMENT is of type DATASTORE), you can improve replication performance by eliminating the master's flush-log-to-disk operation from the replication cycle. This is done by including a TRANSMIT NONDURABLE clause in the ELEMENT description. The TRANSMIT setting has no effect on the subscriber. The transaction records on the subscriber data store are always flushed to disk.

Master data stores configured for return twosafe replication are nondurable by default and cannot be made durable. Setting TRANSMIT DURABLE on a data store that is configured for return twosafe replication has no effect on return twosafe transactions.

Example 8-1 Replicating the entire master data store with TRANSMIT NONDURABLE

To replicate the entire contents of the master data store (masterds) to the subscriber data store (subscriberds) and to eliminate the flush-log-to-disk operation, your ELEMENT description (named a) might look like:

ELEMENT a DATASTORE
  MASTER masterds ON "system1"
  TRANSMIT NONDURABLE
  SUBSCRIBER subscriberds ON "system2"

In general, if a master data store fails, you have to initiate the ttRepAdmin -duplicate operation described in "Recovering a failed data store" to recover the failed master from the subscriber data store. This is always true for a master data store configured with TRANSMIT DURABLE.

A data store configured as TRANSMIT NONDURABLE is recovered automatically by the subscriber replication agent if it is configured in the specific type of bidirectional scheme described in "Automatic catch-up of a failed master data store". Otherwise, you must follow the procedures described in "Recovering nondurable data stores" to recover a failed nondurable data store.

Using a return service

You can configure your replication scheme with a return service to ensure a higher level of confidence that replicated data is consistent on both the master and subscriber data stores. This section describes how to configure and manage the return receipt and return twosafe services.

You can specify a return service for table elements and data store elements for any subscriber defined in a CREATE REPLICATION or ALTER REPLICATION statement.

Example 8-2 shows separate SUBSCRIBER clauses that can define different return service attributes for SubDataStore1 and SubDataStore2.

Example 8-2 Different return services for each subscriber

CREATE REPLICATION Owner.SchemeName
  ELEMENT ElementNameElementType
    MASTER DataStoreName ON "HostName"
    SUBSCRIBER SubDataStore1 ON "HostName" ReturnServiceAttribute1
    SUBSCRIBER SubDataStore2 ON "HostName" ReturnServiceAttribute2;

Alternatively, you can specify the same return service attribute for all of the subscribers defined in an element. Example 8-3 shows the use of a single SUBSCRIBER clause that defines the same return service attributes for both SubDataStore1 and SubDataStore2.

Example 8-3 Same return service for all subscribers

CREATE REPLICATION Owner.SchemeName
  ELEMENT ElementNameElementType
    MASTER DataStoreName ON "HostName"
    SUBSCRIBER SubDataStore1 ON "HostName",
               SubDataStore2 ON "HostName"
               ReturnServiceAttribute;

These sections describe the return service attributes:

RETURN RECEIPT
RETURN RECEIPT BY REQUEST
RETURN TWOSAFE
RETURN TWOSAFE BY REQUEST
NO RETURN

RETURN RECEIPT

TimesTen provides an optional return receipt service to loosely couple or synchronize your application with the replication mechanism.

Specify the RETURN RECEIPT attribute to enable the return receipt service for the subscribers listed in the SUBSCRIBER clause of an ELEMENT description. With return receipt enabled, when the application commits a transaction for an element on the master data store, the application remains blocked until the subscriber acknowledges receipt of the transaction update. If the master is replicating the element to multiple subscribers, the application remains blocked until all of the subscribers have acknowledged receipt of the transaction update.

For example replication schemes that use return receipt services, see Example 8-24 and Example 8-25.

Example 8-4 RETURN RECEIPT

To confirm that all transactions committed on the tab table in the master store (masterds) are received by the subscriber (subscriberds), the ELEMENT description (e) might look like the following:

ELEMENT e TABLE tab
    MASTER masterds ON "system1"
    SUBSCRIBER subscriberds ON "system2"
      RETURN RECEIPT

If any of the subscribers are unable to acknowledge receipt of the transaction within a configurable timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. You can use the ttRepXactStatus procedure to check on the status of a return receipt transaction. See "Checking the status of return service transactions" for more information on the return service timeout period.

You can also configure the replication agent to disable the return receipt service after a specific number of timeouts. See "Managing return service timeout errors and replication state changes" for details.

The return receipt service is disabled by default if replication is stopped. See "RETURN SERVICES { ON | OFF } WHEN REPLICATION STOPPED" for details.

RETURN RECEIPT BY REQUEST

RETURN RECEIPT enables notification of receipt for all transactions. You can use RETURN RECEIPT with the BY REQUEST option to enable receipt notification only for specific transactions identified by your application.

If you specify RETURN RECEIPT BY REQUEST for a subscriber, you must use the ttRepSyncSet procedure to enable the return receipt service for a transaction. The call to enable the return receipt service must be part of the transaction (autocommit must be off).

Example 8-5 RETURN RECEIPT BY REQUEST

To enable confirmation that specific transactions committed on the tab table in the master store (masterds) are received by the subscriber (subscriberds), your ELEMENT description (e) might look like:

ELEMENT e TABLE tab
    MASTER masterds ON "system1"
    SUBSCRIBER subscriberds ON "system2"
      RETURN RECEIPT BY REQUEST

Prior to committing a transaction that requires receipt notification, we call ttRepSyncSet within a SQLExecDirect function to request the return services and to set the timeout period to 45 seconds:

rc = SQLExecDirect( hstmt, (SQLCHAR *)
    "CALL ttRepSyncSet(0x01, 45, NULL)", SQL_NTS )

If any of the subscribers are unable to acknowledge receipt of the transaction update within a configurable timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. See "Setting the return service timeout period".

You can use ttRepSyncGet to check if a return service is enabled and obtain the timeout value. For example:

Command> CALL ttRepSyncGet();
< 01, 45, 1>
1 row found.

RETURN TWOSAFE BY REQUEST

RETURN TWOSAFE enables notification of commit on the subscriber for all transactions. You can use RETURN TWOSAFE with the BY REQUEST option to enable notification of subscriber commit only for specific transactions identified by the application.

If you specify RETURN TWOSAFE BY REQUEST for a subscriber, you must use the ttRepSyncSet procedure to enable the return twosafe service for a transaction. The call to enable the return twosafe service must be part of the transaction (autocommit must be off).

The ALTER TABLE statement cannot be used to alter a replicated table that is part of a TWOSAFE BY REQUEST transaction. If DDLCommitBehavior=1, this operation results in error 8051. If DDLCommitBehavior=0, the operation succeeds because a commit is performed before the ALTER TABLE operation, resulting in the ALTER TABLE operation being in a new transaction which is not part of the TWOSAFE BY REQUEST transaction.

Example 8-6 RETURN TWOSAFE BY REQUEST

To enable confirmation that specific transactions committed on the master store (datastoreA) are also committed by the subscriber (datastoreB), the ELEMENT description (a) might look like:

ELEMENT a DATASTORE
    MASTER datastoreA ON "system1"
    SUBSCRIBER datastoreB ON "system2"
      RETURN TWOSAFE BY REQUEST;

Before calling commit for a transaction that requires confirmation of commit on the subscriber, we call ttRepSyncSet within a SQLExecDirect function to request the return service, set the timeout period to 45 seconds, and specify no action (1) in the event of a timeout error:

rc = SQLExecDirect( hstmt, (SQLCHAR *)
    "CALL ttRepSyncSet(0x01, 45, 1)", SQL_NTS )

In this example, if the subscriber is unable to acknowledge commit of the transaction within the timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. The application can then chose how to handle the timeout. See "Setting the return service timeout period".

You can use ttRepSyncGet to check if a return service is enabled and obtain the timeout value. For example:

Command> CALL ttRepSyncGet();
< 01, 45, 1>
1 row found.

RETURN TWOSAFE

The return twosafe service ensures that each replicated transaction is committed on the subscriber data store before it is committed on the master data store. If replication is unable to verify the transaction has been committed on the subscriber, it returns notification of the error. Upon receiving an error, the application can either take a unique action or fall back on preconfigured actions, depending on the type of failure.

The return twosafe service is intended to be used in replication schemes where two data stores must stay synchronized. One data store has an active role, while the other data store has a standby role but must be ready to assume an active role at any moment. Use return twosafe with a bidirectional replication scheme with exactly two data stores.

To enable the return twosafe service for the subscriber, specify the RETURN TWOSAFE attribute in the SUBSCRIBER clause in the CREATE REPLICATION or ALTER REPLICATION statement.

Example 8-7 RETURN TWOSAFE

To confirm all transactions committed on the master store (datastoreA) are also committed by the subscriber (datastoreB), your ELEMENT description (a) might look like the following:

ELEMENT a DATASTORE
    MASTER datastoreA ON "system1"
    SUBSCRIBER datastoreB ON "system2"
      RETURN TWOSAFE

The entire CREATE REPLICATION statement that specifies both datastoreA and datastoreB in a bidirectional configuration with RETURN TWOSAFE might look like the following:

CREATE REPLICATION bidirect
ELEMENT a DATASTORE
    MASTER datastoreA ON "system1"
    SUBSCRIBER datastoreB ON "system2"
      RETURN TWOSAFE
ELEMENT b DATASTORE
    MASTER datastoreB ON "system2"
    SUBSCRIBER datastoreA ON "system1"
      RETURN TWOSAFE;

When replication is configured with RETURN TWOSAFE, you must disable the AutoCommit connection attribute.

When the application commits a transaction on the master data store, the application remains blocked until the subscriber acknowledges it has successfully committed the transaction. Initiating identical updates or deletes on both data stores can lead to deadlocks in commits that can be resolved only by stopping the processes.

If the subscriber is unable to acknowledge commit of the transaction update within a configurable timeout period, your application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. See "Setting the return service timeout period".

NO RETURN

Use the NO RETURN attribute to explicitly disable the return receipt or return twosafe service. NO RETURN is the default condition. This attribute is typically set in ALTER REPLICATION statements. See Example 12-13.

Setting STORE attributes

Table 8-3 lists the optional STORE parameters for the CREATE REPLICATION and ALTER REPLICATION statements.

Table 8-3 STORE attribute descriptions

STORE attribute	Description
`DISABLE RETURN` `{SUBSCRIBER\|ALL}` `NumFailures`	Set the return service policy so that return service blocking is disabled after the number of timeouts specified by `NumFailures`. See "Establishing return service failure/recovery policies".
`RETURN SERVICES {ON\|OFF} WHEN REPLICATION STOPPED`	Set return services on or off when replication is disabled. See "Establishing return service failure/recovery policies".
`RESUME RETURN` `Milliseconds`	If `DISABLE RETURN` has disabled return service blocking, this attribute sets the policy for re-enabling the return service. See "Establishing return service failure/recovery policies".
`RETURN WAIT TIME` `Seconds`	Specifies the number of seconds to wait for return service acknowledgement. A value of 0 means that there is no waiting. The default value is 10 seconds. The application can override this timeout setting by using the `returnWait` parameter in the `ttRepSyncSet` built-in procedure. See "Setting the return service timeout period".
`DURABLE COMMIT` `{ON\|OFF}`	Overrides the `DurableCommits` attribute setting. Enables durable commit when return service blocking has been disabled by `DISABLE RETURN`. See "DURABLE COMMIT".
`LOCAL COMMIT ACTION` `{NO ACTION\|ACTON}`	Specify the default action to be taken for a return service transaction in the event of a timeout. The options are: `NO ACTION` - On timeout, the commit function returns to the application, leaving the transaction in the same state it was in when it entered the commit call, with the exception that the application is not able to update any replicated tables. The application can reissue the commit. This is the default. `COMMIT`- On timeout, the commit function attempts to perform a COMMIT to end the transaction locally. This default setting can be overridden for specific transactions by using the `localAction` parameter in the `ttRepSyncSet` procedure. See "LOCAL COMMIT ACTION".
`COMPRESS TRAFFIC` `{ON\|OFF}`	Compress replicated traffic to reduce the amount of network bandwidth used. See "Compressing replicated traffic".
`PORT` `PortNumber`	Set the port number used by subscriber data stores to listen for updates from a master. If no PORT attribute is specified, the TimesTen daemon dynamically selects the port. While static port assignment is allowed by TimesTen, dynamic port allocation is recommended. See "Port assignments".
`TIMEOUT` `Seconds`	Set the maximum number of seconds a data store waits for a response from another data store before resending the message. See "Setting the return service timeout period"
`FAILTHRESHOLD`	Set the log failure threshold. See "Setting the log failure threshold".
`CONFLICT REPORTING {SUSPEND\|RESUME} AT` `Value`	Specify the number of replication conflicts per second at which conflict reporting is suspended, and the number of conflicts per second at which conflict reporting resumes. See Chapter 13, "Resolving Replication Conflicts".
`TABLE DEFINITION CHECKING {EXACT\|RELAXED}`	Specify the type of table definition checking: `EXACT` - The tables must be identical on master and subscriber. This is the default. `RELAXED` - The tables must have the same key definition, number of columns and column data types. See "Replicating tables with different definitions".

The FAILTHRESHOLD and TIMEOUT attributes can be unique to a specific replication scheme definition. This means these attribute settings can vary if you have applied different replication scheme definitions to your replicated data stores. This is not true for any of the other attributes, which must be the same across all replication scheme definitions. For example, setting the PORT attribute for one scheme sets it for all schemes.

For an example replication scheme that uses a STORE clause to set the FAILTHRESHOLD attribute, see Example 8-24.

Setting the return service timeout period

If your replication scheme is configured with one of the return services described in "Using a return service", a timeout occurs if any of the subscribers are unable to send an acknowledgement back to the master within the time period specified by TIMEOUT.

The default return service timeout period is 10 seconds. You can specify a different return service timeout period by:

Configuring RETURN WAIT TIME in the CREATE REPLICATION or ALTER REPLICATION statement. A RETURN WAIT TIME of '0' indicates no waiting.
Calling the ttRepSyncSet procedure with a new returnWait parameter

Once set, the timeout period applies to all subsequent return service transactions until you either reset the timeout period or terminate the application session. The timeout setting applies to all return services for all subscribers.

A return service may time out because of a replication failure or because replication is so far behind that the return service transaction times out before it is replicated. However, unless there is a simultaneous replication failure, failure to obtain a return service confirmation from the subscriber does not mean the transaction has not been or will not be replicated.

You can set other STORE attributes to establish policies that automatically disable return service blocking in the event of excessive timeouts and re-enable return service blocking when conditions improve. See "Managing return service timeout errors and replication state changes".

Example 8-8 Setting the timeout period for both data stores in bidirectional replication scheme

To set the timeout period to 30 seconds for both bidirectionally replicated data stores, datastoreA and datastoreB, in the bidirect replication scheme, the CREATE REPLICATION statement might look like the following:

CREATE REPLICATION bidirect
ELEMENT a DATASTORE
    MASTER datastoreA ON "system1"
    SUBSCRIBER datastoreB ON "system2"
      RETURN TWOSAFE
ELEMENT b DATASTORE
    MASTER datastoreB ON "system2"
    SUBSCRIBER datastoreA ON "system1"
      RETURN TWOSAFE
STORE datastoreA RETURN WAIT TIME 30
STORE datastoreB RETURN WAIT TIME 30;

Example 8-9 Resetting the timeout period

To use the ttRepSyncSet procedure to reset the timeout period to 45 seconds, call ttRepSyncSet within a SQLExecDirect ODBC function. To avoid resetting the requestReturn and localAction values, specify NULL:

rc = SQLExecDirect( hstmt, (SQLCHAR *)
    "CALL ttRepSyncSet(NULL, 45, NULL)", SQL_NTS )

Managing return service timeout errors and replication state changes

The replication state can be reset to Stop by a user or by the master replication agent in the event of a subscriber failure. A subscriber may be unable to acknowledge a transaction that makes use of a return service and may time out with respect to the master. If any of the subscribers are unable to acknowledge the transaction update within the timeout period, the application receives an errRepReturnFailed warning on its commit request.

The default return service timeout period is 10 seconds. You can specify a different return service timeout period by:

Configuring the RETURN WAIT TIME attribute in the STORE clause of the CREATE REPLICATION or ALTER REPLICATION statement
Calling ttRepSyncSet procedure with a new returnWait parameter

A return service may time out or fail because of a replication failure or because replication is so far behind that the return service transaction times out before it is replicated. However, unless there is a simultaneous replication failure, failure to obtain a return service confirmation from the subscriber does not necessarily mean the transaction has not been or will not be replicated.

This section describes how to detect and respond to timeouts on return service transactions. The main topics are:

When to manually disable return service blocking
Establishing return service failure/recovery policies

When to manually disable return service blocking

You may want respond in some manner if replication is stopped or return service timeout failures begin to adversely impact the performance of the replicated system. Your "tolerance threshold" for return service timeouts may depend on the historical frequency of timeouts and the performance/availability equation for your particular application, both of which should be factored into your response to the problem.

When using the return receipt service, you can manually respond by:

Using ALTER REPLICATION to make changes to the replication scheme to disable return receipt blocking for a particular subscriber. If you decide to disable return receipt blocking, your decision to re-enable it depends on your confidence level that the return receipt transaction is no longer likely to time out.
Calling the ttDurableCommit procedure to durably commit transactions on the master that you can no longer verify as being received by the subscriber

An alternative to manually responding to return service timeout failures is to establish return service failure and recovery policies in your replication scheme. These policies direct the replication agents to detect changes to the replication state and to keep track of return service timeouts and then automatically respond in some predefined manner.

Establishing return service failure/recovery policies

The following attributes in the CREATE REPLICATION or ALTER REPLICATION statement set the failure/recovery policies when using a RETURN RECEIPT or RETURN TWOSAFE service:

RETURN SERVICES { ON | OFF } WHEN REPLICATION STOPPED
DISABLE RETURN
RESUME RETURN
DURABLE COMMIT
LOCAL COMMIT ACTION

The policies set by these attributes are applicable for the life of the data store or until changed. However, the replication agent must be running to enforce these policies.

RETURN SERVICES { ON | OFF } WHEN REPLICATION STOPPED

The RETURN SERVICES { ON | OFF } WHEN REPLICATION STOPPED attribute determines whether a return receipt or return twosafe service continues to be enabled or is disabled when replication is stopped. "Stopped" in this context means that either the master replication agent is stopped (for example, by ttAdmin -repStop master) or the replication state of the subscriber data store is set to Stop or Pause with respect to the master data store (for example, by ttRepAdmin -state stop subscriber). A failed subscriber that has exceeded the specified FAILTHRESHOLD value is set to the Failed state, but is eventually set to the Stop state by the master replication agent.

Note:

A subscriber may become unavailable for a period of time that exceeds the timeout period specified by RETURN WAIT TIME but still be considered by the master replication agent to be in the Start state. Failure policies related to timeouts are set by the DISABLE RETURN attribute.

RETURN SERVICES OFF WHEN REPLICATION STOPPED disables the return service when replication is stopped and is the default when using the RETURN RECEIPT service. RETURN SERVICES ON WHEN REPLICATION STOPPED allows the return service to continue to be enabled when replication is stopped and is the default when using the RETURN TWOSAFE service.

Example 8-10 RETURN SERVICES ON WHEN REPLICATION STOPPED

Configure the CREATE REPLICATION statement to replicate updates from the masterds data store to the subscriber1 data store. The CREATE REPLICATION statement specifies the use of RETURN RECEIPT and RETURN SERVICES ON WHEN REPLICATION STOPPED.

CREATE REPLICATION myscheme
ELEMENT e TABLE tab
  MASTER masterds ON "server1"
  SUBSCRIBER subscriber1 ON "server2"
  RETURN RECEIPT
  STORE masterds ON "server1"
    RETURN SERVICES ON WHEN REPLICATION STOPPED;

While the application is committing updates to the master, ttRepAdmin is used to set subscriber1 to the Stop state:

ttRepAdmin -dsn masterds -receiver -name subscriber1 -state stop

The application continues to wait for return receipt acknowledgements from subscriber1 until the replication state is reset to Start and it receives the acknowledgment:

ttRepAdmin -dsn masterds -receiver -name subscriber1 -state start

DISABLE RETURN

When a DISABLE RETURN value is set, the data store keeps track of the number of return receipt or return twosafe transactions that have exceeded the timeout period set by RETURN WAIT TIME. If the number of timeouts exceeds the maximum value set by DISABLE RETURN, the applications revert to a default replication cycle in which they no longer wait for subscribers to acknowledge the replicated updates.

You can set DISABLE RETURN SUBSCRIBER to establish a failure policy to disable return service blocking for only those subscribers that have timed out, or DISABLE RETURN ALL to establish a policy to disable return service blocking for all subscribers. You can use the ttRepSyncSubscriberStatus built-in procedure or the ttRepReturnTransitionTrap SNMP trap to determine whether a particular subscriber has been disabled by the DISABLE RETURN failure policy.

The DISABLE RETURN failure policy is enabled only when the replication agent is running. If DISABLE RETURN is specified but RESUME RETURN is not specified, the return services remain off until the replication agent for the data store has been restarted. You can cancel this failure policy by stopping the replication agent and specifying either DISABLE RETURN SUBSCRIBER or DISABLE RETURN ALL with a zero value for NumFailures. The count of timeouts to trigger the failure policy is reset either when you restart the replication agent, when you set the DISABLE RETURN value to 0, or when return service blocking is re-enabled by RESUME RETURN.

DISABLE RETURN maintains a cumulative timeout count for each subscriber. If there are multiple subscribers and you set DISABLE RETURN SUBSCRIBER, the replication agent disables return service blocking for the first subscriber that reaches the timeout threshold. If one of the other subscribers later reaches the timeout threshold, the replication agent disables return service blocking for that subscriber also.

Example 8-11 DISABLE RETURN SUBSCRIBER

Configure the CREATE REPLICATION statement to replicate updates from the masterds data store to the data stores, subscriber1 and subscriber2. The CREATE REPLICATION statement specifies the use of RETURN RECEIPT and DISABLE RETURN SUBSCRIBER with a NumFailures value of 5. The RETURN WAIT TIME is set to 30 seconds.

CREATE REPLICATION myscheme
ELEMENT e TABLE tab
  MASTER masterds ON "server1"
  SUBSCRIBER subscriber1 ON "server2",
             subscriber2 ON "server3"
RETURN RECEIPT
STORE masterds ON "server1"
  DISABLE RETURN SUBSCRIBER 5
  RETURN WAIT TIME 30;

While the application is committing updates to the master, subscriber1 experiences problems and fails to acknowledge a replicated transaction update. The application is blocked 30 seconds after which it commits its next update to the master. Over the course of the application session, this commit/timeout cycle repeats 4 more times until DISABLE RETURN disables return receipt blocking for subscriber1. The application continues to wait for return-receipt acknowledgements from subscriber2 but not from subscriber1.

RETURN SERVICES OFF WHEN REPLICATION STOPPED is the default setting for the return receipt service. Therefore, return receipt is disabled under either one of the following conditions:

The subscriber is unable to acknowledge an update within the specified RETURN WAIT TIME, as described above.
Replication is stopped, as described in "RETURN SERVICES { ON | OFF } WHEN REPLICATION STOPPED".

For another example that set the DISABLE RETURN attribute, see Example 8-12.

RESUME RETURN

When we say return service blocking is "disabled," we mean that the applications on the master data store no longer block execution while waiting to receive acknowledgements from the subscribers that they received or committed the replicated updates. Note, however, that the master still listens for an acknowledgement of each batch of replicated updates from the subscribers.

You can establish a return service recovery policy by setting the RESUME RETURN attribute and specifying a resume latency value. When this attribute is set and return service blocking has been disabled for a subscriber, the return receipt or return twosafe service is re-enabled when the commit-to-acknowledge time for a transaction falls below the value set by RESUME RETURN. The commit-to-acknowledge time is the latency between when the application issues a commit and when the master receives acknowledgement of the update from the subscriber.

Example 8-12 RESUME RETURN

If return receipt blocking has been disabled for subscriber1 and if RESUME RETURN is set to 8 milliseconds, then return receipt blocking is re-enabled for subscriber1 the instant it acknowledges an update in less than 8 milliseconds from when it was committed by the application on the master.

CREATE REPLICATION myscheme
ELEMENT e TABLE tab
  MASTER masterds ON "server1"
  SUBSCRIBER subscriber1 ON "server2",
             subscriber2 ON "server3"
RETURN RECEIPT
STORE masterds ON "server1"
  DISABLE RETURN SUBSCRIBER 5
  RESUME RETURN 8;

The RESUME RETURN policy is enabled only when the replication agent is running. You can cancel a return receipt resume policy by stopping the replication agent and then using ALTER REPLICATION to set RESUME RETURN to zero.

DURABLE COMMIT

Set the DURABLE COMMIT attribute to specify the durable commit policy for applications that have return service blocking disabled by DISABLE RETURN. When DURABLE COMMIT is set to ON, it overrides the DurableCommits attribute on the master data store and forces durable commits for those transactions that have had return service blocking disabled.

DURABLE COMMIT is useful if you have only one subscriber. However, if you are replicating the same data to two subscribers and you disable return service blocking to one subscriber, then you achieve better performance if you rely on the other subscriber than you would by enabling durable commits.

Note:

If the replication scheme is configured with RETURN SERVICES ON WHEN REPLICATION STOPPED, the replication agent must be running to enforce the DURABLE COMMIT policy.

Example 8-13 DURABLE COMMIT

Set DURABLE COMMIT ON when establishing a DISABLE RETURN ALL policy to disable return-receipt blocking for all subscribers. If return-receipt blocking is disabled, commits are durably committed to disk to provide redundancy.

CREATE REPLICATION myscheme
ELEMENT e TABLE tab
  MASTER masterds ON "server1"
  SUBSCRIBER subscriber ON "server2",
             subscriber2 ON "server3"
RETURN RECEIPT
STORE masterds ON "server1"
  DISABLE RETURN ALL 5
  DURABLE COMMIT ON
  RESUME RETURN 8;

LOCAL COMMIT ACTION

When using the return twosafe service, you can specify how the master replication agent responds to timeout errors by:

Setting the LOCAL COMMIT ACTION attribute in the STORE clause of the CREATE REPLICATION statement
Calling the ttRepSyncSet procedure with the localAction parameter

The possible actions upon receiving a timeout during replication of a twosafe transaction are:

COMMIT - Upon timeout, the application commits the transaction and no more operations are allowed in the transaction.
NO ACTION - Upon timeout, the application does not commit the transaction. The process recovery commits the transaction. This is equivalent to a forced commit.

If the call returns with an error, you can use the ttRepXactStatus procedure described in "Checking the status of return service transactions" to check the status of the transaction. Depending on the error, your application can choose to:

Reissue the commit call - This repeats the entire return twosafe replication cycle, so that the commit call returns when the success or failure of the replicated commit on the subscriber is known or if the timeout period expires.
Roll back the transaction - If the call returns with an error related to applying the transaction on the subscriber, such as primary key lookup failure, you can roll back the transaction on the master.

Compressing replicated traffic

If you are replicating over a low-bandwidth network, or if you are replicating massive amounts of data, you can set the COMPRESS TRAFFIC attribute to reduce the amount of bandwidth required for replication. The COMPRESS TRAFFIC attribute compresses the replicated data from the data store specified by the STORE parameter in your CREATE REPLICATION or ALTER REPLICATION statement. TimesTen does not compress traffic from other data stores.

Although the compression algorithm is optimized for speed, enabling the COMPRESS TRAFFIC attribute has some impact on replication throughput and latency.

Example 8-14 Compressing traffic from one data store

To compress replicated traffic from data store dsn1 and leave the replicated traffic from dsn2 uncompressed, the CREATE REPLICATION statement looks like:

CREATE REPLICATION repscheme
ELEMENT d1 DATASTORE
    MASTER dsn1 ON machine1
    SUBSCRIBER dsn2 ON machine2
ELEMENT d2 DATASTORE
    MASTER dsn2 ON machine2
    SUBSCRIBER dsn1 ON machine1
STORE dsn1 ON machine1 COMPRESS TRAFFIC ON;

Example 8-15 Compressing traffic between both data stores

To compress the replicated traffic between both the dsn1 and dsn2 data stores, use:

CREATE REPLICATION scheme
ELEMENT d1 DATASTORE
    MASTER dsn1 ON machine1
    SUBSCRIBER dsn2 ON machine2
ELEMENT d2 DATASTORE
    MASTER dsn2 ON machine2
    SUBSCRIBER dsn1 ON machine1
STORE dsn1 ON machine1 COMPRESS TRAFFIC ON
STORE dsn2 ON machine2 COMPRESS TRAFFIC ON;

Port assignments

If you do not assign a PORT attribute, the TimesTen daemon dynamically selects the port. When ports are assigned dynamically for the replication agents, then the ports of the TimesTen daemons have to match as well. Setting the PORT attribute for one scheme sets it for all schemes.

You must assign static ports if you want to do online upgrades.

If you use CREATE REPLICATION to establish different schemes on the same data store with different PORT attributes, TimesTen ignores the setting from the last CREATE REPLICATION statement. In this case, you must use ALTER REPLICATION to change the PORT setting.

When statically assigning ports, it is important to specify the full host name, DSN and PORT in the STORE attribute of the CREATE REPLICATION statement.

Example 8-16 Assigning static ports

CREATE REPLICATION repscheme
ELEMENT el1 TABLE tab
    MASTER dsn1 ON machine1
    SUBSCRIBER dsn2 ON machine2
ELEMENT el2 TABLE tab
    MASTER dsn2 ON machine2
    SUBSCRIBER dsn1 ON machine1
STORE dsn1 ON machine1 PORT 16080
STORE dsn2 ON machine2 PORT 16083;

Setting the log failure threshold

You can establish a threshold value that, when exceeded, sets an unavailable subscriber to the Failed state before the available log space is exhausted. Use the FAILTHRESHOLD attribute to set the log failure threshold. See Example 8-24.

The default threshold value is 0, which means "no limit." See "Setting attributes for logging" for details about log failure threshold values.

If a master sets a subscriber data store to the Failed state, it drops all of the data for the failed subscriber from its log and transmits a message to the failed subscriber data store. If the master replication agent can communicate with the subscriber replication agent, then the message is transmitted immediately. Otherwise, the message is transmitted when the connection is reestablished. After receiving the message from the master, if the subscriber is configured for bidirectional replication or to propagate updates to other subscribers, it does not transmit any further updates, because its replication state has been compromised.

Any application that connects to the failed subscriber receives a tt_ErrReplicationInvalid (8025) warning indicating that the data store has been marked Failed by a replication peer. Once the subscriber data store has been informed of its failed status, its state on the master data store is changed from Failed to Stop.

Applications can use the ODBC SQLGetInfo function to check if the data store it is connected to has been set to the Failed state, as described in"Subscriber failures".

Replicating tables with different definitions

You can use the TABLE DEFINITION CHECKING attribute to enable replication of tables that are not identical. For example, if tables have columns in a different order or have a different number of partitions, you can replicate them using this clause. A table is partitioned if columns have been added after its initial creation. See Example 8-18.

Setting the TABLE DEFINITION CHECKING attribute to RELAXED requires that replicated tables have the same key definition, number of columns and column data types. Table definition checking occurs on the subscriber side. Setting this attribute to RELAXED for both master and subscriber has the same effect as setting it for only the subscriber.

The RELAXED setting usually results in slightly slower performance. The change in performance depends on the workload and the number of columns in the tables.

You can set table definition checking to RELAXED temporarily while consolidating tables with multiple partitions and then reset it to EXACT. There is no performance loss for tables with identical structures.

Example 8-17 Replicating tables with columns in different positions

Create table t1 in dsn1 data store:

CREATE TABLE t1 (a INT PRIMARY KEY, b INT, c INT);

Create table t1 in dsn2 data store with the columns in a different order than the columns in t1 in dsn1 data store. Note that the column names and data types are the same in both tables and a is the primary key in both tables.

CREATE TABLE t1 (c INT, a INT PRIMARY KEY, b INT);

Create replication scheme rep1. Set TABLE DEFINITION CHECKING to RELAXED for the subscriber, dsn2.

CREATE REPLICATION rep1
       ELEMENT e1 TABLE t1
       MASTER dsn1
       SUBSCRIBER dsn2
       STORE dsn2 TABLE DEFINITION CHECKING relaxed;

Start the replication agent for both data stores. Insert a row into t1 on dsn1.

Command> INSERT INTO t1 VALUES (4,5,6);
1 row inserted.

Verify the results on t1 on dsn2.

Command> SELECT * FROM t1;
< 5, 6, 4 >
1 row found.

Example 8-18 Replicating tables with a different number of partitions

When you add columns to a table, it increases the number of partitions in the table, even if you subsequently drop the new columns. You can use the RELAXED setting for TABLE DEFINITION CHECKING to replicate tables that have different number of partitions.

Create table t3 on dsn1 with two columns.

CREATE TABLE t3 (a INT PRIMARY KEY, b INT);

Create table t3 on dsn2 with one column that is the primary key.

CREATE TABLE t3 (a INT PRIMARY KEY);

Add a column to the table on dsn2. This increases the number of partitions to two, while the table on dsn1 has one partition.

ALTER TABLE t3 ADD COLUMN b INT;

Create the replication scheme on both data stores.

CREATE REPLICATION reppart
       ELEMENT e2 TABLE t3
       MASTER dsn1
       SUBSCRIBER dsn2
       STORE dsn2 TABLE DEFINITION CHECKING RELAXED;

Start the replication agent for both data stores. Insert a row into t3 on dsn1.

Command> INSERT INTO t3 VALUES (1,2);
1 row inserted.

Verify the results in t3 on dsn2.

Command> SELECT * FROM t3;
< 1, 2 >
1 row found.

Configuring network operations

If your replication host has more than one network interface, you may wish to configure replication to use an interface other than the default interface. Although you must specify the host name returned by the operating system's hostname command when you define a replication element, you may configure replication to send or receive traffic over a different interface using the ROUTE clause.

The syntax of the ROUTE clause is:

ROUTE MASTER FullStoreName SUBSCRIBER FullStoreName
  {{MASTERIP MasterHost | SUBSCRIBERIP SubscriberHost}
    PRIORITY Priority} [...]

Example 8-19 Configuring multiple network interfaces

If host machine1 is configured with a second interface accessible by the host name machine1fast, and machine2 is configured with a second interface at IP address 192.168.1.100, you may specify that the secondary interfaces are used with the replication scheme.

CREATE REPLICATION repscheme
ELEMENT e1 TABLE tab
    MASTER dsn1 ON machine1
    SUBSCRIBER dsn2 ON machine2
ELEMENT e2 TABLE tab
    MASTER dsn2 ON machine2
    SUBSCRIBER dsn1 ON machine1
ROUTE MASTER dsn1 ON machine1 SUBSCRIBER dsn2 ON machine2
    MASTERIP machine1fast PRIORITY 1
    SUBSCRIBERIP "192.168.1.100" PRIORITY 1
ROUTE MASTER dsn2 ON machine2 SUBSCRIBER dsn1 ON machine1
    MASTERIP "192.168.1.100" PRIORITY 1
    SUBSCRIBERIP machine1fast PRIORITY 1;

Alternately, on a replication host with more than one interface, you may wish to configure replication to use one or more interfaces as backups, in case the primary interface fails or the connection from it to the receiving host is broken. You may use the ROUTE clause to specify two or more interfaces for each master or subscriber that are used by replication in order of priority.

Example 8-20 Configuring network priority

If host machine1 is configured with two network interfaces at IP addresses 192.168.1.100 and 192.168.1.101, and host machine2 is configured with two interfaces at IP addresses 192.168.1.200 and 192.168.1.201, you may specify that replication use IP addresses 192.168.1.100 and 192.168.200 to transmit and receive traffic first, and to try IP addresses 192.168.1.101 or 192.168.1.201 if the first connection fails.

CREATE REPLICATION repscheme
ELEMENT e TABLE tab
  MASTER dsn1 ON machine1
  SUBSCRIBER dsn2 ON machine2
ROUTE MASTER dsn1 ON machine1 SUBSCRIBER dsn2 ON machine2
  MASTERIP "192.168.1.100" PRIORITY 1
  MASTERIP "192.168.1.101" PRIORITY 2
  SUBSCRIBERIP "192.168.1.200" PRIORITY 1
  SUBSCRIBERIP "192.168.1.201" PRIORITY 2;

If replication on the master host is unable to bind to the MASTERIP with the highest priority, it will try to connect using subsequent MASTERIP addresses in order of priority immediately. However, if the connection to the subscriber fails for any other reason, replication will try to connect using each of the SUBSCRIBERIP addresses in order of priority before it tries the MASTERIP address with the next highest priority.

Replication scheme syntax examples

The examples in this section illustrate how to configure a variety of replication schemes. The replication schemes include:

Single subscriber schemes
Multiple subscriber schemes with return services and a failure threshold
Replicating tables to different subscribers
Propagation scheme
Bidirectional split workload schemes
Bidirectional distributed workload scheme

Single subscriber schemes

The scheme shown in Example 8-21 is a single master and subscriber unidirectional replication scheme. The two data stores are located on separate hosts, system1 and system2. We use the RETURN RECEIPT service to confirm that all transactions committed on the tab table in the master store are received by the subscriber.

Example 8-21 Replicating one table

CREATE REPLICATION repscheme
ELEMENT e TABLE tab
    MASTER masterds ON "system1"
    SUBSCRIBER subscriberds ON "system2"
      RETURN RECEIPT;

The scheme shown in Example 8-22 is a single master and subscriber unidirectional replication scheme. The two data stores are located on separate hosts, server1 and server2. The master data store, named masterds, replicates its entire contents to the subscriber data store, named subscriberds.

Example 8-22 Replicating entire data store

CREATE REPLICATION repscheme
ELEMENT e DATASTORE
    MASTER masterds ON "server1"
    SUBSCRIBER subscriberds ON "server2";

Multiple subscriber schemes with return services and a failure threshold

You can create a replication scheme that includes up to 128 subscriber data stores.

Example 8-23 Replicating to two subscribers

This example establishes a master data store, named masterds, that replicates the tab table to two subscriber data stores, subscriber1ds and subscriber2ds, located on server2 and server3, respectively. The name of the replication scheme is twosubscribers. The name of the replication element is e.

CREATE REPLICATION twosubscribers
ELEMENT e TABLE tab
    MASTER masterds ON "server1"
    SUBSCRIBER subscriber1ds ON "server2",
               subscriber2ds ON "server3";

Example 8-24 Replicating to two subscribers with RETURN RECEIPT

This example uses the basic example in Example 8-23 and adds a RETURN RECEIPT attribute and a STORE parameter. RETURN RECEIPT enables the return receipt service for both data stores. The STORE parameter sets a FAILTHRESHOLD value of 10 to establish the maximum number of transaction log files that can accumulate on masterds for a subscriber before it assumes the subscriber has failed.

CREATE REPLICATION twosubscribers
ELEMENT e TABLE rel.tab
  MASTER masterds ON "server1"
  SUBSCRIBER subscriber1ds ON "server2",
             subscriber2ds ON "server3"
  RETURN RECEIPT
STORE masterds FAILTHRESHOLD 10;

Example 8-25 Enabling RETURN RECEIPT for only one subscriber

This example shows how to enable RETURN RECEIPT for only subscriber2ds. Note that there is no comma after the subscriber1ds definition.

CREATE REPLICATION twosubscribers
ELEMENT e TABLE tab
    MASTER masterds ON "server1"
    SUBSCRIBER subscriber1ds ON "server2"
    SUBSCRIBER subscriber2ds ON "server3" RETURN RECEIPT
STORE masterds FAILTHRESHOLD 10;

Example 8-26 Enabling different return services for subscribers

This example shows how to apply RETURN RECEIPT BY REQUEST to subscriber1ds and RETURN RECEIPT to subscriber2ds. In this scheme, applications accessing subscriber1ds must use the ttRepSyncSet procedure to enable the return services for a transaction, while subscriber2ds unconditionally provides return services for all transactions.

CREATE REPLICATION twosubscribers
ELEMENT e TABLE tab
    MASTER masterds ON "server1"
    SUBSCRIBER subscriberds1 ON "server2" RETURN RECEIPT BY REQUEST
    SUBSCRIBER subscriber2ds ON "server3" RETURN RECEIPT
STORE masterds FAILTHRESHOLD 10;

Replicating tables to different subscribers

The replication scheme shown in Example 8-27 establishes a master data store, named centralds, that replicates four tables. tab1 and tab2 are replicated to the subscriber backup1ds. tab3 and tab4 are replicated to backup2ds. The master data store is located on the finance server. Both subscribers are located on the backupsystem server.

Example 8-27 Replicating tables to different subscribers

CREATE REPLICATION twobackups
ELEMENT a TABLE tab1
  MASTER centralds ON "finance"
  SUBSCRIBER backup1ds ON "backupsystem"
ELEMENT b TABLE tab2
  MASTER centralds ON "finance"
  SUBSCRIBER backup1ds ON "backupsystem"
ELEMENT d TABLE tab3
  MASTER centralds ON "finance"
  SUBSCRIBER backup2ds ON "backupsystem"
ELEMENT d TABLE tab4
  MASTER centralds ON "finance"
  SUBSCRIBER backup2ds ON "backupsystem";

Propagation scheme

In Example 8-28, t he master data store sends updates on a table to a propagator that forwards the changes to two subscribers. The master data store is centralds on the finance host. The propagator data store is propds on the nethandler host. The subscribers are backup1ds on backupsystem1 and backup2ds on backupsystem2.

The replication scheme has two elements. For ELEMENT a, the changes to the tab table on centralds are replicated to the propds propagator data store. For ELEMENT b, the changes to the tab table received by propds are replicated to the two subscribers, backup1ds and backup2ds.

Example 8-28 Propagation

CREATE REPLICATION propagator
ELEMENT a TABLE tab
  MASTER centralds ON "finance"
  SUBSCRIBER propds ON "nethandler"
ELEMENT b TABLE tab
  PROPAGATOR propds ON "nethandler"
  SUBSCRIBER backup1ds ON "backupsystem1",
             backup2ds ON "backupsystem2";

Bidirectional split workload schemes

In Example 8-29, there are two data stores, westds on the westcoast host and eastds on the eastcoast host. Customers are represented in two tables: waccounts contains data for customers in the Western region and eaccounts has data for customers from the Eastern region. The westds data store updates the waccounts table and replicates it to the eastds data store. The eaccounts table is owned by the eastds data store and is replicated to the westds data store. The RETURN RECEIPT attribute enables the return receipt service to guarantee that transactions on either master table are received by their subscriber.

Example 8-29 Bidirectional split workload

CREATE REPLICATION r1
ELEMENT elem_waccounts TABLE waccounts
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast" RETURN RECEIPT
ELEMENT elem_eaccounts TABLE eaccounts
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast" RETURN RECEIPT;

Bidirectional distributed workload scheme

Example 8-30 shows a bidirectional general workload replication scheme in which the accounts table can be updated on either the eastds or westds data store. Each data store is both a master and a subscriber for the accounts table.

Note:

A bidirectional distributed workload replication scheme should not be used with the return twosafe return service.

Example 8-30 Bidirectional distributed workload scheme

CREATE REPLICATION r1
ELEMENT elem_accounts_1 TABLE accounts
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_accounts_2 TABLE accounts
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast";

When elements are replicated in this manner, the applications should write to each data store in a coordinated manner to avoid simultaneous updates on the same data. To manage update conflicts, include a timestamp column of type BINARY(8) in the replicated table and enable timestamp comparison by including the CHECK CONFLICTS clause in the CREATE REPLICATION statement. See Chapter 13, "Resolving Replication Conflicts" for a complete discussion on how to manage update conflicts.

Example 8-31 shows that the tstamp timestamp column is included in the accounts table. The CREATE REPLICATION statement has been modified to include the CHECK CONFLICTS clause.

Example 8-31 Managing update conflicts

CREATE TABLE accounts (custname VARCHAR2(30) NOT NULL,
                       address VARCHAR2(80),
                       curbalance DEC(15,2),
                       tstamp BINARY(8),
                       PRIMARY KEY (custname));

CREATE REPLICATION r1
ELEMENT elem_accounts_1 TABLE accounts
  CHECK CONFLICTS BY ROW TIMESTAMP
    COLUMN tstamp
    UPDATE BY SYSTEM
    ON EXCEPTION ROLLBACK WORK
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_accounts_2 TABLE accounts
  CHECK CONFLICTS BY ROW TIMESTAMP
    COLUMN tstamp
    UPDATE BY SYSTEM
    ON EXCEPTION ROLLBACK WORK
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast";

Creating replication schemes with scripts

Creating your replication schemes with scripts can save you time and help you avoid mistakes. This section provides some suggestions for automating the creation of replication schemes using Perl.

Consider the general workload bidirectional scheme shown in Example 8-32. Entering the ELEMENT description for the five tables, accounts, sales, orders, inventory, and customer, would be tedious and error-prone if done manually.

Example 8-32 General workload bidirectional replication scheme

CREATE REPLICATION bigscheme
ELEMENT elem_accounts_1 TABLE accounts
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_accounts_2 TABLE accounts
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast"
ELEMENT elem_sales_1 TABLE sales
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_sales_2 TABLE sales
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast"
ELEMENT elem_orders_1 TABLE orders
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_orders_2 TABLE orders
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast"
ELEMENT elem_inventory_1 TABLE inventory
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_inventory_2 TABLE inventory
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast"
ELEMENT elem_customers_1 TABLE customers
  MASTER westds ON "westcoast"
  SUBSCRIBER eastds ON "eastcoast"
ELEMENT elem_customers_2 TABLE customers
  MASTER eastds ON "eastcoast"
  SUBSCRIBER westds ON "westcoast";

It is often more convenient to automate the process of writing a replication scheme with scripting. For example, the Perl script shown in Example 8-33 can be used to build the scheme shown in Example 8-32.

Example 8-33 Using a Perl script to create a replication scheme

@tables = qw(
  accounts
  sales
  orders
  inventory
  customers
);

print "CREATE REPLICATION bigscheme";

foreach $table (@tables) {
  $element = $table;
  $element =~ s/repl\./elem\_/;

  print "\n";
  print " ELEMENT $element\_1 TABLE $table\n";
  print " MASTER westds ON \"westcoast\"\n";
  print " SUBSCRIBER eastds ON \"eastcoast\"\n";
  print " ELEMENT $element\_2 TABLE $table\n";
  print " MASTER eastds ON \"eastcoast\"\n";
  print " SUBSCRIBER westds ON \"westcoast\"";
 }
print ";\n";

The @tables array shown in Example 8-33 can be obtained from some other source, such as a data store. For example, you can use ttIsql and f in a Perl statement to generate a @tables array for all of the tables in the WestDSN data store with the owner name repl:

@tables = 'ttIsql -e "tables; quit" WestDSN
           | grep " REPL\."';

Example 8-34 shows a modified version of the script in Example 8-33 that creates a replication scheme for all of the repl tables in the WestDSN data store. (Note that some substitution may be necessary to remove extra spaces and line feeds from the grep output.)

Example 8-34 Perl script to create a replication scheme for all tables in WestDSN

@tables = 'ttIsql -e "tables; quit" WestDSN
           | grep " REPL\."';

print "CREATE REPLICATION bigscheme";

foreach $table (@tables) {
  $table =~ s/^\s*//; # Remove extra spaces
  $table =~ s/\n//; # Remove line feeds
  $element = $table;
  $element =~ s/repl\./elem\_/;

  print "\n";
  print " ELEMENT $element\_1 TABLE $table\n";
  print " MASTER westds ON \"westcoast\"\n";
  print " SUBSCRIBER eastds ON \"eastcoast\"\n";
  print " ELEMENT $element\_2 TABLE $table\n";
  print " MASTER eastds ON \"eastcoast\"\n";
  print " SUBSCRIBER westds ON \"westcoast\"";
 }
print ";\n";