Understanding System Replication in the latest release of SAP HANA

In early April 2018, SAP released HANA 2.0 SPS3. A new SPS is big news with HANA, but in my opinion, is oddly misnamed. SPS stands for Service Pack Stack.

 

One usually thinks of a service pack as a collection of bug fixes, but with HANA a service pack is when new features are introduced. This now happens annually for HANA and goes into production support immediately.

 

For the full list of new HANA features, follow this link.

 

Significant changes to system replication

 

As someone who designs and runs infrastructure for HANA, what caught my eye was a significant change to System Replication. HANA System Replication is HANA’s implementation of database log shipping.

 

Essentially, it is a method for keeping a replica of a live database on a different system. HANA architects use System Replication to implement both High Availability and DR.

 

Both SuSE and Red Hat have cluster resource agents for HANA that exploit System Replication allowing for easy and reliable deployment of two-node highly available HANA clusters. System Replication is also the technology behind the HANA Tenant Copy/Move functionality.

 

What were its biggest flaws?

 

The biggest flaw in HANA System Replication, before HANA 2.0 SPS3, was that a live database could only replicate to a single target.

 

The target system could then act as a source system for a third node, creating a chain of systems.

 

This chain system was implemented for many customers that required both a local HA system and replication for disaster recovery.

 

 

Node 01 Fails

Although this system looks elegant, it is operationally tricky. Take the following examples:

 

Everything fails over elegantly, but when Node01 comes back online, System Replication needs to be entirely re-configured so that Node02 is the source for Node01 and that Node1 is the source for Node03.

 

During the configuration and resyncing, the DR is not available. Depending on the size of the database this could take anywhere between 30 minutes to many hours.

 

 

Node 02 Fails

 

If Node02 fails, Node03 is no longer being updated until Node02 is back online. If Node02 stays down for a significant period, then Node03 must be configured as the System Replication target for Node01 in order to meet the RPO and RTO objectives.

 

When Node02 comes back online the whole system will need to be reverted to its original state. Again, the DR system will have significant periods of time where DR is not available.

 

 

How SPS3 changes this

 

On production systems, most HANA users implement System Replication for HA, DR or both. Before HANA 2.0 SPS2 this meant that the MDC feature Tenant Copy/Move was unavailable due to the single source server constraint.

 

Tenant Copy/Move is a great feature that allows for a HANA tenant database to be copied or moved to another HANA system. It is especially useful when refreshing test systems with production data.

 

It’s quicker, easier and more up to date than the backup / restore operation. But sadly unavailable for most customers until recently.

 

Since the release of HANA 2.0 SPS3, I’ve spent a bit of time both testing and thinking about how multi-target System Replication changes the way we can design and operate HANA systems. Firstly, using System Replication for HA and DR becomes much simpler.

 

 

Node02 and Node03 are connected directly to Node01; there is no longer a chain. A failure of Node01 will still require some reconfiguration. Node03 will need to be configured to use Node02 as the System Replication source.

 

But once this is done, no complicated reversal of the process is necessary; only one change is needed following a failover event.

 

It is also possible, with a little software engineering effort, for Node03 to sense if a failover has occurred and automatically re-sync to the active node!

 

 

What does multi-target replication enable?

 

Multi-target replication means that it is possible to copy the production database to a test system without breaking the System Replication relationships for HA or DR which streamlines the data refresh cycle for customers (but sadly does nothing to improve the BDLS performance).

 

 

Making DR easier to deploy and maintain

 

In conclusion, multi-target System Replication will make systems that simultaneously use System Replication for HA and DR easier to deploy and maintain. Also, HANA users should also take advantage of Tenant Copying to speed up and simplify the refreshing of test data, rather than using the traditional backup and restore methods.

 

If you would like to know more about System Replication and the latest version of SAP HANA then please get in touch with Centiq.

 

New call-to-action