Skip to main content
Version: 3.5 (unsupported)

How to Back Up and Restore Databases Used Through ScalarDB

Since ScalarDB provides transaction capabilities on top of non-transactional or transactional databases non-invasively, you need to take special care to back up and restore the databases in a transactionally consistent way.

This guide describes how to back up and restore the databases that ScalarDB supports.

Create a backup​

How you create a backup depends on which database you're using and whether or not you're using multiple databases. The following decision tree shows which approach you should take.

Back up without explicit pausing​

If you're using ScalarDB with a single database with support for transactions, you can create a backup of the database even while ScalarDB continues to accept transactions.

warning

Before creating a backup, you should consider the safest way to create a transactionally consistent backup of your databases and understand any risks that are associated with the backup process.

One requirement for creating a backup in ScalarDB is that backups for all the ScalarDB-managed tables (including the Coordinator table) need to be transactionally consistent or automatically recoverable to a transactionally consistent state. That means that you need to create a consistent backup by dumping all tables in a single transaction.

How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.

note

The backup methods by database listed below are just examples of some of the databases that ScalarDB supports.

You can restore to any point within the backup retention period by using the automated backup feature.

Back up with explicit pausing​

Another way to create a transactionally consistent backup is to create a backup while a cluster of ScalarDB instances does not have any outstanding transactions. Creating the backup depends on the following:

  • If the underlying database has a point-in-time snapshot or backup feature, you can create a backup during the period when no outstanding transactions exist.
  • If the underlying database has a point-in-time restore or recovery (PITR) feature, you can set a restore point to a time (preferably the mid-time) in the pause duration period when no outstanding transactions exist.
note

When using a PITR feature, you should minimize the clock drifts between clients and servers by using clock synchronization, such as NTP. Otherwise, the time you get as the paused duration might be too different from the time in which the pause was actually conducted, which could restore the backup to a point where ongoing transactions exist.

In addition, you should pause for a sufficient amount of time (for example, five seconds) and use the mid-time of the paused duration as a restore point since clock synchronization cannot perfectly synchronize clocks between nodes.

To make ScalarDB drain outstanding requests and stop accepting new requests so that a pause duration can be created, you should implement the Scalar Admin interface properly in your application that uses ScalarDB or use ScalarDB Server, which implements the Scalar Admin interface.

By using the Scalar Admin client tool, you can pause nodes, servers, or applications that implement the Scalar Admin interface without losing ongoing transactions.

How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.

note

The backup methods by database listed below are just examples of some of the databases that ScalarDB supports.

Cassandra has a built-in replication feature, so you do not always have to create a transactionally consistent backup. For example, if the replication factor is set to 3 and only the data of one of the nodes in a Cassandra cluster is lost, you won't need a transactionally consistent backup (snapshot) because the node can be recovered by using a normal, transactionally inconsistent backup (snapshot) and the repair feature.

However, if the quorum of cluster nodes loses their data, you will need a transactionally consistent backup (snapshot) to restore the cluster to a certain transactionally consistent point.

To create a transactionally consistent cluster-wide backup (snapshot), pause the application that is using ScalarDB or ScalarDB Server and create backups (snapshots) of the nodes as described in Back up with explicit pausing or stop the Cassandra cluster, take copies of all the data in the nodes, and start the cluster.

Restore a backup​

How you restore a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.

note

The restore methods by database listed below are just examples of some of the databases that ScalarDB supports.

You can restore to any point within the backup retention period by using the automated backup feature.