This tutorial describes how to run benchmarking tools for ScalarDB. Database benchmarking is helpful for evaluating how databases perform against a set of standards.

Benchmark workloads

  • TPC-C
  • YCSB (Workloads A, C, and F)
  • Multi-storage YCSB (Workloads C and F)
    • This YCSB variant is for a multi-storage environment that uses ScalarDB.
    • Workers in a multi-storage YCSB execute the same number of read and write operations in two namespaces: ycsb_primary and ycsb_secondary.

Prerequisites

  • One of the following Java Development Kits (JDKs):
  • Gradle
  • Kelpie
    • Kelpie is a framework for performing end-to-end testing, such as system benchmarking and verification. Get the latest version from Kelpie Releases, and unzip the archive file.
  • A client to run the benchmarking tools
  • A target database

Note

Currently, only JDK 8 can be used when running the benchmarking tools.

Set up the benchmarking tools

The following sections describe how to set up the benchmarking tools.

Clone the ScalarDB benchmarks repository

Open Terminal, then clone the ScalarDB benchmarks repository by running the following command:

$ git clone https://github.com/scalar-labs/scalardb-benchmarks

Then, go to the directory that contains the benchmarking files by running the following command:

$ cd scalardb-benchmarks

Build the tools

To build the benchmarking tools, run the following command:

$ ./gradlew shadowJar

Load the schema

Before loading the initial data, the tables must be defined by using the ScalarDB Schema Loader. To apply the schema, go to the ScalarDB Releases page and download the ScalarDB Schema Loader that matches the version of ScalarDB that you are using to the scalardb-benchmarks root folder.

In addition, you need a properties file that contains database configurations for ScalarDB. For details about configuring the ScalarDB properties file, see ScalarDB Configurations.

After applying the schema and configuring the properties file, select a benchmark and follow the instructions to create the tables.

To create tables for TPC-C benchmarking (tpcc-schema.json), run the following command, replacing the contents in the angle brackets as described:

$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f tpcc-schema.json --coordinator

To create tables for YCSB benchmarking (ycsb-schema.json), run the following command, replacing the contents in the angle brackets as described:

$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f ycsb-schema.json --coordinator

To create tables for multi-storage YCSB benchmarking (ycsb-multi-storage-schema.json), run the following command, replacing the contents in the angle brackets as described:

$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f ycsb-multi-storage-schema.json --coordinator

Prepare a benchmarking configuration file

To run a benchmark, you must first prepare a benchmarking configuration file. The configuration file requires at least the locations of the workload modules to run and the database configuration.

The following is an example configuration for running the TPC-C benchmark. The ScalarDB properties file specified for config_file should be the properties file for the benchmarking environment that you previously set up.

Note

Alternatively, instead of using the ScalarDB properties file, you can specify each database configuration item in the .toml file. If config_file is specified, all other configurations under [database_config] will be ignored even if they are uncommented.

[modules]
[modules.preprocessor]
name = "com.scalar.db.benchmarks.tpcc.TpccLoader"
path = "./build/libs/scalardb-benchmarks-all.jar"
[modules.processor]
name = "com.scalar.db.benchmarks.tpcc.TpccBench"
path = "./build/libs/scalardb-benchmarks-all.jar"
[modules.postprocessor]
name = "com.scalar.db.benchmarks.tpcc.TpccReporter"
path = "./build/libs/scalardb-benchmarks-all.jar"

[database_config]
config_file = "<PATH_TO_SCALARDB_PROPERTIES_FILE>"
#contact_points = "localhost"
#contact_port = 9042
#username = "cassandra"
#password = "cassandra"
#storage = "cassandra"

You can define parameters to pass to modules in the configuration file. For details, see the sample configuration files below and available parameters in Common parameters:

Run a benchmark

Select a benchmark, and follow the instructions to run the benchmark.

To run the TPC-C benchmark, run the following command, replacing <PATH_TO_KELPIE> with the path to the Kelpie directory:

$ /<PATH_TO_KELPIE>/bin/kelpie --config tpcc-benchmark-config.toml

To run the YCSB benchmark, run the following command, replacing <PATH_TO_KELPIE> with the path to the Kelpie directory:

$ /<PATH_TO_KELPIE>/bin/kelpie --config ycsb-benchmark-config.toml

To run the multi-storage YCSB benchmark, run the following command, replacing <PATH_TO_KELPIE> with the path to the Kelpie directory:

$ /<PATH_TO_KELPIE>/bin/kelpie --config ycsb-multi-storage-benchmark-config.toml

In addition, the following options are available:

  • --only-pre. Only loads the data.
  • --only-process. Only runs the benchmark.
  • --except-pre Runs a job without loading the data.
  • --except-process. Runs a job without running the benchmark.

Common parameters

Name Description Default
concurrency Number of threads for benchmarking. 1
run_for_sec Duration of benchmark (in seconds). 60
ramp_for_sec Duration of ramp-up time before benchmark (in seconds). 0

Workload-specific parameters

Select a benchmark to see its available workload parameters.

Name Description Default
num_warehouses Number of warehouses (scale factor) for benchmarking. 1
load_concurrency Number of threads for loading. 1
load_start_warehouse Start ID of loading warehouse. This option can be useful with --skip-item-load when loading large-scale data with multiple clients or adding additional warehouses. 1
load_end_warehouse End ID of loading warehouse. You can use either --num-warehouses or --end-warehouse to specify the number of loading warehouses. 1
skip_item_load Whether or not to skip loading item table. false
use_table_index Whether or not to use a generic table-based secondary index instead of ScalarDB’s secondary index. false
np_only Run benchmark with only new-order and payment transactions (50% each). false
rate_new_order Percentage of new-order transactions. When specifying this percentage based on your needs, you must specify the percentages for all other rate parameters. In that case, the total of all rate parameters must equal 100 percent. N/A
rate_payment Percentage of payment transactions. When specifying this percentage based on your needs, you must specify the percentages for all other rate parameters. In that case, the total of all rate parameters must equal 100 percent. N/A
rate_order_status Percentage of order-status transactions. When specifying this percentage based on your needs, you must specify the percentages for all other rate parameters. In that case, the total of all rate parameters must equal 100 percent. N/A
rate_delivery Percentage of delivery transactions. When specifying this percentage based on your needs, you must specify the percentages for all other rate parameters. In that case, the total of all rate parameters must equal 100 percent. N/A
rate_stock_level Percentage of stock-level transactions. When specifying this percentage based on your needs, you must specify the percentages for all other rate parameters. In that case, the total of all rate parameters must equal 100 percent. N/A
backoff Sleep time in milliseconds inserted after a transaction is aborted due to a conflict. 0
Name Description Default
load_concurrency Number of threads for loading. 1
load_batch_size Number of put records in a single loading transaction. 1
load_overwrite Whether or not to overwrite when loading records. false
ops_per_tx Number of operations in a single transaction. 2 (Workloads A and C)
1 (Workload F)
record_count Number of records in the target table. 1000
use_read_modify_write Whether or not to use read-modify-writes instead of blind writes in Workload A. false1
  1. The default value is false for use_read_modify_write since Workload A doesn’t assume that the transaction reads the original record first. However, if you’re using Consensus Commit as the transaction manager, you must set use_read_modify_write to true. This is because ScalarDB doesn’t allow a blind write for an existing record.