ScalarDB Schema Loader
ScalarDB has its own data model and schema that maps to the implementation-specific data model and schema. In addition, ScalarDB stores internal metadata, such as transaction IDs, record versions, and transaction statuses, to manage transaction logs and statuses when you use the Consensus Commit transaction manager.
Since managing the schema mapping and metadata for transactions can be difficult, you can use ScalarDB Schema Loader, which is a tool to create schemas that doesn't require you to need in-depth knowledge about schema mapping or metadata.
You have two options to specify general CLI options in Schema Loader:
- Pass the ScalarDB properties file and database-specific or storage-specific options.
- Pass database-specific or storage-specific options without the ScalarDB properties file. (Deprecated)
This tool supports only basic options to create, delete, repair, or alter a table. If you want to use the advanced features of a database, you must alter your tables with a database-specific tool after creating the tables with this tool.
Set up Schema Loader
Select your preferred method to set up Schema Loader, and follow the instructions.
- Fat JAR
- Docker container
You can download the release versions of Schema Loader from the ScalarDB Releases page.
You can pull the Docker image from the Scalar container registry by running the following command, replacing the contents in the angle brackets as described:
docker run --rm -v <PATH_TO_YOUR_LOCAL_SCHEMA_FILE>:<PATH_TO_SCHEMA_FILE_DOCKER> [-v <PATH_TO_LOCAL_SCALARDB_PROPERTIES_FILE>:<PATH_TO_SCALARDB_PROPERTIES_FILE_IN_DOCKER>] ghcr.io/scalar-labs/scalardb-schema-loader:<VERSION> <COMMAND_ARGUMENTS>
You can specify the same command arguments even if you use the fat JAR or the container. In the Available commands section, the JAR is used, but you can run the commands by using the container in the same way by replacing java -jar scalardb-schema-loader-<VERSION>.jar
with docker run --rm -v <PATH_TO_YOUR_LOCAL_SCHEMA_FILE>:<PATH_TO_SCHEMA_FILE_DOCKER> [-v <PATH_TO_LOCAL_SCALARDB_PROPERTIES_FILE>:<PATH_TO_SCALARDB_PROPERTIES_FILE_IN_DOCKER>] ghcr.io/scalar-labs/scalardb-schema-loader:<VERSION>
.
Run Schema Loader
This section explains how to run Schema Loader.
Available commands
Select how you would like to configure Schema Loader for your database. The preferred method is to use the properties file since other, database-specific methods are deprecated.
The following commands are available when using the properties file:
Usage: java -jar scalardb-schema-loader-<VERSION>.jar [-D] [--coordinator]
[--no-backup] [--no-scaling] -c=<configPath>
[--compaction-strategy=<compactionStrategy>] [-f=<schemaFile>]
[--replication-factor=<replicaFactor>]
[--replication-strategy=<replicationStrategy>] [--ru=<ru>]
Create/Delete schemas in the storage defined in the config file
-A, --alter Alter tables : it will add new columns and create/delete
secondary index for existing tables. It compares the
provided table schema to the existing schema to decide
which columns need to be added and which indexes need
to be created or deleted
-c, --config=<configPath>
Path to the config file of ScalarDB
--compaction-strategy=<compactionStrategy>
The compaction strategy, must be LCS, STCS or TWCS
(supported in Cassandra)
--coordinator Create/delete/repair Coordinator tables
-D, --delete-all Delete tables
-f, --schema-file=<schemaFile>
Path to the schema json file
--no-backup Disable continuous backup (supported in DynamoDB)
--no-scaling Disable auto-scaling (supported in DynamoDB, Cosmos DB)
--repair-all Repair tables : it repairs the table metadata of
existing tables. When using Cosmos DB, it
additionally repairs stored procedure attached
to each table
--replication-factor=<replicaFactor>
The replication factor (supported in Cassandra)
--replication-strategy=<replicationStrategy>
The replication strategy, must be SimpleStrategy or
NetworkTopologyStrategy (supported in Cassandra)
--ru=<ru> Base resource unit (supported in DynamoDB, Cosmos DB)
For a sample properties file, see database.properties
.
The following database-specific methods have been deprecated. Please use the commands for configuring the properties file instead.
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
Usage: java -jar scalardb-schema-loader-<VERSION>.jar --jdbc [-D]
-f=<schemaFile> -j=<url> -p=<password> -u=<user>
Create/Delete JDBC schemas
-A, --alter Alter tables : it will add new columns and create/delete
secondary index for existing tables. It compares the
provided table schema to the existing schema to decide
which columns need to be added and which indexes need
to be created or deleted
-D, --delete-all Delete tables
-f, --schema-file=<schemaFile>
Path to the schema json file
-j, --jdbc-url=<url> JDBC URL
-p, --password=<password>
JDBC password
--repair-all Repair tables : it repairs the table metadata of
existing tables
-u, --user=<user> JDBC user
Usage: java -jar scalardb-schema-loader-<VERSION>.jar --dynamo [-D]
[--no-backup] [--no-scaling] [--endpoint-override=<endpointOverride>]
-f=<schemaFile> -p=<awsSecKey> [-r=<ru>] --region=<awsRegion>
-u=<awsKeyId>
Create/Delete DynamoDB schemas
-A, --alter Alter tables : it will add new columns and create/delete
secondary index for existing tables. It compares the
provided table schema to the existing schema to decide
which columns need to be added and which indexes need
to be created or deleted
-D, --delete-all Delete tables
--endpoint-override=<endpointOverride>
Endpoint with which the DynamoDB SDK should
communicate
-f, --schema-file=<schemaFile>
Path to the schema json file
--no-backup Disable continuous backup for DynamoDB
--no-scaling Disable auto-scaling for DynamoDB
-p, --password=<awsSecKey> AWS access secret key
-r, --ru=<ru> Base resource unit
--region=<awsRegion> AWS region
--repair-all Repair tables : it repairs the table metadata of
existing tables
-u, --user=<awsKeyId> AWS access key ID
Usage: java -jar scalardb-schema-loader-<VERSION>.jar --cosmos [-D]
[--no-scaling] -f=<schemaFile> -h=<uri> -p=<key> [-r=<ru>]
Create/Delete Cosmos DB schemas
-A, --alter Alter tables : it will add new columns and create/delete
secondary index for existing tables. It compares the
provided table schema to the existing schema to decide
which columns need to be added and which indexes need
to be created or deleted
-D, --delete-all Delete tables
-f, --schema-file=<schemaFile>
Path to the schema json file
-h, --host=<uri> Cosmos DB account URI
--no-scaling Disable auto-scaling for Cosmos DB
-p, --password=<key> Cosmos DB key
-r, --ru=<ru> Base resource unit
--repair-all Repair tables : it repairs the table metadata of
existing tables and repairs stored procedure
attached to each table
Usage: java -jar scalardb-schema-loader-<VERSION>.jar --cassandra [-D]
[-c=<compactionStrategy>] -f=<schemaFile> -h=<hostIp>
[-n=<replicationStrategy>] [-p=<password>] [-P=<port>]
[-R=<replicationFactor>] [-u=<user>]
Create/Delete Cassandra schemas
-A, --alter Alter tables : it will add new columns and create/delete
secondary index for existing tables. It compares the
provided table schema to the existing schema to decide
which columns need to be added and which indexes need
to be created or deleted
-c, --compaction-strategy=<compactionStrategy>
Cassandra compaction strategy, must be LCS, STCS or TWCS
-D, --delete-all Delete tables
-f, --schema-file=<schemaFile>
Path to the schema json file
-h, --host=<hostIp> Cassandra host IP
-n, --network-strategy=<replicationStrategy>
Cassandra network strategy, must be SimpleStrategy or
NetworkTopologyStrategy
-p, --password=<password>
Cassandra password
-P, --port=<port> Cassandra Port
-R, --replication-factor=<replicationFactor>
Cassandra replication factor
--repair-all Repair tables : it repairs the table metadata of
existing tables
-u, --user=<user> Cassandra user
Create namespaces and tables
To create namespaces and tables by using a properties file, run the following command, replacing the contents in the angle brackets as described:
java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> [--coordinator]
If --coordinator
is specified, a Coordinator table will be created.
The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead.
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
java -jar scalardb-schema-loader-<VERSION>.jar --jdbc -j <JDBC_URL> -u <USER> -p <PASSWORD> -f <PATH_TO_SCHEMA_FILE>
java -jar scalardb-schema-loader-<VERSION>.jar --dynamo -u <AWS_ACCESS_KEY_ID> -p <AWS_ACCESS_SECRET_KEY> --region <REGION> -f <PATH_TO_SCHEMA_FILE> [-r BASE_RESOURCE_UNIT]
<REGION>
should be a string to specify an AWS region likeap-northeast-1
.-r
option is almost the same as Cosmos DB for NoSQL option. However, the unit means DynamoDB capacity unit. The read and write capacity units are set the same value.
java -jar scalardb-schema-loader-<VERSION>.jar --cosmos -h <COSMOS_DB_FOR_NOSQL_ACCOUNT_URI> -p <COSMOS_DB_FOR_NOSQL_KEY> -f <PATH_TO_SCHEMA_FILE> [-r BASE_RESOURCE_UNIT]
<COSMOS_DB_FOR_NOSQL_KEY>
you can use a primary key or a secondary key.-r BASE_RESOURCE_UNIT
is an option. You can specify the RU of each database. The maximum RU in tables in the database will be set. If you don't specify RU of tables, the database RU will be set with this option. By default, it's 400.
java -jar scalardb-schema-loader-<VERSION>.jar --cassandra -h <CASSANDRA_IP> [-P <CASSANDRA_PORT>] [-u <CASSANDRA_USER>] [-p <CASSANDRA_PASSWORD>] -f <PATH_TO_SCHEMA_FILE> [-n <NETWORK_STRATEGY>] [-R <REPLICATION_FACTOR>]
- If
-P <CASSANDRA_PORT>
is not supplied, it defaults to9042
. - If
-u <CASSANDRA_USER>
is not supplied, it defaults tocassandra
. - If
-p <CASSANDRA_PASSWORD>
is not supplied, it defaults tocassandra
. <NETWORK_STRATEGY>
should beSimpleStrategy
orNetworkTopologyStrategy
.
Alter tables
You can use a command to add new columns to and create or delete a secondary index for existing tables. This command compares the provided table schema to the existing schema to decide which columns need to be added and which indexes need to be created or deleted.
To add new columns to and create or delete a secondary index for existing tables, run the following command, replacing the contents in the angle brackets as described:
java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> --alter
The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead.
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
java -jar scalardb-schema-loader-<VERSION>.jar --jdbc -j <JDBC_URL> -u <USER> -p <PASSWORD> -f <PATH_TO_SCHEMA_FILE> --alter
java -jar scalardb-schema-loader-<VERSION>.jar --dynamo -u <AWS_ACCESS_KEY_ID> -p <AWS_ACCESS_SECRET_KEY> --region <REGION> -f <PATH_TO_SCHEMA_FILE> --alter
java -jar scalardb-schema-loader-<VERSION>.jar --cosmos -h <COSMOS_DB_FOR_NOSQL_ACCOUNT_URI> -p <COSMOS_DB_FOR_NOSQL_KEY> -f <PATH_TO_SCHEMA_FILE> --alter
java -jar scalardb-schema-loader-<VERSION>.jar --cassandra -h <CASSANDRA_IP> [-P <CASSANDRA_PORT>] [-u <CASSANDRA_USER>] [-p <CASSANDRA_PASSWORD>] -f <PATH_TO_SCHEMA_FILE> --alter
Delete tables
You can delete tables by using the properties file. To delete tables, run the following command, replacing the contents in the angle brackets as described:
java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> [--coordinator] -D
If --coordinator
is specified, the Coordinator table will be deleted as well.
The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead.
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
java -jar scalardb-schema-loader-<VERSION>.jar --jdbc -j <JDBC_URL> -u <USER> -p <PASSWORD> -f <PATH_TO_SCHEMA_FILE> -D
java -jar scalardb-schema-loader-<VERSION>.jar --dynamo -u <AWS_ACCESS_KEY_ID> -p <AWS_ACCESS_SECRET_KEY> --region <REGION> -f <PATH_TO_SCHEMA_FILE> -D
java -jar scalardb-schema-loader-<VERSION>.jar --cosmos -h <COSMOS_DB_FOR_NOSQL_ACCOUNT_URI> -p <COSMOS_DB_FOR_NOSQL_KEY> -f <PATH_TO_SCHEMA_FILE> -D
java -jar scalardb-schema-loader-<VERSION>.jar --cassandra -h <CASSANDRA_IP> [-P <CASSANDRA_PORT>] [-u <CASSANDRA_USER>] [-p <CASSANDRA_PASSWORD>] -f <PATH_TO_SCHEMA_FILE> -D
Repair tables
You can repair the table metadata of existing tables by using the properties file. To repair table metadata of existing tables, run the following command, replacing the contents in the angle brackets as described:
java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> [--coordinator] --repair-all
Before executing this command, you should confirm the schema configuration is the same as the one that was last applied.
If --coordinator
is specified, the Coordinator table will be repaired as well. In addition, if you're using Cosmos DB for NoSQL, running this command will also repair stored procedures attached to each table.
The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead.
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
java -jar scalardb-schema-loader-<VERSION>.jar --jdbc -j <JDBC_URL> -u <USER> -p <PASSWORD> -f <PATH_TO_SCHEMA_FILE> --repair-all
java -jar scalardb-schema-loader-<VERSION>.jar --dynamo -u <AWS_ACCESS_KEY_ID> -p <AWS_ACCESS_SECRET_KEY> --region <REGION> [--no-backup] -f <PATH_TO_SCHEMA_FILE> --repair-all
java -jar scalardb-schema-loader-<VERSION>.jar --cosmos -h <COSMOS_DB_FOR_NOSQL_ACCOUNT_URI> -p <COSMOS_DB_FOR_NOSQL_KEY> -f <PATH_TO_SCHEMA_FILE> --repair-all
java -jar scalardb-schema-loader-<VERSION>.jar --cassandra -h <CASSANDRA_IP> [-P <CASSANDRA_PORT>] [-u <CASSANDRA_USER>] [-p <CASSANDRA_PASSWORD>] -f <PATH_TO_SCHEMA_FILE> --repair-all
Sample schema file
The following is a sample schema. For a sample schema file, see schema_sample.json
.
{
"sample_db.sample_table": {
"transaction": false,
"partition-key": [
"c1"
],
"clustering-key": [
"c4 ASC",
"c6 DESC"
],
"columns": {
"c1": "INT",
"c2": "TEXT",
"c3": "BLOB",
"c4": "INT",
"c5": "BOOLEAN",
"c6": "INT"
},
"secondary-index": [
"c2",
"c4"
]
},
"sample_db.sample_table1": {
"transaction": true,
"partition-key": [
"c1"
],
"clustering-key": [
"c4"
],
"columns": {
"c1": "INT",
"c2": "TEXT",
"c3": "INT",
"c4": "INT",
"c5": "BOOLEAN"
}
},
"sample_db.sample_table2": {
"transaction": false,
"partition-key": [
"c1"
],
"clustering-key": [
"c4",
"c3"
],
"columns": {
"c1": "INT",
"c2": "TEXT",
"c3": "INT",
"c4": "INT",
"c5": "BOOLEAN"
}
}
}
The schema has table definitions that include columns
, partition-key
, clustering-key
, secondary-index
, and transaction
fields.
- The
columns
field defines columns of the table and their data types. - The
partition-key
field defines which columns the partition key is composed of. - The
clustering-key
field defines which columns the clustering key is composed of. - The
secondary-index
field defines which columns are indexed. - The
transaction
field indicates whether the table is for transactions or not.- If you set the
transaction
field totrue
or don't specify thetransaction
field, this tool creates a table with transaction metadata if needed. - If you set the
transaction
field tofalse
, this tool creates a table without any transaction metadata (that is, for a table with Storage API).
- If you set the
You can also specify database or storage-specific options in the table definition as follows:
{
"sample_db.sample_table3": {
"partition-key": [
"c1"
],
"columns": {
"c1": "INT",
"c2": "TEXT",
"c3": "BLOB"
},
"compaction-strategy": "LCS",
"ru": 5000
}
}
The database or storage-specific options you can specify are as follows:
- JDBC databases
- DynamoDB
- Cosmos DB for NoSQL
- Cassandra
No options are available for JDBC databases.
The ru
option stands for Request Units. For details, see RUs.
The ru
option stands for Request Units. For details, see RUs.
The compaction-strategy
option is the compaction strategy used. This option should be STCS
(SizeTieredCompaction), LCS
(LeveledCompactionStrategy), or TWCS
(TimeWindowCompactionStrategy).
Scale for performance when using Cosmos DB for NoSQL or DynamoDB
When using Cosmos DB for NoSQL or DynamoDB, you can scale by using Request Units (RUs) or auto-scaling.