Skip to main content
Version: 3.8 (unsupported)

Getting started with Export

This document explains how you can get started with the ScalarDB Data Loader Export function.

Features​

The ScalarDB Data Loader allows you to export data in the following formats:

  • JSON
  • JSONLines
  • CSV

Each export will run a ScalarDB scan operation based on the provided CLI arguments when running data loader.

Usage​

The data loader export function can be started with the following minimal configuration:

./scalardb-data-loader export --config scalardb.properties --namespace namespace --table tableName
  • --config: the path to the scalardb connection properties file
  • --namespace: the namespace of the table that contains the data
  • --table: name of the table that contains the data

By default, the data loader will create the output file in the working directory if the --output-file argument is omitted as well.

Command-line flags​

Here is a list of flags (options) that can be used with the scalardb data loader.

FlagDescriptionUsage
--configThe path to the scalardb.properties file. If omitted the tool looks for a file named scalardb.properties in the current folderscalardb-data-loader --config scalardb.properties
--namespaceNamespace to export table data from. Required.scalardb-data-loader --namespace namespace
--tableName of table to export data from. Required.scalardb-data-loader --table tableName
--keyExport data of specific Partition key. By default, it exports all data from the specified table.scalardb-data-loader --key columnName=value
--sortSpecify a column to sort on. The column needs to be a clustering key. The argument can be repeated to provide multiple sortings. This flag is only applicable to --key.scalardb-data-loader --sort columnName=desc
--projectionLimit the columns that are exported by providing a projection. The argument can be repeated to provide multiple projections.scalardb-data-loader --projection columnName
--startClustering key to mark scan start. This flag is only applicable to --key.scalardb-data-loader --start columnName=value
--start-exclusiveIs the scan start exclusive or not. If omitted, the default value is false. This flag is only applicable to --keyscalardb-data-loader --start-exclusive
--endClustering key to mark scan end. This flag is only applicable to --key.scalardb-data-loader --end columnName=value
--end-exclusiveIs the scan start exclusive or not. If omitted, the default value is false. This flag is only applicable to --keyscalardb-data-loader --end-exclusive
--limitLimit the results of the scan. If omitted, the default value is 0 which means their is no limit.scalardb-data-loader --limit 1000
--output-fileThe name and path of the output file. If omitted, the tool will save the file in the current folder with the following name format:
export_namespace.tableName_timestamp.json or export_namespace.tableName_timestamp.csv

The ouput folder needs to exists. The dataloader does not create the output folder for you.
scalardb-data-loader --output-file ./out/output.json
--formatThe output format. By default json is selected.scalardb-data-loader --format json
--metadataWhen set to true the transaction metadata is included in the export. By default this is set to falsescalardb-data-loader --metadata
--delimiterThe delimiter used in CSV files. Default value is ;scalardb-data-loader --delimiter ;
--no-headersExclude header row in CSV file. Default is falsescalardb-data-loader --no-headers
--threadsThread count for concurrent processingscalardb-data-loader --threads 500