Skip to main content
Version: 3.7 (unsupported)

Getting started with Export

This document explains how you can get started with the ScalarDB Data Loader Export function.

Features

The ScalarDB Data Loader allows you to export data in the following formats:

  • JSON
  • JSONLines
  • CSV

Each export will run a ScalarDB scan operation based on the provided CLI arguments when running data loader.

Usage

The data loader export function can be started with the following minimal configuration:

./scalardb-data-loader export --config scalardb.properties --namespace namespace --table tableName
  • --config: the path to the scalardb connection properties file
  • --namespace: the namespace of the table that contains the data
  • --table: name of the table that contains the data

By default, the data loader will create the output file in the working directory if the --output-file argument is omitted as well.

Command-line flags

Here is a list of flags (options) that can be used with the scalardb data loader.

FlagDescriptionUsage
--configThe path to the scalardb.properties file. If omitted the tool looks for a file named scalardb.properties in the current folderscalardb-data-loader --config scalardb.properties
--namespaceNamespace to export table data from. Required.scalardb-data-loader --namespace namespace
--tableName of table to export data from. Required.scalardb-data-loader --table tableName
--keyExport data of specific Partition key. By default, it exports all data from the specified table.scalardb-data-loader --key columnName=value
--sortSpecify a column to sort on. The column needs to be a clustering key. The argument can be repeated to provide multiple sortings. This flag is only applicable to --key.scalardb-data-loader --sort columnName=desc
--projectionLimit the columns that are exported by providing a projection. The argument can be repeated to provide multiple projections.scalardb-data-loader --projection columnName
--startClustering key to mark scan start. This flag is only applicable to --key.scalardb-data-loader --start columnName=value
--start-exclusiveIs the scan start exclusive or not. If omitted, the default value is false. This flag is only applicable to --keyscalardb-data-loader --start-exclusive
--endClustering key to mark scan end. This flag is only applicable to --key.scalardb-data-loader --end columnName=value
--end-exclusiveIs the scan start exclusive or not. If omitted, the default value is false. This flag is only applicable to --keyscalardb-data-loader --end-exclusive
--limitLimit the results of the scan. If omitted, the default value is 0 which means their is no limit.scalardb-data-loader --limit 1000
--output-fileThe name and path of the output file. If omitted, the tool will save the file in the current folder with the following name format:
export_namespace.tableName_timestamp.json or export_namespace.tableName_timestamp.csv

The ouput folder needs to exists. The dataloader does not create the output folder for you.
scalardb-data-loader --output-file ./out/output.json
--formatThe output format. By default json is selected.scalardb-data-loader --format json
--metadataWhen set to true the transaction metadata is included in the export. By default this is set to falsescalardb-data-loader --metadata
--delimiterThe delimiter used in CSV files. Default value is ;scalardb-data-loader --delimiter ;
--no-headersExclude header row in CSV file. Default is falsescalardb-data-loader --no-headers
--threadsThread count for concurrent processingscalardb-data-loader --threads 500