Run Analytical Queries on Sample Data by Using ScalarDB Analytics with PostgreSQL
This tutorial describes how to run analytical queries on sample data by using ScalarDB Analytics with PostgreSQL.
Overview
This sample tutorial shows how you can run two types of queries: a single-table query and a multi-table query.
What you can do in this sample tutorial
This sample tutorial shows how you can run the following types of queries:
- Read data and calculate summaries.
- Join tables that span multiple storages.
You can run any arbitrary query that PostgreSQL supports on the imported tables in this sample tutorial. Since ScalarDB Analytics with PostgreSQL supports all queries that PostgreSQL supports, you can use not only join, aggregation, filtering, and ordering as shown in the example, but also the window function, lateral join, or various analytical operations.
To see which types of queries PostgreSQL supports, see the PostgreSQL documentation.
Prerequisites
- Docker 20.10 or later with Docker Compose V2 or later
- psql
Set up ScalarDB Analytics with PostgreSQL
First, you must set up the database to run analytical queries with ScalarDB Analytics with PostgreSQL. If you haven't set up the database yet, please follow the instructions in Getting Started.
Schema details in ScalarDB
In this sample tutorial, you have tables with the following schema in the ScalarDB database:
For reference, this diagram shows the following:
dynamons
,postgresns
, andcassandrans
. Namespaces that are mapped to the back-end storages of DynamoDB, PostgreSQL, and Cassandra, respectively.dynamons.customer
. A table that represents information about customers. This table includes attributes like customer key, name, address, phone number, and account balance.postgresns.orders
. A table that contains information about orders that customers have placed. This table includes attributes like order key, customer key, order status, order date, and order priority.cassandrans.lineitem
. A table that represents line items associated with orders. This table includes attributes such as order key, part key, supplier key, quantity, price, and shipping date.
Schema details in PostgreSQL
By running the Schema Importer when setting up ScalarDB, you can import the table schema in the ScalarDB database into the PostgreSQL database. More precisely, for each namespace_name.table_name
table in the ScalarDB database, you will have a foreign table for namespace_name._table_name
and a view for namespace_name.table_name
in the PostgreSQL database.
The created foreign table contains columns that are identical to the ScalarDB table and the transaction metadata columns that ScalarDB manages internally. Since the created view is defined to exclude the transaction metadata columns from the foreign table, the created view contains only the same columns as the ScalarDB table.
You can find the schema of the ScalarDB tables in schema.json
. For example, the dynamons.customer
table is defined as follows:
"dynamons.customer": {
"transaction": true,
"partition-key": [
"c_custkey"
],
"columns": {
"c_custkey": "INT",
"c_name": "TEXT",
"c_address": "TEXT",
"c_nationkey": "INT",
"c_phone": "TEXT",
"c_acctbal": "DOUBLE",
"c_mktsegment": "TEXT",
"c_comment": "TEXT"
}
},