ScalarDB Analytics
ScalarDB Analytics is the analytical component of ScalarDB. Similar to ScalarDB, it unifies diverse data sources - ranging from RDBMSs like PostgreSQL and MySQL to NoSQL databases such as Cassandra and DynamoDB - into a single logical database. While ScalarDB focuses on operational workloads with strong transactional consistency across multiple databases, ScalarDB Analytics is optimized for analytical workloads. It supports a wide range of queries, including complex joins, aggregations, and window functions. ScalarDB Analytics operates seamlessly on both ScalarDB-managed data sources and non-ScalarDB-managed ones, enabling advanced analytical queries across various datasets.
The current version of ScalarDB Analytics leverages Apache Spark as its execution engine. It provides a unified view of ScalarDB-managed and non-ScalarDB-managed data sources by utilizing a Spark custom catalog. Using ScalarDB Analytics, you can treat tables from these data sources as native Spark tables. This allows you to execute arbitrary Spark SQL queries seamlessly. For example, you can join a table stored in Cassandra with a table in PostgreSQL to perform a cross-database analysis with ease.
You need to have a license key (trial license or commercial license) to use ScalarDB Analytics with Spark. If you don't have a license key, please contact us.
Further reading​
- For tutorials on how to use ScalarDB Analytics by using a sample dataset and application, see Getting Started with ScalarDB Analytics.
- For supported Spark and Scala versions, see Version Compatibility of ScalarDB Analytics with Spark