ScalarDB Roadmap
This roadmap provides a look into the proposed future of ScalarDB. The purpose of this roadmap is to provide visibility into what changes may be coming so that you can more closely follow progress, learn about key milestones, and give feedback during development. This roadmap will be updated as new versions of ScalarDB are released.
During the course of development, this roadmap is subject to change based on user needs and feedback. Do not schedule your release plans according to the contents of this roadmap.
If you have a feature request or want to prioritize feature development, please create an issue in GitHub.
CY2024 Q4
New capabilities
- Data virtualization for analytics
- Users will be able to run read-only OLAP SQL queries on diverse data sources through ScalarDB Analytics. ScalarDB Analytics currently supports only ScalarDB-managed data stores, so this enhancement will virtually unify various data stores, like relational databases and NoSQL databases, and files in cloud object stores, like Amazon S3, without regard to whether the data sources are managed by ScalarDB transactions.
- Vector store abstraction
- Users will be able to store and search embeddings (vectors) in and from vector stores through a new vector store interface in ScalarDB. With this feature, users can simplify the process of realizing retrieval-augmented generation (RAG) with large language models (LLMs) by reading data from databases through the existing ScalarDB interface, creating embeddings from the data, and storing and searching the embeddings to and from a vector store through the new interface.
Security
- Fine-grained access control
- Users will be able to authorize accesses to the underlying databases in a finer-grained way. In addition to the current simple authorization where ScalarDB checks if users are authorized to issue particular operations, ScalarDB will check if users can access particular records.
Usability
- Addition of time-related data types
- Users will be able to use time-related data types, which will make their existing applications easier to migrate.
- Removal of extra-write strategy
- Users will no longer be able to use the extra-write strategy to make transactions serializable. Although ScalarDB currently provides two strategies, extra-read and extra-write strategies, to make transactions serializable, the extra-write strategy has several limitations. For example, users can't issue write and scan operations in the same transaction. Therefore, the strategy will be removed so that users don't need to worry about such limitations when creating applications.
Performance
- One-phase commit optimization
- Users will experience faster execution for simple transactions that write to a single partition. ScalarDB will omit the prepare-record and commit-state phases without sacrificing correctness if a transaction updates only one partition by exploiting the single-partition linearizable operations of the underlying databases.
- Reduction of storage space needed for managing ScalarDB metadata
- Users will likely use less storage space to run ScalarDB. ScalarDB will remove the before image of committed transactions after they are committed. However, whether or not those committed transactions will impact actual storage space depends on the underlying databases.
- Removal of coordinator writes for read-only transactions
- Users will experience faster execution for read-only transactions by removing coordinator writes for those transactions.
Cloud support
- Container offering in Azure Marketplace
- Users will be able to deploy ScalarDB Cluster by using the Azure container offering, which enables users to use a pay-as-you-go subscription model.
- Google Cloud Platform (GCP) support
- Users will be able to deploy ScalarDB Cluster in Google Kubernetes Engine (GKE) in GCP.
CY2025 Q1
New capabilities
- Native secondary index
- Users will be able to define flexible secondary indexes. The existing secondary index is limited because it is implemented based on the common capabilities of the supported databases' secondary indexes. Therefore, for example, you cannot define a multi-column index. The new secondary index will be created at the ScalarDB layer so that you can create more flexible indexes, like a multi-column index.
Usability
-
Addition of SQL operations for aggregation
- Users will be able to issue aggregation operations in ScalarDB SQL.
-
Elimination of out-of-memory errors due to large scans
- Users will be able to issue large scans without experiencing out-of-memory errors.
-
Enabling of read operations during a paused duration
- Users will be able to issue read operations even during a paused duration so that users can still read data while taking backups.
-
Addition of more data types
- Users will be able to use more data types so that their existing applications will be easier to migrate.
CY2025 Q2 -
Usability
- Views
- Users will be able to define Views so that they can manage multiple different databases in an easier and simplified way.
Scalability and availability
- Semi-synchronous replication
- Users will be able to provide ScalarDB-based applications in a disaster-recoverable manner. For example, assume you provide a primary service in Tokyo and a standby service in Osaka. In case of catastrophic failure in Tokyo, you can switch the primary service to Osaka so that you can continue to provide the service without data loss and extended downtime.
Cloud support
- Red Hat OpenShift support
- Users will be able to use Red Hat–certified Helm Charts for ScalarDB Cluster in OpenShift environments.
- Container offering in Google Cloud Marketplace
- Users will be able to deploy ScalarDB Cluster by using the Google Cloud container offering, which enables users to use a pay-as-you-go subscription model.