Deploy a ScalarDB Analytics server
This document explains how to deploy a ScalarDB Analytics server in your local or production environment.
Step 1. Decide on the billing method for ScalarDB Analytics​
You can get the ScalarDB Analytics server in several ways:
- Pay as you go
- Fixed price (bring your own license)
You can use ScalarDB Analytics in a pay-as-you-go plan. In this case, you will pay the license fee based on your query usage.
- AWS Marketplace
You can use ScalarDB Analytics in the pay-as-you-go plan in AWS Marketplace.
- Container offer
To deploy the ScalarDB Analytics server from AWS Marketplace with a pay-as-you-go plan:
- Go to the AWS Marketplace page ScalarDB Analytics server.
- Subscribe to the ScalarDB Analytics server.
- Select View purchase options.
- Select Subscribe.
After subscribing, you'll have permission to pull the container image of the ScalarDB Analytics server from the following container registry. You will specify this container registry and pull the container image in a later step, so keep note of it.
709825985650.dkr.ecr.us-east-1.amazonaws.com/scalar/scalardb-analytics-server-aws-payg
You can use ScalarDB Analytics in a fixed-price method. In this case, you will pay the fixed license fee based on your contract, with an upper limit on the queries you can run.
- Any supported Kubernetes platform
You can use ScalarDB Analytics in the fixed-price method on any supported Kubernetes platforms. You can see the supported Kubernetes platforms in Requirements.
- Container offer
You need to have a license key (trial license or commercial license) to use ScalarDB Analytics server. If you don't have a license key, please contact us.
You can deploy the ScalarDB Analytics server by using a container image with a license key that is provided at a fixed price, also known as bring your own license (BYOL). You can pull the container image of the ScalarDB Analytics server from the following container registry.
You will specify this container registry in a later step, so keep note of it.
ghcr.io/scalar-labs/scalardb-analytics-server-byol
Step 2. Deploy a Kubernetes cluster​
Deploy a cluster on your preferred Kubernetes platform based on the following requirements and checkpoints:
-
Decide which Kubernetes platform to use based on the billing method and purpose.
-
If you chose Pay as you go (container offer - AWS Marketplace) in Step 1. Decide on the billing method for ScalarDB Analytics, you need to deploy Amazon Elastic Kubernetes Service (EKS) in the supported regions. The supported regions will be referred to in a later step.
-
If you chose Fixed price w/bring your own license (container offer - any supported Kubernetes platform) in Step 1. Decide on the billing method for ScalarDB Analytics, you can use any of the supported Kubernetes platforms.
noteYou should use minikube for testing or development purposes only. minikube is not recommended for production use.
-
-
Check the general recommendations and requirements of the Kubernetes cluster for the ScalarDB Analytics server.
- Recommendations
- You should use a worker node that has at least 2 CPUs and 4 GB of memory.
- Currently, the ScalarDB Analytics server does not have a clustering feature. Therefore, only one worker node is enough.
- If you want to make the Kubernetes cluster itself highly available, you can deploy it with multiple worker nodes.
- You should use a worker node that has at least 2 CPUs and 4 GB of memory.
- Requirements
- You must allow your Spark application to connect to the ScalarDB Analytics server deployed on the Kubernetes cluster from a network perspective. To see which port the ScalarDB Analytics server uses, see Requirements.
- You must allow the ScalarDB Analytics server to read from and write to the backend database to store the catalog information. These procedures will be described in detail in Step 3. Deploy a backend database.
- You must allow the ScalarDB Analytics server to read from and write to the object storage to store metering information. These procedures will be described in detail in Step 4. Deploy an object storage.
- Recommendations
-
Deploy a Kubernetes cluster for the ScalarDB Analytics server.
- Testing/development environments
- Production/staging environments
For testing or development purposes, you can use minikube as a local Kubernetes cluster. For details on how to install and start minikube, see the official minikube documentation.
For production environments, please deploy the Kubernetes cluster based on the above requirements of the ScalarDB Analytics server and your system's requirements, for example, security, availability, backup/restore, cost, and scalability amongst your other requirements.
- EKS
- minikube
- If you chose Fixed price w/ bring your own license (container offer - any supported Kubernetes platform), you can use Amazon Elastic Kubernetes Service (EKS).
- If you chose Pay as you go (container offer - AWS Marketplace) in Step 1. Decide on the billing method for ScalarDB Analytics, you need to do the following:
- Deploy EKS in supported regions that are described in the AWS documentation MeterUsage Region support for Amazon ECS and Amazon EKS.
- Run the following two commands after you deploy EKS:
-
eksctl utils associate-iam-oidc-providereksctl utils associate-iam-oidc-provider --region <REGION> --cluster <EKS_CLUSTER_NAME> --approve -
eksctl create iamserviceaccounteksctl create iamserviceaccount \
--name <SERVICE_ACCOUNT_NAME> \
--namespace <NAMESPACE> \
--region <REGION> \
--cluster <EKS_CLUSTER_NAME> \
--attach-policy-arn arn:aws:iam::aws:policy/AWSMarketplaceMeteringFullAccess \
--approve \
--override-existing-serviceaccountsYou can set an arbitrary name to
SERVICE_ACCOUNT_NAMEbased on the Kubernetes resource naming rule.noteKeep note of the value that you set for
SERVICE_ACCOUNT_NAMEbecause you will specify this service account name in a later step.
-
importantFor production environments, you must use the supported Kubernetes platform. You can see the supported Kubernetes platforms in Requirements.
Step 3. Deploy a backend database​
Deploy your preferred backend database based on the following requirements and checkpoints:
-
Decide which backend database to use.
- You can see the supported backend database for the ScalarDB Analytics server in Requirements.
- Unless you have a special reason not to, you should use a database that you are familiar with.
-
Check the backend database requirements for the ScalarDB Analytics server.
- You can see the requirements of each backend database in the Requirements page.
-
Deploy the backend database in your environment.
- Testing/development environments
- Production/staging environments
For testing or development purposes, you can deploy a backend database in the Kubernetes cluster as a Pod. For example, if you use PostgreSQL, you can deploy it as follows:
-
Add the Bitnami Helm repository by running the following command:
helm repo add bitnami https://charts.bitnami.com/bitnami -
Deploy PostgreSQL by running the following command:
helm install postgresql-scalardb-cluster bitnami/postgresql \
--set auth.postgresPassword=postgres \
--set primary.persistence.enabled=false -
Check if the PostgreSQL container is running by running the following command:
kubectl get podYou should see the following output:
NAME READY STATUS RESTARTS AGE
postgresql-scalardb-cluster-0 1/1 Running 0 17s
For production environments, please deploy the backend database based on the above requirements of the ScalarDB Analytics server and your system's requirements, for example, security, availability, backup/restore, cost, and scalability amongst your other requirements.
Step 4. Deploy an object storage​
Deploy an object storage based on the following requirements and checkpoints:
-
Decide which object storage to use.
- You can use Amazon S3, Azure Blob Storage, or Google's Cloud Storage as a data store for metering information for the ScalarDB Analytics server.
- You should use the object storage that is provided by the same cloud service provider as the Kubernetes cluster that you chose in Step 2. Deploy a Kubernetes cluster. For example, if you chose EKS, you should use Amazon S3.
-
Check the object storage requirements for the ScalarDB Analytics server.
- You must allow the ScalarDB Analytics server to read from and write to the object storage.
-
Deploy the object storage in your environment.
- Testing/development environments
- Production/staging environments
For testing or development purposes, you can store metering information on the filesystem in the ScalarDB Analytics server container. In other words, you don't need to use the object storage. In this case, you need to set
scalar.db.analytics.server.metering.storage.provider=filesystemin the properties file. For more details, see Step 5. Create a custom values file.For production environments, please deploy the object storage based on the above requirements of the ScalarDB Analytics server and your system's requirements, for example, security, availability, backup/restore, cost, and scalability amongst your other requirements.
Step 5. Create a custom values file​
Create your custom values file scalardb-analytics-server.yaml based on your environment and your decisions in the previous steps.
Set the required configurations​
-
Set the container image and the license configurations
Based on the billing method you chose in Step 1. Decide on the billing method for ScalarDB Analytics, set the container image configuration to
scalarDbAnalyticsServer.image.repository. Select one of the following billing methods to see an example of this configuration.- Pay as you go (container offer - AWS Marketplace)
- Fixed price w/bring your own license (container offer - any supported Kubernetes platform)
scalarDbAnalyticsServer:
image:
repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/scalar/scalardb-analytics-server-aws-paygwarningYou need to have a license key (trial license or commercial license) to use ScalarDB Analytics server. If you don't have a license key, please contact us.
scalarDbAnalyticsServer:
image:
repository: ghcr.io/scalar-labs/scalardb-analytics-server-byol
properties: |
scalar.db.analytics.server.licensing.license_key=<YOUR_LICENSE_KEY>
scalar.db.analytics.server.licensing.license_check_cert_pem=-----BEGIN CERTIFICATE-----\nMIID...certificate content...\n-----END CERTIFICATE----- -
Set the service account configurations
Based on the billing method you chose in Step 1. Decide on the billing method for ScalarDB Analytics, set the service account configurations to
scalarDbAnalyticsServer.serviceAccount. Select one of the following billing methods to see an example of this configuration.- Pay as you go (container offer - AWS Marketplace)
- Fixed price w/bring your own license (container offer - any supported Kubernetes platform)
scalarDbAnalyticsServer:
serviceAccount:
serviceAccountName: <SERVICE_ACCOUNT_NAME>
automountServiceAccountToken: truenoteChange
<SERVICE_ACCOUNT_NAME>to the name of the service account that you created by using theeksctl create iamserviceaccountcommand in Step 2. Deploy a Kubernetes cluster.You don't need to set a service account configuration.
-
Set the database configurations
Based on the backend database you chose in Step 3. Deploy a backend database, set the database configurations in
scalarDbAnalyticsServer.properties. Select one of the following databases to see an example of these configurations.- PostgreSQL
- MySQL
- SQL Server
- Oracle
scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.db.url=jdbc:postgresql://<POSTGRESQL_SERVER_HOSTNAME>:<POSTGRESQL_SERVER_PORT>/<POSTGRESQL_DATABASE_NAME>
scalar.db.analytics.server.db.username=<POSTGRESQL_USERNAME>
scalar.db.analytics.server.db.password=<POSTGRESQL_PASSWORD>scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.db.url=jdbc:mysql://<MYSQL_SERVER_HOSTNAME>:<MYSQL_SERVER_PORT>/<MYSQL_DATABASE_NAME>
scalar.db.analytics.server.db.username=<MYSQL_USERNAME>
scalar.db.analytics.server.db.password=<MYSQL_PASSWORD>scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.db.url=jdbc:sqlserver://<SQL_SERVER_HOSTNAME>:<SQL_SERVER_PORT>;databaseName=<SQL_SERVER_DATABASE_NAME>;encrypt=true;trustServerCertificate=true
scalar.db.analytics.server.db.username=<SQL_SERVER_USERNAME>
scalar.db.analytics.server.db.password=<SQL_SERVER_PASSWORD>scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.db.url=jdbc:oracle:thin:@//<ORACLE_SERVER_HOSTNAME>:<ORACLE_SERVER_PORT>/<PDB_NAME>
scalar.db.analytics.server.db.username=<ORACLE_USERNAME>
scalar.db.analytics.server.db.password=<ORACLE_PASSWORD> -
Set the object storage configurations
Based on the object storage you chose in Step 4. Deploy an object storage, please set object storage configurations in
scalarDbAnalyticsServer.properties. Select one of the following object storages to see an example of these configurations.- Amazon S3
- Azure Blob Storage
- Cloud storage
- Filesystem
scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.metering.storage.provider=aws-s3
scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.metering.storage.provider=azureblob
scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.metering.storage.provider=google-cloud-storage
scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>noteYou can use
filesystemfor testing or development purposes only. Filesystem is not recommended for production use.scalarDbAnalyticsServer:
properties: |
scalar.db.analytics.server.metering.storage.provider=filesystem
scalar.db.analytics.server.metering.storage.path=/tmp/scalardb-analytics-metering -
Set the service configurations
Based on the connectivity of the ScalarDB Analytics server, you need to set
scalarDbAnalyticsServer.service.type. Select one of the following types of connections to see an example of this configuration.- Access from outside of the Kubernetes cluster
- Access from inside of the Kubernetes cluster
If your Spark application accesses the ScalarDB Analytics server from outside of the Kubernetes cluster, set
scalarDbAnalyticsServer.service.typetoLoadBalancer.scalarDbAnalyticsServer:
service:
type: "LoadBalancer"If your Spark application accesses the ScalarDB Analytics server from inside of the Kubernetes cluster, set
scalarDbAnalyticsServer.service.typetoClusterIP.scalarDbAnalyticsServer:
service:
type: "ClusterIP" -
Check the required configurations
After completing the above steps, you should have the following configurations, depending on your environment, for example:
noteThese configurations are just examples. The actual configurations may be different from these examples. Please make sure to set configurations based on your environment.
- BYOL / PostgreSQL / Azure Blob Storage / LoadBalancer
- AWS Marketplace / MySQL / Amazon S3 / ClusterIP
- BYOL / SQL Server / Filesystem / ClusterIP
scalarDbAnalyticsServer:
image:
repository: ghcr.io/scalar-labs/scalardb-analytics-server-byol
properties: |
# License configurations
scalar.db.analytics.server.licensing.license_key=<YOUR_LICENSE_KEY>
scalar.db.analytics.server.licensing.license_check_cert_pem=-----BEGIN CERTIFICATE-----\nMIID...certificate content...\n-----END CERTIFICATE-----
# Database configurations
scalar.db.analytics.server.db.url=jdbc:postgresql://<POSTGRESQL_SERVER_HOSTNAME>:<POSTGRESQL_SERVER_PORT>/<POSTGRESQL_DATABASE_NAME>
scalar.db.analytics.server.db.username=<POSTGRESQL_USERNAME>
scalar.db.analytics.server.db.password=<POSTGRESQL_PASSWORD>
# Object storage configurations
scalar.db.analytics.server.metering.storage.provider=azureblob
scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>
service:
type: "LoadBalancer"scalarDbAnalyticsServer:
image:
repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/scalar/scalardb-analytics-server-aws-payg
properties: |
# Database configurations
scalar.db.analytics.server.db.url=jdbc:mysql://<MYSQL_SERVER_HOSTNAME>:<MYSQL_SERVER_PORT>/<MYSQL_DATABASE_NAME>
scalar.db.analytics.server.db.username=<MYSQL_USERNAME>
scalar.db.analytics.server.db.password=<MYSQL_PASSWORD>
# Object storage configurations
scalar.db.analytics.server.metering.storage.provider=aws-s3
scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>
service:
type: "ClusterIP"
serviceAccount:
serviceAccountName: "scalardb-analytics-payg-sa"
automountServiceAccountToken: truenoteYou can use
filesystemfor testing or development purposes only. Filesystem is not recommended for production use.scalarDbAnalyticsServer:
image:
repository: ghcr.io/scalar-labs/scalardb-analytics-server-byol
properties: |
# License configurations
scalar.db.analytics.server.licensing.license_key=<YOUR_LICENSE_KEY>
scalar.db.analytics.server.licensing.license_check_cert_pem=-----BEGIN CERTIFICATE-----\nMIID...certificate content...\n-----END CERTIFICATE-----
# Database configurations
scalar.db.analytics.server.db.url=jdbc:sqlserver://<SQL_SERVER_HOSTNAME>:<SQL_SERVER_PORT>;databaseName=<SQL_SERVER_DATABASE_NAME>;encrypt=true;trustServerCertificate=true
scalar.db.analytics.server.db.username=<SQL_SERVER_USERNAME>
scalar.db.analytics.server.db.password=<SQL_SERVER_PASSWORD>
# Filesystem configurations
scalar.db.analytics.server.metering.storage.provider=filesystem
scalar.db.analytics.server.metering.storage.path=/tmp/scalardb-analytics-metering
service:
type: "ClusterIP"
Set the optional configurations​
You can see the optional configurations in Optional configurations. Set the optional configurations based on your environment if necessary.
Step 6. Deploy a ScalarDB Analytics server by using Helm Chart​
Deploy, upgrade, or uninstall the ScalarDB Analytics server deployment by using the helm command with your custom values file scalardb-analytics-server.yaml that you created in Step 5. Create a custom values file.
Prerequisites
-
Add the Scalar Helm Chart repository and update it to the latest version by using the
helm repo addcommand andhelm repo updatecommand as follows:helm repo add scalar-labs https://scalar-labs.github.io/helm-chartshelm repo update -
Decide on the version of the ScalarDB product (strictly, the corresponding chart version) that you will deploy or upgrade. You can check the version by running the following command:
helm search repo scalar-labs/<CHART_NAME> -ltipIn this document (when you deploy the ScalarDB Analytics server), run the following command:
helm search repo scalar-labs/scalardb-analytics-server -lFor example, you should see a similar output as below:
NAME CHART VERSION APP VERSION
scalar-labs/<CHART_NAME> 1.9.0 3.16.1
scalar-labs/<CHART_NAME> 1.8.1 3.16.1
scalar-labs/<CHART_NAME> 1.8.0 3.16.0
scalar-labs/<CHART_NAME> 1.7.6 3.15.5
scalar-labs/<CHART_NAME> 1.7.5 3.15.5
scalar-labs/<CHART_NAME> 1.7.4 3.15.5
scalar-labs/<CHART_NAME> 1.7.3 3.15.4
scalar-labs/<CHART_NAME> 1.7.2 3.15.3
scalar-labs/<CHART_NAME> 1.7.1 3.15.2
scalar-labs/<CHART_NAME> 1.7.0 3.15.1
scalar-labs/<CHART_NAME> 1.6.4 3.14.4
scalar-labs/<CHART_NAME> 1.6.3 3.14.3
scalar-labs/<CHART_NAME> 1.6.2 3.14.2
scalar-labs/<CHART_NAME> 1.6.1 3.14.1
scalar-labs/<CHART_NAME> 1.6.0 3.14.0noteAPP VERSIONmeans the version of the ScalarDB product itself. First, check this version to decide which version of the ScalarDB product you will deploy or upgrade.- After checking the version under
APP VERSIONand deciding which version of the ScalarDB product you will deploy or upgrade, note the corresponding version underCHART VERSION. - If there are several of the same versions under
APP VERSION, note the latest version underCHART VERSION. - For example:
- If you want to deploy the ScalarDB product 3.16.1, note
1.9.0asCHART VERSION. - If you want to deploy the ScalarDB product 3.15.5, note
1.7.6asCHART VERSION. - If you want to deploy the ScalarDB product 3.14.4, note
1.6.4asCHART VERSION.
- If you want to deploy the ScalarDB product 3.16.1, note
Deploy, upgrade, or uninstall
- Deploy
- Upgrade
- Uninstall
Deploy the ScalarDB Analytics server by using the helm install command as follows:
helm install <RELEASE_NAME> scalar-labs/scalardb-analytics-server -f scalardb-analytics-server.yaml --namespace <KUBERNETES_NAMESPACE> --version <CHART_VERSION>
- Change
<RELEASE_NAME>to the arbitrary (unique) name of your deployment. - For the
--namespaceoption, change<KUBERNETES_NAMESPACE>to the name of the Kubernetes namespace that you want to deploy the ScalarDB Analytics server to. - For the
--versionoption, change<CHART_VERSION>to the version that you noted in the previous step.
Upgrade the existing ScalarDB Analytics server deployment by using the helm upgrade command as follows:
helm upgrade <RELEASE_NAME> scalar-labs/scalardb-analytics-server -f scalardb-analytics-server.yaml --namespace <KUBERNETES_NAMESPACE> --version <CHART_VERSION>
- Change
<RELEASE_NAME>to the arbitrary (unique) name of the deployment that you want to upgrade. - For the
--namespaceoption, change<KUBERNETES_NAMESPACE>to the name of the Kubernetes namespace that you want to upgrade the ScalarDB Analytics server of. - For the
--versionoption, change<CHART_VERSION>to the version that you noted in the previous step.
Downgrading the version of the ScalarDB Analytics server is not supported. When specifying a version number, you can do only the following:
- Specify the same version as the existing deployment. For example, you might do this when updating configurations.
- Specify a version that is greater than the existing deployment. For example, you might do this when upgrading the version of the ScalarDB Analytics server.
Uninstall the existing ScalarDB Analytics server deployment by using the helm uninstall command as follows:
helm uninstall <RELEASE_NAME> --namespace <KUBERNETES_NAMESPACE>
- Change
<RELEASE_NAME>to the arbitrary (unique) name of the deployment that you want to uninstall. - For the
--namespaceoption, change<KUBERNETES_NAMESPACE>to the name of the Kubernetes namespace that you want to uninstall the ScalarDB Analytics server of.
Step 7. Check your deployment​
After deploying the ScalarDB Analytics server or upgrading it, you should check the following points:
-
Check if the pod status is
Runningby running the following command:kubectl get pod --namespace <KUBERNETES_NAMESPACE>noteFor the
--namespaceoption, change<KUBERNETES_NAMESPACE>to the name of the Kubernetes namespace that you deployed the ScalarDB Analytics server to.For example, you can see
Runningin theSTATUScolumn and1/1in theREADYcolumn as follows:$ kubectl get pod
NAME READY STATUS RESTARTS AGE
scalardb-analytics-server-86767fff4c-p6nkq 1/1 Running 0 22m -
Check if the service is exported.
kubectl get svc --namespace <KUBERNETES_NAMESPACE>noteFor the
--namespaceoption, change<KUBERNETES_NAMESPACE>to the name of the Kubernetes namespace that you deployed the ScalarDB Analytics server to.- Access from outside of the Kubernetes cluster
- Access from inside of the Kubernetes cluster
If you set
scalarDbAnalyticsServer.service.typetoLoadBalancerin Step 5. Create a custom values file, you'll see the IP address or FQDN (depending on Kubernetes cluster) in theEXTERNAL-IPcolumn as follows:$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h54m
scalardb-analytics-server LoadBalancer 10.98.116.121 127.0.0.1 11051:32619/TCP,11052:32598/TCP 2m43snoteIf you're using minikube for testing or development purposes, you'll need to run the minikube tunnel command to expose the
LoadBalancerservice.If you set
scalarDbAnalyticsServer.service.typetoClusterIPin Step 5. Create a custom values file, you'll see the IP address in theCLUSTER-IPcolumn as follows:$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h56m
scalardb-analytics-server ClusterIP 10.102.141.240 <none> 11051/TCP,11052/TCP 3s