Skip to main content
Version: 3.16

Deploy a ScalarDB Analytics server

This document explains how to deploy a ScalarDB Analytics server in your local or production environment.

Step 1. Decide on the billing method for ScalarDB Analytics​

You can get the ScalarDB Analytics server in several ways:

You can use ScalarDB Analytics in a pay-as-you-go plan. In this case, you will pay the license fee based on your query usage.

You can use ScalarDB Analytics in the pay-as-you-go plan in AWS Marketplace.

To deploy the ScalarDB Analytics server from AWS Marketplace with a pay-as-you-go plan:

  1. Go to the AWS Marketplace page ScalarDB Analytics server.
  2. Subscribe to the ScalarDB Analytics server.
    1. Select View purchase options.
    2. Select Subscribe.
tip

After subscribing, you'll have permission to pull the container image of the ScalarDB Analytics server from the following container registry. You will specify this container registry and pull the container image in a later step, so keep note of it.

709825985650.dkr.ecr.us-east-1.amazonaws.com/scalar/scalardb-analytics-server-aws-payg

Step 2. Deploy a Kubernetes cluster​

Deploy a cluster on your preferred Kubernetes platform based on the following requirements and checkpoints:

  1. Decide which Kubernetes platform to use based on the billing method and purpose.

    • If you chose Pay as you go (container offer - AWS Marketplace) in Step 1. Decide on the billing method for ScalarDB Analytics, you need to deploy Amazon Elastic Kubernetes Service (EKS) in the supported regions. The supported regions will be referred to in a later step.

    • If you chose Fixed price w/bring your own license (container offer - any supported Kubernetes platform) in Step 1. Decide on the billing method for ScalarDB Analytics, you can use any of the supported Kubernetes platforms.

      note

      You should use minikube for testing or development purposes only. minikube is not recommended for production use.

  2. Check the general recommendations and requirements of the Kubernetes cluster for the ScalarDB Analytics server.

    • Recommendations
      • You should use a worker node that has at least 2 CPUs and 4 GB of memory.
        • Currently, the ScalarDB Analytics server does not have a clustering feature. Therefore, only one worker node is enough.
        • If you want to make the Kubernetes cluster itself highly available, you can deploy it with multiple worker nodes.
    • Requirements
      • You must allow your Spark application to connect to the ScalarDB Analytics server deployed on the Kubernetes cluster from a network perspective. To see which port the ScalarDB Analytics server uses, see Requirements.
      • You must allow the ScalarDB Analytics server to read from and write to the backend database to store the catalog information. These procedures will be described in detail in Step 3. Deploy a backend database.
      • You must allow the ScalarDB Analytics server to read from and write to the object storage to store metering information. These procedures will be described in detail in Step 4. Deploy an object storage.
  3. Deploy a Kubernetes cluster for the ScalarDB Analytics server.

    For testing or development purposes, you can use minikube as a local Kubernetes cluster. For details on how to install and start minikube, see the official minikube documentation.

Step 3. Deploy a backend database​

Deploy your preferred backend database based on the following requirements and checkpoints:

  1. Decide which backend database to use.

    • You can see the supported backend database for the ScalarDB Analytics server in Requirements.
    • Unless you have a special reason not to, you should use a database that you are familiar with.
  2. Check the backend database requirements for the ScalarDB Analytics server.

    • You can see the requirements of each backend database in the Requirements page.
  3. Deploy the backend database in your environment.

    For testing or development purposes, you can deploy a backend database in the Kubernetes cluster as a Pod. For example, if you use PostgreSQL, you can deploy it as follows:

    1. Add the Bitnami Helm repository by running the following command:

      helm repo add bitnami https://charts.bitnami.com/bitnami
    2. Deploy PostgreSQL by running the following command:

      helm install postgresql-scalardb-cluster bitnami/postgresql \
      --set auth.postgresPassword=postgres \
      --set primary.persistence.enabled=false
    3. Check if the PostgreSQL container is running by running the following command:

      kubectl get pod

      You should see the following output:

      NAME                            READY   STATUS    RESTARTS   AGE
      postgresql-scalardb-cluster-0 1/1 Running 0 17s

Step 4. Deploy an object storage​

Deploy an object storage based on the following requirements and checkpoints:

  1. Decide which object storage to use.

  2. Check the object storage requirements for the ScalarDB Analytics server.

    • You must allow the ScalarDB Analytics server to read from and write to the object storage.
  3. Deploy the object storage in your environment.

    For testing or development purposes, you can store metering information on the filesystem in the ScalarDB Analytics server container. In other words, you don't need to use the object storage. In this case, you need to set scalar.db.analytics.server.metering.storage.provider=filesystem in the properties file. For more details, see Step 5. Create a custom values file.

Step 5. Create a custom values file​

Create your custom values file scalardb-analytics-server.yaml based on your environment and your decisions in the previous steps.

Set the required configurations​

  1. Set the container image and the license configurations

    Based on the billing method you chose in Step 1. Decide on the billing method for ScalarDB Analytics, set the container image configuration to scalarDbAnalyticsServer.image.repository. Select one of the following billing methods to see an example of this configuration.

    scalarDbAnalyticsServer:
    image:
    repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/scalar/scalardb-analytics-server-aws-payg
  2. Set the service account configurations

    Based on the billing method you chose in Step 1. Decide on the billing method for ScalarDB Analytics, set the service account configurations to scalarDbAnalyticsServer.serviceAccount. Select one of the following billing methods to see an example of this configuration.

    scalarDbAnalyticsServer:
    serviceAccount:
    serviceAccountName: <SERVICE_ACCOUNT_NAME>
    automountServiceAccountToken: true
    note

    Change <SERVICE_ACCOUNT_NAME> to the name of the service account that you created by using the eksctl create iamserviceaccount command in Step 2. Deploy a Kubernetes cluster.

  3. Set the database configurations

    Based on the backend database you chose in Step 3. Deploy a backend database, set the database configurations in scalarDbAnalyticsServer.properties. Select one of the following databases to see an example of these configurations.

    scalarDbAnalyticsServer:
    properties: |
    scalar.db.analytics.server.db.url=jdbc:postgresql://<POSTGRESQL_SERVER_HOSTNAME>:<POSTGRESQL_SERVER_PORT>/<POSTGRESQL_DATABASE_NAME>
    scalar.db.analytics.server.db.username=<POSTGRESQL_USERNAME>
    scalar.db.analytics.server.db.password=<POSTGRESQL_PASSWORD>
  4. Set the object storage configurations

    Based on the object storage you chose in Step 4. Deploy an object storage, please set object storage configurations in scalarDbAnalyticsServer.properties. Select one of the following object storages to see an example of these configurations.

    scalarDbAnalyticsServer:
    properties: |
    scalar.db.analytics.server.metering.storage.provider=aws-s3
    scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
    scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>
  5. Set the service configurations

    Based on the connectivity of the ScalarDB Analytics server, you need to set scalarDbAnalyticsServer.service.type. Select one of the following types of connections to see an example of this configuration.

    If your Spark application accesses the ScalarDB Analytics server from outside of the Kubernetes cluster, set scalarDbAnalyticsServer.service.type to LoadBalancer.

    scalarDbAnalyticsServer:
    service:
    type: "LoadBalancer"
  6. Check the required configurations

    After completing the above steps, you should have the following configurations, depending on your environment, for example:

    note

    These configurations are just examples. The actual configurations may be different from these examples. Please make sure to set configurations based on your environment.

    scalarDbAnalyticsServer:
    image:
    repository: ghcr.io/scalar-labs/scalardb-analytics-server-byol
    properties: |
    # License configurations
    scalar.db.analytics.server.licensing.license_key=<YOUR_LICENSE_KEY>
    scalar.db.analytics.server.licensing.license_check_cert_pem=-----BEGIN CERTIFICATE-----\nMIID...certificate content...\n-----END CERTIFICATE-----
    # Database configurations
    scalar.db.analytics.server.db.url=jdbc:postgresql://<POSTGRESQL_SERVER_HOSTNAME>:<POSTGRESQL_SERVER_PORT>/<POSTGRESQL_DATABASE_NAME>
    scalar.db.analytics.server.db.username=<POSTGRESQL_USERNAME>
    scalar.db.analytics.server.db.password=<POSTGRESQL_PASSWORD>
    # Object storage configurations
    scalar.db.analytics.server.metering.storage.provider=azureblob
    scalar.db.analytics.server.metering.storage.accessKeyId=<YOUR_ACCESS_KEY>
    scalar.db.analytics.server.metering.storage.secretAccessKey=<YOUR_SECRET_ACCESS_KEY>
    service:
    type: "LoadBalancer"

Set the optional configurations​

You can see the optional configurations in Optional configurations. Set the optional configurations based on your environment if necessary.

Step 6. Deploy a ScalarDB Analytics server by using Helm Chart​

Deploy, upgrade, or uninstall the ScalarDB Analytics server deployment by using the helm command with your custom values file scalardb-analytics-server.yaml that you created in Step 5. Create a custom values file.

Prerequisites

  1. Add the Scalar Helm Chart repository and update it to the latest version by using the helm repo add command and helm repo update command as follows:

    helm repo add scalar-labs https://scalar-labs.github.io/helm-charts
    helm repo update
  2. Decide on the version of the ScalarDB product (strictly, the corresponding chart version) that you will deploy or upgrade. You can check the version by running the following command:

    helm search repo scalar-labs/<CHART_NAME> -l
    tip

    In this document (when you deploy the ScalarDB Analytics server), run the following command:

    helm search repo scalar-labs/scalardb-analytics-server -l

    For example, you should see a similar output as below:

    NAME                        CHART VERSION   APP VERSION
    scalar-labs/<CHART_NAME> 1.9.0 3.16.1
    scalar-labs/<CHART_NAME> 1.8.1 3.16.1
    scalar-labs/<CHART_NAME> 1.8.0 3.16.0
    scalar-labs/<CHART_NAME> 1.7.6 3.15.5
    scalar-labs/<CHART_NAME> 1.7.5 3.15.5
    scalar-labs/<CHART_NAME> 1.7.4 3.15.5
    scalar-labs/<CHART_NAME> 1.7.3 3.15.4
    scalar-labs/<CHART_NAME> 1.7.2 3.15.3
    scalar-labs/<CHART_NAME> 1.7.1 3.15.2
    scalar-labs/<CHART_NAME> 1.7.0 3.15.1
    scalar-labs/<CHART_NAME> 1.6.4 3.14.4
    scalar-labs/<CHART_NAME> 1.6.3 3.14.3
    scalar-labs/<CHART_NAME> 1.6.2 3.14.2
    scalar-labs/<CHART_NAME> 1.6.1 3.14.1
    scalar-labs/<CHART_NAME> 1.6.0 3.14.0
    note
    • APP VERSION means the version of the ScalarDB product itself. First, check this version to decide which version of the ScalarDB product you will deploy or upgrade.
    • After checking the version under APP VERSION and deciding which version of the ScalarDB product you will deploy or upgrade, note the corresponding version under CHART VERSION.
    • If there are several of the same versions under APP VERSION, note the latest version under CHART VERSION.
    • For example:
      • If you want to deploy the ScalarDB product 3.16.1, note 1.9.0 as CHART VERSION.
      • If you want to deploy the ScalarDB product 3.15.5, note 1.7.6 as CHART VERSION.
      • If you want to deploy the ScalarDB product 3.14.4, note 1.6.4 as CHART VERSION.

Deploy, upgrade, or uninstall

Deploy the ScalarDB Analytics server by using the helm install command as follows:

helm install <RELEASE_NAME> scalar-labs/scalardb-analytics-server -f scalardb-analytics-server.yaml --namespace <KUBERNETES_NAMESPACE> --version <CHART_VERSION>
note
  • Change <RELEASE_NAME> to the arbitrary (unique) name of your deployment.
  • For the --namespace option, change <KUBERNETES_NAMESPACE> to the name of the Kubernetes namespace that you want to deploy the ScalarDB Analytics server to.
  • For the --version option, change <CHART_VERSION> to the version that you noted in the previous step.

Step 7. Check your deployment​

After deploying the ScalarDB Analytics server or upgrading it, you should check the following points:

  1. Check if the pod status is Running by running the following command:

    kubectl get pod --namespace <KUBERNETES_NAMESPACE>
    note

    For the --namespace option, change <KUBERNETES_NAMESPACE> to the name of the Kubernetes namespace that you deployed the ScalarDB Analytics server to.

    For example, you can see Running in the STATUS column and 1/1 in the READY column as follows:

    $ kubectl get pod
    NAME READY STATUS RESTARTS AGE
    scalardb-analytics-server-86767fff4c-p6nkq 1/1 Running 0 22m
  2. Check if the service is exported.

    kubectl get svc --namespace <KUBERNETES_NAMESPACE>
    note

    For the --namespace option, change <KUBERNETES_NAMESPACE> to the name of the Kubernetes namespace that you deployed the ScalarDB Analytics server to.

    If you set scalarDbAnalyticsServer.service.type to LoadBalancer in Step 5. Create a custom values file, you'll see the IP address or FQDN (depending on Kubernetes cluster) in the EXTERNAL-IP column as follows:

    $ kubectl get svc
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h54m
    scalardb-analytics-server LoadBalancer 10.98.116.121 127.0.0.1 11051:32619/TCP,11052:32598/TCP 2m43s
    note

    If you're using minikube for testing or development purposes, you'll need to run the minikube tunnel command to expose the LoadBalancer service.