メインコンテンツまでスキップ
バージョン: 3.10

Ledger Alerts

LedgerClusterDown

This is the most critical alert and indicates that an Ledger cluster is not able to process requests. This alert should be handled with the highest priority.

Example Alert

Firing

[FIRING:1] LedgerClusterDown - critical
Alert: Ledger cluster is down - critical
Description: Ledger cluster is down, no resquest can be process.
Details:
• alertname: LedgerClusterDown
• deployment: prod-scalardl-ledger

Resolved

[RESOLVED] LedgerClusterDown - critical
Alert: Ledger cluster is down - critical
Description: Ledger cluster is down, no resquest can be process.
Details:
• alertname: LedgerClusterDown
• deployment: prod-scalardl-ledger

Action Needed

  • Check the number of replicas set kubectl get deployments. prod-scalardl-ledger
  • Check the number of replicas set kubectl describe deployments. prod-scalardl-ledger
  • Check nodes statuses with kubectl get node -o wide
  • Check the log server to pinpoint the root cause of a failure with kubernetes logs on the monitor server /log/kubernetes/<year>/<month>-<day>/kube.log
  • Check a cloud provider to see if there is any known issue. For example, you can check statues here in Azure.

LedgerClusterDegraded

This alert lets you know if a kubernetes cluster cannot start ledger pods, which means that the cluster does not have enough resource or lost of one or many kubernetes nodes to run the deployment.

Example Alert

Firing

[FIRING:1] LedgerClusterDegraded - warning
Alert: Ledger cluster is running in a degraded mode - warning
Description: Ledger cluster is running in a degraded mode, some of the Ledger pods are not healthy.
Details:
• alertname: LedgerClusterDegraded
• deployment: prod-scalardl-ledger

Resolved

[RESOLVED] LedgerClusterDegraded - warning
Alert: Ledger cluster is running in a degraded mode - warning
Description: Ledger cluster is running in a degraded mode, some of the Ledger pods are not healthy.
Details:
• alertname: LedgerClusterDegraded
• deployment: prod-scalardl-ledger

Action Needed

  • Check the log server to pinpoint the root cause of a failure with kubernetes logs on the monitor server /log/kubernetes/<year>/<month>-<day>/kube.log
  • Check kubernetes deployment with kubectl describe deployments prod-scalardl-ledger
  • Check replica set with kubectl get replicasets.apps
  • Check nodes statuses with kubectl get node -o wide
  • Check a cloud provider to see if there is any known issue. For example, you can check statues here in Azure.

LedgerPodsPending

This alert lets you know if a kubernetes cluster cannot start ledger pods, which means that the cluster does not have the enough resource.

Example Alert

Firing

[FIRING:1] LedgerPodsPending - warning
Alert: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default in pending status - warning
Description: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has been in pending status for more than 1 minute.
Details:
• alertname: LedgerPodsPending
• deployment: prod-scalardl-ledger

Resolved

[RESOLVED:1] LedgerPodsPending - warning
Alert: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default in pending status - warning
Description: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has been in pending status for more than 1 minute.
Details:
• alertname: LedgerPodsPending
• deployment: prod-scalardl-ledger

Action Needed

  • Check log server to pinpoint root cause of failure with the kubernetes logs on the monitor server /log/kubernetes/<year>/<month>-<day>/kube.log
  • Check the kubernetes deployment with kubectl describe pod prod-scalardl-ledger-xxxx-yyyy

LedgerPodsError

This alert lets you know if a kubernetes cluster cannot start ledger pods for one of the following reasons:

  • CrashLoopBackOff
  • CreateContainerConfigError
  • CreateContainerError
  • ErrImagePull
  • ImagePullBackOff
  • InvalidImageName

Example Alert

Firing

[FIRING:1] LedgerPodsError - warning
Alert: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has an error status - warning
Description: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has been in pending status for more than 1 minutes.
Details:
• alertname: LedgerPodsError
• deployment: prod-scalardl-ledger

Resolved

[RESOLVED:1] LedgerPodsError - warning
Alert: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has an error status - warning
Description: Pod prod-scalardl-ledger-xxxx-yyyy in namespace default has been in pending status for more than 1 minutes.
Details:
• alertname: LedgerPodsError
• deployment: prod-scalardl-ledger

Action Needed

  • Check the kubernetes deployment with kubectl describe pod prod-scalardl-ledger-xxxx-yyyy
  • Check log server to pinpoint root cause of failure with the kubernetes logs on the monitor server /log/kubernetes/<year>/<month>-<day>/kube.log