Skip to main content

Kubernetes Deployment (EKS)

Deploy Data Stream on Amazon EKS with autoscaling using KEDA.

Architecture

Kubernetes Architecture

AWS EKS Setup

Prerequisites

  • AWS CLI configured with appropriate permissions
  • kubectl installed
  • eksctl (optional, for cluster creation)
  • Helm 3.x

Create EKS Cluster (if needed)

# Using eksctl
eksctl create cluster \
--name datastream-prod \
--region us-east-1 \
--nodegroup-name workers \
--node-type t3.medium \
--nodes 3 \
--nodes-min 2 \
--nodes-max 5 \
--managed

# Configure kubectl
aws eks update-kubeconfig --name datastream-prod --region us-east-1

Verify Connection

kubectl cluster-info
kubectl get nodes

Prerequisites

  • Kubernetes 1.25+ (EKS 1.28+ recommended)
  • KEDA 2.x installed
  • AWS Load Balancer Controller (for ALB Ingress)
  • kubectl configured

Install KEDA

# Using Helm
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

# Verify installation
kubectl get pods -n keda

Deployment

1. Create Namespace

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: datastream
kubectl apply -f namespace.yaml

2. Install AWS Load Balancer Controller

The ALB Ingress Controller creates Application Load Balancers for your services:

# Add EKS Helm repo
helm repo add eks https://aws.github.io/eks-charts
helm repo update

# Install AWS Load Balancer Controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=datastream-prod \
--set serviceAccount.create=true \
--set serviceAccount.name=aws-load-balancer-controller

3. Deploy Services

# Apply all manifests from EKS base
kubectl apply -k provider/k8s/eks/base/

# Check deployments
kubectl get deploy -n datastream

# Expected output:
# NAME READY UP-TO-DATE AVAILABLE AGE
# backend 3/3 3 3 5m
# consumer 2/2 2 2 5m
# redis 1/1 1 1 5m
# nats 1/1 1 1 5m

4. Verify Ingress

# Check ALB creation
kubectl get ingress -n datastream

# Get ALB DNS
kubectl get ingress datastream -n datastream -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

EKS-Specific Configuration

Ingress with ALB

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: datastream
namespace: datastream
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:xxx:certificate/xxx
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'
spec:
rules:
- host: datastream.hypetech.games
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: backend
port:
number: 3000

External Secrets (AWS Secrets Manager)

# external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: datastream-secrets
namespace: datastream
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: datastream-secrets
data:
- secretKey: REDIS_URL
remoteRef:
key: datastream/prod
property: redis_url
- secretKey: DATABASE_URL
remoteRef:
key: datastream/prod
property: database_url

Resource Requests for EKS

Properly sized resources for EKS nodes:

# backend deployment resources
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# consumer deployment resources
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "200m"
memory: "256Mi"

Autoscaling with KEDA

Data Stream uses KEDA for intelligent autoscaling based on workload metrics.

Consumer Scaling (NATS JetStream Lag)

The consumer scales based on pending messages in the NATS JetStream consumer:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: consumer-scaledobject
namespace: datastream
spec:
scaleTargetRef:
name: consumer
pollingInterval: 15
cooldownPeriod: 60
minReplicaCount: 1
maxReplicaCount: 5
triggers:
- type: nats-jetstream
metadata:
natsServerMonitoringEndpoint: "nats.datastream.svc.cluster.local:8222"
account: "$G"
stream: "DebeziumStream"
consumer: "cdc-public-rounds"
lagThreshold: "100" # Scale up when lag > 100
activationLagThreshold: "10" # Start scaling at lag > 10

How it works:

  • Lag = 0-10: 1 replica (minimum)
  • Lag = 10-100: Gradually scale up
  • Lag > 100: Scale to maximum (5 replicas)
  • Cooldown: 60 seconds before scaling down

Backend Scaling (CPU/Memory)

The backend scales based on resource utilization:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: backend-scaledobject
namespace: datastream
spec:
scaleTargetRef:
name: backend
pollingInterval: 15
cooldownPeriod: 60
minReplicaCount: 1
maxReplicaCount: 5
triggers:
- type: cpu
metricType: Utilization
metadata:
value: "70" # Scale up at 70% CPU
- type: memory
metricType: Utilization
metadata:
value: "80" # Scale up at 80% Memory

How it works:

  • CPU < 70%: Maintain current replicas
  • CPU > 70%: Add replicas
  • Memory > 80%: Add replicas (WebSocket connections consume memory)
  • Both triggers: Scale based on whichever threshold is hit first

Monitoring Autoscaling

Check ScaledObjects

# View ScaledObjects status
kubectl get scaledobjects -n datastream

# Example output:
# NAME SCALETARGETKIND SCALETARGETNAME MIN MAX READY ACTIVE
# consumer-scaledobject apps/v1.Deployment consumer 1 5 True False
# backend-scaledobject apps/v1.Deployment backend 1 5 True True

Check HPA (created by KEDA)

# View HPA metrics
kubectl get hpa -n datastream

# Example output:
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# keda-hpa-backend-scaledobject Deployment/backend cpu: 42%/70%, memory: 28%/80% 1 5 1
# keda-hpa-consumer-scaledobject Deployment/consumer 0/100 (avg) 1 5 1

Check NATS Consumer Lag

# Get NATS JetStream info
kubectl exec -n datastream deploy/consumer -- \
wget -qO- "http://nats:8222/jsz?consumers=true" | jq '.account_details[0].stream_detail[0].consumer_detail'

# Key metrics:
# - num_pending: Messages waiting to be processed
# - num_ack_pending: Messages being processed

Watch Scaling Events

# Watch HPA events
kubectl get events -n datastream --field-selector reason=SuccessfulRescale -w

# Watch pods scaling
kubectl get pods -n datastream -w

Scaling Scenarios

High CDC Volume

When database changes spike:

  1. Debezium captures more events
  2. NATS JetStream queue grows
  3. Consumer lag increases (num_pending > 100)
  4. KEDA triggers consumer scale-up
  5. More consumers process events in parallel
  6. Lag decreases
  7. After cooldown, KEDA scales down

High API Traffic

When client requests spike:

  1. More WebSocket/HTTP connections
  2. Backend CPU/Memory increases
  3. KEDA triggers backend scale-up
  4. Load distributed across replicas
  5. After traffic decreases, KEDA scales down

Configuration Reference

Consumer ScaledObject

ParameterValueDescription
minReplicaCount1Minimum pods
maxReplicaCount5Maximum pods
pollingInterval15sMetric check interval
cooldownPeriod60sWait before scale-down
lagThreshold100Scale up threshold
activationLagThreshold10Activation threshold

Backend ScaledObject

ParameterValueDescription
minReplicaCount1Minimum pods
maxReplicaCount5Maximum pods
pollingInterval15sMetric check interval
cooldownPeriod60sWait before scale-down
cpu.value70%CPU threshold
memory.value80%Memory threshold

Troubleshooting

ScaledObject Shows "Unknown"

# Check KEDA operator logs
kubectl logs -n keda deploy/keda-operator --tail=50

# Check ScaledObject status
kubectl describe scaledobject consumer-scaledobject -n datastream

Metrics Not Available

# Verify metrics-server is running
kubectl get pods -n kube-system | grep metrics-server

# Check NATS monitoring endpoint
kubectl exec -n datastream deploy/consumer -- wget -qO- http://nats:8222/healthz

Consumer Not Scaling

# Check NATS stream info
kubectl exec -n datastream deploy/consumer -- \
wget -qO- "http://nats:8222/jsz" | jq '.streams, .consumers'

# Verify consumer name matches
# The consumer name in KEDA must match the NATS consumer name exactly

Best Practices

  1. Set appropriate thresholds: Start conservative, tune based on actual load
  2. Monitor cooldown periods: Avoid thrashing (rapid scale up/down)
  3. Use resource requests/limits: KEDA CPU/Memory scaling requires these
  4. Test scaling behavior: Simulate load before production
  5. Set up alerts: Monitor when max replicas is reached