Kubernetes Deployment (EKS)
Deploy Data Stream on Amazon EKS with autoscaling using KEDA.
Architecture
AWS EKS Setup
Prerequisites
- AWS CLI configured with appropriate permissions
- kubectl installed
- eksctl (optional, for cluster creation)
- Helm 3.x
Create EKS Cluster (if needed)
# Using eksctl
eksctl create cluster \
--name datastream-prod \
--region us-east-1 \
--nodegroup-name workers \
--node-type t3.medium \
--nodes 3 \
--nodes-min 2 \
--nodes-max 5 \
--managed
# Configure kubectl
aws eks update-kubeconfig --name datastream-prod --region us-east-1
Verify Connection
kubectl cluster-info
kubectl get nodes
Prerequisites
- Kubernetes 1.25+ (EKS 1.28+ recommended)
- KEDA 2.x installed
- AWS Load Balancer Controller (for ALB Ingress)
- kubectl configured
Install KEDA
# Using Helm
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace
# Verify installation
kubectl get pods -n keda
Deployment
1. Create Namespace
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: datastream
kubectl apply -f namespace.yaml
2. Install AWS Load Balancer Controller
The ALB Ingress Controller creates Application Load Balancers for your services:
# Add EKS Helm repo
helm repo add eks https://aws.github.io/eks-charts
helm repo update
# Install AWS Load Balancer Controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=datastream-prod \
--set serviceAccount.create=true \
--set serviceAccount.name=aws-load-balancer-controller
3. Deploy Services
# Apply all manifests from EKS base
kubectl apply -k provider/k8s/eks/base/
# Check deployments
kubectl get deploy -n datastream
# Expected output:
# NAME READY UP-TO-DATE AVAILABLE AGE
# backend 3/3 3 3 5m
# consumer 2/2 2 2 5m
# redis 1/1 1 1 5m
# nats 1/1 1 1 5m
4. Verify Ingress
# Check ALB creation
kubectl get ingress -n datastream
# Get ALB DNS
kubectl get ingress datastream -n datastream -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
EKS-Specific Configuration
Ingress with ALB
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: datastream
namespace: datastream
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:xxx:certificate/xxx
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'
spec:
rules:
- host: datastream.hypetech.games
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: backend
port:
number: 3000
External Secrets (AWS Secrets Manager)
# external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: datastream-secrets
namespace: datastream
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: datastream-secrets
data:
- secretKey: REDIS_URL
remoteRef:
key: datastream/prod
property: redis_url
- secretKey: DATABASE_URL
remoteRef:
key: datastream/prod
property: database_url
Resource Requests for EKS
Properly sized resources for EKS nodes:
# backend deployment resources
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# consumer deployment resources
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "200m"
memory: "256Mi"
Autoscaling with KEDA
Data Stream uses KEDA for intelligent autoscaling based on workload metrics.
Consumer Scaling (NATS JetStream Lag)
The consumer scales based on pending messages in the NATS JetStream consumer:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: consumer-scaledobject
namespace: datastream
spec:
scaleTargetRef:
name: consumer
pollingInterval: 15
cooldownPeriod: 60
minReplicaCount: 1
maxReplicaCount: 5
triggers:
- type: nats-jetstream
metadata:
natsServerMonitoringEndpoint: "nats.datastream.svc.cluster.local:8222"
account: "$G"
stream: "DebeziumStream"
consumer: "cdc-public-rounds"
lagThreshold: "100" # Scale up when lag > 100
activationLagThreshold: "10" # Start scaling at lag > 10
How it works:
- Lag = 0-10: 1 replica (minimum)
- Lag = 10-100: Gradually scale up
- Lag > 100: Scale to maximum (5 replicas)
- Cooldown: 60 seconds before scaling down
Backend Scaling (CPU/Memory)
The backend scales based on resource utilization:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: backend-scaledobject
namespace: datastream
spec:
scaleTargetRef:
name: backend
pollingInterval: 15
cooldownPeriod: 60
minReplicaCount: 1
maxReplicaCount: 5
triggers:
- type: cpu
metricType: Utilization
metadata:
value: "70" # Scale up at 70% CPU
- type: memory
metricType: Utilization
metadata:
value: "80" # Scale up at 80% Memory
How it works:
- CPU < 70%: Maintain current replicas
- CPU > 70%: Add replicas
- Memory > 80%: Add replicas (WebSocket connections consume memory)
- Both triggers: Scale based on whichever threshold is hit first
Monitoring Autoscaling
Check ScaledObjects
# View ScaledObjects status
kubectl get scaledobjects -n datastream
# Example output:
# NAME SCALETARGETKIND SCALETARGETNAME MIN MAX READY ACTIVE
# consumer-scaledobject apps/v1.Deployment consumer 1 5 True False
# backend-scaledobject apps/v1.Deployment backend 1 5 True True
Check HPA (created by KEDA)
# View HPA metrics
kubectl get hpa -n datastream
# Example output:
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# keda-hpa-backend-scaledobject Deployment/backend cpu: 42%/70%, memory: 28%/80% 1 5 1
# keda-hpa-consumer-scaledobject Deployment/consumer 0/100 (avg) 1 5 1
Check NATS Consumer Lag
# Get NATS JetStream info
kubectl exec -n datastream deploy/consumer -- \
wget -qO- "http://nats:8222/jsz?consumers=true" | jq '.account_details[0].stream_detail[0].consumer_detail'
# Key metrics:
# - num_pending: Messages waiting to be processed
# - num_ack_pending: Messages being processed
Watch Scaling Events
# Watch HPA events
kubectl get events -n datastream --field-selector reason=SuccessfulRescale -w
# Watch pods scaling
kubectl get pods -n datastream -w
Scaling Scenarios
High CDC Volume
When database changes spike:
- Debezium captures more events
- NATS JetStream queue grows
- Consumer lag increases (num_pending > 100)
- KEDA triggers consumer scale-up
- More consumers process events in parallel
- Lag decreases
- After cooldown, KEDA scales down
High API Traffic
When client requests spike:
- More WebSocket/HTTP connections
- Backend CPU/Memory increases
- KEDA triggers backend scale-up
- Load distributed across replicas
- After traffic decreases, KEDA scales down
Configuration Reference
Consumer ScaledObject
| Parameter | Value | Description |
|---|---|---|
minReplicaCount | 1 | Minimum pods |
maxReplicaCount | 5 | Maximum pods |
pollingInterval | 15s | Metric check interval |
cooldownPeriod | 60s | Wait before scale-down |
lagThreshold | 100 | Scale up threshold |
activationLagThreshold | 10 | Activation threshold |
Backend ScaledObject
| Parameter | Value | Description |
|---|---|---|
minReplicaCount | 1 | Minimum pods |
maxReplicaCount | 5 | Maximum pods |
pollingInterval | 15s | Metric check interval |
cooldownPeriod | 60s | Wait before scale-down |
cpu.value | 70% | CPU threshold |
memory.value | 80% | Memory threshold |
Troubleshooting
ScaledObject Shows "Unknown"
# Check KEDA operator logs
kubectl logs -n keda deploy/keda-operator --tail=50
# Check ScaledObject status
kubectl describe scaledobject consumer-scaledobject -n datastream
Metrics Not Available
# Verify metrics-server is running
kubectl get pods -n kube-system | grep metrics-server
# Check NATS monitoring endpoint
kubectl exec -n datastream deploy/consumer -- wget -qO- http://nats:8222/healthz
Consumer Not Scaling
# Check NATS stream info
kubectl exec -n datastream deploy/consumer -- \
wget -qO- "http://nats:8222/jsz" | jq '.streams, .consumers'
# Verify consumer name matches
# The consumer name in KEDA must match the NATS consumer name exactly
Best Practices
- Set appropriate thresholds: Start conservative, tune based on actual load
- Monitor cooldown periods: Avoid thrashing (rapid scale up/down)
- Use resource requests/limits: KEDA CPU/Memory scaling requires these
- Test scaling behavior: Simulate load before production
- Set up alerts: Monitor when max replicas is reached