Deploying a New Django Service on EKS: GitOps Setup with ArgoCD, External Secrets, and Gateway API
Documented the end-to-end GitOps setup for onboarding a new Django service to EKS, covering ECR, RDS, Secrets Manager, Kustomize overlays, cert-manager TLS, Gateway API routing, and CI/CD pipeline wiring.
ON THIS PAGE
Architecture Overview
The stack:
- AWS EKS for Kubernetes
- ArgoCD for GitOps continuous delivery
- Kustomize for manifest management (base/overlay pattern)
- External Secrets Operator for pulling secrets from AWS Secrets Manager
- cert-manager for automatic TLS certificates
- Traefik with Gateway API for ingress routing
- Shared Redis and RabbitMQ clusters for caching and messaging
┌─────────────────────────────────────────────────────────────────┐
│ Deployment Workflow │
├─────────────────────────────────────────────────────────────────┤
│ │
│ App Repo K8s Manifests Repo │
│ ┌─────────┐ ┌─────────────────┐ │
│ │ Django │ push │ Kustomize │ │
│ │ Code │───────────────▶│ Overlays │ │
│ └─────────┘ └────────┬────────┘ │
│ │ │ │
│ │ GitHub Actions │ ArgoCD watches │
│ ▼ ▼ │
│ ┌─────────┐ ┌─────────────────┐ │
│ │ ECR │ │ EKS Cluster │ │
│ │ Image │◀───────────────│ Deployment │ │
│ └─────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Step 1: Repository Structure
Separate repositories for application code and Kubernetes manifests. The K8s repository structure:
service-name-k8s/
├── apps/
│ └── service-name/
│ ├── argocd/
│ │ ├── application-production.yaml
│ │ └── application-sandbox.yaml
│ ├── base/
│ │ ├── configmap-app.yaml
│ │ ├── deployment-web.yaml
│ │ ├── kustomization.yaml
│ │ └── service-web.yaml
│ └── overlays/
│ ├── production/
│ │ ├── certificate.yaml
│ │ ├── external-secret.yaml
│ │ ├── httproute.yaml
│ │ ├── kustomization.yaml
│ │ ├── namespace.yaml
│ │ ├── patch-env-web-prod.yaml
│ │ ├── patch-replicas-web.yaml
│ │ └── reference-grant.yaml
│ └── sandbox/
│ └── (same structure)
└── docs/
└── SERVICE-SETUP.md
The split is deliberate:
- Base contains shared resources — configuration that does not change between environments
- Overlays contain environment-specific patches — different replicas, URLs, debug settings
- ArgoCD Applications are separate — easy to apply or delete independently
- Clear separation of concerns — platform team manages base, application team manages overlays
Step 2: AWS Resources
Create ECR Repository
aws ecr create-repository \
--repository-name service-name \
--region us-east-2 \
--image-scanning-configuration scanOnPush=true
Create Database
For each environment, create a database and user:
# Connect to RDS
kubectl run psql-client --rm -it --restart=Never \
--image=postgres:15-alpine \
--env="PGPASSWORD=$MASTER_PASSWORD" \
-- psql -h $RDS_HOST -U postgres -d postgres
# Create database and user
CREATE DATABASE service_sandbox;
CREATE USER service_sandbox_user WITH PASSWORD 'secure-password-here';
GRANT ALL PRIVILEGES ON DATABASE service_sandbox TO service_sandbox_user;
ALTER DATABASE service_sandbox OWNER TO service_sandbox_user;
Create Secrets in AWS Secrets Manager
aws secretsmanager create-secret \
--name project-shared/service-sandbox \
--secret-string '{
"SECRET_KEY": "django-secret-key-here",
"DB_NAME": "service_sandbox",
"DB_USER": "service_sandbox_user",
"DB_PASSWORD": "secure-password-here",
"DB_HOST": "rds-host.region.rds.amazonaws.com",
"DB_PORT": "5432"
}' \
--region us-east-2
Step 3: Base Kubernetes Manifests
kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: service-name
resources:
- deployment-web.yaml
- service-web.yaml
- configmap-app.yaml
labels:
- pairs:
app.kubernetes.io/name: service-name
app.kubernetes.io/part-of: service-name
app.kubernetes.io/managed-by: kustomize
includeSelectors: true
includeTemplates: true
commonAnnotations:
argocd.argoproj.io/sync-wave: "0"
deployment-web.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-web
labels:
app.kubernetes.io/component: web
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: web
template:
metadata:
labels:
app.kubernetes.io/component: web
spec:
containers:
- name: web
image: service-name:placeholder # Kustomize overrides this
imagePullPolicy: Always
ports:
- containerPort: 8000
name: http
envFrom:
- configMapRef:
name: service-config
- secretRef:
name: service-secrets
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
readinessProbe:
httpGet:
path: /health/
port: 8000
initialDelaySeconds: 10
periodSeconds: 10
livenessProbe:
httpGet:
path: /health/
port: 8000
initialDelaySeconds: 30
periodSeconds: 30
configmap-app.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: service-config
labels:
app.kubernetes.io/component: config
data:
# Non-sensitive configuration
REDIS_HOST: "redis-master.cache.svc.cluster.local"
REDIS_PORT: "6379"
RABBITMQ_HOST: "rabbitmq.queue.svc.cluster.local"
RABBITMQ_PORT: "5672"
Step 4: Environment Overlay
overlays/sandbox/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: service-sandbox
resources:
- ../../base
- namespace.yaml
- external-secret.yaml
- certificate.yaml
- reference-grant.yaml
- httproute.yaml
labels:
- pairs:
environment: sandbox
includeSelectors: true
includeTemplates: true
images:
- name: service-name:placeholder
newName: ACCOUNT_ID.dkr.ecr.us-east-2.amazonaws.com/service-name
newTag: placeholder # CI/CD updates this line
patches:
- target:
kind: Deployment
name: service-web
path: patch-replicas-web.yaml
- target:
kind: Deployment
name: service-web
path: patch-env-web-sandbox.yaml
external-secret.yaml
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: service-secrets
annotations:
argocd.argoproj.io/sync-wave: "-5"
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: service-secrets
creationPolicy: Owner
data:
- secretKey: SECRET_KEY
remoteRef:
key: project-shared/service-sandbox
property: SECRET_KEY
- secretKey: DB_PASSWORD
remoteRef:
key: project-shared/service-sandbox
property: DB_PASSWORD
# Pull from shared secrets
- secretKey: REDIS_PASSWORD
remoteRef:
key: project-shared/redis
property: REDIS_PASSWORD
certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: service-sandbox-tls
annotations:
argocd.argoproj.io/sync-wave: "-3"
spec:
secretName: service-sandbox-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- service.domain.com
reference-grant.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
name: allow-traefik-tls-access
annotations:
argocd.argoproj.io/sync-wave: "-2"
spec:
from:
- group: gateway.networking.k8s.io
kind: Gateway
namespace: traefik
to:
- group: ""
kind: Secret
name: service-sandbox-tls-secret
httproute.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: service-route
annotations:
argocd.argoproj.io/sync-wave: "10" # After service exists
spec:
parentRefs:
- name: traefik-gateway
namespace: traefik
sectionName: websecure-service-sandbox
hostnames:
- service.domain.com
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: service-web
port: 8000
Step 5: Gateway Listener
Add a listener to the Traefik gateway for the new hostname:
# In platform/cert-manager/traefik-gateway-tls.yaml
- name: websecure-service-sandbox
port: 8443
protocol: HTTPS
hostname: service.domain.com
tls:
mode: Terminate
certificateRefs:
- name: service-sandbox-tls-secret
namespace: service-sandbox
kind: Secret
allowedRoutes:
namespaces:
from: All
Step 6: ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: service-sandbox
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: dev
source:
repoURL: https://github.com/org/service-k8s.git
targetRevision: main
path: apps/service-name/overlays/sandbox
destination:
server: https://kubernetes.default.svc
namespace: service-sandbox
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
ignoreDifferences:
- group: ""
kind: Secret
jsonPointers:
- /data
Step 7: Placeholder Deployment
Before the real application is ready, deploy a placeholder that validates the infrastructure:
#!/usr/bin/env python3
"""Placeholder app that displays environment configuration."""
import os
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
class HealthHandler(BaseHTTPRequestHandler):
def do_GET(self):
if '/health' in self.path:
self.send_response(200)
self.send_header('Content-type', 'application/json')
self.end_headers()
self.wfile.write(json.dumps({'status': 'ok'}).encode())
return
# Show environment variables (mask secrets)
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
env_vars = ['ENVIRONMENT', 'DB_HOST', 'REDIS_HOST']
secrets = ['SECRET_KEY', 'DB_PASSWORD']
html = "<h1>Service Placeholder</h1><h2>Environment</h2><ul>"
for var in env_vars:
value = os.environ.get(var, 'NOT SET')
html += f"<li>{var}: {value}</li>"
html += "</ul><h2>Secrets (Masked)</h2><ul>"
for var in secrets:
value = os.environ.get(var, '')
masked = f"{value[:4]}...{value[-4:]}" if len(value) > 8 else '****'
html += f"<li>{var}: {masked}</li>"
html += "</ul>"
self.wfile.write(html.encode())
if __name__ == '__main__':
HTTPServer(('0.0.0.0', 8000), HealthHandler).serve_forever()
Build and push:
docker build -t service-placeholder .
docker tag service-placeholder:latest $ECR_URI:placeholder
docker push $ECR_URI:placeholder
Step 8: GitHub Actions Pipeline
name: Build and Deploy
on:
push:
branches: [main]
env:
AWS_REGION: us-east-2
ECR_REPOSITORY: service-name
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Build and push image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
- name: Update K8s manifests
env:
IMAGE_TAG: ${{ github.sha }}
GH_TOKEN: ${{ secrets.K8S_REPO_TOKEN }}
run: |
git clone https://x-access-token:${GH_TOKEN}@github.com/org/service-k8s.git k8s
cd k8s
sed -i "s/newTag: .*/newTag: $IMAGE_TAG/" \
apps/service-name/overlays/sandbox/kustomization.yaml
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add -A
git commit -m "deploy: sandbox $IMAGE_TAG"
git push
Verification
After deploying, verify each component:
-
ArgoCD Application is Synced:
kubectl get applications -n argocd | grep service -
Pods are Running:
kubectl get pods -n service-sandbox -
External Secret is Synced:
kubectl get externalsecret -n service-sandbox -
TLS Certificate is Ready:
kubectl get certificate -n service-sandbox -
HTTPRoute is Accepted:
kubectl get httproute -n service-sandbox -
Endpoint is Accessible:
curl -sk https://service.domain.com/health/
Failure Modes
HTTPRoute Not Accepted
- Check gateway listener exists for hostname
- Verify ReferenceGrant allows cross-namespace TLS access
- Ensure TLS certificate is ready
External Secret Not Syncing
- Verify AWS secret exists with correct key
- Check ClusterSecretStore is configured
- Examine ExternalSecret status for errors:
kubectl describe externalsecret service-secrets -n service-sandbox
Image Not Pulling
- Verify ECR repository exists
- Check node IAM role has ECR pull permissions
- Ensure image tag exists in registry
ArgoCD Sync Stuck
- Check sync wave ordering
- Verify all referenced resources exist
- Look for validation errors in ArgoCD UI
Production Rule
Sync wave ordering is the most predictable failure point when first applying this stack. Secrets and certificates must exist before the Deployment is scheduled. The wave order that works: ExternalSecrets at -5, Certificates at -3, ReferenceGrants at -2, Deployments at 0, HTTPRoutes at 10. Set these annotations before applying the ArgoCD Application — not after diagnosing the first sync failure.
The placeholder container serves a second purpose beyond testing connectivity: it confirms that TLS issuance, secret sync, and routing all function independently of the application code. When the real image lands, the only change required is the image tag.
Discussion