Blog Field Notes Diagnosing 15 Hours of ContainerCreating: `replicas: 2` Against a ReadWriteOnce EBS Volume on EKS
Debug #kubernetes#eks#aws#ebs#persistent-volumes#storage#gitops#odoo

Diagnosing 15 Hours of ContainerCreating: `replicas: 2` Against a ReadWriteOnce EBS Volume on EKS

Traced a 15-hour silent ContainerCreating stall to a Deployment running two replicas against a single ReadWriteOnce EBS PVC, where AWS rejected the second EC2 volume attachment with no events and no logs.

· Gideon Warui
ON THIS PAGE

A pod running a stateful application had been stuck in ContainerCreating for over 15 hours. The pod showed no error logs, no events, and the kubectl describe output seemed normal — a PVC was bound, the node was schedulable, and the container image was present. Yet the pod would not start.

The root cause was a fundamental incompatibility between the deployment’s replica count and its storage access mode: a replicas: 2 deployment sharing a single ReadWriteOnce (RWO) EBS volume across two nodes on AWS EKS. This combination cannot work.


Environment

ComponentDetail
Kubernetesv1.34 (EKS)
ApplicationSelf-hosted ERP (Odoo 18)
StorageAWS EBS gp2, ReadWriteOnce access mode
Volume size10Gi (filestore)
Deployment configreplicas: 2 managed via Kustomize/GitOps

Step 1 — Observing the Symptom

During a cluster health check, one of two running replicas of a stateful deployment was found stuck:

kubectl get pods -n erp-production -o wide
NAME                READY   STATUS              RESTARTS   AGE   IP          NODE
app-7c8d6f-abc1     1/1     Running             0          15h   10.0.0.66   node-a
app-7c8d6f-xyz2     0/1     ContainerCreating   0          15h   <none>      node-b

Two observations stood out immediately:

  1. Both pods had been in this state for 15 hours — not a transient scheduling issue
  2. The stuck pod had no IP address, meaning it had never reached the network configuration stage

Step 2 — Investigating via Pod Description

kubectl describe pod app-7c8d6f-xyz2 -n erp-production

The events section was empty:

Events: <none>

This is unusual. Under normal circumstances, ContainerCreating pods produce events like Pulling image, Pulled, Created. The complete absence of events indicated the pod was blocked before the kubelet even began container setup.

The volume section of the describe output pointed to the cause:

Volumes:
  filestore:
    Type:       PersistentVolumeClaim
    ClaimName:  app-filestore
    ReadOnly:   false

The pod was waiting to mount a PVC. The PVC was checked:

kubectl get pvc app-filestore -n erp-production
NAME            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS
app-filestore   Bound    pvc-4d830c9b-7014-4ac1-befa-1863efe59a81   10Gi       RWO            gp2

STATUS: Bound — the PVC was bound to a PV. ACCESS MODES: RWO — the critical detail.


Step 3 — Understanding ReadWriteOnce

Kubernetes PersistentVolume access modes describe how a volume can be mounted across the cluster:

Access ModeAbbreviationMeaning
ReadWriteOnceRWOMounted read-write by a single node
ReadOnlyManyROXMounted read-only by many nodes
ReadWriteManyRWXMounted read-write by many nodes
ReadWriteOncePodRWOPMounted read-write by a single pod (Kubernetes 1.22+)

The key word in ReadWriteOnce is node, not pod. Multiple pods on the same node can mount an RWO volume simultaneously. But the volume can only be attached to one node at a time.

AWS EBS volumes enforce this at the infrastructure level. An EBS volume is attached to an EC2 instance as a block device (/dev/nvme1n1, etc.). The operating system mounts the filesystem from that block device. When a second EC2 instance tries to attach the same EBS volume, AWS rejects the attachment — the volume is already in use by another instance.


Step 4 — Tracing the Scheduling Decision

The running pod (app-7c8d6f-abc1) was on node-a. The stuck pod (app-7c8d6f-xyz2) had been scheduled to node-b. Why did the scheduler pick a different node?

The PV’s node affinity was checked:

kubectl get pv pvc-4d830c9b-7014-4ac1-befa-1863efe59a81 \
  -o jsonpath='{.spec.nodeAffinity}'
{
  "required": {
    "nodeSelectorTerms": [{
      "matchExpressions": [
        {
          "key": "topology.kubernetes.io/zone",
          "operator": "In",
          "values": ["us-east-2a"]
        },
        {
          "key": "topology.kubernetes.io/region",
          "operator": "In",
          "values": ["us-east-2"]
        }
      ]
    }]
  }
}

The PV’s node affinity constrained scheduling to availability zone us-east-2a — not to a specific node. Within us-east-2a, both node-a and node-b were valid scheduling targets:

kubectl get nodes -o custom-columns="NAME:.metadata.name,ZONE:.metadata.labels.topology\.kubernetes\.io/zone"
NAME      ZONE
node-a    us-east-2a
node-b    us-east-2a
node-c    us-east-2a
node-d    us-east-2b
node-e    us-east-2b

The Kubernetes scheduler correctly placed both pods in us-east-2a (satisfying the zone constraint), but it did not enforce that both pods land on the same specific node. The scheduler has no mechanism to know that an EBS volume can only be attached to one node at a time — it relies on the RWO access mode semantics, which only prevent pods on different nodes from simultaneously mounting, but does not proactively co-locate them.

What actually happens:

  1. Pod 1 starts on node-a, EBS volume attaches to node-a
  2. Pod 2 is scheduled to node-b (different valid node in the same zone)
  3. Pod 2 attempts to mount the volume on node-b
  4. AWS rejects the attachment — volume is already attached to node-a
  5. The kubelet on node-b gets stuck waiting to mount the volume — no events, ContainerCreating indefinitely

Step 5 — Why Deleting the Stuck Pod Does Not Help

An intuitive first response is to delete the stuck pod and hope it reschedules to node-a:

kubectl delete pod app-7c8d6f-xyz2 -n erp-production

The deployment controller immediately creates a replacement pod. The scheduler picks a valid node in us-east-2a. If it happens to pick node-a, the pod will start (both pods on the same node can share the EBS volume). But the scheduler has no preference for node-a — it may equally pick node-b or node-c.

In practice, this creates a coin-flip situation that resolves randomly and temporarily. The next node rotation, eviction, or rescheduling event may land a pod back on the wrong node.

This is not a fix. It is a temporary workaround that masks the underlying configuration error.


Step 6 — The Root Cause: A Configuration Error

The deployment was configured with replicas: 2:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  namespace: erp-production
spec:
  replicas: 2          # <-- this is the problem
  selector:
    matchLabels:
      app: erp
  template:
    spec:
      volumes:
      - name: filestore
        persistentVolumeClaim:
          claimName: app-filestore   # RWO EBS volume
      containers:
      - name: app
        image: app:18.0
        volumeMounts:
        - name: filestore
          mountPath: /var/lib/app

Running 2 replicas of an application that mounts a single RWO EBS volume is fundamentally incompatible. The second replica cannot reliably attach the volume.


Step 7 — Resolution Options

Option A — Reduce to replicas: 1 (correct for stateful single-filestore apps)

The appropriate fix for applications with a single shared filestore that cannot safely handle concurrent writes:

spec:
  replicas: 1    # was 2

This is the correct choice for applications like Odoo, where the filestore contains user-uploaded files and session data that are not designed for concurrent multi-process access. Running a second replica with the same filestore would not provide true HA — it would create data consistency risks even if the volume mounting problem were solved.

# Quick operational fix (must also be updated in the GitOps source)
kubectl scale deployment app -n erp-production --replicas=1

Option B — Migrate to ReadWriteMany storage (EFS)

If horizontal scaling is genuinely required, the storage backend must support ReadWriteMany access mode, allowing multiple pods on multiple nodes to mount the volume simultaneously.

On AWS EKS, this means replacing EBS with EFS (Elastic File System) using the EFS CSI driver:

# StorageClass for EFS
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-<efs-id>
  directoryPerms: "700"
# PVC using RWX
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-filestore
spec:
  accessModes:
  - ReadWriteMany    # RWX — supported by EFS
  storageClassName: efs-sc
  resources:
    requests:
      storage: 10Gi

This migration requires data to be transferred from the existing EBS volume to the new EFS filesystem, and the application must be designed to handle concurrent filesystem access safely.

Option C — Use ReadWriteOncePod to surface the error faster (Kubernetes 1.22+)

ReadWriteOncePod restricts the volume to a single pod (not just a single node). If a second pod attempts to use the same PVC, it will fail immediately with a clear error rather than hanging in ContainerCreating. This does not fix the architectural problem but makes it more visible:

spec:
  accessModes:
  - ReadWriteOncePod

Step 8 — GitOps Considerations

The deployment was managed via a GitOps workflow (Kustomize overlays stored in a Git repository). The replicas: 2 value was in the source manifest.

Any kubectl scale command applied directly to the cluster is a temporary measure. When the GitOps reconciler next applies the source manifest, it will revert the deployment back to replicas: 2 — recreating the problem.

The permanent fix must be made in the source manifest:

# erp/production/deployment.yaml
spec:
  replicas: 1    # Updated: RWO EBS filestore does not support multiple replicas

Until the source is updated and applied, the cluster state and the GitOps source are out of sync. Tracking this discrepancy is important — if another engineer applies the manifest from source believing they are “restoring the intended state”, they will reintroduce the stuck pod.


The StatefulSet Alternative

For genuinely stateful workloads that manage their own high-availability (clustered databases, message queues, etc.), a StatefulSet is more appropriate than a Deployment. StatefulSets provide:

  • Stable pod identity (pod-0, pod-1, etc.)
  • Per-pod PersistentVolumeClaims via volumeClaimTemplates
  • Ordered startup and shutdown

Each StatefulSet replica gets its own PVC, eliminating the shared-volume problem:

apiVersion: apps/v1
kind: StatefulSet
spec:
  replicas: 2
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

With this model, pod-0 gets data-pod-0 and pod-1 gets data-pod-1 — two separate EBS volumes, no sharing conflict.

However, a StatefulSet alone does not solve the data consistency problem for applications that require a shared filestore. If the application itself is not designed for multi-master operation, separate volumes per pod simply means separate (inconsistent) data per pod.


Commands Reference

# Check pod scheduling and node placement
kubectl get pods -n <namespace> -o wide

# Inspect PVC access mode
kubectl get pvc <name> -n <namespace>

# Check PV node affinity (zone constraint)
kubectl get pv <pv-name> -o jsonpath='{.spec.nodeAffinity}'

# Check which node labels exist per node
kubectl get nodes -o custom-columns="NAME:.metadata.name,ZONE:.metadata.labels.topology\.kubernetes\.io/zone"

# Describe pod to check volume mount state and events
kubectl describe pod <name> -n <namespace>

# Scale deployment (temporary — update GitOps source for permanence)
kubectl scale deployment <name> -n <namespace> --replicas=1

# Check current replica count
kubectl get deployment <name> -n <namespace> \
  -o jsonpath='replicas={.spec.replicas}'

Production Rules

  1. ReadWriteOnce means one node, not one pod. Multiple pods on the same node can share an RWO volume, but the volume cannot be attached to more than one node simultaneously.

  2. EBS enforces RWO at the infrastructure level. AWS will reject a second EC2 attachment while the volume is already attached. The kubelet gets stuck waiting, producing no events and no logs.

  3. The Kubernetes scheduler does not co-locate pods to satisfy RWO constraints. It only ensures pods are in the correct availability zone. Node co-location must be enforced via pod affinity rules if needed.

  4. replicas: 2 + single RWO PVC = broken configuration. Choose: replicas: 1, RWX storage, or per-pod volumes via StatefulSet.

  5. GitOps source must match cluster state. A direct kubectl scale is temporary. The permanent fix lives in the repository.

#kubernetes#eks#aws#ebs#persistent-volumes#storage#gitops#odoo