Blog Field Notes Wiring Azure File Persistent Storage for Notification and Batch Services on AKS Staging
Platform #aks#kubernetes#pvc#azure-file#persistent-storage

Wiring Azure File Persistent Storage for Notification and Batch Services on AKS Staging

Added PVCs and wired volume mounts for the notification and batch services across two namespaces on AKS staging, replacing stale AWS StorageClass references and correcting two naming and access mode mistakes made in the process.

· Gideon Warui
ON THIS PAGE

Both the notification and batch services on AKS staging had been running without persistent storage. The volume and volumeMount sections in all four deployments (KE and UG for each service) were commented out. The PVC files in the repo existed on disk but were stale — referencing an AWS gp2 StorageClass and a <client> namespace that no longer exists in this cluster.

The task was to create proper PVCs, wire them into the deployments, and apply everything to the staging cluster without disrupting running pods.


What the repo state looked like before

ke/notification/notification-ke-pvc.yaml had storageClassName: gp2 and storage: 5Gi. ug/notification/notification-ug-pvc.yaml had name: <client>-pvc, namespace: <client>, storageClassName: manual. Both batch service PVC files had storageClassName: manual. None of these were applied to the cluster — kubectl get pvc -n <namespace-ke> and kubectl get pvc -n <namespace-ug> returned nothing for these services.

The deployments had the volume blocks commented out like this:

# volumes:
#   - name: notification-volume
#     persistentVolumeClaim:
#       claimName: notification-pvc
containers:
  - name: notifications-ke
    # volumeMounts:
    #   - mountPath: "/notification/files"
    #     name: notification-volume

Note the mountPath in the comment — /notification/files. The FILES_DIR env var in the deployment was /notifications/files. The extra s would have caused a silent misconfiguration had I just uncommented as-is.


Checking the active StorageClass

Before writing anything, I checked what StorageClass was actually in use on this cluster:

kubectl get storageclass

The active one for shared storage is azurefile-csi-wildfly:

NAME                    PROVISIONER          RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
azurefile-csi-wildfly   file.csi.azure.com   Retain          WaitForFirstConsumer   true                   344d

WaitForFirstConsumer is the binding mode — the PVC stays Pending until a pod mounts it. This is expected. It ensures the Azure File share is provisioned in the same availability zone as the scheduled pod.


Verifying the cluster context

The default context was pointing at an AWS EKS cluster (<client>-shared-eks). Switched it before touching anything:

kubectl config get-contexts
kubectl config use-context <staging-cluster-context>
# REVIEW: redacted — confirm

Writing the PVC manifests

I settled on the naming convention <service>-<country>-pvc for clear separation across namespaces. The four PVCs:

  • notifier-ke-pvc in <namespace-ke>
  • notifier-ug-pvc in <namespace-ug>
  • <svc>-ke-pvc in <namespace-ke>
  • <svc>-ug-pvc in <namespace-ug>

Each PVC followed the same structure:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: notifier-ke-pvc
  namespace: <namespace-ke>
  labels:
    app: notifications-ke
    environment: uat
    team: systechltd  # REVIEW: redacted — confirm
    project: <namespace-ke>
    tier: web
    type: service
    country: KE
spec:
  storageClassName: azurefile-csi-wildfly
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi

The batch service has two mount paths on the same volume:

volumeMounts:
  - mountPath: "/opt/jboss/<core-system>-files"  # REVIEW: redacted — confirm
    name: <svc>-volume
  - mountPath: "/opt/jboss/batch-uploads"
    name: <svc>-volume

Both mount points reference the same <svc>-volume, which maps to a single PVC. One volume, two paths.


Mistake 1: Started with ReadWriteOnce

The initial PVC template I was working from had ReadWriteOnce. I applied the notification KE PVC and deployment with RWO, and it worked — pod came up, PVC bound. But azurefile-csi-wildfly supports ReadWriteMany and these services may run more than one replica, or need shared file access. I caught it after the batch service KE deployment applied and the bound PVC showed RWO in kubectl get pvc.

Swapping a bound PVC’s access mode requires deleting it and recreating — you can’t patch accessModes on a bound PVC. That meant scaling down each deployment, deleting the old PVC, creating the new one, and reapplying the deployment. Since all four PVCs had just been created and no actual files had been written to them yet, there was no data to migrate.

The swap sequence for each service:

kubectl scale deployment notifications-ke -n <namespace-ke> --replicas=0
kubectl delete pvc notifier-ke-pvc -n <namespace-ke>
kubectl apply -f ke/notification/notification-ke-pvc.yaml
kubectl apply -f ke/notification/notification-ke-deployment.yaml

Mistake 2: Generic naming on the first pass

I initially named the notification PVCs notifier-pvc in both namespaces. Since each lives in a separate namespace (<namespace-ke>, <namespace-ug>), there’s no actual collision — but it breaks the convention used everywhere else in this cluster and makes cross-namespace audit harder. Caught it before shipping to prod.

Renamed to notifier-ke-pvc and notifier-ug-pvc. Same delete-and-recreate process as above.


Apply sequence

Order matters. If the deployment is applied before the PVC exists, the pod goes Pending with:

0/1 nodes are available: persistentvolumeclaim "notifier-ke-pvc" not found

The sequence for each service:

  1. kubectl apply -f <pvc>.yaml — creates the PVC (status: Pending, WaitForFirstConsumer)
  2. kubectl apply -f <deployment>.yaml — pod gets scheduled, PVC binds, volume mounts
  3. kubectl rollout status deployment/<name> -n <namespace> — confirm rollout

Full command log

# context
kubectl config use-context <staging-cluster-context>
# REVIEW: redacted — confirm

# notification KE
kubectl apply -f ke/notification/notification-ke-pvc.yaml
kubectl get pvc notifier-ke-pvc -n <namespace-ke>
kubectl describe pvc notifier-ke-pvc -n <namespace-ke>
kubectl apply -f ke/notification/notification-ke-deployment.yaml
kubectl rollout status deployment/notifications-ke -n <namespace-ke>
kubectl get pvc notifier-ke-pvc -n <namespace-ke>
kubectl get pods -n <namespace-ke> -l app=notifications-ke

# notification UG
kubectl apply -f ug/notification/notification-ug-pvc.yaml
kubectl apply -f ug/notification/notification-ug-deployment.yaml
kubectl rollout status deployment/notification-ug -n <namespace-ug>

# RWO -> RWX swap (notification KE)
kubectl scale deployment notifications-ke -n <namespace-ke> --replicas=0
kubectl delete pvc notifier-ke-pvc -n <namespace-ke>
kubectl apply -f ke/notification/notification-ke-pvc.yaml
kubectl apply -f ke/notification/notification-ke-deployment.yaml
kubectl rollout status deployment/notifications-ke -n <namespace-ke>

# RWO -> RWX swap (notification UG)
kubectl scale deployment notification-ug -n <namespace-ug> --replicas=0
kubectl delete pvc notifier-ug-pvc -n <namespace-ug>
kubectl apply -f ug/notification/notification-ug-pvc.yaml
kubectl apply -f ug/notification/notification-ug-deployment.yaml
kubectl rollout status deployment/notification-ug -n <namespace-ug>

# rename swap (notifier-pvc -> notifier-ke-pvc / notifier-ug-pvc)
kubectl scale deployment notifications-ke -n <namespace-ke> --replicas=0
kubectl delete pvc notifier-ke-pvc -n <namespace-ke>
kubectl apply -f ke/notification/notification-ke-pvc.yaml
kubectl apply -f ke/notification/notification-ke-deployment.yaml
kubectl rollout status deployment/notifications-ke -n <namespace-ke>

kubectl scale deployment notification-ug -n <namespace-ug> --replicas=0
kubectl delete pvc notifier-ug-pvc -n <namespace-ug>
kubectl apply -f ug/notification/notification-ug-pvc.yaml
kubectl apply -f ug/notification/notification-ug-deployment.yaml
kubectl rollout status deployment/notification-ug -n <namespace-ug>

# batch service KE
kubectl apply -f ke/<svc>/<svc>-pvc.yaml
kubectl apply -f ke/<svc>/<svc>-ke-deployment.yaml
kubectl rollout status deployment/<svc>-ke -n <namespace-ke>

# RWO -> RWX swap (batch service KE)
kubectl scale deployment <svc>-ke -n <namespace-ke> --replicas=0
kubectl delete pvc <svc>-ke-pvc -n <namespace-ke>
kubectl apply -f ke/<svc>/<svc>-pvc.yaml
kubectl apply -f ke/<svc>/<svc>-ke-deployment.yaml
kubectl rollout status deployment/<svc>-ke -n <namespace-ke>

# batch service UG (applied with RWX from the start)
kubectl apply -f ug/<svc>/<svc>-ug-pvc.yaml
kubectl apply -f ug/<svc>/<svc>-ug-deployment.yaml
kubectl rollout status deployment/<svc>-ug -n <namespace-ug>

Final state

kubectl get pvc -n <namespace-ke> | grep -E "notifier|<svc>"
# notifier-ke-pvc   Bound   pvc-7137fbe8-...   2Gi   RWX   azurefile-csi-wildfly   76s
# <svc>-ke-pvc      Bound   pvc-40de71dd-...   2Gi   RWX   azurefile-csi-wildfly   21s

kubectl get pvc -n <namespace-ug> | grep -E "notifier|<svc>"
# notifier-ug-pvc   Bound   pvc-8ea0beda-...   2Gi   RWX   azurefile-csi-wildfly   64s
# <svc>-ug-pvc      Bound   pvc-85c0b23d-...   2Gi   RWX   azurefile-csi-wildfly   2m29s

All four pods running, images unchanged from before the volume wiring.

Committed as 168f542 on uat, pushed to origin/uat.


What’s next

Same setup needs to go to prod once staging is confirmed stable. Contexts: <prod-ke-cluster-context> and <prod-ug-cluster-context>.

Same PVC names, same StorageClass, same mount paths, same apply order.

#aks#kubernetes#pvc#azure-file#persistent-storage