Retiring the Latest Tag: Environment-Specific Image Tagging for Kubernetes

For the first several months of ‘s deployment, both UAT and production ran images tagged latest. Rollbacks meant finding the previous image digest in ACR. Knowing what was in production required checking the running pod’s image field, not the deployment manifest. After the March incident, this had to change.

The problem with latest

latest is a mutable pointer. Pushing a new image with the latest tag silently moves what that tag references. Two namespaces both claiming to run <core-system>-backend:latest might be running different underlying images depending on when they last pulled.

In the setup, this created specific problems:

After the March 13 manual build, latest pointed to the patched image. But the manifest files still said latest, so there was no way to tell from the repo what was actually running.
A kubectl rollout undo would revert to whatever latest had pointed to the time before — not a specific known-good version.
The March incident log needed to record exactly what was compromised. “latest” is not an answer.

The new convention

Tags follow YYYYMMDD-HHMMSS-{env}:

<acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-uat
<acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod
<acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706

The timestamp component is sortable and human-readable. The environment suffix makes it unambiguous which namespace a given image belongs in. The frontend shares the same build across environments (it has no environment-specific code), so its tag has no suffix.

Environment-agnostic aliases (uat, prod) are pushed alongside the timestamped tag:

<acr-registry>.azurecr.io/<core-system>-backend:uat    → current UAT build
<acr-registry>.azurecr.io/<core-system>-backend:prod   → current prod build

These aliases are how a human quickly finds “what’s the current build” without scrolling through ACR history. The timestamped tags are what the Kubernetes deployment manifests reference.

Pipeline implementation

The Azure DevOps pipeline determines the environment suffix from the source branch:

variables:
  ${{ if eq(variables['Build.SourceBranchName'], '<core-system>-backend-prod') }}:
    envSuffix: 'prod'
    deployNamespace: '<namespace>'
    envAlias: 'prod'
  ${{ else }}:
    envSuffix: 'uat'
    deployNamespace: '<namespace-dev>'
    envAlias: 'uat'

  buildDate: $[format('{0:yyyyMMdd}-{1:HHmmss}', pipeline.startTime, pipeline.startTime)]
  imageTag: $(buildDate)-$(envSuffix)

Build steps tag and push:

- task: Docker@2
  displayName: Build and Push
  inputs:
    command: buildAndPush
    tags: |
      $(imageTag)
      $(envAlias)

Deploy steps reference the computed tag:

- script: |
    kubectl set image deployment/<core-system>-backend \
      <core-system>-backend=$(ACR)/<core-system>-backend:$(imageTag) \
      -n $(deployNamespace)
  displayName: Deploy to $(deployNamespace)

Creating the prod branch

Before this could work, a production branch had to exist. The backend had only <core-system>-backend-dev. Cutting the prod branch from the current dev state:

# Push the remote dev branch as prod
git push origin refs/remotes/origin/<core-system>-backend-dev:refs/heads/<core-system>-backend-prod

For the frontend:

git checkout <core-system>-frontend-dev
git checkout -b <core-system>-frontend-prod
git push origin <core-system>-frontend-prod

Now there are four branches:

<core-system>-frontend-dev builds *-uat images and deploys to <namespace-dev>
<core-system>-frontend-prod builds unversioned images and deploys to <namespace>
<core-system>-backend-dev builds *-uat images and deploys to <namespace-dev>
<core-system>-backend-prod builds *-prod images and deploys to <namespace>

Promotion from UAT to prod is a git merge (<core-system>-backend-dev into <core-system>-backend-prod), which triggers the prod pipeline automatically.

Updating the manifests

After cutting the first properly-tagged builds, the deployment manifests were updated:

# k8s/prod/deployment-backend.yaml
image: <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod

# k8s/uat/deployment-backend.yaml
image: <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-uat

The manifest now reflects what actually ran — not a mutable pointer. The CI/CD pipeline updates this on every build via kubectl set image, and the manifest is updated in git to match as part of the deploy step.

Rollback is now explicit

With timestamped tags:

# Rollback to the previous known-good prod build
kubectl set image deployment/<core-system>-backend \
  <core-system>-backend=<acr-registry>.azurecr.io/<core-system>-backend:20260313-082326-prod \
  -n <namespace>

The target image exists in ACR, is immutable, and is identifiable from the deploy history. kubectl rollout undo still works, but it’s no longer the only option — and it no longer means “whatever latest was pointing to before.”

What changed immediately

After switching tags, two things became easier: knowing what’s running, and knowing when something changed.

kubectl get deployment <core-system>-backend -n <namespace> \
  -o jsonpath='{.spec.template.spec.containers[0].image}'
# <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod

That timestamp is a deployment record. It’s also how the post-incident analysis confirmed exactly when the vulnerable image was running and when it was replaced.