Retiring the Latest Tag: Environment-Specific Image Tagging for Kubernetes
Retired the latest tag in favor of YYYYMMDD-HHMMSS-{env} image tags and conditional pipeline logic enforcing build separation between UAT and production.
ON THIS PAGE
For the first several months of latest. Rollbacks meant finding the previous image digest in ACR. Knowing what was in production required checking the running pod’s image field, not the deployment manifest. After the March incident, this had to change.
The problem with latest
latest is a mutable pointer. Pushing a new image with the latest tag silently moves what that tag references. Two namespaces both claiming to run <core-system>-backend:latest might be running different underlying images depending on when they last pulled.
In the
- After the March 13 manual build,
latestpointed to the patched image. But the manifest files still saidlatest, so there was no way to tell from the repo what was actually running. - A
kubectl rollout undowould revert to whateverlatesthad pointed to the time before — not a specific known-good version. - The March incident log needed to record exactly what was compromised. “latest” is not an answer.
The new convention
Tags follow YYYYMMDD-HHMMSS-{env}:
<acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-uat
<acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod
<acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706
The timestamp component is sortable and human-readable. The environment suffix makes it unambiguous which namespace a given image belongs in. The frontend shares the same build across environments (it has no environment-specific code), so its tag has no suffix.
Environment-agnostic aliases (uat, prod) are pushed alongside the timestamped tag:
<acr-registry>.azurecr.io/<core-system>-backend:uat → current UAT build
<acr-registry>.azurecr.io/<core-system>-backend:prod → current prod build
These aliases are how a human quickly finds “what’s the current build” without scrolling through ACR history. The timestamped tags are what the Kubernetes deployment manifests reference.
Pipeline implementation
The Azure DevOps pipeline determines the environment suffix from the source branch:
variables:
${{ if eq(variables['Build.SourceBranchName'], '<core-system>-backend-prod') }}:
envSuffix: 'prod'
deployNamespace: '<namespace>'
envAlias: 'prod'
${{ else }}:
envSuffix: 'uat'
deployNamespace: '<namespace-dev>'
envAlias: 'uat'
buildDate: $[format('{0:yyyyMMdd}-{1:HHmmss}', pipeline.startTime, pipeline.startTime)]
imageTag: $(buildDate)-$(envSuffix)
Build steps tag and push:
- task: Docker@2
displayName: Build and Push
inputs:
command: buildAndPush
tags: |
$(imageTag)
$(envAlias)
Deploy steps reference the computed tag:
- script: |
kubectl set image deployment/<core-system>-backend \
<core-system>-backend=$(ACR)/<core-system>-backend:$(imageTag) \
-n $(deployNamespace)
displayName: Deploy to $(deployNamespace)
Creating the prod branch
Before this could work, a production branch had to exist. The backend had only <core-system>-backend-dev. Cutting the prod branch from the current dev state:
# Push the remote dev branch as prod
git push origin refs/remotes/origin/<core-system>-backend-dev:refs/heads/<core-system>-backend-prod
For the frontend:
git checkout <core-system>-frontend-dev
git checkout -b <core-system>-frontend-prod
git push origin <core-system>-frontend-prod
Now there are four branches:
<core-system>-frontend-devbuilds*-uatimages and deploys to<namespace-dev><core-system>-frontend-prodbuilds unversioned images and deploys to<namespace><core-system>-backend-devbuilds*-uatimages and deploys to<namespace-dev><core-system>-backend-prodbuilds*-prodimages and deploys to<namespace>
Promotion from UAT to prod is a git merge (<core-system>-backend-dev into <core-system>-backend-prod), which triggers the prod pipeline automatically.
Updating the manifests
After cutting the first properly-tagged builds, the deployment manifests were updated:
# k8s/prod/deployment-backend.yaml
image: <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod
# k8s/uat/deployment-backend.yaml
image: <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-uat
The manifest now reflects what actually ran — not a mutable pointer. The CI/CD pipeline updates this on every build via kubectl set image, and the manifest is updated in git to match as part of the deploy step.
Rollback is now explicit
With timestamped tags:
# Rollback to the previous known-good prod build
kubectl set image deployment/<core-system>-backend \
<core-system>-backend=<acr-registry>.azurecr.io/<core-system>-backend:20260313-082326-prod \
-n <namespace>
The target image exists in ACR, is immutable, and is identifiable from the deploy history. kubectl rollout undo still works, but it’s no longer the only option — and it no longer means “whatever latest was pointing to before.”
What changed immediately
After switching tags, two things became easier: knowing what’s running, and knowing when something changed.
kubectl get deployment <core-system>-backend -n <namespace> \
-o jsonpath='{.spec.template.spec.containers[0].image}'
# <acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod
That timestamp is a deployment record. It’s also how the post-incident analysis confirmed exactly when the vulnerable image was running and when it was replaced.
Discussion