Blog

CVEs Are Now Being Exploited Faster Than You Can Respond

Jeremy Hogan | 28 January 2026

Anthropic just announced that Claude Sonnet 4.5 was able to replicate the Equifax data breach, without needing to look it up, just by recognizing the CVE.

Agents can weaponize a CVE in minutes … just by leveraging public models and off-the-shelf tools, with a little help from some ill-intended prompting.

Which means your window for remediation just slammed shut.

Agentic code is already a firehose feeding a fistful of SDLC straws. Agents reach out for unsanctioned dependencies, add npm post-install cruft, hallucinate packages; their velocity outpaces review. Agents spit the bit, jailbreak, hop in the saddle.

A lot of orgs won’t even be able to surface the existence of a vulnerable package, let alone act on it before there’s an active Known Exploited Vulnerabilities (KEV). Meaning what should be alert noise that gets logged for “next release,” or auto-ticketed for triage, turns into “breakglass” pager alerts.

The answer isn’t more humans. You probably already have too many humans in the mix. And, as the television true-life docuseries Black Mirror teaches us, don’t bring fleshware to a firmware fight. You need high-fidelity guardrails. More tower defense. Foundational integrity. Zero trust.

The answer is deterministic infrastructure for probabilistic workloads: input-addressed hashing, living SBOMs, and policy automation that can prevent at admission, detect in milliseconds, and remediate before the attacker finishes their first scan.

Move fast. Break nothing. Here. We. Go.

Trust Is Math

Agentic code rips the bandages off. The prevailing rituals—imperative, probabilistic, mutable, scan-and-pray—:none of it ever really worked. Machine-speed output just makes it impossible to ignore.

Mathematical verification replaces human review as the basis of trust. No scanning. No sampling. No guesswork. The Flox SBOM is a mathematical proof, not an inference.

If you're running containers in production, you may already be using content-addressed hashes. Docker digests. OCI manifests. Definitions like image: nginx@sha256:abc123….

That's good. Keep doing that.

If you’re even doing that. Statistically you aren’t. And, even if you are, your layers are drifting. Probabilistically.

On that note, let’s talk terms. “Ontological reproducibility” means that the same recipe + same inputs = bit-for-bit identical artifact. Anywhere. Anytime. Forever. Input-addressed hashing is how you prove it—the hash and the build are the same thing, not a description of each other. Reproducibility as a policy or compliance/audit goal misses the point entirely; what you're actually after is mathematical certainty.

Imperative systems aspire to reproducibility—run the same steps and you should get the same results. Probably will, even. Deterministic systems guarantee it —same hash, same artifact, same results, always. Best practices can't beat physics. Even so, content-addressed hashing answers only one question: are these the same bytes? It proves integrity. It doesn't prove provenance. Input-addressing gives you provenance and integrity. By definition. Two different build pipelines can produce the same tarball. Two different Dockerfiles, run at different times, pulling "latest" at different moments—same bytes, different lineage. The hash only tells you what arrived.

It doesn't tell you how it was built, or from what. Nor guarantee true reproducibility, without drift. When a CVE drops, you know the image is affected. You don't know why—which input, which version, which path through the dependency graph.

I’m big on data. Data + AI is the epitome of garbage in, garbage out. Bad—or stale, or incomplete—data in your AI/ML jobs means inaccuracy and false or inflated confidence.

The Solution: Hash-based trust and deterministic environments.

For Flox, the environment—not the package, not the container, not the image—is the atomic unit of deployment, of security, of trust. Flox supports, but doesn't require images. When you need a container, you can materialize the runtime directly into a thin container at compose time. (More on that here.)

The input-addressed hash lives alongside your existing identifiers. An annotation, a label, a lookup key, a join on SBOM and pod lookup. Oh, and, you know, still also a hash.

metadata:
  annotations:
    floxorg/env: "acme/trading-engine"
    floxenv/hash: "abr8bb314wjx7z23zy9p53hywda6fzi3"

That hash becomes the primary key for everything downstream. When a CVE drops, you get a to-the-bit-level certainty if it matters, and you don't grep logs; rather, you (or your triage agents) query the hash:

kubectl get pods -A -o json | jq -r '
.items[]
| select(.metadata.annotations["flox.dev/hash"]=="abr8bb314wjx7z23zy9p53hywda6fzi3")
| [.metadata.namespace, .metadata.name, .status.phase, .spec.nodeName]
| @tsv' | column -t
 
NAMESPACE       NAME                                STATUS    NODE
prod-trading    trading-engine-7f8d9c4b5-2xklm     Running   node-pool-a-3
prod-trading    trading-engine-7f8d9c4b5-9vjrn     Running   node-pool-b-1
prod-trading    trading-engine-7f8d9c4b5-qw4ht     Running   node-pool-a-7
staging         trading-engine-6c5d8a3f2-abo12     Running   staging-node-2
dr-west         trading-engine-7f8d9c4b5-dr001     Pending   -

Automatic blast radius in milliseconds. And when it’s time to remediate?

You, or your agents just:

# Cordon every node running affected pods
kubectl get pods -A -o json | jq -r '
.items[]
| select(.metadata.annotations["flox.dev/hash"]=="abr8bb314wjx7z23zy9p53hywda6fzi3")
| .spec.nodeName' | sort -u | xargs -I{} kubectl cordon {}
 
# Or block the hash at admission and let policy handle it
# Kyverno, Gatekeeper, OPA—same pattern: "reject if flox.dev/hash in blocklist"

Automatable remediation. This filters all that new noise back down to sniper shot alerts, where a breakglass really means “everyone, out of bed, now!” And everything else is logged, ticketed, or heck, maybe even automatically rolled-back/upgraded/patched.

The hash is the join key. You’re joining by matching pods.hash = artifacts.hash. So the hash becomes the common column that everything—your runtime inventory, your vulnerability intel, and your policy layer—can match on. Query it. Block it. Upgrade it. And, yes, still use it to prove provenance and integrity. The tooling you already have becomes the enforcement layer for guarantees you never had.

The result: faster detection, automated remediation, zero unauthorized execution.

See It in Action

I built a demo where FloxHub sits at the center of the CVE correlation pipeline—ingest feeds from NIST NVD, OSV, CISA KEV, and FIRST EPSS on one side, and feeds enriched SBOM data to your existing security toolchain on the other.

Real-life CVE drops. I correlate against every SBOM, join by hash. Matches flow to Grafana dashboards, ELK, Splunk alerts, your existing SCA tools—BYOK or take-your-own-SBOM integrations with Snyk, FOSSA, Wiz, Black Duck—whatever you're already running.

My demo also applies risk scoring. Tier assignment. VEX inference. All computed from the SBOM, all keyed to the hash, all in real time. These surface in Grafana based on severity +/- KEV, reachability, and so on.

Build Time Is Runtime

There is no daylight between them. The closure doesn't describe the runtime—it is the runtime. One artifact. One SBOM. Not what you think is running—what you know. In Flox the closure is a circle. Complete and closed by definition.

Most security tools live in the gap between what they saw in a manifest and what they found when they scanned.

Build-level accuracy tells you what they think is there. Deploy-level accuracy requires scanning artifacts after they ship, pattern-matching file hashes to known packages, hoping nothing got statically linked or obfuscated. Runtime accuracy normally requires agents observing what actually executes, and you only see what runs while you're watching. Each layer is rarer than the last

The gap between these layers is where supply chain attacks live.

XZ Utils wasn't in a manifest. It was hiding in plain sight in the build closure. Traditional tools missed it because they were scanning the wrong layer.

Input-addressed environments close that gap. Flox doesn't tell you what did execute. It tells you what can execute—and nothing else. The closure is complete and hermetic: if it's in the graph, it can run; if it's not, it cannot.

One SBOM per hash. Not per release. Not per "version." Every unique environment gets exactly one SBOM, and the hash is the lookup key.

Pi doesn't change, no matter how the circle is measured.

Correlation, not just inventory. When that CVE drops, the question isn't "do we use this package?" It's "which hashes contain this package, and where are they running?"

Known, Not Partial

DoD C-SCRM requirements ask vendors to declare confidence in dependency relationships: Known, Partial, None, or Unknown. Most tools can only assert Partial —even then, they infer the relationship from scanning. Statically linked libraries disappear into binaries. When a scanner says “we think we found openssl 3.1.4,” it’s talking about byte patterns that match.

Flox asserts Known for every relationship. The deterministic graph is a mathematical property of how input-addressed builds work.

And what's better than a base image with 500 known CVEs or even a hardened (for now) base image? No base image at all.

Traditional container security means inheriting hundreds of CVEs from upstream layers you didn't choose. You scan, find vulnerabilities in packages you've never heard of, and file tickets that nobody owns, on code that never sees the light of runtime.

Flox environments require no base image. The closure contains exactly what you declared—nothing inherited, nothing hidden. The environment is the artifact.

Same reproducibility. Zero mystery meat.

Defined is better than inferred. Runtime SBOMs prevent build-time false positives. Cryptographic identity. Deterministic reachability.

Zero Trust Beats Zero CVE

You will never have zero CVEs. It's a comfort metric. It's a shared suspension of disbelief as we all enter the security theater. It's a lie we all agree to believe. It's whack-a-mole with infinite tokens.

Everyone gets that, now.

But you can have zero unauthorized execution. Nothing runs without a valid hash, a verified signature, a clean policy check.

Stop chasing "no vulnerabilities." Start enforcing "no execution without verification." One of those is a fantasy you will never reach. The other is something you can actually build. Compliance baked in, shifted as far left on the SDLC as you can get.

As Yoda once said, "There is no trust, only verify."

The Catalog Is Your Firewall

The Flox Catalog acts as a software firewall—a formal boundary between the creation and consumption of software. Only validated, tested builds pass through to production. Developers publish to the catalog. Operators deploy from the catalog. Production credentials never touch the inner loop.

This is where prevent, detect, remediate becomes operational:

Prevent. Admission controllers reject unknown hashes before pods ever schedule. Require both —known label hash and valid Flox hash with clean SBOM. If it's not in the catalog, it doesn't run.

Detect. CVE correlation runs continuously against every SBOM in the registry. The moment a vulnerability publishes, the fleet registry returns every running instance of an affected hash—in milliseconds, not hours.

Remediate. CVE remediation becomes a version bump. Change the manifest, lock, push. Every flox install,or flox build/flox publish produces new package hashes, every flox push produces a new environment hash, which can be atomically updated. One package gets updated, the rest pull from the cache.

The chain is traceable end-to-end. The old hash becomes automatically unauthorized. Blocked at push, blocked at pull. From pkill to purged.

Policy Is Physics

SBOMs don't have to be just a regulatory checkbox. Audits don't have to take weeks. Compliance can become a physical property of the environment itself. In the manifest-driven world of declarativity, violating policy is functionally impossible. Either it complies, or it does not run. The hash identifies the artifact. The signature proves who built it and when. Together, they make policy enforceable at machine speed—no human in the loop required for the decision, only for the exception. This is the only thing that makes automation trustworthy.

Green Means Go

If an agent can exploit a CVE in minutes from memory, you need AI-assisted defense. But not all CVEs are equal, and not all responses should be automated. The key is building a trust pyramid where confidence determines who acts:

Green (Automated): 0 known CVEs or all have VEX "not_affected" status. Safe for autonomous deployment.
Yellow (Human Review): Medium/Low CVEs. Requires justification or assessment to proceed.
Red (Hard Block): Critical CVE or unsigned build. Environment activation disabled by policy.

Every decision is a lookup, not a judgment call. The hash is real, the SBOM is accurate, and the fleet registry is current. Automation and agents are executing on math, rather than guessing.

Brakes make the car go fast. High-fidelity guardrails and strict environmental boundaries aren't restrictions—they're the safety systems that give us the confidence to floor the accelerator.

What's Next: Agentic Remediation

Or at least highly agent-assisted. If agents can correlate CVEs to running infrastructure in seconds, they can certainly also triage tickets. The next question is obvious: can they fix it too?

The pieces are already there:

flox upgrade openssl produces a new locked environment with a new hash
CI signs that hash automatically, YAML is forged
Admission controllers allow/deny on hash
Fleet registry tracks the rollout
Atomic rollback is a pointer swap—seconds, not build > test > tag > push > pull > deploy times N iterations on X-gig images

And after that? Agentic hot-patching, duh. An agent with a full audit trail, with automatic rollback if health checks fail.

All that. With one eye on the bots, per: Asimov, Cameron et. al.

We're Not There Yet

We’re closer to letting AI drive us to work in our two-ton death machines than we are to leaving an agent alone in a room with private data; only because of brakes and seatbelts and airbags and sensors and cameras and GPS and kill switches and freaking lasers, probably.

But we're closer than most people think. The infrastructure for agentic remediation is:

Input-addressed hashes as identity primitives—not just integrity, but provenance
Living SBOMs derived from the hash—build time is runtime
Fleet visibility across every running instance—single source of truth
Policy automation that prevents, detects, and remediates on math, not meetings

Where This Goes Next

When attackers can go from CVE → exploit in minutes, you need defenses that go from CVE → blast radius → block/upgrade just as fast. Flox makes it simple to do this.

Here’s the tl;dr version of what this looks like:

Create and maintain runtime environments. These are declarative definitions of packages, automation logic, and services. Flox environments pull hash-pinned dependencies from an immutable store.
Publish to FloxHub. This pins an environment to an atomic generation (i.e., version). Each environment gets a deterministic hash and an SBOM that are derived from its complete dependency graph.
Gate on the hash. Require an allowlisted flox.dev/hash annotation and reject blocked hashes.
Correlate CVEs by hash. This gives you instant “Where is this running?” insight with zero scanning.

AUTHOR’S NOTE: I use AI for lots of things. Including writing. I use it to organize my thoughts. I use it to check for correctness of content and grammar. It often gives me unsolicited feedback and offers to give me a “punchier” and “more engaging” draft. Sometimes, without a hint of irony, it will ask if it can take pen to draft and make me sound more human. This article got flagged as AI. For things I wrote. In a style I've used for decades. It even offered—again without irony—to make me sound…less like me! It's sad that it's come to this. But I now need to add the disclaimer: The ideas and the language in this article are sui generis my own.