Inside the exploit pipeline: how Guardix proves vulnerabilities are real

Most security tools stop at detection. They flag a potential reentrancy, an unchecked return value, a suspicious access pattern — then hand the finding to a human to assess whether it is actually exploitable. That gap between "flagged" and "proven" is where false positives live, and where real vulnerabilities get dismissed as noise.

The Guardix exploit pipeline closes that gap. After the static analysis audit completes, the exploit pipeline takes validated findings and attempts to exploit them on a forked copy of the real chain — with real protocol state, real token balances, and real contract interactions. If an agent succeeds, the result is an on-chain proof: a transaction that extracted value, replayed on a clean state snapshot.

Exploit pipeline overview: audit → deploy → exploit → verify — The four phases of the exploit pipeline

Phase 1: Deploy — forking the real world

The pipeline starts by creating a faithful replica of the target chain. A Cloud Run job downloads the audited repository, spins up a local Anvil node forked from the live RPC endpoint at a specific block height, and runs a deployment agent that sets up the protocol exactly as it exists on-chain.

The deployment agent is itself an LLM-driven subagent. It reads the repository, identifies the deployment scripts or factory patterns, and executes them against the forked chain. This is not a simulated environment — the fork carries the full state of mainnet at the target block, including token balances, oracle prices, governance configurations, and liquidity pool depths.

Deploy phase: download repo, fork mainnet, deploy contracts, validate, snapshot, upload — The deploy phase creates a reusable chain snapshot with the protocol fully configured

Once the deployment agent finishes, the pipeline runs a readiness check. This validates the deployment manifest — confirming that contract addresses are non-zero, the attacker account has sufficient balance, the deployer and attacker use separate private keys, and the protocol state looks healthy. If validation passes, the entire Anvil state is dumped to a hex snapshot and uploaded to cloud storage alongside the manifest.

The state snapshot is the critical handoff artifact. Every exploit agent downstream loads this exact snapshot, so they all start from identical chain state — no drift, no race conditions, no stale data.

Phase 2: Exploit — parallel agents, independent strategies

This is where things get interesting. The pipeline spawns multiple exploit agents in parallel, each running in its own isolated Cloud Run container. Each container loads the same state snapshot from the deploy phase, but the agents operate completely independently — different models, different strategies, different attack paths.

Multiple parallel agents with different models exploring independently — Agents are assigned models via round-robin and run with independent time budgets

Models are assigned to agents via round-robin across the configured model pool. A typical configuration uses claude-opus and codex-max, but the pool is configurable. Each agent gets a time budget — currently 5 hours by default — and iterates autonomously within that window.

The rationale for parallelism is not just speed. Different models have genuinely different reasoning patterns. Claude might identify a flash loan vector that Codex misses, while Codex might find a reentrancy path through a callback that Claude overlooks. Running them in parallel captures a wider surface of potential exploits than any single model would alone.

The exploit loop

Each agent runs in a tight loop: analyze the codebase on the forked chain, build a hypothesis about a potential vulnerability, write an exploit in Solidity (as a FlawVerifier contract), and submit it for verification. If verification fails, the agent receives diagnostic feedback from the verifier and iterates.

Exploit loop: analyze, hypothesize, write exploit, verify, iterate — The inner loop continues until the time budget is exhausted or an exploit is verified

The agent maintains a session hypothesis file — a running backlog of attack ideas it has explored, what worked, and what did not. This prevents the agent from re-trying the same approach and helps it build on partial progress across iterations.

Each iteration has a capped inner timeout (75 minutes) to prevent any single attempt from consuming the full budget. The loop continues until either a verified exploit is found or the remaining budget drops below 2 minutes.

Actor separation: deployer vs attacker

A subtle but important design choice: the pipeline enforces strict separation between the deployer and attacker roles. The deployer account sets up the protocol — deploying contracts, seeding liquidity, configuring parameters. The attacker account is the one that attempts the exploit. They use separate private keys and the pipeline validates that the deployer never appears as a sender in exploit transactions.

Deployer and attacker as separate actors with isolated keys and roles — Strict role separation prevents the exploit from relying on deployer privileges

This matters because a real attacker does not have admin keys. If the exploit relies on the deployer's privileges — setting prices, minting tokens, pausing contracts — it is not a genuine vulnerability. The separation ensures that every verified exploit represents an attack that an external party could actually execute.

The attacker agent also operates with restricted RPC access. Harness-only admin calls like anvil_setBalance, anvil_setCode, and anvil_loadState are blocked via a cast wrapper. The agent can interact with the chain as a normal user, but cannot cheat the environment.

Two-chain verification

The verification model is the part of the pipeline we are most careful about. A false positive exploit — one that appears to succeed but relies on corrupted state or accumulated side effects — is worse than no exploit at all. It would misrepresent the risk.

To prevent this, each agent maintains two separate Anvil instances. The scratch chain is where the agent does its work — reading state, sending test transactions, iterating on approaches. Mutations on the scratch chain persist across attempts, which is useful for exploration but means the state drifts from the original snapshot.

Scratch chain for exploration and verifier chain for clean-state replay — The verifier chain loads a pristine snapshot for every verification attempt

The verifier chain is separate. For every verification attempt, it loads the original state snapshot from scratch — the same pristine state that all agents started from. The agent's FlawVerifier contract is deployed fresh, the attacker account is funded, and a single transaction calls executeOnOpportunity(). If the FlawVerifier contract's native balance increases after that transaction, the exploit is confirmed.

verification.py python

// Simplified verification logic
initial_balance = verifier_chain.get_balance(flaw_verifier_address)

// Single atomic transaction
verifier_chain.send_tx(
    from=attacker,
    to=flaw_verifier,
    data=encode("executeOnOpportunity()")
)

final_balance = verifier_chain.get_balance(flaw_verifier_address)
profit = final_balance - initial_balance

if profit > MIN_PROFIT_THRESHOLD:
    return ExploitVerified(profit_wei=profit)

The minimum profit threshold is 0.1 native units (e.g. 0.1 ETH). This filters out dust-level value extraction that might technically succeed but does not represent a meaningful vulnerability.

The FlawVerifier contract

Every exploit is expressed as a Solidity contract that implements a single entry point: executeOnOpportunity(). The agent writes the exploit logic inside this function — flash loan calls, reentrancy callbacks, price manipulation sequences, whatever the attack vector requires. The constraint is that the entire exploit must execute atomically in one transaction.

FlawVerifier.sol solidity

contract FlawVerifier {
    function executeOnOpportunity() external {
        // Agent writes the exploit logic here:
        // 1. Take flash loan from Aave/Uniswap
        // 2. Manipulate price oracle
        // 3. Drain vulnerable vault
        // 4. Repay flash loan
        // 5. Profit remains in this contract
    }

    receive() external payable {}
}

This atomic constraint is important. It proves that the exploit can succeed in a single block, without relying on multi-block manipulation or off-chain coordination that might not be feasible in practice.

Infrastructure: Cloud Run jobs and state management

The pipeline runs on GCP Cloud Run jobs. The deploy phase is a single job. The exploit phase spawns N parallel tasks within a job, where each task is an independent container with its own Anvil instance. Tasks are assigned models via round-robin, and each task reports results independently via Pub/Sub events.

Deploy job: single container, runs the deployment agent, produces state snapshot and manifest
Exploit job: N parallel containers (default 6), each loads the same snapshot, runs independently
State handoff: GCS-backed — state_dump.hex and manifest.json uploaded by deploy, downloaded by exploit
Event reporting: Pub/Sub events for deploy completion, task completion, and live output streaming
Time budget: 5 hours per agent by default, configurable per run

The deploy-once-snapshot pattern is deliberate. Running deployment once and sharing the snapshot across all exploit agents eliminates a class of non-determinism — every agent starts from byte-identical chain state. It also saves significant time compared to re-deploying for each agent.

What happens when an exploit succeeds

When the verifier confirms a profitable transaction, the agent uploads a result bundle: the FlawVerifier source code, the verification transaction details, balance deltas for all relevant accounts and contracts, and the profit amount. The backend stores this alongside the original audit findings.

In the dashboard, exploited findings are surfaced with their proof — the exact Solidity code that extracted value, the profit amount, and the chain state at the time of execution. This transforms a finding from "this pattern looks risky" to "this pattern was exploited for X ETH on a fork of mainnet at block Y."

The exploit pipeline is not a replacement for static analysis. It is a second pass that takes the highest-signal findings and attempts to prove they are real. The combination of detection + exploitation gives teams a much higher-confidence view of their actual risk surface.

Design constraints and trade-offs

Several design choices are worth calling out explicitly, because they represent trade-offs we made deliberately:

Atomic exploits only — requiring everything in one transaction filters out multi-block attacks. We accept this trade-off because single-tx atomicity is a strong proof standard and covers the majority of DeFi exploit patterns (flash loans, reentrancy, oracle manipulation).
No prior-stage bias — the exploit agent does not receive findings or hypotheses from the static analysis stage. It gets the codebase and the understanding artifacts (system map, invariants) but forms its own attack hypotheses. This prevents anchoring on the static analyzer's perspective.
Blocked admin RPCs — the agent cannot call anvil_setBalance or anvil_setCode. This is stricter than necessary for many cases, but eliminates an entire class of false positive exploits where the agent "cheats" by manipulating the environment.
Clean-state verification — replaying on a pristine snapshot for every verification attempt is expensive (each load takes seconds). But it is the only way to guarantee that accumulated side effects on the scratch chain do not contaminate the proof.

What comes next

The exploit pipeline is currently in controlled access behind a feature flag. We are iterating on the agent prompting strategy, expanding the model pool, and building better diagnostics for the feedback loop between failed verification attempts and the next iteration.

The end goal is straightforward: every validated finding in a Guardix audit should come with a clear answer to "is this actually exploitable?" — backed by a transaction, not a guess.