Research
Methodology2026-05-16

SPECTRE × Differential Fuzzing — Integration Design

Author: spectre-solana-max engineering Status: Design doc. PoC implementation deferred to a follow-up. Companion: spectre-vs-competitors-gap-analysis-2026-05-16.md (tier-3 item 8).

Premise

Differential fuzzing is the audit-firm-internal technique that catches behavior divergence: run two implementations against random inputs, flag any inputs where outputs disagree. On Solana the canonical use cases are:

  1. Pre-upgrade vs post-upgrade handler — does the upgrade change behavior on any input it wasn't supposed to?
  2. Native vs Anchor port — does the Anchor rewrite preserve the native handler's semantics?
  3. Reference math vs production math — does the optimized fixed-point math agree with the reference rational?
  4. Cross-protocol — does Drift's perp math agree with Mango's on the same input?

OtterSec uses this internally; no public Solana tool ships it. Trail of Bits' Echidna does it for Ethereum.

The friction is harness authorship: every fuzzed handler needs a cargo fuzz target that arbitrary-generates the accounts struct, calls the handler, and diffs against a reference. Writing those targets is mechanical — and SPECTRE already knows the accounts shape.

Integration shape

Component 1: pinpoint spectre fuzz-target <handler> (2-3 days)

A new CLI subcommand that consumes SPECTRE's symbol-table output and emits a starter cargo-fuzz target for a named handler. Pre-fills:

  • Accounts struct → Arbitrary derive. Each Anchor Account<'info, T> field becomes an Arbitrary instance with bounded ranges (u64 capped at MAX_REASONABLE, Pubkey from a small pool, etc.).
  • Handler signature → fuzz entry point. fuzz_target!(|input: HandlerInput| { ... }).
  • Pre/post-state diff. Capture pre-state, run handler, capture post-state, panic on assertion. The dev plugs in the assertion (which fields should be equal? which monotonic?).

Output: fuzz_targets/<handler>.rs + Cargo.toml patch. Run with cargo fuzz run <handler>.

Component 2: Reference-implementation diff mode (~1 week)

Two flavors:

  • A/B mode: SPECTRE generates a harness that runs the same input through TWO commits of the same handler (current HEAD vs --baseline <sha>). Panic on output divergence. Useful for "is my refactor semantics-preserving?"
  • Native ↔ Anchor mode: SPECTRE generates a harness that runs the same input through a native handler AND its Anchor port. Panic on divergence. Useful for "did the rewrite preserve semantics?"

Both lean on SPECTRE's existing native + Anchor extractors — we already know which native dispatcher matches which Anchor handler by name + ctx-type heuristics.

Component 3: SPECTRE-finding-driven fuzz harness (~2 weeks)

When a SPECTRE rule fires that specifically benefits from fuzzing (RACE-001 stale-account-after-CPI, ARI-050 unchecked arithmetic, ITER-001 sentinel iteration), the harness skeleton includes the relevant assertion. Example: ITER-001 fires on Obligation::total_collateral — the auto-generated harness arbitrary-generates a Position array with random Pubkey holes, asserts total_collateral agrees with a reference sum over all non-default positions. The bug class is precisely what the harness exercises.

Why this is high-leverage

  • Most Solana programs have zero fuzz targets. cargo-fuzz exists but the boilerplate barrier is real.
  • SPECTRE knows the accounts shape — the boilerplate auto-generation cost goes from days to minutes.
  • Finding-driven harnesses turn SPECTRE's pattern detection into a concrete, runnable proof-of-bug or proof-of-correctness target.

Concrete next steps

  1. Phase 1 (1 day): pinpoint spectre fuzz-target --handler <name> CLI stub.
  2. Phase 2 (2 days): Arbitrary derivation from accounts-struct + handler-arg JSON.
  3. Phase 3 (3 days): Pre/post-state capture + diff assertion template.
  4. Phase 4 (1 week): A/B and native↔Anchor modes.
  5. Phase 5 (2 weeks): Per-rule finding-driven assertions for RACE-001, ARI-050, ITER-001, INV-001/004.

Scope boundary

Phase 1-3 ships standalone value: a dev can generate a useful fuzz target with one command, even without finding-driven mode. That's the minimum viable integration.

The corpus-fuzz infrastructure (running fuzz targets continuously across the corpus, surfacing crashes back to PR annotations) is out of scope here — that's a separate ops conversation.

References