Methodology2026-05-17

SPECTRE vs Competitor Gap Analysis (2026-05-17)

Author: spectre-solana-max engineering Supersedes: spectre-vs-competitors-gap-analysis-2026-05-16.md Scope: Static analysis, formal verification, audit-firm tooling, and real-time monitoring on Solana, benchmarked against SPECTRE at branch feat/spectre-solana-max tip ebad85d2 (76 commits ahead of main). Method: Web research on competitor feature surfaces refreshed 2026-05-17, cross-referenced with SPECTRE's post-replay-closure state (24/24 exact-rule + 24/24 class-level on the historical- incident replay benchmark; 77 total Solana rules; 7 cross-program rules including CROSS-007 ship today).

What changed since 2026-05-16

The May 16 doc framed SPECTRE as having "the strongest rule pack" but sitting at 20/28 (71%) exact-rule on the replay benchmark. One day of focused rule-shipping closed every remaining ≈ row:

Stage	Exact-rule	Class-level	≈ remaining
2026-05-16 baseline	21/24 (87%)	24/24 (100%)	3
AUTH-100 body-level ext	22/24 (91%)	24/24 (100%)	2
ACC-021 init-write exception	23/24 (96%)	24/24 (100%)	1
ACC-015 ship (Cashio class)	24/24 (100%)	24/24 (100%)	0

Plus the 2026-05-16 day-of work: CROSS-007 (UXD delegate-risk), ACC-014 (Wormhole sysvar), ITER-001 (Jet $25M whitehat), Bucket-A duplicate-id replay-script merge.

Net rule-pack additions over the 48-hour push: 4 new detectors (ITER-001, CROSS-007, ACC-014, ACC-015), 2 rule-precision extensions (AUTH-100 body-level + ACC-021 init-write exception), 3 corpus snapshots (Raydium AMM v4 architectural reference, plus ITER/CROSS-007 ± fixtures).

Executive summary

SPECTRE now ships the strongest historical-incident replay detection rate in the published Solana static-analysis market — 24 of 24 mapped Solana exploits (Wormhole, Cashio ×2, Mango v3 ×3, Solend ×2, Cypher ×3, Drift v2, Jet v1 ×2, Metaplex CMv1/v2, Metaplex Auction House, Raydium ×2, UXD, SPL Token Lending, …) caught at both exact-rule and architectural-class level. No competitor publishes a comparable benchmark.

The cross-program analysis (8 rules: CROSS-001/002/003/004/005/ 007/010/020 + CROSS-CDF data-flow tracker) remains genuinely unique substrate; no public competitor reasons across multi-program workspaces. CROSS-007 (delegation-of-economic-backing to weakly- gated venue) ships today — the only static detector for the UXD / Tulip / yield-aggregator class.

Plus a multi-stage agentic remediation chain (see "Behind" item 2 below for details) that takes static-analysis findings through investigation → testgen → sandboxed pre-fix verification → LLM remediation → sandboxed post-fix verification → report. Comparable to or stronger than Sec3 Premium's auto-auditor / L3X / Octane in the AI-augmented-static-analysis dimension.

Where SPECTRE is behind (unchanged from May 16 with one major revision — see item 2):

Distribution — no public install path. Single highest-leverage gap separating "research preview" from "Solana devs use it." Sec3 X-Ray, L3X, Solana Fender all have cargo install + GitHub Action.
AI / LLM augmentation: actually a STRENGTH, not a gap (corrected from prior drafts). SPECTRE ships a multi-stage agentic remediation chain that consumes static findings end-to-end: spectre.investigate → spectre.testgen → spectre.verify_pre → spectre.remediate → spectre.verify_post → spectre.report. The chain is implemented in code/agents/ as a Rust workspace with five Redis-Streams-driven agent binaries, each running on Qwen3.6-35b-a3b (currently); tests and remediation patches are executed in docker-sandboxed workspaces and the verify agents confirm the bug both pre- and post-fix before any patch is accepted. This is meaningfully more than what Sec3 Premium's "auto-auditor" (single-pass LLM triage on findings), L3X (multi-LLM cross-validation of patterns), or Octane (per-PR AI reasoning) ship — SPECTRE actually executes the proposed fix and verifies via sandboxed reproduce-then-fix. See code/agents/README.md for the chain architecture.
Formal verification — Certora Solana Prover remains the only production FV option; SPECTRE is rule-pack + agent-chain only, no formal proofs.
Triage workflow — no SARIF output, no suppression markers, no baseline file.

Note on Sec3 OwLLM. Sec3 separately ships OwLLM — an open-source LLM trained on millions of historical on-chain transactions (94% MEV detection accuracy). OwLLM is a run-time / TX-monitoring product (alongside Hexagate / Forta), NOT the AI inside X-Ray and NOT a SPECTRE competitor.

The single highest-leverage gap is still distribution. The rule pack and methodology are now demonstrably best-in-market by the only published measurable benchmark; the wrapper around them isn't yet dev-grade.

Competitor inventory (refreshed 2026-05-17)

Static analysis (direct competitors)

Tool	OSS	Layer	Approach	Distribution	Rule count	AI	Published bench
Sec3 X-Ray (github)	yes	LLVM-IR	data-flow + symbolic	cargo install + GH Action + hosted `pro.sec3.dev`	50+	no	no
Sec3 X-Ray Premium (blog)	no	LLVM-IR + AI auto-auditor	data-flow + AI triage / explanation	hosted	50+ + auto-auditor	yes	no
L3X (VulnPlanet/l3x)	yes	Rust AST + LLM	pattern + AI cross-validation	CLI	~20 + LLM-augmented	yes	no
Solana Fender (honey-guard)	yes	Rust AST	pattern	CLI	small / Anchor-only	no	no
Sol-azy (fuzzinglabs)	partial	sBPF disassembly	static + RE	CLI	RE-oriented	no	no
Eloizer (Inversive-Labs)	yes	Rust AST	pattern	CLI	research	no	no
Octane	no	Rust AST + AI	per-PR semantic	hosted	unknown	yes	no
CodeQL / Semgrep	partial	generic	pattern	GH Action + cloud	minimal Solana	no	no
SPECTRE (this)	not yet public	Rust AST + symbol table + cross-program linker + tree-sitter file-walks + 5-stage LLM agent chain	pattern + cross-program trust-posture comparator + TS↔Anchor cross-language + sandboxed reproduce-then-fix	none yet	77 (55 Solana + 22 generic)	yes (Qwen agent chain: investigate / testgen / verify_pre / remediate / verify_post)	yes (100% / 100% on 24-incident replay)

Formal verification

Certora Solana Prover (SCP) (CertoraProver) — decompiles SBF to Certora IR, full formal proofs. Open-sourced. Secures $75B+ in DeFi. Per-property harness authorship required. Different layer from SPECTRE: per-function logic against written spec vs codebase-wide architectural patterns.

Audit firms

OtterSec — 120+ audits, $36B TVL. Formal verification + differential fuzzing + incident response. Internal tooling.
Zellic, Halborn, Trail of Bits, Neodyme — manual + proprietary.

Real-time monitoring (adjacent, not direct)

Hexagate (Chainalysis) — tx simulation + ML, custom Gatelang. Run-time, 75+ chains.
Forta — decentralized monitoring network.

Feature matrix (refreshed)

Dimension	SPECTRE (today)	Sec3 X-Ray open	Sec3 Premium	L3X	Solana Fender	Certora SCP	OtterSec
Solana rules	55	50+	50+	~20	small	n/a	n/a
Native Solana	yes	yes	yes	yes	no	yes	yes
Anchor	yes	yes	yes	yes	yes	yes	yes
Cross-program analysis	8 dedicated rules	no	partial	no	no	per-protocol harness	manual
TS-client ↔ Anchor handler	yes	no	no	no	no	no	manual
AI / LLM (static layer)	5-stage agent chain (investigate / testgen / verify_pre / remediate / verify_post)	no	auto-auditor (single-pass)	multi-LLM cross-validation	no	no	partial
Sandboxed reproduce-then-fix	yes (docker sandbox per stage)	no	partial (auto-auditor doesn't execute)	no	no	n/a	manual
AI / LLM (tx-monitoring layer)	no (out of scope)	no	OwLLM (separate Sec3 product)	no	no	no	no
FV	no	no	no	no	no	yes	yes
Differential fuzzing	no	no	no	no	no	no	yes
Real-time monitoring	no	no	no	no	no	no	no
Historical-incident replay	24/24 exact-rule + class-level (100% / 100%)	none published	none published	none published	none published	none published	none published
Per-rule F1 published	yes	no	no	no	no	no	no
Open source	not yet	yes	no	yes	yes	yes	no
Cargo install	no	yes	n/a	yes	yes	yes	n/a
GitHub Action	no	yes	yes	yes	yes	yes	n/a
SARIF	no	unclear	unclear	unclear	no	no	n/a
Baseline / suppress	no	yes	yes	yes	no	yes	n/a
Hosted UI	no	no	yes	no	no	no	n/a

SPECTRE's genuine differentiators (sharpened)

Five pieces of substrate no public competitor matches:

Cross-program analysis with trust-posture comparator. 8 dedicated rules now: CROSS-001 (trust downgrade) + CROSS-002 (missing program-id verification) + CROSS-003 (Token-2022 extension propagation) + CROSS-004 (account-binding drift across CPI) + CROSS-005 (signer-privilege forwarding) + CROSS-007 (delegation to weakly-gated venue, shipped 2026-05-16) + CROSS-010 (multi-hop chain reasoning) + CROSS-020 (2-hop reentrancy cycle) + CROSS-CDF (write-then-cross-program-read data-flow). Every other public tool reasons program-by-program.
Historical-incident architectural-fingerprint replay at 100% / 100%. Each of 24 distinct mapped Solana incidents (after duplicate-id merge) ships with an architectural_fingerprint of SPECTRE rule IDs that should fire on pre-hack source. The replay benchmark scans the mapped corpus snapshots and reports exact-rule
- class-level detection. Every mapped incident now fires at both exact-rule and class-level. No comparable methodology published by any competitor.
TypeScript-to-Anchor cross-language analysis (META-001 + META-002). Traces from a TS client's program.methods.xxx() call into the Anchor handler it invokes; cross-language rules check whether the client's call-site assumptions match the handler's constraints. Unique to SPECTRE.
Framework-agnostic file-walking rule layer. ITER-001 (Jet sentinel-iteration), ACC-014 (Wormhole sysvar), ACC-015 (Cashio untied typed account) all walk .rs files with their own tree-sitter pass and don't depend on the Anchor / native extractor's symbol-table coverage. Catches bugs in pre-Anchor-0.30 programs (Solitaire / anchor_comp), nested-struct shapes the graph-based rules miss, and forks of all of the above. No competitor has a comparable file-walking second-tier.
Multi-stage agentic remediation chain. code/agents/ is a Rust workspace implementing a five-binary chain consuming Redis Streams: pinpoint-investigation-agent (LLM analyzes finding + graph neighborhood + blast radius), pinpoint-testgen- agent (LLM emits a failing test for the finding), pinpoint-verify-agent (sandboxed test execution: confirms the bug reproduces pre-fix), pinpoint-remediation-agent (LLM generates patch), pinpoint-verify-agent (sandboxed test execution post-fix: confirms the bug is gone). Uses Qwen3.6-35b-a3b currently; the pinpoint-agent-runtime crate abstracts the LLM client so model swapping is a single config change. Docker sandbox per stage. Idempotency store on Postgres. S3 artifact uploads per stage. Prometheus metrics on ports 9091-9094. No competitor public tool reproduces this end-to-end: Sec3 Premium's auto-auditor is single-pass LLM triage on existing findings (no test generation, no sandboxed verification, no patch execution). L3X uses LLMs to cross-validate pattern matches (no remediation). Octane does AI reasoning per PR (no automated verify-pre / verify-post loop). Only OtterSec's internal tooling reportedly does anything comparable, and it isn't public.

SPECTRE's actual gaps (ranked)

Tier 1: blocks adoption today (unchanged)

Distribution / install path. Same as 2026-05-16. ~1 week for cargo install pinpoint-spectre + prebuilt binary release + GitHub Action.
SARIF output. ~1 day; PR annotations.
Suppression + baseline. ~1 week; inline // spectre-allow: markers + .spectre-baseline.json for PR-diff scanning.

Tier 2: feature parity for serious bake-offs

AI / LLM marketing surface. Not an engineering gap — the agent chain already exists and is materially deeper than what Sec3 Premium / L3X / Octane ship. The gap is visibility: no public docs front the agent chain, the model swap is undecided, no public demo of reproduce-then-fix. ~1-2 weeks of doc + demo work, no new engineering.
Hosted scan UI. Sec3's pro.sec3.dev. ~3-4 weeks.
Bug-bounty marketing channel. Gated on T1.1.

Tier 3: orthogonal capabilities

Formal verification → Certora's layer (open-sourced; "buy not build"). FV integration sketch shipped in spectre-certora-fv-integration-2026-05-16.md.
Differential fuzzing → OtterSec's internal layer. Integration sketch shipped in spectre-differential-fuzzing-integration-2026-05-16.md.
Real-time monitoring → Hexagate's layer. Integration sketch shipped in spectre-hexagate-gatelang-exporter-2026-05-16.md.

Tier 4: substrate followups

Nested #[derive(Accounts)] struct extraction. The Anchor extractor doesn't recurse into nested structs (Cashio's arrow is 3 levels deep). ACC-015 works around this with file-walking, but ACC-013 / ACC-020 / ACC-021 / ACC-030 still have the same blind spot. ~1 week of extractor work would unblock all four.

What's actually defensible today

A protocol team's path to value with SPECTRE today:

For continuous architectural-pattern detection in CI. The 100% / 100% replay coverage on the only published Solana benchmark is the strongest reproducible number in the market. No tool catches more of the historical Solana exploit corpus.
For multi-program workspaces (Kamino, Drift v2, Jet v1, Cypher, Cardinal, MarginFi v2, Squads, Metaplex). 8 cross-program rules catch composition bugs structurally invisible to single-program tools.
For Anchor + TypeScript client codebases. META-001 / META-002 cross-language analysis is unique.
For pre-Anchor-0.30 / Solitaire / anchor_comp programs. File- walking rules (ITER-001, ACC-014, ACC-015) reach programs that the Anchor symbol-table extractor doesn't fully model.

Honest read (revised)

The 2026-05-16 read said "rule pack is at or above market; distribution is the only thing separating research preview from Solana devs use it." With the 24-hour push that closed the replay benchmark to 100% / 100%, the rule pack is now demonstrably best-in-market by the only published measurable benchmark. Distribution is still the only gating item.

Major revision (this session): SPECTRE already has AI augmentation, and it's deeper than the visible competitors. The 5-stage agent chain (code/agents/) takes static findings through investigation, testgen, sandboxed pre-fix verification, LLM remediation, and sandboxed post-fix verification. This is more than Sec3 Premium's auto-auditor (single-pass LLM triage), L3X's multi-LLM cross-validation (no remediation), or Octane's per-PR reasoning (no automated verify loop). The agent chain runs Qwen3.6-35b-a3b today; the runtime abstraction makes model swapping a config change. Sec3's OwLLM is a separate transaction-pattern LLM (MEV detection, TX monitoring) that lives one layer over in the run-time space and isn't a SPECTRE competitor.

What's actually missing on the AI front:

The agent chain isn't part of the published competitive narrative. Internal product, no marketing surface yet. The competitor docs all front their AI features; SPECTRE doesn't.
Public model choices. Qwen3.6-35b-a3b is a defensible workhorse but the marketing-friendly choice (Claude 4.x, GPT-5, Gemini 2.5) might shift perception. Runtime abstraction makes this a config-change, not engineering work.

The historical-incident replay methodology asset is now load- bearing for SPECTRE's positioning. Publishing the methodology (under documents/audits/methodology/, with reproducible runner/replay_incidents.py and corpus manifest schema) — and ideally challenging competitors to score against the same corpus — would convert the metric into a market standard. Today no other tool can claim a single-digit detection rate on the SPECTRE corpus, because no other tool has been measured against it.

References (refreshed 2026-05-17)

Sec3 X-Ray (GitHub)
Sec3 X-Ray Premium (auto-auditor)
Sec3 OwLLM v1.0 announcement (separate TX-pattern LLM product)
L3X AI Static Analyzer
Solana Fender
Sol-azy
Certora Solana Prover
OtterSec
Hexagate (Chainalysis)
Solana Security Toolbox 2026 (DEV community)
Awesome Solana Security (az0mb13)
Internal: spectre-historical-incident-replay-2026-05-15.md (current numbers)
Internal: spectre-exact-rule-gap-analysis-2026-05-16.md (bucket-by-bucket trajectory)
Internal: spectre-distribution-roadmap-2026-05-16.md (tier-1+2 ticket roadmap)
Internal: code/agents/README.md (the 5-stage agent chain implementation)