ROUND R19 · MODULUM × MTP + INFERENCE STACK

The compound-9× target is structurally achievable

2026-05-11 · 6 streams · 4/5 R1 + 3/5 R2 substantive · Codex max-budget · ~$25-40 spend

R18 closed the architecture (Modulum as control plane). R19 resolved the central technical question — how to compound 3× MTP and 3× Modulum on the same decode pass. Codex's R2 Substrate Delta Masks synthesis (M_shared + ΔM_k) reconciles the shared-mask vs per-head debate cleanly. Modulum API live; Forge healthy; Google JV becomes the R20 push.

9×

compound decode target

sound · 6 streams

50-65%

google JV probability

Q3 2026

hypernym router target

01 · The R19 architectural resolution

Substrate Delta Masks — the synthesis

Codex R2 collapsed the shared-mask vs per-head debate into one architecture. This is the R19 technical breakthrough.

M = M_shared + ΔM_k

"Most MTP positions use a shared Modulum mask (preserves efficiency); deeper or uncertain heads receive horizon-specific delta additions (preserves correctness). Shared-mask vs per-head is not a binary architecture choice but a measurable substrate-overlap problem solved by shared masks plus delta masks."

Codex R2 · most novel R2 contribution · resolves the central R19 architectural debate

R1 surfaced three candidate paths: shared-Modulum-mask (Claude) · per-head Modulum draft (Stream 1 path 2) · substrate-aware draft training with JV partner (Stream 1 path 3). Grok's R1 critique: "MTP changes the geometry of the attention mask Modulum must optimize, forcing architectural re-derivation rather than simple multiplication of speedups." Codex's R2 delta-mask synthesis reconciles both: shared mask for proximate positions; deltas for distal / uncertain positions.

Compound math under delta masks:

N=2 (live API today): ~6× compound, inference-time-only · achievable now
N=4 (MTP-aggressive): ~12× compound with substrate-policy training · requires JV
N=8 (frontier scale): theoretical 24× · trained-in Modulum architectural defaults

The compound-9× target Chris flagged is structurally achievable in 12-18 months

R20 prototype proves M_shared + ΔM_k on Gemma 4 Q4_K_M with ≥8× decode at 128k, ≤2% BABILong qa1 regression. Production rollout follows. JV with Google for Gemma 5 ships substrate-policy training defaults.

02 · Per-stream final verdicts

6 streams · 4 sound · 1 sound-phased · 1 pivot-resolved

Cross-pollinated R2 verdicts. Codex's reframes resolved Stream 1 architecturally; Grok's pick won Stream 6 with Claude conceding.

Stream	R2 Verdict	Key resolution
1 — MTP × Modulum compound	sound · substrate delta masks	Codex `M_shared + ΔM_k` synthesizes shared + per-head. Compound 3N× achievable.
2 — Hypernym Router	sound · inference-improving	Stop saying cheaper; say better. Retention SLA marketplace as Y2 evolution.
3 — Reasoning-state arch	sound · obligation integrity	Codex tightening: obligation tracking is canonical Day-1 use case. Cursor as partner.
4 — Forge deployment toolkit	sound · Y1 benchmark+integrate; Y1.5 deploy	Phased scope (Codex partial → sound with phases).
5 — Cross-model · M5 Compiler	sound · substrate policy + reverse-compiler moat	Codex substrate-policy reframe + Grok M5-in-reverse for training-data moat.
6 — World model	pivot-resolved · materials Y1 · econ+gov Y2	Materials wins Y1 on revenue + Modulum fit; econ + governance are Y2 fast-validation.

03 · Seven cross-panel convergent commits

What 3 R2 panels agreed on

Substrate Delta Masks M_shared + ΔM_k as the MTP × Modulum compound architecture. Codex R2 origin; Claude R2 adopted; Grok R1 critique resolved.
Substrate policies are first-class objects (Codex reframe). The M5 Compiler emits policies; MTP heads receive policies; Reasoning State references policies in dependency traces. Universal currency across the stack.
Hypernym Router positioning shift: "inference-improving router," not "cheaper OpenRouter." Quality + traceability beat. Retention SLA marketplace as Y2 evolution.
Obligation integrity is the Day-1 Reasoning State framing (Codex sharpening of Cursor partnership). Schema simple + hash-chained; Pearl-style do-calc deferred to R20+.
Forge phased Y1→Y1.5→Y2. Y1: benchmark + integrate. Y1.5: deploy on customer demand signal. Y2: interactive REPL.
M5 Compiler emits substrate policies, not just sparsity masks. Reverse-compiler generates synthetic long-context training data competitors can't replicate (Y2 moat-builder).
Materials science is the Y1 world-model bet. Pharma + battery + semiconductor pay 7-figures per useful prediction; retrosynthesis directly benefits from Modulum's long-context retention.

04 · R20 push — locked

Substrate-Policy Training Partnership with Google for Gemma 5

R19 produced the architecture spec. R20 productizes — JV pitch, prototype, materials science pilot, Router launch, Cursor partnership.

1. JV pitch to Google

50-65% probability

Hypernym brings: substrate policies + BABILong validation + retrosynthesis benchmark planning. Google brings: training pipeline + Gemma 5 weights + revenue share on Hypernym-verified calls. Year-1 LOI.

2. R19 → R20 prototype

≥8× target

Prove M_shared + ΔM_k architecture achieves ≥8× decode at 128k on Gemma 4 Q4_K_M with ≤2% BABILong qa1 regression. Acceptance-aware shared-mask MTP.

3. Materials science pilot

pharma partner

Pfizer-AI / Eli Lilly / Recursion / Insilico for retrosynthesis-planning using Modulum-augmented Gemma. 7-figure annual + per-hit fee. Retrosynthesis-Bench-Hypernym v1 publication.

4. Hypernym Router launch

Q3 2026

Inference-improving router + Retention SLA marketplace. Free tier + Pro $20/mo + Enterprise wholesale. Distribution: GitHub + pip + base_url swap.

5. Cursor partnership

SDK integration

Reasoning State SDK into Cursor's Composer/Agent mode. $0.05-0.20 per traced action; Cursor markets as Premium tier. ~100k paying developers in Cursor base.

JV probability ranking

pursue google first

Google 50-65% · xAI 25-35% · Meta 20-30% · DeepSeek 10-20% · Anthropic 5-15%. Pursue Google first; substrate-policy training run starts ~Q3 2026.

05 · Three standout R2 outliers

Codex's R2 panel-tier surfaces

Substrate Delta Masks

resolution

The architectural answer to Stream 1. M_shared + ΔM_k · most positions use shared mask; deeper/uncertain heads receive horizon-specific delta additions. R20 must prove ≥8× decode with ≤2% regression.

Rejection-Position Profiler

enterprise product

Expose per-head acceptance rates (head 1: 99% · head 4: 28%). Internal architecture tool + enterprise diagnostic. "Your workload loses long-context coherence at future-token horizon 3 because entity anchors are under-retained."

Obligation-Preserving Context Compiler

value-chain eat

When HR falls back to Claude / GPT, preprocess context around obligations (constraints, decisions, test duties, unresolved assumptions) rather than semantic summaries. Hypernym extracts value even when routing to frontier APIs.

Honorable mentions: Substrate Policy Marketplace (Claude R2 — per-domain licensed substrate policies) · Retrosynthesis-Bench-Hypernym (Claude R2 — first public multi-step retrosynthesis benchmark) · M5 Compiler in REVERSE (Grok R1 — generates synthetic long-context training data Y2 moat) · MTP Acceptance Insurance v2 (Codex+Claude — adaptive decode depth degrades gracefully when uncertainty rises) · Modulum as differentiable attention curriculum for continued pre-training (Grok R1).

06 · R20 carry-forward

10 items

Substrate Delta Mask architecture v1 spec — formalize M_shared + ΔM_k mathematically
Acceptance-aware shared-mask MTP prototype — ≥8× decode, ≤2% BABILong qa1 regression
Google JV LOI language draft — substrate-policy training partnership for Gemma 5
Hypernym Router launch + Retention SLA marketplace — Q3 2026 target
Cursor partnership LOI — Reasoning State SDK integration into Composer
Materials-science retrosynthesis benchmark — Retrosynthesis-Bench-Hypernym v1 publication
Rejection-Position Profiler product — enterprise diagnostic + internal architecture tool
Obligation-Preserving Context Compiler — fallback-call value extraction
M5 Compiler in reverse — synthetic long-context training data generation (Y2 moat-builder)
R20 dispatch pre-flight: Gemini API auth + spike-handling; Gemma mlx_vlm CLI fix

07 · System status — verified 2026-05-11

Both up and running

Modulum API

live

200 OK / 1.2s smoke test. Model: gemma-4-31B-it-Q4_K_M.gguf. Active draft_n=2 speculative drafting with 100% acceptance on short prompts. Production architecture is the live test bed for Substrate Delta Mask validation.

Forge infrastructure

17/17 canonical

All services up. Zero drift. Zero orphans. Memory router 10/12 healthy. Ready to deploy Modulum benchmarks + customer integrations through forge benchmark / integrate.

08 · Codex pattern (4 consecutive rounds)

CTO title intact

Per memory rule: Codex catches blockers + surfaces unifying frames others miss.

R16 — 8 attack vectors + 6 abuse vectors no other model surfaced
R17 — Truth Maintenance System for Institutions + Reliance-Class VTCs + "Capacity is not performance"
R18 — "Control plane not 6 products" + Context Compiler + reasoning-state architecture
R19 — Substrate Delta Masks M_shared + ΔM_k + Rejection-Position Profiler + Obligation-Preserving Context Compiler

Synthesis adopts every Codex anchor each round. The compound-9× target Chris flagged is structurally achievable because Codex resolved the architectural debate cleanly.