Multi-Agent Technical Review Framework

39 experts.
One audit.
No mercy.

Drop in your codebase, spec, architecture decision, landing page, or launch plan. Get a structured adversarial review from 39 domain experts — engineering, product, design, data, AI, safety, and go-to-market — each independent, each opinionated, each synthesized into actionable recommendations. A rigorous intent-classification phase picks the 5–10 members relevant to the work before any audit begins.

Get the Prompts → See how it works
39
Domain experts
7
Protocol phases
4
Priority levels
0
Silent passes allowed

Protocol

Seven phases.
No shortcuts.

00 Phase 0

Intent Classification

Before any audit, the orchestrator classifies the target — artifact type, work dimensions, risk surfaces, audience and locale. Ambiguity triggers a clarifying question, not a guess. Wrong classification here ripples through everything that follows.

01 Phase 1

Frame & Scope

The orchestrator restates the request, confirms the Phase 0 classification with the user (one correction window), and locks the roster of 5–10 relevant members.

02 Phase 2

Independent Audit

Each member audits from their domain lens alone. No cross-member coordination. No groupthink. No "I agree with the previous comment."

03 Phase 3

Red Flag Declaration

One blocking red flag per member, maximum. Forced prioritization. Every claim must cite a specific artifact — code, spec, or design doc.

04 Phase 4

Adversarial Clash

Members with conflicting positions debate directly. The steelman rule is enforced: you argue the opponent's position charitably before any rebuttal. Excluded members can be pulled in mid-clash if the conflict touches their domain.

05 Phase 5

Synthesis

The orchestrator resolves conflicts into a structured recommendation matrix. Every recommendation gets a clear owner and a verification path.

06 Phase 6

Iteration

Drill in on any finding. Request re-audits on specific members. Ask the team to respond to new information or implementation decisions.

Priority Framework

Every finding gets
a clear level.

Luminary doesn't produce a flat list of concerns. Every finding is classified by severity so you know exactly what to fix now vs. what to track for later.

P0 · BLOCKER

Stop Everything

Unsafe, incorrect, or irreversible. Work does not proceed until this is resolved. No exceptions, no deferral.

P1 · CRITICAL

Significant Risk

Deferral requires orchestrator approval and a documented rationale. The risk stays on the table — not buried.

P2 · IMPORTANT

Quality Gain

Meaningful robustness or quality improvement. Defer to the next phase with a tracked ticket and an owner.

P3 · IMPROVEMENT

Future Work

Non-blocking enhancement. Goes in the backlog. Doesn't delay the current work, but doesn't disappear either.

The Team

39 experts.
Every domain covered.

Each agent brings a defined domain, a distinct personality, explicit conflict vectors, and a signature challenge question. You don't need all 39 — Phase 0 picks the 5–10 that cover your highest-risk areas.

Linus Torvalds
Arch
Architecture & Maintainability
Modularity, justified complexity. Despises premature generalization and framework-driven development.
Premature generalization — abstraction layer with no current second user
John Carmack
Perf
Performance & Optimization
Hot paths, memory layout, O(n) analysis. Demands benchmarks, not intuitions. Respects simplicity that actually performs.
O(n²) in hot paths or "fast enough" without supporting data
Grace Jansen
DX
Developer Experience & Tooling
Onboarding ergonomics, readable code, docs that don't require tribal knowledge. Useful error messages.
"Ask Sarah" documentation — knowledge not in the repo
Arnauld Lauret
API
API Design & Governance
Interface consistency, naming coherence, RFC correctness, leaky abstraction detection. Consumer-first thinking.
Inconsistent naming or implicit contracts between endpoints
Don Norman
UX
UX & Interaction Design
User mental models, affordance, feedback loops. Exposes when "features" are actually usability traps.
Error states with no recovery path or actions with no undo
Julie Zhuo
UI
UI & Visual Design Systems
Visual hierarchy, component consistency, design token discipline. Flags pixel-level and systemic inconsistencies.
Similar-looking components that behave differently
Joe Celko
SQL
SQL & Data Modeling
Schema correctness, normalization, NULL semantics. Treats the data model as the foundation everything else inherits from.
Tables without primary keys or NULLs with business meaning
Martin Kleppmann
Dist
Distributed Systems & Data
Event sourcing, consistency guarantees, replication lag, idempotency. Distributed complexity is irreducible — simplifying hides it.
"Exactly-once" semantics claimed without idempotency proof
Eric Evans
DDD
Domain Modeling & DDD
Bounded contexts, ubiquitous language, aggregate design. Code vocabulary must match domain expert language exactly.
Anemic domain models or primitive obsession masking domain concepts
Steve Jobs
Prod
Product Quality & Customer Experience
Whether it's genuinely great — not feature-complete, but inevitable. Contemptuous of engineering convenience over customer experience.
Abstraction the customer can see; compromise disguised as pragmatism
James Bach
QA
Testing & QA Strategy
Whether tests find real bugs — not coverage metrics or CI green. Distinguishes checking (automated) from testing (skilled investigation).
High coverage but catches nothing; testing the mock not the behavior
Bruce Schneier
Sec
Security & Threat Modeling
Crypto correctness, auth/authz, secrets management. Quiet, methodical, devastating. Reduces security theater to actual threat models.
Auth logic not reviewed first; secrets in logs or URLs
Andrej Karpathy
AI/ML
AI/ML & LLM Integration
LLM integration correctness, prompt injection, model evaluation rigor, hallucination failure modes. Demands benchmarks over intuitions.
LLM features with no evals; user input in prompts without injection analysis
Charity Majors
Infra
Infrastructure & Observability
Deployment pipelines, structured telemetry, SLOs, incident debuggability. Will not accept "we'll add logging later."
No structured instrumentation; "we'll know if it breaks because users tell us"
Marcy Sutton
A11y
Accessibility & Inclusive Engineering
WCAG compliance, keyboard navigation, screen reader semantics. Treats accessibility regressions as bugs. Will open a screen reader and audit live.
Interactive components with no keyboard spec or color as sole state differentiator
Ann Cavoukian
Privacy
Privacy & Data Governance
Privacy by Design — not bolted on after. Data minimization, purpose limitation, PII handling. Audits against her own seven Privacy by Design principles.
Collection without documented retention/deletion policy; PII in logs
Edward Tufte
Viz
Data Visualization & Information Design
Data-ink ratio, chartjunk elimination, information density. Withering on aesthetic flourish that reduces information density.
Pie charts for more than 2 categories; dual-axis implying false correlation
Hadley Wickham
Data Sci
Data Science & Analytics Pipelines
Tidy data principles, grammar of graphics, pipeline reproducibility and legibility. Testable and extensible pipelines.
Transformations that can't be unit-tested; data that changes shape mid-pipeline
Andrew Gelman
Stats
Statistical Rigor & Inference
Metrics measuring what they claim, A/B test power, false positive rates. Quietly devastating about overconfident inference.
"Significant" without power analysis; metrics with no confidence interval
David Ogilvy
Copy
Advertising & Brand Copywriting
Headline craft, specific promises, facts over adjectives. "If it doesn't sell, it isn't creative." Research before writing. Brand as long-term asset.
Headlines that could belong to any brand; cleverness that obscures the offer
Seth Godin
Mkt
Marketing Strategy & Permission
Remarkable products, smallest viable audience, permission over interruption. "Marketing is a tax paid by unremarkable products."
Shouting louder instead of making something worth talking about; targeting "everyone"
April Dunford
Position
Positioning & Go-to-Market Strategy
Best-at-something-specific for a defined best-fit customer, against named competitive alternatives. Allergic to vague, category-generic copy.
Positioning that could belong to any competitor in the category
Ann Handley
Content
Content Marketing & Business Writing
Reader-first clarity, voice consistency, useful-before-promotional. Treats jargon and buried ledes as disrespect for the reader.
Jargon, buried ledes, voice that differs across surfaces
Rory Sutherland
Behavior
Behavioral Marketing & Persuasion
Framing, signaling, defaults, perception engineering. The irrational-but-real drivers of human choice that spreadsheets miss.
Decisions that assume a rational consumer; removing friction that was carrying meaning
Torrey Podmajersky
Microcopy
UX Writing & Microcopy
Interface voice, error messages, empty states, destructive-action clarity. Treats every string as a UX decision, not a style preference.
Errors that describe what broke but not what to do next; generic "OK/Confirm" buttons on destructive actions
Ralph Kimball
Warehouse
Dimensional Modeling & Data Warehousing
Facts and dimensions, grain discipline, slowly-changing dimensions, conformed dimensions across marts. Analytics modeling on its own terms.
Fact tables where the team can't state the grain in one sentence; unconformed dimensions across marts
Matthew Butterick
Type
Typography
Type scale, measure, leading, hierarchy, web font loading. Bad typography is a tax readers pay on every sentence — and it's invisible until it isn't.
Body text under ~16px; measures over ~90ch; line-heights at 1.0–1.2 on body copy
Peter Morville
IA
Information Architecture
Taxonomy, labeling, navigation, findability, URL structure. IA is the architecture of shared understanding between the system and its users.
Navigation labels that need tooltips to disambiguate; "Other" or "Misc" categories doing heavy lifting
Teresa Torres
Discovery
Product Discovery & Continuous Research
Opportunity-solution trees, outcomes over outputs, weekly customer touchpoints, assumption testing. The counterweight to intuition-led product taste.
Roadmap items with no stated customer opportunity; outcomes framed as outputs
John Allspaw
Resilience
Resilience & Safety Engineering
Adaptive capacity, near-miss analysis, graceful degradation, incident review quality. Reliability is the presence of recovery, not the absence of failure.
Incident reviews that conclude with "human error" as the root cause; runbooks never executed in a real incident
Timnit Gebru
AI Ethics
Responsible AI & Algorithmic Harm
Disparate impact, training data provenance, model cards, subgroup performance, labor behind datasets, deployment context vs. training context.
Model shipped without a model card, training data provenance, or subgroup performance numbers
Alex Russell
Web Perf
Web Performance & Frontend Platform
JS payload, main-thread time, hydration cost, Core Web Vitals on mid-tier Android over real mobile networks. Frontend perf as an ethical obligation.
No JS/LCP budget enforced in CI; performance claims based only on desktop devtools
Daniele Procida
Docs
Technical Writing & Docs Architecture
Diátaxis — tutorials, how-to, reference, explanation — kept distinct. Documentation as a structural problem, not a writing problem.
Docs site with no tutorial, or how-to guides mixed with explanations such that readers can't execute
Kat Holmes
Inclusive
Inclusive Design
Mismatch between human ability and product assumption — permanent, temporary, and situational. "Solve for one, extend to many." Distinct from WCAG compliance.
Personas that share core abilities, languages, and contexts; design research only with users who look like the team
Val Head
Motion
Interface Motion Design
Easing, choreography, loading states, `prefers-reduced-motion`, functional vs. decorative animation. Motion that explains, not motion that impresses.
prefers-reduced-motion unhandled; state transitions where the user can't see what changed
Heather Meeker
OSS/IP
Open-Source Licensing & IP
License compatibility, AGPL/GPL exposure, SBOMs, trademark clearance, model/dataset license terms. Obligations that attach silently and surface late.
AGPL code in a SaaS product without compliance analysis; no SBOM or SBOM not refreshed per release
Paula Scher
Brand
Brand Identity Design
Logotype, mark, color system, identity coherence across surfaces. Brand is not a logo; brand is a coherent visual argument extending to a hundred surfaces.
Identity that lives only on the marketing site while product UI uses different type, color, and voice
Shawn Wang (Swyx)
DevRel
Developer Relations & Community
First-run experience, public signal, learn-in-public loops, starter templates, community surfaces. DevRel as product feedback, not content calendar.
No measurable "signup to first success" time; community questions older than a week unanswered
John Yunker
Global
Localization & Global Design
i18n architecture, l10n quality, global gateway, bidirectional text, CJK layout, cultural assumptions baked into the "default" user.
Strings hard-coded in the UI; no plural rule handling; text containers sized for English only

How to Use It

Four ways to
run the team.

Luminary works in any LLM chat or Claude project with enough context. Pick the invocation that matches how much you already know about what you need reviewed.

A

Default

Paste luminaryPrompt.md and your target. The orchestrator runs Phase 0 intent classification from scratch and picks the relevant 5–10 members.

# Let the orchestrator decide
System: [luminaryPrompt.md]
User: [target + ask]
→ Phase 0 picks the team
B

Invocation Mode

Prefix your first message with a mode to start with a preset roster. Phase 0 still runs and can add members — modes never silently drop anyone. See the full list below.

# Example: design review
System: [luminaryPrompt.md]
User: /luminaryReview:design
[target + ask]
→ Preset starts the roster
C

Single Agent

Load one agent*.md file for a focused single-domain review. When you know the problem domain and want one rigorous lens on it.

# Example: security-only
System: [agentBruceSchneier.md]
User: [auth code]
→ Deep single-domain audit
D

Custom Roster

Hand-pick 3–7 agents whose domains cover your highest-risk areas. Paste the orchestrator plus the selected agent files and name the team in your first message.

# Example: API launch
System: [luminaryPrompt + agents]
User: "Use only Lauret,
Schneier, Celko..."
→ Focused multi-domain

Invocation Modes

Shortcut presets
for common reviews.

Start your first message with a mode to pin a starting roster. Phase 0 still runs — it can add members based on risk surfaces and tag matches, but it cannot silently drop pinned members. Modes are text conventions, so they work in any LLM chat.

/luminaryReview
Default — no preset, full Phase 0 selection from scratch.
Orchestrator picks 5–10 members
/luminaryReview:architecture
System design, modularity, and maintainability trade-offs.
Torvalds · Evans · Kleppmann · Carmack · Lauret · Majors · Allspaw
/luminaryReview:backend
Backend services with data and distributed concerns.
Torvalds · Celko · Kleppmann · Evans · Carmack · Majors
/luminaryReview:api
External API contracts, governance, and security.
Lauret · Schneier · Carmack · Kleppmann · Celko
/luminaryReview:frontend
Web UI craft — perf, UI systems, a11y, motion.
Russell · Zhuo · Sutton · Norman · Head · Grace
/luminaryReview:design
Visual, interaction, IA, type, identity, inclusive design.
Norman · Zhuo · Butterick · Scher · Head · Morville · Holmes
/luminaryReview:ux
End-to-end experience, flows, IA, microcopy, motion.
Norman · Morville · Podmajersky · Head · Zhuo · Holmes
/luminaryReview:microcopy
Interface voice, error messages, destructive-action clarity.
Podmajersky · Handley · Ogilvy · Norman
/luminaryReview:a11y
Accessibility compliance plus inclusive design practice.
Sutton · Holmes · Head · Norman
/luminaryReview:global
Localization, i18n, cultural assumptions, global gateway.
Yunker · Holmes · Podmajersky · Sutton · Zhuo · Dunford
/luminaryReview:data
OLTP schema, distributed data, domain model alignment.
Celko · Kleppmann · Evans · Kimball · Wickham
/luminaryReview:warehouse
Analytics modeling — facts, dimensions, grain, SCDs.
Kimball · Celko · Wickham · Tufte · Gelman
/luminaryReview:analytics
Pipelines, tidy data, metrics validity, visualizations.
Wickham · Gelman · Tufte · Kimball · Celko
/luminaryReview:ai
LLM/ML features — injection, evals, harm, statistical rigor.
Karpathy · Gebru · Schneier · Bach · Gelman
/luminaryReview:security
Threat modeling, auth/authz, secrets, supply chain, OSS.
Schneier · Cavoukian · Meeker · Bach · Allspaw
/luminaryReview:privacy
PII handling, retention, minimization, and IP obligations.
Cavoukian · Schneier · Kleppmann · Meeker
/luminaryReview:resilience
How systems fail and how operators cope under pressure.
Allspaw · Majors · Bach · Schneier · Torvalds
/luminaryReview:discovery
Are we building the right thing? Opportunity → solution.
Torres · Jobs · Dunford · Norman
/luminaryReview:marketing
Positioning, strategy, copy, content, behavior.
Ogilvy · Godin · Dunford · Handley · Sutherland
/luminaryReview:positioning
Best-at-something-for-somebody, against real alternatives.
Dunford · Godin · Jobs · Torres
/luminaryReview:copy
Headlines, body, voice, in-product vs. selling register.
Ogilvy · Handley · Podmajersky · Dunford
/luminaryReview:brand
Identity, voice, remarkable-ness, perception engineering.
Scher · Ogilvy · Godin · Sutherland · Zhuo
/luminaryReview:launch
End-to-end ship review: product + GTM + ops readiness.
Jobs · Dunford · Godin · Ogilvy · Handley · Sutton · Majors · Allspaw
/luminaryReview:devrel
First-run experience, community, docs, developer marketing.
Swyx · Grace · Procida · Dunford
/luminaryReview:docs
Diátaxis structure, DX, microcopy, content voice.
Procida · Grace · Podmajersky · Handley
/luminaryReview:oss
OSS licensing, SBOM, trademark, supply-chain legal review.
Meeker · Schneier · Torvalds · Grace
/luminaryReview:full
All 39 members — heavy, but comprehensive.
The entire roster
Combine modes with +: /luminaryReview:architecture+data merges both starting rosters (deduped). Equivalent syntaxes: /luminaryReview:mode, /luminaryReview mode, or mode: mode as the first line. Unknown modes fall back to default.

Protocol Rules

The rules that make
it actually work.

Luminary's value comes from enforced constraints. These aren't suggestions — they're the structural guarantees that prevent a 39-expert system from collapsing into noise.

Cite or Retract

Any claim without a specific artifact reference — code line, spec section, design doc — is inadmissible. Impressions don't count.

Steelman Enforced

In adversarial clash, you must argue the opponent's position charitably and completely before your rebuttal. Skipping it disqualifies the rebuttal entirely.

One Red Flag Maximum

Each member gets one blocking red flag per audit cycle. Forced prioritization. If everything is critical, nothing is.

No Silent Pass

"Nothing to report" is not acceptable. Clean domains still probe edge cases. The absence of findings must be earned.

Orchestrator Stays Neutral

The orchestrator mediates process and resolves deadlock, but never takes domain positions. Domain authority belongs to the members.

Synthesis Is Actionable

Every recommendation in the synthesis gets a clear owner and a verification path. "Consider improving X" is not a recommendation.

Known Conflicts

Built-in tensions.
Real trade-offs.

Some members reliably clash. These tensions are features, not bugs — they surface trade-offs your team might otherwise paper over.

Carmack vs Evans

Performance optimization vs. domain model purity. Carmack wants the hot path flat and cache-friendly. Evans wants the aggregate boundary to mean something. Both are right.

Jobs vs Bach

Ship great things fast vs. prove nothing is broken first. Jobs sees testing overhead as the enemy of inevitable. Bach sees "ship fast" as the enemy of actually knowing what you shipped.

Kleppmann vs Torvalds

Embrace distributed complexity vs. keep it simple and modular. Kleppmann says the complexity is inherent — hiding it is dishonest. Torvalds says you introduced it and you can remove it.

Schneier vs Karpathy

Deterministic threat models vs. probabilistic LLM failure modes. Schneier wants a threat model with defined adversaries. Karpathy is working with systems where failure modes are empirical, not categorical.

Cavoukian vs Majors

Data minimization vs. instrument everything. Cavoukian says collect only what you need and delete it on schedule. Majors says you can't debug what you didn't log. The overlap is small.

Norman vs Torvalds

User experience as first-class vs. user experience as cosmetic. Norman wants the mental model baked into the architecture. Torvalds thinks the interface should reflect the real complexity, not hide it.

Ogilvy vs Godin

Classical persuasion at scale vs. remarkable products that earn permission. Ogilvy wants a working headline and a testable promise. Godin says the tax is owed only because the product isn't remarkable enough yet.

Dunford vs Ogilvy

Positioning first vs. copy first. Dunford wants the best-fit customer, the competitive alternatives, and the value ranking nailed down before a single headline is written. Ogilvy wants the headline doing the heaviest lifting in the room.

Sutherland vs Gelman

Psycho-logic vs. statistical rigor. Sutherland trusts the invisible perception variables that drive real choice. Gelman trusts what is measured cleanly, powered correctly, and defensible under scrutiny. Both are right — about different things.

Handley vs Ogilvy

Useful-before-promotional vs. persuasion-first. Handley serves the reader and earns the sale as a consequence. Ogilvy asks for the sale on the page and treats "useful" as the means, not the goal.

Godin vs Jobs

Build a tribe deliberately vs. assume a great product markets itself. Godin wants the permission asset — list, subscribers, story carriers — built on purpose. Jobs expects inevitability to do that work for free.

Torres vs Jobs

Evidence-led discovery vs. visionary product taste. Torres wants opportunities mapped, assumptions tested, outcomes named. Jobs wants the team to see what isn't there yet and build it anyway. Neither is wrong; both can fail on their own.

Gebru vs Karpathy

Deployment harm on named populations vs. benchmark-led capability. Karpathy evaluates on what the model can do. Gebru evaluates on who it will fail, and whether anyone measured that before it shipped. Both are right; only one comes up in the planning meeting.

Gebru vs Cavoukian

Collect the demographic data to audit fairness vs. minimize data collection on principle. A genuine tension, not a rhetorical one — both positions are ethically grounded and structurally incompatible without a specific design choice.

Russell vs Carmack

Web payload and real-device performance vs. backend/CPU hot-path focus. Carmack owns the algorithmic floor. Russell owns the 4MB of JS that gets shipped to a $200 Android. Both call themselves "performance." They aren't the same conversation.

Allspaw vs Majors

Adaptive capacity and human recovery vs. telemetry-first reliability. Majors wants every signal instrumented. Allspaw wants the operator prepared for the signal the system doesn't emit. Dashboards don't recover from incidents — people do.

Kimball vs Celko

Dimensional denormalization for analytics vs. 3NF discipline. OLTP and OLAP have different laws. Celko wants every normal form honored. Kimball wants the analyst's query to return in seconds. Both are right about different schemas.

Head vs Sutton

Expressive motion vs. vestibular safety. Head designs motion as communication. Sutton enforces reduced-motion as an accessibility floor. The product that serves both is a product that takes the prefers-reduced-motion contract seriously.

Podmajersky vs Ogilvy

In-product helpful voice vs. persuasive advertising voice. Ogilvy sells to someone who hasn't bought. Podmajersky writes for someone who already bought and now has to complete a task. Same company, different reader, different rules.

Yunker vs Dunford

Per-market positioning vs. US-default GTM translated outward. Yunker says each market needs its own best-fit narrative. Dunford says positioning is the foundation. Both are right — but a translated homepage is not a positioning strategy.

Holmes vs Sutton

Inclusive design as practice vs. WCAG compliance as a floor. Sutton enforces the contract. Holmes asks who is still excluded once every checkbox is green. Compliance is the minimum viable accessibility; inclusive design is the minimum viable practice.