Admin preview · Product blueprint

What AgentProof is meant to become

The full product framework, the complete customer journey, and a readiness matrix. Every status label is honest. No fake operational claims. No buried defects. Pick a journey, build it until operational, prove it, only then move on.

Open /system-health →Open /admin/pilot-readiness →Open /admin/intelligence-ops →

Active defect audit · review_answer_preservation

Why reviewed agents still show "Not captured in this review" on the live URL

Fix applied — awaiting live proof

Symptom: After a buyer scores an agent, signs out, and signs back in, the previously-captured answers render as the "Legacy answer unavailable. Not captured in this review." badge instead of the saved values. Affects both NEW (1.1.0) and OLD (1.0.0) reviews, but for different reasons.

Audited modules

lib/agentproof/review/answer_recovery_v1.ts
lib/agentproof/review/legacy_answer_recovery_v1.ts (S34AH legacy cascade)
lib/agentproof/review_persistence.ts (CapturedAnswerSnapshot)
components/form/JsonPasteScoreCard.tsx (LOAD effect + render-time recovery + legacy fallback wiring)

Findings (7)

F1_recovery_module_correct. The recovery module's six strategies (exact_id → canonical_id → raw_id_reverse → deduped_id → text_hash → legacy_alias) are individually correct and deterministic. The 26 S34AD unit tests pass.
Evidence: lib/agentproof/review/answer_recovery_v1.ts L164-L228 · tests/unit/phase_1g_s34ad_answer_recovery.test.ts
F2_hydration_effect_correct. The reviewed-summary hydrate effect surfaces the snapshot's rich_answers verbatim. The per-agent LOAD effect correctly merges in-progress storage UNDER the snapshot (snapshot wins). State management is sound.
Evidence: components/form/JsonPasteScoreCard.tsx L1579-L1666 (hydrate effect) · components/form/JsonPasteScoreCard.tsx L1690-L1735 (LOAD merge effect)
F3_text_hash_index_dead. The text-hash recovery index was built with a label resolver scoped to ACTIVE question ids only: `(k) => s34adActiveQuestionLabels.get(k)`. So it could only resolve a label for a snapshot key that was ALSO a currently-active question id — which made the text-hash strategy redundant with exact_id and structurally unable to recover answers whose persistence-time id no longer matched any current id.
Evidence: components/form/JsonPasteScoreCard.tsx L5786-L5789 (pre-fix)
F4_snapshot_carried_no_labels. `CapturedAnswerSnapshot` (snapshot_version 1.0.0) carried `rich_answers` (id → answer) but no `question_labels` (id → label). Without the labels persisted at write-time, no read-time strategy could resolve a historical id to its label, breaking the text-hash fallback entirely.
Evidence: lib/agentproof/review_persistence.ts L131-L136 (pre-fix CapturedAnswerSnapshot)
F5_diagnostic_panel_only_for_reviewed. The S34AD diagnostic panel renders only when `s34adIsReviewedReadOnly` is true. If `loadLatestReviewSummaryForAgent` returns `reviewed: false` (e.g., env/agent id mismatch between save-time and read-time, or signing in on a different device where localStorage is empty), the panel is hidden and the founder sees no clue about what failed.
Evidence: components/form/JsonPasteScoreCard.tsx L5819-L5822 (s34adIsReviewedReadOnly gate) · components/form/JsonPasteScoreCard.tsx L6266-L6319 (panel only renders when gate is true)
F6_old_1_0_0_reviews_have_no_labels. The S34AG fix protects NEW reviews (snapshot_version 1.1.0 with question_labels) but does nothing for OLD reviews already on disk at 1.0.0. Those records carry rich_answers + legacy_answers but no labels, so the text-hash strategy cannot fire even after S34AG. Without a separate path that does NOT depend on question_labels, every pre-0.176.0 review keeps showing the "Not captured" badge.
Evidence: lib/agentproof/review_persistence.ts L131-L156 (snapshot shape — labels optional) · components/form/JsonPasteScoreCard.tsx L5875-L5879 (S34AG-only path)
F7_multiple_legacy_sources_already_persist_answers. Even when captured_answers is empty / mismatched, the buyer's answers usually live in OTHER stores: agentproof.rich_answers.v1::<env>::<agent> (per-agent rich answers — written live during the wizard), agentproof.answers.v1::<env>::<agent> (per-agent legacy tri-state), the report_markdown_body field on the review record (renders `- **<label>:** <value>` for every answer), and every historical entry in agentproof.report_history.v1::<env>::<agent>. The S34AD cascade ignored all of these.
Evidence: lib/agentproof/per_agent_answers.ts (per-agent stores) · lib/reporting/agentproof_readiness_markdown_report.ts L346-L359 (markdown answer lines) · lib/agentproof/review_persistence.ts L308-L312 + L634+ (report history)

Root cause

Two compounding failures. PRIMARY: text-hash recovery was structurally broken because snapshots persisted no question_labels and the read-time label resolver was scoped to active ids only — S34AG fixed this for NEW (1.1.0) reviews. LEGACY: the recovery cascade only consulted the captured_answers snapshot. Every other persisted source (per-agent rich answers, per-agent legacy answers, latest report markdown body, report history with its own captured_answers / markdown bodies) was ignored — so OLD (1.0.0) reviews where the snapshot's ids no longer match active ids still cascade to `unrecoverable`. S34AH adds the legacy fallback cascade so OLD reviews recover from whichever source has the answer. Cross-device sync via Supabase remains outstanding.

Fix applied (6 steps)

1. Extend CapturedAnswerSnapshot (S34AG)
lib/agentproof/review_persistence.ts
Added optional `question_labels?: Record<string, string>` and bumped allowed snapshot_version to also accept "1.1.0". Legacy "1.0.0" records remain readable.
2. Add buildTextHashIndexFromSnapshotLabels() (S34AG)
lib/agentproof/review/answer_recovery_v1.ts
New helper that builds the text-hash index using the snapshot's OWN question_labels map as the primary source, with active-question labels as a fallback for legacy 1.0.0 snapshots.
3. Stamp labels at both save points (S34AG)
components/form/JsonPasteScoreCard.tsx
Both the modern persist path and the legacy-fallback persist path now write snapshot_version: "1.1.0" AND a `question_labels` map built from the current deduped question list. New snapshots are text-hash recoverable.
4. Use snapshot labels at render-time (S34AG)
components/form/JsonPasteScoreCard.tsx
The text-hash index is now built via buildTextHashIndexFromSnapshotLabels(), threading the snapshot's question_labels (if present) and falling back to active labels otherwise. This makes the text-hash strategy actually capable of recovering cross-id matches.
5. Add legacy_answer_recovery_v1 module (S34AH)
lib/agentproof/review/legacy_answer_recovery_v1.ts
New module exporting recoverRichAnswerWithLegacyFallbacks() + recoverLegacyAnswerWithLegacyFallbacks() that extend the standard cascade with three new strategies — per_agent_storage, report_markdown, history_walk. Includes extractAnswersFromReportMarkdown() that parses `- **<label>:** <value>` lines out of the saved report Markdown body, and coerceMarkdownAnswerToLegacy() for the yes/no/not_sure path. None of these strategies require question_labels — they work on legacy 1.0.0 records.
6. Wire S34AH legacy cascade into the render-time recovery
components/form/JsonPasteScoreCard.tsx
Imports the new S34AH helpers. Assembles `s34ahLegacyInputs` from loadPerAgentRichAnswers + loadPerAgentAnswers + microsoftReviewedAgentSummary.report_markdown_body + loadReportHistory. Both the pre-compute loop and the per-question renderer now call recoverRichAnswerWithLegacyFallbacks / recoverLegacyAnswerWithLegacyFallbacks instead of the bare recoverRichAnswer / recoverLegacyAnswer. The "Not captured in this review" badge shows ONLY when EVERY primary AND legacy strategy returns unrecoverable.

Outstanding before we call this complete

Live walk-through on the Railway URL with a real signed-in user, opening an OLD (pre-0.176.0) reviewed agent and seeing the prior answers restored — confirms the S34AH legacy cascade works against real data on the live deploy.
Cross-device persistence: this fix solves the SAME-BROWSER case (localStorage retained across sign-out / sign-in). Cross-device requires Supabase as the source of truth for captured_answers + per-agent answers + report markdown body. Currently localStorage-only. Tracked as the next pending defect.
Truly unrecoverable answers — when an old review has no per-agent storage, no report markdown body, no history, and no captured_answers — must show the explicit "Legacy answer unavailable" badge instead of a blank control. Verified by the S34AH unrecoverable-path test.

Next action: Founder opens an OLD reviewed agent on the live URL. Expected: the answers reappear via the S34AH legacy cascade (most likely from the report_markdown_body or per_agent_storage strategy). Claude captures a screenshot of restored answers + the diagnostic panel showing 0 unrecoverable. Status flips to fix_proven_live. If any answer remains unrecoverable, the founder sees the explicit "Legacy answer unavailable" badge — which is the honest state for that one question only.

Product map — the 11 areas

Every area of the intended product, with its sub-modules, an honest operational status, the route it lives at, and the next action required. No fake operational claims.

Public site

The public-facing surface a buyer or evaluator visits before signing in. Explains what AgentProof is, what readiness means, and how the assessment works. No buyer login required.

Partially operational

8/9 sub-modules operational

Home / landing/
OperationalOwner: Claude
Hero, four Learn cards, public-vs-workspace panel, trust strip, footer. Mounted via AppHeader route-aware shell.
Next action: None — this surface is shipped and stable.
Readiness explanation/agentic-ai-readiness
OperationalOwner: Founder
Explains the AgentProof readiness model to a public visitor.
Next action: Founder to confirm copy is aligned with current methodology.
Learn centre/learn
OperationalOwner: Claude
Polished gradient hero, seven journey stages, six training tracks, sticky section ribbon. Wired into 7 sub-routes.
Next action: None — Learn centre is shipped.
Capability zones (Informational / Assisted / Action-taking)/learn/capability-zones
OperationalOwner: Claude
Three-zone framing with risk profile, examples, what-good-includes.
Next action: None.
Good agent design/learn/good-agent-design
OperationalOwner: Claude
Ten modules, each with what-this-is, why-it-matters, signal-of-good-design.
Next action: None.
Controls and oversight/learn/controls-and-oversight
OperationalOwner: Claude
Six control families, maturity ladder, before-go-live checklist.
Next action: None.
AI Radar overview (public)/learn/ai-landscape-radar
Partially operationalOwner: Founder
R13-A replaced the static preview with the RadarOperationalStatusPanel that shows three honest states (engine off / configured no run / active). Until the founder runs the first source check, the live state will be 'configured — no successful run yet'.
Next action: Founder runs the first radar check from /admin/intelligence-ops to flip the state to 'active'.
Demo entry/demo
OperationalOwner: Founder
R13 made demo default-on. R14 added an explicit honest diagnostic to /demo so if the founder DOES see the OFF state, the page names the exact Railway env var that's set to false and tells them what to remove.
Next action: If /demo shows OFF, remove AGENTPROOF_DEMO_MODE_ENABLED=false from Railway Variables.
Trust / help / setup status/beta/trust
OperationalOwner: Claude
Trust page, help page, and the /setup-status diagnostic page are all mounted.
Next action: None.

Demo journey

A buyer-runnable, no-login, sample-data flow that shows AgentProof end-to-end before they invest in a real assessment. Demo data must be clearly isolated from real customer data.

Founder preview

0/5 sub-modules operational

Sample agents catalogue
Partially operationalOwner: Founder
Sample agent data exists in lib/agentproof/demo/. Surfaces require AGENTPROOF_DEMO_MODE_ENABLED=true.
Next action: Founder enables the demo flag, OR Claude makes demo safe-by-default.
Guided sample assessment
Founder previewOwner: Claude
Trial assessment page exists at /trial/assessment with a sample three-question slice.
Next action: Promote to operational once full guided flow is wired to actions.
Sample report/trial/report
Founder previewOwner: Founder
Canonical sample report with ReadinessScoreRing, findings, evidence panel, radar panel.
Next action: Founder to confirm the sample reflects buyer-grade polish standard.
Sample improvement guidance/trial/improvement
Founder previewOwner: Claude
ImprovementGuidancePanel with three sample cards.
Next action: Promote to operational when improvement cycles are persisted.
Demo data isolation
Partially operationalOwner: Claude
Demo data lives under separate types and lib/agentproof/demo/. No demo write hits the live workspace tables. Isolation verified at code level; not yet proven by a live walk-through.
Next action: Run a live demo walk-through and document the isolation proof.

Login / account

Buyer-facing authentication. Magic link only, no password. Supabase EU auth with a workspace-scoped account record.

Blocked by configuration

0/4 sub-modules operational

Login (magic link)/login
Blocked by configurationOwner: Founder
Magic link flow is implemented. Auth is gated by AGENTPROOF_SUPABASE_AUTH_ENABLED + the Supabase env vars. Currently AMBER on /system-health.
Next action: Founder sets AGENTPROOF_SUPABASE_AUTH_ENABLED=true on Railway and runs a real signed-in walk-through.
Auth callback/auth/callback
Partially operationalOwner: Founder
S34AC-R9 patched to use the proxy-forwarded origin so Railway callbacks work. Awaiting a live signed-in cycle to confirm.
Next action: Founder runs the signed-in walk-through.
Signed-in state + sign out
Blocked by configurationOwner: Founder
Workspace pages render the signed-in workspace header when AGENTPROOF_SUPABASE_AUTH_ENABLED is true. Untested end-to-end.
Next action: Founder enables auth flag and signs in once.
Account → workspace entry/workspace
Blocked by configurationOwner: Founder
Routes through workspace home after successful auth.
Next action: Founder enables auth flag.

Workspace

The signed-in surface where a buyer manages their agents, environments, and connectors. Microsoft is the first real connector path; others are provider-agnostic placeholders.

Blocked by configuration

0/12 sub-modules operational

Workspace home/workspace
Blocked by configurationOwner: Founder
EstateHomeDashboard renders inside workspace. Requires auth + Supabase persistence.
Next action: Founder enables auth.
Agents list/dashboard/estate
Partially operationalOwner: Founder
Estate view renders with the unified ReadinessScoreRing. Behaviour against real Supabase data still unverified.
Next action: Live walk-through with auth enabled.
Environments (per connector)
PlannedOwner: Claude
Microsoft Power Platform environments discovery exists in lib/connectors/microsoft. UI surface not yet user-facing.
Next action: Build /workspace/environments surface in next cycle.
Add manual non-Microsoft agent/workspace/manual-agent/new
Partially operationalOwner: Founder
R7 wired the manual-agent form to a server action. Persistence path exists. Not yet proven by a live signed-in walk-through.
Next action: Live walk-through that creates an agent, signs out, signs back in, sees it.
Microsoft connector (first real provider path)/workspace/microsoft-readiness
Blocked by configurationOwner: Founder
Six connector libs (microsoft_auth_config, power_platform_client, dataverse_client, copilot_studio_discovery, to_canonical_footprint, errors). Five API routes. Tokens never leave the server.
Next action: Founder configures the six MICROSOFT_* env vars on Railway and tests the read-only discovery flow.
OpenAI connector
PlannedOwner: Deferred
Placeholder in the connector-agnostic registry. Not built.
Next action: Deferred until Microsoft path is proven end-to-end with a real customer.
Anthropic connector
PlannedOwner: Deferred
Placeholder. Not built.
Next action: Deferred.
Google connector
PlannedOwner: Deferred
Placeholder. Not built.
Next action: Deferred.
AWS Bedrock connector
PlannedOwner: Deferred
Placeholder. Not built.
Next action: Deferred.
Salesforce connector
PlannedOwner: Deferred
Placeholder. Not built.
Next action: Deferred.
ServiceNow connector
PlannedOwner: Deferred
Placeholder. Not built.
Next action: Deferred.
Custom in-house agent
PlannedOwner: Deferred
Handled today via the manual non-Microsoft agent path. Dedicated connector surface not built.
Next action: Defer dedicated connector until manual path is proven.

Assessment / review journey

The core scoring flow. Classify the agent's capability zone, answer the questions, attach evidence, generate a deterministic score, and preserve every previously-given answer.

Partially operational

3/7 sub-modules operational

Agent classification (capability zone)
OperationalOwner: Claude
Four-question classifier maps the agent to Informational / Assisted Work / Action-taking. Used by the scoring engine.
Next action: None.
Questions + evidence capture
OperationalOwner: Claude
Question bank, evidence fields, confidence pickers all wired.
Next action: None.
Review summary
Partially operationalOwner: Founder
Review summary surface renders. Display of previously-scored answers is the active defect being fixed by R13 + R13-A recovery layer.
Next action: Founder confirms previously-scored answers now appear after sign-out → sign-in. (Currently STILL FAILING per founder report.)
Review history (per agent)
Partially operationalOwner: Claude
ReviewSnapshot model supports multiple committed snapshots with re-score clones. UI history list not yet a dedicated route.
Next action: Build /workspace/reports/[agentId]/history.
Re-score journey
Partially operationalOwner: Claude
cloneSnapshotForRescore() exists. UI affordance to start a re-score from a prior report is not yet a button on the report page.
Next action: Add 'Re-score this agent' CTA to /workspace/reports/[reportId].
Previous answer preservation
Partially operationalOwner: Claude
R13 S34AD multi-strategy recovery + S34AG snapshot-label fix (NEW 1.1.0 reviews) + S34AH legacy fallback cascade (OLD 1.0.0 reviews). The S34AH layer pulls from per-agent rich answers, per-agent legacy answers, the saved report markdown body, and the full report history. "Legacy answer unavailable" appears ONLY when every primary AND legacy strategy returns unrecoverable. See the Active Defect Audit panel on /founder/product-blueprint for the full audit.
Next action: Founder opens an OLD reviewed agent on the live URL; answers should appear via the S34AH legacy cascade. Claude captures screenshot proof. Until then this stays partially_operational.
Review snapshot status (committed / immutable)
OperationalOwner: Claude
ReviewSnapshot model carries committed_at, snapshot_status, immutability refusal in saveReviewSnapshot.
Next action: None.

Reports

The buyer-grade artefact AgentProof produces. Executive summary, score, findings linked to evidence, recommendations, improvement plan, methodology version. Print + share ready.

Partially operational

6/10 sub-modules operational

Report dashboard (per workspace)/workspace/reports
Blocked by configurationOwner: Founder
Lists reports per workspace. Requires auth + persistence.
Next action: Founder enables auth.
Report detail page
Partially operationalOwner: Founder
Detail page renders for the sample/demo path. Live-data path unverified.
Next action: Live walk-through that opens a real report.
Executive summary
OperationalOwner: Claude
Summary block ships with score ring + band + headline statement.
Next action: None.
Score + band (unified ring)
OperationalOwner: Claude
Canonical ReadinessScoreRing used on every surface.
Next action: None.
Findings linked to Learn + control family + improvement action
OperationalOwner: Claude
ReportFindingsWithGuidancePanel renders all findings with related_learn + control_family + improvement_action.
Next action: None.
Evidence expectations
OperationalOwner: Claude
EvidenceExpectationsPanel shows good-example / weak-example / how-AgentProof-uses-it.
Next action: None.
Recommendations
OperationalOwner: Claude
Renders in report alongside improvement guidance panel.
Next action: None.
Improvement plan (in report)
Partially operationalOwner: Claude
Sample improvement cards render. Persisted improvement cycles per agent are partially wired (see Improvement cycles area).
Next action: Wire live improvement cycle persistence.
Methodology / version block
OperationalOwner: Claude
Each report carries methodology_version + scoring_version + intelligence_pack_version + report_version.
Next action: None.
Export / print readiness
Partially operationalOwner: Founder
S33L/S33M shipped @page CSS + cover page + section dividers + editorial blocks. Browser print confirmed locally; live confirmation pending.
Next action: Founder runs print from /workspace/reports/[id] and confirms layout.

Improvement cycles

Turn report findings into trackable improvement actions with evidence-to-collect, owner, status, and reassessment guidance.

Partially operational

4/5 sub-modules operational

Recommended actions per finding
OperationalOwner: Claude
ImprovementGuidancePanel surfaces the why-it-matters / what-good-looks-like / evidence-to-collect / reassess-after / related-Learn structure.
Next action: None.
Owner / status / date fields
Founder previewOwner: Claude
Sample card structure exists. Per-customer persistence of owner + status + date is not yet a live surface.
Next action: Wire improvement_cycle table writes from the workspace UI.
Evidence to collect
OperationalOwner: Claude
Documented per improvement card.
Next action: None.
Reassessment guidance
OperationalOwner: Claude
Each card states reassess_after + ties to radar status if relevant.
Next action: None.
Link back to report findings
OperationalOwner: Claude
Each improvement card references the originating finding via related_learn + related_control_family.
Next action: None.

AI Radar

Controlled source-watchlist monitoring with change detection, human review queue, and versioned intelligence packs. Not broad web scraping. Not real-time live monitoring.

Partially operational

5/7 sub-modules operational

Current honest status (3-state banner)/learn/ai-landscape-radar
OperationalOwner: Claude
RadarOperationalStatusPanel renders one of three states: engine off / configured no run / active monitoring.
Next action: None.
Source watchlist (16 sources, 7 categories)
OperationalOwner: Claude
RADAR_SOURCE_REGISTRY: EU AI Act, NIST AI RMF, ISO 42001, OpenAI/Anthropic/Google/AWS Bedrock blogs, OWASP LLM, MITRE ATLAS, MS Copilot Studio docs, LangChain/LlamaIndex releases, AUPs.
Next action: None.
Change detection (SHA-256 + allowlist)
OperationalOwner: Claude
source_check_engine_v1 hashes content, detects no-change / first-check / change-detected. Allowlist-only.
Next action: None.
Human review queue/admin/intelligence-ops
Blocked by configurationOwner: Founder
RadarReviewQueue with approve / reject / pick-up. Requires AGENTPROOF_ADMIN_TOOLS_ENABLED=true.
Next action: Founder enables the admin tools flag.
Intelligence pack versioning (semver)
OperationalOwner: Claude
intelligence_pack_version_v1: append-only, bump rules, initial 1.0.0.
Next action: None.
Report impact warnings
Partially operationalOwner: Claude
computeReportImpact() emits warnings for zone + control-family overlap. UI surface that injects these into reports is not yet wired.
Next action: Wire report_impact_v1 output into /workspace/reports/[id].
Clear preview/real separation
OperationalOwner: Claude
Three-state panel always tells the buyer exactly which mode is live. No fake live claim possible.
Next action: None.

Methodology / version governance

Every report records the methodology pack version, scoring model version, intelligence pack version, and report version. Old reports stay immutable. Compatibility warnings surface when a newer pack would change the answer.

Partially operational

4/6 sub-modules operational

Methodology pack version
OperationalOwner: Claude
Tracked on every committed review_snapshot.
Next action: None.
Scoring model version
OperationalOwner: Claude
engine_version pinned across the deterministic scoring path.
Next action: None.
Intelligence pack version
OperationalOwner: Claude
INITIAL_PACK_VERSION_LABEL = 1.0.0, bumpPackVersionLabel handles minor/patch.
Next action: None.
Report version
OperationalOwner: Claude
Each report carries an explicit report_version field.
Next action: None.
Change history
Founder previewOwner: Claude
Methodology changelog exists in content/. UI surface that renders the history for a buyer is not yet a dedicated page.
Next action: Build /learn/methodology-history.
Compatibility warnings (newer pack → reassess)
Partially operationalOwner: Claude
report_impact_v1 computes reassessment_recommended_overall. Surfacing it into the report page is the next step.
Next action: Wire into report detail page.

Admin / founder operations

Founder-only operational surfaces: system health, setup status, intelligence ops, release readiness, evidence of live checks, and the product completion matrix.

Partially operational

3/6 sub-modules operational

System health (active probes)/system-health
OperationalOwner: Claude
R12 shipped active probes against Supabase + the deploy. Verdict groups: working / configured-untested / missing / needs-founder-action / needs-claude-action / sample-only.
Next action: None.
Setup status diagnostic/setup-status
OperationalOwner: Claude
Flag-free diagnostic surface — always loads. Tells founder exactly what is configured.
Next action: None.
Intelligence ops (admin review)/admin/intelligence-ops
Blocked by configurationOwner: Founder
Mounts RadarReviewQueue + admin tooling. Gated by AGENTPROOF_ADMIN_TOOLS_ENABLED.
Next action: Founder enables the admin tools flag.
Release readiness
Partially operationalOwner: Claude
S34J/S34K shipped a release-gate enforcement layer. Production_launch_decision module gates the public_beta_ready signal. Operational, not yet surfaced as a single founder-readable page.
Next action: Build /admin/release-readiness summary page.
Evidence of live checks
Founder previewOwner: Claude
Active probes ship; their evidence is shown on /system-health. A consolidated 'evidence vault' surface exists in lib but is not founder-routable.
Next action: Expose evidence vault contents at /admin/evidence-vault.
Product Completion Matrix/founder/product-blueprint
OperationalOwner: Claude
Visible matrix lives on the Founder Preview / Product Blueprint page. Source of truth for what is real, partial, planned, or deferred.
Next action: None — shipped in R13-FRAMEWORK.

Commercial / private pilot readiness

What is ready for a real private pilot with a real buyer. What still needs founder action. What needs Claude action. What is deferred. No payments, no public registration in this cycle.

Deferred

0/5 sub-modules operational

Ready for pilot (proven)
Partially operationalOwner: Claude
Public Learn surface, AI Radar 3-state panel, system-health, and the deterministic scoring engine are pilot-ready.
Next action: None for these surfaces.
Not yet ready for pilot
Partially operationalOwner: Claude
Auth, persistence end-to-end proof, manual non-MS persistence proof, demo default-on, review answer preservation visible after sign-out/in. These are the active gating items.
Next action: See sub-modules in Assessment, Workspace, and Login areas.
Needs founder action
Partially operationalOwner: Founder
Flip the auth flag. Configure the Microsoft env vars. Run the live signed-in walk-through. Confirm visible answer preservation. Decide whether demo is default-on.
Next action: Founder works the punch list on /system-health.
Needs Claude action
Partially operationalOwner: Claude
Prove the answer-preservation fix on the live URL, finalise improvement-cycle persistence UI, wire report-impact warnings into the report page.
Next action: Claude works the build punch list once founder unblocks auth.
Deferred (intentional)
DeferredOwner: Deferred
Payments. Pricing pages. Public registration. Non-Microsoft connectors (OpenAI/Anthropic/Google/AWS/Salesforce/ServiceNow). Custom-agent dedicated connector. Public beta launch.
Next action: Revisit only after first paid pilot is committed.

Target buyer journey (15 stages)

Home → Learn → Demo or Trial → Login → Workspace → Add agent → Classify → Assess → Generate report → Read findings → Improvement plan → Return later → Re-score → Compare history → Radar / methodology prompts reassessment.

Stage 1
1. Arrive at Home
Surface: /
Operational
Buyer does
Lands on the public site, sees what AgentProof is, and decides whether to Learn more, run the Demo, or Start Trial.
AgentProof does
Renders the public landing with hero, four Learn cards, trust strip, and provider-agnostic positioning chip.
Next action: None.
Stage 2
2. Read the Learn centre
Surface: /learn
Operational
Buyer does
Reads the capability-zone framing, good-agent-design modules, controls + oversight, AI Radar overview, and the reference library.
AgentProof does
Surfaces all 7 Learn pages with consistent gradient hero, sticky section ribbon, prev/next, and cross-links.
Next action: None.
Stage 3
3. Choose Demo or Start Trial
Surface: /demo or /trial
Partially operational
Buyer does
Picks Demo (sample-data preview, no login) or Start Trial (real assessment, login required).
AgentProof does
Demo path requires AGENTPROOF_DEMO_MODE_ENABLED. Trial path leads to login.
Next action: Founder decides whether demo is default-on, or whether trial is the primary path.
Stage 4
4. Sign in (magic link)
Surface: /login → /auth/callback
Blocked by configuration
Buyer does
Enters email, opens magic link, lands signed-in.
AgentProof does
Sends Supabase magic link, exchanges code for session at /auth/callback using the proxy-forwarded origin (S34AC-R9 fix).
Next action: Founder enables AGENTPROOF_SUPABASE_AUTH_ENABLED and runs the live signed-in cycle.
Stage 5
5. Land in the workspace
Surface: /workspace
Blocked by configuration
Buyer does
Sees their workspace home with the estate dashboard.
AgentProof does
Loads workspace-scoped agents, environments, and reports. Renders unified ReadinessScoreRing.
Next action: Founder enables auth.
Stage 6
6. Add an agent
Surface: /workspace/manual-agent/new (or Microsoft connector)
Partially operational
Buyer does
Either adds a manual non-Microsoft agent or connects via the Microsoft Copilot Studio connector.
AgentProof does
Manual: posts to the server action and persists to Supabase. Microsoft: discovery flow asks Microsoft for environments + agents.
Next action: Founder proves manual non-MS persistence end-to-end with a live signed-in walk-through.
Stage 7
7. Classify the agent
Surface: /workspace/[agentId]/classify
Operational
Buyer does
Answers the four classification questions to map the agent to a capability zone.
AgentProof does
Maps responses to Informational / Assisted Work / Action-taking via the deterministic classifier.
Next action: None.
Stage 8
8. Assess (answer the question bank)
Surface: /workspace/[agentId]/review
Partially operational
Buyer does
Goes through the question bank, attaches evidence, sets confidence per answer.
AgentProof does
Stores every answer in the workspace-scoped review snapshot. Saves progress between sessions.
Next action: Founder confirms previously-scored answers actually reappear on sign-out → sign-in (currently failing per founder report).
Stage 9
9. Generate report
Surface: /workspace/reports/[reportId]
Partially operational
Buyer does
Clicks generate, reads the report.
AgentProof does
Runs the deterministic scoring engine, mints a methodology+pack-versioned report, locks the snapshot.
Next action: Confirm live report path end-to-end with auth on.
Stage 10
10. Read findings + evidence
Surface: /workspace/reports/[reportId]
Operational
Buyer does
Reads the executive summary, score band, findings with related Learn + control family + evidence + improvement action.
AgentProof does
Renders ReportFindingsWithGuidancePanel + EvidenceExpectationsPanel + the methodology version block.
Next action: None (sample path proven; live path follows when auth is on).
Stage 11
11. Work the improvement plan
Surface: /workspace/improvement/[reportId]
Partially operational
Buyer does
Picks improvement actions, records owner / status / date, collects evidence.
AgentProof does
Surfaces improvement cards with reassess-after + evidence-to-collect. Persists per-customer state when wired.
Next action: Wire live improvement-cycle persistence UI.
Stage 12
12. Return later
Surface: /login → /workspace
Partially operational
Buyer does
Signs in again, expects everything to be where it was.
AgentProof does
Restores workspace-scoped state, including every previously-scored answer.
Next action: PROVE the answer-preservation path on the live URL with a signed-in cycle.
Stage 13
13. Re-score
Surface: /workspace/reports/[reportId] → Re-score CTA
Partially operational
Buyer does
Decides to re-score (new methodology, agent changed, periodic refresh).
AgentProof does
cloneSnapshotForRescore() produces a NEW draft review_id while the prior snapshot stays immutable.
Next action: Add the Re-score CTA + history list UI.
Stage 14
14. Compare report history
Surface: /workspace/reports/[agentId]/history
Planned
Buyer does
Compares old vs new scores, sees what changed and why.
AgentProof does
Lists committed snapshots per agent. Shows methodology / pack version per snapshot.
Next action: Build the history list page.
Stage 15
15. Radar / methodology prompts reassessment
Surface: /workspace/reports/[reportId] (warning banner)
Partially operational
Buyer does
Sees a warning that a new approved radar item or a new methodology pack version may affect this report.
AgentProof does
report_impact_v1 computes overlap; emits a banner on the report page when reassessment is recommended.
Next action: Wire the report-impact banner into the report detail page.

Product completion matrix (14 rows)

The master scorecard. Each product area, its target end-state, current status, operational proof required, persistence requirement, visual acceptance, next action, and owner. Known active defects (review answer preservation, demo, manual non-MS persistence, AI Radar first-run, reports polish) appear here and are NOT buried.

Public navigation

OperationalOwner: Claude

Target end-state: Sticky top nav + four-column footer reach every public page. No orphan routes.
Operational proof required: PublicSiteHeader + Footer mount via AppHeader on every public route.
Data / persistence requirement: None.
Visual acceptance required: Founder confirms nav links work from every public page.

Next action: None — shipped at S34AA.

Login (magic link)

Blocked by configurationOwner: Founder

Target end-state: Buyer enters email, opens magic link, lands signed-in in the workspace.
Operational proof required: Live walk-through: enter email → click link → land in /workspace as the signed-in user.
Data / persistence requirement: Supabase auth tables + workspace tables wired.
Visual acceptance required: Workspace header shows signed-in identity, sign-out works.

Next action: Founder sets AGENTPROOF_SUPABASE_AUTH_ENABLED=true on Railway.

Supabase persistence

Partially operationalOwner: Founder

Target end-state: Every workspace write (agents, reviews, reports, improvement cycles, radar) lands in Supabase EU and survives a sign-out / sign-in.
Operational proof required: R14 (S34AI) shipped 14 declarative probes + data-source chip + migration planner. R15 (S34AJ) shipped the live persistence probe (/api/system-health/persistence) + signed-in resolver + migration executor. R16 (S34AK) ships the per-surface compliance registry: every live read/write surface is honestly declared as Supabase-first, supabase-via-server-action, local-only-intentional (demo/trial), local-only-defect (awaiting R17 rewire), or scheduled_for_rewire. /system-health renders the compliance panel with violation count. R16 wires the manual-agent server action + radar API + persistence probe as the operational compliant surfaces; JsonPasteScoreCard hydrate + dashboard/estate + workspace home + per-agent report history are honestly declared as remaining defects with named next-action.
Data / persistence requirement: All 7 migrations applied (0001-0007). RLS verified. AGENTPROOF_SUPABASE_AUTH_ENABLED=true on Railway.
Visual acceptance required: Founder sees the persistence proof panel on /system-health flip from 'configured — untested' to 'operational' once the live walk-through runs. Data-source chips on workspace + manual agent + review surfaces show 'Supabase-backed'.

Next action: Founder flips auth flag, runs the live walk-through (create manual agent → refresh → sign out → sign back in → review → report). Claude captures the screenshot proof.

Manual non-Microsoft agent

Partially operationalOwner: Founder

Target end-state: Buyer adds a manual agent, signs out, signs in, sees it. Persisted in Supabase, scoped to workspace.
Operational proof required: R7 wired form → server action → repository. Live signed-in walk-through still needed.
Data / persistence requirement: manual_agents table with workspace_id scoping.
Visual acceptance required: Agent appears in /dashboard/estate after re-login.

Next action: Founder runs the live walk-through once auth is on; Claude documents the screenshot proof.

Demo

OperationalOwner: Founder

Target end-state: Buyer can run a sample-data flow end-to-end with no login. Demo data clearly isolated.
Operational proof required: R13 made demo default-on. R14 added an explicit honest diagnostic to the /demo page that names the Railway env var and tells the founder exactly what to do if disabled.
Data / persistence requirement: None — demo writes do NOT hit the live workspace tables.
Visual acceptance required: Founder confirms /demo loads sample flow without login. If founder sees the disabled state, the diagnostic on the page explains the exact Railway action.

Next action: If /demo shows the OFF state, remove AGENTPROOF_DEMO_MODE_ENABLED=false from Railway Variables.

Trial

Founder previewOwner: Founder

Target end-state: Buyer can walk a full trial workspace, sample assessment, sample report, sample improvement.
Operational proof required: Trial workspace pages render. Sample data only — no real persistence.
Data / persistence requirement: None — sample data only.
Visual acceptance required: Founder confirms trial pages reflect the buyer-grade polish standard.

Next action: Founder reviews /trial, /trial/workspace, /trial/report, /trial/improvement.

Review answer preservation

Partially operationalOwner: Claude

Target end-state: Every previously-scored answer is visible after sign-out / sign-in. Never blank, never lost.
Operational proof required: R13-FRAMEWORK / S34AG fixed NEW reviews via snapshot_version 1.1.0 + question_labels. S34AH adds the legacy fallback cascade for OLD (1.0.0) reviews — per_agent_storage, report_markdown, and history_walk strategies pull from per-agent localStorage, the saved report markdown body, and the report history. The "Not captured in this review" badge now appears ONLY when EVERY primary AND legacy strategy genuinely returns unrecoverable. See the Active Defect Audit panel on this page for the full audit record.
Data / persistence requirement: review_snapshots + per-agent rich-answer rows in Supabase. Cross-device sync is a separate pending defect (currently localStorage-only).
Visual acceptance required: Screenshot of restored answers on the live Railway URL after sign-out → sign-in, with the diagnostic panel showing 0 unrecoverable.

Next action: Founder enables auth flag on Railway and signs in. Claude captures the live screenshot proof. Until then this row stays partially_operational — the fix is real but unproven on the live URL.

Reports

Partially operationalOwner: Founder

Target end-state: Buyer-grade report: exec summary, score ring, findings linked to evidence + Learn + improvement action, methodology block, print-ready.
Operational proof required: R14 added the ReportVersionMetadataBanner component for explicit methodology / scoring / intelligence-pack / report-version display + reassessment-recommended banner. Sample report path proven. Live report path waits on auth + persistence.
Data / persistence requirement: reports + review_snapshots tables.
Visual acceptance required: Founder approves the polish on both sample and live reports. Old reports show their version metadata banner; reassessment-recommended banner appears without mutating the original report.

Next action: Founder reviews /trial/report polish; confirms standard for live reports.

AI Radar

Partially operationalOwner: Founder

Target end-state: Controlled source-watchlist monitoring with change detection, human review, and versioned packs. No fake live claim.
Operational proof required: R13-A shipped the engine + registry + state machine. First real check run still pending.
Data / persistence requirement: radar tables from migration 0007.
Visual acceptance required: Banner on /learn/ai-landscape-radar flips from 'configured — no run yet' to 'active' once founder runs first check.

Next action: Founder enables admin tools flag and runs first radar check from /admin/intelligence-ops.

Microsoft connector

Blocked by configurationOwner: Founder

Target end-state: Buyer connects via Microsoft, AgentProof discovers environments + Copilot Studio agents, reads only metadata.
Operational proof required: Six MICROSOFT_* env vars + a real test tenant. Real OAuth round-trip.
Data / persistence requirement: microsoft_connections table with workspace scoping.
Visual acceptance required: Founder runs the Microsoft connect flow against a real tenant.

Next action: Founder configures the six MICROSOFT_* env vars or defers to post-pilot.

Provider-agnostic connector model

OperationalOwner: Claude

Target end-state: Connector-agnostic architecture lets us add OpenAI / Anthropic / Google / AWS / Salesforce / ServiceNow without rewriting the assessment.
Operational proof required: lib/agentproof/provider/provider_agnostic_positioning_v1.ts declares SUPPORTED_PROVIDERS. Each provider has a registry entry.
Data / persistence requirement: None at the architecture level.
Visual acceptance required: Landing page chip says 'Microsoft is one connector path'.

Next action: None — shipped at S34AC-R1 Defect 5.

System health

OperationalOwner: Claude

Target end-state: Founder can open one page and see honestly what works, what is configured, what is missing.
Operational proof required: R12 active probes + R13 product-completion 12-item dashboard + R14 persistence proof panel with 14 founder-named checks and explicit 'Requires founder live login test' badges.
Data / persistence requirement: Read-only agentproof_healthcheck row (migration 0006).
Visual acceptance required: Founder opens /system-health → sees R12 health groups + R13 completion dashboard + R14 persistence proof panel.

Next action: None.

Report export / print

Partially operationalOwner: Founder

Target end-state: Buyer can print or export a report to a clean, branded PDF with cover, dividers, no browser chrome.
Operational proof required: S33L/S33M shipped the print CSS, cover page, dividers, editorial blocks. Founder still needs to confirm on live data.
Data / persistence requirement: Nothing extra — uses the live report.
Visual acceptance required: Founder prints a real report from /workspace/reports/[id] and confirms layout.

Next action: Founder confirms print preview once auth is on.

Commercial readiness

Partially operationalOwner: Founder

Target end-state: AgentProof is ready for a real private pilot with a real buyer. No payments, no public registration in this cycle.
Operational proof required: Auth + persistence + review preservation + manual agent persistence all proven on the live URL.
Data / persistence requirement: All migrations applied + auth on + first pilot data created.
Visual acceptance required: Founder walks the journey end-to-end as a buyer and signs off.

Next action: Work the punch list on /system-health top to bottom; then run a single full pilot walk-through.

Private pilot readiness matrix (R18)

Where the product stands, honestly

Fourteen areas the founder asked about. Each row carries one of six honest statuses, the proof surface, and the single named next action. Nothing is hidden behind "future work."

6 operational·5 partial·1 founder preview·2 blocked by config·0 planned·0 deferred·14 rows

Public site
Operational
What buyer sees: Home, Learn, Demo, AI Radar, Founder blueprint, Sign-in.
Proof surface: / + /learn + /demo + /learn/ai-landscape-radar + /login
Next action: None — shipped S34AA.
Wired by S34AA
Navigation
Operational
What buyer sees: Sticky AppHeader + workspace shell. No duplicate chrome. No orphan pages.
Proof surface: Header mounted at app/layout.tsx
Next action: None — shipped S34AC-R6 / R11.
Wired by S34AC-R6 + R11
Login (Supabase magic link)
Blocked by configuration
What buyer sees: Magic-link form sends a real Supabase auth email when enabled.
Proof surface: /login + Supabase auth callback
Next action: Founder sets AGENTPROOF_SUPABASE_AUTH_ENABLED=true on Railway.
Wired by S34C + R8 + R9
Supabase persistence
Partially operational
What buyer sees: Manual agents, review answers, report history all read from Supabase first when signed in.
Proof surface: /system-health → SourceOfTruthCompliancePanel + LiveSupabaseProbePanel
Next action: Founder enables auth on Railway, then walks the 13-step journey.
Wired by R14 + R15 + R16 + R17
Manual non-Microsoft agent
Partially operational
What buyer sees: Create agent → save → refresh → sign-out / sign-in → still there.
Proof surface: /workspace/manual-agent/new
Next action: Founder runs the live walk-through after sign-in.
Wired by R7 + S34I + R17
Demo
Operational
What buyer sees: Sample agents + capability zone + sample assessment + sample report. Reachable from Home, Learn, Trial.
Proof surface: /demo (default-on)
Next action: If demo shows OFF state, remove AGENTPROOF_DEMO_MODE_ENABLED=false from Railway.
Wired by R13 + R14 diagnostic + R18 sample flow
Trial (signed-in guided)
Founder preview
What buyer sees: Public preview clearly labelled; signed-in guided trial creates a sample workspace + agent + assessment + report.
Proof surface: /trial + /trial/workspace + /trial/assessment + /trial/report
Next action: Founder walks /trial → /trial/workspace → /trial/report; reports any path that breaks.
Wired by S34O + S34P + R18
Review answer preservation
Partially operational
What buyer sees: Every captured answer is visible after sign-out / sign-in. Never blank.
Proof surface: Review wizard + report → snapshot recovery + legacy fallback cascade
Next action: Founder enables auth, signs in, opens a reviewed agent.
Wired by R13 + R13-A + R13-LEGACY (S34AH) + R17
Reports (buyer-grade)
Operational
What buyer sees: Seven-section professional layout, methodology + version banner, honest disclaimer, print/export action.
Proof surface: ReportProfessionalLayout mounted on trial report + workspace reports; ReportExportPrintAction visible.
Next action: Founder reviews the new layout on /trial/report after R18 ships.
Wired by S33L + S33M + R14 banner + R17-EXPANDED + R18
AI Radar MVP
Partially operational
What buyer sees: Controlled source watchlist with check engine, items with status, admin review queue. Public page reports honest operational state.
Proof surface: /learn/ai-landscape-radar + /admin/intelligence-ops + radar API routes
Next action: Founder enables admin tools flag and runs first radar check.
Wired by R13-A + R18 system-health surfacing
Microsoft connector
Blocked by configuration
What buyer sees: Connect Microsoft, discover environments, discover Copilot Studio agents, read metadata only.
Proof surface: /api/connectors/microsoft/* + /connect/microsoft (real tenant required)
Next action: Founder configures the six MICROSOFT_* env vars or defers to post-pilot.
Wired by 1G initial + S34B
Provider-agnostic connector model
Operational
What buyer sees: Architecture supports adding more connectors without rewriting the assessment.
Proof surface: lib/agentproof/provider/provider_agnostic_positioning_v1.ts
Next action: None — shipped S34AC-R1 Defect 5.
Wired by S34AC-R1
System health (acceptance dashboard)
Operational
What buyer sees: Active probes + product completion + R14/R15/R16/R17 panels + R17-EXPANDED 13-step journey + R18 Live Acceptance Runner.
Proof surface: /system-health
Next action: None.
Wired by R12 + R14 + R15 + R16 + R17 + R18
Commercial / private pilot readiness
Partially operational
What buyer sees: Buyer-grade product: real persistence, real reports, honest demo, real AI Radar foundations, founder-runnable acceptance.
Proof surface: Founder walks the 13-step journey on the live URL; smoke evidence pack reports public-route green.
Next action: Founder runs the live walk-through; uses the R18 evidence pack to capture proof.
Wired by R12..R18 cascade

Product completion matrix v2 (R20)

End-product framework — what each area looks like complete

Each area carries a target end-state, current honest status, operational proof surface, remaining gap, named next action, and owner.

7 operational·5 partial·1 founder preview·3 blocked by config·0 planned·0 deferred·16 rows

Public site
Operational
Target end-state: Home, Learn, Demo, AI Radar, Founder blueprint, Sign-in — all reachable from sticky AppHeader and four-column footer. No orphan pages.
Operational proof: Smoke script asserts every public route renders 2xx + contains expected substrings (no dead-page text, no localhost, no SERVICE_ROLE).
Remaining gap: None.
Next action: None — shipped S34AA + S34AC-R6.
Owner: claude · Wired by S34AA + S34AC-R6 + R11
Navigation
Operational
Target end-state: AppHeader route-aware shell. PublicSiteHeader for public pages, WorkspaceSiteHeader for workspace, AdminSiteHeader for admin. No duplicate chrome.
Operational proof: S34AC-R6 unified shell tests pass; R11 removed duplicate ProductShell chrome.
Remaining gap: None.
Next action: None.
Owner: claude · Wired by S34AC-R6 + R11
Login (Supabase magic link)
Blocked by configuration
Target end-state: Buyer enters email, opens magic link, lands signed-in on /workspace. R8 cookie-options patch ensures the session cookie is set under the production domain.
Operational proof: Magic-link form renders. SSR Supabase client wired. R8 cookie patch applied.
Remaining gap: AGENTPROOF_SUPABASE_AUTH_ENABLED=true must be set on Railway. Until then login is preview-only.
Next action: Founder sets AGENTPROOF_SUPABASE_AUTH_ENABLED=true on Railway.
Owner: founder · Wired by S34C + R8 + R9
Supabase persistence
Partially operational
Target end-state: Every workspace write (agents, reviews, reports, improvement, radar) lands in Supabase EU and survives a sign-out / sign-in.
Operational proof: R14 declarative probes + R15 live anon SELECT probe per table + R16 per-surface compliance registry (0 signed-in localStorage-first violations) + R17 Supabase-first hydration into the 4 R16 defect surfaces. R20 adds explicit write/read proof manifest on /system-health.
Remaining gap: End-to-end persistence proof on the live URL still requires the founder signed-in walk-through to flip the operational verdict from 'configured — untested' to 'operational'.
Next action: Founder enables auth flag, signs in, walks the R18 Live Acceptance Runner on /system-health.
Owner: founder · Wired by R14 + R15 + R16 + R17 + R20
Manual non-Microsoft agent
Partially operational
Target end-state: Buyer adds a manual agent, signs out, signs in, sees it. Persisted in Supabase, scoped to workspace.
Operational proof: R7 wired form → server action → repository. R17 added Supabase-first hydration probe.
Remaining gap: Live signed-in walk-through still needed.
Next action: Founder runs the walk-through once auth is on; Claude captures the screenshot proof.
Owner: founder · Wired by R7 + S34I + R17
Demo
Operational
Target end-state: Buyer runs the sample agent flow end-to-end with no login. Demo data clearly isolated. Reachable from Home, nav, footer, Learn, Trial.
Operational proof: Demo default-on (R13). R14 added the explicit OFF-state diagnostic with the named env action. /system-health reports demo status.
Remaining gap: If the founder sees the OFF state, the diagnostic explains the exact Railway action.
Next action: If /demo shows the OFF state, remove AGENTPROOF_DEMO_MODE_ENABLED=false from Railway Variables.
Owner: founder · Wired by R13 + R14 + R20 nav sweep
Trial (signed-in guided)
Founder preview
Target end-state: Buyer walks /trial → /trial/workspace → /trial/agent/new → /trial/assessment → /trial/improvement → /trial/report. Signed-in trial persists a sample workspace + agent.
Operational proof: 9 trial routes exist with services. Sample data path renders. R18 added /trial/report/professional preview with the buyer-grade ReportProfessionalLayout.
Remaining gap: Trial persistence is still sample-only — signed-in trials do not yet persist a per-user trial workspace.
Next action: Founder reviews the trial path on the live URL; Claude promotes sample → persisted in a follow-on release.
Owner: both · Wired by S34O + S34P + R18 + R20
Review answer preservation
Partially operational
Target end-state: Every captured answer visible after sign-out / sign-in. Never blank.
Operational proof: R13 snapshot recovery + R13-A snapshot v1.1.0 + R13-LEGACY fallback cascade. Diagnostic panel shows 0 unrecoverable when running on the live URL.
Remaining gap: Cross-device sync is a separate pending defect (currently localStorage-only when offline).
Next action: Founder enables auth, signs in, opens a reviewed agent — Claude captures the screenshot proof.
Owner: founder · Wired by R13 + R13-A + R13-LEGACY (S34AH) + R17
Reports (buyer-grade)
Operational
Target end-state: Seven-section professional layout, methodology + version banner, honest disclaimer, print/export action.
Operational proof: ReportProfessionalLayout component live + ReportExportPrintAction visible + methodology/version block. Mounted on /trial/report and /trial/report/professional. Print CSS shipped S33L/S33M.
Remaining gap: Workspace reports still use existing widget set with unified score ring (consistent visual). The full ReportProfessionalLayout wrapper is mounted on the sample trial surface — workspace surfaces can be migrated after first live walk-through.
Next action: Founder reviews /trial/report/professional layout; approves migration of workspace surfaces in a follow-on.
Owner: founder · Wired by S33L + S33M + R14 banner + R17-EXPANDED + R18
Improvement cycle
Operational
Target end-state: Per-finding improvement guidance card with why-it-matters, what-good-looks-like, evidence-to-collect, reassess-after, related-Learn.
Operational proof: ImprovementGuidancePanel renders sample improvement cards. /trial/improvement route exists. Improvement service wired.
Remaining gap: Persisted improvement-cycle state for real workspace agents needs the live walk-through.
Next action: Same as the persistence walk-through above.
Owner: founder · Wired by S34F + S34O + R18
AI Radar
Partially operational
Target end-state: Controlled source-watchlist monitoring with change detection, human review, versioned packs. No fake live claim.
Operational proof: R13-A shipped engine + registry + state machine + admin review queue + operational status panel. R18 added /system-health surfacing.
Remaining gap: First real check run still pending.
Next action: Founder enables AGENTPROOF_ADMIN_TOOLS_ENABLED=true and runs first radar check from /admin/intelligence-ops.
Owner: founder · Wired by R13-A + R18
Microsoft connector
Blocked by configuration
Target end-state: Buyer connects via Microsoft, AgentProof discovers environments + Copilot Studio agents, reads only metadata.
Operational proof: Six MICROSOFT_* env vars + OAuth callback + Power Platform / Dataverse client wired.
Remaining gap: Founder must configure the six env vars OR defer to post-pilot.
Next action: Founder configures MICROSOFT_TENANT_ID, MICROSOFT_CLIENT_ID, MICROSOFT_CLIENT_SECRET, MICROSOFT_REDIRECT_URI, MICROSOFT_POWER_PLATFORM_SCOPE, MICROSOFT_SESSION_SECRET — or defers to post-pilot.
Owner: founder · Wired by 1G initial + S34B
Provider-agnostic connector model
Operational
Target end-state: Architecture supports adding OpenAI / Anthropic / Google / AWS / Salesforce / ServiceNow connectors without rewriting the assessment.
Operational proof: lib/agentproof/provider/provider_agnostic_positioning_v1.ts declares SUPPORTED_PROVIDERS. Landing-page chip says 'Microsoft is one connector path'.
Remaining gap: None at the architecture level.
Next action: None — shipped S34AC-R1 Defect 5.
Owner: claude · Wired by S34AC-R1
System health (acceptance dashboard)
Operational
Target end-state: Founder opens one page and sees the live operational state of every product area.
Operational proof: R12 active probes + R13 completion + R14 persistence proof + R15 live probe + R16 compliance + R17-EXPANDED journey + R18 live runner + R18 private-pilot matrix + R19 Basic Auth status + R20 Supabase persistence operational proof + R20 core-journey panel.
Remaining gap: None.
Next action: None.
Owner: claude · Wired by R12 + R14..R20
Admin / founder operations
Blocked by configuration
Target end-state: Founder can open the intelligence-ops queue, run the radar check, manage invites, and read the live admin readiness without exposing buyer surfaces.
Operational proof: /admin/intelligence-ops + admin pages exist. AdminSiteHeader gates the chrome. R19 Basic Auth optional on /admin/* when explicitly enabled.
Remaining gap: AGENTPROOF_ADMIN_TOOLS_ENABLED must be set on Railway for the founder to access the admin tools.
Next action: Founder enables AGENTPROOF_ADMIN_TOOLS_ENABLED=true on Railway.
Owner: founder · Wired by S34G + R13-A + R19
Commercial / private pilot readiness
Partially operational
Target end-state: AgentProof is ready for a controlled private pilot with a real buyer — real persistence, real reports, honest demo, real AI Radar foundations, founder-runnable acceptance, no payments / public registration in this cycle.
Operational proof: R18 Private Pilot Readiness Matrix + R20 Pilot Readiness Pack at /founder/pilot-readiness with go / stop criteria + recommended pilot script.
Remaining gap: Auth flag + live walk-through + first AI Radar check + report acceptance run.
Next action: Work the punch list on /founder/pilot-readiness top to bottom; then run a single full pilot walk-through.
Owner: founder · Wired by R12..R20

Private pilot readiness pack

The full readiness scorecard, security/trust posture, and the recommended pilot script live at /admin/pilot-readiness.

Open /admin/pilot-readiness →

Build policy

Choose one journey or module from the matrix above.
Build it until operational.
Prove it with an active probe in /system-health.
Prove it with a live walk-through on the live URL.
Only then mark it complete in this matrix.
Do not abandon half-built modules unless the admin explicitly defers them.

Audited modules

Findings (7)

Root cause

Fix applied (6 steps)

Outstanding before we call this complete

1. Arrive at Home

2. Read the Learn centre

3. Choose Demo or Start Trial

4. Sign in (magic link)

5. Land in the workspace

6. Add an agent

7. Classify the agent

8. Assess (answer the question bank)

9. Generate report

10. Read findings + evidence

11. Work the improvement plan

12. Return later

13. Re-score

14. Compare report history

15. Radar / methodology prompts reassessment

Private pilot readiness pack

Build policy