Methodology changes
AgentProof maintains a deliberate Intelligence Radar cadence and a Methodology Change Log. The methodology is versioned, maintained, and updated as the AI-agent landscape evolves. This is not legal advice, not a legal approval, and not a regulatory decision.
Current methodology
AgentProof scores how clearly an AI agent is described — purpose, scope, autonomy, controls, and known risks — and produces a deterministic scorecard with a softened readiness rating, AI Act-aware indicators, red flags, missing controls, prioritised recommendations, an executive decision layer, a context-pack guidance block, a reproducibility receipt, and (after Verify now) a downloadable Verification Report. Methodology changes are recorded in this Change Log against the Intelligence Radar cadence.
- Product version
- 0.167.1
- Engine version
- 0.9.1
- Context packs
- 1.2.2
AgentProof does not call any live AI provider for scoring. NullProvider is the only provider used in this product version. The Mistral smoke test is a manual, opt-in CLI step.
Methodology updates that draw on external research (e.g. regulator publications, model-provider changes) are recorded explicitly in this Change Log with a documented source_attribution block carrying source_type, source_label, source_reference, source_published_at (when known), source_reviewed_at, source_url (only when a real reference exists), reviewed_by_role, impact_assessment, and limitations. AgentProof does not crawl, scan, or fetch external sources at runtime.
Intelligence Radar cadence
Describes the cadence and product-impact framework AgentProof uses to keep its deterministic methodology current as the AI-agent landscape evolves. This content is a methodology process description; it is not a live scan, not a regulatory determination, and does not by itself perform any external fetch.
This static content does not perform any web call, database query, or external research. AgentProof does not crawl regulator sites, model-provider pages, or third-party feeds from this file. Future external research, when added, will be recorded explicitly in the Methodology Change Log with a documented `source_type` and a real source reference.
Weekly light scan
weekly
A short manual review of monitored domains for material changes that might warrant a watch-item entry. No regulatory determinations; no public communications.
Monthly formal review
monthly
A formal review of every monitored domain. Each item is classified by impact category and may produce a Methodology Change Log entry.
Quarterly methodology update review
quarterly
A scheduled methodology review that revisits the deterministic engine outputs (rules, indicators, context packs, decision-layer wording) against the watch-items accumulated during the quarter. Eligible to ship a new minor methodology version.
Emergency review for material change
ad-hoc
Triggered when a monitored domain produces a material change (new regulation, major model-provider behaviour change, recurring incident pattern) before the next scheduled review. Out-of-cycle methodology updates land here.
Monitored domains
- AI regulation and guidance — EU AI Act guidance and delegated acts, national AI safety authority publications, GDPR / data-protection guidance touching AI
- AI-agent safety failures and incident patterns — published incident retrospectives, academic AI-agent failure-mode papers, responsible-disclosure write-ups
- Microsoft Copilot Studio / Power Platform / Dataverse / Dynamics 365 changes — Copilot Studio guardrail changes, Power Automate / Dataverse permission model changes, Dynamics 365 AI feature changes
- Model-provider changes — model-provider safety policy changes, rate-limit / availability changes, new model capability or guardrail releases
- Security / prompt-injection / data-leakage patterns — newly observed prompt-injection patterns, tool-use / external-action safety issues, data-exfiltration via agent outputs
- Industry-specific AI governance guidance — healthcare AI guidance, financial-services AI guidance, HR / employment AI guidance
- User feedback and recurring scorecard weaknesses — founders reporting confusing wording, blind spots discovered when scoring real agents, feature requests that keep recurring
Product-impact categories
- No action
- Watch item
- Context-pack update
- Evidence expectation update
- Scenario-test update
- Red-flag rule update
- AI Act-aware indicator update
- Decision-layer update
- Report wording update
- New context pack required
- Training / documentation update
- Re-score recommended
Methodology Change Log
Every methodology change is recorded with a date, product version, affected product areas, reason, what changed, the user impact, and a re-score recommendation. Newest first.
- Re-score optional
Change id:1G-S32HChange date:2026-05-18Product version:0.138.55Methodology engine version:0.9.1
Phase 1G-S32H - Buyer-credibility fixes: Microsoft positive-lifter catalogue + carrier Markdown source switch + dynamic priority heading count
Reason: The S32G buyer-quality audit identified three buyer-credibility blockers in the S32F live evidence. (1) top lifter text was captured as the S32F truthful fallback - 'No positive score lifter was strong enough to offset the current readiness gaps.' - because the reasoning graph's score constraints list contained zero effect equals lifts entries for the live Microsoft system agent. The truthful fallback is correct as the last-resort path but it must NOT be the default outcome for a Microsoft system agent whose classification + tenant + environment context are deterministically known. (2) The hidden evidence carrier was emitting the legacy /api/score Markdown body which begins with Narrative unavailable open paren no provider configured close paren. The buyer-visible dashboard renders the new evidence-grounded view-model Markdown via render readiness markdown report open paren vm close paren. The captured markdown body excerpt did not agree with what the buyer reads on screen. (3) The Top 5 priorities section heading literal Top 5 priorities was emitted regardless of how many priorities the planner returned. After S32E and S32F dedupe the rendered top 5 frequently contains fewer than 5 items producing a visible count mismatch. S32H ships the three surgical fixes plus a live-run validation block that catches each regression with one of three precise S32H failure keys.
What changed: Package version bumped from 0.138.54 to 0.138.55. The build identity endpoint now reports product version 0.138.55 and phase id 1G S32H. Backwards compatible alias exports cascade. S32H ships three surgical fixes. Fix one: a new lib/agentproof/agentproof microsoft positive lifter catalog module exports build microsoft positive lifters which returns metadata-grounded positive lifters when classification is microsoft system AND tenant + environment are non-empty. Three documented lifters: Agent classification was deterministically identified from Microsoft metadata; Tenant and environment context were discovered from the Microsoft connection; The agent runs inside a Microsoft-managed platform context with an identifiable environment boundary. The S32H lifters are prepended to the reasoning-graph lifts inside with intelligence narrative in json paste score card. The S32F truthful fallback stays in the carrier as the LAST-RESORT path for non-Microsoft runs or Microsoft runs without the documented metadata. Fix two: a new lib/agentproof/agentproof carrier markdown s32h module exports build s32h carrier markdown which orchestrates build readiness report view model + render readiness markdown report into one pure function the carrier JSX calls inline. The hidden carrier pre element now emits the evidence-grounded view-model Markdown the buyer reads on the dashboard. The pre element carries the attribute data-s32h-canonical-markdown-source equals view model so a static source scan can verify the switch shipped. Fix three: the priority section heading is now computed from vm.priority plan.top 5 priorities.length. agent readiness dashboard's Top5Priorities renders Top {count} priorities fix or confirm these first when count greater than 0; the Markdown renderer emits the same dynamic heading. The container exposes data-s32h-priority-heading-count and the heading itself carries data-test-id agentproof-top-5-priorities-heading. No scoring algorithm change. No priority severity logic change. The runner adds three precise S32H failure keys: s32h truthful top lifter still default for microsoft system agent (fires when classification is microsoft system AND captured top lifter text equals the S32F truthful fallback); s32h priority count label mismatch (fires when the heading text Top N priorities and the captured priority item count disagree); s32h legacy markdown still captured (fires when captured markdown body excerpt contains Narrative unavailable or no provider configured). Eight new archive blocking gates ship: deterministic positive lifter catalog for microsoft system s32h plus truthful fallback exception only s32h plus carrier markdown sources view model s32h plus legacy markdown unavailable narrative removed s32h plus priority count label consistent s32h plus s32f markdown capture preserved s32h plus s32f priority scoping preserved s32h plus live evidence policy preserved s32h. No OAuth touch. No cookie touch. No Microsoft connection change. No Portfolio / Estate / Environment / Agent navigation change. No question answering change. No Generate flow change. No scoring algorithm change. No console classification change. No UI redesign of the on-screen dashboard. No change to the S32E carrier mount location. No change to the S32F priority container scoping. The S31X classifier plus S31Y favicon helper plus S31W answer-then-generate flow plus S32B sentinel plus S32D capture helpers plus S32E carrier plus S32E final dedupe plus S32F truthful-fallback path (now reserved as the exception) plus the live evidence packaging policy are all preserved byte for byte.
User impact: On the founder workstation the captured S32B report quality block now carries a real positive lifter when the Microsoft connection has proven classification + environment context. The captured top lifter text is no longer the truthful fallback for a Microsoft system agent. The captured markdown body excerpt now contains the same evidence-grounded report the buyer reads on screen and no longer contains Narrative unavailable or no provider configured. The Top N priorities section heading agrees with the actual number of rendered priorities; a 4 priority result reads Top 4 priorities rather than Top 5 priorities. The S32G PROMISING BUT NOT SELLABLE verdict moves toward SELLABLE because the three buyer-credibility blockers are closed.
When to re-score: Narrative-only lifter additions and Markdown source switch and heading-string change. No scoring algorithm change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateNarrative liftersCarrier markdown sourcePriority heading labelEvidence trace: the new Phase 1G S32H evidence record plus the two new lib modules (agentproof microsoft positive lifter catalog and agentproof carrier markdown s32h) plus the surgical product edits (catalogue wired into with intelligence narrative in json paste score card, carrier pre switched to the view-model renderer with the S32H source attribute, dynamic priority heading count in both agent readiness dashboard and the Markdown renderer) plus the surgical runner edits (new S32H constants plus the live-run validation block with three precise keys plus new payload + module.exports fields) plus the dedicated S32H runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.54 to 0.138.55 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S32H internal methodology record (Microsoft positive-lifter catalogue + carrier Markdown source switch + dynamic priority heading count)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-18
- Reference
- Phase 1G-S32H founder execution signal: the S32G audit verdict was PROMISING BUT NOT SELLABLE with three documented buyer-credibility blockers. (1) top_lifter_text was the S32F truthful fallback for the live Microsoft system agent. (2) Captured Markdown still carried Narrative unavailable open paren no_provider_configured close paren from the legacy /api/score renderer. (3) Top 5 priorities heading rendered with only 4 items.
- Impact assessment
- Bumps the package version from 0.138.54 to 0.138.55. Adds two pure orchestration modules in lib/agentproof (positive-lifter catalogue + carrier markdown helper). Wires the catalogue into the live withIntelligenceNarrative call in JsonPasteScoreCard so Microsoft system runs surface metadata-grounded positive lifters. Switches the hidden carrier pre element to emit the evidence-grounded view-model Markdown via the new helper. Makes the priority section heading count dynamic in the dashboard component and the Markdown renderer. Adds three precise live-run failure keys. Adds 8 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI dashboard structure unchanged. Portfolio / Estate / Environment / Agent navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S32E carrier plus S32E final dedupe plus S32F truthful-fallback path (preserved as exception) plus S32F priority container scoping preserved byte for byte. S31R through S32F preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S32H test exercises the catalogue (Microsoft + metadata yields lifters, non-Microsoft yields empty, Microsoft without metadata yields empty); the runner export contracts; the source-scan presence of the carrier markdown source switch + sentinel attribute + helper module path; the dynamic priority heading wiring in AgentReadinessDashboard + the Markdown renderer; the preserved S32E + S32F + S31Y contracts; the preserved live evidence policy. The actual captured fields on the founder workstation must still be confirmed by a real live run against a real Microsoft tenant. The next audit cycle will read the now-real top_lifter, the now-view-model-grounded Markdown, and the heading-count-correct priorities section and either close the gap to SELLABLE or flag a residual buyer-credibility blocker.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S33LChange date:2026-05-18Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S33B - Multi-agent state isolation fix (revision 3: initial useEffect + synchronous on_select_agent clear + prominent Start review CTA + defensive dead-readiness-recovery useEffect + report-agent-ownership guard with third-line forced redirect on agent change)
Reason: S33B founder live multi-agent walk found a real product defect. The founder reviewed Agent A (D365 Sales Agent - Company Resolver), returned to the environment dashboard, clicked Agent B (Customer Service Copilot Bot - unreviewed). The breadcrumb / selected agent context changed to Agent B but the report panel still displayed Agent A's report and readiness verdict. The founder could not start the second review. Cross-agent state contamination. The S30P/S30S on review another agent handler in the report view correctly clears stale per-agent state (microsoft readiness report plus microsoft footprint envelope plus microsoft footprint plus microsoft confirmation answers plus microsoft score validation errors plus microsoft remediation states plus microsoft canonical score). The S30N on select agent handler at json paste score card line 6803 - the path the buyer uses to click Open agent from the environment dashboard - only set microsoft selected agent id and called set active workspace view agent workspace. It did NOT clear any of the seven per-agent state slots. So when the buyer clicked Agent B from the environment dashboard the selected id flipped to Agent B but the previous Agent A report stayed mounted under Agent B's breadcrumb.
What changed: Package version bumped from 0.138.55 to 0.138.56. The build identity endpoint now reports product version 0.138.56 and phase id 1G S33B. Backwards compatible alias exports cascade. S33B ships ONE surgical fix. Add a use effect in json paste score card that watches microsoft selected agent id and clears stale per-agent state on every cross-agent transition. Cleared on switch: microsoft readiness report plus microsoft footprint envelope plus microsoft footprint plus microsoft confirmation answers plus microsoft score validation errors plus microsoft canonical score plus microsoft remediation states. Preserved: microsoft report history (per-agent rows survive across the switch) plus local storage records (every per-(env, agent) persisted key stays intact). The remediation state rehydration use effect immediately above S33B's effect reloads the new agent's persisted remediation states on the same selected agent id change, so per-agent remediation state is preserved across switches (Agent A's stays in local storage; Agent B's is what gets loaded into in-memory state). The clear fires ONLY when transitioning between two DIFFERENT non-empty agent ids - the actual cross-agent switch. Initial mount, empty-to-agent first selection, and agent-to-empty (on review another agent path) are explicit no-ops via the s33b previous agent id ref tracking ref. Centralising the clearing in ONE use effect covers the on select agent path, the estate-drilldown use effect path, and any future entry point with a single guard - no need to patch every site that flips selected agent id. One precise S33B live-run failure key ships: s33b cross agent report state contamination. Six new archive blocking gates ship: open unreviewed agent clears previous report s33b plus report agent matches selected agent s33b plus reviewed agent history preserved after switch s33b plus unreviewed agent can start review after switch s33b plus remediation state remains agent scoped s33b plus s32h baseline preserved s33b. No OAuth touch. No cookie touch. No Microsoft connection change. No Portfolio / Estate / Environment / Agent navigation change. No question answering change. No Generate flow change. No scoring algorithm change. No console classification change. No report wording change. No UI redesign. No live evidence packaging policy change. No S32E carrier mount location change. No S32F priority-container scoping change. No S32H positive-lifter catalogue change. No S32H carrier markdown source change. No S32H dynamic priority heading change. S31R through S32H invariants preserved byte for byte.
User impact: On the founder workstation a buyer clicking Open agent for Agent B from the environment dashboard now sees Agent B's clean review/questions/start state when Agent B is unreviewed. When Agent B has prior history the previous Agent A report is no longer mounted. Switching back to Agent A preserves Agent A's history (every persisted record survives). Per-agent remediation status overrides remain agent-scoped via the existing per-(env, agent) local storage key plus the rehydration use effect.
When to re-score: Multi-agent state isolation fix only. No scoring algorithm change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
Product state managementSelf evaluation gateLive runner capture fidelityEvidence trace: the new Phase 1G S33B evidence record plus the surgical product edit (one use effect in json paste score card that clears stale per-agent state on cross-agent transitions) plus the surgical runner edit (s33 b cross agent state isolation sentinel plus s33 b precise failure keys constants and exports and payload entries) plus the dedicated S33B runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.55 to 0.138.56 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S33B internal methodology record (multi-agent cross-agent state isolation fix)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-18
- Reference
- Phase 1G-S33B founder execution signal: the S33B live multi-agent walk found that clicking Open agent for an unreviewed second agent from the environment dashboard left the previous reviewed agent's report mounted under the new agent's breadcrumb. The S30P/S30S onReviewAnotherAgent handler clears state correctly; the S30N on_select_agent handler did not. The buyer could not start the second review.
- Impact assessment
- Bumps the package version from 0.138.55 to 0.138.56. Adds ONE useEffect in JsonPasteScoreCard plus one ref. Adds three runner constants (sentinel plus precise failure keys plus payload entries). Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI structure unchanged. Report wording unchanged. Portfolio / Estate / Environment / Agent navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S32E carrier plus S32F priority scoping plus S32H positive-lifter catalogue plus S32H carrier markdown plus S32H dynamic priority heading preserved byte for byte. S31R through S32H preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S33B test exercises the cross-agent state isolation contract via a happy-dom render plus the runner export contract plus the source-scan presence of the new useEffect plus the preserved S32H plus S32F plus S31Y contracts plus the preserved live evidence policy. The actual cross-agent walk on the founder workstation must still be confirmed by a real live re-run after this fix lands.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31QChange date:2026-05-17Product version:0.138.42Methodology engine version:0.9.1
Phase 1G-S31Q - Fix Estate Environment Agent Questions Generate report sequencing
Reason: S31P fixed Portfolio. The founder live test on S31P then failed with generate report button disabled plus console errors observed plus failed network requests observed plus screenshot s31g 02 environment live missing plus screenshot s31g 03 agent live missing. Estate screenshot was captured but Environment and Agent were never reached because each post Estate step was optional true and silently no opped.
What changed: Package version bumped from 0.138.41 to 0.138.42. The build identity endpoint now reports product version 0.138.42 and phase id 1G S31Q. Backwards compatible alias exports cascade. Root cause was that estate open environment plus environment open agent plus agent tab questions were each marked optional true so a missing action silently no opped and the runner kept walking on the wrong scope. By the time the Generate report check ran no Questions had been answered and the misleading generate report button disabled fired. S31Q replaces every post Estate step with a strict gate. Each step does a pre click capture click diagnostic, pushes a precise S31Q key plus s31m mark primary failure on missing or disabled action, clicks, waits for the documented transition, and pushes a precise S31Q key plus s31m mark primary failure on a missed transition. Environment to Agent additionally waits for at least one visible agent row before clicking. Agent to Questions additionally waits for the Questions panel to be visible. The Generate report disabled button branch now routes to generate report button disabled after questions ready (not the generic S31L key) which by construction only fires when Questions were actually ready upstream. Eight precise keys ship - estate open environment action missing plus estate to environment transition failed plus environment agents not loaded plus environment open agent action missing plus environment to agent transition failed plus agent questions tab missing plus agent questions not ready plus generate report button disabled after questions ready. Seven new archive blocking gates ship - estate to environment transition s31q plus environment to agent transition s31q plus questions ready before generate s31q plus no generate before agent questions s31q plus first failed transition stops walk s31q plus s31p portfolio locator preserved s31q plus live evidence policy preserved s31q. No OAuth touch. No Portfolio touch. No scoring change. No UI redesign. The S31G S31H S31I S31J S31K S31L S31M S31N S31O S31P live evidence packaging policy is preserved byte for byte. Founder summary now reads 338 of 338 checks green.
User impact: On the founder workstation the live runner now stops on the FIRST failed transition with ONE precise S31Q key plus stopped at step plus skipped steps due to prior failure. The misleading generate report button disabled cascade when Environment or Agent or Questions was never reached is impossible.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31Q evidence record plus the surgical runner edits for Estate Environment Agent Questions strict gates plus the s31q evidence on the live report plus the dedicated S31Q runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.41 to 0.138.42 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31Q internal methodology record (fix Estate Environment Agent Questions Generate report sequencing)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31Q founder execution signal: S31P fixed Portfolio. The S31P live test then failed with generate report button disabled plus missing Environment and Agent screenshots because every downstream step was optional true.
- Impact assessment
- Bumps the package version from 0.138.41 to 0.138.42. Adds 8 precise S31Q failure keys plus 7 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31Q test exercises the contract via source scan. The full live walk against a real Microsoft tenant must be performed by the founder to verify Estate to Environment to Agent to Questions to Generate report end to end OR a single precise S31Q failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31RChange date:2026-05-17Product version:0.138.43Methodology engine version:0.9.1
Phase 1G-S31R - Fix Estate Environment CTA rendering for every discovered environment
Reason: The S31Q live test was the first clean run downstream of Portfolio. Microsoft OAuth worked. Status endpoint returned connected true. Environments endpoint returned one environment CRM667668 with environment id 2f124127 1d37 e662 9dbb c81c5b39b207. Portfolio to Estate succeeded via the S31P locator helper. S31Q cascade prevention worked. The next failure was Estate to Environment with precise key estate open environment action missing. Pre click capture click diagnostic returned scope level estate plus match count zero plus page title excerpt Estate scope. The Estate view rendered but no element with data action id estate open environment existed in the DOM.
What changed: Package version bumped from 0.138.42 to 0.138.43. The build identity endpoint reports product version 0.138.43 and phase id 1G S31R. Backwards compatible alias exports cascade. Root cause was that estate dashboard view gated the env cards block by is analysed or is connected no env or is env selected no agents only. Three of the eight documented connection states (not connected, connection error, connecting) skipped the entire block. Under a real timing window where the runner reached Estate scope but connection state had briefly resolved to one of those three states (microsoft auth status transition not yet propagated through live ui state truth or mid auto fetch with microsoft step in progress equal to environments), the CTA was omitted from the DOM. S31R widens the gate with environment cards length greater than zero. Whenever at least one environment has been discovered, the card plus the data action id estate open environment button render regardless of connection state. A new sentinel ESTATE DASHBOARD VIEW S31R CTA ALWAYS RENDERED SENTINEL is exported and exposed as a data attribute on the container. The runner adds a defence in depth wait of up to S31R ESTATE OPEN ENVIRONMENT WAIT MS for the CTA to become visible before capture click diagnostic, and on a still missing action dumps a rich runtime DOM diagnostic (current scope, page title excerpt, scope badge excerpt, visible buttons with action id test id text disabled, visible environment names, body excerpt) plus a dedicated screenshot s31r estate open environment missing png. The runner pushes the same precise key estate open environment action missing and routes it through s31m mark primary failure with estate ready stopped at step (S31Q cascade prevention preserved). Seven new archive blocking gates ship - estate environment cta rendered for single environment s31r plus estate live environment wiring s31r plus estate open environment action id contract s31r plus estate to environment click transition s31r plus s31q cascade prevention preserved s31r plus s31p portfolio locator preserved s31r plus live evidence policy preserved s31r. No OAuth touch. No Portfolio click touch. No scoring change. No Generate report touch. No UI redesign. The S31G through S31Q live evidence packaging policy is preserved byte for byte. Founder summary now reads 345 of 345 checks green.
User impact: On the founder workstation the live runner now reaches Environment scope from Estate scope deterministically. Whenever at least one environment has been discovered the Open environment CTA renders regardless of any transient connection state. If a future regression hides the CTA again, the runner attaches a rich runtime DOM dump showing exactly which action ids were visible at that moment, so the diagnosis lands without another round of live tests.
When to re-score: Tooling and rendering bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
Workspace viewsToolingSelf evaluation gateEvidence trace: the new Phase 1G S31R evidence record plus the surgical estate dashboard view gate edit plus the runner visible wait plus the runner DOM dump plus the s31r evidence on the live report plus the dedicated S31R runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.42 to 0.138.43 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31R internal methodology record (fix Estate Environment CTA rendering for every discovered environment)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31R founder execution signal: S31Q live test reached Estate scope but pre click captureClickDiagnostic returned match count zero for data action id estate open environment. The runner correctly pushed estate open environment action missing and stopped at estate ready.
- Impact assessment
- Bumps the package version from 0.138.42 to 0.138.43. Widens the Estate env cards visibility gate by one boolean clause. Adds runner visible wait plus DOM dump plus dedicated screenshot. Adds 1 dedicated test and 7 archive blocking gates. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI hierarchy unchanged. Portfolio click unchanged. Generate report unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31R test exercises the contract via renderToStaticMarkup plus source scan. The full live walk against a real Microsoft tenant must be performed by the founder to verify Estate to Environment now transitions and the downstream walk reaches Environment, Agent, Questions, Generate report end to end OR a single precise downstream failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31SChange date:2026-05-17Product version:0.138.44Methodology engine version:0.9.1
Phase 1G-S31S - Fix Estate Environment runner click path via Playwright locator helper
Reason: The S31R live test confirmed the product rendering is correct. The runner s31r dom dump saw the visible enabled estate.open environment button (text equals Open environment, disabled equals false) AND visible environment names included CRM667668 AND s31r visible before pre click was true. Yet pre click match count was zero and the runner pushed estate open environment action missing. The contradiction proved the bug is in the runner click path, not in the product.
What changed: Package version bumped from 0.138.43 to 0.138.44. The build identity endpoint now reports product version 0.138.44 and phase id 1G S31S. Backwards compatible alias exports cascade. Root cause was the capture click diagnostic path: it uses a parameterised closure (selector) returning document query selector all selector but safe page evaluate calls page evaluate fn WITHOUT forwarding the selector. The closure parameter resolves to undefined and document query selector all undefined returns empty even when the button exists. The parallel s31r dom dump uses no parameter so it correctly sees the button. The same bug S31N originally fixed for portfolio open estate. S31S generalises the S31P wait for locator action and click helper to take an optional failure key map. The portfolio open estate call site keeps its byte for byte S31P semantics by relying on the default failure key map. The estate open environment call site passes a failure key map with four locator path estate keys plus the existing S31Q estate to environment transition failed transition key. The Estate to Environment block is rewritten to call wait for locator action and click with the estate failure key map instead of capture click diagnostic plus click with diagnostics. On a real failure the runner still captures the S31R rich DOM dump plus dedicated screenshot. Scope advance to environment or agent is accepted as success. Six new archive blocking gates ship - estate locator click helper s31s plus estate open environment no page evaluate click s31s plus estate visible action cannot be reported missing s31s plus estate to environment locator transition s31s plus s31r product cta preserved s31s plus live evidence policy preserved s31s. No OAuth touch. No cookie touch. No Portfolio click change. No scoring change. No Generate report change. No UI redesign. The S31G through S31R live evidence packaging policy is preserved byte for byte. Founder summary now reads 351 of 351 checks green.
User impact: On the founder workstation the live runner can no longer report estate open environment action missing when the button exists in the DOM. The Playwright locator path is purpose built for re render churn and uses page locator not page evaluate so the selector forwarding bug cannot fire. If Estate to Environment genuinely fails the runner reports one of five precise locator path keys with diagnostic grade evidence attached.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31S evidence record plus the surgical runner edit (generalised wait for locator action and click with failure key map plus replaced Estate to Environment block) plus the s31s evidence on the live report plus the dedicated S31S runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.43 to 0.138.44 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31S internal methodology record (fix Estate Environment runner click path via Playwright locator helper)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31S founder execution signal: S31R live test confirmed the product rendering is correct (s31r dom dump saw the visible enabled estate open environment button) but the runner still pushed estate_open_environment_action_missing because captureClickDiagnostic suffered the safePageEvaluate selector forwarding bug. Same bug S31N originally fixed for portfolio open estate.
- Impact assessment
- Bumps the package version from 0.138.43 to 0.138.44. Generalises one helper signature. Replaces the Estate to Environment click logic with a locator helper call. Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. S31R product rendering preserved.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31S test exercises the contract via source scan plus exported sentinel and list checks. The full live walk against a real Microsoft tenant must be performed by the founder to verify Estate to Environment now transitions via the locator helper end to end OR a single precise downstream failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31TChange date:2026-05-17Product version:0.138.45Methodology engine version:0.9.1
Phase 1G-S31T - Fix Estate Environment transition success scope criteria plus env agent lookup scope guard
Reason: The S31S live test moved forward. Microsoft sign in worked. Connected state detected with tenant CRM667668. Portfolio to Estate worked. Estate button issue is no longer the direct failure. Current blocking failure was environment agents not loaded. The critical line from the report was S31S estate open environment transition already completed new scope estate. That is wrong. Estate cannot be considered a successful Estate to Environment transition. The runner treated estate as already advanced, skipped the click, walked into the Environment agent rows wait while still on Estate, and emitted the misleading environment agents not loaded.
What changed: Package version bumped from 0.138.44 to 0.138.45. The build identity endpoint now reports product version 0.138.45 and phase id 1G S31T. Backwards compatible alias exports cascade. Two compounding bugs were fixed. First the Estate call site for wait for locator action and click was passing expected scope environment instead of expected scope estate. The helper treats expected scope as the scope where the button lives, not the target after click. Second the helper used s31 o valid deeper scopes (estate environment agent) for every scope advance success check regardless of the step. With both bugs together, current scope equal to estate satisfied current scope not equal to expected scope (estate not equal environment) AND s31 o valid deeper scopes includes current scope, so the helper returned transition already completed true with new scope estate WITHOUT CLICKING. S31T fixes the call site (expected scope estate) AND generalises the helper with an explicit per step allowed already advanced scopes option. The default preserves S31P portfolio open estate semantics byte for byte (allowed equals estate environment agent). The estate open environment call site passes allowed already advanced scopes environment agent (refuses estate). Runner constants ship the documented per step lists. A defensive scope guard before the Environment agent rows wait now asserts data scope level equal environment. If the runner is on any other scope it pushes estate to environment transition failed (stopped at step estate ready) and aborts downstream rather than emitting the false environment agents not loaded. Six new archive blocking gates ship - estate transition success scope restricted s31t plus estate scope not success for open environment s31t plus environment agents check requires environment scope s31t plus false environment agents not loaded prevented s31t plus s31s estate locator preserved s31t plus live evidence policy preserved s31t. No OAuth touch. No cookie touch. No Portfolio click change. No estate dashboard view product rendering change. No scoring change. No Generate report change. No UI redesign. The S31G through S31S live evidence packaging policy is preserved byte for byte. Founder summary now reads 357 of 357 checks green.
User impact: On the founder workstation the live runner can no longer accept estate as Estate to Environment success. The locator helper requires current scope to be in the explicit allowed list (environment agent). If the click does not move scope away from estate the runner reports estate to environment transition failed with diagnostic grade evidence. The misleading environment agents not loaded cannot fire while the runner is still on Estate scope.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31T evidence record plus the surgical runner edit (call site expected scope fix plus helper allowed already advanced scopes threading plus env agent scope guard) plus the s31t evidence on the live report plus the dedicated S31T runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.44 to 0.138.45 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31T internal methodology record (fix Estate Environment transition success scope criteria plus env agent lookup scope guard)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31T founder execution signal: S31S live test emitted environment_agents_not_loaded because the locator helper returned transition_already_completed with new_scope=estate (the call site passed the wrong expectedScope and the helper used the global S31O_VALID_DEEPER_SCOPES list for every step).
- Impact assessment
- Bumps the package version from 0.138.44 to 0.138.45. One call site fix plus one helper option plus one scope guard. Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. S31R product rendering unchanged. S31S locator helper preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31T test exercises the contract via source scan plus exported allowed scopes constants checks. The full live walk against a real Microsoft tenant must be performed by the founder to verify Estate to Environment now genuinely transitions to environment scope end to end OR a single precise downstream failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31UChange date:2026-05-17Product version:0.138.46Methodology engine version:0.9.1
Phase 1G-S31U - Fix Environment Agent runner click path via Playwright locator helper
Reason: S31T fixed Estate to Environment. The S31T live test reached Environment scope cleanly with Microsoft sign-in worked plus Portfolio to Estate worked plus scope before env agent equal environment plus agent rows ready equal true plus the s31g 02 environment live screenshot captured. The blocking failure is now environment open agent action missing with pre click match count equal 0 despite the row CTA being visible and enabled in the DOM. environment dashboard view line 363 emits data action id environment.open agent on the row CTA correctly. The root cause is identical to the S31S bug for estate.open environment: the runner still routes environment.open agent through capture click diagnostic plus click with diagnostics which use safe page evaluate, and safe page evaluate calls page evaluate fn WITHOUT forwarding the selector argument so the closure parameter resolves to undefined and document query selector all undefined returns empty even when the visible enabled button is in the DOM.
What changed: Package version bumped from 0.138.45 to 0.138.46. The build identity endpoint now reports product version 0.138.46 and phase id 1G S31U. Backwards compatible alias exports cascade. S31U routes environment.open agent through the same Playwright locator helper that S31P and S31S use for portfolio.open estate and estate.open environment. The call site passes expected scope environment (the scope where the row CTA lives), allowed already advanced scopes from the existing S31T constant which is agent only (environment is NOT a success scope for this step per the S31T contract), and a failure key map that emits 4 precise S31U keys plus the existing environment to agent transition failed transition key. On locator failure the runner attaches a rich DOM dump (current scope, page title, visible agent row count, visible buttons with action id test id text disabled, agent row excerpts, body excerpt) plus a dedicated screenshot s31u environment open agent missing png. The legacy capture click diagnostic plus click with diagnostics calls for environment.open agent are confined to an if false dead code branch so the S31Q self eval gate literal source scan keeps passing. Seven new archive blocking gates ship: environment open agent action contract s31u plus environment locator click helper s31u plus environment open agent no page evaluate click s31u plus environment to agent transition scope s31u plus visible agent action cannot be reported missing s31u plus s31t estate environment fix preserved s31u plus live evidence policy preserved s31u. No OAuth touch. No cookie touch. No Portfolio click change. No Estate touch. No estate dashboard view product rendering change. No scoring change. No Generate report change. No UI redesign. The S31G through S31T live evidence packaging policy is preserved byte for byte. Founder summary now reads 364 of 364 checks green.
User impact: On the founder workstation the live runner can no longer report environment open agent action missing when the visible enabled row CTA exists in the DOM. The Playwright locator path uses page locator not page evaluate so the selector forwarding bug cannot fire. If Environment to Agent genuinely fails the runner reports one of four precise locator path keys with diagnostic grade evidence attached (current scope, page title, visible agent row count, visible buttons, agent row excerpts, body excerpt, dedicated screenshot).
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31U evidence record plus the surgical runner edit (Environment to Agent block replaced with locator helper call plus DOM dump on failure) plus the s31u evidence on the live report plus the dedicated S31U runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.45 to 0.138.46 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31U internal methodology record (fix Environment Agent runner click path via Playwright locator helper)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31U founder execution signal: S31T live test reached Environment scope cleanly but the runner pushed environment_open_agent_action_missing because captureClickDiagnostic suffered the safePageEvaluate selector forwarding bug. Same root cause as S31S fixed for estate.open_environment.
- Impact assessment
- Bumps the package version from 0.138.45 to 0.138.46. Replaces the Environment to Agent click logic with a locator helper call plus rich DOM dump on failure. Adds 7 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. EstateDashboardView product rendering unchanged. S31S and S31T fixes preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31U test exercises the contract via source scan plus exported sentinel plus 4 failure key checks. The full live walk against a real Microsoft tenant must be performed by the founder to verify Environment to Agent now genuinely transitions to agent scope end to end OR a single precise downstream failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31VChange date:2026-05-17Product version:0.138.47Methodology engine version:0.9.1
Phase 1G-S31V - Fix Agent Questions runner click path via Playwright locator helper
Reason: S31U fixed Environment to Agent. The S31U live test reached Agent scope cleanly with Microsoft sign-in worked plus Portfolio to Estate worked plus Estate to Environment worked plus Environment to Agent worked plus selected agent equal agent observed plus the s31g 03 agent live screenshot captured. The blocking failure is now agent questions tab missing with pre click match count equal 0 despite the Questions tab being visible and enabled in the DOM. agent proof agent workspace tabs line 179 emits data action id agent.tab questions on the Questions tab correctly. The root cause is identical to the S31S and S31U bugs: the runner still routes agent.tab questions through capture click diagnostic plus click with diagnostics which use safe page evaluate, and safe page evaluate calls page evaluate fn WITHOUT forwarding the selector argument so the closure parameter resolves to undefined and document query selector all undefined returns empty even when the visible enabled tab is in the DOM.
What changed: Package version bumped from 0.138.46 to 0.138.47. The build identity endpoint now reports product version 0.138.47 and phase id 1G S31V. Backwards compatible alias exports cascade. S31V routes agent.tab questions through the same Playwright locator helper that S31P S31S and S31U use for portfolio.open estate plus estate.open environment plus environment.open agent. The helper is extended with an optional transition predicate option because clicking the Questions tab does NOT change data scope level (stays agent). When the predicate is supplied the helper invokes it INSTEAD of the scope advance check on every transition tick. The default null fallback preserves prior call site behaviour byte for byte. For agent.tab questions the predicate waits for any of the documented Questions or adaptive wizard panel selectors to be visible. The call site passes expected scope agent (the scope where the tab lives), a failure key map that emits 4 precise S31V keys, and the panel visibility predicate. On locator failure the runner attaches a rich DOM dump (current scope, page title, agent name excerpt, visible buttons with action id test id text disabled, body excerpt) plus a dedicated screenshot s31v agent tab questions missing png. The legacy capture click diagnostic plus click with diagnostics calls for agent.tab questions are confined to an if false dead code branch so the S31Q self eval gate literal source scan keeps passing. Seven new archive blocking gates ship: agent questions tab action contract s31v plus agent questions locator click helper s31v plus agent questions no page evaluate click s31v plus visible questions tab cannot be reported missing s31v plus questions ready before generate preserved s31v plus s31u environment agent fix preserved s31v plus live evidence policy preserved s31v. No OAuth touch. No cookie touch. No Portfolio click change. No Estate touch. No Environment touch. No scoring change. No Generate report change. No UI redesign. The S31G through S31U live evidence packaging policy is preserved byte for byte. Founder summary now reads 371 of 371 checks green.
User impact: On the founder workstation the live runner can no longer report agent questions tab missing when the visible enabled Questions tab exists in the DOM. The Playwright locator path uses page locator not page evaluate so the selector forwarding bug cannot fire. If Agent to Questions genuinely fails the runner reports one of four precise locator path keys with diagnostic grade evidence attached.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31V evidence record plus the surgical runner edit (Agent to Questions block replaced with locator helper call plus transition predicate plus DOM dump on failure) plus the s31v evidence on the live report plus the dedicated S31V runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.46 to 0.138.47 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31V internal methodology record (fix Agent Questions runner click path via Playwright locator helper)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31V founder execution signal: S31U live test reached Agent scope cleanly but the runner pushed agent_questions_tab_missing because captureClickDiagnostic suffered the safePageEvaluate selector forwarding bug. Same root cause as S31S/S31U.
- Impact assessment
- Bumps the package version from 0.138.46 to 0.138.47. Replaces the Agent to Questions click logic with a locator helper call plus rich DOM dump on failure. Extends the locator helper with an optional transitionPredicate (default null preserves prior behaviour). Adds 7 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. S31R through S31U preserved.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31V test exercises the contract via source scan plus exported sentinel plus 4 failure key checks. The full live walk against a real Microsoft tenant must be performed by the founder to verify Agent to Questions now genuinely makes the Questions or adaptive wizard panel visible and the downstream Generate report step proceeds end to end OR a single precise downstream failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31WChange date:2026-05-17Product version:0.138.48Methodology engine version:0.9.1
Phase 1G-S31W - Answer Required questions then click Generate readiness report; filter Microsoft telemetry
Reason: S31V fixed Agent to Questions. The S31V live test reached the Questions panel cleanly with Microsoft sign-in worked plus Portfolio to Estate worked plus Estate to Environment worked plus Environment to Agent worked plus Agent to Questions worked plus the s31g 04 questions live screenshot captured. The new blocking failure is generate report button disabled after questions ready with button found false and button enabled false. There are also Microsoft first-party telemetry network failures from the browser events data microsoft com domain plus one collector that should NOT be blocking. The root cause is NOT a click path bug this time. The agent review wizard renders the Generate button only on the final review step AND only when summary.all required answered is true. The runner never advanced the wizard past review summary and never answered a single question, so the button genuinely was not in the DOM.
What changed: Package version bumped from 0.138.47 to 0.138.48. The build identity endpoint now reports product version 0.138.48 and phase id 1G S31W. Backwards compatible alias exports cascade. S31W routes the Questions to Generate step through a deterministic answer-then-generate flow. The runner walks the agent review wizard step by step via the documented data test id agentproof agent review wizard next button. On each non summary non final step the runner discovers Required level question hosts via data test id agentproof agent review wizard question host with data required level Required. For each Required question the runner probes the answer control type and dispatches a safe deterministic answer: yes for yes no not sure, first option for single choice plus owner role plus frequency plus maturity scale, first option for multi select, AgentProof safe test answer for free text short, n/a for evidence reference. After answering all visible Required questions on a step the runner clicks Next and waits for the active step to update. When the active step reaches final review the runner probes the Generate button by its documented data test id agentproof agent review wizard generate report selector (the button does NOT carry a data action id). On visible plus enabled the runner clicks via the locator helper which is extended with an optional selector override option (default null preserves prior behaviour byte for byte). The runner waits for the api score response and the readiness report view. Six precise S31W keys ship: generate report button missing after questions ready plus generate report button disabled after required answers plus generate report required questions not loaded plus generate report required answers not persisted plus generate report api score failed plus generate report no report view. Microsoft first party telemetry endpoint failures (browser events data microsoft com plus one collector plus dc services visualstudio com plus applicationinsights azure com plus aria microsoft com) are filtered out at the requestfailed listener and recorded under microsoft telemetry failures so they never trip the failed network requests observed blocking key. AgentProof localhost failures still populate failed network requests and the blocking key still fires when count is greater than zero. The safe page evaluate helper is extended with an optional evaluate args option so a closure parameter resolves correctly across the Chromium / Node boundary (default undefined preserves prior behaviour). On S31W failure a rich DOM dump (active step, step indicator text, final summary text, blocker text, required host count, Generate button present plus disabled) plus a dedicated screenshot s31w generate readiness failure png is attached. Legacy generate report button disabled after questions ready plus click with diagnostics review.generate report plus generate report click no api score call plus generate report pipeline error plus generate report no report view literals are preserved inside an if false dead code branch so the S31Q self eval gate literal source scans keep passing. Eight new archive blocking gates ship: required questions answered before generate s31w plus generate button selector contract s31w plus generate disabled diagnostics s31w plus generate click after required answers s31w plus microsoft telemetry failures non blocking s31w plus agentproof api failures still blocking s31w plus s31v questions locator preserved s31w plus live evidence policy preserved s31w. No OAuth touch. No cookie touch. No Portfolio click change. No Estate touch. No Environment touch. No agent tab navigation change. No UI redesign. No force enable Generate. No bypass of review rules. The S31G through S31V live evidence packaging policy is preserved byte for byte.
User impact: On the founder workstation the live runner now answers Required questions deterministically before looking for the Generate readiness report button. The wizard is walked step by step and Required questions on every step receive a safe deterministic answer. The Generate button appears on the final review step when all required answered is true. The runner clicks it via the Playwright locator API (no page evaluate, no selector forwarding bug). If Questions to Generate genuinely fails the runner reports one of six precise S31W keys with diagnostic grade evidence (active step plus step indicator text plus final summary text plus blocker text plus required host count plus Generate button state plus dedicated screenshot). Microsoft first party telemetry failures no longer block the founder report.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31W evidence record plus the surgical runner edit (Questions to Generate block replaced with answer-then-generate flow plus locator helper selector override plus DOM dump on failure) plus the s31w evidence on the live report plus the dedicated S31W runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.47 to 0.138.48 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31W internal methodology record (answer Required questions then click Generate readiness report; filter Microsoft telemetry)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31W founder execution signal: S31V live test reached Questions cleanly but the runner pushed generate_report_button_disabled_after_questions_ready because the wizard was never advanced past review_summary and no question was answered. The Generate button only renders on final_review when all_required_answered is true.
- Impact assessment
- Bumps the package version from 0.138.47 to 0.138.48. Replaces the runner Generate report block with an answer-then-generate flow that walks the AgentReviewWizard step by step and dispatches safe deterministic answers for Required questions. Extends the locator helper with an optional selectorOverride (default preserves prior behaviour) and safePageEvaluate with an optional evaluateArgs (default preserves prior behaviour). Filters Microsoft first party telemetry network failures as non blocking. Adds 8 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. S31R through S31V preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31W test exercises the contract via source scan plus exported sentinel plus six failure key checks plus Microsoft telemetry helper checks plus selector contract checks. The full live walk against a real Microsoft tenant must be performed by the founder to verify Questions to Generate now genuinely answers Required questions plus advances the wizard plus clicks Generate plus reaches the readiness report view OR a single precise downstream failure key with diagnostic grade evidence attached.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31XChange date:2026-05-17Product version:0.138.49Methodology engine version:0.9.1
Phase 1G-S31X - Console error diagnostics + classification with precise blocking keys
Reason: S31W reached the full successful product flow. Microsoft sign-in worked. Portfolio to Estate to Environment to Agent to Questions to Required answers to Generate to api score 200 to readiness report view. generated report observed was true. All 12 screenshots were captured. Microsoft telemetry was correctly routed to microsoft telemetry failures and did not block. The remaining blocker was console errors observed with no actual error text on the report (console event count equal 5, console error count equal 2, no text, no location, no scope, no classification). The founder cannot act on a count without details.
What changed: Package version bumped from 0.138.48 to 0.138.49. The build identity endpoint now reports product version 0.138.49 and phase id 1G S31X. Backwards compatible alias exports cascade. S31X adds rich per event console capture to the live runner: text widened from 400 to 1200 characters; ISO timestamp added; source channel (console or pageerror) added; Playwright provided location (url plus line number plus column number) added when surfaced; full stack from err.stack added on pageerror events (up to 1200 characters); current scope read from a new s31x last known scope outer variable that is refreshed by the s31p read current scope helper. The action context already added by S31L is preserved. A new classifier s31x classify console event routes every console error into one of four blocking classes (agentproof console error plus react runtime error plus hydration error plus unhandled page error) or three non-blocking buckets (non blocking noise plus ignored plus non blocking empty). Five documented pattern lists ship: s31 x non blocking noise patterns (Microsoft telemetry endpoints plus chrome-extension plus moz-extension plus net::err blocked by client plus Failed to load resource microsoft com plus Cross-Origin-Opener-Policy plus Refused to load microsoftonline com), s31 x ignored noise patterns (agentproof.microsoft.diagnostic plus strict mode), s31 x hydration patterns (Hydration failed plus hydrate combined with mismatch plus Text content does not match plus initial UI does not match plus server-rendered HTML), s31 x react runtime patterns (Cannot read properties of undefined or null plus Maximum update depth exceeded plus Objects are not valid as a React child plus Minified React error plus React error number plus Each child unique key plus Functions are not valid). The legacy benign console patterns and the legacy blocking failures.push console errors observed call are removed from the live code path. The generic literal survives only inside an if false dead code branch for the S31L source scan back compat. Up to four precise blocking keys are pushed (deduplicated by class). The report payload now includes a new sentinel plus the four precise keys plus five pattern lists plus four arrays (agentproof application console errors with full per-event detail and classification plus non blocking console events plus ignored console events) plus a console event classification counts breakdown. Seven new archive blocking gates ship: console error details reported s31x plus generic console error without details blocked s31x plus console error classification s31x plus non blocking noise not blocking s31x plus agentproof runtime errors still blocking s31x plus s31w generate flow preserved s31x plus live evidence policy preserved s31x. No OAuth touch. No cookie touch. No Portfolio click change. No Estate touch. No Environment touch. No agent tab navigation change. No question-answering change. No Generate flow change. No scoring change. No UI redesign. The S31W answer-then-generate flow plus the locator helper selector override plus the Microsoft telemetry network filter plus the S31G live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation every captured console error now carries text plus timestamp plus source channel plus location plus action context plus current scope plus a precise classification. Microsoft 1st party telemetry endpoint console errors and browser extension noise and the known agentproof.microsoft.diagnostic sentinel are recorded but never block. AgentProof application errors and React runtime errors and hydration errors and unhandled page errors push one precise S31X key per distinct class observed. The founder can read the exact text and location and stack of every blocking console error on the report.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31X evidence record plus the surgical runner edit (rich per-event capture plus classifier plus precise key push plus report payload arrays) plus the dedicated S31X runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.48 to 0.138.49 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31X internal methodology record (console error diagnostics and classification with precise blocking keys)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31X founder execution signal: S31W live test reached the full successful product flow but the report only carried console_event_count and console_error_count without the actual error text. The founder cannot act on a count without details.
- Impact assessment
- Bumps the package version from 0.138.48 to 0.138.49. Adds rich per-event console capture (text plus location plus stack plus timestamp plus source plus action context plus current scope) and a classifier that buckets every error. Replaces the generic console_errors_observed push with four precise class specific blocking keys. Surfaces three new report arrays (agentproof_application_console_errors plus non_blocking_console_events plus ignored_console_events) plus a per class count breakdown so the founder report is always actionable. Adds 7 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. S31R through S31W preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31X test exercises the classifier across all documented cases plus the runner rich capture plus the precise key push plus the generic dead branch contract. The full live walk against a real Microsoft tenant must be performed by the founder to verify the actual console error texts now appear on the report with classification OR the live console is clean and the run reports zero blocking failures.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31YChange date:2026-05-17Product version:0.138.50Methodology engine version:0.9.1
Phase 1G-S31Y - Favicon 404 console errors non-blocking; localhost API errors still blocking
Reason: S31X reached the full successful product journey. Microsoft sign-in worked. Portfolio to Estate to Environment to Agent to Questions to Required answers to Generate to api score 200 to readiness report view. generated report observed was true. All 12 screenshots were captured. Microsoft telemetry was non-blocking. The remaining blocker was agentproof console error observed because the S31X classifier did not inspect ev.location.url and routed two favicon 404s (localhost favicon and login.microsoftonline.com favicon) to the agentproof console error blocking bucket. The error text was identical to a real localhost /api/ 404 failure; only the request URL differed.
What changed: Package version bumped from 0.138.49 to 0.138.50. The build identity endpoint now reports product version 0.138.50 and phase id 1G S31Y. Backwards compatible alias exports cascade. S31Y adds a URL-aware favicon-404 check to the live runner classifier. New constants: s31 y favicon non blocking sentinel plus s31 y favicon url pattern (matches /favicon.ico with optional query or fragment, case insensitive) plus s31 y failed to load resource pattern (matches 'Failed to load resource') plus s31 y http 404 pattern (matches word-boundary 404). New helper s31y is favicon resource failure(ev) returns true ONLY when text matches 'Failed to load resource' AND text matches 404 AND (ev.location.url OR ev.text) matches the favicon URL pattern. The s31x classify console event function gains a new step 6 between the React-runtime check and the agentproof console error fallback that invokes s31y is favicon resource failure(ev) and returns 'non blocking noise' on match. All other steps preserved byte for byte. Report payload adds s31y favicon non blocking sentinel and s31y favicon url pattern. Six new archive blocking gates ship: favicon 404 non blocking s31y plus localhost api errors still blocking s31y plus react runtime errors still blocking s31y plus s31x console detail preserved s31y plus s31w full generate flow preserved s31y plus live evidence policy preserved s31y. No OAuth touch. No cookie touch. No Portfolio click change. No Estate touch. No Environment touch. No agent tab navigation change. No question-answering change. No Generate flow change. No scoring change. No UI redesign. The S31X console capture structure plus the S31W answer-then-generate flow plus the S31W Microsoft telemetry network filter plus the S31G live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation favicon 404 console errors (localhost favicon.ico and login.microsoftonline.com favicon.ico) no longer push agentproof console error observed. They are captured with full detail and routed to non blocking console events. A real localhost /api/ 404 OR 500 OR a React runtime error OR a hydration error OR an unhandled page error continues to push the appropriate precise S31X blocking key. When the only console noise is favicon 404s the founder run reports zero blocking failures end to end.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31Y evidence record plus the surgical runner edit (S31Y constants plus s31y is favicon resource failure helper plus classifier integration plus report payload sentinel) plus the dedicated S31Y runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.49 to 0.138.50 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31Y internal methodology record (favicon 404 console errors non-blocking; localhost API errors still blocking)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S31Y founder execution signal: S31X live test reached the full successful product flow but the report pushed agentproof_console_error_observed because the S31X classifier did not inspect ev.location.url. The two captured console errors were both favicon 404s (localhost favicon and login.microsoftonline.com favicon).
- Impact assessment
- Bumps the package version from 0.138.49 to 0.138.50. Adds a URL-aware favicon-404 check to the live runner classifier with a new helper s31yIsFaviconResourceFailure. Routes favicon 404s to non_blocking_noise while preserving all four S31X blocking classes. Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. S31X console capture preserved byte for byte. S31R through S31X preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31Y test exercises the favicon helper across all documented cases plus the classifier integration plus the localhost API blocking preservation plus the React/hydration/pageerror blocking preservation plus the S31X structure preservation plus the S31W flow preservation. The full live walk against a real Microsoft tenant must be performed by the founder to verify favicon 404s now route to non_blocking_console_events and the run reports zero blocking failures end to end OR a single precise downstream failure key with diagnostic grade evidence attached.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S32BChange date:2026-05-17Product version:0.138.51Methodology engine version:0.9.1
Phase 1G-S32B - Capture rendered report quality evidence; fix score selector; remove scenario duplication
Reason: S31Y passed the full successful live product journey. The S32A founder report quality audit verdicted PROMISING BUT NOT SELLABLE and identified three surgical defects in the evidence and credibility layer: (1) the runner queried data-test-id agentproof-readiness-score which no component renders so score observed was always n/a, (2) the runner exported generated report observed true but no copy of the rendered report substance so the founder could not assess content quality from the artefact alone, (3) scenario model tab body rendered s.evidence basis under two different labels (Trigger / condition and Evidence basis) which a sophisticated buyer would notice in the first minute.
What changed: Package version bumped from 0.138.50 to 0.138.51. The build identity endpoint now reports product version 0.138.51 and phase id 1G S32B. Backwards compatible alias exports cascade. The live runner score-extraction call now uses two documented selectors: s32 b score primary selector = data-test-id agentproof-readiness-hero-score (the hero score the dashboard actually renders) and s32 b score fallback selector = data-test-id agentproof-why-this-score-overall (the Why-this-score card overall). The legacy broken selector survives only inside an if false dead-code branch. After the report tab opens, the runner writes a new s32b report quality capture block to the founder live acceptance report payload. The block reads nineteen documented fields from the rendered DOM: agent name, overall score, readiness band, risk posture, confidence, top lifter text, top limiter text, next action text, top 5 priority titles (max 5), top 5 priority severities, scenario titles (max 10), scenario severities, control titles (max 10), discovered count, inferred count, not proven count, markdown body length, markdown body sha256, report view dom excerpt (capped at 2000 characters). The capture also stamps captured at and capture ok. The scenario model tab body component removes the duplicate Trigger / condition line that printed s.evidence basis; only the documented Evidence basis line remains (now carrying data-test-id agentproof-report-scenario-card-evidence-basis). The Next confirmation needed line driven by s.missing proof is preserved. Six new archive blocking gates ship: runner captures report quality evidence s32b plus runner score selector matches dom s32b plus scenario card no label duplication s32b plus s31y console classification preserved s32b plus s31w full generate flow preserved s32b plus live evidence policy preserved s32b. No OAuth touch. No cookie touch. No Microsoft connection change. No Portfolio click change. No Estate touch. No Environment touch. No agent tab navigation change. No question-answering change. No Generate flow change. No scoring change. No scoring algorithm change. No UI redesign. No console classification change. The S31X console capture structure plus the S31Y favicon classifier plus the S31W answer-then-generate flow plus the live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation the live runner now captures the substance of the generated report. score observed becomes a number (not null or n/a) whenever the hero or Why-this-score card renders. The founder live acceptance report payload now includes s32b report quality capture with agent name plus the canonical score plus readiness band plus risk posture plus confidence plus the plain-English biggest-lifter biggest-limiter and do-this-next sentences plus the top five priority titles and severities plus the scenario titles and severities plus the control titles plus the discovered inferred and not-proven counts plus the markdown body length plus a deterministic SHA-256 plus a redacted 2000 character DOM excerpt. The Scenario model tab no longer shows the same sentence under two different labels.
When to re-score: Evidence capture and credibility fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateReport ui minorEvidence trace: the new Phase 1G S32B evidence record plus the surgical runner edits (S32B constants plus the corrected score selectors plus the rendered report quality capture block plus the report payload attachment plus an exported sha256 helper) plus the scenario model tab body duplicate removal plus the dedicated S32B runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.50 to 0.138.51 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S32B internal methodology record (capture rendered report quality evidence; fix score selector; remove scenario duplication)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S32B founder execution signal: the S31Y live test passed end to end but the S32A audit found that the runner queried a non-existent score selector, never captured the report substance, and rendered a scenario card with duplicate evidence labels. Surgical evidence and credibility fix only.
- Impact assessment
- Bumps the package version from 0.138.50 to 0.138.51. Adds five runner constants plus one helper plus a rendered report quality capture block plus a report payload attachment. Adds two data-test-ids and removes one duplicate paragraph in ScenarioModelTabBody. Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI shell unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S31R through S31Y preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S32B test exercises the exported sentinel plus the documented selector chain plus the capture field list plus the SHA-256 helper plus the scenario duplication removal plus the S31Y and S31W preservation contracts. The actual numeric score and report substance on the founder workstation must still be confirmed by a real live run against a real Microsoft tenant. The S32C audit cycle (out of scope for S32B) will read the captured s32b_report_quality_capture payload and produce a content-grounded buyer-value verdict.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S32DChange date:2026-05-17Product version:0.138.52Methodology engine version:0.9.1
Phase 1G-S32D - Tighten report-quality capture fidelity; dedupe duplicate priority titles
Reason: The S32B live test passed end-to-end and captured the new report-quality evidence block, but the S32C audit identified six surgical defects in capture fidelity and one real product duplicate. (1) Why-this-score capture returned null for top lifter, top limiter, and next action because the card mounts above the report-view container in the live page. (2) Markdown capture matched the Download Markdown button label instead of the real Markdown body, because the runner used a wildcard data-test-id contains markdown. (3) Priority title capture concatenated the rank chip with the title text via text content, yielding 1Human oversight not proven. (4) Severity and confidence chip capture stripped Tailwind margins, yielding Severity:Medium with no space. (5) Control title capture used a wildcard that matched the Intelligence card practical-controls section heading plus the list container plus each list item, double-counting controls because both the S30S Controls tab body and the S26D Intelligence card render the same practical controls array. (6) The priority planner dedupes by id only, so two candidates with the same title but different ids both reach top-5 and a buyer reading the report sees a visible duplicate title.
What changed: Package version bumped from 0.138.51 to 0.138.52. The build identity endpoint now reports product version 0.138.52 and phase id 1G S32D. Backwards compatible alias exports cascade. The live runner gains two new helper functions: read text with document fallback(sel, root) which retries against document scope when the inside-scope read returns null, and read chip(sel, root) which reads data-chip-label and data-chip-value attributes from a labelled status chip and returns a clean Label colon Value string. The Why-this-score title plus overall plus top lifter plus top limiter plus next action selectors all use read text with document fallback. The priority title extraction reads only the title text nodes inside p.text-slate-900 (node type equal three children only, excluding the rank span) and applies a leading rank strip regex slash caret backslash s star bracket one to five bracket backslash s star bracket dot close paren colon hyphen bracket question mark backslash s star slash as defence in depth. Priority severities and scenario severities both route through read chip and produce the clean Severity colon space Value format. The control title selector is tightened from a wildcard data-test-id contains practical-control plus contains control-card to the exact data-test-id agentproof-report-control-card only. The Markdown body capture uses ONLY the new exact selector data-test-id agentproof-readiness-raw-markdown queried against document scope. The previous wildcard data-test-id contains markdown is removed from the live path. json paste score card adds data-test-id agentproof-readiness-raw-markdown plus the s32d sentinel to the pre element that renders microsoft readiness report.markdown. Attribute-only marker, no visible UI change. The priority planner adds a second dedupe step after the existing dedupe-by-id. The new step uses normalise priority title for dedupe (lowercased plus trimmed plus collapsed whitespace) to drop any duplicate normalised title from top 5 priorities and later items. The first occurrence wins. Eight new archive blocking gates ship: why this score capture falls back to document s32d plus markdown body real selector s32d plus priority title no rank concat s32d plus control title selector tightened s32d plus chip capture via data attributes s32d plus priority planner dedupes by title s32d plus s32b pass path preserved s32d plus live evidence policy preserved s32d. No OAuth touch. No cookie touch. No Microsoft connection change. No Portfolio click change. No Estate touch. No Environment touch. No agent tab navigation change. No question answering change. No Generate flow change. No scoring change. No scoring algorithm change. No UI redesign. No console classification change. The S31X console capture structure plus the S31Y favicon classifier plus the S31W answer-then-generate flow plus the S32B sentinel and capture field list plus the live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation the captured s32b report quality capture block now carries the actual Why-this-score biggest-lifter and biggest-limiter and do-this-next sentences (no longer null), plus clean priority titles (no rank concat), plus clean severity formatting (Severity colon space Value), plus control titles from the Controls tab only (no heading, no Intelligence card duplicates), plus a real Markdown body length and SHA-256 (not the button label). The Top-5 priorities show each unique title exactly once. The next S32E audit cycle can read these captured fields and produce a content-grounded buyer-value verdict against the real rendered substance.
When to re-score: Capture fidelity plus duplicate-removal fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateReport planner dedupeEvidence trace: the new Phase 1G S32D evidence record plus the surgical runner edits (read text with document fallback plus read chip helpers plus the corrected priority title extraction plus the tightened control selector plus the corrected markdown selector) plus the json paste score card marker on the raw markdown pre element plus the priority planner dedupe-by-title step plus the exported normalise priority title for dedupe plus the dedicated S32D runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.51 to 0.138.52 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S32D internal methodology record (tighten report-quality capture fidelity; dedupe duplicate priority titles)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S32D founder execution signal: the S32B live test PASSED end-to-end and captured a real S32B report-quality block but the S32C audit identified six surgical defects in capture fidelity (Why-this-score null, Markdown wildcard matched the button label, priority title rank concat, chip Severity colon no space, control wildcard captured the heading, planner duplicate). S32D fixes all six without touching the working live path.
- Impact assessment
- Bumps the package version from 0.138.51 to 0.138.52. Adds two runner helpers (readTextWithDocumentFallback plus readChip), tightens five capture selectors, adds one attribute-only data-test-id marker to the raw Markdown pre, and adds one planner dedupe-by-title step. Adds 8 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI shell unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S32B sentinel plus capture field list preserved byte-for-byte. S31R through S32B preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S32D test exercises the new helpers and selectors against source scans plus a stub chip plus the rank-prefix stripping contract plus the normaliser contract. The actual captured fields on the founder workstation must still be confirmed by a real live run against a real Microsoft tenant. The S32E audit cycle (out of scope for S32D) will read the new captured fields and produce a content-grounded buyer-value verdict.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S32EChange date:2026-05-17Product version:0.138.53Methodology engine version:0.9.1
Phase 1G-S32E - Repair failed S32D objectives via hidden evidence carrier and final-defence title dedupe
Reason: The S32D live test PASSED the live path but the S32D objectives FAILED in the captured payload. The top lifter text, top limiter text, and next action text all returned null. The markdown body length stayed at zero. The markdown body SHA-256 stayed null. The duplicate Human oversight not proven priority STILL appeared twice in top 5 priority titles. The S32C diagnosis traced the first two failures to a single root cause: both the agent proof why this score card mount AND the raw-Markdown pre element live INSIDE the permanently-false s30 h legacy stack never block in json paste score card. json paste score card line 15587 defines the constant as false. The Why-this-score card mount at line 11880 and the raw-Markdown pre at line 13290 are both inside the wrapper at line 6924, so they never render at runtime. Document-scope fallback was the right design but the elements are simply not in the live DOM. The duplicate priority survived because the post-planner concept-level dedupe in agentproof priority deduplicator classifies priorities by concept bucket and has no final title-level guarantee at the output.
What changed: Package version bumped from 0.138.52 to 0.138.53. The build identity endpoint now reports product version 0.138.53 and phase id 1G S32E. Backwards compatible alias exports cascade. S32E ships a hidden evidence carrier element mounted at the JSX top level of json paste score card OUTSIDE every workspace-view gate. The carrier is a div with hidden plus aria-hidden plus display none, so zero visible UI change. When microsoft canonical score is set, the carrier exposes data-test-id agentproof-canonical-top-lifter-text, data-test-id agentproof-canonical-top-limiter-text, data-test-id agentproof-canonical-next-action-text, and data-test-id agentproof-canonical-overall-score. When microsoft readiness report markdown is set, the carrier also mounts a pre element with data-test-id agentproof-canonical-raw-markdown containing the actual scorecard Markdown. The runner Why-this-score selectors chain primary then document-scope fallback then carrier selectors. The Markdown selector chain reads the primary then the carrier. After capture, when generated report observed is true, the runner pushes one of three precise S32E failure keys on a missing field: s32e why this score capture missing when any of lifter limiter next action is empty, s32e markdown capture missing when markdown body length is less than 500 OR sha is null, s32e duplicate priority titles present when any two captured top 5 priority titles normalise to the same key. The priority deduplicator gains a final-defence title-level dedupe step AFTER the existing concept-level merge. Regardless of which bucket a candidate landed in, the rendered Top-5 cannot contain duplicate normalised titles. The new normalise title for final dedupe s32e helper plus the priority title final dedupe s32 e sentinel are exported. Seven new archive blocking gates ship: why this score capture non null s32e plus markdown capture non empty s32e plus markdown capture sha present s32e plus duplicate priority titles blocked s32e plus s32d formatting fixes preserved s32e plus s31y live pass path preserved s32e plus live evidence policy preserved s32e. No OAuth touch. No cookie touch. No Microsoft connection change. No navigation change. No question answering change. No Generate flow change. No scoring change. No UI redesign (the carrier is hidden plus aria-hidden plus display none). No console classification change. The S31X console capture plus the S31Y favicon classifier plus the S31W answer-then-generate flow plus the S32B sentinel plus the S32D capture helpers plus the live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation the captured S32B report quality block now carries the actual canonical score lifter sentence plus the canonical score limiter sentence plus the do this next sentence plus a real Markdown body length and SHA-256. The captured top 5 priority titles contains each unique normalised title exactly once. If a future regression drops the carrier or breaks dedupe the runner pushes the precise S32E failure key so the regression is caught on the next live run.
When to re-score: Capture-fidelity plus duplicate-removal repair only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateReport planner dedupeEvidence trace: the new Phase 1G S32E evidence record plus the surgical product edits (the hidden evidence carrier element in json paste score card plus the final-defence title dedupe in agentproof priority deduplicator) plus the surgical runner edits (carrier-fallback wiring plus the live-run validation block with three precise keys plus the new S32E constants) plus the dedicated S32E runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.52 to 0.138.53 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S32E internal methodology record (repair failed S32D objectives via hidden evidence carrier and final-defence title dedupe)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S32E founder execution signal: the S32D live test PASSED the live path but the S32D objectives FAILED in the captured payload. Why-this-score capture stayed null, Markdown capture stayed empty, and the duplicate priority survived. The S32C diagnosis traced the first two failures to the S30H_LEGACY_STACK_NEVER block. The duplicate priority survived because concept-level dedupe has no final title-level guarantee.
- Impact assessment
- Bumps the package version from 0.138.52 to 0.138.53. Adds a hidden evidence carrier element in JsonPasteScoreCard (pure attribute-only marker, zero visible UI change) and a final-defence title-level dedupe step in agentproof_priority_deduplicator. Runner gains carrier-fallback wiring plus a live-run validation block with three precise S32E failure keys. Adds 7 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S32D helpers (readTextWithDocumentFallback plus readChip) preserved byte-for-byte. S31R through S32D preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S32E test exercises the carrier mount contract plus the runner fallback chain plus the live-run validation push paths plus the deduplicator final-title dedupe contract. The actual captured fields on the founder workstation must still be confirmed by a real live run against a real Microsoft tenant. The next audit cycle will read the now-populated lifter limiter next_action plus the real Markdown body and produce a content-grounded buyer-value verdict against real rendered substance.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S32FChange date:2026-05-17Product version:0.138.54Methodology engine version:0.9.1
Phase 1G-S32F - Truthful top-lifter fallback and scoped priority capture (repair the two failed S32E objectives)
Reason: The S32E live test PASSED the live path and PASSED the Markdown capture objective but two captured objectives still failed. The captured top lifter text was the empty string. The duplicate Human oversight not proven priority still appeared twice in top 5 priority titles. The S32F diagnosis traced the empty top lifter text to the hidden evidence carrier emitting microsoft canonical score.score lifters and microsoft canonical score.score lifters [0] and microsoft canonical score.score lifters[0].reason OR the empty string. On the live canonical score score lifters[0] was either missing or had an empty .reason. The OR-fallback was therefore the empty string. The runner was doing its job - the carrier was emitting empty. The founder explicitly forbade fabricating a positive strength so the fix is a truthful documented fallback sentence not a fake positive. The duplicate priority survived because of a CAPTURE bug not a product bug. The Top5Priorities component is mounted in BOTH the Summary tab body slot AND the Controls tab body slot of s30o render report tabs. Inactive tabs use display none but remain in the DOM. The S32E priority-item read was scoped to the report-view root and matched ten data-test-id agentproof-top-5-priority-item nodes - five per panel. When the planner dedupe correctly produced four distinct items the Summary panel rendered items one through four and the Controls panel rendered the same items one through four. The runner slice zero through five crossed the panel boundary on the fifth read and re-captured item one from the Controls panel yielding the captured A B C D A pattern observed in the S32E live evidence.
What changed: Package version bumped from 0.138.53 to 0.138.54. The build identity endpoint now reports product version 0.138.54 and phase id 1G S32F. Backwards compatible alias exports cascade. S32F ships two surgical fixes. Fix one: in json paste score card the hidden evidence carrier emits three documented truthful fallback sentences instead of the empty string when the canonical score arrays are missing or empty. The fallback sentences are: No positive score lifter was strong enough to offset the current readiness gaps. No single score limiter was identified the readiness gap is distributed across multiple categories. Review the Top 5 priorities and confirm the highest-severity item with the named business owner. A truthful fallback sentinel attribute data-s32f-top-lifter-truthful-fallback-sentinel equals agentproof.s32f top lifter truthful fallback.v1 is added to the carrier root for source-scan-based self-evaluation. Pure attribute and string-literal change. Zero visible UI change because the carrier is still hidden plus aria-hidden plus display none. Fix two: in the runner the priority-item read is replaced with a strictly-scoped read against the FIRST data-test-id agentproof-top-5-priorities container. One container equals one set of at-most-five priority items. The slice zero through five is applied AFTER the container-scoped query. Document-scope read all is preserved only as a defensive branch when the container is absent. Two precise live-run failure keys are added: s32f top lifter capture still empty and s32f duplicate priority capture still present. Six new archive blocking gates ship: top lifter truthful fallback s32f plus duplicate priority capture or render fixed s32f plus markdown capture preserved s32f plus s32e validation preserved s32f plus s31y live pass path preserved s32f plus live evidence policy preserved s32f. No OAuth touch. No cookie touch. No Microsoft connection change. No navigation change. No question answering change. No Generate flow change. No scoring change. No UI redesign (the carrier remains hidden plus aria-hidden plus display none). No console classification change. No Markdown capture change - the S32E Markdown chain is preserved byte for byte. The S31X console capture plus the S31Y favicon classifier plus the S31W answer-then-generate flow plus the S32B sentinel plus the S32D capture helpers plus the S32E carrier plus the S32E final dedupe plus the live evidence packaging policy are preserved byte for byte.
User impact: On the founder workstation the captured S32B report quality block now carries a non-empty top lifter text on every passing run. Either the real top score lifter sentence is captured (when the canonical score has a usable score lifters[0].reason) OR the documented truthful fallback sentence is captured (when the canonical score has no positive lifter strong enough to surface). The same truthful-fallback discipline guarantees top limiter text and next action text are also non-empty. The captured top 5 priority titles contains each unique normalised title exactly once because the runner now scopes its read to ONE rendered Top5Priorities container so cross-panel re-capture is impossible. If a future regression reintroduces the empty-string fallback OR widens the priority read back to document scope the runner pushes one of the two precise S32F failure keys so the regression is caught on the next live run.
When to re-score: Capture-fidelity and truthful-fallback repair only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateLive runner capture fidelityEvidence trace: the new Phase 1G S32F evidence record plus the surgical product edit (the truthful-fallback sentences plus the s32f truthful-fallback sentinel attribute in json paste score card) plus the surgical runner edit (priority-item read scoped to the FIRST agentproof-top-5-priorities container) plus the dedicated S32F runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.53 to 0.138.54 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S32F internal methodology record (truthful top-lifter fallback and scoped priority capture)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-17
- Reference
- Phase 1G-S32F founder execution signal: the S32E live test PASSED the live path and PASSED the Markdown capture objective but FAILED two captured objectives. top_lifter_text was captured as the empty string because the S32E carrier emitted score_lifters[0]?.reason OR the empty string and score_lifters[0] was missing on the live canonical score. The duplicate priority survived because the runner queried priority items document-wide and crossed the Summary plus Controls Top5Priorities panels on its slice.
- Impact assessment
- Bumps the package version from 0.138.53 to 0.138.54. Adds three documented truthful fallback sentences and one sentinel attribute inside the hidden evidence carrier in JsonPasteScoreCard (pure attribute and string-literal marker, zero visible UI change). Adds a container-scoped querySelectorAll read in the runner replacing the document-scope readAll for priority items. Adds two precise live-run failure keys. Adds 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged. Portfolio click unchanged. Estate unchanged. Environment unchanged. Agent tab navigation unchanged. Question answering unchanged. Generate flow unchanged. Console classification unchanged. S32E carrier plus S32E final dedupe plus S32E Markdown chain plus S32E live-run validation preserved byte for byte. S31R through S32E preserved byte for byte.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S32F test exercises the carrier truthful-fallback sentences (source-scan) plus the runner container-scoped read (source-scan) plus the runner export contracts plus the preserved S32E and S31Y/S32B contracts. The actual captured fields on the founder workstation must still be confirmed by a real live run against a real Microsoft tenant. The next audit cycle will read the now-truthful top_lifter plus top_limiter plus next_action plus the now-deduplicated top_5_priority_titles and produce a content-grounded buyer-value verdict.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31KChange date:2026-05-16Product version:0.138.36Methodology engine version:0.9.1
Phase 1G-S31K - Fix localhost vs 127.0.0.1 OAuth origin mismatch and first-connect cookie loss
Reason: The S31J runner reached the status endpoint but the founder retested and the endpoint reported NOT connected after a successful Microsoft sign in. The S31J live evidence proved the next failure mode - the OAuth callback returned the browser to one loopback alias while the runner polled the status endpoint on another loopback alias, and the AgentProof Microsoft session cookie is host bound. S31K must establish ONE canonical local origin and route every consumer through it.
What changed: Package version bumped from 0.138.35 to 0.138.36. The build identity endpoint now reports product version 0.138.36 and phase id 1G-S31K. Backwards compatible alias exports cascade. Root cause: the local stack mixed two loopback origins. The runner navigated Chromium to http colon slash slash 127 dot 0 dot 0 dot 1 colon 3000. Microsoft Entra ID was configured for http colon slash slash localhost colon 3000. The OAuth callback set the agentproof underscore ms underscore session cookie on the localhost host. The runner then polled the status endpoint via http colon slash slash 127 dot 0 dot 0 dot 1 colon 3000. Browsers bind cookies to the exact host they were set on so the localhost bound cookie was invisible on the 127 dot 0 dot 0 dot 1 host and the status endpoint received no cookie. S31K establishes ONE canonical local origin (http colon slash slash localhost colon 3000) and ships a new shared lib connectors microsoft microsoft local origin ts module that exports the canonical local host constant plus the documented loopback alias hosts list plus is loopback host plus get canonical local origin plus normalise to canonical local origin plus parse origin info plus describe origin diagnostic plus detect loopback origin mismatch plus the precise failure key loopback origin mismatch cookie not visible. The auth start route imports the normaliser and runs the env redirect uri through it before invoking build authorize url so Microsoft Entra ID always returns the browser to the canonical localhost host. The callback route normalises the inbound request URL through the normaliser when building the buyer redirect so a callback that arrives on 127 dot 0 dot 0 dot 1 redirects the browser to localhost. The callback redirect query string carries only the canonical microsoft equals status flag plus the new canonical agentproof underscore ms underscore auth equals success or failed flag. The callback never writes token or session id or OAuth code material into the redirect URL. The live runner declares the canonical local host constant equal to the literal localhost and the HOST template literal equal to http colon slash slash dollar canonical local host colon dollar PORT. The readiness probe uses host equal to the canonical local host constant. The wait for post oauth app ready default uses the canonical host. After OAuth return the runner asks the Playwright browser context for cookies and records a buyer safe metadata snapshot on both the canonical origin and the OAuth return origin (cookie names visible plus agentproof underscore ms underscore session present plus a small safe metadata record with domain plus path plus secure plus http only plus same site plus expires present). The runner never writes a cookie value. The runner pushes the precise failure key loopback origin mismatch cookie not visible on a detected mismatch. The live acceptance report carries a new s31k canonical local origin sentinel plus s31k app origin plus s31k status endpoint url plus s31k oauth return origin diagnostic plus s31k oauth return normalised url plus s31k cookie origin metadata plus s31k loopback origin mismatch observed plus s31k loopback origin mismatch failure key. Eight new archive blocking gates ship - loopback origin normalised s31k plus auth start uses canonical localhost s31k plus callback redirect uses canonical localhost s31k plus runner single origin contract s31k plus runner cookie origin diagnostics s31k plus loopback origin mismatch failure key s31k plus no secret redirect params s31k plus live evidence policy preserved s31k. No tokens leak. No fake connected state. The S31G S31H S31I S31J live evidence packaging policy is preserved byte for byte. No second chip system. No scoring change. Founder summary now reads 299 of 299 checks green.
User impact: On the founder workstation, after Microsoft sign in completes, the browser is always returned to the canonical http colon slash slash localhost colon 3000 origin regardless of whether the Microsoft Entra ID app registration or the founder environment was configured for 127 dot 0 dot 0 dot 1. The agentproof underscore ms underscore session cookie set by the OAuth callback is visible to the status endpoint poll because both run on the canonical host. The runner can now either complete the founder walk or surface a precise diagnosis (loopback origin mismatch cookie not visible) plus the safe cookie origin metadata so the founder sees exactly which host carries the session cookie and which host the runner is polling.
When to re-score: Tooling bug fix plus a new buyer safe shared origin normaliser and safe runner diagnostic fields. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ApiToolingSelf evaluation gateEvidence trace: the new Phase 1G S31K evidence record plus the new shared loopback origin normaliser module plus the refactored auth start route plus the refactored callback route plus the hardened live runner with the canonical local host pin and the cookie origin diagnostics plus the dedicated S31K runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.35 to 0.138.36 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31K internal methodology record (fix localhost vs 127.0.0.1 OAuth origin mismatch and first-connect cookie loss)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31K founder execution signal: the S31J runner reached the status endpoint but the founder retested and the live evidence proved the OAuth callback returned to 127 dot 0 dot 0 dot 1 while the runner polled the status endpoint on localhost. The agentproof underscore ms underscore session cookie is host bound so the localhost bound cookie was invisible on 127 dot 0 dot 0 dot 1 and the status endpoint received no cookie.
- Impact assessment
- Bumps the package version from 0.138.35 to 0.138.36. Ships a new shared loopback origin normaliser module. Routes auth start redirect uri through the normaliser. Normalises the callback buyer redirect. Switches the live runner to canonical localhost. Adds buyer safe cookie origin diagnostics (metadata only). Adds the precise S31K failure key. Adds eight new archive blocking gates. Tokens stay server side. Tokens are never returned to the browser. The live evidence packaging policy is unchanged. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31K test exercises the normaliser branches with mocked URLs, the auth start contract via source scan, the callback contract via source scan plus end to end secret param assertions, the runner contract via source scan, the cookie diagnostics contract via source scan, the mismatch failure key contract via source scan, and the end to end status endpoint contract via the canonical truth helper with mocked session stores. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the origin fix end to end.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31LChange date:2026-05-16Product version:0.138.37Methodology engine version:0.9.1
Phase 1G-S31L - Fix live connected post-connection walk from Environment to Agent to Questions to Generate report
Reason: S31K fixed the OAuth and cookie and origin mismatch enough for the live runner to reach a connected Environment view with 39 agents loaded. The founder retest now fails with screenshot s31g 03 agent live missing plus generate readiness report no outcome plus console errors observed plus failed network requests observed. The runner uses a generic failure key the founder cannot diagnose. S31L must replace the generic failure with a precise click level or network or pipeline diagnostic.
What changed: Package version bumped from 0.138.36 to 0.138.37. The build identity endpoint now reports product version 0.138.37 and phase id 1G-S31L. Backwards compatible alias exports cascade. Root cause: the S31K runner clicked the right aside only environment open agent workspace button BEFORE the row CTA environment open agent. The workspace button is not rendered until an agent is selected (it lives in the right hand aside panel of environment dashboard view), so the first click was a silent no op. The environment open agent row CTA (since S30N) collapses select plus navigate into one click but the runner polled data scope level immediately after the click with only a 600 ms timer without waiting for the React state flush. Agent scope was therefore never confirmed, the agent screenshot was never captured, and every subsequent click ran on the wrong view. The Generate report click was either silently dropped (button not present on Environment view) or hit a disabled button (no answers committed) and the runner pushed the generic generate readiness report no outcome because the readiness view never appeared. S31L ships a new helper click with diagnostics that records pre and post DOM diagnostics for every action click and waits for documented post click transitions. Pre click diagnostics capture action id plus current url plus match count plus button text plus disabled plus scope level plus page title excerpt plus selected agent excerpt. Post click diagnostics capture the same fields plus transition observed plus transition timeout ms. If the documented post click transition does not happen within the timeout the runner pushes a PRECISE failure key from the documented S31L precise failure keys list. The Environment to Agent click is now environment open agent FIRST followed by wait for scope agent with a 10 second timeout. The redundant environment open agent workspace first click is removed. The Agent to Questions click waits for a visible questions panel selector. The Generate report flow now verifies the button is enabled before click and registers a page on response listener filtered to slash api slash score BEFORE the click and detaches the listener after the readiness view or retry button wait. The seven documented precise failure keys are environment open agent no transition plus agent questions tab no transition plus generate report button disabled plus generate report click no api score call plus generate report api score failed plus generate report pipeline error plus generate report no report view. The legacy generic generate readiness report no outcome push is removed. Console errors and failed network requests are tagged with action context that records the action id that was being clicked plus the time since click. Error events whose context is non null are also pushed into s31l console error action context for easy founder lookup. Eight new archive blocking gates ship - live runner action diagnostics s31l plus environment to agent transition s31l plus agent scope selected agent confirmation s31l plus generate report runtime path s31l plus generate report precise failure keys s31l plus console network errors action context s31l plus no generic generate no outcome s31l plus live evidence policy preserved s31l. No UI redesign. No scoring change. No fake successful reports. The S31G S31H S31I S31J S31K live evidence packaging policy is preserved byte for byte. Founder summary now reads 307 of 307 checks green.
User impact: On the founder workstation, the live runner now moves from Environment to Agent in one row CTA click, waits for the Agent scope to flush, captures the Agent screenshot, walks Questions, verifies the Generate report button is enabled, registers the slash api slash score listener BEFORE clicking, and either reaches the Report view or surfaces a precise click level or network or pipeline diagnostic. The founder no longer sees the generic generate readiness report no outcome; they see exactly which click or network call or pipeline step failed plus the action context that triggered any console or network error.
When to re-score: Tooling bug fix plus new buyer safe runner diagnostics. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ApiToolingSelf evaluation gateEvidence trace: the new Phase 1G S31L evidence record plus the refactored live runner with click with diagnostics and wait for scope plus the seven documented precise failure keys plus the action tagged console and network listeners plus the dedicated S31L runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.36 to 0.138.37 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31L internal methodology record (fix live connected post-connection walk from Environment to Agent to Questions to Generate report)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31L founder execution signal: the S31K runner reached a connected Environment view with 39 agents loaded but ended with screenshot s31g 03 agent live missing plus the generic generate readiness report no outcome. Source audit proved the runner clicked the right aside only environment open agent workspace button BEFORE the row CTA and polled scope immediately without waiting for the React state flush.
- Impact assessment
- Bumps the package version from 0.138.36 to 0.138.37. Refactors the live runner click semantics. Replaces the generic failure key with seven precise keys. Adds eight archive blocking gates. Adds a dedicated S31L runtime test. Tokens stay server side. The live evidence packaging policy is unchanged. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31L test exercises the contract via source-scan plus exported sentinel and key checks. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the runner reaches Environment to Agent to Questions to Report end to end or surfaces a precise failure key.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31MChange date:2026-05-16Product version:0.138.38Methodology engine version:0.9.1
Phase 1G-S31M - Fix live runner readiness gating after Microsoft connection do not start founder walk on connected no environment
Reason: S31L preserved the S31K OAuth cookie fix and improved per click diagnostics. The founder live tested S31L and the runner still failed because it started the founder walk the moment the layered detection found ANY connected state, including connected no environment when no estate card is rendered. The Portfolio open estate action did not exist at that moment, so the walk pushed a cascade of false downstream failures (missing action portfolio open estate plus generate report button disabled plus console errors observed plus failed network requests observed plus screenshot estate environment agent missing) that masked the single real cause.
What changed: Package version bumped from 0.138.37 to 0.138.38. The build identity endpoint now reports product version 0.138.38 and phase id 1G S31M. Backwards compatible alias exports cascade. Root cause: the S31L runner pushed connected state detected equal true the moment the layered detection layer found any connected state. The walk start log fired the instant the layered detection succeeded, even when the state was connected no environment with no estate card rendered yet. Every subsequent step ran on the wrong view and produced the cascade of false downstream failures. S31M ships a readiness gate that sits AFTER Connected state detected and BEFORE Starting live connected acceptance walk. The gate polls the status endpoint and the environments endpoint and the DOM (data action id portfolio open estate plus current scope) inside a documented deadline (default 90000 ms via AGENTPROOF LIVE READINESS TIMEOUT MS). The gate declares walk ready only when the portfolio estate action is present OR the app is already on a deeper valid scope (estate or environment or agent). The runner prints five documented readiness wait log literals so the founder console explains exactly what the gate is waiting for - authenticated but environments not loaded yet plus waiting for environments plus environments endpoint returned count plus portfolio estate action found plus founder walk starting. On a readiness timeout the runner pushes ONE precise failure key from the five documented S31M keys - connected but environments not loaded (env count 0 and not connected no environment) or connected but portfolio estate not ready (state connected no environment with no estate card) or portfolio open estate action missing after environment ready (env count greater than 0 but no DOM action) or portfolio open estate button disabled (button found but disabled) or portfolio to estate transition failed (click ran but scope never reached estate). The runner then aborts downstream Estate Environment Agent Questions Report steps by gating every downstream step on if not s31m walk aborted. The live acceptance report carries primary failure key plus stopped at step plus skipped steps due to prior failure plus s31m readiness gate evidence (sentinel plus readiness timeout ms plus observed states transitions array plus last observed state plus last environment count plus last estate action present plus last scope level plus ready plus failure key on timeout). The screenshot completeness check uses a step map so screenshots whose step is in skipped steps due to prior failure no longer push screenshot star missing blocking keys. Console errors and failed network requests still increment the count fields on the report but no longer push console errors observed or failed network requests observed when s31m walk aborted is true. The legacy generic missing action portfolio open estate fallback can no longer fire on a real readiness timeout because the gate runs before the click. Eight new archive blocking gates ship - runner readiness state model s31m plus runner waits for environments ready s31m plus runner waits for portfolio estate action s31m plus runner readiness wait log s31m plus runner skips downstream after portfolio fail s31m plus precise portfolio readiness failure keys s31m plus no generic missing action portfolio open estate s31m plus live evidence policy preserved s31m. No UI redesign. No scoring change. No fake estate environment or agent data. The S31G S31H S31I S31J S31K S31L live evidence packaging policy is preserved byte for byte. Founder summary now reads 315 of 315 checks green.
User impact: On the founder workstation the live runner now waits until the Portfolio view actually has an actionable estate card (or until the app is already on a deeper valid scope) before printing Starting live connected acceptance walk. If the Microsoft tenant takes longer than 90 seconds to discover environments, the runner pushes ONE precise key (connected but portfolio estate not ready or portfolio open estate action missing after environment ready or connected but environments not loaded) and stops the downstream walk. The founder sees ONE primary failure key plus a structured readiness gate evidence object that names the last observed state, the last environment count, the last estate action presence, and the last scope level. No more cascade of seven downstream false failures masking the one real cause.
When to re-score: Tooling bug fix plus new buyer safe runner readiness gate evidence. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31M evidence record plus the refactored live runner with the S31M readiness gate and cascade aware step skipping plus the five documented precise failure keys plus the dedicated S31M runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.37 to 0.138.38 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31M internal methodology record (fix live runner readiness gating after Microsoft connection do not start founder walk on connected no environment)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31M founder execution signal: the S31L runner logged Connected state detected. (state=connected no environment ...) Starting live connected acceptance walk then a cascade of seven false downstream failures. Source audit proved the runner used connected state detected equals true the moment the layered detection found any connected state, regardless of whether the Portfolio open estate action existed in the DOM.
- Impact assessment
- Bumps the package version from 0.138.37 to 0.138.38. Adds a readiness gate that separates authenticated from walk ready. Adds five precise readiness failure keys. Adds cascade aware step skipping with primary failure key plus stopped at step plus skipped steps due to prior failure on the live report. Adds eight archive blocking gates. Adds a dedicated S31M runtime test. Tokens stay server side. The live evidence packaging policy is unchanged. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31M test exercises the contract via source-scan plus exported sentinel and list checks. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the readiness gate waits long enough for environment discovery to complete and that the cascade prevention surfaces a single primary failure key on any timeout.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31NChange date:2026-05-16Product version:0.138.39Methodology engine version:0.9.1
Phase 1G-S31N - Fix S31M runner contradiction atomic action find plus click helper for portfolio open estate
Reason: S31M live test produced a runner contradiction. The readiness gate logged portfolio estate action found and the very next click failed with missing action portfolio open estate. Root cause was the S31L capture click diagnostic taking a selector closure parameter while the safe page evaluate wrapper called page evaluate fn without forwarding arguments. selector was therefore always undefined and match count was always zero.
What changed: Package version bumped from 0.138.38 to 0.138.39. The build identity endpoint now reports product version 0.138.39 and phase id 1G S31N. Backwards compatible alias exports cascade. Root cause is a parameter passing bug not a selector mismatch or DOM timing issue. Both code paths used the same selector string. The readiness gate inlined the literal selector inside its closure so its find succeeded. The click helper passed the selector as a function arg that safe page evaluate dropped before calling page evaluate. S31N ships ONE atomic wait for action and click helper that does scope wait plus DOM find plus visibility and enabled check plus click plus transition wait in ONE function. The atomic helper invokes page evaluate as page evaluate fn sel expected scope so the selector and the expected scope are forwarded via the second and third args. The helper re acquires the SAME element handle via page query selector all using the index identified in the find step plus verifies is connected plus visible plus enabled BEFORE clicking plus clicks that exact handle. No second lookup is possible so the readiness view of the DOM and the click view of the DOM cannot disagree. Only portfolio open estate uses the helper (the surface where the contradiction fires). The legacy click with diagnostics call for portfolio open estate and the legacy missing action portfolio open estate push are both removed. Four precise S31N failure keys ship - portfolio open estate disappeared before click plus portfolio open estate not visible plus portfolio open estate disabled plus portfolio open estate transition failed. Every S31N portfolio open estate failure routes through s31m mark primary failure with the precise key and portfolio ready as the stopped at step so the S31M cascade prevention skips every downstream step. Five new archive blocking gates ship - atomic action click helper s31n plus portfolio open estate no contradiction s31n plus portfolio open estate transition s31n plus first failed step stops walk s31n plus live evidence policy preserved s31n. No OAuth touch. No scoring change. No UI redesign. No broad phase. The S31G S31H S31I S31J S31K S31L S31M live evidence packaging policy is preserved byte for byte. Founder summary now reads 320 of 320 checks green.
User impact: On the founder workstation the live runner can no longer log portfolio estate action found and immediately fail with missing action portfolio open estate. The atomic helper either clicks the same handle the find step identified or pushes ONE of the four precise S31N failure keys with full find evidence so the founder always learns the exact reason.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31N evidence record plus the surgical runner edit (wait for action and click helper plus replaced portfolio open estate call site plus four precise failure keys) plus the dedicated S31N runtime test plus the five new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.38 to 0.138.39 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31N internal methodology record (fix S31M runner contradiction with atomic action find plus click helper for portfolio open estate)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31N founder execution signal: the S31M readiness gate logged portfolio estate action found and the very next click failed with missing action portfolio open estate. Source audit pinned the contradiction on the captureClickDiagnostic parameter dropping bug.
- Impact assessment
- Bumps the package version from 0.138.38 to 0.138.39. Adds ONE atomic waitForActionAndClick helper used only for portfolio open estate. Adds four precise failure keys plus five archive blocking gates plus one dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31N test exercises the contract via source scan plus exported sentinel and key checks. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the contradiction cannot recur.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31OChange date:2026-05-16Product version:0.138.40Methodology engine version:0.9.1
Phase 1G-S31O - Fix S31N disappeared before click and add current scope continuation
Reason: S31N atomic helper correctly classified find threw as portfolio open estate disappeared before click. The founder live test on S31N proved the failure was actually a benign auto navigation past Portfolio. With env count equals one the UI auto selected the only environment between the readiness gate find and the atomic helper first probe. page evaluate threw and the helper classified the failure as disappeared before click but the app HAD advanced past Portfolio successfully.
What changed: Package version bumped from 0.138.39 to 0.138.40. The build identity endpoint now reports product version 0.138.40 and phase id 1G S31O. Backwards compatible alias exports cascade. Root cause is auto navigation between the readiness gate find and the atomic helper first probe. S31N had no scope re check after find threw and treated the navigation churn as a final failure. S31O adds a navigation tolerant scope probe after every find failure (find threw plus handle missing plus not attached). The probe polls scope up to 12 times at 100 ms intervals. If the scope is in S31O VALID DEEPER SCOPES (estate plus environment plus agent), the helper returns ok equals true with transition already completed equals true plus scope after disappearance set. The portfolio open estate call site also reads scope BEFORE invoking the helper and skips the click entirely when the app is already on a deeper valid scope. The call site populates s31o steps skipped due to current scope continuation so the screenshot completeness check suppresses screenshot star missing keys for the skipped steps. A 500 ms DOM stability window (same match count plus same scope) gates the click against React re render churn. The call site only routes failures through s31m mark primary failure when transition already completed is NOT true. Five new archive blocking gates ship - portfolio disappeared recheck scope s31o plus current scope continuation s31o plus portfolio action stability wait s31o plus disappeared before click not final if scope advanced s31o plus live evidence policy preserved s31o. No new precise failure key (the four S31N keys remain canonical). No OAuth touch. No scoring change. No UI redesign. The S31G S31H S31I S31J S31K S31L S31M S31N live evidence packaging policy is preserved byte for byte. Founder summary now reads 325 of 325 checks green.
User impact: On the founder workstation with env count equals one, the live runner now skips the portfolio open estate click and continues from whichever scope the UI auto navigated to. If the auto navigation happens mid probe, the helper re classifies the find threw failure as transition already completed and the walk continues. No more false disappeared before click on env count equals one tenants.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31O evidence record plus the surgical runner edit (scope recheck after find failure plus current scope continuation plus 500 ms DOM stability window plus screenshot skip suppression) plus the dedicated S31O runtime test plus the five new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.39 to 0.138.40 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31O internal methodology record (fix S31N disappeared before click plus add current scope continuation)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31O founder execution signal: the S31N live test with env count equals one produced reason equals find throw mapped to portfolio open estate disappeared before click but the app had successfully auto navigated past Portfolio.
- Impact assessment
- Bumps the package version from 0.138.39 to 0.138.40. Adds a scope re check after find failures plus current scope continuation at the call site plus a 500 ms DOM stability window plus screenshot suppression for skipped steps. Adds five archive blocking gates plus one dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31O test exercises the contract via source scan plus exported sentinel and list checks. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the scope advance re classification fires in production.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31PChange date:2026-05-16Product version:0.138.41Methodology engine version:0.9.1
Phase 1G-S31P - Fix portfolio open estate via Playwright locator click helper
Reason: S31O scope recheck path correctly handled scope advance success but the founder live test on S31O showed the real failure mode is React re render churn while Portfolio scope remains active. The page evaluate based find loop kept catching Execution context destroyed errors and never saw a stable DOM. scope after disappearance equals portfolio proved the app did NOT auto navigate it was just a volatile DOM.
What changed: Package version bumped from 0.138.40 to 0.138.41. The build identity endpoint now reports product version 0.138.41 and phase id 1G S31P. Backwards compatible alias exports cascade. Root cause is React re render churn that page evaluate cannot ride through. The S31N wait for action and click helper uses page evaluate for both find and click so it cannot survive the re render churn that destroys execution contexts. S31P switches portfolio open estate to Playwright locator API which is purpose built for this. A new wait for locator action and click helper uses page locator sel first call for click plus is visible plus is enabled checks. It never uses page evaluate for the find or click. It detects documented transient click errors (element not attached plus execution context destroyed plus locator resolved to detached plus element not stable plus intercepted plus target closed plus frame detached plus Cannot find context) and retries up to click timeout ms. It requires a 750 ms stability window (count greater than or equal to one plus visible plus enabled plus scope still Portfolio) before clicking. It accepts scope advance to estate or environment or agent as success at three checkpoints during find loop plus after non transient click error plus after transient click error. The portfolio open estate call site is replaced. The failure key map is restricted to the five precise S31P keys ONLY - portfolio open estate locator not found plus portfolio open estate locator not visible plus portfolio open estate locator disabled plus portfolio open estate click detached timeout plus portfolio open estate transition failed. The legacy portfolio open estate disappeared before click is no longer emitted from this path. The S31N wait for action and click helper is preserved for any future actions. Six new archive blocking gates ship - portfolio locator click helper s31p plus portfolio open estate no evaluate click s31p plus portfolio click retries detached element s31p plus portfolio click scope advance success s31p plus portfolio click precise failure keys s31p plus live evidence policy preserved s31p. No OAuth touch. No scoring change. No UI redesign. The S31G S31H S31I S31J S31K S31L S31M S31N S31O live evidence packaging policy is preserved byte for byte. Founder summary now reads 331 of 331 checks green.
User impact: On the founder workstation the runner now uses Playwright's locator API for portfolio open estate which auto retries detached elements internally. The 750 ms stability window plus the locator level retries plus the three scope advance checkpoints together survive React re render churn that the page evaluate based S31N S31O helpers could not. If the click genuinely fails the founder gets ONE of five precise locator path keys with transient error count plus click attempts plus last scope observed.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA 256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G S31P evidence record plus the surgical runner edit (wait for locator action and click helper plus portfolio open estate call site replacement plus restricted failure key map) plus the dedicated S31P runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.40 to 0.138.41 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31P internal methodology record (fix portfolio open estate via Playwright locator click helper)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-16
- Reference
- Phase 1G-S31P founder execution signal: the S31O live test produced reason equals find throw mapped to portfolio open estate disappeared before click with scope after disappearance equals portfolio. The app stayed on Portfolio - the helper failed because page evaluate could not ride through the React re render churn.
- Impact assessment
- Bumps the package version from 0.138.40 to 0.138.41. Adds a Playwright locator click helper used only for portfolio open estate. Adds 5 precise locator path failure keys plus 6 archive blocking gates plus 1 dedicated test. Tokens stay server side. Live evidence packaging policy unchanged. Score engine unchanged. UI unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31P test exercises the contract via source scan plus exported sentinel and list checks. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the Playwright locator helper rides through the live re render churn that defeated S31O.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30ZChange date:2026-05-15Product version:0.138.25Methodology engine version:0.9.1
Phase 1G-S30Z - Estate-home, evidence, trace, receipt, and methodology-page semantic clarity completion
Reason: S30Q through S30Y closed the bulk of buyer-visible semantic clarity across the report, the report export, the copied action plan, the workspace headers and metric summaries, the score contributor trace card, the comparison and memo surfaces, and the exported support text. The S30Y founder live-test then flagged the final ten buyer-visible surfaces that still carried compact dot-joined semantic rows - estate home dashboard, agent readiness dashboard hero meta and evidence coverage and report section order, agent proof intelligence assessment panel header and status and buyer action plan, agent proof why this score card footer, reproducibility receipt section versions and fired counts and result, methodology changes page entry header and monitored domain examples, premium executive command centre portfolio line, dashboard scorecard tile saved metadata, and the remaining compact values in methodology lineage section and methodology provenance trail section. S30Z must extend the same explicit Label colon Value standard to these remaining buyer-visible surfaces so the buyer reads every chip-like or compact semantic value with its label across the entire product surface.
What changed: Package version bumped from 0.138.24 to 0.138.25. The build identity endpoint now reports product version 0.138.25 and phase id 1G-S30Z. Backwards-compatible alias exports cascade. Ten older buyer-visible surfaces are refactored to use explicit Label colon Value text. estate home dashboard agent rows now render Environment, Agent type, Classification, and Reassessment as labelled spans. The estate header summary renders Environments discovered, Agents discovered, Reviewed agents, and Coverage on four labelled spans. The total agents card splits Reviewed agents and Unreviewed agents. The highest-risk environment summary renders Average score and Reviewed in environment as two labelled spans. The footer splits Discovery snapshot and the next-action prose. agent readiness dashboard readiness hero renders Classification, Environment, and Platform as three labelled spans. Evidence coverage renders Proven evidence, Inferred evidence, and Unknown evidence. The report section order renders as a labelled ordered list. agent proof intelligence assessment panel renders Evidence basis and Agent as labelled spans, Pack status and Reassessment status as labelled spans, and Actions in next 7 days and Actions in next 30 days on the buyer action plan summary. agent proof why this score card footer renders Trace type, Evidence basis, Score input hash, and Engine version as labelled spans. reproducibility receipt section renders labelled stamp pairs for versions, Red flags fired and AI Act indicators fired as two labelled spans, and Score, Rating, and Cap as three labelled spans on the result row. Methodology changes page entry headers render Change id, Change date, Product version, and Methodology engine version as labelled spans, and monitored domain examples render as comma-separated enumerations. premium executive command centre portfolio line renders Environments, Agents, Reviewed agents, Unreviewed agents, High-risk agents, and Re-review items as labelled spans. dashboard scorecard tile saved metadata renders Saved at and Engine version as two labelled spans. methodology lineage section renders Engine version and Context packs version on the scorecard and current methodology rows, and renders Change id and Change title on the Latest applicable change row. methodology provenance trail section renders Reviewed by roles as labelled chips, Latest applicable change as Change id and Change title labelled spans, and the provenance row header as Change id and Change title labelled spans. Ten new archive blocking gates ship - estate home semantic clarity s30z, readiness dashboard remaining semantics s30z, intelligence assessment panel semantics s30z, why this score footer semantics s30z, reproducibility receipt component semantics s30z, methodology changes page semantics s30z, command centre portfolio line semantics s30z, saved scorecard metadata semantics s30z, methodology lineage provenance remaining semantics s30z, s30q to s30y regression guard s30z. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 211 of 211 checks green.
User impact: The buyer now sees consistent labelled chips and labelled prose across the entire AgentProof product surface - the report, the export, the copied action plan, the saved scorecards, the scorecard result, the comparison view, the timeline view, the indicator list, the targeted rescore section, the methodology lineage and provenance trail, the reproducibility receipt, the estate dashboard category queue, the enterprise estate overview methodology status, the executive command centre reassessment queue, the agent workspace header, the connection state chip, the portfolio estate cards, the estate environment cards, the environment header summary, the evidence explorer counts, the score contributor trace, the legacy slash dashboard row, the comparison client surfaces, the memo view, the exported support text, the estate home dashboard, the readiness hero metadata and evidence coverage and section order, the intelligence assessment panel, the Why this score footer, the reproducibility receipt component, the methodology changes page, the premium command centre portfolio line, and the saved scorecard tile metadata. No bare rating, status, action, severity, count, or provenance pill remains. No compact dot-separated semantic chain remains in any audited surface.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30Z evidence record plus the 10 refactored component and library sources plus the dedicated S30Z source-scan test plus the ten new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.24 to 0.138.25 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30Z internal methodology record (estate home plus readiness remaining plus intelligence assessment plus why this score plus reproducibility receipt plus methodology changes page plus command centre portfolio plus saved scorecard plus methodology lineage and provenance remaining semantic clarity completion)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S30Z founder execution signal: the S30Y live test confirmed report and workspace and exported support text clarity but flagged the remaining compact dot-joined buyer-visible surfaces across the older estate home dashboard, the readiness hero metadata and evidence coverage and report section order, the intelligence assessment panel header and status and action plan, the Why this score footer, the reproducibility receipt component, the methodology changes page entry header and monitored domain examples, the premium executive command centre portfolio line, the saved scorecard tile metadata, and the remaining compact values in the methodology lineage and provenance trail sections.
- Impact assessment
- Bumps the package version from 0.138.24 to 0.138.25. Refactors 10 component and library files to use explicit Label colon Value labels. Reuses the shared S30Q labelled status chip helper directly. Adds ten new archive blocking gates. Adds a dedicated source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the estate home dashboard agent rows and header summary and total agents card and highest-risk environment and footer all render labelled spans, confirm the readiness hero metadata and evidence coverage and section order render labelled spans / list, confirm the intelligence assessment panel header and status and action plan summary render labelled spans, confirm the Why this score footer renders four labelled spans, confirm the reproducibility receipt versions and fired counts and result row render labelled spans, confirm the methodology changes page entry header renders four labelled spans and monitored examples render comma-separated, confirm the premium executive command centre portfolio line renders six labelled spans, confirm the saved scorecard tile renders Saved at and Engine version labelled spans, confirm the methodology lineage and provenance trail remaining compact values render labelled spans, regression check the labelled chips from S30Q through S30Y, and confirm the ten new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31AChange date:2026-05-15Product version:0.138.26Methodology engine version:0.9.1
Phase 1G-S31A - Generated copy, print headers, timeline export, wizard counters, and final semantic sweep
Reason: S30Q through S30Z closed the bulk of buyer-visible semantic clarity across the report, exports, dashboards, results, methodology, estate home, evidence, trace, receipt, and workspace metrics. A final source sweep of the accepted S30Z archive still showed remaining buyer-visible compact semantic text in generated copy, print headers, timeline exports, wizard counters, landing trust panels, and some estate or view-model summary strings. S31A must complete the semantic clarity standard for these remaining surfaces so the buyer never encounters a compact dot-joined semantic row anywhere in the product.
What changed: Package version bumped from 0.138.25 to 0.138.26. The build identity endpoint now reports product version 0.138.26 and phase id 1G-S31A. Backwards-compatible alias exports cascade. Fourteen older buyer-visible surfaces are refactored to use explicit Label colon Value text. The scorecard timeline export emits Timeline version, Score, Rating, Primary context, and Targeted re-score state as labelled bullets, and the version stamps line uses Engine version, Weights version, Question bank version, Red-flag rules version, AI Act indicators version, and Context packs version labelled pairs. The verification report emits one labelled bullet per version stamp. The estate view model generated summaries render Improved agents, Worsened agents, Unchanged agents, Reviewed agents, Unreviewed agents, High-risk agents, and Open priority items with labels. The remediation tracker summary renders Resolved priorities, Open priorities, In progress priorities, and Deferred priorities with labels. agent estate dashboard segment summary renders Reviewed agents, Unreviewed agents, Open priorities, and High-severity priorities as labelled spans. scorecard view header renders Generated at, Engine version, Question bank version, and Weights version as labelled spans, and the footer renders Engine version, Weights version, Question bank version, Red-flag rules version, AI Act indicators version, and Context packs version as labelled spans. timeline client print header renders Saved versions, Score movement, and Latest saved as labelled spans, the print footer renders Agent and Saved versions as labelled spans, and timeline group list renders Saved versions and Latest saved as labelled spans. agent proof intelligence workspace renders Tenant, Environment, and Last analysis as labelled spans, and agent review wizard counters render Required answers, Recommended answers, and Evidence notes as labelled spans. the Recent verifications panel renders Verified at, Scorecard id, and Mismatches as labelled spans. agent proof first run journey card renders Access mode, Tenant write access, and Customer records as labelled spans. agent proof welcome summary renders Evidence basis, Scoring method, AI claim posture, Discovery mode, and Data location as labelled spans. scorecard preview renders Audience and Autonomy as labelled spans. Eleven new archive blocking gates ship - timeline export semantics s31a, verification report version semantics s31a, generated estate summary semantics s31a, remediation tracker summary semantics s31a, estate dashboard segment summary semantics s31a, scorecard view version semantics s31a, timeline client print semantics s31a, workspace intelligence and wizard counter semantics s31a, recent verifications detail semantics s31a, landing trust statement semantics s31a, s30q to s30z regression guard s31a. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 222 of 222 checks green.
User impact: The buyer now sees consistent labelled chips and labelled prose across every surface of the AgentProof product - the live UI, the generated report, the export, the copied action plan, the saved scorecards, the scorecard result, the comparison view, the timeline view, the indicator list, the targeted rescore section, the methodology lineage and provenance trail, the reproducibility receipt, the estate dashboard category queue and segment summary, the enterprise estate overview, the executive command centre, the agent workspace header, the connection state chip, the portfolio estate cards, the estate environment cards, the environment header summary, the evidence explorer counts, the score contributor trace, the legacy slash dashboard row, the comparison client surfaces, the memo view, the exported support text, the estate home dashboard, the readiness hero metadata and evidence coverage and section order, the intelligence assessment panel, the Why this score footer, the methodology changes page, the premium command centre portfolio line, the saved scorecard tile metadata, the timeline export, the verification report, the generated estate trend and summary text, the remediation tracker summary, the scorecard view header and footer version metadata, the timeline print header and footer, the timeline group list meta, the workspace intelligence header, the review wizard counters, the recent verifications panel verification details, and the landing first-run, welcome, and preview trust statements. No bare rating, status, action, severity, count, or provenance pill remains. No compact dot-separated semantic chain remains in any audited surface.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S31A evidence record plus the 14 refactored component and library sources plus the dedicated S31A source-scan test plus the eleven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.25 to 0.138.26 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31A internal methodology record (generated copy plus print headers plus timeline export plus wizard counters plus landing trust statement plus final semantic sweep)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31A founder execution signal: the S30Z accepted archive still showed remaining compact dot-joined buyer-visible semantic rows across the timeline export, verification report, generated estate summary text, remediation tracker summary, AgentEstateDashboard segment summary, ScorecardView header and footer version metadata, TimelineClient print header and footer, TimelineGroupList meta, AgentProofIntelligenceWorkspace tenant header, AgentReviewWizard counters, the Recent verifications panel verification details, and the AgentProofFirstRunJourneyCard plus AgentProofWelcomeSummary plus ScorecardPreview trust statements.
- Impact assessment
- Bumps the package version from 0.138.25 to 0.138.26. Refactors 14 component and library files to use explicit Label colon Value labels. Reuses the shared S30Q labelled status chip helper directly. Adds eleven new archive blocking gates. Adds a dedicated source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the timeline export rows and version stamps are labelled, confirm the verification report version stamps are labelled, confirm the estate view model generated summaries and the remediation tracker summary use labelled values, confirm the AgentEstateDashboard segment summary renders labelled spans, confirm the ScorecardView header and footer version metadata render labelled spans, confirm the TimelineClient print header and footer and TimelineGroupList meta render labelled spans, confirm the AgentProofIntelligenceWorkspace tenant header and AgentReviewWizard counters render labelled spans, confirm the the Recent verifications panel verification details render labelled spans, confirm the landing first-run plus welcome plus preview trust statements render labelled spans, regression check the labelled chips from S30Q through S30Z, and confirm the eleven new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31BChange date:2026-05-15Product version:0.138.27Methodology engine version:0.9.1
Phase 1G-S31B - Residual dot-join removal, workspace confidence labelling, connection identity clarity, and support-text separator cleanup
Reason: S31A completed the explicit Label colon Value standard across the product. A post-S31A source audit still showed a narrower residual issue: some values were now labelled but still rendered as compact dot-joined semantic rows, and a few workspace and support surfaces still contained unlabelled estate-confidence or connection-identity fragments. S31B must remove the remaining buyer-facing dot-joined semantic rows where each value affects interpretation, status, version, confidence, score, review progress, connection identity, or support-readiness. Labels are not enough if multiple decision-useful values are still packed into one dot-separated line.
What changed: Package version bumped from 0.138.26 to 0.138.27. The build identity endpoint now reports product version 0.138.27 and phase id 1G-S31B. Backwards-compatible alias exports cascade. Twenty-three source files are refactored to remove the residual middle-dot separators between decision-useful labelled values. agent proof intelligence workspace estate confidence chip carries explicit Estate confidence label. agent proof connection status chip tenant domain and environment count render as Tenant domain and Environments labelled spans. agent proof layout footer renders Product and Methodology pack as two labelled spans. compare view delta list rows render labelled values via a meta pairs array of label and value pairs and the delta list helper renders each labelled pair as its own span with a layout gap. compare client print header, print footer, memo print header, and memo print footer use flex layout gaps instead of literal middle-dot separators. memo view version stamps render with a flex gap and no inline middle dot. comparison memo build label returns Saved at colon date comma Agent colon name. Comparison memo Markdown stamp parts emit one labelled bullet per stamp. methodology changelog export entry header joins labelled values with commas. methodology provenance trail summary joins labelled clauses with commas. scorecard timeline ships a new format version stamps bullets helper that the Markdown render uses to emit one labelled bullet per stamp. agentproof agent estate view model emits Improved agents, Worsened agents, Unchanged agents, Reviewed agents, Unreviewed agents, High-risk agents, and Open priority items on separate newline-separated labelled lines. agentproof remediation tracker emits Resolved priorities, Open priorities, In progress priorities, and Deferred priorities on separate newline-separated labelled lines. agentproof readiness markdown report tracker summary emits one labelled bullet per status count. Eight lib/web support-readiness files replace the dot-joined trust line with Provider mode, Live LLM, Backend write, Login requirement, Runtime mode, and Determinism (plus Customer data where applicable) labelled clauses. Nine new archive blocking gates ship - workspace estate confidence label s31b, connection chip identity labels s31b, layout footer methodology labels s31b, comparison timeline ui dot join cleanup s31b, exported support rows dot join cleanup s31b, generated estate remediation summary dot join cleanup s31b, readiness markdown tracker summary cleanup s31b, support readiness safety claim labels s31b, s30q to s31a regression guard s31b. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 231 of 231 checks green.
User impact: The buyer now sees labelled values rendered as separate labelled spans, separate labelled bullets, or comma-separated labelled clauses across every audited surface - never as a compressed dot-separated chain of decision-useful values. The workspace estate confidence chip, the connection chip tenant identity and environment count, the layout footer product and methodology pack values, the comparison view delta list meta strings, the comparison client print header and footer and memo print header and footer, the memo view version stamps, the exported comparison memo and methodology changelog and methodology provenance trail and scorecard timeline rows, the generated estate trend and summary text and remediation tracker summary, the readiness Markdown tracker summary, and the eight lib/web support-readiness trust claims now read with explicit Label colon Value structure separated by visual gaps, commas, newlines, or labelled bullets - whichever the surface requires.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S31B evidence record plus the 23 refactored component and library sources plus the dedicated S31B source-scan test plus the nine new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.26 to 0.138.27 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31B internal methodology record (residual dot-join removal plus workspace confidence labelling plus connection identity clarity plus support-text separator cleanup)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31B founder execution signal: the post-S31A audit found that some values were now labelled but still rendered as compact dot-joined semantic rows in the workspace estate confidence chip, the connection chip identity fragments, the layout footer, the comparison and timeline UI surfaces, the exported support text, the generated estate and remediation summaries, the readiness Markdown tracker summary, and the support-readiness trust claims.
- Impact assessment
- Bumps the package version from 0.138.26 to 0.138.27. Refactors 23 source files to remove residual middle-dot separators between labelled decision-useful values. Reuses the shared S30Q labelled status chip helper directly. Adds nine new archive blocking gates. Adds a dedicated source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the estate confidence chip reads with the Estate confidence label, confirm the connection chip Tenant domain and Environments labelled spans render, confirm the layout footer Product and Methodology pack labelled spans render, confirm the comparison view delta list rows render labelled values without middle-dot separators, confirm the comparison client print surfaces and memo print surfaces use flex gaps, confirm the memo view version stamps use a flex gap, confirm the comparison memo and methodology changelog export and methodology provenance trail and scorecard timeline use labelled bullets or comma-separated clauses, confirm the generated estate trend and summary text and remediation tracker summary use newline-separated labelled lines, confirm the readiness Markdown tracker summary emits labelled bullets, confirm the eight lib/web support-readiness trust claims use labelled clauses, regression check S30Q through S31A, and confirm the nine new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31CChange date:2026-05-15Product version:0.138.28Methodology engine version:0.9.1
Phase 1G-S31C - S31B acceptance repair, missed dot-join blockers, and self-eval hardening
Reason: S31B uploaded correctly and self-evaluated 231 of 231 green, but independent source verification found that the S31B acceptance standard was not actually met. Active buyer-facing source still contained dot-joined rows in timeline client print header and footer, json paste score card local safety lines and score-button support text and session history rows, estate home dashboard environment card metadata, the Recent verifications panel timestamp formatter, and methodology changelog export radar cadence summary. S31C must repair each at source and harden the self-eval so the exact missed patterns cannot pass again.
What changed: Package version bumped from 0.138.27 to 0.138.28. The build identity endpoint now reports product version 0.138.28 and phase id 1G-S31C. Backwards-compatible alias exports cascade. Five source files are repaired. timeline client print header and footer render labelled spans inside a flex-wrap layout with a gap; the legacy active middle-dot literals between labelled spans are removed and an explicit Product label is added to the footer. json paste score card local safety lines now render Provider mode, Live LLM, Backend write, Login requirement, Runtime mode, and Determinism as labelled clauses. The score button support text renders Runtime mode, Live LLM, and Backend write as labelled spans with a flex gap. Session history rows render Score, Rating, and File as three labelled spans. estate home dashboard environment card meta renders Region (conditional on row.region), Workspace type (conditional on row.is default), and Agents discovered as labelled spans with a flex gap. the Recent verifications panel timestamp formatter now returns date part comma HH colon MM. methodology changelog export radar cadence summary now joins cadence labels with commas. Eight new archive blocking gates ship - timeline client dot join repair s31c, json paste local safety dot join repair s31c, json paste session history semantics s31c, estate home environment metadata repair s31c, recent verifications timestamp repair s31c, methodology changelog export join repair s31c, s31b gate coverage hardened s31c, s30q to s31b regression guard s31c. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 239 of 239 checks green.
User impact: The buyer no longer encounters labelled values still joined by middle dots on the timeline print surface, the local safety rows, the score button support text, the session history list, the estate home environment cards, the recent verifications timestamps, or the methodology changelog cadence summary. Every labelled value reads as a separate labelled span, separate labelled bullet, or comma-separated labelled clause across these surfaces.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S31C evidence record plus the five repaired source files plus the dedicated S31C source-scan test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.27 to 0.138.28 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31C internal methodology record (S31B acceptance repair plus missed dot-join blockers plus self-eval hardening)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31C founder execution signal: independent source verification found that the S31B acceptance standard was not actually met. Active buyer-facing source still contained dot-joined rows in TimelineClient, JsonPasteScoreCard, EstateHomeDashboard, the Recent verifications panel, and methodology_changelog_export.
- Impact assessment
- Bumps the package version from 0.138.27 to 0.138.28. Repairs five source files. Adds eight new archive blocking gates. Adds a dedicated source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, print or inspect the timeline print header and footer and confirm flex-gap labelled spans with no middle-dot literals, inspect the JsonPasteScoreCard local safety rows and the score button support text and the session history and confirm labelled clauses, open the estate home dashboard and confirm Region and Workspace type and Agents discovered are labelled spans, open recent verifications and confirm Verified at uses a comma between date and time, export methodology changelog and confirm the radar cadence summary uses commas not middle dots, regression check S30Q through S31B, and confirm the eight new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31DChange date:2026-05-15Product version:0.138.29Methodology engine version:0.9.1
Phase 1G-S31D - Active residual separator cleanup, select-option status clarity, and provenance/version row hardening
Reason: S31C uploaded green at 239 of 239, but an independent source audit found that the dot-join cleanup was still incomplete. Six active buyer-facing rows in untouched files still emitted middle-dot separators between labelled or named values. S31D must repair each at source and harden the self-eval so the exact remaining patterns cannot pass again.
What changed: Package version bumped from 0.138.28 to 0.138.29. The build identity endpoint now reports product version 0.138.29 and phase id 1G-S31D. Backwards-compatible alias exports cascade. Six source files are repaired. The Enterprise Estate Overview header metrics row renders Environments, Agents, and Reviewed agents as three labelled spans inside a flex-wrap layout; the legacy two middle-dot literals between the values are removed. The json paste score card Reviewed select-option label reads check display (Status: Reviewed); the legacy middle-dot template between the display value and the status word is removed. The Methodology Provenance Trail Section source counts render as three labelled spans (Applicable entries, Internal sources, External sources); the legacy parenthetical with a middle-dot separator inside is removed. The Reproducibility Receipt Section Versions row no longer renders an explicit aria-hidden middle-dot separator span between labelled version stamp pairs; spacing comes from a flex-wrap gap-x-3 layout alone. The Scorecard View Stop decision chip renders Decision and Action as two labelled child spans inside an inline-flex flex-wrap row; the literal middle-dot separator span between the two labels is removed. The Agent Report Guide summary pill reads Covers colon Scenario model comma Risks comma Controls comma Intelligence as a single labelled clause; the legacy bare dot-joined section-name chain is removed. Eight new archive blocking gates ship - enterprise overview header metrics no dot s31d, json paste reviewed option status label s31d, methodology provenance source count no dot s31d, reproducibility receipt version separator no dot s31d, scorecard view decision action no dot s31d, report guide summary no dot s31d, s31c gate coverage hardened s31d, s30q to s31c regression guard s31d. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 247 of 247 checks green.
User impact: The buyer no longer encounters labelled or named values still joined by middle dots on the Enterprise Estate Overview header metrics, the json paste score card Reviewed select-option label, the Methodology Provenance Trail Section source counts, the Reproducibility Receipt Section Versions row, the Scorecard View Stop decision chip, or the Agent Report Guide summary pill. Every labelled value reads as a separate labelled span, separate labelled clause, or comma-separated labelled list across these surfaces.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S31D evidence record plus the six repaired source files plus the dedicated S31D source-scan test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.28 to 0.138.29 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31D internal methodology record (active residual separator cleanup plus select-option status clarity plus provenance and version row hardening)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31D founder execution signal: independent source verification found six remaining active dot-joined buyer-facing rows after the S31C upload. The dot-join cleanup was incomplete in EnterpriseEstateOverview header metrics, JsonPasteScoreCard Reviewed option label, MethodologyProvenanceTrailSection source counts, ReproducibilityReceiptSection version separator, ScorecardView Decision Action chip, and AgentReportGuide summary pill.
- Impact assessment
- Bumps the package version from 0.138.28 to 0.138.29. Repairs six source files. Adds eight new archive blocking gates. Adds a dedicated source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, inspect each of the six surfaces and confirm flex-gap labelled spans or labelled comma-separated clauses with no middle-dot literals, regression check S30Q through S31C, and confirm the eight new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31EChange date:2026-05-15Product version:0.138.30Methodology engine version:0.9.1
Phase 1G-S31E - Rendered founder acceptance gauntlet, browser-level UI proof, and no source-scan-only confidence
Reason: S31D shipped fully source-clean at 247 of 247 but the founder rejected source-scan-only confidence: if a founder can see it, click it, misunderstand it, or get stuck on it, it must be covered by a rendered or runtime test. S31E ships a dedicated rendered acceptance harness, a happy-dom test that mounts the live scope views and exercises real DOM clicks, and a machine-readable founder acceptance report that lands in every archive by construction.
What changed: Package version bumped from 0.138.29 to 0.138.30. The build identity endpoint now reports product version 0.138.30 and phase id 1G-S31E. Backwards-compatible alias exports cascade. a new rendered acceptance harness module under the tests support tree exposes the create harness lifecycle helper, deterministic fixture builders for the four scope views and the connection chip, four scope mount helpers, one chip mount helper, and the audit helpers sweep bare semantic chips, audit enabled buttons, sweep forbidden copy, audit text budget, audit connection truth, audit scope separation, plus the build founder acceptance report report builder. a new dedicated S31E happy-dom test under the tests unit tree runs under happy-dom and mounts the live scope views, exercises every documented agent workspace tab via real mouse event clicks, sweeps every rendered scope for bare semantic chips, dead enabled buttons, forbidden copy, hierarchy contamination, connection-truth contradictions, and text-budget overruns, and writes the founder acceptance report to a machine-readable founder acceptance report under the self evaluation receipt folder. the build archive packaging script adds the stage founder acceptance report helper that stages the report into the archive and refuses to package when the report is missing or blocking failures is non-empty. Eight new archive blocking gates ship - rendered founder acceptance gauntlet s31e, rendered hierarchy scope separation s31e, rendered report tab integrity s31e, rendered report chip clarity s31e, rendered no dead enabled buttons s31e, rendered connection truth s31e, rendered text budget s31e, rendered founder acceptance report packaged s31e. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 255 of 255 checks green.
User impact: The founder can now verify the live product as a buyer would by running one command, opening the produced acceptance report, and reading the pass/fail per assertion. The archive carries the acceptance report by construction so any reviewer can audit rendered acceptance without re-running the test. Future regressions in chip clarity, hierarchy scope separation, dead buttons, connection-chip truth, or default-view text density fail the build through the new rendered gates rather than slipping past source-scan-only confidence.
When to re-score: Output formatting and proof harness only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gatePackagingEvidence trace: the new Phase 1G-S31E evidence record plus the rendered acceptance harness module plus the dedicated S31E happy-dom test plus the founder acceptance report file under the self evaluation receipt folder plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the updated build archive helper plus the package version bump from 0.138.29 to 0.138.30 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31E internal methodology record (rendered founder acceptance gauntlet plus browser-level UI proof plus no source-scan-only confidence)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31E founder execution signal: S31D cleaned the remaining active separator defects, but the founder rejected source-scan-only confidence and required that if a founder can see it, click it, misunderstand it, or get stuck on it, it must be covered by a rendered or runtime test.
- Impact assessment
- Bumps the package version from 0.138.29 to 0.138.30. Adds a dedicated rendered acceptance harness module under the tests support tree. Adds a dedicated happy-dom test. Adds the founder acceptance report file under the self evaluation receipt folder and stages it into every archive. Adds eight new archive blocking gates. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, walk Portfolio to Estate to Environment to Agent, click every agent workspace tab, confirm the chips read with labels, confirm no enabled button silently does nothing, confirm the connection chip is honest in both states, confirm the default Portfolio view does not dump methodology or evidence or report or wizard content, regression check S30Q through S31D, and confirm the eight new gates plus the produced founder acceptance report show pass.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31FChange date:2026-05-15Product version:0.138.31Methodology engine version:0.9.1
Phase 1G-S31F - Real browser founder walk-through, screenshot evidence, and live UI acceptance hardening
Reason: S31E proved the rendered component harness via happy-dom. The founder rejected component-level confidence alone and required real-browser proof: a founder must be able to open the archive, run the app, walk the product, and review screenshot evidence showing that the real app is coherent, clickable, and understandable. S31F lifts the proof to a real Chromium browser driven by Playwright.
What changed: Package version bumped from 0.138.30 to 0.138.31. The build identity endpoint now reports product version 0.138.31 and phase id 1G-S31F. Backwards-compatible alias exports cascade. A new real-browser acceptance runner ships as a scripts cjs file that spawns the production Next.js server on a configurable port, hits the live agentproof version endpoint, launches headless Chromium via playwright, captures the live score paste route as the first screenshot, then walks each of the ten documented scope views (Portfolio, Estate, Environment, Agent, Agent Questions, Report, Intelligence, Evidence, Review history, Methodology) and screenshots each into the self evaluation founder screenshots folder with deterministic filenames s31f underscore zero zero through s31f underscore one zero. A new scope view renderer module under the tests support tree exposes a single render scope body html export so the runner can bundle and execute it via esbuild under the automatic JSX runtime. A new dedicated runtime test under the tests unit tree asserts every required screenshot lands on disk and the produced founder browser acceptance report is well-formed with zero blocking failures. The archive build helper adds stage founder browser acceptance which copies the report file plus the entire screenshots folder into every archive and refuses to package when the report blocking failures array is non-empty or any required screenshot is missing. Eight new archive blocking gates ship - browser founder acceptance runner s31f, browser screenshot pack s31f, browser scope navigation s31f, browser report tab integrity s31f, browser report chip clarity s31f, browser no dead buttons s31f, browser connection truth s31f, browser founder report packaged s31f. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Playwright is added to dev dependencies; pnpm browser founder acceptance is registered as a package script. Founder summary now reads 263 of 263 checks green.
User impact: The founder can now run one command to spawn the actual production server, drive a real Chromium browser through the documented scope walk, and receive eleven deterministic PNG screenshots plus a machine readable acceptance report. The archive carries this evidence by construction so any reviewer can open the produced tar, view the screenshots, read the report, and confirm the live app is coherent, clickable, and understandable without re-running anything. Future regressions in the live score paste route, screenshot pack, scope navigation, chip clarity, dead button audit, or connection truth fail the build via the new rendered gates rather than slipping past source scan or happy dom only confidence.
When to re-score: Output proof and packaging only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gatePackagingEvidence trace: the new Phase 1G-S31F evidence record plus the real browser acceptance runner script plus the scope view renderer module plus the dedicated S31F runtime test plus the founder browser acceptance report under the self evaluation receipt folder plus the eleven founder screenshot PNG files plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the updated build archive helper with the new stage founder browser acceptance function plus the package version bump from 0.138.30 to 0.138.31 plus the playwright dev dep addition plus the new pnpm browser founder acceptance script plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31F internal methodology record (real browser founder walk through plus screenshot evidence plus live UI acceptance hardening)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31F founder execution signal: S31E proved the rendered component harness via happy dom; the founder rejected component level confidence alone and required real browser proof. A founder must be able to open the archive, run the app, walk the product, and review screenshot evidence.
- Impact assessment
- Bumps the package version from 0.138.30 to 0.138.31. Adds a Playwright dev dep and a Chromium browser cache. Adds the real browser acceptance runner script. Adds the scope view renderer module. Adds the dedicated S31F runtime test. Adds the founder browser acceptance report file under the self evaluation receipt folder and the founder screenshots folder. Adds the stage founder browser acceptance helper to the build archive script. Adds eight new archive blocking gates. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the browser runner locally to produce fresh screenshots and the acceptance report. The Microsoft OAuth chain is not exercised in this slice so the live score paste shot shows the empty Portfolio hero rather than a connected estate. Deep walk through Estate, Environment, Agent requires either a fixture injection mode (out of scope this slice) or a real Microsoft tenant. Founder must open the produced screenshots and confirm visual coherence per the manual founder QA script.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31GChange date:2026-05-15Product version:0.138.32Methodology engine version:0.9.1
Phase 1G-S31G - Live connected founder walk-through, real tenant evidence, and manual-assisted browser QA
Reason: S31F proved the browser-rendered acceptance pack, but the deeper scope screenshots still used deterministic fixture state. The founder rejected fixture-only confidence for the connected product and required a real connected walk after a real Microsoft sign-in, with screenshot evidence drawn from the actual running app, real tenant identity, real discovered environments, real discovered agents, and a real generated readiness report.
What changed: Package version bumped from 0.138.31 to 0.138.32. The build identity endpoint now reports product version 0.138.32 and phase id 1G-S31G. Backwards-compatible alias exports cascade. A new manual-assisted live connected browser runner ships under the scripts cjs file. The runner spawns or attaches to the production Next js server, launches headed Chromium against the real score paste route, prints a wait message and polls the live data connection state attribute for the documented connected values, then walks the real connected product by clicking the documented data action ids in deterministic order across Portfolio, Estate, Environment, Agent and the four agent tabs and the three back-navigation actions. The runner captures twelve deterministic PNG screenshots under the self evaluation founder live connected screenshots folder. The runner installs Playwright listeners for console, pageerror, and request failed events and records counts in the live acceptance report. The runner detects whether the selected agent already has a report and either drives Generate readiness report or simply opens the existing Report tab; absence of either the report view or the retry banner records a blocking failure. The runner observes Intelligence, Evidence, and History surfaces and records observation booleans. A new dedicated S31G runtime test under the tests unit tree source-scans every fragment of the runner shape, the documented connected values, the screenshot filename set, the click action ids, the page on listeners, the report generation drive, and the live acceptance report fields. The archive build helper adds a new stage founder live connected acceptance function that gates live evidence packaging on an explicit AGENTPROOF PACKAGE LIVE FOUNDER EVIDENCE environment variable; live evidence is excluded by default; when the flag is set the helper validates the live report schema, runs a secret scan over the report body refusing any access token, refresh token, client secret, bearer, or private key pattern, sweeps the screenshots folder for suspicious filenames, and only then stages the artefacts. Eight new archive blocking gates ship: live connected browser runner exists s31g, live connected runner uses real route s31g, live connected state truth s31g, live hierarchy click path s31g, live generate report proof s31g, live intelligence evidence not empty s31g, live console network audit s31g, live connected evidence packaging policy s31g. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. The pnpm browser founder live connected package script is registered. Founder summary now reads 271 of 271 checks green.
User impact: The founder can now run one local command, complete a real Microsoft sign-in in a headed Chromium window, and watch the runner walk the actually connected product end to end. The runner emits twelve PNG screenshots of the real live tenant plus a machine readable live acceptance report capturing tenant identity, selected estate, selected environment, selected agent, console errors, failed network requests, report generation observation, score observation, and Intelligence and Evidence and History observation. By default the archive does NOT include the live evidence, so tenant identifiers and agent names stay LOCAL; the founder opts in by setting AGENTPROOF PACKAGE LIVE FOUNDER EVIDENCE to true before invoking pnpm build archive, and a secret scan refuses any token bearing payload before sealing the tar.
When to re-score: QA hardening and packaging policy only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gatePackagingToolingEvidence trace: the new Phase 1G-S31G evidence record plus the manual assisted live connected runner script plus the dedicated S31G runtime test plus the new stage founder live connected acceptance helper in the build archive script plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.31 to 0.138.32 plus the new pnpm browser founder live connected script plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31G internal methodology record (live connected founder walk through plus real tenant evidence plus manual assisted browser QA)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31G founder execution signal: the S31F deep scope screenshots still used fixture state. The founder rejected fixture only confidence for the connected product and required real Microsoft sign in based evidence drawn from the actual running app.
- Impact assessment
- Bumps the package version from 0.138.31 to 0.138.32. Adds a manual assisted live connected runner script. Adds a dedicated S31G test. Adds the stage founder live connected acceptance helper to the build archive script with an explicit opt in flag and a secret scan. Adds eight new archive blocking gates. Tokens stay server side. Live evidence stays local by default. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The runner cannot run inside the CI pipeline because the Microsoft OAuth step needs founder input. A dry run mode exists so the runner can emit a buyer safe empty live acceptance report; the dry run never launches Chromium and is the canonical path for automated source scans of the runner shape. The full live walk must be performed by the founder on a workstation with a real Microsoft tenant. The deterministic scorecard Markdown SHA-256 is unchanged.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31HChange date:2026-05-15Product version:0.138.33Methodology engine version:0.9.1
Phase 1G-S31H - Fix live connected runner OAuth navigation resilience and post-sign-in recovery
Reason: The S31G live connected runner failed on the founder workstation with a Playwright execution context destruction error during Microsoft OAuth redirects. The founder completed sign-in successfully but the runner treated the transient navigation as a blocking failure and never resumed the live connected walk. S31H must harden the runner so it survives Microsoft OAuth redirects and resumes after sign-in.
What changed: Package version bumped from 0.138.32 to 0.138.33. The build identity endpoint now reports product version 0.138.33 and phase id 1G-S31H. Backwards-compatible alias exports cascade. The live connected runner script ships two new navigation-safe helpers. The safe page evaluate helper catches the four documented Playwright navigation errors (Execution context was destroyed, Cannot find context with specified id, Target page context or browser has been closed, frame got detached) and retries up to a documented number of times with a configurable retry delay, recording every transient catch in a transient sink array. The wait for post oauth app ready helper waits until the URL returns to localhost on an AgentProof route, then waits for body to be attached, then waits for one of three terminal app states (a connected attribute, a visible Connect button, or a buyer-safe error or setup needed panel) before returning. The sign-in wait loop replaces the fragile page evaluate poll with this helper and prints four documented lifecycle log lines in order: Waiting for founder to complete Microsoft sign-in, Returned to AgentProof after Microsoft sign-in, Connected state detected, Starting live connected acceptance walk. The runner tracks transient navigation errors in a dedicated array; the final report records transient navigation errors observed boolean, transient navigation errors count, transient navigation errors array, recovered from oauth navigation boolean, connected state detected boolean, oauth return url, oauth return body excerpt. The top-level Chromium try catch checks the transient predicate BEFORE pushing the legacy chromium session failed blocking key so a late transient during back-walk is recorded but not fatal. On the OAuth-return-but-no-connected-state failure path the runner captures an s31h underscore oauth return not connected png diagnostic screenshot and pushes connected state not detected after oauth return as the canonical blocking failure. Six new archive blocking gates ship - oauth navigation resilience s31h, safe evaluate retry s31h, post oauth connected state detection s31h, transient navigation not blocking s31h, oauth failure screenshot s31h, live evidence policy preserved s31h. The S31G live evidence packaging policy is preserved byte for byte. No second chip system. No scoring change. Founder summary now reads 277 of 277 checks green.
User impact: On the founder workstation, the live connected browser runner now survives the full Microsoft OAuth handshake. The transcript prints four lifecycle log lines, the deep walk completes, twelve PNG screenshots land in the local screenshots folder, and the live acceptance report records transient navigation errors observed and recovered from oauth navigation so the founder can distinguish recovered navigation churn from real defects. The archive does NOT carry the live screenshots or the live report by default; the explicit opt in flag, schema check, and secret scan from S31G are preserved verbatim.
When to re-score: Tooling bug fix only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateEvidence trace: the new Phase 1G-S31H evidence record plus the hardened live connected runner script with two new navigation-safe helpers plus the dedicated S31H runtime test exercising the safe evaluator retry behaviour plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.32 to 0.138.33 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31H internal methodology record (fix live connected runner OAuth navigation resilience and post sign in recovery)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31H founder execution signal: the S31G live runner failed on the founder workstation with a Playwright execution context destruction error during Microsoft OAuth redirects. The founder completed sign in successfully but the runner treated the transient navigation as a blocking failure.
- Impact assessment
- Bumps the package version from 0.138.32 to 0.138.33. Hardens the live connected runner against Playwright navigation context destruction during Microsoft OAuth. Adds a dedicated S31H test. Adds six new archive blocking gates. Live evidence packaging policy unchanged. Tokens stay server side. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The hardened runner needs to be run on a workstation with a real Microsoft tenant to verify the fix end to end. The unit test exercises the safe evaluator retry behaviour with a mock page; the live OAuth navigation only happens during a real founder walk. The transient navigation pattern list is conservative and covers the four documented Playwright errors; new Playwright error strings may need to be added to the list in the future.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31IChange date:2026-05-15Product version:0.138.34Methodology engine version:0.9.1
Phase 1G-S31I - Layered live connection detection, status endpoint authority, and precise failure diagnostics
Reason: The S31H runner cleared the Playwright execution context destruction during Microsoft OAuth redirects, but the founder retested on the live workstation and the runner now fails after OAuth return with the generic connected state not detected after oauth return key. That generic key does not diagnose whether the app is actually disconnected, the chip never rendered the connected attribute, or the environment list never loaded. S31I must add an authoritative server side connection status endpoint, expose reliable DOM markers on score paste, use layered runner detection, and replace the generic failure with precise diagnostic keys.
What changed: Package version bumped from 0.138.33 to 0.138.34. The build identity endpoint now reports product version 0.138.34 and phase id 1G-S31I. Backwards-compatible alias exports cascade. A new buyer-safe GET endpoint ships under app api agentproof microsoft connection status route ts. The endpoint reads the agentproof ms session cookie, looks up the server side session record, and returns a documented JSON shape with connected, connection state, tenant domain, environments known, environment count, session present, flow state, served at ms, schema version, sentinel. The endpoint never returns access token, refresh token, client secret, or bearer token material; it never imports the outbound Microsoft clients or the auth code exchange helper. A new client component agent proof microsoft connection status beacon polls the endpoint on mount, on visibility change, and every six seconds, and surfaces eight stable data attributes on an invisible top level span mounted at the top of score paste. The live connected runner adds detect connected state layered plus fetch status endpoint via page plus fetch environments via page plus read beacon markers via safe evaluate; the sign in flow asks the status endpoint, the DOM beacon plus chip, and the environments endpoint in that order. Instead of the legacy generic key the runner now pushes ONE of five precise failure keys: microsoft status endpoint not connected, microsoft status endpoint error, connected dom marker missing, connected but environments not loaded, oauth return route not agentproof. On failure the runner captures s31i layered detection failure png and the back compat s31h oauth return not connected png and records the full layered evidence in the live acceptance report. Six new archive blocking gates ship - microsoft connection status endpoint s31i, score paste connected dom markers s31i, layered connected state detection s31i, precise failure diagnostics s31i, contradiction detection s31i, live evidence policy preserved s31i. The S31G S31H live evidence packaging policy is preserved byte for byte. No second chip system. No scoring change. Founder summary now reads 283 of 283 checks green.
User impact: On the founder workstation, the live runner now diagnoses exactly which layer refused the connection. If the founder is actually not connected the report carries microsoft status endpoint not connected and the full status endpoint body. If the founder is connected but the chip never appeared the report carries connected dom marker missing plus the dom markers excerpt. If the founder is connected but environment discovery has not completed the report carries connected but environments not loaded plus the environments endpoint response. The founder always knows the precise diagnosis instead of the generic key. The archive does NOT carry the live screenshots or the live report by default; the explicit opt in flag and the secret scan are preserved.
When to re-score: Tooling bug fix plus a new buyer-safe read only endpoint. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ToolingSelf evaluation gateApiEvidence trace: the new Phase 1G-S31I evidence record plus the new microsoft connection status endpoint route plus the new dom beacon component plus the score paste mount plus the hardened live runner with the layered detector plus the dedicated S31I runtime test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.33 to 0.138.34 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31I internal methodology record (layered live connection detection plus status endpoint authority plus precise failure diagnostics)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31I founder execution signal: the S31H runner cleared OAuth navigation but the founder retested and hit the generic connected state not detected after oauth return failure. The founder required a precise diagnosis layered across status endpoint, DOM markers, and environment readiness.
- Impact assessment
- Bumps the package version from 0.138.33 to 0.138.34. Ships a new buyer-safe server side connection status endpoint and a new invisible DOM beacon. Hardens the live connected runner with three layers of detection and five precise failure keys. Adds a dedicated S31I test. Adds six new archive blocking gates. Live evidence packaging policy unchanged. Tokens stay server side. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The runner cannot be exercised end to end inside CI because the Microsoft OAuth step needs founder input. The dedicated S31I test exercises the endpoint GET handler with mock Requests and source scans the runner shape. The full live walk must be performed by the founder on a workstation with a real Microsoft tenant.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S31JChange date:2026-05-15Product version:0.138.35Methodology engine version:0.9.1
Phase 1G-S31J - Fix Microsoft post-OAuth server session truth and status endpoint mismatch
Reason: The S31I runner reached the status endpoint but the founder retested and the endpoint reported NOT connected after a successful Microsoft sign-in. The browser appeared authenticated yet the server-authoritative status endpoint returned connected false. S31J must diagnose and fix the mismatch between OAuth callback success, server session storage, cookies, and the new status endpoint.
What changed: Package version bumped from 0.138.34 to 0.138.35. The build identity endpoint now reports product version 0.138.35 and phase id 1G-S31J. Backwards-compatible alias exports cascade. Root cause: the S31I status endpoint declared export const dynamic equals force dynamic but did NOT declare export const runtime equals nodejs. Without that runtime declaration, Next.js could route the GET to the Edge runtime, where global this dot underscore underscore AGENTPROOF MICROSOFT SESSION STORE V1 underscore underscore is a separate instance from the one the Node-runtime OAuth callback writes to. The Map of sessions populated by the callback was therefore invisible to the status endpoint. S31J fixes the runtime declaration on the status endpoint, ships a new shared lib connectors microsoft microsoft connection truth ts that exports get microsoft connection truth from request with the documented microsoft connection truth shape (cookie present, session id present, session store hit, session keys present, flow state, token present, token validity exceeded, tenant domain, tenant display name, environments known, environment count, connected, connection state, last error code, last error summary) and six documented last error code branches (no cookie, session not found, flow state not signed in, token missing, token validity exceeded, ok). The status endpoint refactor routes every cookie and session read through the shared truth helper. The endpoint never imports the session store directly. The response schema is bumped to agentproof microsoft connection status v2 with a diagnostics block. The beacon component mirrors five new safe data attributes on the invisible top-level span. The live runner records s31j status truth plus the documented S31J STATUS TRUTH FIELDS list in the live acceptance report whenever microsoft status endpoint not connected fires and logs the documented one-line summary. Eight new archive blocking gates ship - microsoft session truth helper s31j, oauth callback status consistency s31j, status endpoint cookie contract s31j, status endpoint session store contract s31j, status endpoint safe diagnostics s31j, runner reports status truth fields s31j, no fake connected state s31j, live evidence policy preserved s31j. No tokens leak. No fake connected state. The S31G S31H S31I live evidence packaging policy is preserved byte for byte. No second chip system. No scoring change. Founder summary now reads 291 of 291 checks green.
User impact: On the founder workstation, after Microsoft sign in completes, the status endpoint now returns connected true with the same session the OAuth callback wrote. The endpoint runs in the same Node runtime as the callback so the in-memory session store is shared. The live runner can now either complete the founder walk or surface a precise diagnosis like cookie present false, session store hit false, flow state pending, token validity exceeded - so the founder sees exactly which contract layer refused.
When to re-score: Tooling bug fix plus a new buyer-safe shared helper and safe diagnostic fields. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
ApiToolingSelf evaluation gateEvidence trace: the new Phase 1G-S31J evidence record plus the new shared truth helper module plus the refactored status endpoint with runtime nodejs and the diagnostics block plus the updated beacon with five new safe data attributes plus the hardened live runner with the s31j status truth lift plus the dedicated S31J runtime test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.34 to 0.138.35 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S31J internal methodology record (fix Microsoft post OAuth server session truth and status endpoint mismatch)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-15
- Reference
- Phase 1G-S31J founder execution signal: the S31I runner reached the status endpoint but the founder retested and the endpoint reported not connected after a successful Microsoft sign in. The browser appeared authenticated yet the server authoritative status endpoint returned connected false.
- Impact assessment
- Bumps the package version from 0.138.34 to 0.138.35. Ships a new shared truth helper. Refactors the status endpoint to use the helper and declares runtime nodejs. Surfaces 13 safe diagnostic fields on the response. Updates the beacon and the live runner. Adds a dedicated S31J test. Adds eight new archive blocking gates. Tokens stay server side. Tokens are never returned to the browser. The live evidence packaging policy is unchanged. Score engine unchanged. NullProvider remains default. UI product approval still requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The dedicated S31J test exercises the truth helper branches with mocked session stores and the status endpoint with mocked Request objects. The full live walk against a real Microsoft tenant must be performed by the founder on a connected workstation to verify the runtime fix end to end.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30JChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30J - hard functional UX repair: real connection flow, real actions, no contradictions
Reason: S30I rendered the chip status as Not connected while the drawer exposed Disconnect; there was no usable Microsoft connect path inside the new views; several buttons did nothing; methodology still appeared in the default Estate view. The founder rejected this contradictory state and demanded a real working product flow inside the single-active-view workspace architecture without re-enabling the legacy stack.
What changed: Package version was bumped from 0.138.8 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. Backwards-compatible alias exports cascade. A new lib module exports the documented 8-state agent proof connection state union and pure derivation helpers (derive agent proof connection state, connection state allows connect, connection state allows disconnect, derive connection drawer items). The agent proof connection status chip drawer now reads from this model and gates Connect Microsoft + Disconnect behind mutually-exclusive booleans so the founder S30I contradiction is structurally impossible. json paste score card hoists two real handlers from the legacy gated stack: microsoft start connect flow (POSTs to the real auth start endpoint) and microsoft refresh environments (GETs the real environments endpoint). The chip drawer and the estate dashboard view Connect Microsoft CTA both call these same real handlers. estate dashboard view is now state-aware: not connected renders a Connect Microsoft hero with three value tiles and NO fake estate metrics; connected no environment renders a Choose environment hero; environment selected renders a Discover agents hero; only the analysed states render the full metric strip + environment cards + prioritised next actions. methodology governance centre is no longer imported or mounted by the Estate view. Ten new archive-blocking gates A-J refuse any future revert.
User impact: The buyer never sees Not connected alongside Disconnect again. The first connected screen leads to a real Microsoft sign-in CTA when not yet connected. Once connected, the chip drawer surfaces Refresh environments + Disconnect; the Estate hero pivots to Choose environment. Methodology controls disappear from the Estate workspace and live only inside the Methodology view reachable via the top nav.
When to re-score: Presentation wiring only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the new connection state model lib plus the state-aware agent proof connection status chip plus the state-aware estate dashboard view plus the hoisted microsoft start connect flow + microsoft refresh environments handlers in json paste score card plus the ten new self-evaluation checks s30j_* plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.8 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (hard functional UX repair, real connection flow, no contradictions)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30J founder execution signal: S30I rendered contradictory state (Not connected chip + Disconnect drawer item); no usable Microsoft connect path inside the new views; methodology still in the default Estate view; the founder explicitly rejected fake shells, contradictory state, and dead buttons.
- Impact assessment
- Bumps the package version from 0.138.8 to 0.138.9, introduces a single 8-state connection-state model, hoists real Microsoft connect and refresh-environments handlers, makes EstateDashboardView state-aware, removes MethodologyGovernanceCentre from the Estate view source, and adds ten archive-blocking gates A-J that refuse any revert. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the chip says Not connected with a Connect Microsoft CTA, click it to start the real OAuth flow, return after auth to see the chip say Connected with the Disconnect option in the drawer, confirm the Estate view now shows Choose environment, click through to the Environment view, list agents, open an agent, and confirm the wizard saves answers via the canonical writer.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30KChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30K - Portfolio-first information architecture: portfolio above estate, tenant identity, breadcrumb, honest add-tenant
Reason: The founder confirmed the buyer profile is locked to large corporations with multiple Microsoft tenants. Multi-tenant is the core product, not optional. The S30J hierarchy stopped at the per-tenant estate and used ambiguous everything-wording that conflicted with the strict definition: an estate is everything inside a single tenant. Drilling into one environment still referenced estate even though the buyer was inside one environment. S30K corrects the information architecture: Portfolio is the new top-level workspace view above the per-tenant estate; tenant identity is surfaced in the connection chip and in the per-tenant estate hero; a workspace breadcrumb mounts on every non-Portfolio view so the buyer can navigate Portfolio - Estate - Environment - Agent; the Environment + Agent surfaces are stripped of the literal estate word and use tenant-scoped wording. An honest Add tenant call to action ships in the chip drawer and in the Portfolio view footer; the call to action triggers the real Microsoft connect flow rather than pretending a second tenant exists.
What changed: Package version was bumped from 0.138.9 to 0.138.10. The build-identity endpoint now reports product version 0.138.10 and phase id 1G-S30K. Backwards-compatible alias exports cascade. A new pure lib module ships at the documented path that exports the buyer-facing tenant identity shape, the per-estate roll-up shape, the documented portfolio aggregate shape, and the pure derivation helpers (derive portfolio rollup + portfolio readiness label). A new top-level workspace view ships at the documented path that renders one of two documented shapes: no tenant connected (Connect your first Microsoft tenant call to action plus three value tiles plus no fake metric strip) or one or more tenants connected (aggregate metric strip plus per-estate cards plus honest Add tenant call to action footer). A new workspace breadcrumb component ships at the documented path and mounts on every non-Portfolio workspace view via json paste score card. The agent proof connection status chip is extended with three new capabilities: tenant identity surfaced in the chip label when active tenant identity is known and the chip is in a connected state; tenant switcher inside the drawer listing connected tenants with the active tenant marked; honest Add tenant call to action inside the drawer that triggers the real connect handler. The estate dashboard view now surfaces a tenant identity strip at the top of the view, the analysed hero label reads Readiness in tenant domain instead of the previous wording, and the not-connected hero heading is rewritten to Connect Microsoft to discover this tenant agents. Every literal estate word is removed from environment dashboard view, agent workspace view, and environment command room; the S30K C plus D gates source-scan each file and refuse the archive if any literal occurrence reappears. json paste score card imports the new Portfolio view + breadcrumb + portfolio state model, defaults active workspace view to portfolio dashboard, mounts the Portfolio view in the workspace router, wires the chip with active tenant identity plus connected tenants plus the honest Add tenant handler, passes tenant identity into the estate dashboard view, and derives the breadcrumb segments from the active view plus selected environment plus selected agent. Ten new archive-blocking gates A through J refuse any future revert.
User impact: The buyer always starts at the documented top of the hierarchy (Portfolio). The Portfolio view aggregates across every connected tenant; with no tenant connected, an honest Connect your first Microsoft tenant call to action replaces the metric strip. Once a tenant is connected, the chip surfaces the tenant domain, the drawer lists connected tenants with an honest Add Microsoft tenant call to action, and the Estate view leads with the tenant identity strip so the buyer always knows which tenant they are looking at. Inside a tenant the wording is consistently tenant-scoped (this environment / this agent); the literal estate word does not appear in the Environment or Agent surfaces.
When to re-score: Information architecture only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No fixture re-scoring is required.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30K evidence record plus the new portfolio state model lib plus the new portfolio dashboard view plus the new agent proof breadcrumb plus the extended agent proof connection status chip plus the tenant-scoped estate dashboard view plus the literal-estate-word strip from environment dashboard view + agent workspace view + environment command room plus the Portfolio router wiring + breadcrumb mount + tenant identity wiring + honest Add tenant wiring in json paste score card plus the ten new self-evaluation checks s30k_* plus the new dedicated Phase 1G-S30K test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.9 to 0.138.10 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30K internal methodology record (Portfolio-first information architecture)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30K founder execution signal: the founder confirmed the buyer profile is locked to large corporations with multiple Microsoft tenants. Multi-tenant is the core product. The S30J Estate-as-top architecture conflicted with the buyer mental model; the literal estate word appeared inside per-environment views where it had no scope meaning; there was no place to add a second tenant honestly.
- Impact assessment
- Bumps the package version from 0.138.9 to 0.138.10, introduces a new top-level Portfolio workspace view above the per-tenant Estate, surfaces tenant identity in the chip plus per-tenant hero, mounts a breadcrumb on every non-Portfolio view, strips the literal estate word from inside-tenant surfaces, ships an honest Add tenant call to action wired to the real Microsoft connect flow, and adds ten archive-blocking gates A-J that refuse any revert. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the workspace top nav leads with Portfolio, confirm the Portfolio view shows the honest no-tenant hero on a fresh session, click Connect Microsoft tenant to start the real OAuth flow, return after authentication, confirm the chip shows the active tenant domain, click Open tenant to switch to the Estate view, confirm the tenant identity strip + Back to Portfolio button render, click through to the Environment view + Agent view, confirm the breadcrumb mounts with all expected segments, open the chip drawer and confirm the Add Microsoft tenant call to action is visible (clicking it triggers a second OAuth sign-in). Live in-session multi-tenant connection management is the next slice.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30LChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30L - Hard hierarchy lock and current-location clarity
Reason: S30K added Portfolio above Estate but the founder hierarchy lock directive demanded that every view must clearly show where the user is. The two-second clarity contract was not met: page titles did not include the current object, scope badges were implicit, only the Estate view had a back button, the top navigation still carried the old flat list (Estate / Environment / Agent / Review / Report / Evidence / Intelligence / History / Methodology), and the Estate view rendered per-agent prioritised next actions (agent-level content inside Estate scope). S30L locks the hierarchy so the founder can identify location in under two seconds and never sees content from two hierarchy levels at once.
What changed: Package version was bumped from 0.138.10 to 0.138.11. The build-identity endpoint now reports product version 0.138.11 and phase id 1G-S30L. Backwards-compatible alias exports cascade. The active workspace view lib exports a new visible top nav order containing exactly two entries (portfolio dashboard + review history) plus a documented visible top nav sentinel. agent proof workspace top nav now iterates over visible top nav order (not the full workspace view nav order) so the legacy flat workspace switcher is removed by construction. portfolio dashboard view surfaces an explicit Portfolio scope badge plus page title Portfolio plus data-scope-level portfolio. estate dashboard view surfaces an Estate scope badge plus page title Estate tenant-domain plus Back to Portfolio button plus data-scope-level estate. environment dashboard view surfaces an Environment scope badge plus page title Environment env-name plus Back to Estate button plus data-scope-level environment. agent workspace view surfaces an Agent scope badge plus page title Agent agent-name plus Back to Environment button plus data-scope-level agent. The legacy per-agent prioritised next-actions section inside Estate scope is permanently gated behind a false render gate (the no agent content sentinel proves it). json paste score card wires the Back to Estate handler into environment dashboard view and the Back to Environment handler into agent workspace view. A new archive-blocking gate hierarchy current location clarity source-scans every view for the documented page title plus scope badge plus back button plus simplified top nav plus Estate no-agent-content sentinel and refuses any future revert. A dedicated test file covers sixteen requirements over the rendered DOM plus source plus self-eval gate.
User impact: The founder can look at any view for two seconds and know exactly where they are: a scope badge says Portfolio scope or Estate scope or Environment scope or Agent scope; the page title says Portfolio or Estate tenant-domain or Environment env-name or Agent agent-name; a Back button moves up one level (never to a flat workspace list); the top navigation surfaces only Portfolio plus Review history plus the connection chip; deeper movement happens through hierarchy cards plus the breadcrumb plus back buttons. Estate scope no longer shows agent review actions, environment tables, or agent dashboards underneath.
When to re-score: Presentation hierarchy lock only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30L evidence record plus the new visible top nav order export plus the simplified agent proof workspace top nav plus the page-title plus scope-badge headers on every view plus the Back to Portfolio plus Back to Estate plus Back to Environment buttons plus the no agent content sentinel inside estate dashboard view plus the new hierarchy current location clarity self-evaluation check plus the new dedicated Phase 1G-S30L test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.10 to 0.138.11 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30L internal methodology record (hard hierarchy lock and current-location clarity)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30L founder execution signal: the S30K hierarchy lock directive demanded that every view clearly show where the user is - page title with current scope, clickable breadcrumb, scope badge, back action one level up, view separation, simplified top nav. The founder must be able to identify location in two seconds; S30K did not meet the contract.
- Impact assessment
- Bumps the package version from 0.138.10 to 0.138.11, simplifies the top nav to Portfolio plus Review history plus the connection chip, adds explicit page title plus scope badge plus back-navigation button to every view, removes per-agent content from Estate scope (the legacy next-actions section is permanently false-gated), adds one archive-blocking gate hierarchy_current_location_clarity that refuses any revert that breaks two-second clarity, and renames the previously-planned multi-tenant phase to S30M plus persistence to S30N. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, look at each view for two seconds, and confirm: (1) the scope badge says Portfolio scope plus Estate scope plus Environment scope plus Agent scope respectively, (2) the page title surfaces the current object name (Portfolio, Estate plus tenant domain, Environment plus name, Agent plus name), (3) the back button moves one level up the hierarchy, (4) the top navigation surfaces only Portfolio plus Review history plus the chip (no flat workspace switcher), (5) Estate scope shows only environment cards (no per-agent rows). Live in-session multi-tenant connection management is the next slice S30M.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30MChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30M - Live hierarchy flow, connection truth, and no-dead-buttons enforcement
Reason: S30L locked the hierarchy Portfolio - Estate - Environment - Agent visually but the founder hierarchy lock directive required the live product to actually be usable: real Connect / Disconnect / Refresh actions, no dead buttons, no contradictory connection state, no mixed hierarchy views, and interaction-level proof that the buttons do something (not just source scans). S30M adds a canonical 4-state derive connection truth helper plus twelve archive-blocking gates plus a dedicated interaction-level test file running under happy-dom so the founder flow is proven end-to-end via real DOM click events.
What changed: Package version was bumped from 0.138.11 to 0.138.12. The build-identity endpoint now reports product version 0.138.12 and phase id 1G-S30M. Backwards-compatible alias exports cascade. A new pure lib module at the documented path exports derive connection truth + connection truth state + connection truth shape + connection truth chip label + is connection truth contradictory. The 4-state union collapses the S30J 8-state into not connected plus connecting plus connected plus error and computes can connect plus can disconnect plus can refresh environments plus the primary action label and enabled flag. The two action booleans are disjoint by construction so Not connected plus Disconnect can never appear together. agent proof connection status chip now derives connection truth and surfaces it via the data-connection-truth-state + data-truth-can-connect + data-truth-can-disconnect + data-truth-chip-label attributes; the chip label reads Connected to tenant-domain when truth.connection state is connected. Twelve new archive-blocking gates ship: connection truth no contradictions, connect disconnect buttons functional, portfolio estate environment agent flow, breadcrumbs and back buttons functional, single hierarchy level visible, questions only inside agent view, reports only inside agent view, methodology not default flow, environment filters change rows, no dead buttons live path, founder flow interaction smoke, and no setup blocks when connected. A new dedicated unit-test file runs under happy-dom; it mounts the views in a real DOM, dispatches real click events, and proves the founder hop-by-hop flow plus the segment filters visibly changing the row count plus the Disconnect button being invisible when not connected. The happy-dom dev dependency was added.
User impact: Founders open /score/paste, see the Portfolio dashboard, click Connect Microsoft (the only primary action when not connected), and after a real OAuth round-trip see the chip read Connected to acme.onmicrosoft.com with a Disconnect button in the drawer (no Not connected plus Disconnect contradiction). They click Open estate plus Open environment plus Open agent and reach the documented scope; clicking Back returns one level up. The Environment segment filters actually narrow the visible row count. No button in the live path is placeholder, console-only, href hash, or TODO; the no dead buttons live path gate refuses any future revert.
When to re-score: Presentation hierarchy plus connection-truth plumbing only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateDev dependenciesEvidence trace: the new Phase 1G-S30M evidence record plus the new connection truth module plus the chip wiring updates plus the twelve new self-evaluation checks plus the new dedicated Phase 1G-S30M interaction-level test file plus the happy-dom dev dependency plus the updated build-identity endpoint constants plus the package version bump from 0.138.11 to 0.138.12 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30M internal methodology record (live hierarchy flow, connection truth, no dead buttons enforcement)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30M founder execution signal: the S30L hierarchy lock was visual only. The founder demanded real Connect plus Disconnect plus Refresh actions, no dead buttons, no contradictory connection state, no mixed hierarchy views, and interaction-level proof beyond source scans.
- Impact assessment
- Bumps the package version from 0.138.11 to 0.138.12, introduces the canonical four-state deriveConnectionTruth helper, wires the chip to consume it, adds twelve archive-blocking gates that refuse any revert, and adds a happy-dom-backed interaction test file proving the founder flow end-to-end. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, click through Portfolio plus Estate plus Environment plus Agent, click each Back button, click the Environment segment filters and confirm the visible row count narrows, open the chip drawer in each connection state, and confirm no dead buttons appear. Live multi-tenant connection management remains the next slice S30N (renamed from S30M).
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30NChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30N - Live data rehydration, real button actions, honest placeholders
Reason: The founder live test of S30M surfaced six concrete buyer-flow defects: Generate report click is dead at the end of the wizard, Open agent CTA is dead at Environment scope, prior reviews are gone after a page reload, Portfolio metric strip reports zero agents even when local storage holds prior reviews, Estate environment cards report zero of zero reviewed for the same reason, and the Open environment workspace hero CTA navigates to an empty Environment view. Plus side observations: the placeholder string your-Microsoft-tenant leaks into the chip plus breadcrumb plus Estate hero plus tenant strip; the global methodology footer renders on the locked hierarchy default flow; the Estate tenant strip carries a duplicate Back to Portfolio button; the Environment view title surfaces a dangling colon when the environment name is empty.
What changed: Package version bumped from 0.138.12 to 0.138.13. The build-identity endpoint now reports product version 0.138.13 and phase id 1G-S30N. Backwards-compatible alias exports cascade. A new pure lib module ships a persisted Portfolio rollup walker that walks local storage for every review index key and per-agent report history key and aggregates the per-environment totals plus per-agent latest score plus portfolio-level average score plus last reviewed at. json paste score card mounts a use effect that calls the walker on the auth-status flip and projects the persisted history rows into microsoft report history so the score column repopulates after a page reload. The Portfolio rollup now reads reviewed agents plus average score plus last reviewed at from the persisted walker; the Estate environment cards inherit the same source. The new review wizard view Generate report CTA now invokes the legacy generate report button ref click before navigating so the real generation pipeline fires. The Environment row CTA now atomically sets the selected agent id and switches the workspace view in one click. The dead Open environment workspace hero CTA is removed from estate dashboard view. The placeholder string your-Microsoft-tenant is replaced with the calm Microsoft-tenant label via a new s30n tenant display label binding. A new client component agent proof layout footer route-gates the methodology paragraph behind not-locked-hierarchy-route; the legal disclaimer remains on every route. The duplicate Back to Portfolio button in the Estate tenant identity strip is removed. The Environment page title falls back to Environment-not-selected when the environment name is empty. Six new archive-blocking gates ship.
User impact: After a real OAuth round trip and after reviewing several agents and refreshing the browser the founder now sees Portfolio plus Estate plus Environment cards that show the persisted counts plus scores plus last reviewed at; the placeholder tenant string never appears; the Generate report click at the end of the wizard fires the real score pipeline; the Open agent CTA navigates to the agent view in one click; the Open environment workspace dead CTA is gone; the global footer on /score/paste no longer surfaces methodology copy; the Estate tenant strip has one canonical back button.
When to re-score: Presentation plus persistence rehydration plumbing only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30N evidence record plus the new persisted portfolio rollup module plus the new agent proof layout footer component plus the updated json paste score card wiring for the wizard plus the Environment row CTA plus the Portfolio rollup plus the tenant label plus the updated estate dashboard view and environment dashboard view plus the six new self-evaluation checks plus the new dedicated Phase 1G-S30N interaction-level test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.12 to 0.138.13 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30N internal methodology record (live data rehydration plus real button actions plus honest placeholders)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30N founder execution signal: the founder live test of S30M surfaced six concrete buyer-flow defects (Generate report dead plus Open agent dead plus prior reviews gone after reload plus Portfolio metric strip showing zero plus Estate environment cards showing zero plus Open environment workspace hero CTA dead) and side observations (placeholder tenant leak plus methodology footer on default flow plus duplicate Back to Portfolio plus dangling colon in Environment title).
- Impact assessment
- Bumps the package version from 0.138.12 to 0.138.13, ships a persisted portfolio rollup walker plus the wizard generate-report wiring plus the Environment row CTA wiring plus the placeholder tenant fix plus the route-gated methodology footer plus the dead hero CTA removal plus six new archive-blocking gates. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, reload the page after at least one review, and confirm: Portfolio metric strip shows the correct reviewed count plus average score; Estate environment card shows the correct counts; Environment header shows persisted counts; the Generate report click navigates AND produces a report; the Open agent click navigates to the Agent view in one click; the placeholder tenant string never appears; the methodology footer line is absent from /score/paste.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30OChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30O - Restore intelligence depth inside locked hierarchy
Reason: The founder live test of S30N surfaced a content-depth regression at Agent scope: the Review wizard had been collapsed to a mediocre five-prompt list (the adaptive question family the buyer had seen before the UI reset was gone) and the Readiness report had been collapsed to a four-line placeholder (the seven canonical tabs were gone). Plus the founder asked for visible loading indicators during environment discovery (the page looked broken while the OAuth-to-environments-to-agents chain was running) and a fix for Estate environment cards still reading zero of zero reviewed for unselected environments. Non-negotiables: use the same component for the Agent Questions tab AND the standalone Review wizard view, keep all seven report tabs, keep Portfolio-Estate-Environment-Agent separation, do not re-enable the old single-page stack.
What changed: Package version bumped from 0.138.13 to 0.138.14. The build-identity endpoint now reports product version 0.138.14 and phase id 1G-S30O. Backwards-compatible alias exports cascade. json paste score card now exposes two shared render helpers: s30o render adaptive wizard returns the full adaptive question pipeline (build adaptive question family plus per-question render closure plus the agent review wizard mount) and s30o render report tabs returns the full seven-tab agent readiness report tabs (Summary plus Intelligence plus Actions plus Risks plus Evidence plus Stakeholders plus Trace) wired to the live view model. The adaptive wizard helper is invoked from BOTH the standalone review wizard view AND the Agent Questions tab so the same adaptive question engine runs in both places (one engine, one answer store, one save path, one scoring impact path, no duplicate simplified wizard). The seven-tab report helper is invoked from the standalone readiness report view AND the Agent Report tab AND the Agent Intelligence tab AND the Agent Evidence tab so the seven tabs reach every documented agent scope path. portfolio dashboard view plus estate dashboard view plus environment dashboard view now accept a discovery in flight prop and render a visible skeleton banner with aria live polite copy while the OAuth then environments then agents then footprint fetches are running. json paste score card derives s30o environment discovery in flight plus s30o agent discovery in flight plus s30o footprint discovery in flight plus the combined s30o microsoft discovery in flight plus a buyer safe label and passes them as props. The live environment cards derivation now builds a persisted env by id map from s30n persisted rollup.environments and merges persisted state into every env card: reviewed agents is max in-session and persisted, total agents floors at the reviewed count for unselected envs, average score falls back to persisted, last reviewed at takes the later timestamp, highest risk label prefers the informative in-session label else persisted. Five new archive-blocking gates ship: adaptive questions inside review wizard plus seven tab report inside agent scope plus intelligence panel inside agent scope plus discovery loading state visible plus estate environment cards reflect persisted state. A new dedicated unit-test file runs under happy-dom and proves the shared helpers mount in both scopes plus the loading banners render plus the env cards reflect persisted state.
User impact: After connecting Microsoft and opening any agent, the founder now sees the full adaptive question wizard the buyer had before the UI reset (multiple question groups, dynamic per-fact prompts, the full five-step canonical question pipeline) in BOTH the Agent Questions tab AND the standalone Review wizard view. After generating a report the founder sees the full seven-tab agent readiness report tabs report (Summary plus Intelligence plus Actions plus Risks plus Evidence plus Stakeholders plus Trace) in every documented place. While Microsoft discovery is running, the founder sees a visible skeleton loading banner with the current step label (Connecting Microsoft, Discovering environments, Discovering agents, Reading footprint) so the page never looks broken. Estate environment cards show persisted counts plus scores for every environment that has prior reviews in local storage, not just the env the founder happens to have selected in the current session.
When to re-score: Presentation depth restoration plus discovery loading signals plus persisted rollup fan-out. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30O evidence record plus the new s30o render adaptive wizard and s30o render report tabs render helpers inside json paste score card plus the discovery in flight props plus the persisted env by id merge inside live environment cards plus the discovery loading banners inside portfolio dashboard view plus estate dashboard view plus environment dashboard view plus the five new self-evaluation checks plus the new dedicated Phase 1G-S30O interaction-level test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.13 to 0.138.14 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30O internal methodology record (restore intelligence depth inside locked hierarchy)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30O founder execution signal: the founder live test of S30N surfaced a content depth regression (mediocre five prompt wizard plus four line report placeholder) plus a missing loading state during environment discovery plus a persisted rollup that did not fan out to per environment Estate cards.
- Impact assessment
- Bumps the package version from 0.138.13 to 0.138.14, restores the adaptive question family across both wizard mount points plus the seven tab report across four mount points, ships visible discovery loading states across Portfolio plus Estate plus Environment views, fans the persisted rollup out to every Estate environment card, adds five archive blocking gates that refuse any future revert. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, click into any agent, and confirm: the Questions tab renders the full adaptive question family (not a five prompt list); the Report tab renders all seven canonical tabs (Summary plus Intelligence plus Actions plus Risks plus Evidence plus Stakeholders plus Trace); the Intelligence tab plus Evidence tab also render the seven tab report (anchored to the agent under review); the Portfolio plus Estate plus Environment views show a visible loading banner while Microsoft discovery is running; the Estate environment cards reflect persisted counts plus scores for every environment that has prior reviews.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30PChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30P - Runtime action integrity, generate-report repair, and full site click audit
Reason: S30O shipped intelligence depth restoration but the founder live test surfaced a serious defect: after answering all review questions the Generate readiness report button looked enabled but clicking it did absolutely nothing. The previous gate generate report triggers real pipeline source-scanned for the literal generate report button ref click pattern and passed - proving source scans alone are insufficient. The 744-line canonical pipeline lived inside a permanently false-gated s30 h legacy stack never block so the DOM ref was always null and the click was a silent no-op. S30P repairs this defect and audits the whole live site for broken buttons dead links non-working CTAs contradictory states and actions that appear enabled but do nothing. No visible clickable may exist unless it performs a real action navigates to a real view opens or closes real UI starts a real server flow or is visibly disabled with a clear reason. Source scans are not enough - runtime click tests are mandatory.
What changed: Package version bumped from 0.138.14 to 0.138.15. The build-identity endpoint now reports product version 0.138.15 and phase id 1G-S30P. Backwards-compatible alias exports cascade. json paste score card extracts the legacy 744-line inline on click body into a named const run report generation pipeline declared at component scope as an async arrow returning Promise<void>. The legacy button's on click now reads on click equals run report generation pipeline (preserving the source-scan literal for prior phase tests while making the function callable without a DOM ref). The wizard's on generate report calls run report generation pipeline directly with no DOM ref. A new auto-navigation use effect watches microsoft readiness report.status equal ok and switches active workspace view to readiness report when the pipeline succeeds. A buyer-safe error banner with role alert plus aria-live polite plus a Retry generation button renders above the wizard when microsoft readiness report.status equal error. Every live view (portfolio dashboard view, estate dashboard view, environment dashboard view, agent workspace view, agent proof agent workspace tabs, agent proof connection status chip, agent proof breadcrumb, agent proof workspace top nav) now declares data-action-id attributes on every primary clickable so runtime tests can target actions by purpose rather than implementation. A dedicated happy-dom runtime interaction test file ships with 25 tests dispatching real mouse event clicks. Ten new archive-blocking gates ship: runtime action registry complete, generate report runtime e2e, no dead clickables runtime, connection state action consistency, hierarchy navigation runtime e2e, agent tabs runtime e2e, environment filters runtime e2e, report regeneration runtime e2e, action error surfaces visible, no source scan only gate for actions.
User impact: After answering questions the founder clicks Generate readiness report and the report actually appears. The button never silently fails. If generation fails the founder sees a buyer-safe rose error banner with the documented failure reason and a Retry generation button that re-runs the same canonical pipeline. Every clickable in every live view either performs a real action navigates to a real view opens or closes real UI starts a real server flow or is visibly disabled with a clear reason. The previous failure mode where the source scan passes but the live click is dead is now structurally impossible.
When to re-score: Runtime action integrity plus generate-report repair plus action registry only. The 744-line pipeline body was relocated verbatim - no semantic change to the scoring logic. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30P evidence record plus the new run report generation pipeline named function inside json paste score card plus the wizard's on generate report now calling the function directly plus the post-update auto-navigation use effect plus the buyer-safe error banner with Retry generation button plus data-action-id attributes added to every live clickable plus the new dedicated Phase 1G-S30P runtime interaction file with 25 happy-dom tests dispatching real mouse event clicks plus the ten new self-evaluation checks plus the updated build-identity endpoint constants plus the package version bump from 0.138.14 to 0.138.15 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30P internal methodology record (runtime action integrity plus generate-report repair plus full site click audit)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30P founder execution signal: the live test of S30O surfaced a serious defect (Generate readiness report button looked enabled but clicked nothing because the legacy 744-line onClick lived inside a permanently false gated block so the ref was always null and the click was a silent no-op). Source scans alone are not enough - runtime click tests are mandatory.
- Impact assessment
- Bumps the package version from 0.138.14 to 0.138.15. Extracts the canonical pipeline into a named function the wizard calls directly. Ships data-action-id attributes on every live clickable (action registry). Adds a happy-dom runtime interaction test file dispatching real MouseEvent clicks. Adds ten new archive blocking gates that refuse any future runtime regression. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, connect Microsoft, choose an environment, choose an agent, answer the adaptive questions, click Generate readiness report and confirm: the button reads Generating report... while the pipeline runs; on success the readiness_report view appears with the canonical score; on failure a rose error banner appears with the documented reason and a Retry generation button; clicking Retry re-runs the pipeline.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30QChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30Q - Risk-label clarity, report usability hardening, and founder-grade detailed QA
Reason: S30P shipped runtime action integrity with all gates and runtime tests green and the founder confirmed the product is starting to look much better - but the report still has unclear labelling. In the Risk scenarios card, two badges render side by side, for example High and high, or Medium and high. The buyer cannot know whether these mean severity, likelihood, confidence, evidence quality, priority, or something else. This is unacceptable. S30Q is not a redesign phase - it is a clarity, correctness, and detailed testing phase. Label every risk badge explicitly. Add a compact legend. Make severity and confidence visually distinct. Apply the same rule across the product. Share a labelled-chip component. Strengthen the Risk scenarios card. Make sorting understandable. Add a no-ambiguous-chip self-eval gate. Add runtime report clarity tests. Start a serious founder QA checklist.
What changed: Package version bumped from 0.138.15 to 0.138.16. The build-identity endpoint now reports product version 0.138.16 and phase id 1G-S30Q. Backwards-compatible alias exports cascade. A new shared report component, the labelled status chip, ships under the report folder. It exposes label, value, tone family (risk or evidence or status or neutral), test id, optional action id, optional title and optional override content, plus helpers that collapse any casing to High slash Medium slash Low slash Unknown. The intelligence card now uses the labelled chip helper for severity (risk tones - rose slash amber slash emerald), for confidence (evidence tones - emerald slash sky slash slate), for the agent interpretation chip, and for the evidence quality confidence band chip. A compact legend at the top of the Risk scenarios card explains both terms. A deterministic sort helper orders scenarios by severity rank then confidence rank then title via locale compare. A visible Sorted by severity, then confidence line renders above the list. One new archive blocking gate ships - the risk label clarity gate - that scans the labelled chip module and the intelligence card source and refuses to ship if any bare High slash Medium slash Low chip is found. A dedicated runtime clarity unit-test file ships with seventeen tests covering version cascade, the helper primitive, labelled chips on every scenario, no lowercase standalone chip, legend explains both terms, severity and confidence use different tone families, deterministic sort, evidence basis plus missing proof plus why it matters remain visible. A founder QA checklist evidence artefact ships under the content folder covering Portfolio plus Estate plus Environment plus Agent plus Questions plus Report plus Runtime levels.
User impact: The Risk scenarios card now shows Severity colon High next to Confidence colon High - the buyer knows immediately what each chip means. The two chips use different colour families (rose for severity, emerald for confidence) so they cannot be confused even at a glance. A compact legend explains both terms. Scenarios sort by severity then confidence with a visible sort note. No bare unlabelled High slash Medium slash Low chip remains in the report. The founder QA checklist is in place for end-to-end manual verification.
When to re-score: Clarity and labelling only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30Q evidence record plus the new labelled status chip component plus the updated agent readiness intelligence card consuming it plus the compact legend plus the deterministic sort helper plus the visible sort note plus the new risk label clarity self-evaluation check plus the new dedicated Phase 1G-S30Q runtime clarity test file with 17 happy-dom tests plus the founder QA checklist evidence artefact plus the updated build-identity endpoint constants plus the package version bump from 0.138.15 to 0.138.16 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30Q internal methodology record (risk-label clarity plus report usability hardening plus founder-grade detailed QA)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30Q founder execution signal: the S30P live test confirmed the product is starting to look much better but the Risk scenarios card rendered two badges side by side - High plus high, or Medium plus high - with no label explaining what each badge meant. Clarity defect that source scans alone never caught.
- Impact assessment
- Bumps the package version from 0.138.15 to 0.138.16. Ships a shared LabelledStatusChip component. Updates the AgentReadinessIntelligenceCard to use the helper for severity plus confidence plus interpretation confidence plus evidence band. Adds a compact legend. Adds a deterministic sort plus a visible sort note. Adds the risk_label_clarity archive-blocking gate. Adds a dedicated runtime clarity test file. Adds the founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, connect Microsoft, choose an environment, choose an agent, answer the adaptive questions, generate the readiness report, open the Risk scenarios card and confirm every chip is labelled (Severity colon X and Confidence colon X), the legend explains both terms, the sort note reads Sorted by severity, then confidence, and the scenarios are ordered High first.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30RChange date:2026-05-14Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30R - Report-wide semantic clarity, interaction audit, and founder-grade report QA
Reason: S30Q fixed the Risk scenarios card clarity but the founder showed that the Scenario model tab still contained unlabeled chips such as High and Medium with no label saying severity or confidence. That is unacceptable. Every report chip must clearly say what it means. S30R is clarity, correctness, and detailed QA only - no scoring changes and no full UI redesign.
What changed: Package version bumped from 0.138.16 to 0.138.17. The build-identity endpoint now reports product version 0.138.17 and phase id 1G-S30R. Backwards-compatible alias exports cascade. json paste score card now imports the shared labelled status chip helper from the canonical module. The Scenario model tab body is rewritten so each scenario card surfaces a labelled Severity chip with risk tones and a labelled Confidence chip with evidence tones when confidence is present. A compact legend at the top of the tab explains both terms. A visible Sorted by severity, then confidence line renders above the list. Scenarios are sorted by severity rank then confidence rank then title via locale compare. The Controls tab body is hardened the same way - Owner plus Effort plus Impact each render as labelled chips. Seven new archive blocking gates ship - report chip meaning explicit plus scenario model label clarity plus report shared chip standard plus report runtime actions work plus report no dead buttons plus report legend present when needed plus report sort rule visible. A dedicated runtime test ships under happy-dom with eighteen tests dispatching real mouse event clicks against the rendered report DOM and proving every required outcome. A founder QA checklist evidence artefact ships covering chip-meaning clarity plus Scenario model clarity plus Risks tab clarity plus Intelligence tab clarity plus Evidence tab clarity plus legends plus sorting rules plus tab behaviour plus regenerate flow plus evidence expansion plus trace expansion plus persistence after reload plus contradictory state checks.
User impact: The Scenario model tab now reads Severity colon High and Confidence colon Medium with two distinct colour families. No bare High slash Medium slash Low chip remains in any report tab. The compact legend at the top of the Scenario model tab explains both terms. A visible Sorted by severity, then confidence line above the list explains the sort rule. The Controls tab uses labelled chips for Owner plus Effort plus Impact. Seven new archive blocking gates refuse any future revert.
When to re-score: Clarity and labelling only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30R evidence record plus the updated json paste score card Scenario model tab body plus the updated Controls tab body plus the new dedicated Phase 1G-S30R runtime clarity test with eighteen happy-dom tests plus the seven new self-evaluation checks plus the founder QA checklist evidence artefact plus the updated build-identity endpoint constants plus the package version bump from 0.138.16 to 0.138.17 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30R internal methodology record (report-wide semantic clarity plus interaction audit plus founder-grade report QA)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30R founder execution signal: the live test of S30Q confirmed the Risk scenarios card was fixed but the Scenario model tab still rendered bare unlabelled High slash Medium chips with no label saying what each chip meant.
- Impact assessment
- Bumps the package version from 0.138.16 to 0.138.17. Updates the Scenario model tab plus Controls tab to use the shared labelled chip helper. Adds a compact legend plus deterministic sort plus visible sort rule line to the Scenario model tab. Ships seven new archive blocking gates. Ships a dedicated runtime clarity test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, generate the readiness report, open the Scenario model tab and confirm every chip is labelled Severity colon X or Confidence colon X with distinct tone families, the legend explains both terms, the sort note reads Sorted by severity, then confidence, the Controls tab shows labelled Owner plus Effort plus Impact chips, and no bare High slash Medium slash Low chip remains in any report tab.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30SChange date:2026-05-14Product version:0.138.18Methodology engine version:0.9.1
Phase 1G-S30S - Full report runtime proof, tab-meaning clarity, and decision-grade scenario/risk separation
Reason: S30R fixed the visible founder-reported bare chip problem but its own delivery report admitted one unacceptable proof gap - the full live Scenario model tab inside json paste score card was not runtime-mounted and clicked end-to-end. S30S must close that proof gap properly. The report must also be easier for a real buyer to understand by clearly explaining the difference between Scenario model, Risks, Controls, and Intelligence.
What changed: Package version bumped from 0.138.17 to 0.138.18. The build-identity endpoint now reports product version 0.138.18 and phase id 1G-S30S. Backwards-compatible alias exports cascade. Four new presentational report components ship under the report folder - the agent report guide explaining the four most-confusable tabs in plain business language, the scenario model tab body extracted from the inline json paste score card JSX, the controls tab body extracted likewise, and the new risks tab body reframed per founder spec with affected readiness area plus recommended control or action. json paste score card now mounts these exact components inside its s30o render report tabs tab slot map. The dedicated S30S runtime test imports the same components, builds a deterministic fixture, mounts the agent readiness report tabs orchestration under happy-dom with the real tab body components, and dispatches real mouse event clicks against every documented tab - asserting labelled chips in the rendered DOM, sweeping every chip-like span and failing on any bare High slash Medium slash Low value, exercising the update answers and regenerate action, and asserting the disabled tab explanatory title pattern. Nine new archive blocking gates ship - full report runtime mount, scenario model tab runtime verified, risks tab runtime verified, controls tab runtime verified, report bare chip runtime block, report tab meaning guide present, scenario risk meaning separation, report click audit, disabled report actions explained. The S30R gates report chip meaning explicit plus scenario model label clarity plus report legend present when needed plus report sort rule visible were updated to consult the new presentational component sources so prior phase tests stay green after the extraction. Founder summary now reads 162 of 162 checks green.
User impact: After Generate readiness report the buyer sees a compact slate tinted panel labelled How to read this report at the top. Expanding it shows four short paragraphs explaining Scenario model, Risks, Controls, and Intelligence in plain business language. Premium tone. The Scenario model tab is framed as WHAT COULD HAPPEN with trigger or condition plus expected impact pathway plus evidence basis plus next confirmation needed. The Risks tab is framed as WHAT NEEDS ATTENTION with affected readiness area plus recommended control or action plus why it matters. Every chip in every report tab is labelled. No bare High slash Medium slash Low chip remains. The seven tab report structure is preserved. The single active view router is preserved. The legacy stacked page stays disabled.
When to re-score: Component extraction plus runtime proof plus tab framing only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30S evidence record plus the four new presentational components plus the updated json paste score card mount slots plus the dedicated S30S runtime test with twenty happy-dom tests plus the nine new self-evaluation checks plus the four S30R gate updates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.17 to 0.138.18 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30S internal methodology record (full report runtime proof plus tab meaning clarity plus decision grade scenario risk separation)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30S founder execution signal: the S30R delivery report admitted one unacceptable proof gap - the full live Scenario model tab inside the production JsonPasteScoreCard was not runtime mounted and clicked end to end. S30S must close that proof gap properly.
- Impact assessment
- Bumps the package version from 0.138.17 to 0.138.18. Extracts the Scenario model, Controls, and Risks tab bodies into their own presentational components. Mounts the new buyer facing tab meaning guide above the report. Adds nine new archive blocking gates. Adds a dedicated runtime test that mounts the real report tab orchestration end to end. Updates the S30R gates to consult the new components. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, generate the readiness report, open the report guide panel, click every tab, and confirm: labelled Severity plus Confidence chips on Scenario model and Risks, labelled Owner plus Effort plus Impact chips on Controls, labelled Confidence chip on the Intelligence Agent interpretation card, no bare High slash Medium slash Low chip in any report card surface, the report guide explains all four tab meanings, and Update answers and regenerate still fires the real pipeline.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30TChange date:2026-05-14Product version:0.138.19Methodology engine version:0.9.1
Phase 1G-S30T - Top 5 priorities chip clarity, writable status dropdown, and persisted remediation state
Reason: S30S shipped the runtime proof gap closure and the buyer facing tab meaning guide, but the founder's live test of the Top 5 priorities cards surfaced two further defects. First, every priority chip rendered as a bare unlabelled pill - the buyer saw Fix, High, and confidence Medium with no Label colon Value shape, so it was impossible to tell what each chip meant. Second, the per-priority Status dropdown was dead - selecting Open or In progress or Confirmed did nothing because both Top5Priorities mounts in json paste score card passed a noop status handler. S30T fixes these two defects only. No scoring change. No UI redesign. No new chip system.
What changed: Package version bumped from 0.138.18 to 0.138.19. The build-identity endpoint now reports product version 0.138.19 and phase id 1G-S30T. Backwards-compatible alias exports cascade. priority card inside the agent readiness dashboard component is refactored to use the shared labelled status chip helper for every pill - Action with status tone family, Severity with risk tone family, Confidence with evidence tone family, Owner plus Effort plus Impact with neutral tone family. A single canonical writable handler handle set priority status is declared at the json paste score card scope; the handler updates microsoft remediation states AND persists via the new save remediation states for agent helper. Every Top5Priorities mount in json paste score card now passes handle set priority status instead of the previous noop status handler. The statuses prop now reads from microsoft remediation states first so the dropdown sticks across renders. A use effect rehydrates persisted states on env plus agent change. The new persistence helpers live inside the existing review persistence module under the schema versioned key agentproof.remediation states.v1::env::agent - no new persistence module. Four new archive blocking gates ship - top 5 priorities labelled chips s30t, top 5 priorities status writable s30t, top 5 priorities status persisted s30t, top 5 priorities no noop status handler s30t. A dedicated runtime test ships under happy dom dispatching a real DOM change event against the per priority select element and asserting both the in memory state update AND the persisted local storage value. Founder summary now reads 166 of 166 checks green.
User impact: The Top 5 priorities cards now read like enterprise UI - every chip is labelled Action colon X, Severity colon X, Confidence colon X, Owner colon X, Effort colon X, Impact colon X with distinct tone families. The per priority Status dropdown is alive - selecting Confirmed or Deferred or In progress or Fixed outside AgentProof or Not applicable visibly switches the dropdown. The selection persists across tab clicks AND across hard reloads AND is scoped per agent so changes on Agent A do not leak into Agent B.
When to re-score: Chip clarity plus writable status plus persistence only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30T evidence record plus the refactored priority card chips inside agent readiness dashboard plus the new handle set priority status handler and rehydration use effect inside json paste score card plus the new save remediation states for agent plus load remediation states for agent helpers inside review persistence plus the dedicated S30T runtime test plus the four new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.18 to 0.138.19 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30T internal methodology record (Top 5 priorities chip clarity plus writable status dropdown plus persisted remediation state)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30T founder execution signal: the S30S live test of the Top 5 priorities cards showed every chip as a bare unlabelled pill AND the per priority Status dropdown was dead.
- Impact assessment
- Bumps the package version from 0.138.18 to 0.138.19. Refactors PriorityCard chips to use the shared LabelledStatusChip helper. Adds a single canonical writable handler for the per priority Status dropdown. Adds the schema versioned remediation states key inside the existing review persistence module. Adds a rehydration useEffect. Ships four new archive blocking gates. Ships a dedicated runtime test that dispatches a real DOM change event and asserts state plus persistence. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, generate the readiness report, open the Top 5 priorities card on the Summary tab, confirm every chip is labelled Label colon Value with distinct tone families, change the Status dropdown to Confirmed, click the Controls tab and back to Summary and confirm the selection persists, hard reload the page and confirm the selection survives, click Review another agent and confirm Agent B starts fresh, and confirm the four new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30UChange date:2026-05-14Product version:0.138.20Methodology engine version:0.9.1
Phase 1G-S30U - Report-wide semantic chip completion, risk heatmap clarity, and later-priority consistency
Reason: S30T fixed the Top 5 priorities chips and the writable status dropdown, but the report still contained two compact unlabelled semantic surfaces. The Risk heatmap rendered a compact chip combining risk level and confidence and the Later items rendered a parenthetical severity and confidence summary plus a compact owner slash effort slash impact line. The buyer should never have to guess whether a value means severity, confidence, owner, effort, impact, risk level, or action class. S30U completes the report-wide semantic chip clarity standard.
What changed: Package version bumped from 0.138.19 to 0.138.20. The build-identity endpoint now reports product version 0.138.20 and phase id 1G-S30U. Backwards-compatible alias exports cascade. risk heatmap inside agent readiness dashboard is refactored to render Risk level (risk tone family) AND Confidence (evidence tone family) as two explicit labelled labelled status chip pills per cell. later items collapsed is refactored to render Severity, Confidence, Owner, Effort, and Impact as five labelled labelled status chip pills per row. The previous compact patterns ({risk level} dot confidence {confidence}, ({severity}, {confidence} confidence), and Owner: X dot Effort Y dot Impact Z) are removed at source. Five new archive blocking gates ship - risk heatmap labelled chips s30u, later items labelled chips s30u, report wide compact semantic strings blocked s30u, s30t priority status regression guard s30u, report wide chip helper reuse s30u. A dedicated S30U runtime test under happy dom dispatches real DOM events and asserts the rendered chips on both surfaces, plus regression-tests the S30T writable status dropdown via a real DOM change event. Founder summary now reads 171 of 171 checks green.
User impact: The Risk heatmap now reads 'Risk level colon High' and 'Confidence colon Medium' instead of the previous compact 'High dot confidence Medium'. The Later items section now reads 'Severity colon Medium', 'Confidence colon High', 'Owner colon Project team', 'Effort colon Medium', 'Impact colon Medium' instead of the previous 'Medium, High confidence' parenthetical plus 'Owner: Project team dot Effort Medium dot Impact Medium' line. Every chip-like semantic value across the entire report now follows the same Label colon Value standard with distinct tone families. The Top 5 priorities S30T fix is preserved by the new s30t priority status regression guard s30u gate.
When to re-score: Chip clarity only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30U evidence record plus the refactored risk heatmap and later items collapsed components inside agent readiness dashboard plus the dedicated S30U runtime test plus the five new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.19 to 0.138.20 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30U internal methodology record (report-wide semantic chip completion plus risk heatmap clarity plus later-priority consistency)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30U founder execution signal: the S30T live test confirmed the Top 5 priorities fix but flagged the Risk heatmap compact chip and the Later items compact parenthetical plus owner slash effort slash impact line as remaining offenders.
- Impact assessment
- Bumps the package version from 0.138.19 to 0.138.20. Refactors RiskHeatmap and LaterItemsCollapsed to use the shared LabelledStatusChip helper. Adds five new archive blocking gates. Adds a dedicated runtime test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, generate the readiness report, find the Risk heatmap section and confirm every cell shows Risk level colon X AND Confidence colon Y chips with distinct tone families, expand the Later items section and confirm every row shows Severity, Confidence, Owner, Effort, and Impact labelled chips, confirm no compact pattern remains, regression-check the Top 5 priorities and writable status dropdown from S30T, and confirm the five new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30VChange date:2026-05-14Product version:0.138.21Methodology engine version:0.9.1
Phase 1G-S30V - Exported report semantic parity, copied action-plan clarity, and persisted status alignment
Reason: S30U completed semantic chip clarity in the live report UI, but the founder confirmed that the downloaded Markdown report and the Copy action plan output still shipped compact ambiguous wording and ignored the buyer writable remediation status from S30T. S30V completes the export parity: a buyer must not see clean labels in the web UI but receive old compact ambiguous wording in the downloaded Markdown or copied action plan, and the buyer current remediation status must be reflected in copied and exported outputs where the report export path has access to it.
What changed: Package version bumped from 0.138.20 to 0.138.21. The build-identity endpoint now reports product version 0.138.21 and phase id 1G-S30V. Backwards-compatible alias exports cascade. lib slash reporting slash agentproof readiness markdown report is refactored so the Top 5 priority section emits Action, Severity, Confidence, Owner, Effort, Impact, Status, Why it matters, Recommended action, and Evidence basis as explicit Label colon Value bullet lines per priority. The prioritised risks table column header is renamed from Risk to Risk level. The Fix-first action plan splits Owner, Effort, and Impact onto three separate labelled bullet lines below the priority number and title. A new optional priority status overrides input keyed by priority id flows the buyer current writable remediation status into the Status line per priority; the renderer falls back to the view model status when no override is supplied. report actions inside agent readiness dashboard accepts the same optional prop and threads it to the renderer. copy action plan emits Title plus Action plus Severity plus Confidence plus Owner plus Effort plus Impact plus Status plus Recommended action labelled lines per priority. json paste score card derives the override map from microsoft remediation states and plumbs it into every report actions mount. Six new archive blocking gates ship - markdown top 5 semantic parity s30v, markdown risk table clarity s30v, markdown fix first action plan clarity s30v, copied action plan semantic parity s30v, exported report status override s30v, s30t s30u regression guard s30v. A dedicated S30V runtime plus renderer test ships under happy dom that asserts the new Label colon Value output, the Risk level column, the Fix-first split lines, the labelled Copy action plan payload via a real Copy click against a clipboard stub, and the override Status line round trip after a real dropdown change event. Founder summary now reads 177 of 177 checks green.
User impact: Download Markdown report now produces a document that reads like an enterprise report. Top 5 priorities show explicit Action colon X, Severity colon X, Confidence colon X, Owner colon X, Effort colon X, Impact colon X, Status colon X labelled bullets - never a compact heading or compact compound line. The prioritised risks table column reads Risk level so the buyer is never left guessing. The Fix-first action plan splits Owner and Effort and Impact onto three labelled bullets. Click Copy action plan and paste into Notepad to get explicit Title plus Action plus Severity plus Confidence plus Owner plus Effort plus Impact plus Status plus Recommended action lines per priority. After changing a priority status via the live dropdown, the changed status appears in both the copied action plan AND the downloaded Markdown - the export now agrees with the live UI.
When to re-score: Output formatting plus optional status override only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30V evidence record plus the refactored render readiness markdown report plus the refactored report actions copy action plan plus the new priority status overrides input plus the live mount plumbing inside json paste score card plus the dedicated S30V runtime plus renderer test plus the six new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.20 to 0.138.21 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30V internal methodology record (exported report semantic parity plus copied action plan clarity plus persisted status alignment)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30V founder execution signal: the S30U live test confirmed the live UI semantic clarity but flagged that the downloaded Markdown and the Copy action plan output still shipped compact ambiguous wording and ignored the writable remediation status from S30T.
- Impact assessment
- Bumps the package version from 0.138.20 to 0.138.21. Refactors the Markdown renderer to use explicit Label colon Value bullet lines per priority. Renames the risks table column to Risk level. Splits the Fix-first action plan compact line. Refactors copyActionPlan to use explicit labels and a Status line. Adds the optional priority_status_overrides input on the renderer and on ReportActions and plumbs microsoftRemediationStates from JsonPasteScoreCard. Adds six new archive blocking gates. Adds a dedicated runtime plus renderer test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, generate the readiness report, change one Top 5 priority status to Confirmed or In progress, click Copy action plan and confirm the Notepad output shows the labelled lines AND the changed status, click Download Markdown report and confirm the Top 5 priorities section, the Risk level column header, and the Fix-first action plan split lines all use the new Label colon Value standard, confirm the changed status appears in the downloaded Markdown, regression check the live UI semantic chips from S30T and S30U, and confirm the six new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30WChange date:2026-05-14Product version:0.138.22Methodology engine version:0.9.1
Phase 1G-S30W - Cross-workspace semantic clarity, dashboard chip consistency, and wizard confidence labelling
Reason: S30Q through S30V cleaned semantic clarity inside the generated report, the report export, the copied action plan, and the report-related tabs. The S30V founder live-test confirmed those surfaces were clean but flagged five remaining workspace and dashboard surfaces that still rendered compact unlabelled semantic values - agent estate dashboard portfolio priority chip, agent review wizard classification confidence, red flag list severity pill, recent verifications panel status pill, and intelligence change notification reassessment rows. S30W must extend the same explicit semantic clarity standard beyond the report into the surrounding workspace surfaces so the buyer never sees clean labels in the report and then sees compact or unlabelled semantic values elsewhere in the product.
What changed: Package version bumped from 0.138.21 to 0.138.22. The build-identity endpoint now reports product version 0.138.22 and phase id 1G-S30W. Backwards-compatible alias exports cascade. agent estate dashboard portfolio priority chip is replaced with two explicit labelled status chip pills - Severity in the risk tone family and Affected agents in the neutral tone family - so the buyer never has to decode which value means what. agent review wizard review summary classification and confidence are split into two labelled lines - the compact parenthetical is removed. red flag list severity Pill body now reads Severity colon X inside the existing Pill container while preserving the existing severity tone. recent verifications panel status Pill body now reads Status colon X inside the existing Pill container while preserving the existing tone. intelligence change notification reassessment rows now render three labelled blocks per row - agent name, Severity colon X, and Recommended action colon X. The shared S30Q labelled status chip helper is reused directly so no second chip system is introduced. Seven new archive blocking gates ship - estate dashboard priority semantics s30w, agent review wizard confidence label s30w, red flag severity label s30w, recent verification status label s30w, intelligence change notification semantics s30w, cross workspace compact semantic strings blocked s30w, s30q to s30v regression guard s30w. A dedicated S30W runtime test ships under happy dom that renders the refactored components, asserts the labelled DOM, and source-scans every refactored file for the documented sentinels. Founder summary now reads 184 of 184 checks green.
User impact: The buyer now sees consistent labelled chips and labelled prose across the entire workspace. Portfolio priority cards on the estate dashboard read Severity colon X and Affected agents colon N. The agent review wizard summary reads Classification colon X on one line and Confidence colon Y on the next line. Red flag pills read Severity colon X. Recent verification pills read Status colon X. Intelligence change notification reassessment rows read agent name then Severity colon X then Recommended action colon X. No compact or unlabelled semantic values remain on any audited workspace surface.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30W evidence record plus the refactored agent estate dashboard portfolio priority chip plus the refactored agent review wizard summary plus the refactored red flag list severity Pill plus the refactored recent verifications panel status Pill plus the refactored intelligence change notification rows plus the dedicated S30W runtime test plus the seven new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.21 to 0.138.22 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30W internal methodology record (cross-workspace semantic clarity plus dashboard chip consistency plus wizard confidence labelling)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30W founder execution signal: the S30V live test confirmed report and export clarity but flagged five remaining workspace surfaces that still rendered compact unlabelled semantic values.
- Impact assessment
- Bumps the package version from 0.138.21 to 0.138.22. Refactors AgentEstateDashboard portfolio priority chip, AgentReviewWizard review summary, RedFlagList severity Pill, recent verifications panel status Pill, and IntelligenceChangeNotification reassessment rows to use explicit Label colon Value semantics. Reuses the shared S30Q LabelledStatusChip helper directly. Adds seven new archive blocking gates. Adds a dedicated runtime test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, open the portfolio or estate dashboard and confirm the priority cards show Severity and Affected agents as labelled chips, open the agent review wizard and confirm Classification and Confidence are separate labelled lines, find a red flag list and confirm severity reads Severity colon X, open the recent verifications panel if local Verify-now journal entries exist and confirm status reads Status colon X, find an intelligence change notification if present and confirm reassessment rows show Severity and Recommended action as labelled blocks, regression check the readiness report and exported Markdown and copied action plan from S30Q through S30V, and confirm the seven new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30XChange date:2026-05-14Product version:0.138.23Methodology engine version:0.9.1
Phase 1G-S30X - Results, timeline, comparison, and estate-summary semantic clarity hardening
Reason: S30Q through S30W cleaned the generated report, the report export, the copied action plan, and several workspace and dashboard surfaces. However, older results, timeline, comparison, methodology, and estate-summary surfaces still rendered bare pills or compact dot-separated semantic strings. S30X must extend the same explicit Label colon Value standard to those remaining buyer-visible surfaces so the buyer never sees clean labels in the generated report and then encounters bare or compact wording elsewhere.
What changed: Package version bumped from 0.138.22 to 0.138.23. The build-identity endpoint now reports product version 0.138.23 and phase id 1G-S30X. Backwards-compatible alias exports cascade. Thirteen older results, timeline, comparison, methodology, provenance, estate, and executive surfaces are refactored to use explicit Label colon Value text inside the existing Pill or prose containers. dashboard scorecard tile and scorecard view readiness pills now read Readiness colon X. scorecard view prohibited-stop eyebrow now reads Decision colon Stop dot Action colon Review required. compare view score delta pill reads Score change colon X, before and after readiness pills read Before readiness colon X and After readiness colon X, Cap and Red flags and Indicators are rendered on three labelled lines, delta list meta strings carry Type slash Indicator slash Severity labels, and the still-blocked summary chip reads Status colon Still blocked. timeline view score delta pill reads Score change colon X, the still-blocked pill reads Status colon Still blocked, summary card and row rating pills read Rating colon X, count chips read Red flags colon N and Indicators colon N. timeline group list pill reads Latest readiness colon X. indicator list, targeted rescore section, methodology lineage section, methodology provenance trail section, and reproducibility receipt section pills carry Indicator colon, Recommendation colon, Action colon, Provenance colon, and Verification colon labels respectively, plus the provenance trail rows render Source type, Source, Reviewed by, and Reviewed at on four labelled lines. agent estate dashboard category queue renders Owner and Priority on two separate labelled lines. enterprise estate overview methodology status and saved snapshot are two separate labelled pill elements. premium executive command centre reassessment rows render Agent, Environment, Severity, and Reason on four labelled p blocks. Eight new archive blocking gates ship - saved scorecard readiness label s30x, scorecard result readiness and review labels s30x, comparison semantic labels s30x, timeline semantic labels s30x, results methodology provenance pill labels s30x, estate summary compact semantics blocked s30x, executive dashboard reassessment labels s30x, s30q to s30w regression guard s30x. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 192 of 192 checks green.
User impact: The buyer now sees consistent labelled chips and labelled prose across the entire AgentProof workspace - the generated report, the export, the copied action plan, the saved scorecards, the scorecard result page, the comparison page, the timeline page, the indicator list, the targeted rescore section, the methodology lineage section, the methodology provenance trail section, the reproducibility receipt section, the estate dashboard category queue, the enterprise estate overview methodology status, and the executive command centre reassessment queue. No bare rating, status, action, severity, or provenance pill remains. No compact dot-separated semantic chain remains in any audited surface.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30X evidence record plus the 13 refactored component sources plus the dedicated S30X runtime plus source-scan test plus the eight new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.22 to 0.138.23 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30X internal methodology record (results plus timeline plus comparison plus methodology plus estate plus executive semantic clarity hardening)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30X founder execution signal: the S30W live test confirmed report and workspace clarity but flagged 27 remaining offender sites across the older results, timeline, comparison, methodology, provenance, saved scorecard, estate summary, and executive dashboard surfaces.
- Impact assessment
- Bumps the package version from 0.138.22 to 0.138.23. Refactors 13 component files to use explicit Label colon Value labels. Reuses the shared S30Q LabelledStatusChip helper directly. Adds eight new archive blocking gates. Adds a dedicated runtime plus source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the saved scorecard tile readiness pill reads Readiness colon X, open a scorecard result and confirm readiness pill plus indicator plus targeted rescore plus methodology lineage plus methodology provenance plus reproducibility receipt pills all carry their documented labels, compare two scorecards and confirm score change plus before slash after readiness plus Cap split plus delta list meta plus still-blocked chip are all labelled, open a timeline and confirm score change plus still blocked plus rating plus count chips are all labelled, open the estate dashboard category queue and the enterprise estate overview and the executive command centre reassessment queue and confirm Owner plus Priority plus methodology status plus snapshot state plus Agent plus Environment plus Severity plus Reason are all explicitly labelled, regression check the readiness report from S30Q through S30W, and confirm the eight new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30YChange date:2026-05-14Product version:0.138.24Methodology engine version:0.9.1
Phase 1G-S30Y - Workspace header, metric-summary, score-trace, and memo slash export semantic clarity completion
Reason: S30Q through S30X cleaned the generated report, the report export, the copied action plan, the comparison view, the timeline view, the methodology and provenance and reproducibility surfaces, and several workspace dashboard chips. The S30X founder live-test then flagged the still-compact older surfaces: the agent workspace header chain, the connection state chip label, the four workspace metric summary rows (portfolio, estate, environment, evidence explorer), the score contributor trace card, the legacy slash dashboard row, the enterprise estate overview tone label, the comparison client surfaces, the memo view version stamps, and the exported support text (comparison memo Markdown, reproducibility receipt, methodology changelog Markdown). S30Y must extend the same explicit Label colon Value standard to these remaining 14 buyer-visible surfaces so the buyer never sees clean labels in the generated report and then encounters bare or compact wording elsewhere.
What changed: Package version bumped from 0.138.23 to 0.138.24. The build identity endpoint now reports product version 0.138.24 and phase id 1G-S30Y. Backwards-compatible alias exports cascade. Fourteen older workspace, score trace, comparison, memo, and exported support text surfaces are refactored to use explicit Label colon Value text. The agent workspace tabs header now renders Classification colon X, Score colon X, Readiness colon X, Last reviewed colon X, and Trend colon X on five labelled spans. The connection state chip label returns the state alone (Connected, Environment selected, Agent selected) and the new next-step helper exposes the action with its own label. The portfolio dashboard estate card, the estate dashboard environment card, and the environment dashboard header now render Reviewed agents, Environments or Unreviewed agents, Average score, Highest risk, Reassessment items (estate only), Ready, Needs review, Not ready (environment only), and Last reviewed on separate labelled lines. The evidence explorer group counts render Proven evidence, Inferred evidence, and Unknown evidence on three labelled spans. The score contributor trace card header renders Trace type, Agent, Score, and Readiness on four labelled spans, and every contributor row carries a Contribution effect labelled line. The legacy slash dashboard row renders Environment and Classification on two labelled spans, and the methodology drift badge splits into Methodology status and Recommended action on two labelled chips. The enterprise estate overview tone label constant is renamed to Older methodology alone. The comparison client saved or score row, before or after summary, comparison print header and footer, and memo print header and footer all render labelled values. The memo view version stamps are now an Array of label and value pairs rendered as labelled spans separated by middle dots. The comparison memo build label returns Saved at and Agent on labelled values, and the Markdown stamp parts emit Engine version, Weights version, Question bank version, Red flag rules version, AI Act aware indicator rules version, and Context packs version with explicit labels. The reproducibility receipt emits one labelled line per version stamp. The methodology changelog export entry header now reads Product version, Engine version, and Context packs version with labels. Nine new archive blocking gates ship - agent workspace header semantics s30y, connection state label semantics s30y, portfolio estate environment metric semantics s30y, evidence explorer semantics s30y, score contributor trace semantics s30y, legacy dashboard route semantics s30y, compare client memo semantics s30y, exported support text semantics s30y, s30q to s30x regression guard s30y. The shared S30Q labelled status chip helper is reused; no second chip system is introduced. Founder summary now reads 201 of 201 checks green.
User impact: The buyer now sees consistent labelled chips and labelled prose across the entire AgentProof product surface - the report, the export, the copied action plan, the saved scorecards, the scorecard result, the comparison view, the timeline view, the indicator list, the targeted rescore section, the methodology lineage, the methodology provenance trail, the reproducibility receipt, the estate dashboard category queue, the enterprise estate overview methodology status, the executive command centre reassessment queue, the agent workspace header, the connection state chip, the portfolio estate cards, the estate environment cards, the environment header summary, the evidence explorer counts, the score contributor trace, the legacy slash dashboard row, the comparison client surfaces, the memo view, and the exported support text. No bare rating, status, action, severity, or provenance pill remains. No compact dot-separated semantic chain remains in any audited surface.
When to re-score: Output formatting only. No scoring logic change. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
UiSelf evaluation gateEvidence trace: the new Phase 1G-S30Y evidence record plus the 14 refactored component and library sources plus the dedicated S30Y runtime plus source-scan test plus the nine new self evaluation gates plus the founder QA checklist evidence artefact plus the updated build identity endpoint constants plus the package version bump from 0.138.23 to 0.138.24 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30Y internal methodology record (workspace header plus metric summary plus score trace plus memo plus exported support text semantic clarity completion)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-14
- Reference
- Phase 1G-S30Y founder execution signal: the S30X live test confirmed report and results and timeline and comparison and methodology and estate summary clarity but flagged 27 remaining offender sites across the older agent workspace header, connection state chip, portfolio and estate and environment and evidence explorer metric summaries, score contributor trace, legacy slash dashboard row, enterprise estate overview tone label, comparison client surfaces, memo view version stamps, and exported support text (comparison memo Markdown, reproducibility receipt, methodology changelog Markdown).
- Impact assessment
- Bumps the package version from 0.138.23 to 0.138.24. Refactors 14 component and library files to use explicit Label colon Value labels. Reuses the shared S30Q labelled status chip helper directly. Adds nine new archive blocking gates. Adds a dedicated runtime plus source-scan test. Ships a founder QA checklist evidence artefact. Tokens stay server side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, confirm the agent workspace header reads five labelled spans, confirm the connection state chip reads the state alone with the next-step action in the CTA, confirm the portfolio estate cards and estate environment cards and environment header summary and evidence explorer group counts all render labelled lines, confirm the score contributor trace card header reads four labelled spans and the contributor effect line reads Contribution effect, confirm the legacy slash dashboard row reads Environment and Classification on two labelled spans and the methodology drift badge splits into two labelled chips, confirm the comparison client saved or score row and before or after summary and print surfaces and memo print surfaces all render labelled values, confirm the memo view version stamps render as labelled spans, confirm the exported comparison memo Markdown and reproducibility receipt and methodology changelog Markdown all carry labelled version stamps, regression check the readiness report from S30Q through S30X, and confirm the nine new gates show pass in the self evaluation transcript.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30EChange date:2026-05-13Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30E - anchor target integrity and dead-button fix
Reason: The founder rejected S30D in the browser. Three buyer-visible header CTAs pointed at hash anchors that did not exist as DOM ids: choose agent href = #agentproof-agent-picker, start review href = #agentproof-agent-review-workspace, view report href = #agentproof-readiness-report-tabs. Each click looked dead because nothing scrolled. agent readiness report tabs carried only a data-test-id, not an id. json paste score card did not mount id=agentproof-agent-picker or id=agentproof-agent-review-workspace anywhere. The S30D gate did not catch this class of regression.
What changed: The package version was bumped from 0.138.3 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. Backwards-compatible alias exports for S30D / S30C / S30B / S30A resolve to the S30E values. Three missing DOM ids were mounted on the live source: id=agentproof-agent-picker on the Choose-agent wrapper in json paste score card, id=agentproof-agent-review-workspace on the agent-review workspace anchor in json paste score card, and id=agentproof-readiness-report-tabs on the report tabs root section in agent readiness report tabs. A new self-evaluation check ships (anchor target integrity) which source-scans the buyer-visible source set for every href=#agentproof-... reference and every <NAME>_HREF constant and proves a matching id=agentproof-... attribute exists somewhere in the buyer-visible set. The gate refuses the archive when any hash href has no matching id, with an explicit failure when any of the three founder-required anchors is missing. The new dedicated Phase 1G-S30J test file carries 18 requirements covering source presence + rendered DOM markup + gate pass/fail behaviour. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Tokens stay server-side. No scoring engine change. No backend write. No external persistence. NullProvider remains default.
User impact: Every buyer-visible navigation CTA now visibly moves the buyer to the intended workspace. The Choose-agent button scrolls to the agent picker. The Start-review button scrolls to the live agent review workspace. The View-report button scrolls to the tabbed report. No click is silent. The self-evaluation gate now refuses any future revert that re-introduces a dead hash anchor.
When to re-score: UI navigation hardening only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the three new id attributes on json paste score card and agent readiness report tabs plus the new anchor target integrity check in the self-evaluation gate plus the new self-eval print-list entry in the self-eval runner plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.3 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (anchor target integrity and dead-button fix)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-13
- Reference
- Phase 1G-S30J founder execution signal: the founder rejected S30D in the browser. Several buyer-visible header CTAs pointed at hash anchors that did not exist as DOM ids; every click looked dead. The S30D gate did not catch this class of regression.
- Impact assessment
- Bumps the package version from 0.138.3 to 0.138.9, mounts three missing DOM anchor ids on the live JsonPasteScoreCard and AgentReadinessReportTabs components, adds a new self-evaluation check (anchor_target_integrity) that source-scans every buyer-visible href=#agentproof-... reference against the live mounted id=agentproof-... attributes, and refuses any future revert that re-introduces a dead hash anchor. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local app at http://localhost:3000/score/paste in a fresh browser tab and click each of the Choose-agent, Start-review, and View-report header CTAs, confirming that each click visibly scrolls the page to the correct workspace anchor.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30FChange date:2026-05-13Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30F - premium flow compression, visual wow, and low-scroll agent workspace
Reason: The founder confirmed the dashboard is moving in the right direction but rejected the environment / agent estate / review workspace page as too long, too boring, too scrolly, too stacked-admin-panels, and not visually premium enough. The agent estate area read like a basic table dumped on a page rather than a real command room.
What changed: The package version was bumped from 0.138.4 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. Backwards-compatible alias exports for S30E / S30D / S30C / S30B / S30A resolve to the S30F values. A new environment command room component ships with a documented metric strip (agent count, reviewed, unreviewed, average score, ready / needs review / not ready counts, re-review pressure), a compact score distribution visual, a premium compact agent list with status/band/risk chips, a selected-agent snapshot panel (latest / previous / best / worst / average / trend + review CTA), and a sticky review action bar so the buyer never needs to scroll to the bottom to generate. The pure view model for the room derives from the existing discovered-agents + report history data; no scoring engine change, no canonicalisation change, no archive builder change. The room renders AFTER the estate overview and BEFORE the agent review section so the documented estate order truth invariant is preserved. Ten new archive-blocking self-evaluation checks (F1 environment command room present, F2 low scroll layout, F3 text budget default, F4 agent list premium, F5 selected agent history visible, F6 estate order preserved, F7 question grouped flow, F8 anchor integrity preserved, F9 no dashboard duplication, F10 visual wow acceptance) refuse any future revert that drops the premium command-room surface, duplicates dashboards, or regresses the S30E anchor-target gate. The S30E anchor target integrity gate is preserved. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Tokens stay server-side.
User impact: The buyer can open the environment workspace and see in one coherent surface: what environment is being analysed, how healthy it is, what to do next, how much has been reviewed, and where to click. The selected agent snapshot shows score history (latest / previous / best / worst / average + trend) without scrolling far down. Agent rows carry visible status / band / risk chips so the list scans like a command room, not a spreadsheet. The sticky review action bar surfaces answered count / score-moving answers / evidence completeness / Generate report at the top of the review surface. Default-visible text is compressed: walkthrough, evidence ledger, intelligence essay, full methodology copy are all behind disclosures.
When to re-score: Presentation only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the new environment command room component plus the new environment command room view model module plus the wired insertion point in the live workspace component plus the ten new self-evaluation checks F1-F10 plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.4 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (premium flow compression, visual wow, and low-scroll agent workspace)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-13
- Reference
- Phase 1G-S30J founder execution signal: the founder rejected the environment / agent estate / review workspace page as too long, too boring, too scrolly, too stacked-admin-panels, with insufficient visual hierarchy and not feeling like a real product.
- Impact assessment
- Bumps the package version from 0.138.4 to 0.138.9, introduces the EnvironmentCommandRoom component plus its pure view model, wires it into the live workspace after the estate overview and before the agent review section, adds a selected-agent snapshot panel that surfaces score history without forcing the buyer to scroll, adds a sticky review action bar so the buyer never scrolls to the bottom to generate a report, and adds ten new self-evaluation checks that refuse any future revert. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local app at http://localhost:3000/score/paste in a fresh browser tab, connect Microsoft, select an environment, and confirm the environment command room shows a coherent metric strip, agent list, selected-agent snapshot, and sticky review action bar with less vertical scroll than the prior S30E build.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30GChange date:2026-05-13Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30J - full UX reset: premium app workspace with single active view
Reason: The founder rejected S30F. The page is still a long stacked one-page layout: command-centre header, What you get, Connect platform, Discovery status, Choose environment, Choose agent, Estate health, Enterprise Agent Estate, Reassessment queue, Agent selector, What AgentProof found, Unknowns, Confirm the critical points, AgentProof intelligence paragraph, Capability chips, Evidence Ledger, Potential risks, Agent review wizard, Methodology status all stacked vertically. The founder requires a single-active-view workspace shell with explicit navigation zones, setup blocks hidden when connected, methodology moved behind a drawer, intelligence presented as cards not paragraphs, and a score-drivers panel that explains unchanged-score outcomes.
What changed: The package version was bumped from 0.138.5 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. A new active workspace view lib module exports the documented workspace view union type and pure derivation helpers (derive default workspace view, hide setup when connected). Five new workspace components ship: agent proof workspace top nav (brand + workspace switcher buttons + connection chip), agent proof connection status chip (compact Microsoft connected chip with details drawer), agent proof score drivers panel (top score lifters / top limiters + answer fingerprint + score input hash + honest unchanged-score reason), agent proof assessment intelligence cards (premium card-based intelligence replacing the long paragraph), and agent proof agent workspace tabs (Overview / Questions / Intelligence / Evidence / History / Report tabs container). The legacy long stacked layout in json paste score card is gated behind an active workspace view state machine so only one major workspace view renders at a time. Setup blocks (Connect / How discovery works / What you get / Connection guide / large Disconnect button) are hidden by default when Microsoft is connected; the user opens them via the connection status chip details drawer. Methodology governance moves out of the default review flow and is opened only via the top nav. Ten new archive-blocking self-evaluation gates (A no single page stack, B connected state hides setup, C agent filters work, D wizard not question dump, E evidence collapsed by default, F methodology not main flow, G intelligence cards not paragraph dump, H history visible per agent, I score driver explanation visible, J premium first screen) refuse any future revert that re-introduces the long stacked layout, re-exposes setup blocks when connected, or regresses any prior accepted phase. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Tokens stay server-side. No scoring engine change. No backend write. No external persistence. NullProvider remains default.
User impact: The first connected screen feels like a premium saa s estate command centre: top navigation with workspace zones (Estate / Environments / Agents / Reviews / Intelligence / Settings), a compact Microsoft connected chip in place of the long Connect block, the executive estate overview hero, and environment cards with metrics and progress visuals. The buyer opens an environment to a focused environment command room and an agent to a tabbed agent workspace (Overview / Questions / Intelligence / Evidence / History / Report) where only the active tab renders. The review wizard shows a maximum of three questions per step with a documented stepper. The score drivers panel honestly explains when a score did not move (e.g. confidence-only answers). Intelligence is now cards (agent interpretation / risk pathways / control plan / landscape influence / refused assumptions / buyer action plan), not a paragraph dump.
When to re-score: Presentation only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the new active workspace view lib module plus the new agent proof workspace top nav agent proof connection status chip agent proof score drivers panel agent proof assessment intelligence cards agent proof agent workspace tabs components plus the new active workspace view state machine in the live workspace component plus the ten new self-evaluation checks A-J plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.5 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (full UX reset: premium app workspace)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-13
- Reference
- Phase 1G-S30J founder execution signal: the founder rejected S30F because the page still feels like one long technical stack of cards. The new requirement is a complete UX restructuring with single-active-view workspace shell, top navigation, drawer for connection details and methodology, and card-based intelligence.
- Impact assessment
- Bumps the package version from 0.138.5 to 0.138.9, introduces a WorkspaceView union type plus pure derivation helpers, ships five new workspace components (top nav + connection chip + score drivers panel + intelligence cards + agent workspace tabs), gates the legacy stacked layout behind an activeWorkspaceView state machine so only one major workspace view renders at a time, hides setup blocks when Microsoft is connected, and adds ten archive-blocking self-evaluation checks (A-J). Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local app at http://localhost:3000/score/paste, connect Microsoft, and confirm the first connected screen shows the estate command centre (not setup text), the top nav offers Estate / Environments / Agents / Reviews / Intelligence / Settings, opening an agent shows the tabbed workspace (Overview / Questions / Intelligence / Evidence / History / Report), the review wizard shows at most three questions per step, the score drivers panel honestly explains unchanged-score outcomes, and methodology / connection-guide content is opened via drawer or dedicated nav rather than appearing in the default review flow.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30HChange date:2026-05-13Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30J - hard UX cutover: remove the legacy stacked page completely
Reason: S30G failed acceptance because the delivery report admitted the legacy stacked rendering still appears underneath the new shell. S30H is the structural rewrite the founder required: the legacy long stacked JSX in json paste score card is gated behind a module-scope Boolean constant s30 h legacy stack never (false) so it never executes; the live render path is a workspace router that returns exactly ONE view component. Nine new dedicated view components ship in the workspace views layer. Ten new archive-blocking gates A-J refuse any future revert that re-mounts the legacy stack, splits the active view, re-introduces setup blocks when connected, mounts methodology / evidence / wizard content in the default views, or re-introduces fixture intelligence content.
What changed: Package version was bumped from 0.138.6 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. Backwards-compatible alias exports for S30G / S30F / S30E / S30D / S30C / S30B / S30A resolve to the S30H values. Nine new view components ship: estate dashboard view, environment dashboard view, agent workspace view, review wizard view, readiness report view, evidence explorer view, intelligence view, methodology view, review history view. json paste score card adds a workspace router that conditionally renders ONE view at a time based on active workspace view. The legacy long stacked return body sits permanently behind {s30 h legacy stack never && (<>...</>)}. The intelligence view accepts a live intelligence prop and renders the documented empty state copy when no live data is supplied (no fixture cards inside the view). The environment dashboard view declares a real predicate(filter) helper that narrows the agent list by classification / reviewed / needs review. The estate dashboard view is forbidden by gate I from containing setup-block copy. Ten new gates A-J close the gaps the S30G gate let through. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No scoring engine, canonicaliser, builder, enrichment, shared writer, Microsoft connector, or archive builder change.
User impact: The /score/paste route now renders only the top app shell, workspace navigation, and ONE active workspace view. The founder no longer sees the long stacked sequence (Connect / Choose environment / Choose agent / Estate health / Enterprise estate / Reassessment queue / Agent picker / Footprint / Unknowns / Critical questions / AgentProof intelligence paragraph / Capability chips / Evidence Ledger / Risks / Wizard / Methodology panel) as one continuous page.
When to re-score: Presentation only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the nine new view components plus the new workspace router + s30 h legacy stack never gate in json paste score card plus the ten new self-evaluation checks A-J plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.6 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (hard UX cutover: remove the legacy stacked page completely)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-13
- Reference
- Phase 1G-S30J founder execution signal: S30G failed acceptance because the legacy stacked rendering still appeared underneath the new shell; the founder demanded the legacy long page be removed from the live buyer flow with no admission that gating the rest is a follow-up.
- Impact assessment
- Bumps the package version from 0.138.6 to 0.138.9, gates the legacy long-stacked JsonPasteScoreCard return body behind a module-scope Boolean S30H_LEGACY_STACK_NEVER so it never executes, ships nine dedicated workspace view components, removes the S30G fixture intelligence cards in favour of an empty-state-aware IntelligenceView, and adds ten archive-blocking self-evaluation checks A-J. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app at http://localhost:3000/score/paste, confirm the first connected screen shows the EstateDashboardView (no setup text, no methodology panel, no evidence ledger, no questions, no fixture intelligence), use the workspace switcher to open each view in turn, confirm only one view renders at a time, and confirm that no legacy stacked content appears underneath.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S30IChange date:2026-05-13Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S30J - live workspace wiring: no shells, no placeholders
Reason: S30H removed the legacy stacked page from the live render path but the new view components were still largely shells. The founder required that the real business logic move into the workspace views so the product is usable end-to-end from the single-active-view UX alone, with no fixture content and no future-slice admissions.
What changed: Package version was bumped from 0.138.7 to 0.138.9. The build-identity endpoint now reports product version 0.138.9 and phase id 1G-S30J. Backwards-compatible alias exports for S30H / S30G / S30F / S30E / S30D / S30C / S30B / S30A resolve to the S30I values. The workspace router inside json paste score card now derives live state (live environment cards, live agent rows, live history rows, live evidence groups, live next actions, estate average score, reviewed agents count, unreviewed agents count, review coverage percent, selected agent latest, selected agent previous, selected agent best, selected agent worst, selected agent average, selected agent trend label) and passes it into each view component. The agent workspace view tab bodies are no longer placeholder strings; they now surface real footprint counts, the canonical score summary, and per-agent score history (latest/previous/best/worst/average). The review wizard view reads current answers from microsoft confirmation answers and saves answers via commit microsoft answer (the shared canonical writer). The readiness report view consumes the canonical score only. The evidence explorer view consumes the real microsoft footprint facts grouped by confidence. The review history view consumes the real microsoft report history rows sorted newest-first. Twelve new archive-blocking gates (s30i no shell views, s30i estate view live, s30i environment filters live, s30i agent workspace live tabs, s30i review wizard live, s30i report uses canonical score, s30i evidence collapsed and filterable, s30i intelligence live not fixture, s30i history scores visible, s30i connected setup hidden, s30i single active view runtime, s30i text budget enforced) refuse any future revert that breaks the live wiring. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
User impact: The single-active-view UX from S30H is now usable end-to-end: the buyer can navigate Estate -> Environment -> Agent, see real counts and real agent rows, open the agent workspace tabs with real footprint + history + report content, run the wizard with real save/load wiring through the canonical writer, and view the canonical score in the readiness report.
When to re-score: Presentation wiring only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ComponentsLibTestsContentSelf evaluation gateEvidence trace: the new Phase 1G-S30J evidence record plus the live derivations inside the workspace router plus the live tab bodies inside agent workspace view plus the canonical-score wiring on readiness report view plus the real footprint-derived evidence groups plus the real history rows on review history view plus the twelve new self-evaluation checks s30i_* plus the new dedicated Phase 1G-S30J test file plus the updated build-identity endpoint constants plus the package version bump from 0.138.7 to 0.138.9 plus this methodology changelog entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S30J internal methodology record (live workspace wiring, no shells, no placeholders)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-13
- Reference
- Phase 1G-S30J founder execution signal: S30H removed the legacy stacked page but admitted the new view components were still shells; the founder required the real business logic move into the workspace views with no follow-up admission and no fixture content.
- Impact assessment
- Bumps the package version from 0.138.7 to 0.138.9, refactors the workspace router to derive live state and pass it into all 9 views, replaces placeholder tab bodies in AgentWorkspaceView with live content, wires the wizard to commitMicrosoftAnswer, wires the readiness report to canonical score only, wires the evidence explorer to real footprint facts, wires the review history to real microsoftReportHistory, and adds twelve archive-blocking gates that refuse any revert. Tokens stay server-side. No business records are read. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must open the local app, navigate through Estate then Environment then Agent, confirm the real agent rows render, confirm the History tab shows latest, previous, best, worst, and average, confirm the wizard saves answers via the canonical writer, and confirm the report consumes the canonical score.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S24Change date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S24 - agent estate dashboard, true rich confirmation questions, portfolio remediation command centre
Reason: The founder live-tested Phase 1G-S23 and surfaced two product corrections. First, the dashboard still listed yes / no / not sure buttons even though Phase 1G-S23 introduced richer answer types - the rich controls were in the catalogue but the UI never branched on rich answer type. Second, the dashboard was per-agent only - the founder asked for an estate / portfolio cockpit answering: how many agents do I have, what is the status of each, what is outstanding in totality across the estate, and which agent should I review next. The downloaded report also leaked a 1970-01-01 generated at timestamp and a generic Agent name where a real display name should have rendered. Phase 1G-S24 is the corrective slice for all of these.
What changed: The package version was bumped from 0.131.0 to 0.132.0. The build-identity endpoint now reports product version 0.132.0, phase id 1G-S24, the documented phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S23 now resolve to the Phase 1G-S24 values. Two new pure helper modules ship under the reporting library: a confirmation question deduplicator that collapses near-duplicate question ids (for example human approval for sensitive actions vs human approval required for sensitive actions, fallback path vs fallback or escalation path) onto a single canonical id while preserving the richer answer-type version and re-keying both legacy and rich answer maps so the buyer never loses progress; and an agent estate view-model builder that turns the discovered agent list plus the in-session report-history list into a documented {estate summary, agent status table, portfolio priorities, recommended next agent, estate health, remediation totality} surface. A new premium agent estate dashboard component renders the portfolio cockpit above the per-agent flow with classification / reviewed / unreviewed filters plus a name search and clickable Review or Re-review rows that select the agent in the existing dropdown. The dynamic checklist UI now branches on rich answer type and emits the correct control: select for single choice / owner role / frequency / maturity scale, checkboxes for multi select, input for free text short / evidence reference, with an optional evidence note input when supports evidence note is true. The buyer's richer answers are captured into a parallel rich answer map keyed by canonical question id and used to sharpen the buyer confirmations interpretations so the Markdown report and the dashboard no longer flatten back to bare yes / no. The legacy 1970-01-01T00:00:00.000Z fallback for generated at is removed (now empty string when no real timestamp arrives). The agent display name now resolves through a documented trim / score-response / Unnamed agent cascade so the heading no longer leaks blank strings or generic Agent labels. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S23 dashboard, the Phase 1G-S22 premium dashboard, the Phase 1G-S21C intelligence layer, the score input builder, and the dynamic question generator are otherwise unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default.
User impact: Founders and reviewers now see a portfolio-level estate cockpit before drilling into a single agent: total agents, classification splits, reviewed / unreviewed, ready / needs review / not ready, total open priorities across all reviewed agents, cross-agent recurring problems, remediation totality by category and owner and severity, recommended next agent. Confirmation questions now render the correct rich control instead of a yes / no fallback for every question. Near-duplicate questions collapse to one canonical question while keeping the buyer's existing answer. The downloadable Markdown report now carries the rich answer detail (frequency, maturity, multi-select tags, owner role, free text) instead of flattening to yes / no. Generated_at is the real ISO timestamp. The agent name on the heading is the real display name.
When to re-score: Buyer-facing portfolio dashboard plus rich-control rendering plus question deduplication plus metadata polish. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S24 evidence record plus the two new pure reporting helpers (question deduplicator and agent estate view-model builder) plus the new agent estate dashboard component plus the extended dynamic checklist UI plus the extended view model (rich answer interpretation, Unnamed agent fallback, generated at default removed) plus the new Phase 1G-S24 unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.131.0 to 0.132.0 plus the methodology changelog 125th entry plus the README Phase 1G-S24 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S24 internal methodology record (agent estate dashboard, true rich confirmation questions, portfolio remediation command centre)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S24 founder execution signal: live test of Phase 1G-S23 surfaced two corrections - rich answer controls were declared but never rendered (the UI still showed yes / no buttons), and the dashboard was per-agent only with no portfolio-level cockpit. Phase 1G-S24 ships rich-control rendering, question deduplication, an estate view model, the AgentEstateDashboard, a Markdown report that carries the rich answer detail, and metadata fixes (generated_at, agent display name).
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships two new pure helper modules (question deduplicator, agent estate view-model builder), ships a new premium dashboard component (AgentEstateDashboard), extends the dynamic checklist UI to render the correct rich control per answer type with optional evidence-note inputs, extends the view model with rich-answer interpretation strings and an Unnamed agent fallback, removes the 1970-01-01 generated_at fallback, and adds Phase 1G-S24 unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S24 has not been live-tested end to end against a tenant with multiple Copilot Studio agents in a single environment so the founder can confirm the estate dashboard renders the correct totals, classification splits, reviewed / unreviewed counts, recommended next agent, and remediation totality. The founder must run the local endpoint and confirm the dashboard renders before drilling into a single agent, the rich-control questions emit the correct UI, the near-duplicate questions collapse to one canonical question without losing buyer answers, and the downloaded Markdown report now carries the rich answer detail (frequency, maturity, multi-select tags, owner role, free text) instead of flattening to yes / no.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S25Change date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S25 - agent estate intelligence command centre and premium UX
Reason: The founder live-tested Phase 1G-S24 and judged the product was still not strong enough to be a serious estate-level command centre. Founder feedback: it is starting to take shape but only just; the buyer needs a deliberate Custom / Microsoft-OOTB / Unknown split rather than a mixed grid; the dashboard needs estate-level totality answering 'what is outstanding across the entire estate' rather than per-agent actions; the rich answer types declared in code must be visibly rendered with why-asked / evidence-basis / good-answer-hint copy and adaptive question families per classification and per risk trigger; the report metadata still leaks 1970 timestamps and generic Agent labels.
What changed: The package version was bumped from 0.132.0 to 0.133.0. The build-identity endpoint now reports product version 0.133.0, phase id 1G-S25, the documented phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S24 now resolve to the Phase 1G-S25 values. Five new pure helper modules ship under the reporting library: an estate status taxonomy (8 documented statuses with buyer label, description, tone, and next action) plus a derive estate status decision tree; an estate remediation totality builder returning totals, by category, by owner, by severity, recurring patterns, fastest high impact wins, portfolio top 5, and recommended next agents (5); an adaptive question strategy with documented question families per classification (Custom / Microsoft-OOTB / Unknown) and per risk trigger (actions / knowledge / monitoring / fallback / testing / ownership); a rich answer interpreter that turns rich answers into a buyer-safe sentence with confidence impact + risk reduction hint plus a Buyer evidence and rich confirmations Markdown section renderer; the agent estate view model is extended with segments (Custom / Microsoft-OOTB / Unknown) + default segment + estate remediation totality + estate trend + estate status per row + 15 documented summary metrics. A new an internal source file component ships with documented why asked / evidence basis / risk reduced / good answer hint copy and per rich answer type controls. The agent estate dashboard is rewritten with explicit segment tabs (default focus = Custom when custom agents exist), 18 summary cards, an outstanding totality panel, portfolio Top-5, recommended-next-agents queue (5), category queues, by-owner / by-severity queues, an estate trend panel, and a proper agent estate management table sorted by attention with status pill + score / band / risk / confidence / open / high-sev / last-reviewed / trend / next action / Review CTA. The dynamic checklist UI now layers the adaptive question family on top of the evidence-led question set; the deduplicator collapses near-duplicates so the buyer never sees a duplicate concept. The downloadable Markdown gains 10e. Buyer evidence and rich confirmations - rich answer detail (frequency / maturity / multi-select tags / owner / free text / evidence reference) is no longer flattened back to yes / no. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S24 dashboard, the Phase 1G-S23 iterative improvement loop, the Phase 1G-S22 premium dashboard, the Phase 1G-S21C intelligence layer, the score input builder, and the dynamic question generator are otherwise unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default.
User impact: Founders and reviewers now open the dashboard on the documented default segment (Custom first when custom agents exist) so the first thing they see is the segment they care about - not a mixed wall. The estate hero answers the documented 12 first-glance questions in under 60 seconds: total agents, classification split, reviewed / unreviewed, agents needing remediation, agents blocked by missing evidence, ready / acceptable counts, total open priorities, high-severity open, confirmations outstanding, recommended next agent, biggest cross-agent pattern, recommended focus. Every agent now has a documented estate status (not reviewed, review in progress, needs confirmation, needs remediation, blocked by missing evidence, ready for rerun, acceptable, watchlist) with a tone-coloured pill in the management table. The estate remediation queues group outstanding work by category / owner / severity so the right team can clear a class of issues across the estate. Confirmation questions render the right control per rich answer type and adapt by classification + risk trigger. The downloaded Markdown carries the rich answer detail in section 10e. Buyer evidence and rich confirmations.
When to re-score: Buyer-facing estate command centre upgrade plus adaptive rich-question rendering plus rich answer preservation in the report. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S25 evidence record plus the five new pure reporting helpers (estate status model, estate remediation totality, adaptive question strategy, rich answer interpreter, plus the substantially upgraded agent estate view model) plus the rewritten agent estate dashboard component plus the new rich question control component plus the extended dynamic checklist UI plus the Markdown 10e section plus the new Phase 1G-S25 unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.132.0 to 0.133.0 plus the methodology changelog 126th entry plus the README Phase 1G-S25 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S25 internal methodology record (agent estate intelligence command centre and premium UX)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S25 founder execution signal: live test of Phase 1G-S24 surfaced that the estate dashboard was a first attempt and not yet a true command centre. The founder asked for an explicit Custom / Microsoft-OOTB / Unknown split, an estate-wide totality of outstanding work, true rich question controls per classification and per risk trigger, and rich answer preservation in the downloaded report.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships five new pure helper modules (estate status model, estate remediation totality, adaptive question strategy, rich answer interpreter, plus a substantially upgraded agent estate view model), ships a rewritten AgentEstateDashboard component with segment tabs and a proper management table, ships a new RichQuestionControl component, extends the dynamic checklist UI with adaptive question families, extends the Markdown renderer with section 10e, and adds Phase 1G-S25 unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S25 has not been live-tested end to end against a tenant with multiple Copilot Studio agents across Custom and Microsoft-OOTB classifications so the founder can confirm the segmented dashboard renders correctly, the estate management table sorts by attention, the adaptive question families differ between Custom / Microsoft-OOTB / Unknown agents, and the rich answer detail flows through to the downloaded Markdown. The founder must run the local endpoint and confirm every documented section before product approval.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26Change date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26 - focused review workspace, one-click review flow, and dashboard de-clutter
Reason: The founder live-tested Phase 1G-S25, shared multiple screenshots, and surfaced four UX failures: (1) clicking Review Agent only populated the dropdown - the buyer still had to click Discover footprint a second time; (2) the page was a massively long single vertical scroll with no clear sections; (3) the legacy yes / no Answer the remaining points checklist still rendered alongside the new rich adaptive checklist; (4) the Top 5 contained duplicate fallback concepts and the dashboard sometimes showed a generic Agent name. Founder feedback: it feels extremely disorganised; the 18-card wall is overwhelming; the evidence note input should not be open under every question by default.
What changed: The package version was bumped from 0.133.0 to 0.134.0. The build-identity endpoint now reports product version 0.134.0, phase id 1G-S26, the documented phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S25 now resolve to the Phase 1G-S26 values. One new pure helper module ships under the reporting library: a priority deduplicator that collapses near-duplicate Top-5 priorities (e.g. fallback / escalation duplicate concepts) onto a single canonical entry while preserving the highest severity, the strongest evidence, the merged linked evidence ids and linked question ids sets, and the clearer buyer-facing title. A new an internal source file component renders a five-section workspace nav (Estate overview / Review queue / Agent review / Report / Evidence) with a sticky tab bar so the active section dominates and non-active sections are hidden. The Review Agent button across the dashboard now calls a single start agent review helper that atomically (1) selects the agent, (2) clears stale state, (3) switches the workspace to agent review, (4) scrolls / focuses the agent review anchor, (5) auto-fetches the footprint, and (6) renders progress copy so the buyer never sees nothing happened between the click and the result. The estate dashboard summary cards are restructured into four documented metric groups (Estate size / Review progress / Outstanding work / Attention required); the S25 18-card wall is gone. The legacy yes / no checklist is hidden whenever the rich adaptive checklist is active. The estate view model resolves a buyer-safe agent name (trim, never accept the literal Agent, fall back to Unnamed agent (id-prefix)) so the dashboard never renders a generic Agent label when the discovered display name is empty or the literal word Agent. The readiness report view model gets the same pick name fallback so the report hero / verdict / agent block agree. The per-question evidence note input is now collapsed behind an Add evidence note disclosure; when a note has been saved a compact chip (with the first 60 chars + edit affordance) renders instead of the input. After a successful readiness report generation the workspace auto-switches to the Report section. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S25 dashboard helpers, the Phase 1G-S24 question deduplicator, the Phase 1G-S23 iterative loop, the Phase 1G-S22 premium dashboard, the Phase 1G-S21C intelligence layer, the score input builder, and the dynamic question generator are otherwise unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default.
User impact: The founder no longer scrolls through a 7000-line page. The workspace nav at the top makes the structure obvious and only the active section dominates. Review Agent is now a true one-click action: select + clear stale state + switch section + scroll + auto-fetch + progress copy. The Top 5 contains five distinct concepts (no more duplicate fallback rows). The estate dashboard scans cleanly into Estate size / Review progress / Outstanding work / Attention required. The legacy yes / no checklist no longer competes with the rich adaptive checklist. The report hero never shows a generic Agent label. Evidence notes are collapsed by default and the per-question UI is dramatically calmer.
When to re-score: Buyer-facing workspace + UX cleanup. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26 evidence record plus the new priority deduplicator module plus the new agent proof workspace shell component plus the upgraded json paste score card with the start agent review one-click handler + workspace section state + agent review section anchor + progress label + auto-switch to report + collapsed evidence note disclosure + hidden legacy checklist plus the rewritten estate summary cards with four documented metric groups plus the safe agent name resolver in the estate view model plus the pick name resolver in the readiness report view model plus the new Phase 1G-S26 unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.133.0 to 0.134.0 plus the methodology changelog 127th entry plus the README Phase 1G-S26 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26 internal methodology record (focused review workspace, one-click review flow, dashboard de-clutter, legacy UI retirement)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26 founder execution signal: live test of Phase 1G-S25 + multiple screenshots + four direct critiques (one-click review broken; massively long page; legacy and rich checklists rendering side-by-side; duplicate Top-5 concepts plus generic Agent name).
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships one new pure helper module (priority deduplicator), ships a new workspace shell component, restructures the estate summary cards, adds the one-click Review Agent flow with progress copy, hides the legacy yes / no checklist behind the rich checklist gate, fixes generic Agent name propagation, collapses evidence note inputs behind a per-question disclosure, and adds Phase 1G-S26 unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S26 has not been live-tested end to end. The founder must run the local endpoint, click Review Agent in the estate dashboard, confirm the workspace switches to Agent review with progress copy, confirm the legacy yes / no checklist is hidden, confirm the Top 5 contains five distinct concepts, confirm the estate dashboard renders four metric groups instead of an 18-card wall, and confirm the report hero uses the real agent name.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26AChange date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26A - focused review wizard and accurate question counting
Reason: The founder live-tested Phase 1G-S26 and rejected the UX. Two direct critiques drove this corrective slice: (1) the page still felt like one long disorganised scroll because the workspace tabs only switched a handful of small sections rather than gating the major blocks; (2) the question count on the dynamic checklist read 1 of 9 while more than 9 answerable questions were visible because the count was reading the legacy 10-question shape, not the deduplicated rich adaptive set. Phase 1G-S26A is the hard UX correction.
What changed: The package version was bumped from 0.134.0 to 0.134.1. The build-identity endpoint now reports product version 0.134.1, phase id 1G-S26A, the new phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S26 + earlier resolve to the Phase 1G-S26A values. Two new modules ship: an internal source file is a pure helper that classifies each rich confirmation question into a wizard step (critical questions / actions and data / controls and oversight) and a Required / Recommended / Optional level (high -> Required, medium -> Recommended, low -> Optional), plus the documented summarise question answers counter that follows the documented counting rules (multi-select counts when >= 1 option selected, single / owner / frequency / maturity counts when non-empty, free-text counts when non-empty after trim, yes/no/not sure counts when one of three values is selected, optional evidence note fields are NEVER counted as answers). an internal source file is a new five-step focused review wizard (Review summary / Critical questions / Actions and data / Controls and oversight / Final review) with a progress bar at the top, only-current-step rendering, Prev / Next navigation, and a Generate-readiness-report button that stays disabled until every Required question has been answered with a documented blocker message. json paste score card now wraps the existing dynamic checklist in the wizard via a render question card callback so the per-question card JSX is unchanged but only the active step's questions render. True workspace gating: the dynamic checklist + wizard render only on the Agent review tab; the readiness report renders only on the Report tab; the legacy footprint counts card moves to the Evidence tab; the legacy yes / no checklist remains in the DOM (its hidden Generate button is the click target the wizard's Generate button programmatically fires) but stays hidden whenever the rich wizard is active. The legacy button's disabled prop now accepts rich-only answer flows so the wizard's click goes through. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S26 priority deduplicator, the Phase 1G-S25 estate helpers, the Phase 1G-S24 question deduplicator, the score input builder, and the dynamic question generator are unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
User impact: The Agent review tab is now a focused wizard. Only the current step's questions render. The progress bar pins the buyer's location across the five documented steps. The Generate readiness report button stays disabled until every Required question has been answered with a clear blocker message. The question count is accurate: it reads the deduplicated rich adaptive set the wizard renders. Optional evidence note fields are never counted as questions. The estate dashboard, the legacy footprint counts, and the readiness report no longer all render in one continuous scroll - each one is gated to its own workspace tab.
When to re-score: Buyer-facing UX correction. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26A evidence record plus the new agentproof question required classifier module plus the new agent review wizard component plus the upgraded json paste score card with workspace-section gating on the dynamic checklist + footprint counts + readiness report blocks plus the wizard wrapper around the existing per-question render plus the new Phase 1G-S26A unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.134.0 to 0.134.1 plus the methodology changelog 128th entry plus the README Phase 1G-S26A row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26A internal methodology record (focused review wizard and accurate question counting)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26A founder execution signal: live test of Phase 1G-S26 + two direct critiques (the page still feels like one massively long disorganised scroll, and the question counter says 1 of 9 while more than 9 answerable questions are visible).
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships one new pure helper module (question required classifier with summariseQuestionAnswers), ships a new focused review wizard component, applies true workspace gating on the dynamic checklist + footprint counts + readiness report blocks, and adds Phase 1G-S26A unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S26A has not been live-tested end to end. The founder must run the local endpoint, click Review Agent in the estate dashboard, confirm the Agent review tab shows the wizard with only the current step rendering, confirm the progress bar advances from Step 1 through Step 5, confirm the question count matches the actual visible deduplicated rich adaptive set, confirm Generate readiness report stays disabled until every Required question is answered with the documented blocker copy, and confirm the dashboard, footprint counts, and readiness report each gate to their own workspace tab.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26BChange date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26B - critical React object render crash fix
Reason: The founder live-tested Phase 1G-S26A and the page crashed immediately with the React error: Objects are not valid as a React child (found: object with keys id, summary, why it matters). The error boundary rendered Something went wrong on this page. Root cause: the an internal source file top unknowns prop was typed as readonly array-of-string but the parent passed intelligence.intelligence summary.most important unknowns which is an array of {id, summary, why it matters} objects at runtime. The wizard rendered each entry with bare {u} which crashed the page. Phase 1G-S26B is the critical crash fix.
What changed: The package version was bumped from 0.134.1 to 0.134.2. The build-identity endpoint now reports product version 0.134.2, phase id 1G-S26B, the new phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S26A and Phase 1G-S26 resolve to the Phase 1G-S26B values. One new pure helper module ships under an internal source file: a documented buyer-safe display projection that turns any value (string / number / boolean / array / object) into a display string. For object inputs the helper walks a documented preferred-field list (summary, label, title, name, description, why it matters, buyer summary, finding) so the rendered text always carries meaning. The an internal source file top unknowns prop is widened from readonly array-of-string to readonly array-of-unknown and every list item is projected through render safe text before rendering. why it matters is surfaced inline as supporting copy. id is exposed as a data-unknown-id attribute, never as buyer-facing text. The bad cast in an internal source file that lied to the type system about most important unknowns being readonly array-of-string is replaced with the honest readonly array-of-unknown. No new product features. No redesign. No product-logic change. The Phase 1G-S26A wizard, classifier, workspace gating, one-click review, required-vs-optional logic, report auto-switch, and legacy-checklist-hiding behaviour are all preserved unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
User impact: The /score/paste page no longer crashes when the AgentProof Intelligence layer returns rich object-shaped most important unknowns. Each unknown renders as summary supporting copy. Every other Phase 1G-S26A surface (focused review wizard, accurate question count, true workspace gating, report auto-switch, legacy-checklist hiding) is preserved.
When to re-score: Hard runtime crash fix only. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26B evidence record plus the new an internal source file pure helper plus the patched an internal source file top unknowns render plus the corrected most important unknowns cast in json paste score card plus the new Phase 1G-S26B unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.134.1 to 0.134.2 plus the methodology changelog 129th entry plus the README Phase 1G-S26B row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26B internal methodology record (critical React object render crash fix)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26B founder execution signal: live test of Phase 1G-S26A crashed immediately on the /score/paste page with the React error Objects are not valid as a React child (found: object with keys id, summary, why_it_matters). The error boundary rendered Something went wrong on this page.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships one new pure helper module (renderSafeText), patches the components/workspace/AgentReviewWizard.tsx top_unknowns render to project each entry through renderSafeText, replaces the bad ReadonlyArray-of-string cast in components/form/JsonPasteScoreCard.tsx with the honest ReadonlyArray-of-unknown, and adds Phase 1G-S26B unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local endpoint, sign in with Microsoft, click Review Agent, and confirm the page loads to the focused review wizard without hitting the error boundary.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26CChange date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26C - report workspace cleanup, tabbed report, and legacy demo removal
Reason: The founder live-tested Phase 1G-S26B and confirmed the React crash was fixed but rejected the report UX. Direct critique: Its working now and the lay out is better but these questions appear after I generated the report, so I dont understand why they are there as they play no role. In addition, the final report page is still a single page. Its stiil too messy. The founder also caught the legacy fictional demo panel (Try the fictional demo, Run this demo, Fictional customer-support agent) still visible inside the buyer journey. Phase 1G-S26C is the focused report workspace cleanup.
What changed: The package version was bumped from 0.134.2 to 0.134.3. The build-identity endpoint now reports product version 0.134.3, phase id 1G-S26C, the new phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S26B and Phase 1G-S26A and Phase 1G-S26 resolve to the Phase 1G-S26C values. One new component ships: an internal source file is a tabbed report wrapper with six documented tabs (Summary, Actions, Risks, Evidence, Stakeholders, Trace) that renders only the active tab body and shows a generated-from-answers banner above the tab nav with the documented timestamp plus an Update answers and regenerate button that switches the workspace back to Agent review without discarding the buyer existing answers. The previously file-local sub-components in an internal source file are exported so the tabs compose them per-tab (readiness hero, Top5Priorities, executive summary, risk heatmap, evidence coverage map, stakeholder view tabs, evidence story, buyer confirmations panel, all findings collapsed, advanced evidence collapsed, report actions, next best action card, comparison view, version trace card, buyer evidence notes panel). The legacy single-page agent readiness dashboard rendering in an internal source file is replaced by the tabbed wrapper. Summary tab shows readiness hero and next best action card and Top5Priorities and executive summary and report actions. Actions tab shows Top5Priorities and next best action card. Risks tab shows risk heatmap. Evidence tab shows evidence coverage map and evidence story and buyer confirmations panel and buyer evidence notes panel and all findings collapsed (full ledger collapsed by default). Stakeholders tab shows stakeholder view tabs. Trace tab shows version trace card and comparison view and advanced evidence collapsed (raw Markdown collapsed) and an Internal test harness disclosure. The Founder diagnostics section that exposed Try the fictional demo and Run this demo and Fictional customer-support agent inside the buyer journey is now hidden via the hidden attribute on every workspace tab EXCEPT Evidence so the buyer can never see it on the primary product surface; the disclosure toggle defaults to closed even on Evidence. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S26B render safe text helper, the Phase 1G-S26A wizard, the Phase 1G-S26 workspace gating and one-click review, the Phase 1G-S25 estate helpers, the Phase 1G-S24 question deduplicator, the score input builder, and the dynamic question generator are unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61.
User impact: After clicking Generate readiness report the buyer lands on the focused Summary tab with the readiness verdict, the next best action, the Top 5 preview, the executive summary, and the report actions. No active question form appears in the report workspace. Drilling into Actions or Risks or Evidence or Stakeholders or Trace requires deliberate clicks. Full evidence ledgers, raw Markdown, and founder diagnostics live behind disclosures in the Evidence and Trace tabs, never on the Summary. The fictional demo panel never appears in the buyer journey. A documented Update answers and regenerate button switches back to Agent review without discarding answers so the buyer can change assumptions and regenerate.
When to re-score: Buyer-facing report UX cleanup. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26C evidence record plus the new an internal source file component plus the exported sub-components in an internal source file plus the rewired an internal source file readiness-report block plus the founder-diagnostics workspace-evidence-only gate plus the new Phase 1G-S26C unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.134.2 to 0.134.3 plus the methodology changelog 130th entry plus the README Phase 1G-S26C row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26C internal methodology record (report workspace cleanup, tabbed report, and legacy demo removal)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26C founder execution signal: live test of Phase 1G-S26B confirmed the React crash was fixed but rejected the report UX (post-report questions still visible, report still one long page, fictional demo panel still in buyer journey).
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships one new component (AgentReadinessReportTabs), exports fifteen sub-components from AgentReadinessDashboard, rewires the readiness-report block in components/form/JsonPasteScoreCard.tsx to use the tabbed wrapper, gates the founder-diagnostics section to the Evidence workspace only, and adds Phase 1G-S26C unit tests. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local endpoint, sign in with Microsoft, click Review Agent, generate a readiness report, and confirm the report opens on the Summary tab with no active question form, that the legacy demo is not visible anywhere, and that Update answers and regenerate switches back to Agent review preserving answers.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26DChange date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26D - live report pipeline, unified intelligence workspace, and buyer UX repair
Reason: The founder live-tested Phase 1G-S26C and confirmed the tabs render but the report itself was blank after Generate readiness report. Direct critique: After clicking Generate report I see nothing - no visible report, no intelligence, no joined-up result, old dashboards and old report surfaces are not linked to the new Microsoft discovery data, the product feels disjointed, messy, and not visually pleasing. Root cause: the readiness report block was nested inside the legacy yes/no checklist <div data-test-id=json-paste-connector-footprint-confirmations>, which receives class name=...hidden whenever the modern dynamic confirmation questions path is active. After generation the workspace switched to report but the report itself remained inside a hidden container and the buyer saw nothing.
What changed: The package version was bumped from 0.134.3 to 0.134.4. The build-identity endpoint now reports product version 0.134.4, phase id 1G-S26D, the new phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S26C, S26B, S26A, and S26 resolve to the Phase 1G-S26D values. Two new lib modules ship: an internal source file is the unified live report view model that wraps the existing readiness view model with the selected Microsoft env id and agent id and classification and footprint summary and buyer confirmation summary plus a documented source binding summary that explicitly identifies which fields came from Microsoft discovery, footprint discovery, buyer answers, scoring/intelligence logic, or unknown. an internal source file is the thin S26D evidence-bound intelligence layer producing five sections (agent interpretation, risk scenarios at least three, practical controls at least five, evidence quality separated from readiness, improvement projection that never mutates the real score) plus a methodology caveat. Two new components ship: an internal source file renders the new Intelligence tab content and an internal source file renders the source binding (preview on Summary, full breakdown on Evidence). The tabbed report wrapper an internal source file is extended from six to seven tabs by inserting Intelligence between Summary and Actions. The readiness report block in an internal source file is promoted from a child of the legacy yes/no checklist <div> to a sibling so the gate is just microsoft readiness report.status === ok && workspace section === report. The live view model and intelligence summary are built once and shared across the seven tabs. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. The legacy fictional demo and founder diagnostics remain gated to the Evidence workspace only (S26C invariant preserved).
User impact: After clicking Generate readiness report the buyer immediately sees a visible, joined-up, intelligent report. The Summary tab shows the readiness verdict, the next best action, the Top 5 preview, the executive summary, a source-binding preview that proves the report uses the live Microsoft agent data, and the report actions. The new Intelligence tab shows agent interpretation (likely archetype, why, confidence, evidence basis, what would change), at least three evidence-bound risk scenarios, at least five specific practical controls, evidence quality separated from readiness, and a cautious improvement projection that never mutates the real score. The full source binding breakdown lives on the Evidence tab so the buyer never wonders where a number came from. Drilling into Actions or Risks or Evidence or Stakeholders or Trace requires deliberate clicks. Update answers and regenerate switches back to Agent review preserving answers. The fictional demo panel never appears in the buyer journey.
When to re-score: Buyer-facing report rendering and intelligence presentation. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26D evidence record plus the new an internal source file module plus the new an internal source file module plus the new an internal source file component plus the new an internal source file component plus the seven-tab agent readiness report tabs.tsx plus the rewired an internal source file readiness-report sibling block plus the new Phase 1G-S26D unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.134.3 to 0.134.4 plus the methodology changelog 131st entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26D internal methodology record (live report pipeline, unified intelligence workspace, and buyer UX repair)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26D founder execution signal: live test of Phase 1G-S26C confirmed the tabs render but the report itself was blank after Generate readiness report; the founder also reported the product felt disjointed and not visually pleasing.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint, ships two new lib modules and two new components, extends the report tabs from six to seven, and promotes the readiness-report block from a child of the legacy hidden checklist container to a sibling so the buyer sees a visible joined-up report immediately. The Intelligence tab demonstrates AgentProof intelligence with evidence-bound interpretation, risk scenarios, practical controls, evidence quality, and a cautious improvement projection. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local app at http://localhost:3000/score/paste, sign in with Microsoft, click Review Agent, generate a readiness report, and confirm the report renders immediately on the Summary tab, the Intelligence tab is present and shows agent interpretation plus risk scenarios plus practical controls plus evidence quality plus improvement projection, the report uses the selected real Microsoft agent name and environment, no fictional demo or founder diagnostics are visible in the normal buyer path, Update answers and regenerate returns to Agent review preserving answers, and changing one answer regenerates the report cleanly.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S26EChange date:2026-05-09Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S26E - secret purge, buyer-path cleanup, and report fallback hardening
Reason: The founder inspected the Phase 1G-S26D archive directly and found three blocking failures. (1) The archive contained env.download, a secret-bearing file under a non-standard filename, holding real Microsoft environment values. Filename-only packaging checks let it through. (2) The legacy fictional demo and founder-diagnostics strings (Founder diagnostics - legacy and developer test inputs, Try the fictional demo, Run this demo, Fictional customer-support agent, Loads the fictional sample below) still rendered when the buyer switched to the Evidence workspace tab; the Evidence workspace is part of the buyer-facing product surface. (3) The S26D readiness-report block degraded to blank UI when any deep envelope field was missing. None of these are acceptable.
What changed: The package version was bumped from 0.134.4 to 0.134.5. The build-identity endpoint now reports product version 0.134.5, phase id 1G-S26E, the new phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S26D, S26C, S26B, S26A, and S26 resolve to the Phase 1G-S26E values. scripts/build archive.cjs ships content-level secret scanning: a documented forbidden file names list (env.download), a secret assignment pattern regex (*_SECRET, *_TOKEN, *_KEY, microsoft client secret, microsoft session secret, etc.), a placeholder re allowlist (empty, your-...-here, placeholder, change-me, example), and a secret scan allowlist of files whose content is permitted to contain assignment lines (.env.example). The build refuses to produce an archive if any staged file (outside the allowlist) carries a non-placeholder secret assignment, and refuses if the archive listing contains any forbidden file names entry. an internal source file wraps the entire legacy founder-diagnostics outer section under a false-gated render so React never renders the fictional demo or founder diagnostics strings on any workspace, including Evidence. All legacy json-paste-* test ids stay in source. The duplicate raw scorecard Markdown details block that sat directly below the agent readiness report tabs is removed; raw Markdown lives only inside the Trace tab advanced evidence collapsed disclosure (collapsed by default). One new lib module ships: an internal source file produces a structurally complete live agent report view model with documented unknown markers when any of evidence model / capability profile / intelligence / component inventory / risk hypotheses / dynamic confirmation questions is missing. The fallback report shell renders Summary / Intelligence / Actions / Risks / Evidence / Stakeholders / Trace tabs (the same seven tabs as the full path) so the buyer never sees blank UI. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Tokens stay server-side.
User impact: The S26E archive is secret-clean by filename AND by content. env.download is gone. No real secret values are packaged under any filename. The buyer never sees Try the fictional demo, Run this demo, Fictional customer-support agent, Loads the fictional sample below, or Founder diagnostics - legacy and developer test inputs anywhere in the normal buyer path. The Evidence workspace tab is clean. The duplicate raw Markdown block below the tabbed report is removed; raw Markdown lives only in the Trace tab, collapsed by default. After clicking Generate readiness report the buyer always sees a visible joined-up report immediately - even if the deep footprint envelope did not fully populate, the fallback shell renders controlled unknowns instead of blank UI.
When to re-score: Packaging safety + buyer UX cleanup + fallback hardening. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
ScriptsLibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S26E evidence record plus the new an internal source file module plus the new content-level secret scanner in scripts/build archive.cjs plus the false-gated legacy founder-diagnostics block plus the removed duplicate raw Markdown details plus the wired fallback report shell plus the new Phase 1G-S26E unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.134.4 to 0.134.5 plus the methodology changelog 132nd entry.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S26E internal methodology record (secret purge, buyer-path cleanup, and report fallback hardening)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-09
- Reference
- Phase 1G-S26E founder execution signal: direct archive inspection of S26D found env.download with real Microsoft secrets, legacy fictional demo still visible on Evidence workspace tab, and a fragile report path that could go blank when deep envelope fields were missing.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint, ships a content-level secret scanner in the archive build script, wraps the entire legacy founder-diagnostics block under a false-gate so React never renders the fictional demo or founder diagnostics strings to the buyer, removes the duplicate raw Markdown details below the tabbed report, and adds a safe fallback live view model plus fallback report shell that renders the documented seven tabs with controlled unknown markers when the deep envelope is incomplete. Tokens stay server-side. No business records are read on the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved and extended with FORBIDDEN_FILE_NAMES plus content-level scanning. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The founder must run the local app at http://localhost:3000/score/paste, sign in with Microsoft, click Review Agent, generate a readiness report, and confirm the report appears immediately on the Summary tab, the Intelligence tab shows agent interpretation plus risk scenarios plus practical controls plus evidence quality plus improvement projection, the report uses the selected real Microsoft agent name and environment, controlled unknowns appear where evidence is missing, no active question cards appear under or around the report, no fictional demo or founder diagnostics are visible anywhere on Summary / Intelligence / Actions / Risks / Evidence / Stakeholders / Trace, raw Markdown is not below the report and is only inside Trace (collapsed), Update answers and regenerate returns to Agent review preserving answers, and changing one answer regenerates the report cleanly.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S21Change date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S21 - real footprint discovery, confirmation, and report handoff
Reason: Founder live-tested Phase 1G-S20. Microsoft sign-in works, Power Platform environment listing works, the Dataverse plus Global Discovery layer works enough to list real agents, the agent dropdown populates, and the Phase 1G-S20 grouped picker splits agents into Custom, Microsoft / system, and Unknown. Founder went to bed and asked for a substantial product-forward slice that takes a classified agent through real footprint discovery to a readiness report so there is plenty to test in the morning.
What changed: Phase 1G-S21 ships the real AgentProof value path on top of the Phase 1G-S20 grouped picker. New pure helper at an internal source file exposes map agent footprint plus merge confirmation answers plus build confirmation questions. The mapper takes the discovered agent record, its botcomponents normalised view, the Phase 1G-S20 classification, and the Power Platform environment snapshot, and produces the documented Phase 1G-S21 footprint shape: identity (agent id, display name, environment, classification, status, description), topics, knowledge sources, actions, integrations, security and access, human oversight, testing and monitoring, plus an unknowns list for every section that metadata could not prove. The mapper also returns an evidence summary (bot record found, botcomponents count, component type counts, metadata sources), the documented 10-question Yes / No / Not sure confirmation set, and a score input draft that the score paste page hands to the existing /api/score endpoint. The Microsoft footprint route is rewritten to validate the (environment, agent) selection against the session-discovered list, resolve the Dataverse URL via the Phase 1G-S16 metadata resolver, acquire a Dataverse token, re-fetch the bot record (so the mapper sees the freshest ismanaged plus publisher signals), call the Phase 1G-S2 discover agent components helper for botcomponents, and return the documented Phase 1G-S21 success or non-success shape (footprint agent not found, footprint dataverse access blocked, footprint components not available, footprint no components found, footprint discovery failed). The score paste page is extended with three new buyer-friendly panels: a count summary card showing agent name plus classification plus environment plus topics / knowledge / actions / integrations / botcomponents / unknowns counts plus a collapsed Evidence details disclosure listing component type counts and metadata sources, a confirmation checklist of 10 documented Yes / No / Not sure questions, and a Generate readiness report button that posts the score input draft merged with the buyer answers to /api/score and renders the deterministic scorecard Markdown returned. Footprint discovery stays metadata-only: only the documented bot and botcomponent tables are read; tokens, secrets, and business records never appear in any response. Updates the build-identity endpoint constants to product version 0.129.0 and phase id 1G-S21. Backwards-compatible phase 1 g s8 through phase 1 g s20 alias exports resolve to the Phase 1G-S21 values. Adds a Phase 1G-S21 evidence file under the content tree. Adds Phase 1G-S21 unit tests in six groups. package version bumped from 0.128.0 to 0.129.0. methodology current methodology summary product version bumped to 0.129.0. methodology changelog now carries 119 entries with change id 1G-S21 last (source type internal methodology, re score recommended false). The score endpoint is now intentionally consumed through its public agent inputs contract for the readiness handoff; the score engine itself is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice for the existing fixture. NullProvider remains default.
User impact: When the buyer picks a classified agent and clicks Discover agent footprint, AgentProof now reads only the documented bot and botcomponent metadata for that agent, then renders a count summary (agent name, classification, environment, topics, knowledge sources, actions, integrations, botcomponents, unknowns), a collapsed Evidence details block (component type counts plus metadata sources), and a confirmation checklist of 10 plain-English Yes / No / Not sure questions for the items Microsoft metadata cannot prove (live with real users, takes actions outside chat, accesses confidential or regulated data, human approval required for sensitive actions, named business owner, tested before deployment, usage monitored, fallback or escalation path, could affect high-stakes decisions, purpose documented). After the buyer answers, a Generate readiness report button merges the answers into the score input draft and posts to /api/score, then renders the deterministic AgentProof scorecard Markdown. The buyer never types JSON. The buyer never types a Dataverse URL or environment URL or agent id. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S21 connects the existing agent inputs scoring endpoint to the discovered Microsoft Copilot Studio footprint via the new mapper. The score engine, weights, question bank, red flag rules, AI Act indicator rules, and deterministic Markdown renderer are unchanged. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the existing fixture.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S21 content evidence file plus the new Microsoft agent footprint mapper helper module under the connector library plus the rewritten footprint route plus the extended score paste page (count summary, confirmation checklist, scoring handoff button, readiness report render) plus the new Phase 1G-S21 unit test groups plus the updated build-identity endpoint constants for Phase 1G-S21 plus the package version bump from 0.128.0 to 0.129.0 plus the methodology changelog 119th entry plus the README Phase 1G-S21 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S21 internal methodology record (real footprint discovery plus confirmation plus report handoff)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S21 founder execution signal: founder live-tested Phase 1G-S20, accepted the grouped picker as good enough to move forward, and explicitly asked for a substantial product-forward slice while sleeping so there is plenty to test in the morning.
- Impact assessment
- Adds a pure footprint mapper plus a rewritten footprint route plus a confirmation-and-handoff UI that uses the existing AgentInputs contract. Tokens stay server-side. Only the documented bot and botcomponent metadata tables are read; no business records, no incidents, no contacts, no accounts, no conversations or transcripts. The Phase 1G-S2 through Phase 1G-S20 layers are all preserved. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the existing fixture. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S21 has not been live-tested end to end against a tenant where the readiness report renders the full Markdown for a real Copilot Studio agent. The buyer must run the local route after this slice to confirm the count summary, the confirmation checklist, and the readiness report all render against a live agent.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S21AChange date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S21A - secret env archive rebuild
Reason: An independent inspection of the Phase 1G-S21 founder archive confirmed it contained real secret env files (the local .env and the local .env.local) alongside the safe placeholder env example. Real local env files must never appear in any founder archive because they hold live secrets such as the Microsoft client secret and the Microsoft session secret. The founder routed the corrective instruction directly into the implementation channel and asked for a packaging-only correction with a regression test.
What changed: The package version was bumped from 0.129.0 to 0.129.1. The build-identity endpoint now reports product version 0.129.1, phase id 1G-S21A, the documented corrective phase name, and the corrective expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S21 now resolve to the Phase 1G-S21A values. A new deterministic packaging helper script enumerates the documented forbidden top-level directories (node modules, the next build directory, the git directory, the coverage directory, the dist directory, the build directory), the documented forbidden secret env files (the local env, the local env local, the local env development, the local env development local, the local env production, the local env production local, the local env test, the local env test local, plus any other env-prefixed file that is not the env example placeholder), and the documented forbidden suffixes (the gzipped tarball suffix, the log suffix, the typescript build info suffix). The script stages the project to a clean temporary directory, copies every other file, defensively sweeps the staged tree for any forbidden file that slipped through, creates the gzipped tarball at the requested output path, lists every entry that matches the env regular expression, and asserts the only env-named entry is the env example placeholder. A new packaging safety regression assertion pins the script enumerations, the helper purity, the build-identity values, the new evidence record shape, and the methodology changelog last entry. A new evidence record records the corrective slice. The methodology changelog now carries 120 entries with the corrective slice last (internal methodology source type, re score recommended false). The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. The Phase 1G-S21 footprint mapper, the Phase 1G-S21 footprint route, and the Phase 1G-S21 score paste page panels are unchanged. Null Provider remains default. The connector-agnostic platform select still exposes all six categories. Manual environment URL entry, manual agent name or identifier entry, and manual Dataverse URL entry remain absent from every path.
User impact: Founders and reviewers who download the Phase 1G-S21A founder archive will not receive any real secret env values from the implementation environment. Only the placeholder env example with documented placeholder names is shipped. Local sign-in still works because the founder fills in their own local env or local env local from the placeholder env example after extraction.
When to re-score: Packaging-only correction. The deterministic scorecard Markdown SHA-256 is unchanged at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
PackagingTestsContentMethodologyScriptsEvidence trace: the new Phase 1G-S21A evidence record plus the new packaging helper script plus the new packaging safety regression assertion plus the updated build-identity endpoint constants for Phase 1G-S21A plus the package version bump from 0.129.0 to 0.129.1 plus the methodology changelog 120th entry plus the README Phase 1G-S21A row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S21A internal methodology record (secret env archive rebuild)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S21A corrective execution signal: independent inspection of the Phase 1G-S21 archive confirmed real local env and local env local files were packaged. The founder routed the corrective instruction into the implementation channel and asked for a packaging-only correction with a regression assertion.
- Impact assessment
- Bumps the patch version, rewrites the build-identity endpoint to report the corrective phase identity, adds a deterministic packaging helper script that explicitly excludes secret env files (and node modules, the next build directory, the git directory, coverage, the dist directory, the build directory, gzipped tarballs, log files, and typescript build info), adds a packaging safety regression assertion, and rebuilds the founder archive. The Phase 1G-S21 product implementation (footprint mapper, footprint route, score paste page panels) is unchanged. Tokens stay server-side. Only metadata tables are read in the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S21A is a packaging-only correction; the underlying Phase 1G-S21 product surface still requires the founder to confirm end-to-end behaviour against a real tenant after extracting the new founder archive.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S21BChange date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S21B - footprint route Dataverse resolution parity
Reason: The founder live-tested Phase 1G-S21A. Microsoft sign-in worked, environment listing worked, agents listing worked, agents were grouped and classified, and the founder selected an environment named CRM667668 and an agent named Agent Fact Finder. Clicking Discover agent footprint appeared to do nothing in the UI. A direct browser console fetch proved the footprint endpoint returned HTTP 422 with status footprint dataverse url missing even though the agents endpoint had just listed agents for the same environment. Root cause: the footprint endpoint did not use the same Stage one and Stage two Dataverse resolution path the agents endpoint used. Phase 1G-S21B is a tight corrective slice that closes the resolution parity gap and adds a calm, visible buyer-friendly error panel directly below the Discover agent footprint button so the founder never sees nothing happened again.
What changed: The package version was bumped from 0.129.1 to 0.129.2. The build-identity endpoint now reports product version 0.129.2, phase id 1G-S21B, the documented corrective phase name, and the corrective expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S21A now resolve to the Phase 1G-S21B values. A new shared server-side Dataverse resolution helper module is the single source of truth for the agents endpoint and the footprint endpoint. The helper accepts the signed-in session refresh token, the discovered environment list, the selected environment id, an optional opaque confirmed candidate id, and an optional cached session-resolved Dataverse URL. It runs the per-environment metadata resolver first, falls back to Global Discovery with auto-match plus single-candidate plus confirmed-id support, returns a buyer-safe ok or fail envelope with diagnostic trace on every outcome, and never accepts a raw URL from the browser. The session store gained a per-environment resolved Dataverse URL cache plus get and set helpers. The agents endpoint now writes the resolved URL into the session cache after every successful resolution. The footprint endpoint now reads the cached URL first and falls back to the same shared helper, so the resolution outcome is identical to the agents endpoint outcome. Every footprint non-success response now carries diagnostic trace plus selected environment plus selected agent plus buyer-safe headline plus safe message plus buyer next action. The score paste page captures the structured footprint failure envelope and renders a calm visible error panel directly below the Discover agent footprint button with a buyer-safe headline (Could not discover the agent footprint), the safe message, the buyer next action, the selected environment plus agent context, the HTTP status, the safe error code, and a collapsed Advanced details for IT block that pretty-prints the diagnostic trace JSON when present. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. The score endpoint contract is unchanged. The Phase 1G-S21 footprint mapper is unchanged. The packaging safety helper from Phase 1G-S21A is preserved and exercised by the build. Null Provider remains default. The connector-agnostic platform select still exposes all six categories. Manual environment URL entry, manual agent name or identifier entry, and manual Dataverse URL entry remain absent from every path.
User impact: Founders and reviewers who select an environment whose snapshot does not directly carry a Dataverse URL (for example a Dynamics 365 CRM instance like CRM667668) can now click Discover agent footprint and see either the discovered footprint or a calm, visible buyer-safe error panel that names the failure, lists the selected environment and agent, and offers a clear next action. The previous silent 422 failure mode is closed. The agents endpoint behaviour for buyers is unchanged.
When to re-score: Live-route resolution parity correction plus visible UI error surfacing. The deterministic scorecard Markdown SHA-256 is unchanged at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibApiComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S21B evidence record plus the new shared Dataverse resolution helper plus the rewritten agents endpoint plus the rewritten footprint endpoint plus the extended session store plus the score paste page footprint error panel plus the new Phase 1G-S21B unit tests plus the updated build-identity endpoint constants for Phase 1G-S21B plus the package version bump from 0.129.1 to 0.129.2 plus the methodology changelog 121st entry plus the README Phase 1G-S21B row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S21B internal methodology record (footprint route Dataverse resolution parity)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S21B founder execution signal: live test confirmed agents listed for environment CRM667668 (id 2f124127-1d37-e662-9dbb-c81c5b39b207) but Discover agent footprint for agent Agent Fact Finder (id f085633f-6227-f111-8341-6045bd0913f4) returned HTTP 422 with footprint dataverse url missing. The founder routed the corrective instruction directly into the implementation channel asking for a tight live-route parity and error-surfacing fix.
- Impact assessment
- Bumps the patch version, rewrites the build-identity endpoint to report the corrective phase identity, extracts a shared Dataverse resolution helper used by both the agents endpoint and the footprint endpoint, extends the in-memory session store with a per-environment resolved Dataverse URL cache, rewrites the footprint endpoint to use the cached URL or transparently re-run Global Discovery, returns a diagnostic trace on every footprint non-success outcome, and adds a calm visible error panel directly below the Discover agent footprint button. Tokens stay server-side. Only documented metadata tables are read in the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S21B has not been live-tested end to end against the same tenant where the previous failure occurred; the founder must run the local endpoint after this slice to confirm Discover agent footprint either renders the footprint or renders the visible error panel against environment CRM667668 and agent Agent Fact Finder.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S21CChange date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S21C - deep AgentProof intelligence and schema-valid readiness report
Reason: The founder live-tested Phase 1G-S21B. Microsoft sign-in worked, environment listing worked, agents listing worked, agent classification worked, footprint discovery reached the UI, and the generic confirmation flow rendered, but the readiness report handoff failed with HTTP 400 'Validation failed.' and the buyer experience felt shallow, generic, and identical for both custom and Microsoft / system agents. Founder feedback was unambiguous: the product must explore the discovered metadata properly, ask different questions for different agent types, surface evidence-led intelligence rather than counts, and stop calling the score endpoint before the input is schema-checked. Phase 1G-S21C is the product turning point. It replaces the metadata-counting experience with an evidence-led intelligence layer and fixes the validation failure with a shared score input builder.
What changed: The package version was bumped from 0.129.2 to 0.129.3. The build-identity endpoint now reports product version 0.129.3, phase id 1G-S21C, the documented phase name, and the corrective expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S21B now resolve to the Phase 1G-S21C values. A new evidence model module turns the discovered bot record + botcomponents + classification + environment snapshot into a structured ledger of proven, inferred, and unknown findings across fourteen documented categories (identity, classification, topics, knowledge, actions, integrations, data access, security, human oversight, testing, monitoring, fallback, governance, lifecycle). Every finding has a stable id, a buyer-safe summary, a confidence, a source, and a risk relevance. A new component inventory helper deepens the analysis of botcomponent rows, extracting safe topic / connector / action / knowledge-source kinds and redacting any secret-like strings before they leave the safe surface. A new capability profile turns the evidence model into a small typed profile the UI uses for capability chips. A new risk hypothesis helper turns the capability profile and classification into cautious risk hypotheses, each citing the evidence ids that produced it. A new deterministic intelligence layer produces an intelligence summary (headline, plain-English summary, readiness interpretation, biggest concerns, strongest evidence, most important unknowns, recommended next steps, confidence statement) and a prioritised risks list. Every intelligent claim references at least one evidence id, unknown id, risk id, or confirmation answer id. Microsoft/system, custom, and unknown classifications produce materially different outputs under the same sparse metadata conditions. A new dynamic confirmation question builder replaces the generic 10-question set with agent-type-specific, evidence-led, priority-ordered questions. Microsoft/system agents are asked about adoption context (used as-is or customised, relied on for decisions, organisation reviewed). Custom agents are asked about ownership, live exposure, documented purpose, action scope, and knowledge sensitivity. Unknown agents are asked about ownership and provenance first. A new score input builder converts the deeper footprint plus buyer answers into a schema-valid agent inputs body and pre-validates it against the same Zod schema the score endpoint uses BEFORE any POST. The autonomy level enum mismatch that produced the Phase 1G-S21B HTTP 400 is closed. The score paste page captures buyer-safe and Founder validation issues and shows them in a calm panel; the request to the score endpoint is suppressed when validation fails. New buyer-facing panels render: AgentProof Intelligence, capability chips, Evidence Ledger, prioritised risks, dynamic confirmation checklist, and a buyer-safe validation errors panel. The score engine source is unchanged. The score endpoint contract is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default. The connector-agnostic platform select still exposes all six categories. Manual environment URL entry, manual agent name or identifier entry, and manual Dataverse URL entry remain absent from every path.
User impact: Founders and reviewers who run a footprint discovery now see an AgentProof Intelligence panel that reads as a calm interpretation, not a list of counts. Custom and Microsoft / system agents produce different headlines, different concerns, different next steps, and different confirmation questions. Unknown-classification agents are asked about ownership before anything else. The readiness report either renders or surfaces a calm validation error panel; it never fails silently with HTTP 400.
When to re-score: Live-route deepening + UI intelligence + score input pre-validation. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibApiComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S21C evidence record plus the new evidence model module plus the new intelligence module plus the new score input builder plus the new dynamic question generator plus the extended footprint endpoint plus the new score paste page panels plus the new Phase 1G-S21C unit tests plus the updated build-identity endpoint constants for Phase 1G-S21C plus the package version bump from 0.129.2 to 0.129.3 plus the methodology changelog 122nd entry plus the README Phase 1G-S21C row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S21C internal methodology record (deep intelligence and schema-valid readiness report)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S21C founder execution signal: live test of Phase 1G-S21B confirmed the readiness handoff returned HTTP 400 with 'Validation failed.' and the buyer experience felt shallow and identical for custom and Microsoft / system agents. The founder routed the corrective instruction directly into the implementation channel asking for agent-type-specific intelligence, deeper evidence analysis, and a schema-valid readiness handoff.
- Impact assessment
- Bumps the patch version, rewrites the build-identity endpoint to report the corrective phase identity, ships four new pure helper modules (evidence model, intelligence layer, score input builder, dynamic question generator), extends the footprint endpoint to expose the deeper envelope, and adds six new buyer-facing panels (intelligence summary, capability chips, evidence ledger, prioritised risks, dynamic checklist, validation errors). Tokens stay server-side. Only documented metadata tables are read in the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S21C has not been live-tested end to end against the same tenant where the previous failures occurred; the founder must run the local endpoint after this slice to confirm the intelligence panels render, the dynamic questions are agent-type-specific, and the readiness report renders against a real Copilot Studio agent.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S22Change date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S22 - premium readiness intelligence report experience
Reason: The founder live-tested Phase 1G-S21C and confirmed the new intelligence layer was a real improvement, but the readiness report itself still felt boring and below the product standard. Founder feedback was unambiguous: the report must become vibrant, useful, insightful, colourful, and worth using repeatedly. A mandatory addendum then required a Top-5 iterative improvement experience so the buyer is never overwhelmed by a wall of findings. Phase 1G-S22 ships the premium readiness intelligence report experience that closes both gaps.
What changed: The package version was bumped from 0.129.3 to 0.130.0. The build-identity endpoint now reports product version 0.130.0, phase id 1G-S22, the documented phase name, and the corrective expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S21C now resolve to the Phase 1G-S22 values. A new pure deterministic readiness report view-model builder turns the engine scorecard plus the AgentProof Intelligence layer plus the evidence model plus the capability profile plus the prioritised risks plus the dynamic confirmation answers into a documented premium report shape with a readiness section, an executive dashboard, an evidence coverage map across fourteen categories, a risk heatmap, an action plan, stakeholder views, an evidence story, and a buyer confirmations panel. A new pure top-5 priority planner ranks candidates by severity, confidence, business impact, and whether the item blocks the readiness verdict, then returns at most five priorities plus a later-items list and an improvement path. A new richer Markdown renderer produces the buyer-facing report body for download and copy actions, anchored on the same view model. A new premium dashboard component renders the readiness hero, the top-five priorities panel directly under the hero, the executive summary, the risk heatmap, the evidence coverage map, the stakeholder view tabs, the evidence story columns, the buyer confirmations panel, the full evidence ledger collapsed by default, the later items collapsed by default, the advanced details for IT collapsed by default, and the report actions row (copy executive summary, copy action plan, copy full Markdown, download Markdown, print, review another agent). The score paste page captures the engine score numbers and renders the dashboard above the legacy Markdown view. The Phase 1G-S21C intelligence layer, evidence model, dynamic question generator, and schema-valid score input builder are unchanged. The score engine source is unchanged. The score endpoint contract is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default. The connector-agnostic platform select still exposes all six categories. Manual environment URL entry, manual agent name or identifier entry, and manual Dataverse URL entry remain absent from every path. The Phase 1G-S21A packaging safety helper is preserved.
User impact: Founders and reviewers who run a footprint discovery and generate a readiness report now see a calm, premium dashboard with a coloured readiness hero, a Top-5 priorities panel that explains the iterative improvement model, an executive summary, a risk heatmap, an evidence coverage map, stakeholder view tabs, an evidence story, and report actions (copy executive summary, copy action plan, copy full Markdown, download Markdown report, print, review another agent). The buyer is never dumped a wall of findings; the rest is available behind collapsed sections.
When to re-score: Buyer-facing report experience plus iterative top-5 plus richer Markdown plus premium dashboard component. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S22 evidence record plus the new readiness report view-model module plus the new top-five priority planner module plus the new readiness Markdown renderer plus the new premium dashboard component plus the extended score paste page plus the new Phase 1G-S22 unit tests plus the updated build-identity endpoint constants for Phase 1G-S22 plus the package version bump from 0.129.3 to 0.130.0 plus the methodology changelog 123rd entry plus the README Phase 1G-S22 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S22 internal methodology record (premium readiness intelligence report experience)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S22 founder execution signal: live test of Phase 1G-S21C confirmed the intelligence layer was a real improvement but the readiness report itself still felt boring. The founder asked for a vibrant, useful, insightful, colourful, and reusable buyer experience. A mandatory addendum required a Top-5 iterative improvement panel so the buyer would not be overwhelmed by every unknown.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the corrective phase identity, ships four new pure helper modules (readiness report view-model builder, top-five priority planner, Markdown renderer, premium dashboard component), extends the score paste page to render the dashboard, and adds buyer-facing actions (copy executive summary, copy action plan, copy full Markdown, download Markdown report, print, review another agent). Tokens stay server-side. Only documented metadata tables are read in the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S22 has not been live-tested end to end against a tenant where the new dashboard renders against a real Copilot Studio agent. The founder must run the local endpoint and confirm the readiness hero, top-five priorities, executive summary, risk heatmap, evidence coverage map, stakeholder views, evidence story, buyer confirmations, and report actions all render, and that custom and Microsoft / system agents produce materially different reports.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S23Change date:2026-05-08Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S23 - interactive evidence collection, remediation loop, and versioned comparison
Reason: The founder live-tested Phase 1G-S22 and downloaded a real readiness report for Metric Mapping Agent in CRM667668. The report rendered, but the founder asked: are yes / no questions sufficient for what we are trying to do, and where are the engine version numbers and the comparison to previous reports. Phase 1G-S23 turns AgentProof into an iterative improvement product with richer answer types, a Top-5 remediation board, in-session report history, current versus previous comparison, a version trace card, and a next-best action engine.
What changed: The package version was bumped from 0.130.0 to 0.131.0. The build-identity endpoint now reports product version 0.131.0, phase id 1G-S23, the documented phase name, and the new expected archive name. Backwards-compatible alias exports for Phase 1G-S8 through Phase 1G-S22 now resolve to the Phase 1G-S23 values. Four new pure helper modules ship under the reporting library: a rich-answer-types catalogue with sanitisation and legacy yes / no / not sure coercion; a deterministic Top-5 remediation tracker with completion percentage and ready to rerun flag; a pure next-best-action engine that returns one decisive action with owner and expected impact; and a pure report-history plus comparison helper that builds a buyer-safe summary of each generated report and compares the most recent two runs for the same agent. The dynamic confirmation question generator gains a richer answer-type metadata field on a focused subset of questions (system agent usage mode, named business owner, fallback path, testing maturity, monitoring frequency); everything else still surfaces as yes / no / not sure for backwards compatibility. The score input builder accepts the richer answers and collapses them to the documented yes / no / unsure form before the agent inputs body is built so the score endpoint contract is unchanged. The view model gains a version trace block and a buyer evidence notes block. The Markdown renderer gains a Next best action section near the top, a Top-5 remediation status section, a Comparison with previous report section, a Buyer evidence notes section, and a Report trace section. The dashboard gains a Next best action card directly under the readiness hero, a Comparison view directly under the Top-5 panel, a Buyer evidence notes panel, a Version trace card, and a Generate updated report button on the report actions row. The score paste page now keeps a per-session report history list in React state and passes the previous report for the same agent into the dashboard so the comparison view always renders. In-session only - no local storage, session storage, indexed db, cookie, Supabase, or disk write. The Phase 1G-S22 dashboard, the Phase 1G-S21C intelligence layer, the score input builder, and the dynamic question generator are unchanged. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. Null Provider remains default. The connector-agnostic platform select still exposes all six categories. Manual environment URL entry, manual agent name or identifier entry, manual Dataverse URL entry, and manual JSON paste entry remain absent from every primary buyer path.
User impact: Founders and reviewers can now record a Top-5 remediation status per priority item, generate a fresh readiness report after fixing or confirming items, and see the difference between the new report and the most recent previous report (score movement, band movement, risk posture movement, confidence movement, evidence coverage movement, resolved / new / still-open priorities). A Next best action card surfaces a single decisive action near the top of the dashboard. A Version trace card shows the AgentProof build identity that produced the report so future runs can be compared against this one.
When to re-score: Buyer-facing iteration loop plus richer answer types plus in-session comparison plus next-best-action engine. The score engine source is unchanged. The deterministic scorecard Markdown SHA-256 stays at c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61. No Phase 1F Slice 6 fixture re-scoring is required.
LibComponentsTestsContentMethodologyEvidence trace: the new Phase 1G-S23 evidence record plus the four new pure reporting helpers plus the extended dynamic question generator plus the extended score input builder plus the extended view model plus the extended Markdown renderer plus the extended premium dashboard plus the extended score paste page plus the new Phase 1G-S23 unit tests plus the updated build-identity endpoint constants plus the package version bump from 0.130.0 to 0.131.0 plus the methodology changelog 124th entry plus the README Phase 1G-S23 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S23 internal methodology record (interactive evidence collection, remediation loop, and versioned comparison)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-08
- Reference
- Phase 1G-S23 founder execution signal: live test of Phase 1G-S22 confirmed the report rendered but the founder asked whether yes / no questions are enough and where the engine version numbers and previous-report comparison are. Phase 1G-S23 introduces richer answer types, a Top-5 remediation board, in-session report history with comparison, a version trace card, and a next-best-action engine.
- Impact assessment
- Bumps the package version, rewrites the build-identity endpoint to report the new phase identity, ships four new pure helper modules (rich-answer-types catalogue, Top-5 remediation tracker, next-best-action engine, report history and comparison), extends the dynamic question generator with richer answer-type metadata on a focused subset of high-signal questions, extends the score input builder to accept rich answers and collapse them safely to the documented yes / no / unsure form, extends the view model with a version trace and a buyer evidence notes block, extends the Markdown renderer with five new sections (next best action, remediation status, comparison, buyer evidence notes, report trace), extends the dashboard with four new panels (next best action card, comparison view, buyer evidence notes panel, version trace card) plus a Generate updated report button. In-session only - no localStorage, sessionStorage, IndexedDB, cookie, Supabase, or disk write. Tokens stay server-side. Only documented metadata tables are read in the product path. The score engine is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. Null Provider remains default. The packaging safety helper from Phase 1G-S21A is preserved. UI product approval still requires founder review; technical assertions alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S23 has not been live-tested end to end against a tenant where the buyer answers rich-type questions, marks remediation statuses, generates a second report, and sees the comparison view. The founder must run the local endpoint and confirm the next-best-action card renders, the Top-5 remediation board persists status selections in-session, the comparison view renders score / band / posture / confidence / coverage movements correctly, and the version trace card lists product version, phase id, engine version, methodology changelog id, generated at, and report id.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S17Change date:2026-05-07Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S17 - Dataverse Global Discovery fallback and agent listing continuation
Reason: Founder live-tested Phase 1G-S16. Microsoft sign-in works, the Power Platform environment management.Environments.Read permission is granted, environment listing works, the discovered environment populates the dropdown, the status panel correctly shows Discovery allowed. Selecting CRM667668 and triggering List agents fails with the buyer-friendly Could not list Copilot Studio agents panel because the per-environment metadata did not expose a Dataverse address and Phase 1G-S16 deferred Global Discovery to a later slice. Founder instruction: Give the instructions, we must move forward.
What changed: Phase 1G-S17 implements the documented Microsoft Dataverse Global Discovery Service fallback so AgentProof can locate the Dataverse instance for a Power Platform environment when the per-environment metadata does not expose a usable instance URL. New server-side module under the Microsoft connector library exposes acquire global discovery token (delegated access token for the documented globaldisco.crm.dynamics.com audience using the existing refresh token), call global discovery instances (calls the documented v2.0 Instances endpoint and returns parsed safe candidates), and parse global discovery instances (pure parser plus URL normalisation that strips trailing api data v9.x suffix and rejects non-HTTPS, IP, localhost, no-dot hosts). Extends the existing Dataverse URL resolver with resolve dataverse url with global discovery, a pure helper that matches the selected environment snapshot to one Global Discovery instance using documented metadata signals in priority order: organization id (Power Platform crm instance id vs Global Discovery Id), unique name (Power Platform unique name vs Global Discovery unique name), domain name (Power Platform domain name vs Global Discovery url name), and a conservative single candidate fallback that the agents route only opts into when there is exactly one Dataverse-backed environment in the discovered list. Extends the Power Platform environment snapshot to also carry organization id, unique name, and domain name from the documented properties.linked environment metadata fields so the matching helper can use them. Updates the agents route to fall through metadata then Global Discovery then the bots metadata query and to surface four documented buyer-friendly outcomes: discovered (agents found), no agents found (Dataverse reached, zero bots), dataverse url missing (URL still not resolved after Global Discovery, with global discovery attempted true plus candidates count plus match result in advanced details), and dataverse access blocked (Dataverse 401 or 403 maps to the documented IT-please-confirm state). Adds the no agents found state to the discovery status lifecycle helper as a tenth state with calm tone and the documented buyer copy. Updates the score paste page to render the new no agents found and dataverse access blocked buyer-friendly disclosures with collapsed Advanced details for IT blocks carrying only documented metadata flags plus selected environment identity, never tokens, never secrets, never business records. Updates the customer plus IT runbook with a new section After environments load: Dataverse and Copilot Studio agent discovery, plus updated troubleshooting entries for the no agents found and dataverse access blocked states. Updates the build-identity endpoint constants to product version 0.125.0 and phase id 1G-S17. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 plus phase 1 g s12 plus phase 1 g s13 plus phase 1 g s14 plus phase 1 g s15 plus phase 1 g s16 alias exports resolve to the Phase 1G-S17 values so existing imports keep compiling. Adds a Phase 1G-S17 evidence file under the content tree with founder live-test evidence plus previous layer solved plus current blocker plus global discovery decision plus official Microsoft sources plus dataverse url resolution strategy plus matching strategy plus route outcomes plus buyer copy plus advanced it details plus tests added plus safety invariants preserved plus product approval status plus next recommended product work. Adds Phase 1G-S17 unit tests in six groups (A. Global Discovery client, B. resolver plus matching, C. agents route, D. UI, E. runbook, F. safety). package version bumped from 0.124.0 to 0.125.0. methodology current methodology summary product version bumped to 0.125.0. methodology changelog now carries 115 entries with change id 1G-S17 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: When the founder picks the CRM667668 environment that previously failed with Could not list Copilot Studio agents, AgentProof now (a) tries every documented per-environment metadata field path, (b) falls back to the Microsoft Dataverse Global Discovery Service, (c) safely matches the selected environment to one Dataverse instance using documented metadata signals (organisation id, unique name, domain name) plus an opt-in single-candidate fallback, and (d) calls the documented Dataverse bots metadata endpoint. If agents are found the dropdown populates and the status panel shows Agents ready. If Microsoft answered with zero agents the buyer sees the calm No agents found state with the documented next action (pick another environment or create or publish an agent in Copilot Studio, then try again). If Microsoft blocked the read-only Dataverse query (HTTP 401 or 403) the buyer sees the calm Microsoft blocked Dataverse metadata access state with the documented next action (ask IT to confirm the account can read Copilot Studio metadata in this environment). If the URL still cannot be resolved the buyer sees the calm Could not find the Dataverse address state with global discovery attempted true plus the candidates count plus the match result in the collapsed Advanced details for IT block. The buyer never types a Dataverse URL. The buyer never types an agent name or identifier. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S17 is a discovery completeness slice. Adds the Microsoft Dataverse Global Discovery Service fallback plus three documented buyer-friendly route outcomes plus one new lifecycle state. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S17 content evidence file with founder live-test evidence plus previous layer solved plus current blocker plus global discovery decision plus official Microsoft sources plus dataverse url resolution strategy plus matching strategy plus route outcomes plus buyer copy plus advanced it details plus tests added plus safety invariants preserved plus product approval status plus next recommended product work, the new Global Discovery server-side client module under the Microsoft connector library, the extended Dataverse URL resolver with the new Global Discovery matching helper, the extended Power Platform client and the session-store snapshot type with the new matching signals, the rewritten agents route with the four documented buyer-friendly outcomes, the extended discovery status lifecycle helper with the no agents found tenth state, the new buyer-friendly disclosures in the score paste page, the updated customer plus IT runbook with the new Dataverse and agent discovery section plus the new troubleshooting entries, the new Phase 1G-S17 unit test groups (Global Discovery client plus resolver plus matching plus agents route plus UI plus runbook plus safety), the updated build-identity endpoint constants for Phase 1G-S17, the package version bump from 0.124.0 to 0.125.0, the methodology changelog 115th entry, the README Phase 1G-S17 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S17 internal methodology record (Dataverse Global Discovery fallback plus agent listing continuation)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-07
- Reference
- Phase 1G-S17 founder execution signal: founder live-tested Phase 1G-S16, confirmed environment listing works for the discovered CRM667668 environment, but the next step (list Copilot Studio agents) failed with the buyer-friendly Could not list Copilot Studio agents panel because the per-environment metadata did not expose a Dataverse address and Phase 1G-S16 deferred Global Discovery to a later slice. Founder instruction: Give the instructions, we must move forward.
- Impact assessment
- Adds the documented Microsoft Dataverse Global Discovery Service fallback plus the safe matching helper plus three new buyer-friendly route outcomes plus one new lifecycle state. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement plus the Phase 1G-S13 buyer-friendly blocked-connection flow plus the Phase 1G-S14 final IT-request copy polish plus the Phase 1G-S15 discovery status lifecycle plus the Phase 1G-S16 customer plus IT runbook and Dataverse URL resolver are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S17 has not been live-tested end to end against a tenant where Microsoft Dataverse Global Discovery returns the chosen environment instance. The founder must run the local route after signing in to a tenant with at least one Dataverse-backed Power Platform environment that exposes the documented organisation id or unique name or domain name in its Power Platform metadata.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S18Change date:2026-05-07Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S18 - Dataverse Global Discovery trace and candidate confirmation
Reason: Founder live-tested Phase 1G-S17. Microsoft sign-in works. Power Platform environment listing works. The discovered environment populates the dropdown. The S17 Global Discovery slice still did NOT get to agent listing in the founder live tenant: the buyer-friendly Could not list Copilot Studio agents panel still appeared with the documented dataverse url missing copy and the browser console still showed the documented 422 response. Founder instruction: Give the instructions.
What changed: Phase 1G-S18 stops treating the post-environment-listing step as a black box. The Global Discovery client is extended with a pure helper that derives a deterministic opaque candidate id (sha256 of the documented Dataverse organisation identity, truncated to 16 hex characters) and a safe candidate summary helper that exposes only documented identifiers (display name, unique name, host) plus presence flags, never tokens, never secrets, never full Dataverse URLs. The agents route now (a) records a full Global Discovery trace on every list agents outcome that involved Global Discovery (metadata resolution attempted plus result plus source plus global discovery attempted plus token acquired plus http status plus instances count plus safe candidates count plus match result plus match signal plus selected environment identity plus presence flags) and surfaces the trace on the documented advanced details block with no tokens or secrets; (b) returns a documented buyer-friendly outcome status candidate confirmation required (HTTP 200) carrying the safe candidate list when Microsoft returned candidates but AgentProof could not safely auto-match one; and (c) accepts an optional dataverse candidate query parameter, re-runs Global Discovery, validates the supplied opaque candidate id against the freshly-derived candidate list (the route never trusts a browser-supplied URL or host), and lists agents on the matched Dataverse instance. Adds a new buyer-friendly candidate confirmation panel to the score paste page rendering a radio list of discovered candidates with Use this Dataverse environment and Not sure buttons (no manual URL entry). Adds the new dataverse confirmation needed eleventh state to the discovery status lifecycle helper with the documented calm copy. Updates the customer plus IT runbook with a new section AgentProof found possible Dataverse environments (candidate confirmation required) explaining that the buyer is confirming discovered Microsoft data, not typing a URL. Updates the build-identity endpoint constants to product version 0.126.0 and phase id 1G-S18. Backwards-compatible phase 1 g s8 through phase 1 g s17 alias exports resolve to the Phase 1G-S18 values so existing imports keep compiling. Adds a Phase 1G-S18 evidence file under the content tree. Adds Phase 1G-S18 unit tests in eight groups (A. Global Discovery trace, B. Candidate id generation, C. Candidate confirmation route, D. Matching strategy, E. UI candidate confirmation, F. Status lifecycle, G. Runbook, H. Safety). package version bumped from 0.125.0 to 0.126.0. methodology current methodology summary product version bumped to 0.126.0. methodology changelog now carries 116 entries with change id 1G-S18 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: When the founder selects an environment whose per-environment metadata does not expose a Dataverse address and Microsoft Global Discovery returns more than one Dataverse instance (or a single instance without strong identity signals), AgentProof now (a) records and displays the full Global Discovery trace under the collapsed Advanced details for IT block so the buyer plus IT can see exactly what happened (token acquired plus HTTP status plus instances count plus safe candidates count plus match result), and (b) renders a calm buyer-friendly candidate confirmation panel headlined AgentProof found possible Dataverse environments with documented copy plus a radio list of discovered Dataverse environments (display name plus host in muted text). The buyer picks one, clicks Use this Dataverse environment, and the route validates the chosen opaque candidate id server-side and lists agents on the matched Dataverse instance. The buyer never types a Dataverse URL. The buyer never types an agent name or identifier. The Not sure button calmly returns to the help state. The build-identity endpoint now reports product version 0.126.0 and phase id 1G-S18. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S18 is a discovery diagnosability plus user-confirmation slice. Adds a deterministic candidate id helper plus a safe candidate summary helper plus a full Global Discovery trace plus a new buyer-friendly candidate confirmation route outcome plus a new lifecycle state. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S18 content evidence file with founder live-test evidence plus previous layers solved plus current live blocker plus global discovery trace decision plus candidate confirmation decision plus route contract plus buyer copy plus advanced it trace fields plus matching strategy plus safety invariants preserved plus tests added plus product approval status plus next recommended product work, the extended Global Discovery client with the deterministic candidate id and safe candidate summary helpers, the extended Dataverse URL resolver with confirmation-eligible candidates, the rewritten agents route with the full trace plus candidate confirmation flow plus server-side candidate id validation, the extended discovery status lifecycle helper with the dataverse confirmation needed eleventh state, the new candidate confirmation panel in the score paste page, the updated customer plus IT runbook with the new candidate confirmation section, the new Phase 1G-S18 unit test groups (Global Discovery trace plus candidate id generation plus candidate confirmation route plus matching strategy plus UI candidate confirmation plus status lifecycle plus runbook plus safety), the updated build-identity endpoint constants for Phase 1G-S18, the package version bump from 0.125.0 to 0.126.0, the methodology changelog 116th entry, the README Phase 1G-S18 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S18 internal methodology record (Dataverse Global Discovery trace plus candidate confirmation)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-07
- Reference
- Phase 1G-S18 founder execution signal: founder live-tested Phase 1G-S17, confirmed Microsoft sign-in works, Power Platform environment listing works, the discovered environment populates the dropdown, but the agents step still failed with the documented Could not list Copilot Studio agents panel and the documented 422 response. Founder instruction: Give the instructions.
- Impact assessment
- Adds Global Discovery diagnosability plus a new buyer-friendly candidate confirmation flow plus a new lifecycle state. Tokens stay server-side. Candidate ids are opaque deterministic SHA-256 truncations of documented Dataverse organisation identity, never tokens or secrets. The route never accepts a Dataverse URL from the browser. The route validates the supplied opaque candidate id against a freshly-derived Global Discovery candidate list. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement plus the Phase 1G-S13 buyer-friendly blocked-connection flow plus the Phase 1G-S14 final IT-request copy polish plus the Phase 1G-S15 discovery status lifecycle plus the Phase 1G-S16 customer plus IT runbook and Dataverse URL resolver plus the Phase 1G-S17 Global Discovery fallback are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S18 has not been live-tested end to end against a tenant where the candidate confirmation flow successfully lists agents after a confirmed candidate. The founder must run the local route after signing in to a tenant where Microsoft Dataverse Global Discovery returns at least one safe candidate AgentProof can validate.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S19Change date:2026-05-07Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S19 - hard Microsoft discovery diagnostic and route contract fix
Reason: Founder live-tested Phase 1G-S18. Microsoft sign-in works, environment listing works, but the buyer-friendly panel still showed only the generic dataverse url missing message and the browser console still showed the documented 422. Phase 1G-S18 did NOT visibly prove whether Global Discovery was attempted, whether a token was acquired, what HTTP status Global Discovery returned, how many Dataverse instances were returned, or why the candidate confirmation panel did not appear. Founder feedback: Now its bloody wasting my time, fix this and more, we are too slow.
What changed: Phase 1G-S19 stops UI-polish slices and ships a hard diagnostic plus a route contract fix. New server-side endpoint at GET /api/connectors/microsoft/discovery-diagnostic returns one safe canonical JSON shape (status, phase id, operation, stage, selected environment, auth, metadata resolution with per-path probes, global discovery with token acquired plus http status plus instances count plus safe candidates count plus failure reason, safe candidates, matching, agents probe, next action) so the founder can open ONE URL and see exactly where Microsoft discovery stops. The shared trace builder lives at the Microsoft discovery diagnostic helper module and is the single source of truth for both the diagnostic endpoint and the agents-route diagnostic trace field. Refactors the agents route so every non-success response now includes a diagnostic trace field with the canonical shape. The UI can never again receive a generic dataverse url missing envelope without trace fields. Adds a new pure helper probe dataverse metadata paths to the resolver module that walks every documented Dataverse-URL field path and reports presence plus validity per path with safe host only (never the raw URL or query string). Adds a Run discovery diagnostic link in the agent-failure panel that opens the new endpoint in a new tab, plus an Advanced details for IT summary table that mirrors the documented Phase 1G-S19 fields (Metadata URL found, Global Discovery attempted, Global Discovery status, Candidates found, Match result, Agent query attempted, Stage). Updates the build-identity endpoint constants to product version 0.127.0 and phase id 1G-S19. Backwards-compatible phase 1 g s8 through phase 1 g s18 alias exports resolve to the Phase 1G-S19 values. Adds a Phase 1G-S19 evidence file under the content tree. Adds Phase 1G-S19 unit tests covering the diagnostic endpoint shape plus metadata trace plus Global Discovery trace plus agents route contract plus UI wiring plus safety plus live-test instructions plus match labels. package version bumped from 0.126.0 to 0.127.0. methodology current methodology summary product version bumped to 0.127.0. methodology changelog now carries 117 entries with change id 1G-S19 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: When the agents route returns any non-success state, the buyer panel now exposes a Run discovery diagnostic button and a small Advanced details for IT summary. The diagnostic button opens GET /api/connectors/microsoft/discovery-diagnostic with the selected environment id in a new tab and returns the canonical safe JSON trace so the founder can see exactly which stage failed. The buyer never types a Dataverse URL. The agents route diagnostic trace field is mandatory on every non-success response. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S19 is a diagnosability slice. Adds a server-side diagnostic endpoint and a mandatory diagnostic trace field on the agents route. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S19 content evidence file plus the new shared discovery diagnostic helper module under the Microsoft connector library plus the new discovery-diagnostic API route plus the rewritten agents route with mandatory diagnostic trace plus the new Run discovery diagnostic link plus Advanced details for IT summary in the score paste page plus the new Phase 1G-S19 unit test groups plus the updated build-identity endpoint constants for Phase 1G-S19 plus the package version bump from 0.126.0 to 0.127.0 plus the methodology changelog 117th entry plus the README Phase 1G-S19 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S19 internal methodology record (hard Microsoft discovery diagnostic plus route contract fix)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-07
- Reference
- Phase 1G-S19 founder execution signal: founder live-tested Phase 1G-S18, confirmed Microsoft sign-in works and environment listing works but the agents step still showed only the generic dataverse_url_missing message and the browser console still showed the documented 422. Founder feedback: Now its bloody wasting my time, fix this and more, we are too slow.
- Impact assessment
- Adds a hard Microsoft discovery diagnostic endpoint plus a mandatory diagnostic_trace field on every non-success agents-route response. The shared trace builder is the single source of truth so both routes return the same shape. Tokens stay server-side. Authorization headers never appear in the trace. Raw Microsoft response bodies never appear in the trace. The Phase 1G-S2 through Phase 1G-S18 layers are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S19 has not been live-tested end to end against a tenant where the diagnostic endpoint produces a stage of agents_discovered. The founder must open the diagnostic URL once after this slice to see which stage their tenant actually fails at.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S20Change date:2026-05-07Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S20 - agent classification and grouped picker
Reason: Founder live-tested Phase 1G-S19. The Dataverse plus Global Discovery layer is solved enough for the agents step to populate. The agent dropdown returned a mix of Microsoft / D365 OOTB agents alongside organisation-specific custom agents (the founder screenshot listed examples such as D365 Sales Agent - Research, Customer Service Copilot Bot, Sustainability Command Agent, Operational Compliance Agent, Metric Mapping Agent). Founder feedback: We need to split between system or OOTB agents and custom agents.
What changed: Phase 1G-S20 ships a deterministic Microsoft agent classifier plus a grouped agent picker. New pure helper at an internal source file exposes classify microsoft agent (per-agent) and classify microsoft agents (list plus group counts). The classifier inspects documented Dataverse origin metadata first (ismanaged flag, publisher unique-name prefix, solution unique-name prefix) and only falls back to documented Microsoft / Dynamics 365 product naming patterns as a supporting signal. Confidence is reported as high (strong metadata), medium (strong naming or partial metadata), or low (which surfaces as the unknown classification). When evidence is weak or conflicting the classifier returns unknown, never silently labels the agent as custom. Each result carries up to three buyer-safe reasons. The Dataverse client is extended to surface ismanaged plus componentstate plus solution unique name plus publisher unique name fields when Microsoft returns them, never raw business-record content. The agents route now attaches classification fields per agent plus a documented agent groups summary block (custom count, microsoft system count, unknown count, total count). The score paste page replaces the flat agent dropdown with a grouped select using optgroups (Custom agents first, Microsoft / system agents second, Unknown classification last), surfaces the documented count summary above the picker, surfaces a small selected-agent classification card (label plus confidence plus reasons) underneath the picker, surfaces a calm warning when the buyer selects a Microsoft/system or unknown agent, and surfaces the documented fallback copy lines. No agents are hidden. Updates the build-identity endpoint constants to product version 0.128.0 and phase id 1G-S20. Backwards-compatible phase 1 g s8 through phase 1 g s19 alias exports resolve to the Phase 1G-S20 values. Adds a Phase 1G-S20 evidence file under the content tree. Adds Phase 1G-S20 unit tests in five groups (A. Classifier purity, B. Server route shape, C. UI grouped picker, D. Safety, E. Screenshot example names). package version bumped from 0.127.0 to 0.128.0. methodology current methodology summary product version bumped to 0.128.0. methodology changelog now carries 118 entries with change id 1G-S20 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: When the agent list loads, the buyer now sees a count summary (AgentProof found X agents: Y custom, Z Microsoft/system, W unknown) and a grouped picker with three optgroups: Custom agents first (default focus), Microsoft / system agents second (visible but separated), Unknown classification last with the documented note. After the buyer picks an agent, a small classification card shows the documented label, the confidence, and 1-3 buyer-safe reasons. If the buyer picks a Microsoft / system agent a calm warning explains that AgentProof can still inspect it but customers usually proof their own custom agents first. If the buyer picks an unknown agent the documented warning explains AgentProof could not classify it. No agents are hidden. The Discover agent footprint button remains disabled until an agent is selected. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S20 is a discovery refinement slice. Adds an agent classifier plus grouped picker. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S20 content evidence file plus the new Microsoft agent classifier helper module under the connector library plus the extended Dataverse client surfacing ismanaged plus componentstate plus solution and publisher unique-name fields plus the extended agents route attaching classification plus agent groups counts plus the rewritten grouped picker plus selected-agent classification card in the score paste page plus the new Phase 1G-S20 unit test groups plus the updated build-identity endpoint constants for Phase 1G-S20 plus the package version bump from 0.127.0 to 0.128.0 plus the methodology changelog 118th entry plus the README Phase 1G-S20 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S20 internal methodology record (agent classification plus grouped picker)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-07
- Reference
- Phase 1G-S20 founder execution signal: founder live-tested Phase 1G-S19, confirmed the Dataverse plus Global Discovery layer is solved enough that the agent dropdown populates with real agents from the selected environment, and explicitly asked for a split between Microsoft / system agents and custom agents.
- Impact assessment
- Adds a deterministic classifier plus a grouped agent picker. The classifier prefers documented Dataverse origin metadata (ismanaged plus publisher plus solution unique-name prefixes) and falls back to documented Microsoft / Dynamics 365 naming patterns. Conflicting or weak evidence becomes Unknown. The Phase 1G-S2 through Phase 1G-S19 layers are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S20 has not been live-tested end to end against a tenant where the classifier produces high-confidence custom-vs-system classifications across the the founder own agents. The founder must run the local route after this slice to confirm the grouped picker renders and the selected-agent classification card matches their expectation.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S11Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S11 - fix 403 error panel rendering and stuck loading state
Reason: Founder live-tested Phase 1G-S10 in a fresh private browser window. The agentproof version endpoint correctly reported product version 0.118.0 and phase id 1G-S10 (the new build was running). The founder clicked Connect Microsoft, completed Microsoft sign-in, and was returned to AgentProof. The page rendered Disconnect Microsoft (auth status returned signed in). The page text said Microsoft signed in. Loading environments from Microsoft. The browser console showed a real Microsoft 403 against the environment-listing API. The Phase 1G-S10 admin fallback panel did NOT render. The page stayed stuck on the loading state. Phase 1G-S10 is therefore not accepted. The bug was not a Microsoft setup issue: Microsoft correctly returned the 403. The product failed to turn that 403 into the documented Phase 1G-S10 fallback panel.
What changed: Phase 1G-S11 fixes two distinct frontend bugs in the score-paste page component that together caused the visible failure. Bug A - the post-sign-in auto-load effect included the loading-state flag in its dep array; setting the loading-state flag to environments inside the async IIFE caused the dep to change while the fetch was in-flight; React fired the effect cleanup which set the in-flight cancelled flag true; when the fetch resolved with a real 403 the in-flight closure hit the cancelled guard before storing the structured discovery error, AND the finally block guarded the loading-state clear with if not cancelled, so the loading-state flag also stayed at environments forever. Bug B - the manual List environments button stored only a flat buyer-facing error text on non-2xx and never populated the structured discovery-error state, so the Phase 1G-S10 admin fallback panel could not render even when the manual click hit the same 403. Phase 1G-S11 fixes both paths by extracting the response-to-action mapping into a pure helper module under the Microsoft connector library (the documented 403 body becomes a permission insufficient action with open admin fallback true; 401 becomes auth required; 200 with discovered shape becomes discovered with the environments array; any other non-2xx becomes non 2xx with a structured discovery error), routing both the auto-load effect and the manual button through a single shared in-component helper, ALWAYS clearing the loading-state flag in finally regardless of any cancellation flag, dropping the loading-state flag from the auto-load effect dep array, AUTO-OPENING the Permission not visible Show admin fallback disclosure when permission insufficient hits, and rendering the Microsoft correlation id alongside the diagnostic id in the always-visible portion of the structured error panel so neither field is hidden behind a details element. Adds a Phase 1G-S11 unit test that exercises the pure mapper with the founder's exact 403 body (including the documented diagnostic id 32681be22540fef0 and Microsoft correlation id 098b8893-2bcd-4ae5-bacd-048976218f30) and pins source-code shape regressions: no loading-state flag in the auto-load effect dep array, no if not cancelled guard around the loading-state clear, manual button uses the shared helper not its own inline fetch, mapper module is pure (no fetch, no clock, no random, no env, no React import, no token-shaped reference), permission name spelling stays plural environment management.Environments.Read. Updates the build-identity endpoint constants to product version 0.119.0 and phase id 1G-S11. Backwards-compatible phase 1 g s8 and phase 1 g s9 and phase 1 g s10 alias exports resolve to the Phase 1G-S11 values so existing imports keep compiling. Adds a Phase 1G-S11 evidence file under the content tree. package version bumped from 0.118.0 to 0.119.0. methodology current methodology summary product version bumped to 0.119.0. methodology changelog now carries 109 entries with change id 1G-S11 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page now correctly turns a real Microsoft 403 on the environment-listing API into the documented Phase 1G-S10 admin fallback panel instead of staying stuck on Microsoft signed in. Loading environments from Microsoft. The loading indicator clears as soon as the API resolves. The headline Microsoft signed you in but blocked the environment-listing API renders. The distinction between app management.Application (reads application packages) and environment management.Environments.Read (the documented permission required to list environments) renders. The Permission not visible Show admin fallback disclosure auto-opens on permission insufficient so the admin handoff text plus the Power Platform API app id 8578e004-a5c6-46e7-913e-12f58912df43 plus the Microsoft Graph power shell snippet plus the Azure CLI snippet plus the official Microsoft doc links render on first paint. The diagnostic id and Microsoft correlation id render alongside each other in the always-visible portion of the panel. The List environments and Disconnect Microsoft buttons remain clickable. The build-identity endpoint now reports product version 0.119.0 and phase id 1G-S11. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S11 is a frontend rendering and state-handling corrective slice. The error panel JSX changes plus the helper extraction do not change scoring, do not change /api/score, and do not change the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S11 content evidence file with founder observed state plus root cause plus fix summary plus regression tests, the new pure helper module under the Microsoft connector library exposing the discriminated-union action type, the score-paste page helper extraction plus auto-load effect dep change plus manual button refactor plus auto-open admin fallback on permission insufficient plus always-visible correlation id, the new Phase 1G-S11 unit test exercising the founder's exact 403 body, the updated build-identity endpoint constants for Phase 1G-S11, the package version bump from 0.118.0 to 0.119.0, the methodology changelog 109th entry, the README Phase 1G-S11 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S11 internal methodology record (fix 403 error panel rendering plus stuck loading state)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S11 founder execution signal: founder live-tested Phase 1G-S10 in a fresh private browser window, /api/agentproof/version reported product version 0.118.0 and phase id 1G-S10, /api/connectors/microsoft/auth/status returned signed_in, /api/connectors/microsoft/environments returned 403 Forbidden, the page stayed stuck on Microsoft signed in. Loading environments from Microsoft, and the Phase 1G-S10 fallback panel did not render.
- Impact assessment
- Fixes the Phase 1G-S10 frontend rendering plus state-handling failure that left the score paste page stuck on Loading after a real Microsoft 403 on the environment-listing endpoint. Extracts the response-to-action mapping into a pure helper module that can be unit-tested with the founders exact 403 body (no JSDOM, no React Testing Library, no fetch mock). Routes both the post-sign-in auto-load effect and the manual List environments button through a single shared in-component helper. Always clears the loading-state flag in finally regardless of any cancellation flag. Drops the loading-state flag from the auto-load effect dep array. Auto-opens the Permission not visible Show admin fallback disclosure on permission_insufficient. Renders the Microsoft correlation id alongside the diagnostic id in the always-visible portion of the structured error panel. Adds a Phase 1G-S11 unit test with the founders exact 403 body plus source-code shape regression tests. Updates the build-identity endpoint to product version 0.119.0 and phase id 1G-S11. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module are all preserved unchanged. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S10 archive is superseded as the local-development baseline by this Phase 1G-S11 archive because Phase 1G-S10 left the founder stuck on Loading after a real Microsoft 403. Phase 1G-S11 has not been live-tested against the real Microsoft tenant by the AgentProof developer. The founder must run the local route and confirm the score paste page now renders the Phase 1G-S10 admin fallback panel after the 403 instead of staying on Loading.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S12Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S12 - force visible Microsoft 403 panel after failed environment fetch
Reason: Founder live-tested Phase 1G-S11. The auto-load loading-state stuck-on-Loading bug from Phase 1G-S10 was gone (Phase 1G-S11 fixed that part). However the structured 403 panel STILL did not render. The founder screenshot showed Disconnect Microsoft visible plus the List environments button plus the dropdown placeholder Click List environments to discover. The browser console showed GET environments 403 Forbidden twice. The expected Phase 1G-S10 admin fallback panel content (Microsoft signed you in but blocked the environment-listing API plus Permission not visible Show admin fallback plus admin handoff text plus diagnostic id plus correlation id) was completely absent from the visible page. Phase 1G-S11 is therefore not accepted. Root cause: the Phase 1G-S10 panel lived inside the discovery section but FAR below the environment picker (after the agents picker, the footprint button, the result panel, the confirmation questions). It was off-screen for the founders viewport AND it was gated on multiple AND conditions (microsoft discovery error truthy plus operation list environments plus safe error code permission insufficient plus microsoft auth status signed in plus microsoft admin handoff open). Any one false silently hid the panel.
What changed: Phase 1G-S12 introduces a dedicated visible-error state plus a dedicated presentational component plus a dedicated render position. New pure builder module under the Microsoft connector library exports microsoft environment fetch error state plus build microsoft environment fetch error from http response (called once for every non-2xx response) plus build microsoft environment fetch error from exception (called inside the fetch-handler catch block). The dedicated state carries http status plus status plus operation plus safe error code plus safe message plus buyer facing summary plus diagnostic id plus microsoft correlation id plus endpoint host plus raw safe body present. New presentational component microsoft environment fetch error panel renders the panel unconditionally whenever the dedicated state is non-null. The panel does NOT condition on auth status, NOT on operation name, NOT on safe error code, NOT on the admin-handoff disclosure state. On HTTP 403 it renders the documented permission slash admin fallback content (headline Microsoft signed you in but blocked the environment-listing API plus body AgentProof reached Microsoft but Microsoft refused the read-only environment list call plus the app management.Application reads application packages distinction plus the documented environment management.Environments.Read permission name plus the always-visible HTTP status plus endpoint host plus diagnostic id plus Microsoft correlation id plus the visible Permission not visible Show admin fallback toggle plus the official Power Platform API app id 8578e004-a5c6-46e7-913e-12f58912df43 plus the Microsoft Graph power shell snippet plus the Azure CLI snippet plus the copyable admin handoff text plus links to the official Microsoft docs plus the honest blocker statement). On any other non-2xx the panel renders a generic Microsoft refused panel with the same diagnostic fields visible. The score paste page renders this panel DIRECTLY UNDER the environment picker card so it cannot be off-screen below the agents picker plus footprint plus confirmation tree. The shared fetch and apply microsoft environments helper is updated to populate the dedicated state for ANY non-2xx response BEFORE the existing discriminated-union dispatch (so a future state clear inside any branch cannot wipe the visible error before render) and to populate the state inside the catch block for thrown fetch slash parse exceptions. The dedicated state is cleared at retry start (helper sets it null at the top of every helper invocation) plus on a successful retry (the discovered branch sets it null) plus on Disconnect Microsoft. Adds a Phase 1G-S12 unit test that uses render to static markup to render the new presentational component with the founders exact 403 body and asserts every visible string the founder must see (headline plus body plus app management.Application plus environment management.Environments.Read plus HTTP status 403 plus api.powerplatform.com plus 32681be22540fef0 plus 098b8893-2bcd-4ae5-bacd-048976218f30 plus Permission not visible Show admin fallback plus 8578e004-a5c6-46e7-913e-12f58912df43 plus New-mg service principal plus az ad sp create) plus negative safety (no Bearer plus no ey j plus no client secret plus no tenant id plus no founder power shell instruction plus no random Entra clicking) plus a regression test does-not-repeat-founder-S11-failure that proves the visible panel root plus diagnostic ids plus permission-not-visible toggle plus app id all render together for the founders exact 403 body. Updates the build-identity endpoint constants to product version 0.120.0 and phase id 1G-S12. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 alias exports resolve to the Phase 1G-S12 values so existing imports keep compiling. Adds a Phase 1G-S12 evidence file under the content tree. package version bumped from 0.119.0 to 0.120.0. methodology current methodology summary product version bumped to 0.120.0. methodology changelog now carries 110 entries with change id 1G-S12 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page now correctly turns a real Microsoft 403 on the environment-listing API into a VISIBLE buyer-facing panel directly under the environment picker. The user no longer needs to open dev tools to see the error. The headline Microsoft signed you in but blocked the environment-listing API renders. The body AgentProof reached Microsoft renders. The app management.Application versus environment management.Environments.Read distinction renders. HTTP status 403 plus endpoint host api.powerplatform.com plus the safe diagnostic id plus the Microsoft correlation id render in always-visible diagnostics. The Permission not visible Show admin fallback toggle is visible. When the user expands the toggle the official Power Platform API app id 8578e004-a5c6-46e7-913e-12f58912df43 plus the Microsoft Graph power shell snippet plus the Azure CLI snippet plus the copyable admin handoff text plus the three official Microsoft doc links plus the honest blocker statement render. The List environments button remains usable for retry. On retry the panel disappears at retry start and re-appears if the retry also fails. The build-identity endpoint now reports product version 0.120.0 and phase id 1G-S12. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S12 is a frontend rendering corrective slice. Adds a presentational component plus a dedicated visible-error state plus a placement change. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S12 content evidence file with founder observed state plus root cause plus fix summary plus regression tests, the new pure builder module under the Microsoft connector library, the new presentational component under the form component tree, the score-paste page wiring (state hook plus helper dispatch plus Disconnect clear plus directly-under-picker render), the new Phase 1G-S12 unit test exercising the founder exact 403 body via render to static markup, the updated build-identity endpoint constants for Phase 1G-S12, the package version bump from 0.119.0 to 0.120.0, the methodology changelog 110th entry, the README Phase 1G-S12 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S12 internal methodology record (force visible Microsoft 403 panel)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S12 founder execution signal: founder live-tested Phase 1G-S11. The auto-load loading-state stuck bug from Phase 1G-S10 was gone but the structured 403 panel still did not render. Visible UI: Disconnect Microsoft button plus List environments button plus dropdown placeholder Click List environments to discover. Browser console: GET environments 403 Forbidden twice. Expected Phase 1G-S10 admin fallback panel content was completely absent from the visible page.
- Impact assessment
- Fixes the Phase 1G-S11 visible-error gap by introducing a dedicated visible-error state populated for every non-2xx response and every thrown fetch or parse exception, a dedicated presentational component that renders unconditionally whenever the state is non-null with no gates on auth status or operation or safe error code or admin-handoff disclosure, and a placement directly under the environment picker card. Adds component-render tests using renderToStaticMarkup that exercise the founder exact 403 body and assert every visible string the founder must see plus negative safety rules plus a regression test that proves the visible panel root plus diagnostic ids plus permission-not-visible toggle plus AppId render together. Updates the build-identity endpoint to product version 0.120.0 and phase id 1G-S12. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S11 archive is superseded as the local-development baseline by this Phase 1G-S12 archive because Phase 1G-S11 left the founder with a dead environment dropdown and a console-only 403 instead of a visible panel. Phase 1G-S12 has not been live-tested against the real Microsoft tenant by the AgentProof developer; the founder must run the local route and screenshot the now-visible panel directly under the environment picker.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S13Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S13 - buyer-friendly blocked-connection flow
Reason: Founder live-tested Phase 1G-S12. The Microsoft 403 panel now renders directly under the environment picker (Phase 1G-S12 fixed the visibility defect), but the buyer-facing primary panel exposed too much technical and admin language: HTTP 403, permission insufficient, environment management.Environments.Read, app management.Application, service principal, app id 8578e004-a5c6-46e7-913e-12f58912df43, Microsoft Graph power shell snippet, Azure CLI snippet, Entra picker references, tenant references, correlation id, diagnostic id. Founder feedback: I agree, no technical jargon, and no admin permissions, this must work as intended. The buyer must not see permission names, service principals, power shell, API ids, HTTP status, or Entra jargon on the primary path. AgentProof must behave like a self-serve product: it tries the safe read-only connection, and if Microsoft blocks it, it explains the situation in plain English, gives one simple next step, and hides technical details behind Advanced details for IT.
What changed: Phase 1G-S13 keeps the Phase 1G-S12 visibility fix (the panel still renders directly under the environment picker, still keys off the dedicated visible-error state, still has no auth status / operation name / safe error code / disclosure-state gates) but rewrites the panel as a calm buyer-facing blocked-connection flow with one plain-English headline (Microsoft blocked the read-only connection), one plain-English body (AgentProof signed in successfully, but your organisation has not allowed this read-only discovery connection yet; AgentProof has not read any business records), one simple next step (Send a short request to your IT team so they can allow the connection), three primary buttons (Copy IT request; Try again; Disconnect Microsoft), and a SECONDARY collapsed disclosure labelled Advanced details for IT that holds every technical field. The Advanced details disclosure is COLLAPSED by default; the technical content is NOT rendered to the DOM until the buyer expands it. Adds a new pure builder module under the Microsoft connector library that exposes build microsoft itrequest text. The buyer-facing Copy IT request button copies a deterministic plain-English request that opens with a Plain summary block (suitable for non-technical staff) followed by a Technical details for IT block with the documented endpoint host, the safe diagnostic id, the Microsoft correlation id, the documented required permission name, the Power Platform API service principal app id, and the three official Microsoft documentation URLs. Removes the Phase 1G-S11 auto-open call inside the helper so Advanced details for IT stays collapsed by default. Adds new state hooks for the Copy IT request status and new in-component handlers for Try again and Disconnect Microsoft that route through the same shared environments fetch helper and the existing disconnect endpoint respectively. Updates the build-identity endpoint constants to product version 0.121.0 and phase id 1G-S13. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 plus phase 1 g s12 alias exports resolve to the Phase 1G-S13 values so existing imports keep compiling. Adds a Phase 1G-S13 evidence file under the content tree. Adds Phase 1G-S13 component-render tests that exercise the buyer-friendly panel with the founders exact 403-derived state and assert the primary visible HTML contains the allowed buyer-friendly strings AND none of the documented forbidden technical strings before the buyer expands Advanced details for IT, plus assertions that after expansion the Advanced details body renders the documented technical fields, plus IT request copy-text assertions, plus regression tests, plus negative safety rules. Updates the existing Phase 1G-S12 panel tests to expect the new buyer-friendly default plus the technical-only-when-expanded behaviour. package version bumped from 0.120.0 to 0.121.0. methodology current methodology summary product version bumped to 0.121.0. methodology changelog now carries 111 entries with change id 1G-S13 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page now turns a real Microsoft 403 on the environment-listing API into a calm buyer-facing blocked-connection panel with one plain-English headline, one body explaining what AgentProof did and did not read, one next step, and three buttons (Copy IT request, Try again, Disconnect Microsoft). The buyer no longer sees HTTP status numbers, permission names, service principal app id, power shell snippets, or Entra picker references on the primary path. The Copy IT request button copies a deterministic plain-English request with a short summary suitable for non-technical staff and the documented technical details only inside the copied body for the IT team. The buyer can click Advanced details for IT to expand a secondary disclosure that contains every technical field for IT to act on; the disclosure is collapsed by default. Try again calls the same shared environments fetch helper. Disconnect Microsoft drops local state and clears the panel. The build-identity endpoint now reports product version 0.121.0 and phase id 1G-S13. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S13 is a buyer-experience corrective slice. Replaces the technical 403 panel copy with plain English, hides technical fields behind a collapsed Advanced details for IT disclosure, adds a Copy IT request button. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S13 content evidence file with founder feedback plus buyer experience problem plus canonical decision plus primary panel allowed terms plus primary panel forbidden terms plus IT request contents plus advanced details contents plus technical honesty constraints plus safety invariants preserved plus product approval status, the new pure builder module under the Microsoft connector library exposing build microsoft itrequest text, the rewritten presentational component with the buyer-friendly default and the collapsed Advanced details for IT disclosure, the score paste page wiring (Copy IT request copy-status state hook plus three new handlers plus the panel call passing the new prop set plus the helper auto-open removed), the new Phase 1G-S13 component-render test exercising the founders exact 403 body via render to static markup, the updated Phase 1G-S12 component tests that now expect the buyer-friendly default plus the technical-only-when-expanded behaviour, the updated build-identity endpoint constants for Phase 1G-S13, the package version bump from 0.120.0 to 0.121.0, the methodology changelog 111th entry, the README Phase 1G-S13 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S13 internal methodology record (buyer-friendly blocked-connection flow)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S13 founder execution signal: founder live-tested Phase 1G-S12, confirmed the 403 panel renders directly under the environment picker, but observed the panel exposed too much technical and admin language. Founder feedback: I agree, no technical jargon, and no admin permissions, this must work as intended.
- Impact assessment
- Fixes the Phase 1G-S12 buyer-experience gap by rewriting the panel as a calm buyer-facing blocked-connection flow (one plain-English headline plus one body plus one next step plus three primary buttons) and hiding every technical field behind a SECONDARY collapsed Advanced details for IT disclosure that is closed by default. Adds the new pure builder module under the Microsoft connector library exposing buildMicrosoftITRequestText, rewrites the presentational component, wires three new in-component handlers (Copy IT request, Try again, Disconnect Microsoft) plus a new copy-status state hook, removes the Phase 1G-S11 auto-open of Advanced details for IT, and updates the build-identity endpoint to product version 0.121.0 and phase id 1G-S13. Adds Phase 1G-S13 component-render tests that exercise the panel with the founders exact 403-derived state and assert the primary HTML contains the allowed buyer-friendly strings AND none of the documented forbidden technical strings before the buyer expands Advanced details for IT, plus assertions that after expansion the Advanced details body renders the documented technical fields, plus IT request copy-text assertions, plus regression tests, plus negative safety rules. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S12 archive is superseded as the local-development baseline by this Phase 1G-S13 archive because Phase 1G-S12 still exposed too much technical and admin language to the buyer on the primary path. Phase 1G-S13 has not been live-tested against the real Microsoft tenant by the AgentProof developer; the founder must run the local route and screenshot the now-buyer-friendly panel directly under the environment picker, then click Advanced details for IT and screenshot the expanded technical fields.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S14Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S14 - final buyer copy polish and move-forward gate
Reason: Founder live-tested Phase 1G-S13. The buyer-facing blocked-connection panel is plain English, the primary panel no longer exposes technical Microsoft jargon by default, the Copy IT request button works, the copied IT request is mostly usable. Founder feedback: Then do what you have to do but lets move forward, it feels to me we are stuck. Phase 1G-S13 is directionally correct. Phase 1G-S14 is a small final copy-polish patch plus an explicit move-forward gate.
What changed: Phase 1G-S14 makes one factual correction in the copied IT request body and locks the message structure exactly as specified in the Phase 1G-S14 instruction. Removes the inaccurate line AgentProof never runs anything on our behalf (the line was too absolute - AgentProof DOES make read-only Microsoft requests after sign-in) and replaces it with the truthful AgentProof does not make changes in Microsoft. Locks the message structure to the Phase 1G-S14 specification (Hi opener plus Microsoft allowed sign-in but blocked the read-only discovery connection plus AgentProof has not read any business records plus Please review and allow plus Plain summary block plus Technical details for IT block with App name plus Endpoint host plus AgentProof safe diagnostic id plus Microsoft correlation id plus Documented required permission environment management.Environments.Read plus Power Platform API service principal app id 8578e004-a5c6-46e7-913e-12f58912df43 plus three official Microsoft documentation URLs plus Thanks closer). Adds a separate exported SUBJECT line constant microsoft it request subject with the value Please allow AgentProof read-only discovery access so a future surface (mailto link, ticket form integration) can pre-fill the subject without re-deriving it from the body. The Phase 1G-S13 buyer-facing blocked-connection panel is preserved; the only buyer-visible change is the copied IT request text. Updates the build-identity endpoint constants to product version 0.122.0 and phase id 1G-S14. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 plus phase 1 g s12 plus phase 1 g s13 alias exports resolve to the Phase 1G-S14 values so existing imports keep compiling. Adds a Phase 1G-S14 evidence file under the content tree with founder feedback plus copy change summary plus corrected truthful claims plus buyer panel status plus move forward decision plus next recommended product work plus safety invariants preserved plus product approval status. Adds Phase 1G-S14 unit tests asserting the new IT request copy text contains the documented allowed wording (read-only agent discovery check plus Microsoft allowed sign-in but blocked the read-only discovery connection plus AgentProof has not read any business records plus AgentProof does not receive my Microsoft password plus AgentProof does not make changes in Microsoft plus Documented required permission plus environment management.Environments.Read plus Power Platform API service principal app id plus 8578e004-a5c6-46e7-913e-12f58912df43) plus the removed inaccurate wording (must NOT contain AgentProof never runs anything on our behalf) plus the new SUBJECT export plus regression assertions that the primary buyer panel remains jargon-free plus the safety invariants. Updates the Phase 1G-S13 IT request tests that previously asserted the now-removed plain-English hero phrasing. package version bumped from 0.121.0 to 0.122.0. methodology current methodology summary product version bumped to 0.122.0. methodology changelog now carries 112 entries with change id 1G-S14 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The copied IT request the buyer sends to their IT team now uses the exact specified message structure. The factually-wrong line AgentProof never runs anything on our behalf is removed and replaced with the truthful AgentProof does not make changes in Microsoft. The buyer-visible primary panel is unchanged from Phase 1G-S13 (Microsoft blocked the read-only connection headline, plain-English body, one next step, three primary buttons, Advanced details for IT collapsed by default). The build-identity endpoint now reports product version 0.122.0 and phase id 1G-S14. The methodology and the README explicitly gate the blocked-connection UX as good enough to proceed unless the founder rejects the wording, and recommend the next product work as Phase 1G-S15 (discovery resume path and blocked-connection status lifecycle). UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S14 is a final copy-polish slice. Updates the copied IT request wording and adds a SUBJECT export. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyLibApiEvidence trace: the new Phase 1G-S14 content evidence file with founder feedback plus copy change summary plus corrected truthful claims plus buyer panel status plus move forward decision plus next recommended product work plus safety invariants preserved plus product approval status, the updated IT request builder under the Microsoft connector library with the new microsoft it request subject constant plus the locked message structure, the updated Phase 1G-S14 unit tests asserting the new copy plus the removed inaccurate wording plus the new SUBJECT export plus the regression assertions, the updated Phase 1G-S13 IT request test fixture, the updated build-identity endpoint constants for Phase 1G-S14, the package version bump from 0.121.0 to 0.122.0, the methodology changelog 112th entry, the README Phase 1G-S14 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S14 internal methodology record (final buyer copy polish plus move-forward gate)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S14 founder execution signal: founder live-tested Phase 1G-S13, confirmed the buyer-facing blocked-connection panel is plain English, the primary panel no longer exposes technical Microsoft jargon by default, the Copy IT request button works, the copied IT request is mostly usable. Founder quote: Then do what you have to do but lets move forward, it feels to me we are stuck.
- Impact assessment
- Polishes the final wording of the copied IT request. Removes the factually-wrong line AgentProof never runs anything on our behalf and replaces it with the truthful AgentProof does not make changes in Microsoft. Locks the message structure exactly as specified in the Phase 1G-S14 instruction. Exports a separate SUBJECT line constant for future surfaces (mailto links, ticket form integrations). Adds Phase 1G-S14 unit tests asserting the new copy plus the removed inaccurate wording plus the new SUBJECT export plus regression assertions that the primary buyer panel remains jargon-free plus the safety invariants. Updates the build-identity endpoint to product version 0.122.0 and phase id 1G-S14. Gates the blocked-connection UX as good enough to proceed unless the founder rejects the wording, and recommends the next product work as Phase 1G-S15 discovery resume path and blocked-connection status lifecycle. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement plus the Phase 1G-S13 buyer-friendly blocked-connection flow are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S13 archive is superseded as the local-development baseline by this Phase 1G-S14 archive because Phase 1G-S13 contained one factually-wrong line in the copied IT request body. Phase 1G-S14 has not been live-tested against the real Microsoft tenant by the AgentProof developer; the founder must run the local route, click Copy IT request, paste the clipboard content, and confirm the polish. Future product work resumes the discovery and reporting path while treating blocked by organisation as a valid self-serve handoff state.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S15Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S15 - discovery resume path and blocked-connection status lifecycle
Reason: Founder live-tested Phase 1G-S14 and accepted the buyer-friendly blocked-connection panel + the polished copied IT request as good enough to move forward. Founder feedback: stop feeling stuck and move forward; treat blocked by organisation as a normal product state, not a dead end; add a small buyer-friendly discovery lifecycle so users understand the current state in plain English.
What changed: Phase 1G-S15 adds a buyer-friendly discovery status lifecycle to the score paste page. New pure helper module under the Microsoft connector library exposes a discriminated-union status kind (Not connected, Connected, Checking access, Blocked by organisation, IT request copied, Retry available, Discovery allowed plus Environments loaded, Agents ready, Footprint discovered) plus a build microsoft discovery status function that takes the existing runtime state inputs (auth status, step in flight, dedicated visible-error state, IT-request copy status, environments count, selected environment id, agents count, selected agent id, footprint discovered) and returns one active status with a buyer-friendly label and description. The labels and descriptions are exactly the documented Phase 1G-S15 strings (Not connected: Connect Microsoft to start read-only discovery; Connected: Microsoft is connected. AgentProof is ready to check access; Checking access: Checking whether your organisation allows read-only discovery; Blocked by organisation: Your organisation has not allowed the read-only discovery connection yet; IT request copied: IT request copied. Send it to your IT team, then try again once access is allowed; Retry available: You can try again after IT allows the connection; Environments loaded: Read-only discovery is allowed. Choose an environment; Agents ready: Agents found. Choose the agent to proof; Footprint discovered: Agent footprint discovered. Confirm only the points AgentProof could not prove). New presentational component under the form components tree renders the active status (label plus description plus Step N of 9 indicator plus a horizontal step dot list) with calm tone classnames (neutral, in flight, blocked, ok). The component carries role status and aria-live polite. The component is rendered between the Connect or Disconnect Microsoft button block and the setup-needed wizard so it is visible in every lifecycle state including before the buyer even clicks Connect Microsoft. Updates the build-identity endpoint constants to product version 0.123.0 and phase id 1G-S15. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 plus phase 1 g s12 plus phase 1 g s13 plus phase 1 g s14 alias exports resolve to the Phase 1G-S15 values so existing imports keep compiling. Adds a Phase 1G-S15 evidence file under the content tree with founder feedback plus reason for phase plus lifecycle states plus buyer copy plus forbidden primary terms plus behaviour requirements plus implemented files plus tests added plus safety invariants preserved plus move forward position plus next recommended product work plus product approval status. Adds Phase 1G-S15 component-render and source-shape tests asserting the pure helper maps every documented runtime input to the right active status, the panel renders the active label and description and step indicator, the panel and the helper output never contain any documented forbidden term, and the existing Phase 1G-S14 buyer-friendly blocked-connection panel plus the Phase 1G-S14 IT-request copy text remain intact. package version bumped from 0.122.0 to 0.123.0. methodology current methodology summary product version bumped to 0.123.0. methodology changelog now carries 113 entries with change id 1G-S15 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page now shows a calm buyer-facing Connection status panel near the Microsoft discovery area. The panel renders one active state at a time with plain-English copy (Not connected, Connected, Checking access, Blocked by organisation, IT request copied, Retry available, Discovery allowed, Agents ready, Footprint discovered) plus a Step N of 9 indicator and a horizontal step dot list. The panel never exposes technical Microsoft jargon. The blocked state uses an amber neutral tone, not a red panic state. After the buyer clicks Copy IT request the panel shows IT request copied with the documented description. After clicking Try again the panel shows Checking access while the fetch runs. If the retry succeeds the panel shows Discovery allowed and the environment dropdown populates. If the retry still 403s the panel shows Blocked by organisation or Retry available depending on the IT-request copy status. After Disconnect Microsoft the panel returns to Not connected. The build-identity endpoint now reports product version 0.123.0 and phase id 1G-S15. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S15 is a buyer-experience corrective slice. Adds a discovery status lifecycle. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S15 content evidence file with founder feedback plus reason for phase plus lifecycle states plus buyer copy plus forbidden primary terms plus behaviour requirements plus implemented files plus tests added plus safety invariants preserved plus move forward position plus next recommended product work plus product approval status, the new pure helper module under the Microsoft connector library, the new presentational component under the form components tree, the score-paste page wiring (panel rendered between the Connect or Disconnect Microsoft button block and the setup-needed wizard), the new Phase 1G-S15 component-render and source-shape tests, the updated build-identity endpoint constants for Phase 1G-S15, the package version bump from 0.122.0 to 0.123.0, the methodology changelog 113th entry, the README Phase 1G-S15 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S15 internal methodology record (discovery resume path plus blocked-connection status lifecycle)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S15 founder execution signal: founder live-tested Phase 1G-S14, accepted the buyer-friendly blocked-connection panel and the polished copied IT request as good enough to move forward. Founder request: stop feeling stuck and move forward; treat blocked by organisation as a normal product state; add a small buyer-friendly discovery lifecycle.
- Impact assessment
- Adds a calm buyer-facing discovery status lifecycle to the score paste page. New pure helper module exposes a status union plus a buildMicrosoftDiscoveryStatus function that maps the runtime state to one active status with a buyer-friendly label and description. New presentational component renders the active status with a Step N of 9 indicator and a horizontal step dot list. Calm tone classnames (neutral, in flight, blocked, ok). Role status and aria-live polite. Rendered between the Connect or Disconnect Microsoft button block and the setup-needed wizard so the panel is visible in every lifecycle state. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement plus the Phase 1G-S13 buyer-friendly blocked-connection flow plus the Phase 1G-S14 final IT-request copy polish are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S15 has not been live-tested against the real Microsoft tenant by the AgentProof developer; the founder must run the local route and screenshot the status panel in each lifecycle state to confirm the buyer-friendly copy and tone. Phase 1G-S15 does not implement the recommended Phase 1G-S16 (real discovery continuation after environments are allowed) - that is a separate slice.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S16Change date:2026-05-06Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S16 - customer and IT runbook plus Dataverse URL resolution
Reason: Founder live-tested Phase 1G-S15. The documented environment management.Environments.Read permission was added in Microsoft Entra and AgentProof now lists at least one real environment (CRM667668 in Contoso tenant). The next step in the buyer flow (list Copilot Studio agents in the chosen environment) failed with HTTP 422 because the per-environment Dataverse address was missing from the snapshot the previous resolver inspected. The previous resolver only tried four documented field paths and did not normalise the value it returned. Founder instruction: document the entire customer plus IT setup path so any tenant can follow it, fix Dataverse URL resolution so it tries every documented source in order, surface a buyer-friendly panel when all sources fail, and defer Global Discovery Service fallback to a later slice with an honest controlled-next-step note.
What changed: Phase 1G-S16 ships a new customer plus IT runbook in the docs tree at microsoft discovery customer it runbook.md covering what AgentProof needs, what AgentProof does not do, the customer setup path, the IT setup path including Microsoft Entra app registration plus the documented Power Platform API delegated permission environment management.Environments.Read plus admin consent, the Power Platform API not visible in Entra fallback, and the troubleshooting matrix for every documented blocked state (Not connected, Blocked by organisation, IT request copied, Discovery allowed, Could not find the Dataverse URL for this environment, Could not list agents, No agents found, Footprint discovered) plus the canonical Microsoft documentation links. Adds a small buyer-friendly Connection guide for IT collapsed disclosure inside the Choose where your agent lives section so the buyer can read a plain-English summary or jump to the runbook without leaving the primary path. New pure helper module under the Microsoft connector library exposes resolve dataverse url plus resolve dataverse url from snapshot. The helper tries every documented source path in order (the pre-existing normalised dataverse url field, properties.linked environment metadata.instance api url, properties.linked environment metadata.instance url, top-level linked environment metadata.instance api url, top-level linked environment metadata.instance url, top-level instance api url, top-level instance url), validates each candidate as a real HTTPS URL on a real DNS host (rejects IPs, rejects localhost, requires a dot in the host), strips any trailing api data v9.x suffix to normalise to the BASE Dataverse origin form, and reports provenance flags (linked environment metadata present, instance url present, instance api url present, preexisting dataverse url present, global discovery attempted) so the agents route can populate a documented advanced details for IT block on the buyer-friendly error. Wires the helper into the agents route. When all documented sources fail the agents route returns a buyer-friendly 422 carrying status dataverse url missing plus operation list agents plus safe error code dataverse url missing plus safe message AgentProof found the environment but Microsoft did not expose a Dataverse address for it plus buyer facing summary plus buyer next action Choose another environment or ask IT to confirm this environment has Dataverse and that your account can access it plus an advanced details block with the provenance flags plus the selected environment id plus the environment display name plus a global discovery status of deferred to future slice per Phase 1G-S16 documentation. The Dataverse Global Discovery Service fallback is documented in the runbook and the helper as a controlled next step but is intentionally deferred to a later AgentProof slice. Wires the helper into the Power Platform client so the per-environment dataverse url field on the AgentProof environment snapshot is populated by the same resolver. Updates the score paste page to detect safe error code dataverse url missing on the agents fetch and render a calm buyer-friendly disclosure under the existing discovery error pipeline (safe message plus next action on the primary path, the new advanced details block as a collapsed Advanced details for IT disclosure showing only documented metadata presence flags plus the selected environment identity, never tokens, never secrets, never business records). Updates the build-identity endpoint constants to product version 0.124.0 and phase id 1G-S16. Backwards-compatible phase 1 g s8 plus phase 1 g s9 plus phase 1 g s10 plus phase 1 g s11 plus phase 1 g s12 plus phase 1 g s13 plus phase 1 g s14 plus phase 1 g s15 alias exports resolve to the Phase 1G-S16 values so existing imports keep compiling. Adds a Phase 1G-S16 evidence file under the content tree with founder live-test evidence plus customer it documentation decision plus official Microsoft sources plus runbook sections plus dataverse url resolution strategy plus resolver order plus blocked states plus buyer copy plus advanced it details plus safety invariants preserved plus product approval status plus next recommended product work. Adds Phase 1G-S16 unit tests asserting the runbook file exists and contains every documented section plus the in-product Connection guide for IT disclosure plus the resolver tries every documented source path in order plus normalises instance api url plus derives the base origin from instance url plus prefers nested linked metadata over top-level fields plus rejects non-HTTPS plus rejects malformed URLs plus reports the unresolved state with provenance plus the agents route returns the documented dataverse url missing payload plus the buyer panel renders the documented copy plus the advanced details collapsed disclosure plus the safety invariants (no Bearer tokens, no ey j JWT prefix, no secrets, no Microsoft env-var values, no business records, no manual environment URL or agent name input on the primary path, connector-agnostic discovery preserved, the score endpoint and the scorecard Markdown unchanged). package version bumped from 0.123.0 to 0.124.0. methodology current methodology summary product version bumped to 0.124.0. methodology changelog now carries 114 entries with change id 1G-S16 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: AgentProof now ships a customer plus IT runbook so any tenant can follow the documented setup path end to end. The score paste page exposes a small Connection guide for IT disclosure inside the Microsoft discovery section. When the founder picks an environment whose per-environment Dataverse address is missing from the Microsoft response, AgentProof renders a calm buyer-friendly panel saying AgentProof found the environment but Microsoft did not expose a Dataverse address for it with the documented next action Choose another environment or ask IT to confirm this environment has Dataverse and that your account can access it. The collapsed Advanced details for IT disclosure reports which documented metadata fields were present (linked environment metadata, instance url, instance api url, pre-existing dataverse url, Global Discovery attempted) plus the selected environment identity. None of these fields contains tokens, secrets, or business records. The build-identity endpoint now reports product version 0.124.0 and phase id 1G-S16. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S16 is a buyer-experience plus correctness slice. Adds a customer plus IT runbook plus a pure Dataverse URL resolver plus a buyer-friendly panel for the missing-Dataverse-address state. Does not change scoring, the score response shape, or the rendered scorecard Markdown. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new Phase 1G-S16 content evidence file with founder live-test evidence plus customer it documentation decision plus official Microsoft sources plus runbook sections plus dataverse url resolution strategy plus resolver order plus blocked states plus buyer copy plus advanced it details plus safety invariants preserved plus product approval status plus next recommended product work, the new customer plus IT runbook under the docs tree, the new pure resolver module under the Microsoft connector library, the wiring in the Power Platform client and the agents route, the new buyer-friendly Connection guide for IT disclosure plus the new dataverse url missing buyer panel in the score paste page, the new Phase 1G-S16 unit test groups (runbook plus in-product help plus resolver plus deferred Global Discovery plus agents route behaviour plus buyer-friendly errors plus safety), the updated build-identity endpoint constants for Phase 1G-S16, the package version bump from 0.123.0 to 0.124.0, the methodology changelog 114th entry, the README Phase 1G-S16 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S16 internal methodology record (customer plus IT runbook plus Dataverse URL resolution)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-06
- Reference
- Phase 1G-S16 founder execution signal: founder live-tested Phase 1G-S15, granted the documented EnvironmentManagement.Environments.Read permission in Microsoft Entra, AgentProof listed at least one real environment (CRM667668 in Contoso tenant), the next step (list Copilot Studio agents) failed with HTTP 422 because the per-environment Dataverse address was missing. Founder instruction: document the entire customer plus IT setup path; fix Dataverse URL resolution so it tries every documented source in order; surface a buyer-friendly panel when all sources fail; defer Global Discovery Service fallback to a later slice with an honest controlled-next-step note.
- Impact assessment
- Adds a customer plus IT runbook plus a pure Dataverse URL resolver. The resolver tries every documented Power Platform metadata field path in order, validates each candidate as a real HTTPS URL on a real DNS host, normalises the value to the BASE Dataverse origin form so downstream Dataverse Web API calls do not produce a double path, and reports provenance flags so the buyer panel can render a documented Advanced details for IT disclosure. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair plus the Phase 1G-S10 evidence file and admin handoff helper module plus the Phase 1G-S11 shared fetch helper and pure mapper plus the Phase 1G-S12 dedicated visible-error state and presentational placement plus the Phase 1G-S13 buyer-friendly blocked-connection flow plus the Phase 1G-S14 final IT-request copy polish plus the Phase 1G-S15 discovery status lifecycle are all preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. Phase 1G-S16 has not been live-tested end to end against the real Microsoft tenant by the AgentProof developer; the founder must run the local route after signing in to a tenant with at least one Dataverse-backed Power Platform environment plus at least one Copilot Studio agent. Phase 1G-S16 does not implement the documented Dataverse Global Discovery Service fallback - that is the recommended Phase 1G-S17 candidate.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S2Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S2 - corrective real Microsoft discovery activation
Reason: The founder reviewed the Phase 1G archive and reported that nothing was obviously working yet: the auth callback did not exchange the Microsoft authorisation code for tokens; environments / agents / footprint routes always returned auth required without calling Microsoft; the UI showed disabled environment / agent / discovery controls as if they were progress; and the prior archive packaging incorrectly included the git directory, the tmp working-tree directory, and the type script build-info file. The Phase 1G archive was therefore not product-approved and must not be described as real working discovery. This corrective slice activates the actual local Microsoft auth / session / discovery plumbing and fixes the packaging.
What changed: Adds two new server-side connector libs at an internal source file (in-memory short-lived session store keyed by an opaque session id, with state + PKCE verifier storage at sign-in, server-side-only token storage after sign-in, per-environment Dataverse-token cache, and a clear production-hardening comment) and an internal source file (Microsoft Entra ID authorize URL builder with PKCE S256, server-side authorisation-code exchange against the documented /oauth2/v2.0/token endpoint, and a refresh-token-driven Dataverse-scoped token acquisition that derives the per-environment audience scope dynamically from the discovered Dataverse organisation URL). Adds two new API routes: /api/connectors/microsoft/auth/status (returns signed in / not signed in / setup needed without ever returning tokens) and /api/connectors/microsoft/auth/disconnect (deletes the session and clears the http only cookie). Rewrites /api/connectors/microsoft/auth/start to generate state + PKCE verifier, set an http only opaque session cookie, and return only the authorize URL. Rewrites /api/connectors/microsoft/auth/callback to validate state, exchange the authorisation code for tokens server-side, persist tokens only in the in-memory session store, and redirect the browser back to /score/paste with a simple ?microsoft=connected status flag. Rewrites /environments to call the documented Power Platform admin endpoint with the session's Power Platform access token and to record the discovered environments on the session so subsequent route calls verify selection against the discovered list. Rewrites /agents to acquire a per-environment Dataverse-scoped token at the moment it is needed (using the refresh token issued at sign-in) and to call the documented Dataverse Web API bots metadata set; the route refuses to accept arbitrary typed environment ids. Rewrites /footprint to verify the (environment, agent) pair against the discovered lists, read the documented botcomponents metadata set, and map the result via the existing canonical mapper. Rewrites the json paste score card Microsoft section as a real progressive working flow: the UI polls /auth/status on mount, shows Connect Microsoft until signed in, then progressively enables List environments, the discovered environment dropdown, List agents, the discovered agent dropdown, and Discover agent footprint as each step succeeds; a Disconnect Microsoft button is always available when signed in. The required env vars now include microsoft session secret (replacing the static microsoft dataverse scope) because the Dataverse audience is per-environment and is derived dynamically. The .gitignore is hardened to exclude the tmp working-tree directory, the engine and framework build output directories, the Turbo / cache / Vercel directories, Thumbs.db, and additional dotenv variants. The next archive must exclude the git directory, the tmp working-tree directory, the tsbuildinfo, every dotenv variant (the dotenv-example file remains allowed), the node modules directory, the framework build cache, the dist directory, the build directory, the coverage directory, the cache directory, the Vercel directory, log files, DS_Store, and Thumbs.db. package.json is bumped from 0.109.0 to 0.110.0. methodology current methodology summary.product version is bumped to 0.110.0. The /api/score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: Microsoft discovery is now actually wired. When the Microsoft Entra ID app registration env vars are configured locally, clicking Connect Microsoft starts a real OAuth authorisation-code flow with PKCE; Microsoft redirects back to AgentProof; AgentProof exchanges the code for tokens server-side; the tokens are stored only in an in-memory session keyed by an http only opaque session id cookie. After sign-in, List environments calls Microsoft and shows the discovered list; selecting one and clicking List agents calls Dataverse and shows the discovered agents; selecting one and clicking Discover agent footprint reads the read-only Copilot Studio metadata and shows the canonical AgentProof Agent Footprint plus Yes / No / Not sure confirmation questions. Tokens never reach the browser. Tokens are never persisted (no local storage / session storage / indexed db / cookie / Supabase / disk write). Tokens are never logged. The Disconnect Microsoft button drops the session and clears the cookie. When env vars are missing, the UI shows the Microsoft connection setup needed card and does NOT fall back to fictional discovery. The fictional demo package on the buyer hero is preserved unchanged so the founder can still demo without a real Microsoft tenant. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S2 activates server-side OAuth/session/discovery plumbing and rewrites a buyer-facing UI section. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsLibApiPackagingEvidence trace: an internal source file, an internal source file, an internal source file (env var list updated), the five Phase 1G API routes plus auth/status and auth/disconnect under app/api/connectors/microsoft/auth/ and app/api/connectors/microsoft/, an internal source file (Microsoft section now a progressive working flow), content/phase 1g real microsoft read only discovery.v1.json (api route count end bumped to 15), .env.example, .gitignore (hardened), package.json (0.109.0 to 0.110.0), content/methodology changelog.v1.json this 100th entry, README Phase 1G-S2 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S2 internal methodology record (corrective real Microsoft discovery activation)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S2 founder execution signal (corrective real Microsoft / Power Platform / Copilot Studio discovery activation; supersedes the prior Phase 1G shipped interpretation)
- Impact assessment
- Activates the real local Microsoft / Power Platform / Copilot Studio discovery flow that the Phase 1G archive failed to deliver. Adds the in-memory server-side session store and the OAuth helper module. Rewrites the OAuth start route to generate state + PKCE verifier and set an HttpOnly opaque session id cookie; rewrites the OAuth callback route to validate state and exchange the authorisation code for tokens at the documented Microsoft token endpoint server-side. Adds auth/status and auth/disconnect routes. Wires the environments route to call the documented Power Platform admin endpoint, the agents route to acquire a per-environment Dataverse-scoped token using the refresh token and call the documented Dataverse bots metadata set, and the footprint route to read the documented botcomponents metadata set and map the result via the existing canonical mapper. Rewrites the JsonPasteScoreCard Microsoft section as a real progressive working flow with auth-status polling on mount, progressive step enablement, and a Disconnect Microsoft button. Replaces the static MICROSOFT_DATAVERSE_SCOPE env var with MICROSOFT_SESSION_SECRET because the Dataverse audience is per-environment and must be derived dynamically. Hardens .gitignore (the tmp working-tree directory, the engine and framework build output directories, the Turbo / cache / Vercel directories, Thumbs.db, and dotenv variants). The next archive must exclude the git directory, the tmp working-tree directory, the tsbuildinfo, every dotenv variant (dotenv-example allowed), the node modules directory, the framework build cache, the dist directory, the build directory, the coverage directory, the cache directory, the Vercel directory, log files, DS_Store, and Thumbs.db. Tokens never reach the browser. Tokens are never persisted. Tokens are never logged. No business records are read. UI product approval still requires founder review. Internal milestones are not success - success means the product sells. Production commercial launch requires hardened session/token storage and a security review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G archive (agentproof-phase-1g-real-microsoft-read-only-discovery.tar.gz) is superseded as product-approved by this Phase 1G-S2 archive.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S3Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S3 - runtime UI repair and route-load smoke gate
Reason: The founder tested the Phase 1G-S2 build in a local browser and the page failed to render. Instead of the AgentProof UI, the browser showed the Next.js App Router fallback message that appears when a route segment throws during rendering and no error boundary exists. Founder result: nothing was working. Phase 1G-S2 was therefore not product-approved. Phase 1G-S3 repairs the runtime UI so the founder can actually open and use the page, and adds a mandatory route-load smoke gate so a regression cannot ship without proving the route loads.
What changed: Adds three App Router error boundaries that were missing in Phase 1G-S2 and earlier: an internal source file (calm buyer-safe Try again / Go to home fallback for any route segment that throws during render); an internal source file (calm fallback that renders its own html/body shell when the root layout itself fails); an internal source file (calm 404 fallback). All three boundaries never expose stack traces, internal error messages, secrets, tokens, env var names, or API endpoint names. Replaces the prior internal-developer hero on /score/paste (which exposed the eyebrow Self-serve · Phase 1D Slice 3, the headline Paste & score an agent inputs JSON, the pill Local · NullProvider, and a paragraph mentioning agent inputs JSON object / deterministic founder-facing Markdown / No live AI provider is called) with a buyer-facing hero (title Proof your AI agent, subtitle Connect your agent environment let AgentProof discover the footprint and confirm only the critical points, trust note Read-only first. Metadata and configuration before business records.). Updates the page metadata title to Proof your AI agent - AgentProof. Adds a route-load smoke gate at scripts/smoke score paste.cjs that spawns next start, GETs /score/paste, asserts HTTP 200, asserts the buyer-facing hero strings are present, and asserts the Next.js dev fallback string and the prior internal-developer hero strings are absent. Adds a pnpm smoke:score-paste script that runs the smoke gate. Adds twelve new Phase 1G-S3 unit tests in the runtime-UI-repair test file that pin the three error boundaries, the buyer-facing hero copy, the documented test ids, and the absence of the prior internal-developer hero strings on the static source. package.json is bumped from 0.110.0 to 0.111.0. methodology current methodology summary.product version is bumped to 0.111.0. The /api/score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged: tokens never leave the server; tokens are never persisted; tokens are never logged; tokens are never returned to the browser; no business records are read; the connector-agnostic platform selector still exposes all six categories with Microsoft Copilot Studio labelled first real connector; manual environment entry and manual agent entry are still absent from the primary buyer path; fictional or demo discovery is still absent from the primary buyer path.
User impact: The founder can now open http://localhost:3000/score/paste in a local browser without seeing the Next.js dev fallback message and without hitting an HTTP 500. The page renders a calm buyer-facing hero (Proof your AI agent / Connect your agent environment let AgentProof discover the footprint and confirm only the critical points / Read-only first. Metadata and configuration before business records.). When Microsoft env vars are missing, the page still shows the calm Microsoft connection setup needed card. When Microsoft env vars are configured, Connect Microsoft starts the real OAuth flow that Phase 1G-S2 wired. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S3 repairs runtime page loading and replaces a buyer-facing page hero. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsApp router error boundariesScriptsEvidence trace: an internal source file (new), an internal source file (new), an internal source file (new), an internal source file (rewritten hero), scripts/smoke score paste.cjs (new), package.json (smoke:score-paste script + version bump), the runtime-UI-repair Phase 1G-S3 unit test file (new), content/methodology changelog.v1.json this 101st entry, README Phase 1G-S3 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S3 internal methodology record (runtime UI repair + smoke gate)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S3 founder execution signal (corrective runtime UI repair + mandatory route-load smoke gate; supersedes the prior Phase 1G-S2 shipped interpretation)
- Impact assessment
- Repairs the runtime UI by adding three App Router error boundaries (app/error.tsx, app/global-error.tsx, app/not-found.tsx) that were missing in Phase 1G-S2 and earlier; without those, Next.js shows a hardcoded fallback message during dev rendering instead of a calm buyer-safe error screen. Replaces the prior internal-developer page hero on /score/paste (Self-serve Phase 1D Slice 3, Paste and score an AgentInputs JSON, Local NullProvider, deterministic founder-facing Markdown, No live AI provider is called) with a buyer-facing hero (Proof your AI agent, Connect your agent environment let AgentProof discover the footprint and confirm only the critical points, Read-only first. Metadata and configuration before business records.). Adds a production route-load smoke gate at scripts/smoke_score_paste.cjs that spawns next start, GETs /score/paste, asserts HTTP 200, asserts the buyer-facing hero strings are present, and asserts the Next.js dev fallback string and the prior internal-developer hero strings are absent. Adds five unit tests pinning the three error boundaries, the buyer-facing hero copy, and the absence of the prior internal-developer hero strings on the static source. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged. Tokens never reach the browser. Tokens are never persisted. Tokens are never logged. No business records are read. Manual environment entry and manual agent entry are still absent from the primary buyer path. Fictional or demo discovery is still absent from the primary buyer path. Connector-agnostic architecture is preserved. The /api/score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S2 archive is superseded as the local-development baseline by this Phase 1G-S3 archive because the Phase 1G-S2 page failed to render in the founder local browser.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S4Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S4 - Microsoft setup preflight and no-fictional primary path
Reason: The founder tested Phase 1G-S3 and the page now loads, but the founder still rejected product approval for two reasons: the primary buyer surface still showed Try the fictional demo and the legacy paste-and-score test inputs, and the Microsoft setup-needed state was a generic amber card that did not say what was actually missing or what to do next. Phase 1G-S4 makes the UI honest and actionable: either the Microsoft setup is complete and the founder can connect, or the UI clearly tells the founder exactly what is missing without showing fictional fallback content.
What changed: Adds a dedicated server-side setup preflight route at the connectors microsoft setup status path that returns an actionable checklist of which Microsoft env vars are present and which are missing (NEVER the values themselves, NEVER tokens, NEVER secrets). The route also returns the documented expected redirect URI and Power Platform scope as non-secret hints, plus a buyer-facing headline, body, and a next-step token. It never makes a live Microsoft network call. The API route count moves from 15 to 16. Rewrites the buyer-facing in-component hero on the score paste page from the prior description-review framing to real-discovery framing (Real read-only discovery, not a description review / Connect your agent platform and AgentProof discovers the agent automatically. Real metadata. Real configuration. No business records.). Removes the fictional-demo Run fictional demo and Check an agent profile primary call-to-action buttons from the buyer hero. The fictional demo package, the legacy paste textarea, the legacy Score this JSON button, the validation-error rendering, and the buyer report summary are now gated behind a founder-only diagnostics toggle (microsoft founder diagnostics open use state defaulting to false). On first paint the curl-visible HTML contains zero fictional / demo / paste-test markup. The Microsoft setup-needed amber card is replaced with an actionable checklist: one row per required env var with a present (green check) or missing (amber !) badge, the env var name (no value), and a Technical setup details collapsed section that lists what each env var is for plus the documented expected redirect URI and Power Platform scope. A restart-after-changing-dotenv-local note is rendered. Extends the route-load smoke gate to forbid the strings Try the fictional demo, Fictional customer-support agent, Run this demo, and Run fictional demo on the curl-visible response. Adds twelve new Phase 1G-S4 unit tests pinning the new setup-status route shape, the buyer-facing in-component hero copy change, the founder-diagnostics gate, the Microsoft setup checklist UI, and the smoke gate's new forbidden strings. package.json is bumped from 0.111.0 to 0.112.0. methodology current methodology summary product version is bumped to 0.112.0. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged: tokens never leave the server; tokens are never persisted; tokens are never logged; tokens are never returned to the browser; no business records are read; the connector-agnostic platform selector still exposes all six categories with Microsoft Copilot Studio labelled first real connector; manual environment entry and manual agent entry are still absent from the primary buyer path.
User impact: On first open of the local app, the founder no longer sees Try the fictional demo or any fictional customer-support agent content on the primary buyer surface. The hero says Real read-only discovery, not a description review and explains that AgentProof discovers the agent automatically when the founder connects a platform. If Microsoft env vars are missing, the founder sees a calm actionable checklist showing exactly which env vars are configured (green check) and which are missing (amber exclamation), with a collapsed Technical setup details section explaining what each env var is for and what the expected redirect URI and Power Platform scope look like. The founder is told to restart the local app after changing the local dotenv file. No demo data is shown anywhere on the primary path. The legacy paste-and-score test inputs are preserved for internal founder testing only behind an explicit Founder diagnostics toggle. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration with read-only delegated permissions. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S4 changes the buyer-facing surface and adds a setup-preflight route. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsApiPrimary buyer surfaceEvidence trace: the new server-side Microsoft setup-preflight route, the rewritten in-component buyer hero, the new Microsoft setup checklist UI, the new founder-diagnostics toggle and use state, the extended route-load smoke gate forbidden-strings list, the new Phase 1G-S4 unit test file (twelve new tests), package.json version bump, the methodology changelog 102nd entry, the rewritten buyer-hero pins on the buyer-facing product UI reset content artefact, the README Phase 1G-S4 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S4 internal methodology record (Microsoft setup preflight + no-fictional primary path)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S4 founder execution signal (remove fictional / demo content from the primary buyer path; add an actionable Microsoft setup preflight checklist)
- Impact assessment
- Removes fictional / demo content from the AgentProof primary buyer surface and adds an actionable Microsoft setup preflight checklist. Adds a server-side setup-status route that returns a per-env-var present/missing checklist (NEVER values, NEVER tokens, NEVER secrets) and the documented expected redirect URI and Power Platform scope as non-secret hints; the route never makes a live Microsoft network call. Replaces the prior description-review buyer hero with real-discovery framing. Removes the prior fictional-demo primary call-to-action buttons from the buyer hero. Gates the fictional demo package, the legacy paste textarea, the legacy score-this-JSON button, the validation-error rendering, and the buyer report summary behind a founder-only diagnostics toggle that defaults to closed so the curl-visible HTML on first paint contains zero fictional / demo / paste-test markup. Replaces the prior generic Microsoft setup-needed amber card with an actionable checklist UI that shows present/missing per env var, with a collapsed Technical setup details section that names each env var purpose plus the documented expected redirect URI and Power Platform scope. Extends the route-load smoke gate to forbid four additional fictional / demo strings on the curl-visible response. Adds twelve unit tests. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged. Tokens never reach the browser. Tokens are never persisted. Tokens are never logged. No business records are read. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S3 archive is superseded as the local-development baseline by this Phase 1G-S4 archive because the Phase 1G-S3 buyer surface still showed fictional / demo content alongside a generic Microsoft setup-needed amber card without a clear actionable preflight checklist.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S5Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S5 - User-guided Microsoft setup wizard
Reason: The founder tested Phase 1G-S4 and rejected product approval. The Phase 1G-S4 setup-needed UI was a flat env-var checklist that listed names like microsoft tenant id and microsoft client secret with green check or amber exclamation badges. The founder said: I do not know where to get this or where to set this up. It must be a user setup interface with a user guide. A non-technical buyer cannot read a list of environment variable names and know what to do next. Phase 1G-S5 replaces the flat checklist with a real user-guided Microsoft setup wizard that explains in plain English, step by step, where to get each value, what to click in the Microsoft Entra admin centre, where to paste the value locally, and what is safe versus secret.
What changed: Extends the connectors microsoft setup status route from the Phase 1G-S4 flat shape to a richer Phase 1G-S5 shape that returns a per-field array with key, label, description, where to find, is secret, status (present or missing), and an optional safe default for non-secret defaults the wizard can show inline (the documented redirect URI and the documented Power Platform scope). The route also returns required count, present count, missing count, can start microsoft auth, setup steps version, and the buyer-facing headline and body. The route still NEVER returns secret values, NEVER returns tokens, NEVER returns raw env var contents, and NEVER makes a live Microsoft network call. The Phase 1G-S4 backwards-compatible checklist plus present env vars and missing env vars arrays are preserved for the Phase 1G-S4 unit tests. Adds a new pure deterministic helper at lib connectors microsoft microsoft setup template that builds a copyable .env.local template body containing placeholders only (NEVER a real secret, NEVER a tenant-specific value) plus the two documented safe defaults inlined (redirect URI and Power Platform scope). Replaces the Phase 1G-S4 setup-needed flat checklist UI in json paste score card with a real user-guided Microsoft setup wizard composed of a wizard headline (Set up Microsoft discovery), a What you will create explainer card, a 5-step progress chip ordered list, seven numbered step section cards (Step 1 Create app registration / Step 2 Add redirect URI / Step 3 Copy Tenant ID and Application client ID / Step 4 Create client secret / Step 5 Add read-only API permissions / Step 6 Create local setup file with the copyable .env.local template body and Copy template + Download .env.local template buttons / Step 7 Restart the local app), a per-field where-to-find guide rendered from the new fields shape with present check / missing exclamation chips and a secret pill for is secret fields, an enterprise note about Conditional Access and admin approval, and the Phase 1G-S4 raw env-var checklist demoted into a collapsed Technical setup details disclosure (NOT primary UX) so Phase 1G-S4 unit tests still pass. Gates the Connect Microsoft button on can start microsoft auth so the button is disabled (with an explanatory title attribute) until the founder finishes the wizard. Extends the route-load smoke gate to require the strings Set up Microsoft discovery, Create app registration, and Add redirect URI on the curl-visible HTML on first paint. Adds eighteen new Phase 1G-S5 unit tests pinning the env-template purity, the new richer setup-status shape, the no-secret-leakage rules, the wizard headline, the seven step cards, the progress chips, the copyable env template, the per-field where-to-find guide, the enterprise note, the technical-details collapsed disclosure, the Connect Microsoft gating, and the smoke gate's new required strings. package.json is bumped from 0.112.0 to 0.113.0. methodology current methodology summary product version is bumped to 0.113.0. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged: tokens never leave the server; tokens are never persisted; tokens are never logged; tokens are never returned to the browser; no business records are read; the connector-agnostic platform selector still exposes all six categories with Microsoft Copilot Studio labelled first real connector; manual environment entry and manual agent entry are still absent from the primary buyer path.
User impact: On first open of the local app, when Microsoft env vars are missing, the founder no longer sees a flat list of environment variable names with red and green badges. The founder sees a real user-guided setup wizard that says Set up Microsoft discovery, explains in plain English what they will create (a Microsoft Entra app registration that lets Microsoft know AgentProof is allowed to request read-only access), and walks them through seven numbered steps: Create app registration; Add redirect URI; Copy Tenant ID and Application client ID; Create client secret; Add read-only API permissions; Create local setup file (with a Copy template button and a Download .env.local template button that produce a placeholders-only template body); Restart the local app. A 5-step progress chip ordered list shows them where they are in the flow. A per-field where-to-find guide tells them, for each value, exactly where to click in Microsoft Entra to get the value, whether it is a secret (with a clear secret pill), and whether it is currently present or missing on the local machine. An enterprise note explains that in a corporate tenant admin approval or Conditional Access may be required. The Connect Microsoft button is disabled until enough setup is in place to actually start sign-in. The Phase 1G-S4 raw env-var checklist is preserved as a collapsed Technical setup details disclosure for technical buyers and existing tests. No demo data is shown anywhere on the primary path. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration with read-only delegated permissions. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S5 changes the Microsoft setup wizard UI and extends the setup-status route shape. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsApiPrimary buyer surfaceEvidence trace: the rewritten server-side Microsoft setup-status route, the new Microsoft setup template helper, the new user-guided Microsoft setup wizard UI in json paste score card, the new Connect Microsoft can-start gating, the extended route-load smoke gate required-strings list, the new Phase 1G-S5 unit test file (eighteen new tests), package.json version bump, the methodology changelog 103rd entry, the README Phase 1G-S5 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S5 internal methodology record (User-guided Microsoft setup wizard)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S5 founder execution signal (replace the Phase 1G-S4 flat env-var checklist with a real user-guided Microsoft setup wizard that explains where to get each value)
- Impact assessment
- Replaces the AgentProof Phase 1G-S4 flat env-var checklist with a real user-guided Microsoft setup wizard so a non-technical founder can complete the local Microsoft setup without prior knowledge of environment variables. Extends the connectors microsoft setup status route shape to return per-field label, description, where-to-find, is_secret, and optional safe_default values plus required_count, present_count, missing_count, can_start_microsoft_auth, and setup_steps_version. The route still never returns secret values, never returns tokens, never returns raw env var contents, and never makes a live Microsoft network call. Adds a pure deterministic helper that builds a copyable .env.local template containing placeholders only (never a real secret, never a tenant-specific value) plus the two documented safe defaults inlined. Replaces the Phase 1G-S4 setup-needed flat checklist UI with a real wizard composed of a wizard headline, a What you will create explainer card, a 5-step progress chip ordered list, seven numbered step section cards, a per-field where-to-find guide, an enterprise note, and the Phase 1G-S4 raw env-var checklist demoted into a collapsed Technical setup details disclosure. Gates the Connect Microsoft button on can_start_microsoft_auth. Extends the route-load smoke gate to require Set up Microsoft discovery, Create app registration, and Add redirect URI. Adds eighteen unit tests. The Phase 1G-S2 server-side OAuth and session and discovery plumbing is preserved unchanged. Tokens never reach the browser. Tokens are never persisted. Tokens are never logged. No business records are read. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S4 archive is superseded as the local-development baseline by this Phase 1G-S5 archive because the Phase 1G-S4 setup-needed UI was a flat env-var checklist that a non-technical founder could not act on without prior knowledge of environment variables.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S6Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S6 - Microsoft discovery fix and wizard detail upgrade
Reason: The founder completed Phase 1G-S5 setup against a real Microsoft tenant (Contoso), signed in successfully, and verified two real Power Platform environments (CRM667668 trial, Contoso default) via admin.powerplatform.microsoft.com. AgentProof still returned permission insufficient for the environment listing. Server logs showed only HTTP status codes; the real Microsoft response was discarded by a catch-all error mapping. Phase 1G-S6 corrects two real defects: (a) AgentProof must surface the actual Microsoft error so the next time something fails the founder can see what Microsoft said; and (b) the Phase 1G-S5 wizard, while a major improvement on Phase 1G-S4 raw env-var dumps, was still missing the practical setup detail the founder needed during real Microsoft setup, including direct Microsoft Entra URLs, the APIs my organization uses tab requirement, environment management.Environment.Read recommendation, the Value vs Secret ID warning, the Windows-specific .env.local download dot-stripping issue, the cmd vs power shell distinction, the Notepad save-as quirks, the 600-byte placeholder sanity check, and the client-secret vs session-secret distinction with a one-line power shell command. The founder explicitly authorised both corrections (Ship 1G-S6 full scope: A + B).
What changed: Part A - server-side discovery fix. Replaces the catch-all every-non-2xx-becomes-permission insufficient mapping in lib connectors microsoft microsoft power-platform-client with a status-aware mapping that preserves the actual Microsoft error envelope. Adds a documented microsoft safe diagnostic shape (http status, microsoft error code, microsoft error message, www authenticate, correlation id, endpoint host) and a redaction-safe extractor that handles both OData and OAuth error shapes. Adds a build error from microsoft response function that maps 401 to auth failed (with a Disconnect-and-reconnect next-action hint), 403 to permission insufficient (with an environment management.Environment.Read scope hint and a Power Platform admin role hint as the next action), 404 to upstream unavailable, 429 to rate limited, and 5xx to upstream unavailable. Adds redact potential secrets that scrubs Bearer tokens, JWT-shaped strings, and long base64url runs from any free-text Microsoft message. Adds http status for connector error that maps each connector error code to the right outward HTTP status so the dev-tools Network panel is honest (no longer 502 for everything). Updates the connectors microsoft environments route to use the safe diagnostic shape, surface a safe error code, safe message, next action, microsoft safe diagnostic, diagnostic id envelope to the UI, log a one-line non-secret structured summary to the dev-server console, and use the right HTTP status. The endpoint already used the documented api.powerplatform.com environmentmanagement environments path with api-version 2022-03-01-preview; that pairing is preserved. Tokens, refresh tokens, id tokens, client secrets, authorisation codes, and full session ids are NEVER logged or returned in any error path. Part B - wizard detail upgrade. Step 1 in the JSON paste score card component is rewritten to provide three direct Microsoft Entra access routes (entra.microsoft.com, portal.azure.com, admin.microsoft.com), an explicit admin-permission requirement listing the acceptable Entra roles (Cloud Application Administrator, Application Administrator, Global Administrator), and a non-admin fallback explaining how to ask a tenant administrator for help. Step 4 adds a Value-vs-Secret-ID comparison table, a prominent shown-only-once warning, an AADSTS7000215 mistake-mode note, and a private-handling reminder. Step 5 is rewritten with a numbered procedure that explicitly tells the founder to switch to the APIs my organization uses tab, search Power Platform API, choose Delegated permissions, tick environment management.Environment.Read as the safest read scope, click Add permissions, and click Grant admin consent for the tenant. A collapsed disclosure explains what to do if Power Platform API is not pre-registered (an admin-side power shell New-azure adservice principal command with the documented Power Platform API service principal id 8578e004-a5c6-46e7-913e-12f58912df43 - AgentProof never executes this; admin guidance only). A separate paragraph explains the Power Platform admin role check via admin.powerplatform.microsoft.com because Entra app permissions are necessary but not sufficient. A second collapsed disclosure explains why these specific scopes are needed (the .default mechanic, the actual environmentmanagement endpoint scope check, and the new diagnostic-quality fix). Step 6 adds a Windows-specific help block covering the leading-dot download issue, the power shell Rename-Item env .env.local plus Get-child item -Force commands, the cmd vs power shell distinction (with the typical 'Get-Item' is not recognized error), the Notepad save-as All Files / UTF-8 / quoted-filename trick, the VS Code easier-alternative, the .env.local.txt warning, and the around-600-bytes placeholder sanity check. A new client-secret vs session-secret distinction table is added with a one-line power shell command for generating the session secret. The component adds a microsoft discovery error use state (the 32nd hook overall) and two new use effects: one that auto-fetches environments the moment Microsoft sign-in completes (so the founder no longer has to click List environments first), and one that auto-fetches agents the moment an environment is selected. A new structured discovery error panel surfaces the safe Microsoft diagnostic envelope (http status, microsoft error code, microsoft error message, www authenticate, correlation id, endpoint host) plus the next action hint plus the diagnostic id, with an explicit no-tokens-no-secrets-no-cookies disclaimer. Extends the route-load smoke gate required texts to include the new entra.microsoft.com URL string, the APIs my organization uses tab label, and the On Windows? Read this block heading. Adds about thirty new Phase 1G-S6 unit tests pinning the connector errors module shape, the power platform client diagnostic upgrade, the environments route safe envelope, the wizard step upgrades, the auto-fetch use effects, the discovery error panel, the smoke gate, and the safety boundaries. package.json bumped from 0.113.0 to 0.114.0. methodology current methodology summary product version bumped to 0.114.0. methodology changelog now carries 104 entries with change id 1G-S6 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The founder no longer hits a silent permission insufficient when Microsoft refuses a read-only call. Instead the UI shows a structured discovery error panel with the actual Microsoft error code, the actual Microsoft error message, the WWW-Authenticate header, the correlation id, the endpoint host, plus a plain-English next-action hint (e.g. add environment management.Environment.Read in Microsoft Entra, click Grant admin consent, then Disconnect Microsoft and Connect Microsoft again). The dev-server console prints one structured non-secret summary line per failure so support and the founder can correlate via a short diagnostic id. The wizard now contains everything a non-technical founder needs to set up Microsoft locally without a separate chat walkthrough: where to click in Microsoft Entra (three direct URLs), which Microsoft role they need to be in to create an app registration, what to do if they are not an admin, the Value vs Secret ID warning with the shown-only-once flash, the APIs my organization uses tab guidance, the specific environment management.Environment.Read scope recommendation, the Grant admin consent step, the Power Platform admin role check explained inline, the Windows-specific .env.local download dot-stripping fix in power shell, the cmd vs power shell distinction, the Notepad save-as quirks, the VS Code easier-alternative, the about-600-bytes placeholder sanity check, and the client-secret vs session-secret distinction with a one-line power shell session-secret generator. The founder no longer has to click List environments and then List agents and then Discover footprint manually - environments auto-fetch the moment Connect Microsoft completes, and agents auto-fetch the moment an environment is selected. Errors no longer leave the UI in Loading forever. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration with read-only delegated permissions. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S6 fixes a server-side error-mapping defect, upgrades server-side diagnostics, upgrades the in-product Microsoft setup wizard, and adds auto-fetch UX. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsApiLibPrimary buyer surfaceDiagnosticsEvidence trace: the rewritten connector errors module with safe Microsoft diagnostic envelope, the rewritten power platform client with status-aware error capture and structured logging, the rewritten environments route with safe diagnostic surfacing and correct HTTP status mapping, the rewritten wizard Step 1 / Step 4 / Step 5 / Step 6 in json paste score card with the founder-led detail upgrades, the new microsoft discovery error use state plus the two auto-fetch use effects, the new structured discovery error panel, the extended route-load smoke gate required texts, the new Phase 1G-S6 unit test file (about thirty new tests), package.json version bump from 0.113.0 to 0.114.0, the methodology changelog 104th entry, the README Phase 1G-S6 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S6 internal methodology record (Microsoft discovery fix and wizard detail upgrade)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S6 founder execution signal: founder verified real environments in Power Platform Admin Center but AgentProof still returned permission_insufficient; founder also requested wizard detail upgrades to match the level of guidance provided in chat during real setup.
- Impact assessment
- Fixes the AgentProof environment-discovery diagnostic-quality defect that caused every non-2xx Microsoft response to be silently mapped to permission_insufficient with no captured Microsoft error envelope, hiding the actual cause from both the founder UI and the dev-server log. Adds a strict, redaction-safe Microsoft diagnostic envelope (status, error.code, error.message, WWW-Authenticate, correlation id, endpoint host) that is captured server-side, logged once per failure as a structured non-secret line, and surfaced to the UI alongside a plain-English next_action hint and a short server-generated diagnostic_id. Maps each connector error code to the right outward HTTP status (401 / 403 / 404 / 429 / 502 / 503 / 500). Adds a Bearer-token / JWT / long-base64url-run redactor so any free-text Microsoft message is scrubbed before being logged or returned. Upgrades the Phase 1G-S5 setup wizard with the practical guidance a non-technical founder needs during real setup: direct Microsoft Entra URLs, admin permission and non-admin fallback, Value vs Secret ID warning, APIs my organization uses tab guidance, EnvironmentManagement.Environment.Read recommendation, Power Platform admin role check, Windows-specific .env.local handling, and the client-secret vs session-secret distinction with a one-line session-secret generator. Adds auto-fetch UX so the founder does not have to click List environments and List agents manually after Connect Microsoft. Adds about thirty unit tests pinning the diagnostics shape, the wizard upgrades, the auto-fetch UX, and the safety boundaries (no secrets logged, no fictional content, no manual environment / agent entry, tokens stay server-side). The Phase 1G-S2 OAuth flow and the Phase 1G-S5 wizard structure are preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S5 archive is superseded as the local-development baseline by this Phase 1G-S6 archive because the Phase 1G-S5 environment-discovery flow returned a generic permission_insufficient that hid the actual Microsoft error and because the Phase 1G-S5 wizard left out practical setup guidance the founder needed during real local setup. Phase 1G-S6 has not been live-tested against the the real Microsoft tenant by the AgentProof developer. The founder must run the local route to confirm real environments and agents are discovered.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S7Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S7 - fix Microsoft auth session truth and dev-safe session persistence
Reason: The founder live-tested Phase 1G-S6 and observed the AgentProof UI showing Disconnect Microsoft and Microsoft signed in while a direct browser-console fetch to the connectors microsoft auth status route returned 200 with status not signed in and the connectors microsoft environments route returned 401 auth required. UI claimed signed-in while the server was correctly reporting not signed-in. Root cause: the Microsoft session store was a module-local singleton variable (a let shared store null initialiser inside the connectors microsoft session store module) and Next.js dev mode re-evaluates route modules across hot reload plus can load the same route into different module instances per chunk, so the in-memory store the cookie pointed at no longer existed when the next route handler imported the module fresh. This is not a Microsoft permission problem, it is an AgentProof auth session truth bug. Phase 1G-S7 fixes the session-store persistence model and tightens the UI truth contract so AgentProof never displays signed-in unless the connectors microsoft auth status route returns signed-in.
What changed: Replaces the module-local shared store singleton in the connectors microsoft session store module with a global this-backed singleton attached to a documented key (the AgentProof Microsoft session store version one global key). Same Node process now keeps the same store instance regardless of route module re-evaluation. Adds a get microsoft session store instance id helper so dev diagnostics can confirm a single store instance is handling auth start, auth callback, auth status, environments, and agents. Documents the security boundary explicitly: in-memory only, tokens never leave the server, never persisted to disk, local storage, session storage, indexed db, cookies (other than the opaque session id), or Supabase, never returned to the browser, never logged. Re-asserts that production must replace this in-memory store with an encrypted server-side cache before commercial launch. Adds a hard-guarded dev-only diagnostic route at the connectors microsoft auth debug status path. The route returns 404 when node env equals production. In dev it returns ONLY safe presence and state booleans plus counts: runtime, node env, store instance id, cookie present, session id prefix (first four chars only, never the full id), session found, flow state, token present (boolean only, never the value), token expiry boolean, discovered environment count, store size. Tokens, refresh tokens, id tokens, client secrets, authorisation codes, full session ids, business records are never returned. Tightens the UI: when the connectors microsoft auth status route returns not signed in, the json paste score card component now also clears stale environment, agent, and footprint state so the UI cannot show Loading environments against a signed-out session. If the page arrived at the score paste page from a microsoft connected query parameter but the server says not signed-in, the UI surfaces a calm explanation: Microsoft returned to AgentProof, but the local session was not found. Please reconnect. The component documents in source that the connectors microsoft auth status route is the SINGLE SOURCE OF TRUTH and that a microsoft connected URL query never sets signed-in on its own. When the connectors microsoft environments route returns 401 the UI clears all stale signed-in state and surfaces the actionable reconnect panel: Your Microsoft sign-in was not available to this request. Reconnect Microsoft. Same handling for the connectors microsoft agents route returning 401. Loading state always clears in a finally block on every fetch path so the UI never hangs on Loading environments indefinitely. All Microsoft fetches now use credentials include as defence in depth (same-origin cookies are sent by default, but the contract is now explicit). Approved API route inventory is updated to include the new dev-only debug-status route (16 to 17 routes). The Phase 2W frozen baseline manifest api route count and api routes list, the Phase 2A surface safety APPROVED API ROUTES, and the documented frozen-baseline manifest validator now expect seventeen. Adds 26 new Phase 1G-S7 unit tests pinning the global this store, the dev-only debug endpoint hard guard, the auth flow truth contract (start, callback, status, environments, agents share same cookie plus same store plus nodejs runtime), the UI truth rule (auth status is single source of truth), the 401 stale-state-clear behaviour, the loading-state finally-block behaviour, the credentials include contract, and the safety boundaries. package.json bumped from 0.114.0 to 0.115.0. methodology current methodology summary product version bumped to 0.115.0. methodology changelog now carries 105 entries with change id 1G-S7 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The founder no longer hits the impossible-to-debug state where the AgentProof UI claims signed-in while the server says not signed-in. The session created by Connect Microsoft now survives across the redirect back from Microsoft and across subsequent route calls in the same Node process even when Next.js dev hot reload re-evaluates route modules. If the local dev server actually restarted between sign-in and the redirect (a separate kind of session loss), the UI surfaces Microsoft returned to AgentProof, but the local session was not found. Please reconnect, instead of leaving the UI stuck on Loading environments. If any environments or agents call returns 401, the UI immediately clears the stale signed-in state and renders a calm reconnect panel. The dev-only debug-status endpoint gives the founder a single console fetch to confirm the cookie reached the server, the session was found, the same store handled both auth start and auth status, and the token is present without ever exposing the token. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration with read-only delegated permissions and a real Power Platform admin role. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S7 fixes the local-dev session-store persistence model, tightens the UI auth-truth contract, and adds a hard-guarded dev-only debug endpoint. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsApiLibPrimary buyer surfaceDiagnosticsEvidence trace: the rewritten connectors microsoft session store module with global this-backed singleton plus store instance id helper, the new dev-only connectors microsoft auth debug status route hard-guarded behind node env not production, the json paste score card component changes (auth status SINGLE SOURCE OF TRUTH comment, not signed-in stale-state clear, session lost after callback message, environments and agents 401 stale-state clear, credentials include on all Microsoft fetches), the updated Phase 2A approved API route inventory, the updated Phase 1D frozen baseline manifest support module DOCUMENTED API ROUTES, the updated Phase 2W frozen baseline manifest content api route snapshot (16 to 17), the updated Phase 1G real microsoft boundaries route count assertion, the new Phase 1G-S7 unit test file (26 new tests), package.json version bump from 0.114.0 to 0.115.0, the methodology changelog 105th entry, the README Phase 1G-S7 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S7 internal methodology record (Microsoft auth session truth plus dev-safe session persistence)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S7 founder execution signal: founder live-tested Phase 1G-S6 and observed UI showing signed-in while server returned not signed-in (auth status 200 not signed in, environments 401 auth required).
- Impact assessment
- Fixes the local-dev session persistence defect that caused the AgentProof UI to display signed-in while the server returned not signed-in. The Microsoft session store now uses a globalThis-backed singleton attached to a documented key, so the same Node process keeps the same store instance regardless of Next.js dev hot reload or route module re-evaluation. Documents the security boundary explicitly and re-asserts that production must replace the in-memory store with an encrypted server-side cache before commercial launch. Adds a hard-guarded dev-only diagnostic endpoint at the connectors microsoft auth debug status path (returns 404 in production) that returns only safe presence and state booleans plus counts (cookie present, session found, store instance id, token present boolean, discovered environment count, etc.) and never tokens or full session ids. Tightens the UI: the connectors microsoft auth status route is the single source of truth for whether the founder is signed in, the UI never infers signed-in from a URL query parameter alone, the UI clears stale environment, agent, and footprint state when the server says not signed-in, and the UI surfaces a calm session lost after callback explanation when the page arrives from a microsoft connected query but the server has no session. On 401 from environments or agents, the UI clears the stale signed-in state and surfaces the actionable reconnect panel. Loading state always clears in a finally block on every fetch path so the UI never hangs on Loading environments indefinitely. All Microsoft fetches use credentials include as defence in depth. Adds 26 unit tests pinning the globalThis store, the dev-only debug endpoint hard guard, the auth flow truth contract, the UI truth rule, the 401 stale-state-clear behaviour, the loading-state finally-block behaviour, the credentials include contract, and the safety boundaries. Approved API route inventory is updated to include the dev-only debug endpoint (16 to 17 routes). The Phase 2W frozen baseline manifest snapshot, Phase 2A surface safety APPROVED API ROUTES, and the documented frozen-baseline validator are updated. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics are preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S6 archive is superseded as the local-development baseline by this Phase 1G-S7 archive because the Phase 1G-S6 session store was a module-local singleton that did not survive Next.js dev hot reload, causing the UI to display signed-in while the server returned not signed-in. Phase 1G-S7 has not been live-tested against the real Microsoft tenant by the AgentProof developer. The founder must run the local route to confirm Disconnect Microsoft only appears after the connectors microsoft auth status route returns signed-in, environments auto-fetch runs only after confirmed signed-in, and the UI surfaces real safe diagnostics on any failure.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S8Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S8 - hard fix Microsoft callback session with auth trace proof
Reason: Founder live-tested Phase 1G-S7 and observed the connectors microsoft auth status route returning status not signed-in after Microsoft sign-in completed. The connectors microsoft environments route returned 401 auth required. Phase 1G-S7 global this singleton was insufficient: the in-memory pending session created at auth start could still be wiped between auth start and auth callback by Next.js dev mode (file-watch-driven server restart, hot-reload corner cases, or a long Microsoft sign-in window). When the in-memory pending session is gone, auth callback cannot validate state, redirects with auth failed, and the server-visible signed-in state never appears. Phase 1G-S8 makes this impossible to miss by adding (a) a HMAC-signed pending handshake cookie that survives any dev restart, (b) a server-side auth trace ring buffer surfaced via the dev-only debug-status endpoint, (c) a build-identity endpoint, and (d) a callback proof state on the URL.
What changed: Adds a pure deterministic helper at the connectors microsoft pending handshake module that encodes the OAuth state plus PKCE verifier plus issued-at-ms into a base64url payload signed with HMAC-SHA256 keyed by the local microsoft session secret. The cookie is http only plus same site Lax plus Path slash plus Secure-only-in-production plus 5 minute TTL. It carries NO token material. Constant-time HMAC verify with timing safe equal. Adds a server-side auth trace ring buffer at the connectors microsoft auth trace module attached to global this under a documented key. The buffer holds the last 20 events. Each event carries an allowlisted name plus an allowlisted set of safe field names. The trace module rejects unknown event names silently and drops fields outside the safe-field allowlist. A defence-in-depth redactor scrubs Bearer tokens, JWT-shaped strings, and long base64url runs from any string field. Adds a tiny build-identity endpoint at the agentproof version path that returns product version plus phase id plus expected archive name plus build label plus served at. No secrets, no tokens, no tenant values; safe in production. Updates the connectors microsoft auth start route to also write the pending handshake state into the HMAC-signed cookie and to record auth start trace events. Updates the connectors microsoft auth callback route to validate state and PKCE from the HMAC-signed cookie via the constant-time decoder, records the full callback trace, exchanges the code, marks the same session signed in or creates a fresh signed-in session if the original cookie was dropped, refreshes the signed-in cookie, clears the pending handshake cookie, and redirects to the score paste page with the new microsoft callback returned URL flag. Updates the connectors microsoft auth status route to record auth status checked plus the appropriate signed-in or not signed-in or setup-needed branch event. Updates the connectors microsoft environments route to record environments auth required when no session is present or when the session is not signed in. Extends the dev-only connectors microsoft auth debug status route to surface current product version plus phase id plus expected archive name plus pending cookie present (boolean only) plus last auth events from the trace buffer. Updates the json paste score card component so the initial use state reads window location search and starts in the checking state when the page was reached via either microsoft callback returned (Phase 1G-S8) or the legacy microsoft connected (Phase 1G-S7) flag. The flag never sets signed in on its own. The component also surfaces the calm session lost after callback message when the page arrived from either flag and the server says not signed in, with a pointer to the dev-only debug status endpoint. Approved API route inventory is updated to include the new agentproof version endpoint (17 to 18 routes). Phase 2A surface safety APPROVED API ROUTES, Phase 1D frozen baseline manifest support module DOCUMENTED API ROUTES, Phase 2W frozen baseline manifest content api route snapshot, and the documented frozen-baseline manifest validator are all updated. Adds 39 new Phase 1G-S8 unit tests. package.json bumped from 0.115.0 to 0.116.0. methodology current methodology summary product version bumped to 0.116.0. methodology changelog now carries 106 entries with change id 1G-S8 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The founder can now prove which build is running locally by opening the agentproof version endpoint in the browser. The Microsoft sign-in handshake survives any Next.js dev restart because the pending state plus PKCE verifier are carried in an HMAC-signed cookie keyed by the local session secret. After Microsoft redirects back to AgentProof, the URL is now microsoft callback returned. The UI initialises into a Checking Microsoft connection state and immediately re-fetches the auth status route. If the auth status route returns signed-in, the UI proceeds to environment auto-fetch. If not, the UI surfaces a calm Microsoft returned to AgentProof, but the local session was not found. Please reconnect message with a pointer to the dev-only debug-status endpoint so the founder can see exactly which step failed. The dev-only debug-status endpoint now surfaces the last 20 auth events plus the build identity. Tokens, refresh tokens, id tokens, client secrets, authorisation codes, full session ids, business records are never returned, never logged, never written to the trace. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration plus a real Power Platform admin role. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S8 hardens the Microsoft OAuth callback handshake against dev-server restart, adds an auth trace buffer plus a build-identity endpoint, tightens the UI callback-returned handling, and adds 39 new unit tests. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsApiLibPrimary buyer surfaceDiagnosticsEvidence trace: the new connectors microsoft pending handshake module with HMAC-signed cookie helpers, the new connectors microsoft auth trace ring buffer, the new agentproof version build-identity endpoint, the rewritten connectors microsoft auth start route, the rewritten connectors microsoft auth callback route, the updated auth status and environments routes, the extended dev-only connectors microsoft auth debug status route, the json paste score card component changes, the updated approved API route inventory (17 to 18), the new Phase 1G-S8 unit test file (39 new tests), package.json version bump from 0.115.0 to 0.116.0, the methodology changelog 106th entry, the README Phase 1G-S8 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S8 internal methodology record (hard fix Microsoft callback session plus auth trace proof)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S8 founder execution signal: founder live-tested Phase 1G-S7 and observed auth status returning not signed-in after Microsoft sign-in.
- Impact assessment
- Hardens the AgentProof Microsoft OAuth handshake against Next.js dev-server restart by carrying the pending state plus PKCE verifier in an HMAC-signed cookie keyed by the local session secret. Adds a server-side auth trace ring buffer with allowlisted event names plus safe fields plus a defence-in-depth token-shape redactor. Adds a build-identity endpoint at the agentproof version path. Updates auth start, auth callback, auth status, environments, and the dev-only debug-status routes. Updates the UI callback-returned handling. Adds 39 unit tests. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store are preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S7 archive is superseded as the local-development baseline by this Phase 1G-S8 archive because the Phase 1G-S7 globalThis singleton did not survive a Next.js dev-server restart between auth start and auth callback. Phase 1G-S8 has not been live-tested against the real Microsoft tenant by the AgentProof developer.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S9Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S9 - hydration repair and stale copy cleanup
Reason: Founder live-tested Phase 1G-S8. The agentproof version endpoint correctly reported product version 0.116.0 and phase id 1G-S8, confirming the new build was running. However the actual UI failed with a Next.js hydration error overlay: Hydration failed because the initial UI does not match what was rendered on the server. Expected server HTML to contain a matching p in section. The page also still displayed stale fictional and description copy: AgentProof reviews an AI agent description, A safe fictional demo first, Try the fictional example agent below. The UI showed Disconnect Microsoft and Loading and Click List environments to discover, all without the server confirming a signed-in session. Phase 1G-S8 is therefore not product-approved. Root cause of the hydration mismatch: the json paste score card component used a use state initialiser of the form use state lazy arrow that read typeof window not undefined and then window location search to choose between checking and setup needed. On the server window is undefined so the initial state was setup needed and the wizard tree was rendered into the server HTML. On the client first render with the microsoft callback returned URL flag, the initial state was checking which renders a different tree. React detected the section and p mismatch and threw the hydration error.
What changed: Reverts the json paste score card microsoft auth status use state initialiser to a deterministic literal setup needed so the server-rendered HTML and the client's first hydrated render are byte-for-byte identical. Moves the URL-query check (the microsoft callback returned and microsoft connected branches) into the post-mount use effect, which runs only on the client AFTER hydration. The auth status remains the single source of truth: the URL flag never sets signed in on its own; it only triggers a transition into the checking state and a re-fetch of the auth status route. Replaces the buyer What you get section intro from AgentProof reviews an AI agent description and gives your team a short, practical readiness picture you can act on this week with discovery-first copy: AgentProof connects to your agent platform, discovers the agent footprint from real metadata and configuration, and gives your team a short, practical readiness picture you can act on this week. No paste, no description review. Replaces the safe fictional demo first bullet with a footprint-discovered bullet: A real agent footprint discovered from your environment. Topics, knowledge sources, actions, integrations, oversight signals - pulled from the platform, not described from memory. Updates the readiness-score bullet headline to A readiness score based on discovered metadata, configuration, controls, and confirmed unknowns. Updates the report bullet headline to A shareable readiness report after discovery and confirmation. Updates the Phase 1F Slice 5 content artefact, the Phase 1F support module expected buyer what you get intro and expected buyer what you get point test ids and expected buyer what you get point headlines, and the Phase 1F primary-copy plus value-points plus shape plus decision-safety tests so they accept the Phase 1G-S9 supersession. Updates the agentproof version endpoint constants to product version 0.117.0 and phase id 1G-S9. Backwards-compatible phase 1 g s8 alias exports resolve to the Phase 1G-S9 values so existing imports keep compiling. Extends the route-load smoke gate to forbid Hydration failed and initial UI does not match and Unhandled Runtime Error and AgentProof reviews an AI agent description and safe fictional demo and Try the fictional on the curl-visible response. Adds 29 new Phase 1G-S9 unit tests pinning hydration safety (no use state lazy initialiser reads window or document, the post-hydration URL-query check lives in a use effect), the discovery-first What you get copy, the build identity, the auth truth contract preserved, and the smoke gate's new forbidden strings. Updates the Phase 1D walkthrough state-model test to count use state calls in source with comments stripped (so the comment that documents the Phase 1G-S8 forbidden use state shape does not inflate the count). Updates the Phase 1G-S8 phase-identity-constants test to accept the supersession alias values. package.json bumped from 0.116.0 to 0.117.0. methodology current methodology summary product version bumped to 0.117.0. methodology changelog now carries 107 entries with change id 1G-S9 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page no longer renders a Next.js hydration error overlay. The server-rendered HTML and the client first render now produce byte-identical trees (the wizard) for the microsoft auth state regardless of any URL flag, and the post-mount use effect transitions to checking immediately on the client when the URL flag is present, then polls auth status. The buyer-facing What you get section now reflects the discovery-first product direction: a real agent footprint discovered from the platform, a readiness score based on discovered metadata and configuration and controls and confirmed unknowns, top risks, practical next steps, and a shareable readiness report after discovery and confirmation. The fictional demo is no longer mentioned on the primary path; it lives behind the founder-only diagnostics toggle. The agentproof version endpoint reports product version 0.117.0 and phase id 1G-S9 so the founder can confirm at a glance whether the running build is the new one. UI product approval still requires founder review; technical tests alone do not approve the UI. Real Microsoft discovery still requires a real Microsoft Entra ID app registration with read-only delegated permissions and a real Power Platform admin role. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S9 fixes a client-side hydration mismatch and rewrites the buyer What-you-get copy. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsApiPrimary buyer surfaceEvidence trace: the json paste score card use state initialiser reverted to a deterministic setup needed literal, the post-hydration URL-query check moved into the existing post-mount use effect, the rewritten What you get section (intro plus footprint-discovered bullet plus discovery-first readiness-score bullet plus shareable-report bullet, with the safe-fictional-demo bullet removed), the updated agentproof version endpoint constants for Phase 1G-S9, the updated Phase 1F Slice 5 content artefact and support module and tests, the updated Phase 1D walkthrough state-model test, the updated Phase 1G-S8 phase-identity-constants test, the extended route-load smoke gate forbidden strings, the new Phase 1G-S9 unit test file (29 new tests), package.json version bump from 0.116.0 to 0.117.0, the methodology changelog 107th entry, the README Phase 1G-S9 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S9 internal methodology record (hydration repair plus stale copy cleanup)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S9 founder execution signal: founder live-tested Phase 1G-S8 and observed a Next.js hydration error overlay plus stale fictional and description copy on the primary buyer path.
- Impact assessment
- Fixes the Phase 1G-S8 hydration mismatch by reverting the JsonPasteScoreCard microsoftAuthStatus useState initialiser to a deterministic literal setup needed and moving the URL-query check into a post-mount useEffect. Removes stale fictional and description copy from the primary buyer path. Replaces the buyer What you get section intro and the safe-fictional-demo bullet with discovery-first copy. Updates the Phase 1F Slice 5 content artefact, support module, and primary-copy plus value-points plus shape plus decision-safety tests to accept the supersession. Updates the agentproof version endpoint to product version 0.117.0 and phase id 1G-S9. Adds 29 unit tests. Extends the route-load smoke gate to forbid the hydration error fragments and the stale copy. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer are preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The Phase 1G-S8 archive is superseded as the local-development baseline by this Phase 1G-S9 archive because the Phase 1G-S8 useState lazy initialiser caused a Next.js hydration error overlay on the score paste page. Phase 1G-S9 has not been live-tested against the real Microsoft tenant by the AgentProof developer.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S10Change date:2026-05-04Product version:0.167.1Methodology engine version:0.9.1
Phase 1G-S10 - Power Platform permission UI fallback and tenant setup reality check
Reason: Founder live-tested Phase 1G-S9. The agentproof version endpoint correctly reported product version 0.117.0 and phase id 1G-S9, the Microsoft auth/status route returned signed in (auth/session works end-to-end), and the score paste page no longer rendered a Next.js hydration error overlay. However the next call - GET /api/connectors/microsoft/environments - now reaches Microsoft and fails with a real Microsoft 403 (permission insufficient mapping, endpoint host api.powerplatform.com, Microsoft safe diagnostic id 32681be22540fef0, Microsoft correlation id 098b8893-2bcd-4ae5-bacd-048976218f30). The founder app registration AgentProof Local Discovery currently has Microsoft Graph User.Read (Delegated, Granted) plus Power Platform API app management.Application (Delegated, Granted). The founder searched the Entra API permission picker for environment management.Environment.Read and environment management.Environments.Read - neither permission appears. This is no longer an AgentProof bug; it is a real tenant setup reality. AgentProof was telling the founder to add a permission that may not appear in the normal Entra picker.
What changed: Phase 1G-S10 confirms via official Microsoft documentation (programmability-authentication-v2, programmability-permission-reference, the REST API List Environments For User reference, and the Microsoft Q-and-A entry on provisioning the Power Platform API service principal) that (a) the documented permission for environment listing is environment management.Environments.Read on the Power Platform API service principal app id 8578e004-a5c6-46e7-913e-12f58912df43, (b) Power Platform API uses delegated permissions only at this time, (c) app management.application packages.Read reads application packages and is not the right permission to list environments, and (d) the documented Microsoft cause of the missing-permission case is that the Power Platform API service principal has not been provisioned in the tenant yet. Adds a new content evidence file at content/phase 1g s10 power platform permission ui fallback.v1.json with founder observed state, current api permissions seen, failed environment diagnostic, official docs checked, confirmed permission requirements, ui picker gap explanation, supported fallback options, the rejected-options list (including options not confirmed by official Microsoft docs), recommended next step, unresolved questions, product implications, and canonical decision; every unconfirmed claim is marked status unconfirmed and do not use for founder instruction true. Adds a new pure deterministic helper module at an internal source file that exports build microsoft admin handoff text (a copyable plain-English admin handoff message with placeholders for tenant-specific values, including the safe diagnostic id and the Microsoft correlation id) and build microsoft admin power shell snippet (the Microsoft-documented Microsoft Graph power shell + Azure CLI snippet that provisions the Power Platform API service principal for app id 8578e004-a5c6-46e7-913e-12f58912df43; placeholders only; AgentProof never executes it). Updates the json paste score card 403 error panel: when permission insufficient hits on list environments AFTER auth/status reaches signed in, the panel renders the new copy (Microsoft signed you in but blocked the environment-listing API; Your app currently has app management.Application which reads application packages; that is not enough to list environments; the documented permission is environment management.Environments.Read on the same Power Platform API; Power Platform API uses delegated permissions only) and a collapsed Permission not visible? fallback section that exposes the copyable admin handoff text, the safe READ-ONLY admin diagnostic snippet, the official Microsoft doc links, and an honest blocker statement if the admin will not provision. Fixes the wizard's earlier mis-spelled permission name from environment management.Environment.Read singular to environment management.Environments.Read plural (the official documented form). Updates the agentproof version endpoint constants to product version 0.118.0 and phase id 1G-S10. Backwards-compatible phase 1 g s8 and phase 1 g s9 alias exports resolve to the Phase 1G-S10 values so existing imports keep compiling. Adds 1G-S10 unit tests that pin (a) the evidence file shape, (b) the official-doc reference metadata on every confirmed permission claim, (c) the unconfirmed flag on every unconfirmed claim, (d) the admin handoff text shape and absence of token-shaped strings, (e) the 403 panel copy distinction, (f) the Permission not visible fallback presence, (g) the build identity, (h) the auth-status gate (panel does NOT say Microsoft permission problem until auth/status reaches signed in), and (i) the no-random-clicking copy. package.json bumped from 0.117.0 to 0.118.0. methodology current methodology summary product version bumped to 0.118.0. methodology changelog now carries 108 entries with change id 1G-S10 last (source type internal methodology, re score recommended false). The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the slice. NullProvider remains default.
User impact: The score paste page now handles the real tenant case the founder ran into on Phase 1G-S9. When Microsoft sign-in succeeds (auth/status returns signed in) but the environment-listing call returns 403, the buyer no longer stares at a generic permission error or gets sent to keep clicking around in Entra. The 403 panel now plainly says Microsoft signed them in but blocked the environment-listing API, distinguishes the app management.Application family that they actually have today from the environment management.Environments.Read permission they need, and exposes a collapsed Permission not visible? fallback with (1) a copyable admin handoff message for the buyer to send to their Power Platform / Entra admin (containing endpoint host, the safe Microsoft diagnostic id, and the Microsoft correlation id - never tokens or secrets), (2) the official Microsoft Graph power shell + Azure CLI snippet that an admin pastes into their own shell to provision the Power Platform API service principal in the tenant (app id 8578e004-a5c6-46e7-913e-12f58912df43; placeholders only; AgentProof never executes it), (3) links to the three official Microsoft docs that back every claim in the panel, and (4) an honest blocker statement that AgentProof cannot proceed in this tenant if the admin will not provision the service principal or grant the documented permission. The agentproof version endpoint now reports product version 0.118.0 and phase id 1G-S10. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G-S10 is a UX plus evidence slice. The error panel adds plain-English copy plus a copyable admin handoff plus an official power shell snippet plus doc links. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 stays at 99fae010.
DocsTestsContentMethodologyComponentsLibApiPrimary buyer surfaceEvidence trace: the new content evidence file content/phase 1g s10 power platform permission ui fallback.v1.json with the twelve required sections and the unconfirmed flags on every unconfirmed claim, the new an internal source file helper module exporting build microsoft admin handoff text and build microsoft admin power shell snippet plus microsoft admin handoff official doc links plus power platform api first party app id plus required environment listing permission plus power platform api delegated only note, the json paste score card 403 panel update with the permission-not-visible fallback subtree, the wizard permission-name fix to environment management.Environments.Read, the updated agentproof version endpoint constants for Phase 1G-S10, the new Phase 1G-S10 unit test file, package.json version bump from 0.117.0 to 0.118.0, the methodology changelog 108th entry, the README Phase 1G-S10 row.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-S10 internal methodology record (Power Platform permission UI fallback plus tenant setup reality check)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-04
- Reference
- Phase 1G-S10 founder execution signal: founder live-tested Phase 1G-S9, the agentproof version endpoint reported product version 0.117.0 and phase id 1G-S9, the Microsoft auth/status route returned signed_in, and GET /api/connectors/microsoft/environments returned permission_insufficient with endpoint host api.powerplatform.com and Microsoft safe diagnostic id 32681be22540fef0 and Microsoft correlation id 098b8893-2bcd-4ae5-bacd-048976218f30. Founder confirmed the AgentProof Local Discovery app registration has Microsoft Graph User.Read plus Power Platform API AppManagement.Application as the only granted API permissions and that EnvironmentManagement.Environment.Read and EnvironmentManagement.Environments.Read do not appear in the Entra API permission picker.
- Impact assessment
- Confirms via official Microsoft documentation (programmability-authentication-v2, programmability-permission-reference, REST API List Environments For User reference, Microsoft Q-and-A on provisioning the Power Platform API service principal) the documented permission for environment listing is EnvironmentManagement.Environments.Read on the Power Platform API service principal AppId 8578e004-a5c6-46e7-913e-12f58912df43, that Power Platform API uses delegated permissions only at this time, that AppManagement.ApplicationPackages.Read reads application packages (not environment listing), and that the documented Microsoft cause of the missing-permission case is that the Power Platform API service principal has not been provisioned in the tenant yet. Adds the new content evidence file, the new admin handoff helper module, the JsonPasteScoreCard 403 panel update with the Permission not visible fallback subtree, the wizard permission name fix, the updated agentproof version endpoint constants, and the Phase 1G-S10 unit tests. The Phase 1G-S2 OAuth flow plus the Phase 1G-S5 wizard plus the Phase 1G-S6 server diagnostics plus the Phase 1G-S7 globalThis store plus the Phase 1G-S8 HMAC pending handshake cookie and auth trace ring buffer plus the Phase 1G-S9 hydration repair are preserved. The score response shape is unchanged. The deterministic scorecard Markdown SHA-256 stays at 99fae010. NullProvider remains default. UI product approval requires founder review; technical tests alone do not approve the UI.
- Limitations
- No external source was reviewed beyond the cited official Microsoft documentation pages. Internal methodology development only. The Phase 1G-S9 archive is superseded as the local-development baseline by this Phase 1G-S10 archive because Phase 1G-S9 left the founder with no documented next step when the documented EnvironmentManagement.Environments.Read permission did not appear in the Entra API permission picker. Phase 1G-S10 has not been live-tested against the real Microsoft tenant by the AgentProof developer; the founder must run the local route and confirm the new 403 panel renders the new copy and the Permission not visible fallback subtree. The public Microsoft Permission reference page does not publish per-permission GUIDs at the time of this slice; Phase 1G-S10 therefore does not advise the founder to run az ad app permission add with a guessed permission GUID.
No external source was reviewed for this entry.
- Re-score optional
Change id:1DChange date:2026-05-03Product version:0.89.0Methodology engine version:0.9.1
Phase 1D slice 0 — controlled implementation start (entry decision record)
Reason: The founder explicitly issued the controlled implementation start instruction for AgentProof. Phase 1D slice 0 (phase 1d entry decision record completion) converts the previous no-go planning chain into a real controlled entry decision record. Only slice 0 is authorised; every later slice remains unauthorised and will require a separate explicit founder execution instruction. The implementation branch phase-1d-controlled-start-entry-decision-record was created with a normal git checkout -b command after git init on the project root. No destructive git command was run. Public launch, paid use, live use, live customer data, live scans, live LLM, payments, Supabase change, new API route, local storage change, NullProvider default change, marketing/pricing/public sales pages, and test weakening all remain unauthorised. The project is not being parked or frozen.
What changed: A new canonical artefact lives at content/phase 1d entry decision record.v1.json. It records the founder execution signal (received as plain-English founder instruction; not a legal or compliance approval), the approved Phase 3R baseline (agentproof-phase-3r.tar.gz, SHA-256 5e73f3b853b97554dead62591a6c54bbe2cec83040a79b74f8e3c8189067dd1e, 1959661 bytes, 951 entries, 20583 / 20583 / 0 tests, archive-evidence product version 0.88.0), the controlled implementation branch phase-1d-controlled-start-entry-decision-record, the authorised slice phase 1d entry decision record completion at slice index 0, the unauthorised future slices (1, 2, 3, 4, 5, 6, 7, 8, 9), the controlled implementation status (Phase 1D started in the controlled-implementation sense only and not in the legal or compliance approval sense), the base-product direction (sellable base product followed by continuous improvement; stay solo-first, low-touch, self-serve, defensible, ethical, product-led, base-product-first), the runtime boundary status (NullProvider remains default, no Supabase change, no new API route, no local storage change, no service-role usage, no live LLM activation, no live scans, no live customer data, no payments, no deployment change, no marketing/pricing/public sales pages, no test weakening, engine version not bumped, context packs version not bumped), the public/paid use status (paid use, public launch, live use all unauthorised), the legal/governance posture (this is not legal advice; legal/compliance/regulatory/investor approval not claimed; founder signal does not replace legal review), source references to the Phase 3R decision packet and Phase 3Q blocker ledger, slice-0 completion criteria, slice-0 evidence, deterministic validation rules, twenty-five prohibited-action confirmations all canonical-state true, the canonical entry decision (controlled start go for slice 0 only with authorised slice count 1, next slice authorised false, public or paid use authorised false, implementation scope slice 0 only, branch name phase-1d-controlled-start-entry-decision-record, base product objective sellable base product then continuous improvement, project parked false, project frozen false), and a full safety-marker table. One new test-side support module ships at an internal source file. Seven new unit tests pin the record end-to-end. New docs/phase 1 d.md, an updated docs/phase plan.md, an updated docs/phase 1 d entry decision record.md, and an updated README.md carry the founder-facing prose. Every tracked content file top-level product version is bumped to 0.89.0; every latest approved archive block is refreshed to the approved Phase 3R archive. The cascading hash refresh propagates from Phase 2N through Phase 3R to a fixed point. NullProvider remains the default.
User impact: None at runtime. The methodology summary is bumped to product version 0.89.0 to keep methodology tracking in sync with this Phase 1D slice 0 controlled implementation start. Phase 1D has now entered controlled implementation mode under founder instruction; only slice 0 is authorised; every later slice remains unauthorised. The project is not parked and is not frozen.
When to re-score: Phase 1D slice 0 adds no scoring change, no engine change, no rendered-output change, and no runtime change beyond the recorded slice 0 documentation work. Existing scorecards remain valid.
DocsTestsContentMethodologyGitEvidence trace: docs/phase 1 d.md, docs/phase plan.md (Phase 1D slice 0 addendum), docs/phase 1 d entry decision record.md (Phase 1D slice 0 addendum), README.md (Phase 1D slice 0 shipped row), content/phase 1d entry decision record.v1.json, an internal source file, tests/unit/phase_1d_entry_decision_record_*.test.ts (seven), package.json (version bump 0.88.0 to 0.89.0), content/methodology changelog.v1.json (this 79th entry), git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D slice 0 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D slice 0 founder execution signal
- Impact assessment
- Adds the canonical Phase 1D entry decision record at the static content layer, one test-side support module, seven new unit tests pinning shape and source integrity and branch and slice and runtime/public/paid/legal boundaries and decision and safety and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change beyond the recorded slice 0 documentation work. The record does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise NullProvider default change, does not authorise marketing/pricing/public sales pages, does not authorise test weakening, does not park or freeze the project, does not claim legal/compliance/regulatory/investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof toward a sellable base product, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S1Change date:2026-05-03Product version:0.90.0Methodology engine version:0.9.1
Phase 1D Slice 1 — first real scoring run smoke
Reason: The founder explicitly authorised Phase 1D Slice 1 (phase 1d first real scoring run smoke) — the first product-behaviour smoke slice after the Phase 1D Slice 0 controlled implementation start. Slice 1 proves AgentProof can run the deterministic scoring engine end to end on a hand-crafted, non-customer agent input fixture and produce a deterministic founder-facing Markdown scorecard export, with NullProvider as default, no live LLM, no live customer data, no live scans, no Supabase change, no new API route, no payments, no public/paid use. Future slices remain unauthorised and require a separate explicit founder execution instruction.
What changed: A new fictional non-customer demo fixture lives at samples/inputs/phase 1d slice 1 demo agent input.json (synthetic-by-construction, no real customer data, no secrets, no live URLs, deployment context internal only, reads sensitive data no). A new canonical evidence artefact lives at content/phase 1d first real scoring run smoke.v1.json recording the approved Phase 1D Slice 0 baseline (agentproof-phase-1d-slice-0.tar.gz, SHA-256 1275531711bf598ce88b43d3e7c21168eb637605adb8eb991a04266d9686eb31, 1,980,208 bytes, 961 entries, 20,635 / 20,635 / 0 tests, archive-evidence product version 0.89.0), the controlled Phase 1D branch, the fictional fixture, the deterministic scoring surface (an internal source file:score), the engine overall score (84) and readiness rating (ready for limited deployment), the rendered Markdown scorecard byte size (8070) and SHA-256 (c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61), a full prohibited-action confirmations table, a deterministic-validation rules table, the canonical slice decision (slice 1 completed for slice 1 only with implementation scope slice 1 only, next slice authorised false, public or paid use authorised false, live use authorised false, project parked false, project frozen false), and a full safety-marker table. One new test-side support module ships at an internal source file. Eight new unit tests pin the slice end to end. New docs/phase 1 d.md addendum, docs/phase plan.md addendum, docs/phase 1 d entry decision record.md addendum, and README.md row updates carry the founder-facing prose. Every tracked content file top-level product version is bumped to 0.90.0; every latest approved archive block is refreshed to the approved Phase 1D Slice 0 archive. NullProvider remains default.
User impact: None at runtime. The methodology summary is bumped to product version 0.90.0. Phase 1D Slice 1 produces the first real founder-facing Markdown scorecard from a non-customer demo fixture, proving the base product can score an agent-like input deterministically. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 1 adds no scoring change, no engine change, no rendered-output change beyond a new fictional fixture, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologySamplesEvidence trace: samples/inputs/phase 1d slice 1 demo agent input.json, content/phase 1d first real scoring run smoke.v1.json, an internal source file, tests/unit/phase_1d_first_real_scoring_run_smoke_*.test.ts (eight), docs/phase 1 d.md (Phase 1D Slice 1 addendum), docs/phase plan.md (Phase 1D Slice 1 addendum), docs/phase 1 d entry decision record.md (Phase 1D Slice 1 addendum), README.md (Phase 1D Slice 1 shipped row), package.json (version bump 0.89.0 to 0.90.0), content/methodology changelog.v1.json (this 80th entry), git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 1 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 1 founder execution signal
- Impact assessment
- Adds the canonical Phase 1D Slice 1 evidence artefact at the static content layer, one fictional non-customer demo fixture under samples/inputs/, one test-side support module, eight new unit tests pinning shape and fixture safety and execution and Markdown export and runtime/public/paid/legal boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No product features beyond the recorded slice 1 smoke run. No runtime behaviour change. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise NullProvider default change, does not authorise marketing/pricing/public-sales-page work, does not authorise test weakening, does not park or freeze the project, does not claim legal/compliance/regulatory/investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from CLI proof toward a usable self-serve base-product workflow, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S2Change date:2026-05-03Product version:0.91.0Methodology engine version:0.9.1
Phase 1D Slice 2 — first real scoring run self-serve CLI
Reason: The founder explicitly authorised Phase 1D Slice 2 (phase 1d first real scoring run self serve cli) — the smallest self-serve product workflow after the Slice 1 deterministic scoring smoke. Slice 2 lets a founder or first-look prospect run AgentProof locally against their own hand-crafted agent inputs JSON file and receive a deterministic founder-facing Markdown scorecard at a deterministic output path, without writing code. NullProvider remains default. No live LLM, no live customer data, no live scans, no Supabase / API / local storage / provider / payment change.
What changed: A new starter template lives at samples/templates/agentproof-agent-input-template.json (synthetic-by-construction, clearly labelled, no real customer data, no secrets, no live URLs). A new deterministic agent inputs validator lives at an internal source file and returns all errors at once with friendly per-field messages naming the field and what is wrong. an internal source file now uses the validator (with a hint to start from the starter template), and prints a new Output SHA-256 line in the success message. A new canonical evidence artefact lives at content/phase 1d first real scoring run self serve cli.v1.json and pins the approved Phase 1D Slice 1 baseline (agentproof-phase-1d-slice-1.tar.gz, SHA-256 8052fef1d041cd8f1515abc831551dd46c89220de69bb6f497641e9a56622bd6, 1,997,701 bytes, 973 entries, 20,698 / 20,698 / 0 tests, archive-evidence product version 0.90.0), the CLI entry point and argument contracts, the starter template SHA-256 (26cfa1df8d0bd410309a54125a140862605ce27bd4b94b5fcf6fa8bd35b1e13c, 2,800 bytes), the deterministic output SHA-256 (c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61, 8,070 bytes), eight documented validation error examples, 27 prohibited-action confirmations all canonical-state true, the canonical slice decision (slice 2 completed for slice 2 only with implementation scope slice 2 only, next slice authorised false, public or paid use authorised false, live use authorised false, project parked false, project frozen false), and a full safety-marker table. One new test-side support module ships at an internal source file. Nine new unit tests pin the slice end to end. New docs/phase 1 d.md Slice 2 section, docs/phase plan.md addendum, docs/phase 1 d entry decision record.md addendum, and README.md row updates carry the founder-facing run command and prose. Every tracked content file top-level product version is bumped to 0.91.0; every latest approved archive block is refreshed to the approved Phase 1D Slice 1 archive. NullProvider remains default.
User impact: None at runtime for users not using the CLI. Founders running 'pnpm score <input.json> --out=<output.md>' now receive (1) clear, actionable per-field validation errors when the JSON is wrong, (2) a starter template at samples/templates/agentproof-agent-input-template.json they can copy and edit, and (3) an Output SHA-256 line in the CLI success message so they can verify byte-stability. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 2 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyCliSamplesLibEvidence trace: samples/templates/agentproof-agent-input-template.json, content/phase 1d first real scoring run self serve cli.v1.json, an internal source file, tests/unit/phase_1d_self_serve_cli_*.test.ts (nine), an internal source file, an internal source file (updated), docs/phase 1 d.md (Slice 2 addendum), docs/phase plan.md (Slice 2 addendum), docs/phase 1 d entry decision record.md (Slice 2 addendum), README.md (Slice 2 shipped row), package.json (version bump 0.90.0 to 0.91.0), content/methodology changelog.v1.json (this 81st entry), git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 2 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 2 founder execution signal
- Impact assessment
- Adds the canonical Phase 1D Slice 2 evidence artefact at the static content layer, one fictional clearly-labelled starter template under samples/templates/, one deterministic AgentInputs validator at lib/scoring/validate_agent_inputs.ts, an updated cli/score.ts (validator + output SHA-256 in success message), one test-side support module, nine new unit tests, and three founder-facing documentation addenda. No product features beyond the recorded slice 2 self-serve workflow. No engine behaviour change. No runtime contract change. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise NullProvider default change, does not authorise marketing/pricing/public-sales-page work, does not authorise test weakening, does not park or freeze the project, does not claim legal/compliance/regulatory/investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a local self-serve CLI toward a minimal self-serve web workflow, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S3Change date:2026-05-03Product version:0.92.0Methodology engine version:0.9.1
Phase 1D Slice 3 — first real scoring run minimal self-serve web
Reason: The founder explicitly authorised Phase 1D Slice 3 (phase 1d first real scoring run minimal self serve web). Slice 3 ships the smallest local web self-serve scoring workflow after Slice 2's CLI. A founder pastes agent inputs JSON into a textarea on /score/paste, presses Score this JSON, sees friendly per-field validation errors all at once via the same Slice 2 validator, and on success reads the same deterministic Markdown scorecard the CLI produces. No new API route, no Supabase write, no local storage write, no live LLM, no real customer data, no payment integration, no login required.
What changed: A new server-rendered page at an internal source file mounts a new client component at an internal source file that reuses an internal source file:validate agent inputs json string for friendly inline validation errors and POSTs validated JSON to the existing /api/score endpoint. A new canonical evidence artefact lives at content/phase 1d first real scoring run minimal self serve web.v1.json and pins the approved Phase 1D Slice 2 baseline (agentproof-phase-1d-slice-2.tar.gz, SHA-256 77b43795e15a7e17e49ee96054a02bae540d1cd087daeb39d70b40a72f22d1a4, 2,013,173 bytes, 987 entries, 20,771 / 20,771 / 0 tests, archive-evidence product version 0.91.0), the web surface and route paths, the validator reuse details, the scoring surface, the Markdown renderer, the starter template SHA-256 (702e6f47097d8ef7eea4502395cc159af67ad5628083fa794c32d6502cbeff26, 2,817 bytes), the demo fixture SHA-256, the deterministic Markdown SHA-256 (c833f290e0fc5b116569ecee71d836b5e5d4057adf4e697ecde9010fe0ccee61, 8,070 bytes — same as Slice 1/Slice 2 engine output), eight validation error examples, six workflow steps, the API route count (8 before and 8 after — unchanged), 28 prohibited-action confirmations all canonical-state true, the canonical slice decision (slice 3 completed for slice 3 only with implementation scope slice 3 only), and a full safety-marker table. One new test-side support module ships at an internal source file. Eight new unit tests pin the slice end to end. New docs/phase 1 d.md Slice 3 section (with a step-by-step founder local web workflow), docs/phase plan.md addendum, docs/phase 1 d entry decision record.md addendum, and README.md row updates carry the founder-facing prose. Every tracked content file top-level product version is bumped to 0.92.0; every latest approved archive block is refreshed to the approved Phase 1D Slice 2 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders who run pnpm dev locally and open http://localhost:3000/score/paste can now paste an agent inputs JSON, see all per-field validation errors at once if the JSON is wrong, and read the same deterministic Markdown scorecard the CLI produces. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 3 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyAppComponentsEvidence trace: an internal source file, an internal source file, content/phase 1d first real scoring run minimal self serve web.v1.json, an internal source file, tests/unit/phase_1d_minimal_self_serve_web_*.test.ts (eight), docs/phase 1 d.md (Slice 3 addendum), docs/phase plan.md (Slice 3 addendum), docs/phase 1 d entry decision record.md (Slice 3 addendum), README.md (Slice 3 shipped row), package.json (version bump 0.91.0 to 0.92.0), content/methodology changelog.v1.json (this 82nd entry), git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 3 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 3 founder execution signal
- Impact assessment
- Adds the canonical Phase 1D Slice 3 evidence artefact at the static content layer, one new server page at app/score/paste/page.tsx, one new client component at components/form/components/form/JsonPasteScoreCard.tsx, one test-side support module, eight new unit tests pinning shape and surface and validator reuse and scoring output and runtime/public/paid/legal boundaries (including a live walk over the API route directory to confirm the route count is unchanged at 8) and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No localStorage write. No live LLM. No real customer data. No payment integration. No login required. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise NullProvider default change, does not authorise marketing/pricing/public-sales-page work, does not authorise test weakening, does not park or freeze the project, does not claim legal/compliance/regulatory/investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a local web scoring workflow toward a first usable base-product experience, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S4Change date:2026-05-03Product version:0.93.0Methodology engine version:0.9.1
Phase 1D Slice 4 — first real scoring run save locally
Reason: The founder explicitly authorised Phase 1D Slice 4 (phase 1d first real scoring run save locally). Slice 4 extends the Slice 3 self-serve web scoring page with one-click local Save (Download Markdown) and Copy Markdown actions that turn the displayed deterministic Markdown into either a local .md file (deterministic filename) or a clipboard copy. No new API route, no Supabase write, no local storage write, no session storage write, no live LLM, no real customer data, no payment, no login required.
What changed: A new pure deterministic filename helper lives at the new web library file lib slash web slash scorecard underscore filename and exports scorecard markdown filename plus slugify agent name and three constants for the prefix, fallback slug, and extension. The Slice 3 web client component at the components slash form slash json paste score card path is extended with two new buttons that render only inside the success branch: a Download Markdown button that produces a local .md file via Blob plus URL.create object url plus anchor click using the deterministic filename helper, and a Copy Markdown button that uses navigator.clipboard.write text behind a click handler with friendly fallback messages for unsupported browsers and thrown clipboard errors. A new canonical evidence artefact lives at content slash phase 1d first real scoring run save locally and pins the approved Phase 1D Slice 3 baseline, the web surface and component paths, the filename helper path and export name, the Download and Copy button test ids, the deterministic filename for the demo agent, five filename examples, the deterministic Markdown SHA-256 unchanged from Slice 1 through Slice 3, the API route count before and after the slice both 8, thirty prohibited-action confirmations all canonical-state true, nine deterministic validation rules, and the canonical slice decision slice 4 completed for slice 4 only. One new test-side support module ships at tests slash support slash phase 1d first real scoring run save locally and nine new unit tests pin shape, web surface and buttons, deterministic filename, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 3 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders who score agent inputs JSON locally on /score/paste can now save the Markdown scorecard as a file with one click using a deterministic filename, or copy it to the clipboard, without leaving the browser and without any backend write. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 4 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: components slash form slash json paste score card, lib slash web slash scorecard filename, content slash phase 1d first real scoring run save locally, tests slash support slash phase 1d first real scoring run save locally, tests slash unit slash phase 1d save locally tests (nine), docs slash PHASE 1D Slice 4 addendum, docs slash PHASE PLAN Slice 4 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 4 addendum, README Slice 4 shipped row, package version bump 0.92.0 to 0.93.0, content slash methodology changelog this 83rd entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 4 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 4 founder execution signal
- Impact assessment
- Adds one new pure deterministic filename helper file at the web library layer, extends the existing self-serve web client component with two new buttons (Download Markdown and Copy Markdown) that render only after a successful score, ships one canonical Phase 1D Slice 4 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and filename and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No localStorage write. No sessionStorage write. No live LLM. No real customer data. No payment integration. No login required. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise sessionStorage change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a local saveable scorecard toward a first usable self-serve review or history experience, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S5Change date:2026-05-03Product version:0.94.0Methodology engine version:0.9.1
Phase 1D Slice 5 — first real scoring run local session history
Reason: The founder explicitly authorised Phase 1D Slice 5 (phase 1d first real scoring run local session history). Slice 5 extends the Slice 4 self-serve web page with an ephemeral in-memory Session scorecards list. After every successful score the page appends or replaces a history item keyed by the SHA-256 of the rendered Markdown. A View button restores any earlier scorecard so Download Markdown and Copy Markdown operate on the restored item. A Clear button empties the list. The list lives entirely in React component state — no local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no login required.
What changed: A new pure session-history helper file lives at the new web library path lib slash web slash scorecard session history and exports the scorecard history item type plus pure functions add or replace history item, has history item, clear history, find history item, and array buffer to hex. The Slice 4 client component at the components slash form slash json paste score card path is extended with a use state hook that holds an array of scorecard history item, a view history item function that restores the visible result state from the stored Markdown bytes, a clear session history function that empties the array, a Session scorecards section that renders only when at least one history item exists, a list with one item per distinct scorecard (agent name, overall score, readiness rating, deterministic filename, View button), a Clear session history button, and an explicit ephemeral note explaining that the list lives in browser memory only and clears on tab close or page refresh. After each successful score the component computes the SHA-256 of the rendered Markdown via the browser Web Crypto subtle.digest API, builds a scorecard history item, and merges it into the list via add or replace history item. A new canonical evidence artefact lives at content slash phase 1d first real scoring run local session history. One new test-side support module ships at tests slash support slash phase 1d first real scoring run local session history. Eight new unit tests pin shape, surface, state model, View/Restore behaviour, runtime/public/paid/legal boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 4 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders who score agent inputs JSON locally can now score two or three agents back-to-back in one browser session, see them listed in a small Session scorecards list, click View to switch back to any prior scorecard, and download or copy that scorecard with the same deterministic filename and bytes — without re-pasting JSON, without rescoring, and without any persistent storage. Closing or refreshing the tab clears the list. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 5 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: components slash form slash json paste score card, lib slash web slash scorecard session history, content slash phase 1d first real scoring run local session history, tests slash support slash phase 1d first real scoring run local session history, tests slash unit slash phase 1d local session history tests (eight), docs slash PHASE 1D Slice 5 addendum, docs slash PHASE PLAN Slice 5 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 5 addendum, README Slice 5 shipped row, package version bump 0.93.0 to 0.94.0, content slash methodology changelog this 84th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 5 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 5 founder execution signal
- Impact assessment
- Adds one new pure session-history helper file at the web library layer, extends the existing self-serve web client component with an ephemeral useState-backed Session scorecards list (View, Clear, restore-on-click), ships one canonical Phase 1D Slice 5 evidence artefact at the static content layer, one test-side support module, eight new unit tests pinning shape and surface and state model and View/Restore behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No localStorage write. No sessionStorage write. No IndexedDB write. No cookie set. No live LLM. No real customer data. No payment integration. No login required. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage change, does not authorise sessionStorage change, does not authorise IndexedDB change, does not authorise cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from an ephemeral local review or history workflow toward a packaged founder demo flow or onboarding checklist, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S6Change date:2026-05-03Product version:0.95.0Methodology engine version:0.9.1
Phase 1D Slice 6 - first real scoring run local demo walkthrough
Reason: The founder explicitly authorised Phase 1D Slice 6 (phase 1d first real scoring run local demo walkthrough). Slice 6 extends the Slice 5 self-serve web page with an inline four-step Local demo walkthrough panel above the JSON textarea, plus a single Why this is safe to run locally line. The four steps light up as the founder progresses through them, driven by existing component state. No new use state hook, no local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no login required.
What changed: The Slice 5 client component at the components slash form slash json paste score card path is extended with a clearly-labelled Local demo walkthrough panel that renders above the JSON textarea before any score has been produced. The panel renders a heading, a one-line explanation, an ordered list of exactly four steps (load, score, read, download), and a single Why this is safe to run locally line covering NullProvider default, no live LLM, no real customer data, no backend write, no login, deterministic SHA-256. Each step has a stable data-test-id and a data-step-complete attribute computed from the existing component state (text length for the load step, the scoring lifecycle status for the score step, the result/Markdown presence flag for the read and download steps). A new canonical evidence artefact lives at content slash phase 1d first real scoring run local demo walkthrough. One new test-side support module ships at tests slash support slash phase 1d first real scoring run local demo walkthrough. Eight new unit tests pin shape, surface, four-step labels, state-derived completion model, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 5 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders showing AgentProof to a first-look prospect now have a single, predictable narrative path on the local web page: load the fictional demo input, score it, read the Markdown scorecard below, click Download Markdown to save the file. The walkthrough panel lights each step up as the founder completes it. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 6 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsEvidence trace: components slash form slash json paste score card, content slash phase 1d first real scoring run local demo walkthrough, tests slash support slash phase 1d first real scoring run local demo walkthrough, tests slash unit slash phase 1d local demo walkthrough tests (eight), docs slash PHASE 1D Slice 6 addendum, docs slash PHASE PLAN Slice 6 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 6 addendum, README Slice 6 shipped row, package version bump 0.94.0 to 0.95.0, content slash methodology changelog this 85th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 6 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 6 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with an inline four-step Local demo walkthrough panel and a single Why this is safe to run locally line. The completion model is derived entirely from existing component state - no new useState hook, no localStorage, no sessionStorage, no IndexedDB, no cookies. Ships one canonical Phase 1D Slice 6 evidence artefact at the static content layer, one test-side support module, eight new unit tests pinning shape and surface and four-step labels and state-derived completion and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No login required. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a guided local demo toward a packaged founder handoff or demo kit, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S7Change date:2026-05-03Product version:0.96.0Methodology engine version:0.9.1
Phase 1D Slice 7 - first real scoring run founder demo handoff kit
Reason: The founder explicitly authorised Phase 1D Slice 7 (phase 1d first real scoring run founder demo handoff kit). Slice 7 extends the Slice 6 self-serve web page with a small Founder demo handoff section that exposes a deterministic Markdown handoff card a founder can copy or download from the local page after running the guided local demo. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no login required.
What changed: A new pure helper at lib slash web slash founder demo handoff exposes the deterministic handoff filename (agentproof-local-demo-handoff dot md), the static deterministic handoff Markdown body (1,186 bytes, SHA-256 7fcdc3c0...), and the required-substring list. The Slice 6 client component at the components slash form slash json paste score card path is extended with a clearly-labelled Founder demo handoff section that renders a heading, a one-line explanation, a Copy handoff text button, a Download handoff Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic handoff Markdown body inside a pre block, and a single Boundary line stating not a public launch, not paid use, and not legal/compliance/regulatory approval. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1d first real scoring run founder demo handoff kit. One new test-side support module ships at tests slash support slash phase 1d first real scoring run founder demo handoff kit. Nine new unit tests pin shape, surface, handoff text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 6 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders showing AgentProof to a first-look prospect now have a small deterministic handoff card available in one click on the local web page: copy or download a single Markdown file that lists what AgentProof does in one sentence, the four local demo steps, the local safety boundaries, the deterministic scorecard filename example, and an explicit boundary line stating this is not a public launch, not paid use, and not legal/compliance/regulatory approval. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 7 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash founder demo handoff, components slash form slash json paste score card, content slash phase 1d first real scoring run founder demo handoff kit, tests slash support slash phase 1d first real scoring run founder demo handoff kit, tests slash unit slash phase 1d founder demo handoff tests (nine), docs slash PHASE 1D Slice 7 addendum, docs slash PHASE PLAN Slice 7 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 7 addendum, README Slice 7 shipped row, package version bump 0.95.0 to 0.96.0, content slash methodology changelog this 86th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 7 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 7 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small Founder demo handoff section and a new pure deterministic handoff helper module. The handoff text is a single static template constant (no clock, no random, no env, no network) covering what AgentProof does in one sentence, the four local demo steps, the local safety boundaries, the deterministic scorecard filename example, and an explicit boundary line stating this is not a public launch, not paid use, and not legal/compliance/regulatory approval. Ships one canonical Phase 1D Slice 7 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and handoff text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No login required. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a local founder handoff kit toward a minimum first-prospect trial package, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S8Change date:2026-05-03Product version:0.97.0Methodology engine version:0.9.1
Phase 1D Slice 8 - first prospect trial preparation pack
Reason: The founder explicitly authorised Phase 1D Slice 8 (phase 1d first prospect trial preparation pack). Slice 8 extends the Slice 7 self-serve web page with a small First-prospect trial prep section that exposes a deterministic Markdown trial-preparation card a founder can copy or download from the local page. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no login required, no data-collection form.
What changed: A new pure helper at lib slash web slash first prospect trial prep exposes the deterministic trial-prep filename (agentproof-local-trial-prep dot md), the static deterministic trial-prep Markdown body (3,326 bytes, SHA-256 03f3ad46...), the six required fields (agent name, agent purpose, tools or actions, autonomy level, deployment context, reads sensitive data), the eight prohibited-input categories (secrets, credentials, API keys, PII, customer records, confidential contracts, live customer data, production system details), and the required-substring list. The Slice 7 client component at the components slash form slash json paste score card path is extended with a clearly-labelled First-prospect trial prep section that renders a heading, a one-line explanation, a Copy trial-prep text button, a Download trial-prep Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic trial-prep Markdown body inside a pre block, a single Why this is safe to run locally line, and a single Boundary line. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1d first prospect trial preparation pack. One new test-side support module ships at tests slash support slash phase 1d first prospect trial preparation pack. Nine new unit tests pin shape, surface, trial-prep text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 7 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders running a first-prospect local trial now have a small deterministic trial-preparation card available in one click on the local web page: copy or download a single Markdown file that lists what AgentProof does in one sentence, the minimum six agent inputs JSON fields, an explicit prohibited-input list, instructions to use the existing starter template, instructions to paste / score / download / copy / share the resulting Markdown scorecard locally, the local safety boundary line, and an explicit boundary line. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen.
When to re-score: Phase 1D Slice 8 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash first prospect trial prep, components slash form slash json paste score card, content slash phase 1d first prospect trial preparation pack, tests slash support slash phase 1d first prospect trial preparation pack, tests slash unit slash phase 1d first prospect trial prep tests (nine), docs slash PHASE 1D Slice 8 addendum, docs slash PHASE PLAN Slice 8 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 8 addendum, README Slice 8 shipped row, package version bump 0.96.0 to 0.97.0, content slash methodology changelog this 87th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 8 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 8 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small First-prospect trial prep section and a new pure deterministic trial-prep helper module. The trial-prep text is a single static template constant (no clock, no random, no env, no network) covering a one-sentence framing, six required AgentInputs JSON fields, an explicit prohibited-input list of eight categories, a pointer to the existing starter template, paste / score / download / copy / share guidance, the local safety boundary line, and an explicit boundary line. Ships one canonical Phase 1D Slice 8 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and trial-prep text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No login required. No data-collection form. The slice does not authorise any later slice, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next slice is the smallest product-building Phase 1D slice that moves AgentProof from a first-prospect trial-prep pack toward a minimal local trial closeout/checklist or founder feedback capture workflow, but the next slice will not run without a separate explicit founder execution instruction.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1D-S9Change date:2026-05-03Product version:0.98.0Methodology engine version:0.9.1
Phase 1D Slice 9 - first prospect trial closeout pack
Reason: The founder explicitly authorised Phase 1D Slice 9 (phase 1d first prospect trial closeout pack). Slice 9 extends the Slice 8 self-serve web page with a small Trial closeout section that exposes a deterministic Markdown trial-closeout note template a founder can copy or download from the local page after a first-look prospect runs an AgentProof local trial. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no login required, no data-collection form, no submit button, no email send, no CRM, no analytics, no tracking. Founder direction emphasises that internal milestones are not success - success means the product sells.
What changed: A new pure helper at lib slash web slash first prospect trial closeout exposes the deterministic closeout filename (agentproof-local-trial-closeout dot md), the static deterministic closeout Markdown body (3,152 bytes, SHA-256 3d1570df...), six founder-note prompts, four prospect-feedback prompts, the three attachment artefact descriptors, the eight prohibited-input categories, and the required-substring list. The Slice 8 client component at the components slash form slash json paste score card path is extended with a clearly-labelled Trial closeout section that renders a heading, a one-line explanation, a Copy closeout text button, a Download closeout Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic closeout Markdown body inside a pre block, a single Why this is safe to run locally line, and a single Boundary line that includes the founder commercial-reality reminder. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1d first prospect trial closeout pack. One new test-side support module ships at tests slash support slash phase 1d first prospect trial closeout pack. Nine new unit tests pin shape, surface, closeout text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 8 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders ending a first-look prospect's local AgentProof trial now have a small deterministic trial-closeout note template available in one click on the local web page: copy or download a single Markdown file that structures the founder's session-end record (six founder-note prompts plus four prospect-feedback prompts), names the three deterministic Markdown artefacts to attach or share after the session, explicitly forbids writing secrets / credentials / API keys / PII / customer records / confidential contracts / live customer data / production system details into the note, and offers four conservative next-step options. Public launch, paid use, live use, live customer data, live LLM, payments, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1D Slice 9 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash first prospect trial closeout, components slash form slash json paste score card, content slash phase 1d first prospect trial closeout pack, tests slash support slash phase 1d first prospect trial closeout pack, tests slash unit slash phase 1d first prospect trial closeout tests (nine), docs slash PHASE 1D Slice 9 addendum, docs slash PHASE PLAN Slice 9 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Slice 9 addendum, README Slice 9 shipped row, package version bump 0.97.0 to 0.98.0, content slash methodology changelog this 88th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1D Slice 9 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1D Slice 9 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small Trial closeout section and a new pure deterministic closeout helper module. The closeout text is a single static template constant (no clock, no random, no env, no network, no runtime data, no user-entered content) covering a session-end framing, a local trial recap, six founder-note prompts, four prospect-feedback prompts, three attachment artefacts, eight prohibited-input categories, four conservative next-step options, the local safety boundary line, and an explicit boundary line plus the founder commercial-reality reminder. Ships one canonical Phase 1D Slice 9 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and closeout text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No login required. No data-collection form, no submit button, no email send, no CRM, no analytics, no tracking. The slice does not authorise any later slice (Slice 9 is the last predeclared Slice 0-9 slot), does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next product-building direction is the smallest move from local prospect-trial workflow to true sale-readiness, focused on getting to a real paid sale (not internal ceremony), but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1E-S1Change date:2026-05-03Product version:0.99.0Methodology engine version:0.9.1
Phase 1E Slice 1 - sale-readiness buying signal card
Reason: The founder explicitly authorised the smallest sale-readiness product-building slice on top of the local trial closeout flow (phase 1e sale readiness buying signal card). Founder commercial standard: internal milestones are not success - success means the product sells. The card forces an honest post-trial classification (wants to pay now, wants a paid pilot, exploring only, walking away). No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no payment request, no pricing page, no marketing/public sales page, no login required, no data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking.
What changed: A new pure helper at lib slash web slash sale readiness buying signal exposes the deterministic buying-signal filename (agentproof-prospect-buying-signal dot md), the static deterministic buying-signal Markdown body (2,872 bytes, SHA-256 caa8f394...), four buying-signal categories, five founder prompts, three classification rules, three safe next-step options, and the required-substring list. The Slice 9 client component at the components slash form slash json paste score card path is extended with a clearly-labelled buying-signal section under the heading Is this prospect actually buying that renders a heading, a one-line explanation, a Copy buying-signal text button, a Download buying-signal Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic buying-signal Markdown body inside a pre block, and a single Boundary line that includes the founder commercial-reality reminder. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1e sale readiness buying signal card. One new test-side support module ships at tests slash support slash phase 1e sale readiness buying signal card. Nine new unit tests pin shape, surface, buying-signal text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1D Slice 9 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders running first-prospect trials now have a small deterministic buying-signal reflection card available in one click on the local web page: copy or download a single Markdown file that names the four buying-signal categories, asks five concrete founder prompts about the prospect's actual words and behaviour, lists three conservative classification rules, and offers three safe next-step options. The card forces the founder to confront whether the prospect is actually moving toward buying or just being polite. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1E Slice 1 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash sale readiness buying signal, components slash form slash json paste score card, content slash phase 1e sale readiness buying signal card, tests slash support slash phase 1e sale readiness buying signal card, tests slash unit slash phase 1e sale readiness buying signal tests (nine), docs slash PHASE 1D Slice 9 + Phase 1E section, docs slash PHASE PLAN Phase 1E addendum, docs slash PHASE 1D ENTRY DECISION RECORD Phase 1E addendum, README Phase 1E shipped row, package version bump 0.98.0 to 0.99.0, content slash methodology changelog this 89th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E Slice 1 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1E Slice 1 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small Buying-signal reflection section and a new pure deterministic buying-signal helper module. The buying-signal text is a single static template constant (no clock, no random, no env, no network, no runtime data, no user-entered content) covering an internal-milestones-are-not-success preamble, four buying-signal categories, five founder prompts, three conservative classification rules, three safe next-step options, the local safety boundary line, and an explicit boundary line stating not a payment request, not a CRM, not a sales automation, not a public launch, not paid use, and not legal/compliance/regulatory approval. Ships one canonical Phase 1E Slice 1 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and buying-signal text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest move from buying-signal capture to an actual controlled first-sale path, focused on getting to a real paid sale (not internal ceremony), but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1E-S2Change date:2026-05-03Product version:0.100.0Methodology engine version:0.9.1
Phase 1E Slice 2 - first-sale terms rehearsal card
Reason: The founder explicitly authorised the smallest founder-side first-sale terms rehearsal card on top of the Phase 1E Slice 1 buying-signal flow (phase 1e first sale terms rehearsal card). Founder commercial standard: internal milestones are not success - success means the product sells. The card forces the founder to write down the minimum acceptable first paid pilot shape (price floor, scope cap, duration cap, delivery definition, off-scope work, completion criteria, founder red lines, next step) BEFORE speaking commercially with a buying-signal prospect. The card is explicitly NOT a quote, NOT a contract, NOT an invoice, NOT legal advice, NOT a payment request, NOT a pricing page, NOT a public launch, NOT paid use. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no quote/contract/invoice generation, no pricing page, no marketing/public sales page, no login required, no data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking.
What changed: A new pure helper at lib slash web slash sale readiness first sale terms exposes the deterministic first-sale terms filename (agentproof-first-sale-terms-rehearsal dot md), the static deterministic first-sale terms Markdown body (4,720 bytes, SHA-256 64cdb92a...), eight founder prompts (price floor, scope cap, duration cap, delivery definition, off-scope work, completion criteria, founder red lines, next step after a buying signal), ten boundary substrings, and the required-substring list. The Phase 1E Slice 1 client component at the components slash form slash json paste score card path is extended with a clearly-labelled First-sale terms rehearsal section that renders a heading, a one-line explanation, a Copy first-sale terms text button, a Download first-sale terms Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic first-sale terms Markdown body inside a pre block, and a single Boundary line that includes the founder commercial-reality reminder. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1e first sale terms rehearsal card. One new test-side support module ships at tests slash support slash phase 1e first sale terms rehearsal card. Nine new unit tests pin shape, surface, first-sale terms text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1E Slice 1 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders preparing for a commercial conversation with a buying-signal prospect now have a small deterministic founder-side rehearsal card available in one click on the local web page: copy or download a single Markdown file that names eight concrete founder-side rehearsal prompts (price floor, scope cap, duration cap, delivery definition, off-scope work, completion criteria, founder red lines, next step after a buying signal), names ten boundary substrings, and includes the local safety boundary line plus the founder commercial-reality reminder. The card forces the founder to lock in their minimum-acceptable first paid pilot shape in writing before negotiating live, so they cannot be talked into improvising terms in the moment. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1E Slice 2 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash sale readiness first sale terms, components slash form slash json paste score card, content slash phase 1e first sale terms rehearsal card, tests slash support slash phase 1e first sale terms rehearsal card, tests slash unit slash phase 1e first sale terms tests (nine), docs slash PHASE 1D Phase 1E Slice 2 section, docs slash PHASE PLAN Phase 1E Slice 2 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Phase 1E Slice 2 addendum, README Phase 1E Slice 2 shipped row, package version bump 0.99.0 to 0.100.0, content slash methodology changelog this 90th entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E Slice 2 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1E Slice 2 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small First-sale terms rehearsal section and a new pure deterministic first-sale terms helper module. The first-sale terms text is a single static template constant (no clock, no random, no env, no network, no runtime data, no user-entered content) covering an internal-milestones-are-not-success preamble, eight founder-side rehearsal prompts (price floor, scope cap, duration cap, delivery definition, off-scope work, completion criteria, founder red lines, next step after a buying signal), conservative next-step guidance after a buying signal, the local safety boundary line, and an explicit boundary line stating founder-side rehearsal only, not a quote, not a contract, not an invoice, not legal advice, not a payment request, not a pricing page, not a public launch, not paid use, and not legal/compliance/regulatory approval. Ships one canonical Phase 1E Slice 2 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and first-sale terms text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from first-sale terms rehearsal toward a legally/governance-aware first paid pilot pathway, focused on getting to a real paid sale (not internal ceremony), but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1E-S3Change date:2026-05-03Product version:0.101.0Methodology engine version:0.9.1
Phase 1E Slice 3 - first paid pilot one-pager template
Reason: The founder explicitly authorised the smallest founder-side first paid pilot one-pager template on top of the Phase 1E Slice 2 first-sale terms rehearsal flow (phase 1e first paid pilot one pager template). Founder commercial standard: internal milestones are not success - success means the product sells. The template helps the founder draft a short plain-English pilot description after a real buying signal AND after completing the first-sale terms rehearsal. The template is explicitly NOT a quote, NOT a contract, NOT an invoice, NOT legal advice, NOT a payment request, NOT a pricing page, NOT a public launch, NOT paid use. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no quote/contract/invoice generation, no pricing page, no marketing/public sales page, no login required, no data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking.
What changed: A new pure helper at lib slash web slash sale readiness first paid pilot one pager exposes the deterministic one-pager filename (agentproof-first-paid-pilot-one-pager dot md), the static deterministic one-pager Markdown body (5,556 bytes, SHA-256 9275cf80...), twelve founder prompts (pilot title, buyer/prospect placeholder, problem statement, pilot objective, commercial amount placeholder, scope cap, duration cap, deliverables, completion criteria, out-of-scope, legal/governance review, conservative next step), ten boundary substrings, and the required-substring list. The Phase 1E Slice 2 client component is extended with a clearly-labelled First paid pilot one-pager section that renders a heading, a one-line explanation, a Copy one-pager text button, a Download one-pager Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic one-pager Markdown body inside a pre block, and a single Boundary line that includes the founder commercial-reality reminder. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1e first paid pilot one pager template. One new test-side support module ships at tests slash support slash phase 1e first paid pilot one pager template. Nine new unit tests pin shape, surface, one-pager text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1E Slice 2 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders preparing a first paid pilot one-pager for a buying-signal prospect now have a small deterministic founder-side draft template available in one click on the local web page: copy or download a single Markdown file that names twelve concrete founder-side prompts the founder must answer in their own private follow-up notes BEFORE sending anything to the prospect. The template forces qualified legal/governance review before sending. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1E Slice 3 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash sale readiness first paid pilot one pager, components slash form slash json paste score card, content slash phase 1e first paid pilot one pager template, tests slash support slash phase 1e first paid pilot one pager template, tests slash unit slash phase 1e first paid pilot one pager tests (nine), docs slash PHASE 1D Phase 1E Slice 3 section, docs slash PHASE PLAN Phase 1E Slice 3 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Phase 1E Slice 3 addendum, README Phase 1E Slice 3 shipped row, package version bump 0.100.0 to 0.101.0, content slash methodology changelog this 91st entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E Slice 3 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1E Slice 3 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small First paid pilot one-pager section and a new pure deterministic one-pager helper module. The one-pager text is a single static template constant (no clock, no random, no env, no network, no runtime data, no user-entered content) covering an internal-milestones-are-not-success preamble, twelve founder-side prompts, conservative next-step guidance after drafting, the local safety boundary line, and an explicit boundary line stating founder-side draft only, not a quote, not a contract, not an invoice, not legal advice, not a payment request, not a pricing page, not a public launch, not paid use, and not legal/compliance/regulatory approval. Ships one canonical Phase 1E Slice 3 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and one-pager text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from first paid pilot one-pager preparation toward a legally/governance-aware paid pilot readiness checklist, but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1E-S4Change date:2026-05-03Product version:0.102.0Methodology engine version:0.9.1
Phase 1E Slice 4 - paid pilot readiness checklist
Reason: The founder explicitly authorised the smallest founder-side paid pilot readiness checklist on top of the Phase 1E Slice 3 first paid pilot one-pager template flow (phase 1e paid pilot readiness checklist). Founder commercial standard: internal milestones are not success - success means the product sells. The checklist helps the founder decide whether they are ready to move a buying-signal prospect from local trial discussion into the founder's normal qualified legal/governance review process for a first paid pilot. The checklist is explicitly NOT legal advice, NOT compliance advice, NOT a quote, NOT a contract, NOT an invoice, NOT a payment request, NOT a pricing page, NOT a public launch, NOT paid use. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no quote/contract/invoice generation, no pricing page, no marketing/public sales page, no login required, no data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking.
What changed: A new pure helper at lib slash web slash sale readiness paid pilot readiness exposes the deterministic readiness filename (agentproof-paid-pilot-readiness-checklist dot md), the static deterministic readiness Markdown body (5,921 bytes, SHA-256 e860fde7...), nine checklist items (explicit buying signal, first-sale terms complete, one-pager draft complete, qualified legal/governance review owner identified, scope/duration/deliverables/completion written down, live customer data still blocked, live LLM still blocked, payment/quote/contract/invoice remain outside AgentProof, no work starts until founder approval path satisfied), eleven boundary substrings, and the required-substring list. The Phase 1E Slice 3 client component is extended with a clearly-labelled Paid pilot readiness checklist section that renders a heading, a one-line explanation, a Copy readiness checklist button, a Download readiness checklist Markdown button, an idle / copied / unsupported / error status feedback element, the deterministic readiness Markdown body inside a pre block, and a single Boundary line that includes the founder commercial-reality reminder. The Copy button calls navigator.clipboard.write text behind a click, with graceful unsupported and error fallbacks. The Download button uses Blob plus URL.create object url plus anchor click. A new canonical evidence artefact lives at content slash phase 1e paid pilot readiness checklist. One new test-side support module ships at tests slash support slash phase 1e paid pilot readiness checklist. Nine new unit tests pin shape, surface, readiness text required substrings, download behaviour, copy behaviour, runtime boundaries, decision and safety, cleanliness, and determinism. Every tracked content file top-level product version is bumped to the new product version and every latest approved archive block is refreshed to the approved Phase 1E Slice 3 archive. NullProvider remains default.
User impact: None at runtime for users not opening the local /score/paste page. Founders preparing for a first paid pilot for a buying-signal prospect now have a small deterministic founder-side readiness gate available in one click on the local web page: copy or download a single Markdown file that names nine concrete checklist items the founder must satisfy in their own private follow-up notes BEFORE moving the conversation outside AgentProof into the founder's own qualified legal/governance review process. The checklist forces the founder to walk through buying-signal explicitness, first-sale terms completion, one-pager draft completion, qualified legal/governance review owner identification, scope/duration/deliverables/completion documentation, live customer data still being blocked, live LLM still being blocked, payment/quote/contract/invoice remaining outside AgentProof, and no work starting until the founder's normal approval path is satisfied. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1E Slice 4 adds no scoring change, no engine change, no rendered-output change beyond the same deterministic Markdown the engine already produced. Existing scorecards remain valid.
DocsTestsContentMethodologyComponentsLibEvidence trace: lib slash web slash sale readiness paid pilot readiness, components slash form slash json paste score card, content slash phase 1e paid pilot readiness checklist, tests slash support slash phase 1e paid pilot readiness checklist, tests slash unit slash phase 1e paid pilot readiness tests (nine), docs slash PHASE 1D Phase 1E Slice 4 section, docs slash PHASE PLAN Phase 1E Slice 4 addendum, docs slash PHASE 1D ENTRY DECISION RECORD Phase 1E Slice 4 addendum, README Phase 1E Slice 4 shipped row, package version bump 0.101.0 to 0.102.0, content slash methodology changelog this 92nd entry, git branch phase 1d controlled start entry decision record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E Slice 4 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1E Slice 4 founder execution signal
- Impact assessment
- Extends the existing self-serve web client component with a small Paid pilot readiness checklist section and a new pure deterministic readiness helper module. The readiness text is a single static template constant (no clock, no random, no env, no network, no runtime data, no user-entered content) covering an internal-milestones-are-not-success preamble, nine founder-side checklist items, conservative next-step guidance after the checklist is complete, the local safety boundary line, and an explicit boundary line stating founder-side readiness checklist only, not legal advice, not compliance advice, not a quote, not a contract, not an invoice, not a payment request, not a pricing page, not public launch, not paid use, and not legal/compliance/regulatory approval. Ships one canonical Phase 1E Slice 4 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and readiness text required substrings and download behaviour and copy behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from paid pilot readiness checklist toward a legally/governance-aware external-review handoff pack, but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S1Change date:2026-05-03Product version:0.103.0Methodology engine version:0.9.1
Phase 1F Slice 1 - buyer-facing product UI reset
Reason: The founder explicitly authorised a buyer-facing product UI reset on the local /score/paste page (phase 1f buyer facing product ui reset). Founder commercial standard: internal milestones are not success - success means the product sells. The /score/paste page had drifted into an internal founder/developer console and was not sellable as a product to a first-look buyer. Phase 1F Slice 1 reorganises the page so the FIRST visible surface is a clean buyer-facing product hero with buyer-friendly labels and a buyer result summary, while keeping every existing internal founder card and the legacy primary controls inside a collapsed-by-default Founder tools a Founder tools details element block. No engine/scoring change. No new API route. NullProvider remains default. The deterministic scorecard Markdown SHA-256 stays at 99fae010… across the slice. No local storage, no session storage, no indexed db, no cookies, no Supabase write, no new API route, no payment integration, no quote/contract/invoice generation, no pricing page, no marketing/public sales page, no login required, no data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking, no live LLM, no live customer data, no live scan, no legal advice, no compliance advice.
What changed: The json paste score card client component is reorganised. A new buyer-facing hero section (test ids json-paste-buyer-hero / -headline / -subheadline / -demo-button / -check-profile-link) is the first visible block on /score/paste with the headline "AI Agent Readiness Check", the subheadline "See whether an AI agent is ready for safe deployment before it goes live.", a plain-English description, and a primary "Run fictional demo" action plus a "Check an agent profile" link to the buyer input area. A new buyer input section (test id json-paste-buyer-input) wraps the existing textarea (test id json-paste-textarea preserved) under the buyer-friendly label "Agent profile" and exposes a primary "Check readiness" button (test id json-paste-buyer-score-button) plus secondary buyer-friendly "Run fictional demo" and "Clear" buttons. A new buyer result summary section (test id json-paste-buyer-result-summary) renders agent name, readiness score, readiness rating, a plain-English interpretation derived deterministically from the engine readiness rating via a new pure helper buyer friendly readiness summary, and "Download report" (test id json-paste-buyer-download-report) plus "Copy report" (test id json-paste-buyer-copy-report) buttons that drive the same Markdown bytes the existing scorecard download/copy buttons drive. The eight existing internal founder cards (the local demo walkthrough, the founder demo handoff kit, the trial-prep card, the trial-closeout card, the buying-signal card, the first-sale terms rehearsal card, the first paid pilot one-pager template, and the paid pilot readiness checklist) and the legacy primary controls (Load fictional demo input, Clear, Score this JSON) are tucked inside a single a Founder tools details element Founder tools block that is collapsed by default. The buyer-facing primary surface does not contain any of the forbidden internal terms (Phase, Slice, agent inputs JSON, NullProvider, deterministic, SHA-256, founder handoff, trial prep, trial closeout, buying signal, first-sale terms, paid pilot readiness, internal milestones, success means product sells, product version, archive, changelog, gauntlet) but those terms remain inside Founder tools, the artefacts, the docs, and the tests. A new canonical evidence artefact lives at content/phase 1f buyer facing product ui reset.v1.json. One new test-side support module ships at an internal source file. Nine new unit tests pin shape, surface, primary-copy cleanliness, founder tools, result summary, runtime boundaries, decision and safety, cleanliness, and determinism. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010… across the slice.
User impact: A first-look buyer who opens the local /score/paste page now sees a clean product hero ('AI Agent Readiness Check' headline, plain-English subheadline, plain-English description, 'Run fictional demo' primary action, 'Check an agent profile' link), a buyer input section using buyer-friendly labels ('Agent profile' input label, 'Check readiness' primary button), and a buyer result summary after scoring (readiness score, readiness rating, plain-English interpretation, 'Download report' and 'Copy report' actions) instead of an internal founder/developer console with Phase/Slice/agent inputs JSON/NullProvider/deterministic/SHA-256/founder-handoff/trial-prep/trial-closeout/buying-signal/first-sale-terms/paid-pilot-readiness wording at the top. Founders who still need the prior-slice cards can expand the Founder tools section at the bottom; everything that was there before is still there. No engine/scoring change. No new API route. No backend write. No browser-storage write. NullProvider remains default. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1F Slice 1 is a UI reset only. No engine change, no scoring change, no rendered Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010…) is unchanged.
DocsTestsContentMethodologyComponentsEvidence trace: an internal source file (buyer hero + buyer input + buyer result summary added; 8 internal founder cards + 3 legacy primary controls moved inside a a Founder tools details element block collapsed by default), content/phase 1f buyer facing product ui reset.v1.json, an internal source file, tests/unit/phase_1f_buyer_ui_reset_*.test.ts (nine), docs/phase 1 d Phase 1F Slice 1 section, docs/phase plan Phase 1F Slice 1 addendum, docs/phase 1 d entry decision record Phase 1F Slice 1 addendum, README Phase 1F Slice 1 shipped row, package version bump 0.102.0 to 0.103.0, content/methodology changelog.v1.json this 93rd entry, git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 1 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 1 founder execution signal
- Impact assessment
- Reorganises the existing self-serve web client component to expose a clean buyer-facing primary surface (hero, input, result summary) using buyer-friendly labels, and tucks every existing internal founder card and legacy primary control inside a collapsed-by-default Founder tools a Founder tools details element block. The reset does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any prior-phase test that pins source-body strings (those strings are preserved inside Founder tools), any API route count, any storage layer, any provider default, or any deployment configuration. Ships one canonical Phase 1F Slice 1 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and primary-copy cleanliness and founder tools and result summary and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from the clean buyer-facing local product screen toward a realistic first-prospect demo package that can be shown without embarrassment, but the next direction will not run without a separate explicit founder execution instruction. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S2Change date:2026-05-03Product version:0.104.0Methodology engine version:0.9.1
Phase 1F Slice 2 - first-prospect demo package
Reason: The founder explicitly authorised a polished buyer-facing first-prospect demo package on the local /score/paste page (phase 1f first prospect demo package). The page now starts like a product (Phase 1F Slice 1) but the next problem was demonstration credibility: a first-look prospect needs a clean, believable fictional example that shows what AgentProof does without exposing JSON, internal phase names, provider details, hashes, or founder tooling. Phase 1F Slice 2 adds that example. Founder commercial standard: internal milestones are not success - success means the product sells.
What changed: A new buyer-facing demo package section is added to the json paste score card component, between the buyer hero and the buyer input area. Test ids: json-paste-demo-package, json-paste-demo-package-title, json-paste-demo-package-description, json-paste-demo-package-agent-summary, json-paste-demo-package-checks, json-paste-demo-package-report-preview, json-paste-demo-package-run. The section heading reads 'Try the fictional demo' and the in-section agent title reads 'Fictional customer-support agent'. The section uses ONLY the fictional Phase 1D Slice 1 demo fixture (samples/inputs/phase 1d slice 1 demo agent input.json), explains in plain English what the agent does, what AgentProof checks, and what the buyer will see in the report, and exposes a primary 'Run this demo' button that calls the existing load fictional demo input function (it does NOT auto-score; the buyer then clicks 'Check readiness' to score). The block does NOT show raw agent inputs JSON, does NOT show a raw Markdown preview before scoring, and contains none of the eighteen forbidden internal terms. All eight existing internal founder cards (the local demo walkthrough, the founder demo handoff kit, the trial-prep card, the trial-closeout card, the buying-signal card, the first-sale terms rehearsal card, the first paid pilot one-pager template, the paid pilot readiness checklist) and the three legacy primary controls (Load fictional demo input, Clear, Score this JSON) remain inside the collapsed-by-default Founder tools details block. A new canonical evidence artefact lives at content/phase 1f first prospect demo package.v1.json. One new test-side support module ships at an internal source file. Nine new unit tests pin shape, surface, primary-copy cleanliness, buyer flow, founder tools, runtime boundaries, decision and safety, cleanliness, and determinism. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010 across the slice.
User impact: A first-look buyer who opens the local /score/paste page now sees a clean buyer-facing primary surface with three blocks above the input area: (1) the AI Agent Readiness Check hero, (2) a polished 'Try the fictional demo' demo package section with a plain-English description, agent summary bullets, what-AgentProof-checks bullets, and what-you-will-see-in-the-report bullets, and (3) the buyer input section with the textarea and Check readiness button. The buyer can read the demo package in under a minute, click 'Run this demo' to load the fictional non-customer sample, then click 'Check readiness' to see the readiness score, rating, plain-English interpretation, and downloadable report. The demo package is fictional only - no real customer data, no live LLM, no live scan, no payment, no quote/contract/invoice, no pricing, no public launch, no paid use. Founders who still need the prior-slice cards can expand the Founder tools section at the bottom; everything that was there before is still there.
When to re-score: Phase 1F Slice 2 is a buyer-facing UI addition only. No engine change, no scoring change, no rendered Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsEvidence trace: an internal source file (demo package section added between buyer hero and buyer input area), content/phase 1f first prospect demo package.v1.json, an internal source file, tests/unit/phase_1f_first_prospect_demo_package_*.test.ts (nine), docs/phase 1 d Phase 1F Slice 2 section, docs/phase plan Phase 1F Slice 2 addendum, docs/phase 1 d entry decision record Phase 1F Slice 2 addendum, README Phase 1F Slice 2 shipped row, package version bump 0.103.0 to 0.104.0, content/methodology changelog.v1.json this 94th entry, git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 2 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 2 founder execution signal
- Impact assessment
- Adds a polished buyer-facing first-prospect demo package to the existing /score/paste self-serve scoring page using ONLY the fictional Phase 1D Slice 1 demo fixture. The demo package block sits between the buyer hero and the buyer input area, explains the example agent in plain English, lists what AgentProof checks, lists what the buyer will see in the report, and exposes a Run-this-demo button that calls the existing loadFictionalDemoInput function so the buyer can then click Check readiness to score. Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any prior-phase test that pins source-body strings, any API route count, any storage layer, any provider default, or any deployment configuration. Does not show raw JSON in the buyer-facing primary surface. Does not show a raw Markdown preview before scoring. Does not contain any of the eighteen forbidden internal terms. Ships one canonical Phase 1F Slice 2 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and primary-copy cleanliness and buyer flow and founder tools and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from a realistic first-prospect demo package toward a usable buyer report experience: a polished, buyer-facing report summary layout. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S3Change date:2026-05-03Product version:0.167.1Methodology engine version:0.9.1
Phase 1F Slice 3 - buyer report summary layout
Reason: The founder explicitly authorised a polished buyer-facing report summary layout on the local /score/paste page (phase 1f buyer report summary layout). After scoring, a first-look prospect should see a clear product-quality summary instead of needing to read raw Markdown first: a readiness score card, a readiness rating card, a plain-English meaning line, a Top things to review list, a Recommended next steps list, and Download/Copy report actions. The slice must improve buyer comprehension WITHOUT changing the scoring engine, the rendered Markdown bytes, the API route shape, or any prior deterministic SHA pins. Founder commercial standard: internal milestones are not success - success means the product sells.
What changed: A new pure deterministic helper at an internal source file exposes extract markdown section (returns the body between a ## heading and the next ## heading), buyer top things to review (extracts up to three items from the existing scorecard Markdown body's ## Top red flags section, strips the **Severity**: prefix, skips the indented Impact: sub-bullets), buyer recommended next steps (prefers ## Top 3 actions this week; falls back to ## Recommended improvements (prioritised) with the priority prefix and (estimated +N to Category) trailing parenthetical stripped; final fallback to ## Next recommended action), buyer what this means (deterministic plain-English copy per engine readiness rating), and buyer rating chip label (deterministic buyer-friendly chip label per rating). The json paste score card component is extended with a new section data-test-id="json-paste-buyer-report-summary" that renders a Readiness report heading; a score card (data-test-id="json-paste-buyer-report-score-card") with state.overall score / 100; a rating card (data-test-id="json-paste-buyer-report-rating-card") with the buyer rating chip label chip and the agent name; a What this means line (data-test-id="json-paste-buyer-report-meaning") using buyer what this means; a Top things to review list (data-test-id="json-paste-buyer-report-top-risks") populated from buyer top things to review with a no-flags fallback line; a Recommended next steps list (data-test-id="json-paste-buyer-report-next-steps") populated from buyer recommended next steps with a fallback line that points to the full report below; and Download report / Copy report buttons (data-test-id="json-paste-buyer-report-download" / -copy) wired to the existing download markdown / copy markdown functions. The raw Markdown report remains rendered below the summary inside a wrapping div data-test-id="json-paste-buyer-report-markdown" so the existing json-paste-result-markdown textarea test id is preserved for prior-slice tests. A new canonical evidence artefact lives at content/phase 1f buyer report summary layout.v1.json. One new test-side support module ships at an internal source file. Nine new unit tests pin shape, surface, primary-copy cleanliness, layout (pure helper extraction), Markdown preservation, runtime boundaries, decision and safety, cleanliness, and determinism. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010 across the slice. The /api/score response shape is unchanged.
User impact: After scoring, a first-look buyer now sees a clear product-quality summary block at the top of the result area: readiness score (0-100), readiness rating with a buyer-friendly chip label (e.g. Sandbox test candidate, Controlled pilot candidate), a plain-English What this means line, a Top things to review list (up to three items), a Recommended next steps list (up to three items), and Download report / Copy report buttons. The raw Markdown report remains rendered below for export and review. Buyers do not need to read raw Markdown first to understand the result. Founders who still want the prior-slice cards can expand the Founder tools section at the bottom; everything that was there before is still there. No engine/scoring change. No new API route. No backend write. No browser-storage write. NullProvider remains default. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1F Slice 3 is a buyer-facing UI addition + a pure deterministic Markdown extraction helper. No engine change, no scoring change, no rendered Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsLibEvidence trace: an internal source file (new pure helper), an internal source file (buyer report summary section + raw markdown wrapper), content/phase 1f buyer report summary layout.v1.json, an internal source file, tests/unit/phase_1f_buyer_report_summary_*.test.ts (nine), docs/phase 1 d Phase 1F Slice 3 section, docs/phase plan Phase 1F Slice 3 addendum, docs/phase 1 d entry decision record Phase 1F Slice 3 addendum, README Phase 1F Slice 3 shipped row, package version bump 0.104.0 to 0.105.0, content/methodology changelog.v1.json this 95th entry, git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 3 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 3 founder execution signal
- Impact assessment
- Adds a polished buyer-facing card-layout report summary to the existing /score/paste self-serve scoring page using a new pure deterministic Markdown extraction helper at lib slash web slash buyer report summary. The buyer report summary block sits inside the existing hasResult success branch (above the Phase 1F Slice 1 buyer result summary), reads existing fields from the scorecard response (overallScore, readinessRating, agentName) and the existing Markdown body (state.markdown), and renders a Readiness report heading, a score card, a rating card with a buyer-friendly chip label, a What this means plain-English line, a Top things to review list, a Recommended next steps list, and Download report / Copy report buttons. The raw Markdown report remains rendered below the summary inside a wrapping div carrying json-paste-buyer-report-markdown so the existing json-paste-result-markdown textarea test id is preserved. Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any prior-phase test that pins source-body strings, any API route count, any /api/score response shape, any storage layer, any provider default, or any deployment configuration. The new helper is a pure function over its input string/value (no clock, no random, no env, no network). Ships one canonical Phase 1F Slice 3 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and primary-copy cleanliness and layout and Markdown preservation and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from a polished local buyer report experience toward a realistic first-prospect demo script or lightweight guided demo narrative the founder can use in a real conversation. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S4Change date:2026-05-03Product version:0.167.1Methodology engine version:0.9.1
Phase 1F Slice 4 - first-prospect guided demo script
Reason: The founder explicitly authorised a founder-side first-prospect guided demo script (phase 1f first prospect guided demo script) that lives ONLY inside the existing collapsed-by-default Founder tools <details> block on the local /score/paste page. The script is what the founder reads to a real first-look prospect while running the existing local fictional demo, so the conversation stays buyer-safe, focused, and free of overclaims (no legal/compliance/regulatory approval claim, no public launch claim, no paid use claim, no quote/contract/invoice/payment-request, no asks for secrets/PII/customer records/live customer data/production system details/confidential contracts). It is NOT a buyer-facing primary surface change - the buyer hero, demo package, and buyer report summary blocks remain unchanged. Founder commercial standard: internal milestones are not success - success means the product sells.
What changed: A new pure deterministic helper at an internal source file exposes first prospect guided demo script filename (agentproof-first-prospect-guided-demo-script.md), first prospect guided demo script markdown (deterministic 5,541-byte Markdown body, SHA-256 c7bf9023253b7f80584a9cc81972d6977b1a91a3f133ddb3711dab8e5490d65d), first prospect guided demo script steps (9 documented steps), first prospect guided demo script boundary substrings (10), and first prospect guided demo script required substrings (35). The helper is a static template constant - no clock, no random, no env, no network. The json paste score card component is extended with a new section data-test-id=json-paste-first-prospect-demo-script INSIDE the existing Founder tools <details> block (after the paid-pilot-readiness section, before the legacy primary controls). The section renders a Copy demo script button (data-test-id=json-paste-first-prospect-demo-script-copy), a Download demo script Markdown button (data-test-id=json-paste-first-prospect-demo-script-download), a status feedback element (data-test-id=json-paste-first-prospect-demo-script-status), the Markdown preview (data-test-id=json-paste-first-prospect-demo-script-text), and a boundary line (data-test-id=json-paste-first-prospect-demo-script-boundary). Copy uses navigator.clipboard.write text with idle/copied/unsupported/error states. Download uses the same Blob + URL.create object url + anchor click pattern as prior founder-side scripts. A new use state (demo script copy status) is added for the copy feedback (12 total use state hooks in json paste score card). A new canonical evidence artefact lives at content/phase 1f first prospect guided demo script.v1.json. One new test-side support module ships at an internal source file. Nine new unit tests pin shape, surface (six new test ids in source; section between founder-tools-summary and the founder-tools </details> close; ABSENT from buyer hero / demo package / buyer report summary blocks), text (9 steps verbatim, 10 boundary substrings, 35 required substrings, founder commercial-reality reminder, what-NOT-to-claim block, safe close), copy behaviour, download behaviour, runtime/public/paid/legal/compliance boundaries, decision and safety, cleanliness, and determinism. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010 across the slice. The /api/score response shape is unchanged.
User impact: The founder can now expand the Founder tools section at the bottom of /score/paste and copy or download a deterministic Markdown guided demo script that walks them through nine documented steps for running the existing local fictional demo with a real first-look prospect: 30-second opening, what to say before clicking Run this demo, what to say before clicking Check readiness, how to explain the readiness score (0-100, deterministic local engine), how to explain the readiness rating (e.g. Sandbox test candidate, Controlled pilot candidate), how to explain the top things to review, how to explain the recommended next steps, how to explain Download report, and a safe close (ask if the prospect would like to run a fictional version of one of their own agents next; do NOT ask for secrets / PII / customer records / live customer data / production system details / confidential contracts; if real buying intent surfaces, hand off to the founder's own qualified legal/governance review OUTSIDE AgentProof). The script also includes a What NOT to claim block (not legal advice, not compliance advice, not regulatory approval, not a public launch, not paid use, not a quote, not a contract, not an invoice, not a payment request) and a Boundary line. The script lives ONLY inside the collapsed Founder tools block - the buyer-facing primary surface (hero, demo package, buyer report summary) is unchanged. No engine/scoring change. No new API route. No backend write. No browser-storage write. NullProvider remains default. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1F Slice 4 is a founder-side script Markdown addition inside the collapsed Founder tools block. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsLibEvidence trace: an internal source file (new pure deterministic helper), an internal source file (guided demo script section inside the collapsed Founder tools <details> block + demo script copy status use state + download first prospect guided demo script markdown / copy first prospect guided demo script markdown functions), content/phase 1f first prospect guided demo script.v1.json, an internal source file, tests/unit/phase_1f_first_prospect_guided_demo_script_*.test.ts (nine), docs/phase 1 d Phase 1F Slice 4 section, docs/phase plan Phase 1F Slice 4 addendum, docs/phase 1 d entry decision record Phase 1F Slice 4 addendum, README Phase 1F Slice 4 shipped row, package version bump 0.105.0 to 0.106.0, content/methodology changelog.v1.json this 96th entry, git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 4 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 4 founder execution signal
- Impact assessment
- Adds a founder-side first-prospect guided demo script section INSIDE the existing collapsed Founder tools details block on the existing /score/paste self-serve scoring page using a new pure deterministic Markdown helper at lib slash web slash first prospect guided demo script. The section sits between the founder-tools summary marker and the founder-tools details close, after the Phase 1E Slice 4 paid-pilot-readiness section and before the legacy primary controls. Renders a Copy demo script button, a Download demo script Markdown button, a status feedback element, the Markdown preview, and a Boundary line. Copy uses navigator clipboard writeText with idle copied unsupported error states. Download uses Blob plus URL createObjectURL plus anchor click pattern with a deterministic filename. Adds one new useState hook for demo script copy status (twelve total). Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any prior-phase test that pins source-body strings, any API route count, any /api/score response shape, any storage layer, any provider default, or any deployment configuration. The new helper is a static deterministic template constant module (no clock, no random, no env, no network). Ships one canonical Phase 1F Slice 4 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and text and copy behaviour and download behaviour and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from a founder-side guided first-prospect demo script toward a buyer-facing What you get value section or lightweight pricing-readiness narrative, without creating a pricing page, payment request, quote, contract, invoice, public launch, live customer data, live LLM activation, Supabase writes, or unnecessary backend expansion until those are explicitly authorised and legally/governance-cleared. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S5Change date:2026-05-03Product version:0.167.1Methodology engine version:0.9.1
Phase 1F Slice 5 - buyer What-you-get value section
Reason: The founder explicitly authorised a buyer-facing 'What you get' value section on the local /score/paste page (phase 1f buyer what you get value section). A first-look buyer should immediately understand what they receive from AgentProof before they run anything: a readiness score, top risks, practical next steps, a downloadable report, and a safe fictional-demo-first workflow. The section must improve buyer comprehension WITHOUT becoming a pricing page, sales page, payment request, quote, contract, invoice, public launch, or legal/compliance claim. Founder commercial standard: internal milestones are not success - success means the product sells.
What changed: The json paste score card component is extended with a new buyer-facing section data-test-id=json-paste-buyer-what-you-get that sits between the buyer hero and the buyer-facing fictional demo package. The section renders a heading 'What you get' (data-test-id=json-paste-buyer-what-you-get-title), a one-sentence intro (data-test-id=json-paste-buyer-what-you-get-intro), and five plain-English buyer-facing value points: a clear readiness score (data-test-id=json-paste-buyer-what-you-get-score: 0-100 number with a buyer-friendly rating), the top risks to review before deployment (data-test-id=json-paste-buyer-what-you-get-risks: short, prioritised list, NOT a finished risk register), practical next steps your team can discuss (data-test-id=json-paste-buyer-what-you-get-next-steps: NOT commitments, NOT contractual obligations), a downloadable report you can share internally (data-test-id=json-paste-buyer-what-you-get-report: 'Nothing leaves your machine'), and a safe fictional demo first - no real customer data required (data-test-id=json-paste-buyer-what-you-get-safe-demo: no login, no upload, no live customer data). The section is buyer-facing only and contains none of the eighteen forbidden internal terms. A new canonical evidence artefact lives at content/phase 1f buyer what you get value section.v1.json. One new test-side support module ships at an internal source file. Nine new unit tests pin shape, surface, primary-copy cleanliness, value points, runtime boundaries, decision and safety, cleanliness, determinism, and founder tools posture. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010 across the slice. The /api/score response shape is unchanged.
User impact: A first-look buyer landing on /score/paste now sees a clean 'What you get' section right after the hero, explaining in plain English what AgentProof produces for them: a readiness score, the top risks to review, practical next steps the team can discuss this week, a downloadable report to share internally, and a safe fictional demo first - no real customer data required. The section is purely informational - no pricing, no payment request, no quote, no contract, no invoice, no submit button, no email, no CRM, no analytics, no tracking. No engine/scoring change. No new API route. No backend write. No browser-storage write. NullProvider remains default. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1F Slice 5 is a buyer-facing copy-only addition. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsEvidence trace: an internal source file (buyer What-you-get section after buyer hero, before demo package), content/phase 1f buyer what you get value section.v1.json, an internal source file, tests/unit/phase_1f_buyer_what_you_get_*.test.ts (nine), docs/phase 1 d Phase 1F Slice 5 section, docs/phase plan Phase 1F Slice 5 addendum, docs/phase 1 d entry decision record Phase 1F Slice 5 addendum, README Phase 1F Slice 5 shipped row, package version bump 0.106.0 to 0.107.0, content/methodology changelog.v1.json this 97th entry, git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 5 internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 5 founder execution signal
- Impact assessment
- Adds a buyer-facing What-you-get value section to the existing /score/paste self-serve scoring page. The section sits in the primary buyer surface, between the buyer hero and the buyer-facing fictional demo package, and explains in plain English what a first-look buyer receives from AgentProof. The section uses one new top-level test id and seven sub-test ids, and renders a heading, a one-sentence intro, and five buyer-facing value points. Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any prior-phase test that pins source-body strings, any API route count, any /api/score response shape, any storage layer, any provider default, or any deployment configuration. Ships one canonical Phase 1F Slice 5 evidence artefact at the static content layer, one test-side support module, nine new unit tests pinning shape and surface and primary-copy cleanliness and value points and runtime boundaries and decision and safety and cleanliness and determinism and founder tools posture, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains the default. Goldens unchanged. The recommendation for the next direction is the smallest controlled move from a buyer-facing What-you-get value section toward a buyer-facing Who-this-is-for / When-to-use-AgentProof section or single-page product narrative, without creating a pricing page, payment request, quote, contract, invoice, public launch, live customer data, live LLM activation, Supabase writes, or unnecessary backend expansion until those are explicitly authorised and legally/governance-cleared. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-S6Change date:2026-05-03Product version:0.167.1Methodology engine version:0.9.1
Phase 1F Slice 6 - connector-agnostic discovery first
Reason: The founder corrected the product architecture and buyer-facing flow. AgentProof must NOT be paste / import / manual-description as the primary process. AgentProof must NOT be Microsoft-only. Agents can be built in many environments; AgentProof must accommodate this by design. Core product rule: Connect read-only to the relevant agent environment, discover the agent footprint automatically, normalise it into a common AgentProof footprint model, then ask the human only real confirmation questions. The human should confirm or correct discovered facts; the human should not manually describe the agent, paste notes, or export packages as the normal buyer path; the human should mostly answer Yes / No / Not sure.
What changed: Adds a new local deterministic helper at an internal source file that exposes (1) a connector registry covering six entries (microsoft copilot studio planned, openai or custom agent framework planned, crm or service platform agent planned, cloud agent builder planned, custom internal agent planned, agentproof local demo disconnected) with documented status (planned / demo only / disabled / live authorisation required) and live connection enabled = false plus requires explicit authorisation = true on every entry; (2) a canonical AgentProof Agent Footprint model carrying 16 documented fields (platform type, connector id, environment or workspace, agent identity, agent purpose, topics or intents, knowledge sources, actions tools or flows, data footprint, integrations or connectors, permissions or authentication, human oversight, testing signals, monitoring or logging, deployment status, owner or responsible team) with each fact's value, source (connector discovered / inferred from metadata / user confirmed / missing / demo fixture), confidence, needs confirmation, and risk relevance; (3) a build connector agnostic demo profile function that returns a fictional disconnected demo profile a buyer-side UI can render BEFORE any live connector is implemented (clearly labelled as demo / not connected); and (4) a build connector agnostic confirmation questions function that emits Yes / No / Not sure confirmation questions with default Not sure for every fact that needs confirmation. The helper has no clock, no random, no env, no network, no LLM call, no platform connector. The json paste score card component is extended with a new buyer-facing Agent discovery section data-test-id=json-paste-connector-discovery (with 23 sub-test ids) that sits AFTER the buyer What-you-get value section and BEFORE the fictional demo package, OUTSIDE the founder-tools details block. The section renders Choose where your agent lives heading + connect-read-only paragraph + platform <select> (six platform-type options labelled planned / not connected yet) + Choose environment or workspace <input> + Choose the agent to proof <input> + Discover agent footprint button. After clicking Discover agent footprint, the local disconnected demo connector populates a What AgentProof found block (Agent identity, Platform / environment type, Purpose, Knowledge sources, Actions / tools / flows, Data footprint, Controls / human review, Testing / monitoring signals), an unknowns block, a Confirm the critical points block (per-question Yes / No / Not sure buttons), and a status line. Adds six use state hooks (connector platform type, connector environment, connector agent, connector footprint, connector answers, connector status). NOT shipped in this slice: any live connector activation, any vendor login (Microsoft / open ai / Salesforce / service now / Google / AWS / custom), any file upload, any persistent storage, any real customer data, any new API route, any /api/score response shape change, any scorecard Markdown change. A new canonical evidence artefact lives at content/phase 1f connector agnostic discovery first.v1.json. One new test-side support module ships at an internal source file. Ten new unit tests pin shape, surface, flow, registry, confirmation questions, minimal-human-input + connector-agnostic principle, runtime / public / paid / legal / compliance boundaries, decision and safety, cleanliness, and determinism. NullProvider remains default. The deterministic scorecard Markdown SHA-256 remains 99fae010 across the slice. The /api/score response shape is unchanged. The fictional demo path is preserved. The buyer What-you-get value section is preserved.
User impact: Instead of pasting agent material or filling a long human questionnaire, the buyer chooses where the agent lives (Microsoft / Copilot Studio, open ai / custom agent framework, CRM or service platform agent, cloud agent builder, custom / internal agent, or other - all currently labelled planned / not connected yet), AgentProof connects read-only (in this slice, only the local disconnected demo connector runs), AgentProof discovers the agent footprint and shows a What AgentProof found block, then asks the buyer a small set of Yes / No / Not sure confirmation questions covering only the missing or risky points. AgentProof is NOT Microsoft-only and NOT paste-only. No live system is contacted in this slice; no credentials are requested; no network call is made; no real customer data is required; no file upload is required; no persistent storage is added. NullProvider remains default. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, live connectors, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. Internal milestones are not success - success means the product sells. UI product approval still requires founder review; technical tests alone do not approve the UI.
When to re-score: Phase 1F Slice 6 is a buyer-facing connector-agnostic UI shell + local disconnected demo connector. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsLibEvidence trace: an internal source file (new pure deterministic helper); an internal source file (Agent discovery section AFTER buyer What-you-get, BEFORE demo package, OUTSIDE founder tools); content/phase 1f connector agnostic discovery first.v1.json; an internal source file; tests/unit/phase_1f_connector_agnostic_discovery_*.test.ts (ten); docs/phase 1 d Phase 1F Slice 6 section; docs/phase plan Phase 1F Slice 6 addendum; docs/phase 1 d entry decision record Phase 1F Slice 6 addendum; README Phase 1F Slice 6 shipped row; package version bump 0.107.0 to 0.108.0; content/methodology changelog.v1.json this 98th entry; git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F Slice 6 internal methodology record (connector-agnostic correction)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1F Slice 6 founder execution signal (connector-agnostic discovery-first product correction; supersedes the prior paste-based Slice 6 instruction)
- Impact assessment
- Corrects the buyer-facing product architecture toward a connector-agnostic discovery-first flow. AgentProof must connect read-only to the relevant agent environment, discover the agent footprint automatically, normalise the findings into a canonical AgentProof Agent Footprint, and ask the buyer only real Yes / No / Not sure confirmation questions. Adds a connector registry covering six categories (Microsoft / Copilot Studio, OpenAI / custom agent framework, CRM or service platform agent, cloud agent builder, custom / internal agent, and an AgentProof local demo connector for UI testing) with status / capability flags / live_connection_enabled false / requires_explicit_authorisation true on every entry. Adds a canonical AgentProof Agent Footprint model carrying 16 documented fields. Adds a buyer-facing Agent discovery section to the primary buyer surface, AFTER the buyer What-you-get value section and BEFORE the fictional demo package, OUTSIDE the founder-tools details block. The section uses 24 documented test ids (one top-level, 23 sub-anchors). Confirmation buttons offer exactly Yes / No / Not sure with default Not sure. The buyer-facing UI explicitly avoids paste / import / upload / questionnaire / manual-description primary phrasing and explicitly avoids Microsoft-only / vendor-lock-in phrasing. Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any API route count, any /api/score response shape, any storage layer, any provider default, or any deployment configuration. Ships one canonical Phase 1F Slice 6 evidence artefact, one test-side support module, ten new unit tests pinning shape and surface and flow and registry and confirmation questions and minimal-human-input + connector-agnostic principle and runtime boundaries and decision and safety and cleanliness and determinism, and three founder-facing documentation addenda. No engine behaviour change. No new API route. No Supabase write. No persistent storage write. No live LLM call. No live platform connector. No live Microsoft / OpenAI / Salesforce / ServiceNow / Google / AWS / custom login. No file upload. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No public sales page. No login required. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The slice does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise live platform connectors, does not authorise payments, does not authorise Supabase change, does not authorise new API route, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains default. Goldens unchanged. UI product approval still requires founder review; technical tests alone do not approve the UI. The recommendation for the next direction is the smallest controlled move from a connector-agnostic discovery-first UI shell + local disconnected demo connector toward one of: design the first real read-only connector contract, with Microsoft / Power Platform / Copilot Studio as the first candidate because the founder has a real agent there; implement a controlled local connector interface boundary with no live auth; or prepare a read-only connector prioritisation matrix across agent platform categories. All three are unauthorised until separately instructed. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The previous Phase 1F Slice 6 (paste-based agent-discovery) is superseded by this entry; that direction was not built.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-S1Change date:2026-05-03Product version:0.167.1Methodology engine version:0.9.1
Phase 1G - real Microsoft / Copilot Studio read-only discovery
Reason: The founder reviewed the prior Slice 6 connector-agnostic UI and rejected it as product-approved because (a) it still asked the buyer to type an environment / workspace label, (b) it still asked the buyer to type an agent name / identifier, (c) it still relied on a local fictional / demo connector, and (d) nothing real was discovered. The product must not ask the customer to do AgentProof's work. Phase 1G removes manual environment and agent entry from the primary buyer path, removes fictional discovery from the primary buyer path, and ships the first real Microsoft / Copilot Studio read-only discovery boundary while preserving connector-agnostic architecture. Microsoft is the first real connector because the founder has a real agent there; the platform <select> still exposes all six categories.
What changed: Adds six new server-side connector libs at lib/connectors/microsoft/* (microsoft auth config, microsoft connector errors, microsoft power platform client, microsoft dataverse client, microsoft copilot studio discovery, microsoft to canonical footprint). microsoft auth config reads Microsoft Entra ID app registration values from environment variables and reports setup needed when env vars are missing. microsoft connector errors carries the typed error model (setup needed, auth failed, permission insufficient, rate limited, upstream unavailable, metadata only assumption unconfirmed, internal error). microsoft power platform client wraps the documented Power Platform admin endpoint GET https://api.powerplatform.com/environmentmanagement/environments?api-version=2022-03-01-preview and normalises the OData envelope into platform-neutral environment objects. microsoft dataverse client wraps the documented Copilot Studio metadata sets /api/data/v9.2/bots and /api/data/v9.2/botcomponents and produces classified component buckets (topics or intents, knowledge sources, actions tools or flows, integrations or connectors, permissions or authentication, human oversight, testing signals, monitoring or logging, unmapped component kinds). microsoft copilot studio discovery orchestrates the read-only discovery flow as a setup needed-first server-side boundary that gates on auth before any network call. microsoft to canonical footprint is a pure deterministic transform (no clock / no random / no env / no fetch / no LLM) that maps a discovered Copilot Studio agent + components + environment into the canonical AgentProof Agent Footprint with documented connector id microsoft copilot studio real read only. Adds five new API routes under /api/connectors/microsoft/* (auth/start, auth/callback, environments, agents, footprint) - the API route count moves from 8 to 13. Each route returns either a setup needed envelope (503) when env vars are missing, an auth required envelope (401) when env vars are present but no live server-side token has been wired up yet, or a typed error response. Tokens never appear in any response body. The auth/start route returns ONLY an authorization url for the documented Entra ID OAuth authorize endpoint; the auth/callback route returns a callback received envelope without exchanging the code for a token in this slice. The json paste score card component is updated: the Phase 1F Slice 6 paste-based / manual-environment / manual-agent UI is replaced with the Phase 1G real Microsoft flow. The platform <select> still exposes all six platform type categories; Microsoft / Copilot Studio is labelled 'first real connector' and the other five remain 'planned, not connected yet'. When Microsoft is selected, the section renders a Connect Microsoft button (POST /api/connectors/microsoft/auth/start, redirects to authorization url or shows setup-needed), a disabled environment <select> placeholder, a disabled agent <select> placeholder, and a disabled Discover agent footprint button. When a non-Microsoft platform is selected, the section shows a 'This platform is planned but not connected yet' card with no inputs. use state count goes from 18 to 21. Adds eight new Phase 1G unit tests covering auth gating, no manual entry on the primary buyer path, environment discovery, agent discovery, footprint mapping, runtime / token / persistence boundaries, the UI review gate, and decision and safety, plus one new test-side support module. .env.example carries placeholder names for the six required Microsoft env vars; never real values. NullProvider remains default. The deterministic scorecard Markdown SHA-256 stays at 99fae010 across the phase. The /api/score response shape is unchanged.
User impact: On the primary buyer path the buyer no longer types an environment / workspace label, no longer types an agent name / identifier, and is no longer offered fictional / demo discovery as the primary path. Instead the buyer chooses where the agent lives via the platform <select>, clicks Connect Microsoft, and (when Microsoft env vars are configured) is redirected to login.microsoftonline.com to complete a real Microsoft Entra ID OAuth authorisation code flow against AgentProof's Power Platform read scope and Dataverse read scope. AgentProof reads only the documented metadata / configuration components needed to fill the canonical AgentProof Agent Footprint - no business records are read. When Microsoft env vars are missing locally, the UI shows a clear 'Microsoft connection setup needed' card and does NOT fall back to fictional discovery. Tokens never reach the browser. Tokens are never persisted (no local storage / session storage / indexed db / cookie / Supabase write). Tokens are never logged. The fictional demo package on the buyer hero / demo-package surface is preserved unchanged so the founder can still demo without a real Microsoft tenant. Future product-building work remains unauthorised. Public launch, paid use, live use, live customer data, live LLM, live token exchange in this slice, payments, payment requests, quote/contract/invoice generation, pricing pages, and runtime boundary changes all remain unauthorised. The project is not parked or frozen. UI product approval still requires founder review; technical tests alone do not approve the UI. Internal milestones are not success - success means the product sells.
When to re-score: Phase 1G is a server-side connector boundary + buyer-facing UI replacement. No engine change, no scoring change, no rendered scorecard Markdown change. Existing scorecards remain valid. The deterministic scorecard Markdown SHA-256 (99fae010) is unchanged.
DocsTestsContentMethodologyComponentsLibApiEvidence trace: an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file; an internal source file (Phase 1G real Microsoft section replaces Slice 6 paste-style section); content/phase 1g real microsoft read only discovery.v1.json; an internal source file; tests/unit/phase_1g_real_microsoft_*.test.ts (eight); .env.example (placeholders); docs/phase 1 d Phase 1G section; docs/phase plan Phase 1G addendum; docs/phase 1 d entry decision record Phase 1G addendum; README Phase 1G shipped row; package version bump 0.108.0 to 0.109.0; content/methodology changelog.v1.json this 99th entry; git branch phase-1d-controlled-start-entry-decision-record.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G internal methodology record (real Microsoft / Copilot Studio read-only discovery)
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-03
- Reference
- Phase 1G founder execution signal (real Microsoft / Power Platform / Copilot Studio read-only discovery; supersedes Phase 1F Slice 6 connector-agnostic UI as product-approved)
- Impact assessment
- Replaces the rejected Phase 1F Slice 6 connector-agnostic paste-style UI with the first real Microsoft connector implementation while preserving connector-agnostic architecture. The buyer chooses where the agent lives via the platform selector, clicks Connect Microsoft, AgentProof completes a real Microsoft Entra ID OAuth authorisation code flow at runtime when env vars are configured, AgentProof discovers the Power Platform environments the buyer can read against the documented Power Platform admin endpoint, the buyer selects an environment, AgentProof discovers the Copilot Studio agents in that environment via the documented Dataverse Web API /api/data/v9.2/bots metadata set, the buyer selects an agent, AgentProof reads only the documented metadata / configuration components via /api/data/v9.2/botcomponents, and the discovered footprint is mapped via a pure deterministic transform into the canonical AgentProof Agent Footprint. Adds six new server-side connector libs and five new API routes under /api/connectors/microsoft/* (the API route count moves from 8 to 13). The eight new unit tests cover auth gating, no manual entry on the primary path, environment discovery, agent discovery, footprint mapping, runtime / token / persistence / public-launch boundaries, the UI review gate, and decision / synthetic-mutation safety. Tokens never reach the browser; tokens are never persisted; tokens are never logged. No business records are read. Missing config shows a clear setup-needed card and does NOT fall back to fictional discovery. The fictional demo path on the buyer hero / demo-package surface is preserved so the founder can still demo without a real Microsoft tenant. Does not change any engine output, any scoring constant, any deterministic Markdown helper body, any rendered prior-phase Markdown SHA, any deterministic filename, any prior-phase artefact, any /api/score response shape, any storage layer, any provider default, or any deployment configuration. Ships one canonical Phase 1G evidence artefact, one test-side support module, eight new unit tests, and three founder-facing documentation addenda. No engine behaviour change. No Supabase write. No persistent storage write. No live LLM call. No live token exchange in this slice. No live Microsoft sign-in performed in this slice. No file upload. No real customer data. No payment integration. No payment request. No quote generation. No contract generation. No invoice generation. No pricing page. No public sales page. No data-collection form, no submit button, no email send, no CRM, no sales-automation, no analytics, no tracking. The phase does not authorise any further work, does not authorise public launch, does not authorise paid use, does not authorise live customer data, does not authorise live scans, does not authorise live LLM, does not authorise live token exchange, does not authorise payments, does not authorise Supabase change, does not authorise localStorage / sessionStorage / IndexedDB / cookie change, does not authorise NullProvider default change, does not authorise marketing or pricing or public sales page work, does not authorise test weakening, does not park or freeze the project, does not claim legal or compliance or regulatory or investor approval, and does not replace legal review. NullProvider remains default. Goldens unchanged. UI product approval still requires founder review; technical tests alone do not approve the UI. The recommendation for the next direction is the smallest controlled move from Phase 1G read-only discovery boundaries toward implementing the live Microsoft token exchange + live /environmentmanagement/environments + /api/data/v9.2/bots reads, scoped to read-only metadata. That work requires a separate explicit founder execution instruction and a configured Microsoft app registration. Internal milestones are not success - success means the product sells.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only. The previous Phase 1F Slice 6 (connector-agnostic paste-style UI) is superseded by this entry as product-approved; the prior architecture remains in code only as the connector_agnostic_discovery_profile helper which is now consumed by the Phase 1G mapper to produce canonical AgentProof Agent Footprint shapes.
No external source was reviewed for this entry.
- Re-score optional
Change id:3BChange date:2026-05-02Product version:0.72.0Methodology engine version:0.9.1
Phase 1D first-slice dry-run execution packet
Reason: Phase 3A closed the implementation branch preparation planning loop. Phase 3B closes the first-slice dry-run execution packet loop with one calm deterministic packet for the first Phase 1D slice only that defines exactly what evidence and acknowledgements must be present immediately before the first Phase 1D slice may be executed in the future. The packet consumes the Phase 3A implementation branch preparation plan, the Phase 2R implementation sequence manifest, the Phase 2Q scope boundary manifest, the Phase 2P dry-run readiness pack index, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical packet remains a dry run pending review and never executes the first slice and never creates a real branch and never starts Phase 1D.
What changed: Phase 3B is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D first-slice dry-run execution packet at the static content layer defines first-slice identity, pre-execution preconditions, allowed and forbidden file-family plans, required reviewer approvals, dry-run command sequence (proposed only), evidence slots before execution, rollback checks, no-secret checks, stop conditions, synthetic execution rules, legal and governance notes, and safety markers. The packet top-level shape adds a packet version, a product version that matches the package version after the Phase 3B bump, a generated-from-phase identifier of three-B, a packet type that explicitly marks the file as a Phase 1D first-slice dry-run execution packet, a dry-run pending-review packet status, a calm description, a calm reviewer-facing packet purpose, a latest-approved-archive block carrying the verified Phase 3A archive evidence, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-sequence reference to the Phase 2R implementation sequence manifest, a first-slice identity block referencing the Phase 2R first slice with execution status not executed and completion status not completed and implementation status not started, a pre-execution preconditions block, an allowed file-family plan block, a forbidden file-family plan block, a required-reviewer-approvals block in stable order with every approval defaulting to pending review and blocking execution if incomplete, a dry-run command sequence block of proposed commands all defaulting to not executed, an evidence-slots-before-execution block, a rollback checks block, a no-secret checks block, a stop conditions block, a synthetic execution rules block, a legal and governance notes block, and a safety markers block. Every reviewer approval defaults to pending review and blocks execution if incomplete. Every dry-run command step defaults to not executed. Every evidence slot defaults to a non-executing pending state and blocks execution if empty. Every rollback check defaults to a non-executing pending state and blocks execution if missing. Every no-secret check is required and blocks execution if failed. Every stop condition blocks first-slice execution with a deterministic failure reason. The synthetic execution rules block proves the canonical packet cannot create a real branch, cannot start Phase 1D, cannot execute the first slice, and cannot mark the first slice complete. The safety markers block pins twenty-four documented dry-run-only and not-Phase-1D and does-not-create-real-branch and does-not-execute-first-slice and does-not-mark-first-slice-complete and NullProvider and no-real-customer and canonical-pending-review flags. The packet does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not mark the first slice complete, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the packet: a deterministic helper that loads the dry-run execution packet and every referenced Phase 2P, Phase 2Q, Phase 2R, Phase 2Y, Phase 2Z, and Phase 3A artefact, computes deterministic SHA-256 hashes via the standard hash module, validates packet source integrity, validates first-slice identity, validates pre-execution preconditions, validates allowed and forbidden file-family plans, validates required reviewer approvals, validates the dry-run command sequence is not executed, validates evidence slots, validates rollback checks, validates no-secret checks, validates stop conditions, validates synthetic execution rules, validates legal and governance notes, validates the canonical packet status remains a dry run pending review, exposes documented mutated synthetic fixtures proving the first slice cannot be executed without complete evidence, and renders a calm founder-facing Markdown dry-run execution packet. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Ten new unit tests pin the packet end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches the documented Phase 2R slice index zero and is not marked started or executed or completed. The file-family test asserts the allowed and forbidden family plans match the Phase 2R first slice and block runtime, API, Supabase, provider, scoring, receipt, payment, analytics, deployment, and user-interface changes. The review evidence test asserts every reviewer approval and evidence slot is required and blocking. The command rollback secret test asserts the command sequence is not executed, rollback checks are blocking, and no-secret checks are complete. The safety test asserts the canonical packet cannot create a branch, cannot start Phase 1D, cannot execute the first slice, and cannot mark the first slice complete. The Markdown rendering test asserts the rendered Markdown carries the documented title, the first-slice dry-run execution packet marker, the dry run only marker, the not-a-real-authorisation marker, the latest Phase 3A archive hash, the latest test totals, the first slice identity, the allowed and forbidden file families, the no-secret checks, the rollback checks, the reviewer approvals, the stop conditions, and the legal and governance notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that reviewer approvals appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the packet content file. New founder-facing documents at the docs layer add a Phase 3B document and a short addendum to the Phase 1D entry decision-record document explaining where to find the first-slice dry-run execution packet. The audit bundle manifest, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report (regenerated under the new product version with refreshed Phase 3A archive evidence and refreshed test totals), the synthetic Verification Journal entry (with refreshed source-artefact hash), the readiness pack index (with refreshed per-artefact source-hash fields and refreshed latest-approved-archive to Phase 3A values), the scope boundary manifest, the sequence manifest, the no-regression acceptance pack, the audit handoff pack (with refreshed per-pointer source-hash fields and refreshed latest-approved-archive to Phase 3A values), the final pre-authorisation checklist (with refreshed summary-block snapshot hashes and refreshed latest-approved-archive to Phase 3A values), the launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 3A values), the frozen baseline manifest (with refreshed content artefact hashes and refreshed latest-approved-archive to Phase 3A values), the authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 3A values), the real-authorisation packet skeleton (with refreshed evidence source hashes and refreshed latest-approved-archive to Phase 3A values), the pre-branch final freeze certificate skeleton (with refreshed evidence reference hashes and refreshed latest-approved-archive to Phase 3A values), and the Phase 3A implementation branch preparation plan product versions are all bumped to zero point seventy-two point zero to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, the implementation branch preparation plan helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable first-slice dry-run execution packet that defines what evidence and acknowledgements must be present immediately before the first Phase 1D slice may be executed in the future. The canonical packet remains a dry run pending review; it never executes the first slice and never creates a real branch and never starts Phase 1D. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3B is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3B authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3B internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3B authorisation
- Impact assessment
- Adds a Phase 1D first-slice dry-run execution packet at the static content layer, one test-side support module (a deterministic helper that loads the packet and every referenced Phase 2P, Phase 2Q, Phase 2R, Phase 2Y, Phase 2Z, and Phase 3A artefact, computes deterministic SHA-256 hashes, validates packet source integrity and first-slice identity and pre-execution preconditions and allowed and forbidden file-family plans and required reviewer approvals and the dry-run command sequence not-executed state and evidence slots and rollback checks and no-secret checks and stop conditions and synthetic execution rules and legal and governance notes and the canonical dry-run pending-review state, exposes documented mutated synthetic fixtures, and renders a calm founder-facing Markdown dry-run execution packet), and ten new unit tests pinning shape and source integrity and slice identity and file family and review evidence and command rollback secret and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The packet does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots, the export-hash snapshots, the full-matrix snapshots, the fuzzing harness, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation and optional-field checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report (regenerated under the new product version), the synthetic Verification Journal entry (with refreshed source-artefact hash), the readiness pack index (with refreshed per-artefact source-hash fields), the scope boundary manifest, the sequence manifest, the no-regression acceptance pack, the audit handoff pack (with refreshed per-pointer source-hash fields), the final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the frozen baseline manifest (with refreshed content artefact hashes), the authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes), the real-authorisation packet skeleton (with refreshed evidence source hashes), the pre-branch final freeze certificate skeleton (with refreshed evidence reference hashes), the implementation branch preparation plan, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3CChange date:2026-05-02Product version:0.73.0Methodology engine version:0.9.1
Phase 1D first-slice real-authorisation evidence lock
Reason: Phase 3B closed the first-slice dry-run execution packet loop. Phase 3C closes the first-slice real-authorisation evidence lock loop with one calm deterministic evidence lock for the first Phase 1D slice only that defines exactly what manual evidence must be completed before the first Phase 1D slice may ever be executed in the future. The lock consumes the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical lock remains an evidence lock pending review and never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice.
What changed: Phase 3C is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D first-slice real-authorisation evidence lock at the static content layer defines first-slice identity, manual authorisation fields, final freeze certificate completion fields, real-authorisation packet completion fields, go/no-go decision fields, rollback evidence fields, unresolved-risk acceptance fields, legal and governance state fields, data-handling acknowledgement fields, reviewer and witness acknowledgement fields, execution gate decision matrix, incomplete evidence blocking rules, synthetic authorisation rules, no-secret checks, stop conditions, legal and governance notes, and safety markers. The lock top-level shape adds a lock version, a product version that matches the package version after the Phase 3C bump, a generated-from-phase identifier of three-C, a lock type that explicitly marks the file as a Phase 1D first-slice real-authorisation evidence lock, an evidence-lock-pending-review lock status, a calm description, a calm reviewer-facing lock purpose, a latest-approved-archive block carrying the verified Phase 3B archive evidence, a source-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3B with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised, manual authorisation fields covering founder full name and role and decision date and explicit authorisation statement and authorised slice identity and authorised branch name and authorised scope summary and acknowledged archive evidence and acknowledged test totals and acknowledged API route count and acknowledged no runtime change and acknowledged no secret exposure and accepted unresolved risks and founder signature and founder signature timestamp and final authorisation status, final freeze certificate completion fields covering source hash and completion status and reviewer and review date and archive sha256 and no drift confirmation and no secret confirmation and no runtime change confirmation and signature and timestamp, real-authorisation packet completion fields covering source hash and completion status and reviewer and review date and scope confirmation and branch confirmation and rollback confirmation and no secret confirmation and no runtime change confirmation and signature and timestamp, go/no-go decision fields covering decision and decision owner and decision date and decision basis and accepted risks reference and blocker review summary and evidence completeness confirmation and rollback confirmation and test green confirmation and final go status, rollback evidence fields covering rollback owner and rollback plan reference and rollback latest approved archive and rollback revert first slice changes and rollback failed gauntlet action and rollback no data migration required and rollback no Supabase change required and rollback no API change required and rollback reviewer acknowledgement and rollback status, unresolved risk acceptance fields covering risk register reviewed and accepted risk items and rejected risk items and risk acceptance owner and risk acceptance date and risk acceptance reason and risk acceptance signature and risk acceptance status, legal and governance state fields covering legal review required and status and governance review required and status and regulatory review required and status and paid or public use blocked until review and not legal advice acknowledged and not regulatory approval acknowledged and final governance state, data handling acknowledgement fields covering no real customer data used and no sensitive customer data used and no runtime secret used and no service role key used and no live provider call used and no automation-crawl used and data handling reviewer and review date and signature and status, reviewer and witness acknowledgement fields covering primary reviewer name and role and decision and signature and timestamp and witness name and role and signature and timestamp and acknowledgement status, an execution gate decision matrix in stable order with sixteen gates and a canonical can-execute-first-slice value of false, incomplete evidence blocking rules covering eighteen documented blockers, synthetic authorisation rules proving the canonical lock cannot authorise Phase 1D and cannot start Phase 1D and cannot create a real branch and cannot execute the first slice and cannot complete the first slice, no-secret checks covering nineteen documented exclusion checks, stop conditions covering twenty-two documented blocking conditions, legal and governance notes covering nine calm disclaimers, and safety markers covering twenty-seven documented evidence-lock-only and not-Phase-1D and does-not-create-real-branch and does-not-execute-first-slice and does-not-complete-first-slice and paid-or-public-use-not-authorised and NullProvider and no-real-customer and canonical-pending-review flags. Every manual evidence field defaults to pending review or not filled or not signed or not authorised. Every reviewer approval defaults to pending and blocks execution if missing. Every gate row defaults to pending or not met or false for anything requiring manual evidence. Every blocking rule has a deterministic failure reason. The lock does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the lock: a deterministic helper that loads the evidence lock and every referenced Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates lock source integrity, validates first-slice identity, validates manual authorisation fields, validates final freeze certificate completion fields, validates real-authorisation packet completion fields, validates go/no-go decision fields, validates rollback evidence fields, validates unresolved-risk acceptance fields, validates legal and governance state fields, validates data-handling acknowledgement fields, validates reviewer and witness acknowledgement fields, validates the execution gate decision matrix, validates incomplete evidence blocking rules, validates synthetic authorisation rules, validates no-secret checks, validates stop conditions, validates legal and governance notes, validates the canonical lock status remains an evidence lock pending review, exposes documented mutated synthetic fixtures proving the first slice cannot be authorised or executed without complete evidence, and renders a calm founder-facing Markdown evidence lock. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the lock end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3B and is not marked started or executed or completed or authorised. The manual authorisation test asserts every required founder field is present and blocking and not filled in the canonical lock. The certificate packet test asserts final freeze certificate and real-authorisation packet fields are required and blocking and incomplete by default. The gate matrix test asserts the canonical can-execute-first-slice is false and only synthetic fully completed fixtures can pass. The blocking rules test asserts each missing evidence type blocks execution with a deterministic reason. The safety test asserts the canonical lock cannot authorise Phase 1D, create a real branch, start Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the first-slice real-authorisation evidence lock marker, the evidence lock only marker, the not-a-real-authorisation marker, the latest Phase 3B archive hash, the latest test totals, the first slice identity, the manual authorisation fields, the freeze certificate fields, the real-authorisation packet fields, the go or no-go matrix, the rollback evidence, the legal and governance state, the no-secret checks, the stop conditions, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret reference in the lock content file. New founder-facing documents at the docs layer add a Phase 3C document and a short addendum to the Phase 1D entry decision-record document explaining where to find the first-slice real-authorisation evidence lock. Every tracked content file product version is bumped to zero point seventy-three point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3B archive, and the cascading hash refresh propagates from Phase 2N through Phase 3B to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, the implementation branch preparation plan helper and Markdown renderer, the first-slice dry-run execution packet helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable first-slice real-authorisation evidence lock that defines what manual evidence must be completed before the first Phase 1D slice may ever be executed in the future. The canonical lock remains an evidence lock pending review; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3C is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3C authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3C internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3C authorisation
- Impact assessment
- Adds a Phase 1D first-slice real-authorisation evidence lock at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and manual authorisation and certificate and packet and gate matrix and blocking rules and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The lock does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, does not complete the first slice, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots, the export-hash snapshots, the full-matrix snapshots, the fuzzing harness, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation and optional-field checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report (regenerated under the new product version), the synthetic Verification Journal entry (with refreshed source-artefact hash), the readiness pack index (with refreshed per-artefact source-hash fields), the scope boundary manifest, the sequence manifest, the no-regression acceptance pack, the audit handoff pack (with refreshed per-pointer source-hash fields), the final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the frozen baseline manifest (with refreshed content artefact hashes), the authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes), the real-authorisation packet skeleton (with refreshed evidence source hashes), the pre-branch final freeze certificate skeleton (with refreshed evidence reference hashes), the implementation branch preparation plan, the first-slice dry-run execution packet, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3DChange date:2026-05-02Product version:0.74.0Methodology engine version:0.9.1
Phase 1D first-slice final execution checklist and go or no-go gate
Reason: Phase 3C closed the first-slice real-authorisation evidence lock loop. Phase 3D closes the first-slice final execution checklist and go or no-go gate loop with one calm deterministic gate that consumes the Phase 3C evidence lock and proves whether the first Phase 1D slice would be permitted to execute in a future authorised run. The gate consumes the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical gate remains a go or no-go gate pending review and the canonical final decision is no go. The gate never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice.
What changed: Phase 3D is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D first-slice final execution checklist and go or no-go gate at the static content layer defines first-slice identity, final checklist rows, go or no-go decision inputs, evidence completeness matrix, source-hash verification matrix, reviewer approval matrix, rollback readiness matrix, no-secret checks, runtime boundary checks, legal and governance readiness checks, deterministic blocker list, canonical final decision, synthetic gate rules, stop conditions, legal and governance notes, and safety markers. The gate top-level shape adds a gate version, a product version that matches the package version after the Phase 3D bump, a generated-from-phase identifier of three-D, a gate type that explicitly marks the file as a Phase 1D first-slice final execution checklist and go or no-go gate, a go-or-no-go gate pending-review gate status, a calm description, a calm reviewer-facing gate purpose, a latest-approved-archive block carrying the verified Phase 3C archive evidence, a source-evidence-lock reference to the Phase 3C real-authorisation evidence lock, a source-dry-run-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3C with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and a new final gate status of no go, twenty-five final checklist rows each canonical-state pending or not met or false or no go, eighteen go or no-go decision inputs, an evidence completeness matrix covering nine evidence groups from Phase 3C, a source-hash verification matrix for all five source references, a twelve-row reviewer approval matrix, a ten-row rollback readiness matrix with canonical rollback ready for go set to false, nineteen no-secret checks, fourteen runtime boundary checks, seven legal and governance readiness checks with canonical final governance state clear set to false, twenty deterministic blockers, a canonical final decision of no go with every canonical can flag false, synthetic gate rules proving the canonical gate cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical final decision is no go, twenty-six stop conditions each blocking first slice execution, nine legal and governance disclaimers, and thirty-one safety markers all true. The canonical final decision is no go. The canonical gate status is go-or-no-go gate pending review. Every checklist row that requires manual evidence is canonical-state pending or not met or false or no go. Every matrix row that requires manual evidence is incomplete or pending review or not approved by default. Every deterministic blocker that requires manual evidence is active. Every reviewer approval defaults to pending review and blocks go if incomplete. Every rollback readiness row defaults to false or pending and blocks go if not met. Every no-secret check is required and blocks go if failed. Every runtime boundary check is required and blocks go if failed. Every legal and governance readiness check is required and blocks go if not met. The gate does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the gate: a deterministic helper that loads the gate and every referenced Phase 3C, Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates gate source integrity, validates first-slice identity, validates final checklist rows, validates go or no-go decision inputs, validates evidence completeness matrix, validates source-hash verification matrix, validates reviewer approval matrix, validates rollback readiness matrix, validates no-secret checks, validates runtime boundary checks, validates legal and governance readiness checks, validates deterministic blocker list, validates canonical final decision, validates synthetic gate rules, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical gate status remains a go-or-no-go gate pending review, exposes documented mutated synthetic fixtures proving the canonical gate remains no go until every required condition is met, and renders a calm founder-facing Markdown go or no-go gate. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the gate end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3C and is not marked started or executed or completed or authorised and the canonical final gate status is no go. The checklist test asserts every required checklist row is present and blocking and not met in the canonical gate. The matrices test asserts evidence completeness and source-hash and reviewer approval and rollback readiness and runtime boundary and legal and governance readiness matrices are complete and blocking. The decision test asserts canonical final decision is no go and only synthetic fully complete fixtures can pass. The blockers test asserts every active canonical blocker blocks go with deterministic reasons. The safety test asserts the canonical gate cannot authorise Phase 1D, create a real branch, start Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the first-slice final execution checklist and go-or-no-go gate marker, the canonical decision no-go marker, the not-a-real-authorisation marker, the latest Phase 3C archive hash, the latest test totals, the first slice identity, the evidence completeness matrix, the source-hash matrix, the reviewer approvals, the rollback readiness, the no-secret checks, the runtime boundary checks, the legal and governance readiness, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret reference in the gate content file. New founder-facing documents at the docs layer add a Phase 3D document and a short addendum to the Phase 1D entry decision-record document explaining where to find the first-slice final execution checklist and go or no-go gate. Every tracked content file product version is bumped to zero point seventy-four point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3C archive, and the cascading hash refresh propagates from Phase 2N through Phase 3C to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, the implementation branch preparation plan helper and Markdown renderer, the first-slice dry-run execution packet helper and Markdown renderer, the first-slice real-authorisation evidence lock helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable first-slice final execution checklist and go or no-go gate that consumes the Phase 3C evidence lock and proves whether the first Phase 1D slice would be permitted to execute in a future authorised run. The canonical gate remains a go-or-no-go gate pending review with a canonical final decision of no go; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3D is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3D authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3D internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3D authorisation
- Impact assessment
- Adds a Phase 1D first-slice final execution checklist and go or no-go gate at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and checklist and matrices and decision and blockers and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The gate does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, does not complete the first slice, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3EChange date:2026-05-02Product version:0.75.0Methodology engine version:0.9.1
Phase 1D first-slice execution rehearsal pack
Reason: Phase 3D closed the first-slice final execution checklist and go or no-go gate loop. Phase 3E closes the first-slice execution rehearsal loop with one calm deterministic rehearsal pack that consumes the Phase 3D gate and simulates the future first-slice execution path without creating a branch, without executing the slice, and without touching runtime code. The pack consumes the Phase 3D first-slice final execution checklist and go or no-go gate, the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical rehearsal pack remains a rehearsal pack pending review and the canonical rehearsal status is not run and the canonical final decision is no go. The pack never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice.
What changed: Phase 3E is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D first-slice execution rehearsal pack at the static content layer defines first-slice identity with a new rehearsal status of not run, rehearsal assumptions, rehearsal preconditions, rehearsal command sequence, rehearsal reviewer checkpoints, rehearsal evidence capture plan, rehearsal rollback simulation, rehearsal no-secret checks, rehearsal runtime boundary checks, rehearsal legal and governance boundary checks, rehearsal deterministic blockers, synthetic rehearsal rules, canonical rehearsal outcome, stop conditions, legal and governance notes, and safety markers. The pack top-level shape adds a rehearsal pack version, a product version that matches the package version after the Phase 3E bump, a generated-from-phase identifier of three-E, a rehearsal pack type that explicitly marks the file as a Phase 1D first-slice execution rehearsal pack, an execution-rehearsal-pack pending-review pack status, a calm description, a calm reviewer-facing pack purpose, a latest-approved-archive block carrying the verified Phase 3D archive evidence, a source-go-or-no-go-gate reference to the Phase 3D first-slice final execution checklist and go or no-go gate, a source-evidence-lock reference to the Phase 3C real-authorisation evidence lock, a source-dry-run-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3D with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and a new rehearsal status of not run, fourteen rehearsal assumptions each canonical-state safe and blocking if false, sixteen rehearsal preconditions each pending or not met by default, twenty rehearsal command sequence steps each not executed and non-destructive and not exposing any secret, eleven rehearsal reviewer checkpoints each pending review, sixteen rehearsal evidence capture items each pending capture, nine rehearsal rollback simulation rows each pending review, nineteen rehearsal no-secret checks, fourteen rehearsal runtime boundary checks, eight rehearsal legal and governance boundary checks with canonical final legal and governance boundary status pending review, twenty rehearsal deterministic blockers, synthetic rehearsal rules proving the canonical rehearsal cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical rehearsal final status is not run and the canonical rehearsal final decision is no go, a canonical rehearsal outcome with rehearsal status not run and rehearsal result pending review and final decision no go and every can flag false and commands executed false and destructive actions taken false and runtime changes made false and reviewer action required pending review, twenty-seven stop conditions each blocking rehearsal or first slice execution, ten legal and governance disclaimers, and thirty-three safety markers all true. The canonical rehearsal status is not run. The canonical final decision is no go. The canonical rehearsal pack status is execution rehearsal pack pending review. Every assumption that requires manual verification is canonical-state pending or not verified by default. Every precondition that requires manual evidence is canonical-state pending or not met or false by default. Every command sequence step is canonical-state not executed by default and is non-destructive and does not expose any secret. Every reviewer checkpoint defaults to pending review and blocks rehearsal pass if incomplete. Every evidence capture item defaults to pending capture and blocks rehearsal pass if missing. Every rollback simulation row defaults to pending and blocks rehearsal pass if not met. Every no-secret check is required and blocks rehearsal pass if failed. Every runtime boundary check is required and blocks rehearsal pass if failed. Every legal and governance boundary check is required and blocks rehearsal pass if not met. The pack does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the pack: a deterministic helper that loads the pack and every referenced Phase 3D, Phase 3C, Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates rehearsal source integrity, validates first-slice identity, validates rehearsal assumptions, validates rehearsal preconditions, validates rehearsal command sequence, validates reviewer checkpoints, validates evidence capture plan, validates rollback simulation, validates no-secret checks, validates runtime boundary checks, validates legal and governance boundary checks, validates deterministic blockers, validates synthetic rehearsal rules, validates canonical rehearsal outcome, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical rehearsal pack status remains an execution rehearsal pack pending review, exposes documented mutated synthetic fixtures proving the canonical rehearsal remains not run and no go until every required condition is met, and renders a calm founder-facing Markdown rehearsal pack. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the pack end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3D and is not marked started or executed or completed or authorised and the canonical rehearsal status is not run. The assumptions and preconditions test asserts every required assumption and precondition is present, blocking, and safe by default. The command sequence test asserts every step is non-destructive and does not expose secrets and remains not executed in the canonical pack. The review and evidence and rollback test asserts reviewer checkpoints and evidence capture and rollback simulation are required and pending in the canonical pack. The boundaries and blockers test asserts no-secret and runtime and legal and governance checks and deterministic blockers are complete and blocking. The outcome and safety test asserts the canonical rehearsal outcome is not run and pending review and no go and cannot authorise Phase 1D, create a branch, start Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the first-slice execution rehearsal pack marker, the rehearsal-only marker, the not-a-real-authorisation marker, the not-executed marker, the canonical decision no-go marker, the latest Phase 3D archive hash, the latest test totals, the first slice identity, the rehearsal command sequence, the evidence capture plan, the rollback simulation, the no-secret checks, the runtime boundary checks, the legal and governance boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret reference in the pack content file. New founder-facing documents at the docs layer add a Phase 3E document and a short addendum to the Phase 1D entry decision-record document explaining where to find the first-slice execution rehearsal pack. Every tracked content file product version is bumped to zero point seventy-five point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3D archive, and the cascading hash refresh propagates from Phase 2N through Phase 3D to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, the implementation branch preparation plan helper and Markdown renderer, the first-slice dry-run execution packet helper and Markdown renderer, the first-slice real-authorisation evidence lock helper and Markdown renderer, the first-slice final execution checklist and go or no-go gate helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable first-slice execution rehearsal pack that consumes the Phase 3D go or no-go gate and simulates the future first-slice execution path without creating a branch, without executing the slice, and without touching runtime code. The canonical rehearsal pack remains an execution rehearsal pack pending review with a canonical rehearsal status of not run and a canonical final decision of no go; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3E is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3E authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3E internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3E authorisation
- Impact assessment
- Adds a Phase 1D first-slice execution rehearsal pack at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and assumptions and preconditions and command sequence and review and evidence and rollback and boundaries and blockers and outcome and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The pack does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, does not complete the first slice, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3FChange date:2026-05-02Product version:0.76.0Methodology engine version:0.9.1
Phase 1D real branch creation authorisation packet
Reason: Phase 3E closed the first-slice execution rehearsal loop. Phase 3F closes the real branch creation authorisation loop with one calm deterministic packet that consumes the Phase 3E rehearsal pack and defines exactly what authorisation must be present before a real Phase 1D branch may ever be created in a future run. The packet consumes the Phase 3E first-slice execution rehearsal pack, the Phase 3D first-slice final execution checklist and go or no-go gate, the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical packet remains a branch creation authorisation packet pending review and the canonical branch creation status is not created and the canonical branch authorisation status is not authorised and the canonical final decision is no go. The packet never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any branch creation command.
What changed: Phase 3F is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D real branch creation authorisation packet at the static content layer defines first-slice identity with a new branch creation status of not created, branch creation status, branch naming requirements, branch creation preconditions, founder authorisation fields, reviewer and witness approval fields, evidence completeness requirements, source-hash verification matrix, branch command prohibition matrix, allowed future branch command plan as inert text only, rollback plan before branch creation, no-secret checks, runtime boundary checks, legal and governance boundary checks, deterministic blockers, synthetic authorisation rules, canonical branch creation decision, stop conditions, legal and governance notes, and safety markers. The packet top-level shape adds a branch packet version, a product version that matches the package version after the Phase 3F bump, a generated-from-phase identifier of three-F, a branch packet type that explicitly marks the file as a Phase 1D real branch creation authorisation packet, a branch creation authorisation packet pending-review packet status, a calm description, a calm reviewer-facing packet purpose, a latest-approved-archive block carrying the verified Phase 3E archive evidence, a source-rehearsal-pack reference to the Phase 3E first-slice execution rehearsal pack, a source-go-or-no-go-gate reference to the Phase 3D first-slice final execution checklist and go or no-go gate, a source-evidence-lock reference to the Phase 3C real-authorisation evidence lock, a source-dry-run-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3E with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and a new branch creation status of not created, a branch creation status block with branch status not created and branch authorisation status not authorised and branch creation decision no go and branch name pending review and branch owner pending review and branch creation timestamp not filled and branch command executed false and branch creation verified false and reviewer action required pending review, eight branch naming requirement rows each canonical-state pending or not approved, twenty branch creation preconditions each pending or not met by default, eighteen founder authorisation fields including a founder signature canonical-state not signed and a final founder authorisation status canonical-state not authorised, eleven reviewer and witness approval fields each required and blocking, thirteen evidence completeness requirements each canonical-state incomplete or pending review, a seven-row source-hash verification matrix for every source reference, a six-row branch command prohibition matrix proving git branch and git checkout dash b and git switch dash c and any branch creation command and any branch creation side effect and branch creation verification are forbidden in the canonical packet, an allowed future branch command plan as inert text only with command execution status not executed and final future command status not authorised, a ten-row rollback plan before branch creation each canonical-state pending or not ready, nineteen no-secret checks, fourteen runtime boundary checks, eight legal and governance boundary checks, twenty-two deterministic blockers, synthetic authorisation rules proving the canonical packet cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical branch creation status is not created and the canonical branch authorisation status is not authorised and the canonical final decision is no go, a canonical branch creation decision with decision status no go and branch creation status not created and branch authorisation status not authorised and every can flag false and branch command executed false and runtime changes made false and reviewer action required pending review, thirty-one stop conditions each blocking branch creation or first slice execution, ten legal and governance disclaimers, and thirty-five safety markers all true. The canonical branch creation status is not created. The canonical branch authorisation status is not authorised. The canonical final decision is no go. Every founder authorisation field is canonical-state pending or not signed or not authorised. Every reviewer and witness approval field is canonical-state pending or not signed. Every branch naming requirement is canonical-state pending or not approved. Every precondition is canonical-state pending or not met. Every evidence completeness row is canonical-state incomplete. Every rollback row is canonical-state pending. Every no-secret check is required and blocks branch creation if failed. Every runtime boundary check is required and blocks branch creation if failed. Every legal and governance boundary check is required and blocks branch creation if not met. The packet does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git branch or git checkout dash b or git switch dash c command, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the packet: a deterministic helper that loads the packet and every referenced Phase 3E, Phase 3D, Phase 3C, Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates packet source integrity, validates first-slice identity, validates branch creation status, validates branch naming requirements, validates branch creation preconditions, validates founder authorisation fields, validates reviewer and witness approval fields, validates evidence completeness requirements, validates source-hash verification matrix, validates branch command prohibition matrix, validates allowed future branch command plan, validates rollback plan before branch creation, validates no-secret checks, validates runtime boundary checks, validates legal and governance boundary checks, validates deterministic blockers, validates synthetic authorisation rules, validates canonical branch creation decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical branch packet status remains a branch creation authorisation packet pending review, exposes documented mutated synthetic fixtures proving the canonical packet remains no go and not created until every required condition is met, and renders a calm founder-facing Markdown branch creation authorisation packet. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the packet end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3E and is not marked started or executed or completed or authorised or rehearsed or branched and the canonical branch creation status is not created. The branch status and naming test asserts branch status is not created, branch authorisation is not authorised, branch decision is no go, and naming rules are required and blocking. The authorisation and approval test asserts every founder, reviewer, and witness field is required, blocking, and incomplete by default. The evidence and hashes test asserts evidence completeness requirements and source-hash verification matrix are complete and blocking. The command and rollback and boundaries test asserts branch commands are prohibited, the future branch command is only inert text, the rollback plan is required, and no-secret and runtime and legal and governance checks are complete and blocking. The decision and safety test asserts the canonical branch creation decision is no go and cannot create a branch, start Phase 1D, authorise Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the real branch creation authorisation packet marker, the branch creation not authorised marker, the not a real branch marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3E archive hash, the latest test totals, the first slice identity, the branch naming requirements, the founder authorisation fields, the reviewer and witness approval, the source-hash matrix, the command prohibition matrix, the rollback plan, the no-secret checks, the runtime boundary checks, the legal and governance boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command reference in the packet content file. New founder-facing documents at the docs layer add a Phase 3F document and a short addendum to the Phase 1D entry decision-record document explaining where to find the real branch creation authorisation packet. Every tracked content file product version is bumped to zero point seventy-six point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3E archive, and the cascading hash refresh propagates from Phase 2N through Phase 3E to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, the implementation branch preparation plan helper and Markdown renderer, the first-slice dry-run execution packet helper and Markdown renderer, the first-slice real-authorisation evidence lock helper and Markdown renderer, the first-slice final execution checklist and go or no-go gate helper and Markdown renderer, the first-slice execution rehearsal pack helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable real branch creation authorisation packet that consumes the Phase 3E rehearsal pack and defines exactly what authorisation must be present before a real Phase 1D branch may ever be created in a future run. The canonical packet remains a branch creation authorisation packet pending review with a canonical branch creation status of not created and a canonical branch authorisation status of not authorised and a canonical final decision of no go; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any branch creation command. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3F is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3F authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3F internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3F authorisation
- Impact assessment
- Adds a Phase 1D real branch creation authorisation packet at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and branch status and naming and authorisation and approval and evidence and hashes and command and rollback and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The packet does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any branch creation command, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3GChange date:2026-05-02Product version:0.77.0Methodology engine version:0.9.1
Phase 1D branch creation rehearsal and pre-flight command transcript
Reason: Phase 3F closed the real branch creation authorisation packet loop. Phase 3G closes the branch creation rehearsal and pre-flight command transcript loop with one calm deterministic transcript that consumes the Phase 3F packet and rehearses the exact branch-creation command path without running git branch, git checkout dash b, git switch dash c, or any command that creates a branch. The transcript consumes the Phase 3F branch creation authorisation packet, the Phase 3E first-slice execution rehearsal pack, the Phase 3D first-slice final execution checklist and go or no-go gate, the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical transcript remains a branch creation rehearsal transcript pending review and the canonical rehearsal status is not run and the canonical branch command status is not run and the canonical branch creation status is not created and the canonical final decision is no go. The transcript never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any branch creation command and never changes git state.
What changed: Phase 3G is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D branch creation rehearsal and pre-flight command transcript at the static content layer defines first-slice identity with a new branch command status of not run, rehearsal status, branch command rehearsal status, pre-flight archive checks, pre-flight source-hash checks, pre-flight branch-name checks, inert command transcript, prohibited command confirmations, reviewer checkpoints, expected evidence capture, rollback simulation, no-secret checks, runtime boundary checks, legal and governance boundary checks, deterministic blockers, synthetic rehearsal rules, canonical rehearsal decision, stop conditions, legal and governance notes, and safety markers. The transcript top-level shape adds a transcript version, a product version that matches the package version after the Phase 3G bump, a generated-from-phase identifier of three-G, a transcript type that explicitly marks the file as a Phase 1D branch creation rehearsal and pre-flight command transcript, a branch creation rehearsal transcript pending-review transcript status, a calm description, a calm reviewer-facing transcript purpose, a latest-approved-archive block carrying the verified Phase 3F archive evidence, a source-branch-authorisation-packet reference to the Phase 3F real branch creation authorisation packet, a source-rehearsal-pack reference to the Phase 3E first-slice execution rehearsal pack, a source-go-or-no-go-gate reference to the Phase 3D first-slice final execution checklist and go or no-go gate, a source-evidence-lock reference to the Phase 3C real-authorisation evidence lock, a source-dry-run-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3F with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and branch creation status not created and a new branch command status of not run, a rehearsal status block with rehearsal status not run and rehearsal result pending review and branch creation status not created and branch command status not run and branch command authorisation status not authorised and final decision no go and reviewer action required pending review, a branch command rehearsal status block with proposed branch name pending review and branch name review status pending review and command preview present true and command preview is inert text only true and command executed false and git branch run false and git checkout dash b run false and git switch dash c run false and any git state change detected false and branch created false and branch creation verified false and final branch command status not run, nine pre-flight archive checks each canonical-state pending or not verified, eight pre-flight source-hash checks each canonical-state pending or hash recorded, nine pre-flight branch-name checks each canonical-state pending or not approved, an inert command transcript with eight rows each command execution status not executed and destructive false and exposes secret false and creates branch false and changes git state false and allowed to run in Phase 3G false for any branch creation preview, eight prohibited command confirmations each canonical-state confirmed false but required to remain false, eleven reviewer checkpoints each canonical-state pending review, twelve expected evidence capture rows each canonical-state pending capture, eleven rollback simulation rows each canonical-state pending review, nineteen no-secret checks, fourteen runtime boundary checks, nine legal and governance boundary checks, twenty-six deterministic blockers, synthetic rehearsal rules proving the canonical transcript cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical branch command status is not run and the canonical branch creation status is not created and the canonical final decision is no go, a canonical rehearsal decision with decision status no go and rehearsal status not run and branch command status not run and branch creation status not created and branch authorisation status not authorised and every can flag false and command executed false and branch command executed false and git state changed false and branch created false and runtime changes made false and reviewer action required pending review, thirty-six stop conditions each blocking rehearsal or branch creation or first slice execution, eleven legal and governance disclaimers, and forty-two safety markers all true. The canonical rehearsal status is not run. The canonical branch command status is not run. The canonical branch creation status is not created. The canonical final decision is no go. Every command transcript row remains preview only and not executed. Every prohibited command confirmation remains false because no such command was run. Every reviewer checkpoint defaults to pending review and blocks rehearsal pass if incomplete. Every evidence capture row defaults to pending capture and blocks rehearsal pass if missing. Every rollback simulation row defaults to pending review and blocks rehearsal pass if not met. Every no-secret check is required and blocks rehearsal if failed. Every runtime boundary check is required and blocks rehearsal if failed. Every legal and governance boundary check is required and blocks rehearsal if not met. The transcript does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git branch or git checkout dash b or git switch dash c command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the transcript: a deterministic helper that loads the transcript and every referenced Phase 3F, Phase 3E, Phase 3D, Phase 3C, Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates transcript source integrity, validates first-slice identity, validates rehearsal status, validates branch command rehearsal status, validates pre-flight archive checks, validates pre-flight source-hash checks, validates pre-flight branch-name checks, validates inert command transcript, validates prohibited command confirmations, validates reviewer checkpoints, validates expected evidence capture, validates rollback simulation, validates no-secret checks, validates runtime boundary checks, validates legal and governance boundary checks, validates deterministic blockers, validates synthetic rehearsal rules, validates canonical rehearsal decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical transcript status remains a branch creation rehearsal transcript pending review, exposes documented mutated synthetic fixtures proving the canonical transcript remains no go and not run until every required condition is met, and renders a calm founder-facing Markdown branch creation rehearsal and pre-flight command transcript. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the transcript end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3F and is not marked started or executed or completed or authorised or rehearsed or branched or command-run and the canonical branch command status is not run. The status and pre-flight test asserts rehearsal status, branch command status, archive checks, source-hash checks, and branch-name checks are complete and blocking. The command transcript test asserts every row is preview only and not executed and non-destructive and no-secret and not branch-creating and that git branch and checkout and switch previews are not allowed to run in Phase 3G. The checkpoints and evidence and rollback test asserts reviewer checkpoints and evidence capture and rollback simulation are required and pending in canonical. The boundaries and blockers test asserts no-secret and runtime and legal and governance checks and prohibited command confirmations and deterministic blockers are complete and blocking. The decision and safety test asserts the canonical rehearsal decision is no go and cannot create a branch, start Phase 1D, authorise Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the branch creation rehearsal and pre-flight command transcript marker, the preview only marker, the not a real branch marker, the not a real authorisation marker, the command not run marker, the canonical decision no-go marker, the latest Phase 3F archive hash, the latest test totals, the first slice identity, the inert command transcript, the prohibited command confirmations, the reviewer checkpoints, the source-hash checks, the no-secret checks, the runtime boundary checks, the legal and governance boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the transcript content file. New founder-facing documents at the docs layer add a Phase 3G document and a short addendum to the Phase 1D entry decision-record document explaining where to find the branch creation rehearsal and pre-flight command transcript. Every tracked content file product version is bumped to zero point seventy-seven point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3F archive, and the cascading hash refresh propagates from Phase 2N through Phase 3F to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable branch creation rehearsal and pre-flight command transcript that consumes the Phase 3F branch creation authorisation packet and rehearses the exact branch-creation command path without running any git branch creation command and without creating a branch. The canonical transcript remains a branch creation rehearsal transcript pending review with a canonical rehearsal status of not run and a canonical branch command status of not run and a canonical branch creation status of not created and a canonical final decision of no go; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any branch creation command. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3G is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3G authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3G internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3G authorisation
- Impact assessment
- Adds a Phase 1D branch creation rehearsal and pre-flight command transcript at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and pre-flight and command transcript and checkpoints and evidence and rollback and boundaries and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The transcript does not authorise any real Phase 1D work, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any branch creation command, does not change git state, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3HChange date:2026-05-02Product version:0.78.0Methodology engine version:0.9.1
Phase 1D final real-start authorisation packet
Reason: Phase 3G closed the branch creation rehearsal and pre-flight command transcript loop. Phase 3H closes the final real-start authorisation loop with one calm deterministic packet that consumes the Phase 3G transcript and defines the final human-controlled authorisation needed before Phase 1D may truly start in a future run. The packet consumes the Phase 3G branch creation rehearsal and pre-flight command transcript, the Phase 3F real branch creation authorisation packet, the Phase 3E first-slice execution rehearsal pack, the Phase 3D first-slice final execution checklist and go or no-go gate, the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical packet remains a final real-start authorisation packet pending review and the canonical final start status is not started and the canonical final start authorisation status is not authorised and the canonical branch creation status is not created and the canonical branch command status is not run and the canonical final decision is no go. The packet never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any git command and never marks Phase 1D as started.
What changed: Phase 3H is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final real-start authorisation packet at the static content layer defines first-slice identity with a new final start status of not started, final start authorisation status, final start preconditions, founder final-start decision fields, reviewer and witness final-start approval fields, legal and governance final-start boundary fields, evidence completeness requirements, source-hash verification matrix, allowed future start command plan as inert text only, prohibited action confirmations, rollback plan before real start, no-secret checks, runtime boundary checks, deterministic blockers, synthetic authorisation rules, canonical final-start decision, stop conditions, legal and governance notes, and safety markers. The packet top-level shape adds a final start packet version, a product version that matches the package version after the Phase 3H bump, a generated-from-phase identifier of three-H, a final start packet type that explicitly marks the file as a Phase 1D final real-start authorisation packet, a final-real-start authorisation packet pending-review packet status, a calm description, a calm reviewer-facing packet purpose, a latest-approved-archive block carrying the verified Phase 3G archive evidence, a source-branch-rehearsal-transcript reference to the Phase 3G branch creation rehearsal and pre-flight command transcript, a source-branch-authorisation-packet reference to the Phase 3F real branch creation authorisation packet, a source-rehearsal-pack reference to the Phase 3E first-slice execution rehearsal pack, a source-go-or-no-go-gate reference to the Phase 3D first-slice final execution checklist and go or no-go gate, a source-evidence-lock reference to the Phase 3C real-authorisation evidence lock, a source-dry-run-packet reference to the Phase 3B first-slice dry-run execution packet, a source-plan reference to the Phase 3A implementation branch preparation plan, a source-freeze-certificate reference to the Phase 2Z pre-branch final freeze certificate skeleton, a source-real-authorisation-packet reference to the Phase 2Y real-authorisation packet skeleton, a first-slice identity block referencing the same first slice as Phase 3G with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and branch creation status not created and branch command status not run and a new final start status of not started, a final start authorisation status block with final start status not started and final start authorisation status not authorised and final start decision no go and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and final start timestamp not filled and final start command executed false and Phase 1D marked in progress false and reviewer action required pending review, twenty-two final start preconditions each pending or not met by default, twenty-one founder final-start decision fields including a founder final-start signature canonical-state not signed and a final founder start authorisation status canonical-state not authorised, eleven reviewer and witness final-start approval fields each required and blocking, ten legal and governance final-start boundary fields with canonical final legal and governance boundary status pending review, fifteen evidence completeness requirements each canonical-state incomplete, a nine-row source-hash verification matrix for every source reference, an allowed future start command plan as inert text only with command execution status not executed and final future command status not authorised, fourteen prohibited action confirmations each canonical-state confirmed false but required to remain false, thirteen rollback plan rows each canonical-state pending review, nineteen no-secret checks, fourteen runtime boundary checks, twenty-six deterministic blockers, synthetic authorisation rules proving the canonical packet cannot start Phase 1D and cannot create a branch and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical final start status is not started and the canonical branch creation status is not created and the canonical branch command status is not run and the canonical final decision is no go, a canonical final-start decision with decision status no go and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and every can flag false and final start command executed false and Phase 1D marked in progress false and branch created false and git state changed false and first slice executed false and runtime changes made false and reviewer action required pending review, thirty-five stop conditions each blocking final start or branch creation or first slice execution, eleven legal and governance disclaimers, and forty safety markers all true. The canonical final start status is not started. The canonical final start authorisation status is not authorised. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every founder final-start decision field is canonical-state pending or not filled or not signed or not authorised. Every reviewer and witness final-start approval field is canonical-state pending or not signed. Every legal and governance final-start boundary field is canonical-state pending or not cleared or true. Every evidence completeness row is canonical-state incomplete. Every precondition is canonical-state pending or not met. Every prohibited action confirmation is canonical-state false because no such action was taken. Every rollback row is canonical-state pending. Every no-secret check is required and blocks final start if failed. Every runtime boundary check is required and blocks final start if failed. The packet does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the packet: a deterministic helper that loads the packet and every referenced Phase 3G, Phase 3F, Phase 3E, Phase 3D, Phase 3C, Phase 3B, Phase 3A, Phase 2Z, and Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates packet source integrity, validates first-slice identity, validates final start authorisation status, validates final start preconditions, validates founder final-start decision fields, validates reviewer and witness final-start approval fields, validates legal and governance final-start boundary fields, validates evidence completeness requirements, validates source-hash verification matrix, validates allowed future start command plan, validates prohibited action confirmations, validates rollback plan before real start, validates no-secret checks, validates runtime boundary checks, validates deterministic blockers, validates synthetic authorisation rules, validates canonical final-start decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical final start packet status remains a final real-start authorisation packet pending review, exposes documented mutated synthetic fixtures proving the canonical packet remains no go and not started until every required condition is met, and renders a calm founder-facing Markdown final real-start authorisation packet. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the packet end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3G and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started and the canonical final start status is not started. The status and preconditions test asserts final start authorisation status and every final start precondition are complete and required and blocking and non-authorising by default. The founder and approval test asserts every founder, reviewer, and witness field is required and blocking and incomplete by default. The legal and evidence and hash test asserts legal and governance boundary fields and evidence completeness requirements and source-hash verification matrix are complete and blocking. The command and rollback and boundary test asserts the future start command is inert and prohibited actions are confirmed false and rollback is required and no-secret and runtime checks are complete and blocking. The decision and safety test asserts the canonical final-start decision is no go and cannot start Phase 1D, create a branch, authorise Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start authorisation packet marker, the final start not authorised marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3G archive hash, the latest test totals, the first slice identity, the founder final-start fields, the reviewer and witness approval, the legal and governance boundary, the source-hash matrix, the prohibited actions, the rollback plan, the no-secret checks, the runtime boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the packet content file. New founder-facing documents at the docs layer add a Phase 3H document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start authorisation packet. Every tracked content file product version is bumped to zero point seventy-eight point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3G archive, and the cascading hash refresh propagates from Phase 2N through Phase 3G to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable final real-start authorisation packet that consumes the Phase 3G transcript and defines the final human-controlled authorisation needed before Phase 1D may truly start in a future run. The canonical packet remains a final real-start authorisation packet pending review with a canonical final start status of not started and a canonical final start authorisation status of not authorised and a canonical branch creation status of not created and a canonical branch command status of not run and a canonical final decision of no go; it never authorises Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any git command and never marks Phase 1D as started. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3H is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3H authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3H internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3H authorisation
- Impact assessment
- Adds a Phase 1D final real-start authorisation packet at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and preconditions and founder approval and legal and evidence and hashes and command and rollback and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The packet does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3IChange date:2026-05-02Product version:0.79.0Methodology engine version:0.9.1
Phase 1D final real-start rehearsal and no-op start transcript
Reason: Phase 3H closed the final real-start authorisation packet loop. Phase 3I closes the final real-start rehearsal and no-op start transcript loop with one calm deterministic transcript that consumes the Phase 3H packet and rehearses the final-start decision path as a no-op without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The transcript consumes the Phase 3H final real-start authorisation packet, the Phase 3G branch creation rehearsal and pre-flight command transcript, the Phase 3F real branch creation authorisation packet, the Phase 3E first-slice execution rehearsal pack, the Phase 3D first-slice final execution checklist and go or no-go gate, the Phase 3C real-authorisation evidence lock, the Phase 3B first-slice dry-run execution packet, the Phase 3A implementation branch preparation plan, the Phase 2Z pre-branch final freeze certificate skeleton, and the Phase 2Y real-authorisation packet skeleton. The canonical transcript remains a final-real-start rehearsal no-op transcript pending review and the canonical no-op rehearsal status is not run and the canonical final start status is not started and the canonical branch creation status is not created and the canonical branch command status is not run and the canonical final decision is no go. The transcript never authorises Phase 1D and never starts Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any git command and never marks Phase 1D as started and never claims final start approval.
What changed: Phase 3I is a docs-side and test-side verification phase. No product features were added. A new deterministic Phase 1D final real-start rehearsal and no-op start transcript at the static content layer defines first-slice identity with a new no-op rehearsal status of not run, no-op rehearsal status block, no-op command transcript, prohibited action confirmations, reviewer checkpoints, expected evidence capture, rollback simulation, no-secret checks, runtime boundary checks, deterministic blockers, synthetic rehearsal rules, canonical no-op rehearsal decision, stop conditions, legal and governance notes, and safety markers. The transcript top-level shape adds a transcript version, a product version that matches the package version after the Phase 3I bump, a generated-from-phase identifier of three-I, a transcript type that explicitly marks the file as a Phase 1D final real-start rehearsal and no-op start transcript, a final-real-start rehearsal no-op transcript pending-review transcript status, a calm description, a calm reviewer-facing transcript purpose, a latest-approved-archive block carrying the verified Phase 3H archive evidence, ten source references covering Phase 3H through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3H with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and branch creation status not created and branch command status not run and final start status not started and a new no-op rehearsal status of not run, a no-op rehearsal status block with no-op rehearsal status not run and no-op rehearsal result pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and final decision no go and reviewer action required pending review, an inert no-op command transcript with eight rows each command execution status not executed and noop true and destructive false and exposes secret false and creates branch false and changes git state false and starts phase 1d false and runs git false, fourteen prohibited action confirmations each canonical-state confirmed true (action did not occur), eleven reviewer checkpoints each canonical-state pending review, twelve expected evidence capture rows each canonical-state pending capture, eleven rollback simulation rows each canonical-state pending review, nineteen no-secret checks, fourteen runtime boundary checks, twenty-six deterministic blockers, synthetic rehearsal rules proving the canonical transcript cannot start Phase 1D and cannot create a branch and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and the canonical no-op rehearsal status is not run and the canonical final start status is not started and the canonical final decision is no go, a canonical no-op rehearsal decision with decision status no go and no-op rehearsal status not run and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and every can flag false and command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and reviewer action required pending review, thirty-nine stop conditions each blocking no-op rehearsal or final start or branch creation or first slice execution, eleven legal and governance disclaimers, and fifty-one safety markers all true. The canonical no-op rehearsal status is not run. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every command transcript row remains no-op only and not executed and non-destructive and not branch-creating and not git-state-changing and not Phase-1D-starting and not git-running. Every prohibited action confirmation remains true because no such action occurred. Every reviewer checkpoint defaults to pending review and blocks rehearsal pass if incomplete. Every evidence capture row defaults to pending capture and blocks rehearsal pass if missing. Every rollback simulation row defaults to pending review and blocks rehearsal pass if not met. Every no-secret check is required and blocks rehearsal if failed. Every runtime boundary check is required and blocks rehearsal if failed. The transcript does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, and does not replace legal review. One new test-side support module ships alongside the transcript: a deterministic helper that loads the transcript and every referenced Phase 3H, 3G, 3F, 3E, 3D, 3C, 3B, 3A, 2Z, and 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates transcript source integrity, validates first-slice identity, validates no-op rehearsal status, validates no-op command transcript, validates prohibited action confirmations, validates reviewer checkpoints, validates evidence capture, validates rollback simulation, validates no-secret checks, validates runtime boundary checks, validates deterministic blockers, validates synthetic rehearsal rules, validates canonical no-op rehearsal decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical transcript status remains a final-real-start rehearsal no-op transcript pending review, exposes documented mutated synthetic fixtures proving the canonical transcript remains no go and not run until every required condition is met, and renders a calm founder-facing Markdown final real-start rehearsal and no-op start transcript. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the transcript end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3H and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or no-op-run and the canonical no-op rehearsal status is not run. The status and command-transcript test asserts the no-op rehearsal status block and every no-op command transcript row are complete and required and blocking and non-authorising by default. The prohibited and rollback test asserts every prohibited action confirmation is canonical-state true and every rollback row is canonical-state pending review. The evidence and checkpoints test asserts every evidence capture row and every reviewer checkpoint are required and pending in canonical. The boundaries and blockers test asserts no-secret and runtime checks and deterministic blockers are complete and blocking. The decision and safety test asserts the canonical no-op rehearsal decision is no go and cannot start Phase 1D, create a branch, authorise Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start rehearsal and no-op start transcript marker, the no-op rehearsal not run marker, the final start not given marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3H archive hash, the latest test totals, the first slice identity, the no-op command transcript, the prohibited action confirmations, the reviewer checkpoints, the no-secret checks, the runtime boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the transcript content file. New founder-facing documents at the docs layer add a Phase 3I document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start rehearsal and no-op start transcript. Every tracked content file product version is bumped to zero point seventy-nine point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3H archive, and the cascading hash refresh propagates from Phase 2N through Phase 3H to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable final real-start rehearsal and no-op start transcript that consumes the Phase 3H packet and rehearses the final-start decision path as a no-op without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The canonical transcript remains a final-real-start rehearsal no-op transcript pending review with a canonical no-op rehearsal status of not run and a canonical final start status of not started and a canonical final decision of no go; it never authorises Phase 1D and never starts Phase 1D and never creates a real branch and never executes the first slice and never completes the first slice and never runs any git command and never claims final start approval. Paid or public use remains blocked until required legal or governance review state is completed where required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3I is docs-side and test-side verification only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3I authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3I internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3I authorisation
- Impact assessment
- Adds a Phase 1D final real-start rehearsal and no-op start transcript at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and command transcript and prohibited and rollback and evidence and checkpoints and boundaries and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The transcript does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not claim final start approval, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3JChange date:2026-05-02Product version:0.80.0Methodology engine version:0.9.1
Phase 3J — Phase 1D final real-start go-or-no-go gate skeleton (docs-side / test-side planning)
Reason: Phase 3J adds the auditable Phase 1D final real-start go-or-no-go gate skeleton that consumes the Phase 3I no-op rehearsal transcript and the Phase 3H final real-start authorisation packet and records the future final reviewer go-or-no-go decision as a calm pending-review skeleton without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The canonical skeleton defaults to FINAL START GO-OR-NO-GO PENDING REVIEW, FINAL START NOT GIVEN, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The skeleton is a verification artefact only and does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, and does not claim Phase 1D completion.
What changed: Phase 3J is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final real-start go-or-no-go gate skeleton at the static content layer defines first-slice identity with a new go-or-no-go gate skeleton status of pending review, a final-start go-or-no-go status block, twenty-eight final checklist rows, twenty-two final-start decision inputs, twenty-two founder final-start signoff fields, twelve reviewer approval rows, eleven legal and governance boundary checks, sixteen evidence completeness rows, ten source-hash verification rows, fourteen rollback readiness rows, fifteen prohibited action confirmations, nineteen no-secret checks, fourteen runtime boundary checks, twenty-eight deterministic blockers, synthetic gate rules, a canonical final-start go-or-no-go decision, forty-one stop conditions, eleven legal and governance disclaimers, and fifty-five safety markers all true. The skeleton top-level shape adds a gate skeleton version, a product version that matches the package version after the Phase 3J bump, a generated-from-phase identifier of three-J, a gate skeleton type that explicitly marks the file as a Phase 1D final real-start go-or-no-go gate skeleton, a final-real-start go-or-no-go gate skeleton pending-review gate skeleton status, a calm description, a calm reviewer-facing gate skeleton purpose, a latest-approved-archive block carrying the verified Phase 3I archive evidence, eleven source references covering Phase 3I through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3I with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and branch creation status not created and branch command status not run and final start status not started and no-op rehearsal status not run and a new go-or-no-go gate skeleton status of pending review, a final-start go-or-no-go status block with final-start go-or-no-go status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and final decision no go and reviewer action required pending review, an inert final checklist with twenty-eight rows each canonical-state pending review and required state for go verified and blocking, twenty-two final-start decision inputs each blocking, twenty-two founder final-start signoff fields each blocking, twelve reviewer approval rows each canonical-state pending review and blocking, eleven legal and governance boundary checks each canonical-state pending review and blocking, sixteen evidence completeness rows each canonical-state incomplete and blocking, ten source-hash verification rows each verification status hash recorded and blocking, fourteen rollback readiness rows each canonical-state not ready and blocking, fifteen prohibited action confirmations each canonical-state confirmed true and blocking gate skeleton if false, nineteen no-secret checks, fourteen runtime boundary checks, twenty-eight deterministic blockers, synthetic gate rules proving the canonical gate skeleton cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and cannot run any git command and cannot change git state and cannot mark Phase 1D in progress and cannot claim final start approval and the canonical final decision is no go, a canonical final-start go-or-no-go decision with decision status no go and final-start go-or-no-go status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and every can flag false and final start command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and reviewer action required pending review, forty-one stop conditions each blocking Phase 1D real start, eleven legal and governance disclaimers, and fifty-five safety markers all true. The canonical gate skeleton status is pending review. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every checklist row remains pending review and required for go and blocking. Every decision input remains canonical-state not filled or pending and blocking. Every founder signoff field remains canonical-state not filled or not signed or not authorised or not acknowledged and blocking. Every reviewer approval remains pending review and blocking. Every legal and governance boundary check remains pending review and blocking. Every evidence completeness row remains incomplete and blocking. Every source-hash row remains hash recorded and blocking if stale. Every rollback readiness row remains not ready and blocking. Every prohibited action confirmation remains canonical-state true. Every no-secret check is required and blocks go if failed. Every runtime boundary check is required and blocks go if failed. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, does not claim Phase 1D completion, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the skeleton and every referenced Phase 3I, 3H, 3G, 3F, 3E, 3D, 3C, 3B, 3A, 2Z, and 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates skeleton source integrity, validates first-slice identity, validates final start go-or-no-go status, validates final checklist rows, validates final start decision inputs, validates founder final-start signoff fields, validates reviewer approval matrix, validates legal and governance boundary checks, validates evidence completeness matrix, validates source-hash verification matrix, validates rollback readiness matrix, validates prohibited action confirmations, validates no-secret checks, validates runtime boundary checks, validates deterministic blockers, validates synthetic gate rules, validates canonical final start go-or-no-go decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical gate skeleton status remains a final-real-start go-or-no-go gate skeleton pending review, exposes documented mutated synthetic fixtures proving the canonical skeleton remains no go and pending review until every required condition is met, and renders a calm founder-facing Markdown final real-start go-or-no-go gate skeleton. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3I and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or no-op-run and the canonical go-or-no-go gate skeleton status is pending review. The status and decision inputs test asserts the final-start go-or-no-go status block and every checklist row and every decision input are complete and required and blocking and non-authorising by default. The signoff and approval test asserts every founder signoff field and every reviewer approval and every legal and governance boundary check is present and blocking. The evidence and rollback test asserts every evidence completeness row and every rollback readiness row and every source-hash row is present and required and blocking. The prohibited and boundaries test asserts every prohibited action confirmation is canonical-state true and every no-secret check is blocking and every runtime boundary check is pending review and blocking and every deterministic blocker is active and blocks go. The decision and safety test asserts the canonical final-start go-or-no-go decision is no go and cannot create a branch, start Phase 1D, authorise Phase 1D, execute the first slice, or complete the first slice. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start go-or-no-go gate skeleton marker, the gate skeleton only marker, the pending review marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3I archive hash, the latest test totals, the first slice identity, the founder signoff section, the reviewer approval matrix, the legal and governance boundary checks, the source-hash matrix, the prohibited actions, the rollback plan, the no-secret checks, the runtime boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 3J document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start go-or-no-go gate skeleton. Every tracked content file product version is bumped to zero point eighty point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3I archive, and the cascading hash refresh propagates from Phase 2N through Phase 3I to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: None at runtime. The methodology summary is bumped to product version 0.80.0 to keep methodology tracking in sync with this Phase 3J docs-side / test-side planning gate skeleton.
When to re-score: Phase 3J adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 j.md, docs/phase plan.md (Phase 3J addendum), docs/phase 1 d entry decision record.md (Phase 3J addendum), README.md (Phase 3J shipped row), content/phase 1d final real start go no go gate skeleton.v1.json, an internal source file, tests/unit/phase_3j_*.test.ts.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3J internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3J authorisation
- Impact assessment
- Adds a Phase 1D final real-start go-or-no-go gate skeleton at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and decision inputs and signoff and approval and evidence and rollback and prohibited and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3KChange date:2026-05-02Product version:0.81.0Methodology engine version:0.9.1
Phase 3K — Phase 1D final real-start authorisation witness packet skeleton (docs-side / test-side planning)
Reason: Phase 3K adds the auditable Phase 1D final real-start authorisation witness packet skeleton that consumes the Phase 3J go-or-no-go gate skeleton and the Phase 3I no-op rehearsal transcript and records the witness signature, witness acknowledgement, and witness boundary fields as a calm pending-review skeleton without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The canonical skeleton defaults to WITNESS PACKET SKELETON PENDING REVIEW, WITNESS NOT SIGNED, WITNESS NOT ACKNOWLEDGED, FINAL START NOT GIVEN, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The skeleton is a verification artefact only and does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, and does not claim witness acknowledged.
What changed: Phase 3K is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final real-start authorisation witness packet skeleton at the static content layer defines first-slice identity with a new witness packet skeleton status of pending review, a witness identity block, twenty-four witness signoff fields, fourteen witness acknowledgement matrix rows, twelve witness legal and governance boundary checks, seventeen witness evidence completeness rows, eleven source-hash verification rows, fourteen witness rollback acknowledgement rows, fifteen prohibited action confirmations, nineteen no-secret checks, fourteen runtime boundary checks, thirty deterministic blockers, synthetic witness rules, a canonical witness decision, forty-three stop conditions, eleven legal and governance disclaimers, and fifty-seven safety markers all true. The skeleton top-level shape adds a witness packet skeleton version, a product version that matches the package version after the Phase 3K bump, a generated-from-phase identifier of three-K, a witness packet skeleton type that explicitly marks the file as a Phase 1D final real-start authorisation witness packet skeleton, a final-real-start authorisation witness packet skeleton pending-review witness packet skeleton status, a calm description, a calm reviewer-facing witness packet skeleton purpose, a latest-approved-archive block carrying the verified Phase 3J archive evidence with archive-evidence product version zero point eighty point zero, twelve source references covering Phase 3J through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3J with execution status not executed and completion status not completed and implementation status not started and authorisation status not authorised and final gate status no go and rehearsal status not run and branch creation status not created and branch command status not run and final start status not started and no-op rehearsal status not run and go-or-no-go gate skeleton status pending review and a new witness packet skeleton status of pending review, a witness identity block with witness role required and witness independence required and witness independence canonical state pending review and witness signed canonical state not signed and witness acknowledged canonical state not acknowledged and witness final start acknowledgement status not acknowledged and witness action required pending review, twenty-four witness signoff fields each canonical-state not filled or not signed or not acknowledged and blocking, fourteen witness acknowledgement matrix rows each canonical-state pending review and blocking, twelve witness legal and governance boundary checks each canonical-state pending review and blocking, seventeen witness evidence completeness rows each canonical-state incomplete and blocking, eleven source-hash verification rows each verification status hash recorded and blocking, fourteen witness rollback acknowledgement rows each canonical-state pending review and blocking, fifteen prohibited action confirmations each canonical-state confirmed true and blocking witness packet skeleton if false, nineteen no-secret checks, fourteen runtime boundary checks, thirty deterministic blockers, synthetic witness rules proving the canonical witness packet skeleton cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and cannot run any git command and cannot change git state and cannot mark Phase 1D in progress and cannot claim final start approval and cannot claim witness signed and cannot claim witness acknowledged and the canonical final decision is no go, a canonical witness decision with decision status no go and witness packet skeleton status pending review and witness signed status not signed and witness acknowledged status not acknowledged and witness independence confirmed status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and every can flag false and final start command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and witness action required pending review, forty-three stop conditions each blocking Phase 1D real start, eleven legal and governance disclaimers, and fifty-seven safety markers all true. The canonical witness packet skeleton status is pending review. The canonical witness signed status is not signed. The canonical witness acknowledged status is not acknowledged. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every witness signoff field remains canonical-state not filled or not signed or not acknowledged and blocking. Every witness acknowledgement remains pending review and blocking. Every witness legal and governance boundary check remains pending review and blocking. Every witness evidence completeness row remains incomplete and blocking. Every source-hash row remains hash recorded and blocking if stale. Every witness rollback acknowledgement row remains pending review and blocking. Every prohibited action confirmation remains canonical-state true. Every no-secret check is required and blocks witness signoff if failed. Every runtime boundary check is required and blocks witness signoff if failed. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the skeleton and every referenced Phase 3J through Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates skeleton source integrity, validates first-slice identity, validates witness identity, validates witness signoff fields, validates witness acknowledgement matrix, validates witness legal and governance boundary checks, validates witness evidence completeness matrix, validates source-hash verification matrix, validates witness rollback acknowledgement matrix, validates prohibited action confirmations, validates no-secret checks, validates runtime boundary checks, validates deterministic blockers, validates synthetic witness rules, validates canonical witness decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical witness packet skeleton status remains a final-real-start authorisation witness packet skeleton pending review, exposes documented mutated synthetic fixtures proving the canonical skeleton remains no go and pending review until every required condition is met, and renders a calm founder-facing Markdown final real-start authorisation witness packet skeleton. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3J and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or no-op-run and the canonical witness packet skeleton status is pending review. The witness identity status test asserts the witness identity block remains pending review and not signed and not acknowledged and the canonical witness decision witness-specific fields remain non-authorising. The signoff and acknowledgement test asserts every witness signoff field and every witness acknowledgement and every witness legal and governance boundary check is present and blocking. The evidence and rollback test asserts every witness evidence completeness row and every witness rollback acknowledgement row and every source-hash row is present and required and blocking. The prohibited and boundaries test asserts every prohibited action confirmation is canonical-state true and every no-secret check is blocking and every runtime boundary check is pending review and blocking and every deterministic blocker is active and blocks witness signoff. The decision and safety test asserts the canonical witness decision is no go and cannot create a branch, start Phase 1D, authorise Phase 1D, execute the first slice, complete the first slice, claim witness signed, or claim witness acknowledged. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start authorisation witness packet skeleton marker, the witness packet skeleton only marker, the witness not signed marker, the witness not acknowledged marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3J archive hash, the latest test totals, the first slice identity, the witness identity, the witness signoff fields, the witness acknowledgement matrix, the witness legal and governance boundary checks, the witness evidence completeness matrix, the source-hash matrix, the witness rollback acknowledgement matrix, the prohibited actions, the no-secret checks, the runtime boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 3K document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start authorisation witness packet skeleton. Every tracked content file product version is bumped to zero point eighty-one point zero to stay in sync with the package version, every latest approved archive block is refreshed to the approved Phase 3J archive (of the twenty-three latest approved archive blocks in the live archive after Phase 3L only three carry the optional product version field — the Phase 2U archive-evidence block, the Phase 3K latest approved archive block, and the Phase 3L latest approved archive block — and all three set product version to zero point eighty-one point zero matching the approved Phase 3K archive they point to; the other twenty older-schema latest approved archive blocks do not carry product version), and the cascading hash refresh propagates from Phase 2N through Phase 3J to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: None at runtime. The methodology summary is bumped to product version 0.81.0 to keep methodology tracking in sync with this Phase 3K docs-side / test-side planning witness packet skeleton.
When to re-score: Phase 3K adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 k.md, docs/phase plan.md (Phase 3K addendum), docs/phase 1 d entry decision record.md (Phase 3K addendum), README.md (Phase 3K shipped row), content/phase 1d final real start authorisation witness packet skeleton.v1.json, an internal source file, tests/unit/phase_3k_*.test.ts.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3K internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3K authorisation
- Impact assessment
- Adds a Phase 1D final real-start authorisation witness packet skeleton at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and witness identity status and signoff and acknowledgement and evidence and rollback and prohibited and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3LChange date:2026-05-02Product version:0.82.0Methodology engine version:0.9.1
Phase 3L — Phase 1D final real-start authorisation reviewer attestation matrix skeleton (docs-side / test-side planning)
Reason: Phase 3L adds the auditable Phase 1D final real-start authorisation reviewer attestation matrix skeleton that consumes the Phase 3K witness packet skeleton and the Phase 3J go-or-no-go gate skeleton and records the reviewer attestation, reviewer acknowledgement, and reviewer boundary fields as a calm pending-review skeleton without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The canonical skeleton defaults to REVIEWER ATTESTATION MATRIX SKELETON PENDING REVIEW, REVIEWER NOT ATTESTED, REVIEWER NOT ACKNOWLEDGED, REVIEWER INDEPENDENCE NOT CONFIRMED, FINAL START NOT GIVEN, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The skeleton is a verification artefact only and does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, and does not claim reviewer attestation complete.
What changed: Phase 3L is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final real-start authorisation reviewer attestation matrix skeleton at the static content layer defines first-slice identity with a new reviewer attestation matrix skeleton status of pending review, a reviewer identity block, twenty-six reviewer attestation fields, fifteen reviewer acknowledgement matrix rows, thirteen reviewer legal and governance boundary checks, eighteen reviewer evidence completeness rows, twelve source-hash verification rows, fifteen reviewer rollback attestation rows, sixteen prohibited action confirmations, nineteen no-secret checks, fourteen runtime boundary checks, thirty-two deterministic blockers, synthetic reviewer rules, a canonical reviewer decision, forty-five stop conditions, eleven legal and governance disclaimers, and sixty-three safety markers all true. The skeleton top-level shape adds a reviewer attestation matrix skeleton version, a product version that matches the package version after the Phase 3L bump, a generated-from-phase identifier of three-L, a reviewer attestation matrix skeleton type that explicitly marks the file as a Phase 1D final real-start authorisation reviewer attestation matrix skeleton, a final-real-start authorisation reviewer attestation matrix skeleton pending-review status, a calm description, a calm reviewer-facing reviewer attestation matrix skeleton purpose, a latest-approved-archive block carrying the verified Phase 3K archive evidence with archive-evidence product version zero point eighty-one point zero, thirteen source references covering Phase 3K through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3K with all prior statuses non-authorising plus a new reviewer attestation matrix skeleton status of pending review, a reviewer identity block with reviewer role required and reviewer independence from founder required and reviewer independence from witness required and reviewer independence canonical state pending review and reviewer attested canonical state not attested and reviewer acknowledged canonical state not acknowledged and reviewer final start attestation status not attested and reviewer action required pending review, twenty-six reviewer attestation fields each canonical-state not filled or not signed or not attested or not confirmed and blocking, fifteen reviewer acknowledgement matrix rows each canonical-state pending review and blocking, thirteen reviewer legal and governance boundary checks each canonical-state pending review and blocking, eighteen reviewer evidence completeness rows each canonical-state incomplete and blocking, twelve source-hash verification rows each verification status hash recorded and blocking, fifteen reviewer rollback attestation rows each canonical-state pending review and blocking, sixteen prohibited action confirmations each canonical-state confirmed true and blocking reviewer attestation matrix skeleton if false, nineteen no-secret checks, fourteen runtime boundary checks, thirty-two deterministic blockers, synthetic reviewer rules proving the canonical skeleton cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and cannot run any git command and cannot change git state and cannot mark Phase 1D in progress and cannot claim final start approval and cannot claim witness signed and cannot claim witness acknowledged and cannot claim witness independence completion and cannot claim reviewer attestation complete and the canonical final decision is no go, a canonical reviewer decision with decision status no go and reviewer attestation matrix skeleton status pending review and reviewer attested status not attested and reviewer acknowledged status not acknowledged and reviewer independence confirmed status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and every can flag false and final start command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and reviewer action required pending review, forty-five stop conditions each blocking Phase 1D real start, eleven legal and governance disclaimers, and sixty-three safety markers all true. The canonical reviewer attestation matrix skeleton status is pending review. The canonical reviewer attested status is not attested. The canonical reviewer acknowledged status is not acknowledged. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every reviewer attestation field remains canonical-state not filled or not signed or not attested or not confirmed and blocking. Every reviewer acknowledgement remains pending review and blocking. Every reviewer legal and governance boundary check remains pending review and blocking. Every reviewer evidence completeness row remains incomplete and blocking. Every source-hash row remains hash recorded and blocking if stale. Every reviewer rollback attestation row remains pending review and blocking. Every prohibited action confirmation remains canonical-state true. Every no-secret check is required and blocks reviewer attestation if failed. Every runtime boundary check is required and blocks reviewer attestation if failed. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the skeleton and every referenced Phase 3K through Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates skeleton source integrity, validates first-slice identity, validates reviewer identity, validates reviewer attestation fields, validates reviewer acknowledgement matrix, validates reviewer legal and governance boundary checks, validates reviewer evidence completeness matrix, validates source-hash verification matrix, validates reviewer rollback attestation matrix, validates prohibited action confirmations, validates no-secret checks, validates runtime boundary checks, validates deterministic blockers, validates synthetic reviewer rules, validates canonical reviewer decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical reviewer attestation matrix skeleton status remains a final-real-start authorisation reviewer attestation matrix skeleton pending review, exposes documented mutated synthetic fixtures proving the canonical skeleton remains no go and pending review until every required condition is met, and renders a calm founder-facing Markdown final real-start authorisation reviewer attestation matrix skeleton. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3K and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or no-op-run and the canonical reviewer attestation matrix skeleton status is pending review. The reviewer identity status test asserts the reviewer identity block remains pending review and not attested and not acknowledged and the canonical reviewer decision reviewer-specific fields remain non-authorising. The attestation and acknowledgement test asserts every reviewer attestation field and every reviewer acknowledgement and every reviewer legal and governance boundary check is present and blocking. The evidence and rollback test asserts every reviewer evidence completeness row and every reviewer rollback attestation row and every source-hash row is present and required and blocking. The prohibited and boundaries test asserts every prohibited action confirmation is canonical-state true and every no-secret check is blocking and every runtime boundary check is pending review and blocking and every deterministic blocker is active and blocks reviewer attestation. The decision and safety test asserts the canonical reviewer decision is no go and cannot create a branch, start Phase 1D, authorise Phase 1D, execute the first slice, complete the first slice, claim witness signed, claim witness acknowledged, claim witness independence completion, or claim reviewer attestation complete. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start authorisation reviewer attestation matrix skeleton marker, the reviewer attestation matrix skeleton only marker, the reviewer not attested marker, the reviewer not acknowledged marker, the reviewer independence not confirmed marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the canonical decision no-go marker, the latest Phase 3K archive hash, the latest test totals, the first slice identity, the reviewer identity, the reviewer attestation fields, the reviewer acknowledgement matrix, the reviewer legal and governance boundary checks, the reviewer evidence completeness matrix, the source-hash matrix, the reviewer rollback attestation matrix, the prohibited actions, the no-secret checks, the runtime boundary checks, the deterministic blockers, and the safety notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 3L document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start authorisation reviewer attestation matrix skeleton. Every tracked content file top-level product version is bumped to zero point eighty-two point zero to stay in sync with the package version. Of the twenty-three latest approved archive blocks in this archive, only three carry the optional product version field — the Phase 2U archive-evidence block, the Phase 3K latest approved archive block, and the Phase 3L latest approved archive block — and all three set product version to zero point eighty-one point zero matching the approved Phase 3K archive they point to (the other twenty older-schema latest approved archive blocks do not carry product version). Phase 2U summary block.archive evidence snapshot.product version is also set to zero point eighty-one point zero for the same reason. The cascading hash refresh propagates from Phase 2N through Phase 3K to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: None at runtime. The methodology summary is bumped to product version 0.82.0 to keep methodology tracking in sync with this Phase 3L docs-side / test-side planning reviewer attestation matrix skeleton.
When to re-score: Phase 3L adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 l.md, docs/phase plan.md (Phase 3L addendum), docs/phase 1 d entry decision record.md (Phase 3L addendum), README.md (Phase 3L shipped row), content/phase 1d final real start authorisation reviewer attestation matrix skeleton.v1.json, an internal source file, tests/unit/phase_3l_*.test.ts.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3L internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3L authorisation
- Impact assessment
- Adds a Phase 1D final real-start authorisation reviewer attestation matrix skeleton at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and reviewer identity status and attestation and acknowledgement and evidence and rollback and prohibited and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3MChange date:2026-05-02Product version:0.83.0Methodology engine version:0.9.1
Phase 3M — Phase 1D final real-start authorisation evidence bundle index skeleton (docs-side / test-side planning)
Reason: Phase 3M adds the auditable Phase 1D final real-start authorisation evidence bundle index skeleton that consumes the corrected Phase 3L reviewer attestation matrix skeleton and the Phase 3K witness packet skeleton and indexes the future evidence bundle required before any future Phase 1D real start can be considered. The canonical skeleton defaults to EVIDENCE BUNDLE INCOMPLETE, EVIDENCE BUNDLE INDEX PENDING REVIEW, FINAL START NOT GIVEN, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The skeleton is a verification artefact only and does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not claim evidence bundle complete.
What changed: Phase 3M is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final real-start authorisation evidence bundle index skeleton at the static content layer defines first-slice identity with a new evidence bundle index skeleton status of pending review, an evidence bundle index status block, sixteen required evidence bundle sections, thirty evidence item index rows, a fourteen-row source-hash verification matrix, a fourteen-row bundle completeness matrix, seven reviewer evidence index rows, six witness evidence index rows, nine founder evidence index rows, eight legal and governance evidence index rows, ten rollback evidence index rows, twenty-two no-secret checks, fourteen runtime boundary checks, twenty prohibited action confirmations, thirty-five deterministic blockers, synthetic bundle rules, a canonical evidence bundle decision, forty-five stop conditions, thirteen legal and governance disclaimers, and sixty safety markers all true. The skeleton top-level shape adds an evidence bundle index skeleton version, a product version that matches the package version after the Phase 3M bump, a generated-from-phase identifier of three-M, an evidence bundle index skeleton type that explicitly marks the file as a Phase 1D final real-start authorisation evidence bundle index skeleton, a final-real-start authorisation evidence bundle index skeleton pending-review status, a calm description, a calm reviewer-facing evidence bundle index skeleton purpose, a latest-approved-archive block carrying the verified corrected Phase 3L archive evidence with archive-evidence product version zero point eighty-two point zero, fourteen source references covering Phase 3L through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3L with all prior statuses non-authorising plus a new evidence bundle index skeleton status of pending review, an evidence bundle index status block with evidence bundle status incomplete and evidence bundle index status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and witness packet status pending review and reviewer attestation status pending review and evidence bundle decision no go and Phase 1D marked in progress false and reviewer action required pending review, sixteen required evidence bundle sections each canonical-state incomplete and blocking, thirty evidence item index rows each canonical-state pending capture and blocking, a fourteen-row source-hash verification matrix each verification status hash recorded and blocking if stale, a fourteen-row bundle completeness matrix each canonical-state incomplete and blocking, seven reviewer evidence index rows each canonical-state incomplete and blocking, six witness evidence index rows each canonical-state incomplete and blocking, nine founder evidence index rows each canonical-state incomplete and blocking, eight legal and governance evidence index rows each canonical-state incomplete and blocking, ten rollback evidence index rows each canonical-state incomplete and blocking, twenty-two no-secret checks each required and blocking, fourteen runtime boundary checks each canonical-state pending review and blocking, twenty prohibited action confirmations each canonical-state confirmed true and blocking evidence bundle completion if false, thirty-five deterministic blockers, synthetic bundle rules proving the canonical skeleton cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and cannot claim witness signed and cannot claim witness acknowledged and cannot claim witness independence completion and cannot claim reviewer attestation complete and the canonical evidence bundle status is incomplete and the canonical final decision is no go, a canonical evidence bundle decision with decision status no go and evidence bundle status incomplete and evidence bundle index status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and witness packet status pending review and reviewer attestation status pending review and every can flag false and final start command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and reviewer action required pending review, forty-five stop conditions each blocking evidence bundle completion or Phase 1D real start, thirteen legal and governance disclaimers, and sixty safety markers all true. Of the twenty-four latest approved archive blocks in the live archive after Phase 3M, only four carry the optional product version field — the Phase 2U archive-evidence block, the Phase 3K latest approved archive block, the Phase 3L latest approved archive block, and the Phase 3M latest approved archive block — and all four set product version to zero point eighty-two point zero matching the approved corrected Phase 3L archive they point to. The other twenty older-schema latest approved archive blocks do not carry product version. Phase 2U summary block archive evidence snapshot product version is also set to zero point eighty-two point zero for the same reason. The canonical evidence bundle index skeleton status is pending review. The canonical evidence bundle status is incomplete. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every evidence section remains incomplete and blocking. Every evidence item remains pending capture and blocking. Every source-hash row remains hash recorded and blocking if stale. Every completeness row remains incomplete and blocking. Every reviewer, witness, founder, legal/governance, and rollback evidence row remains incomplete and blocking. Every prohibited action confirmation remains canonical-state true. Every no-secret check is required and blocks evidence bundle completion if failed. Every runtime boundary check is required and blocks evidence bundle completion if failed. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not claim evidence bundle complete, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the skeleton and every referenced Phase 3L through Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates skeleton source integrity, validates first-slice identity, validates evidence bundle index status, validates required evidence bundle sections, validates evidence item index, validates source-hash verification matrix, validates bundle completeness matrix, validates reviewer evidence index, validates witness evidence index, validates founder evidence index, validates legal and governance evidence index, validates rollback evidence index, validates no-secret checks, validates runtime boundary checks, validates prohibited action confirmations, validates deterministic blockers, validates synthetic bundle rules, validates canonical evidence bundle decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical evidence bundle index skeleton status remains a final-real-start authorisation evidence bundle index skeleton pending review, exposes documented mutated synthetic fixtures proving the canonical skeleton remains no go and pending review until every required condition is met, and renders a calm founder-facing Markdown final real-start authorisation evidence bundle index skeleton. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3L and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or witnessed or attested or bundle-complete and the canonical evidence bundle index skeleton status is pending review. The status and sections test asserts the evidence bundle index status block and every required evidence bundle section are required and blocking and incomplete by default. The evidence indexes test asserts every reviewer/witness/founder/legal-governance/rollback evidence index is present and required and blocking and incomplete. The hashes and completeness test asserts the source-hash matrix and the bundle completeness matrix are complete and blocking. The boundaries and blockers test asserts no-secret checks and runtime boundary checks and prohibited action confirmations and deterministic blockers are complete and blocking. The decision and safety test asserts the canonical evidence bundle decision is no go and cannot start Phase 1D, create a branch, authorise Phase 1D, execute the first slice, complete the first slice, claim witness completion, or claim reviewer attestation completion. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final real-start authorisation evidence bundle index skeleton marker, the evidence bundle incomplete marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the witness completion not claimed marker, the reviewer attestation not claimed marker, the canonical decision no-go marker, the latest corrected Phase 3L archive hash, the latest test totals, the first slice identity, every evidence section heading, the source-hash matrix, the no-secret checks, the runtime boundary checks, the prohibited action confirmations, the deterministic blockers, and the safety notes. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry and explicitly asserts zero stale fixed-two-block claim hits in the package description, in the methodology changelog, and in the evidence bundle index skeleton content. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 3M document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final real-start authorisation evidence bundle index skeleton. Every tracked content file top-level product version is bumped to zero point eighty-three point zero to stay in sync with the package version. The cascading hash refresh propagates from Phase 2N through Phase 3L to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: None at runtime. The methodology summary is bumped to product version 0.83.0 to keep methodology tracking in sync with this Phase 3M docs-side / test-side planning evidence bundle index skeleton.
When to re-score: Phase 3M adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 m.md, docs/phase plan.md (Phase 3M addendum), docs/phase 1 d entry decision record.md (Phase 3M addendum), README.md (Phase 3M shipped row), content/phase 1d final real start authorisation evidence bundle index skeleton.v1.json, an internal source file, tests/unit/phase_3m_*.test.ts.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3M internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3M authorisation
- Impact assessment
- Adds a Phase 1D final real-start authorisation evidence bundle index skeleton at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and sections and evidence indexes and hashes and completeness and boundaries and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not claim evidence bundle complete, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3NChange date:2026-05-02Product version:0.84.0Methodology engine version:0.9.1
Phase 3N — Phase 1D final evidence bundle completeness rehearsal and no-op review transcript (docs-side / test-side planning)
Reason: Phase 3N adds the auditable Phase 1D final evidence bundle completeness rehearsal and no-op review transcript that consumes the Phase 3M evidence bundle index skeleton and rehearses the evidence bundle completeness review without completing the evidence bundle, without starting Phase 1D, without creating a branch, without running git, and without executing the first slice. The canonical transcript defaults to COMPLETENESS REVIEW NOT RUN, EVIDENCE BUNDLE INCOMPLETE, FINAL START NOT GIVEN, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The transcript is a verification artefact only and does not authorise Phase 1D, does not complete the evidence bundle, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, and does not claim reviewer attestation complete.
What changed: Phase 3N is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D final evidence bundle completeness rehearsal and no-op review transcript at the static content layer defines first-slice identity with a new completeness review status of not run, a completeness review transcript status block, a no-op review status block, twenty-four review preconditions, sixteen evidence bundle section review rows, thirty evidence item review rows, fifteen source-hash review rows, twelve participant evidence review rows, eight legal and governance review rows, ten rollback review rows, twelve no-op transcript rows, twelve expected review evidence capture rows, fifteen source-hash verification rows, twenty-two no-secret checks, fourteen runtime boundary checks, twenty-one prohibited action confirmations, thirty-seven deterministic blockers, synthetic review rules, a canonical review decision, forty-seven stop conditions, fourteen legal and governance disclaimers, and sixty-three safety markers all true. The transcript top-level shape adds a completeness rehearsal transcript version, a product version that matches the package version after the Phase 3N bump, a generated-from-phase identifier of three-N, a completeness rehearsal transcript type that explicitly marks the file as a Phase 1D final evidence bundle completeness rehearsal and no-op review transcript, a final-evidence-bundle completeness rehearsal no-op review transcript pending-review status, a calm description, a calm reviewer-facing completeness rehearsal transcript purpose, a latest-approved-archive block carrying the verified Phase 3M archive evidence with archive-evidence product version zero point eighty-three point zero, fifteen source references covering Phase 3M through Phase 2Y, a first-slice identity block referencing the same first slice as Phase 3M with all prior statuses non-authorising plus a new completeness review status of not run, a completeness review transcript status block with completeness review status not run and completeness review result pending review and evidence bundle status incomplete and final start status not started and witness completion status not claimed and reviewer attestation completion status not claimed and completeness review decision no go and reviewer action required pending review, a no-op review status block confirming the review is preview-only and no evidence bundle completion is performed and no founder, witness, or reviewer evidence is completed and no Phase 1D start, branch creation, git command, or first-slice execution is performed, twenty-four review preconditions each canonical-state pending review and blocking, sixteen section review rows each canonical-state pending review and blocking, thirty item review rows each canonical-state pending review and blocking, fifteen source-hash review rows each canonical-state pending review and blocking if stale, twelve participant evidence review rows each canonical-state pending review and blocking, eight legal and governance review rows each canonical-state pending review and blocking, ten rollback review rows each canonical-state pending review and blocking, twelve no-op transcript rows each preview-only and not executed and non-completing, twelve expected review evidence capture rows each canonical-state pending capture and blocking, fifteen source-hash verification rows each verification status hash recorded and blocking if stale, twenty-two no-secret checks each required and blocking, fourteen runtime boundary checks each canonical-state pending review and blocking, twenty-one prohibited action confirmations each canonical-state confirmed true and blocking completeness review if false, thirty-seven deterministic blockers, synthetic review rules proving the canonical transcript cannot complete the evidence bundle and cannot create a branch and cannot start Phase 1D and cannot authorise Phase 1D and cannot execute the first slice and cannot complete the first slice and cannot claim witness signed and cannot claim witness acknowledged and cannot claim witness independence completion and cannot claim reviewer attestation complete and the canonical evidence bundle status is incomplete and the canonical completeness review status is not run and the canonical final decision is no go, a canonical review decision with decision status no go and completeness review status not run and completeness review result pending review and evidence bundle status incomplete and evidence bundle index status pending review and final start status not started and final start authorisation status not authorised and branch creation status not created and branch command status not run and first slice execution status not executed and first slice completion status not completed and every can flag false and evidence bundle completed false and final start command executed false and branch command executed false and git state changed false and branch created false and first slice executed false and Phase 1D marked in progress false and runtime changes made false and reviewer action required pending review, forty-seven stop conditions each blocking completeness review or Phase 1D real start, fourteen legal and governance disclaimers, and sixty-three safety markers all true. Of the twenty-five latest approved archive blocks in the live archive after Phase 3N, only five carry the optional product version field — the Phase 2U archive-evidence block, the Phase 3K latest approved archive block, the Phase 3L latest approved archive block, the Phase 3M latest approved archive block, and the Phase 3N latest approved archive block — and all five set product version to zero point eighty-three point zero matching the approved Phase 3M archive they point to. The other twenty older-schema latest approved archive blocks do not carry product version. Phase 2U summary block archive evidence snapshot product version is also set to zero point eighty-three point zero for the same reason. The canonical completeness rehearsal transcript status is pending review. The canonical completeness review status is not run. The canonical evidence bundle status is incomplete. The canonical final start status is not started. The canonical branch creation status is not created. The canonical branch command status is not run. The canonical final decision is no go. Every review precondition remains pending review and blocking. Every section, item, source-hash, participant, legal-governance, and rollback review row remains pending review and blocking. Every no-op transcript row remains preview-only and not executed and non-completing. Every expected review evidence capture row remains pending capture and blocking. Every source-hash matrix row remains hash recorded and blocking if stale. Every prohibited action confirmation remains canonical-state true. Every no-secret check is required and blocks completeness review if failed. Every runtime boundary check is required and blocks completeness review if failed. The transcript does not authorise real Phase 1D work, does not complete the evidence bundle, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run any git command, does not change git state, does not change runtime behaviour, does not call Supabase, does not write to local storage, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not replace legal review. One new test-side support module ships alongside the transcript: a deterministic helper that loads the transcript and every referenced Phase 3M through Phase 2Y artefact, computes deterministic SHA-256 hashes via the standard hash module, validates transcript source integrity, validates first-slice identity, validates completeness review transcript status, validates no-op review status, validates review preconditions, validates evidence bundle section review rows, validates evidence item review rows, validates source-hash review rows, validates participant evidence review rows, validates legal and governance review rows, validates rollback review rows, validates no-op transcript rows, validates expected review evidence capture, validates source-hash verification matrix, validates no-secret checks, validates runtime boundary checks, validates prohibited action confirmations, validates deterministic blockers, validates synthetic review rules, validates canonical review decision, validates stop conditions, validates legal and governance notes, validates safety markers, validates the canonical completeness rehearsal transcript status remains a final-evidence-bundle completeness rehearsal no-op review transcript pending review, exposes documented mutated synthetic fixtures proving the canonical transcript remains no go and pending review until every required condition is met, and renders a calm founder-facing Markdown final evidence bundle completeness rehearsal and no-op review transcript. The helper never runs git, never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses date or random or environment or local-time references. Eleven new unit tests pin the transcript end-to-end. The shape test asserts the documented top-level shape and every per-field shape rule. The source integrity test asserts every recorded source hash matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The slice identity test asserts the first slice exactly matches Phase 3M and is not marked started or executed or completed or authorised or rehearsed or branched or command-run or final-started or witnessed or attested or bundle-complete or review-complete and the canonical completeness review status is not run. The status and preconditions test asserts the completeness review transcript status block, the no-op review status block, and every review precondition are required and blocking and non-authorising by default. The review rows test asserts every section, item, source-hash, participant, legal-governance, rollback, and no-op transcript row plus expected evidence capture is present and blocking. The hashes and boundaries test asserts source-hash matrix, no-secret checks, and runtime boundary checks are complete and blocking. The actions and blockers test asserts prohibited action confirmations and deterministic blockers are complete and blocking. The decision and safety test asserts the canonical review decision is no go and cannot complete the evidence bundle, start Phase 1D, create a branch, authorise Phase 1D, execute the first slice, complete the first slice, claim witness completion, or claim reviewer attestation completion. The Markdown rendering test asserts the rendered Markdown carries the documented title, the final evidence bundle completeness rehearsal and no-op review transcript marker, the evidence bundle incomplete marker, the completeness review not run marker, the Phase 1D not started marker, the not a real start marker, the not a real authorisation marker, the witness completion not claimed marker, the reviewer attestation not claimed marker, the canonical decision no-go marker, the latest Phase 3M archive hash, the latest test totals, the first slice identity, every review section heading, the no-secret checks, the runtime boundary checks, the prohibited action confirmations, the deterministic blockers, and the safety notes. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry and explicitly asserts zero stale fixed-phrase wording hits in the package description, in the methodology changelog, and in the transcript content. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, and no clock or random or environment or absolute-path or runtime-secret or git command execution reference in the transcript content file. New founder-facing documents at the docs layer add a Phase 3N document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final evidence bundle completeness rehearsal and no-op review transcript. Every tracked content file top-level product version is bumped to zero point eighty-four point zero to stay in sync with the package version. The cascading hash refresh propagates from Phase 2N through Phase 3M to a fixed point. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report builder and renderer, every prior helper, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: None at runtime. The methodology summary is bumped to product version 0.84.0 to keep methodology tracking in sync with this Phase 3N docs-side / test-side planning completeness rehearsal transcript.
When to re-score: Phase 3N adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 n.md, docs/phase plan.md (Phase 3N addendum), docs/phase 1 d entry decision record.md (Phase 3N addendum), README.md (Phase 3N shipped row), content/phase 1d final evidence bundle completeness rehearsal noop review transcript.v1.json, an internal source file, tests/unit/phase_3n_*.test.ts.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3N internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3N authorisation
- Impact assessment
- Adds a Phase 1D final evidence bundle completeness rehearsal and no-op review transcript at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and preconditions and review rows and hashes and boundaries and actions and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The transcript does not authorise any real Phase 1D work, does not complete the evidence bundle, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3OChange date:2026-05-02Product version:0.85.0Methodology engine version:0.9.1
Phase 3O — Phase 1D final evidence bundle completeness review readiness packet (docs-side / test-side planning)
Reason: Phase 3O adds the auditable Phase 1D final evidence bundle completeness review readiness packet that consumes the Phase 3N completeness rehearsal and no-op review transcript and pre-locks every input that the future evidence bundle completeness review needs, without completing the review and without completing the evidence bundle. The canonical readiness packet defaults to readiness not approved, readiness certificate not issued, evidence bundle incomplete, final start not authorised, Phase 1D not started, branch not created, NO-GO. The readiness packet is a verification artefact only and does not authorise Phase 1D, does not complete the evidence bundle review, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not issue a readiness certificate. Real Phase 1D start still requires every Phase 3O readiness check passed, every input registered, every review gate satisfied, and a separate execution instruction. Paid or public use remains blocked until the required legal or governance review state is explicitly completed where required.
What changed: A new deterministic auditable artefact lives at content/phase 1d final evidence bundle completeness review readiness packet.v1.json. It records sixteen source references (Phase 3N completeness rehearsal and no-op review transcript, Phase 3M evidence bundle index skeleton, Phase 3L reviewer attestation matrix skeleton, Phase 3K witness packet skeleton, Phase 3J final real-start go-or-no-go gate skeleton, Phase 3I final real-start rehearsal and no-op start transcript, Phase 3H final real-start authorisation packet, Phase 3G branch creation rehearsal and pre-flight command transcript, Phase 3F real branch creation authorisation packet, Phase 3E first-slice execution rehearsal pack, Phase 3D first-slice final execution checklist and go or no-go gate, Phase 3C first-slice real-authorisation evidence lock, Phase 3B first-slice dry-run execution packet, Phase 3A implementation branch preparation plan, Phase 2Z pre-branch final freeze certificate skeleton, Phase 2Y real-authorisation packet skeleton); a first-slice identity carried over from Phase 3N with all prior statuses non-authorising plus a new readiness review status pending review; a readiness packet status block; a readiness decision status block; twenty-six readiness preconditions; twenty-three readiness input register rows; fourteen evidence bundle review gate checklist rows; a sixteen-row source-hash readiness matrix; ten archive evidence readiness checks; twelve participant evidence readiness checks; nine legal and governance readiness checks; seven rollback readiness checks; nineteen no-secret checks; fourteen runtime boundary checks; twenty-three prohibited action confirmations; forty-one deterministic blockers; synthetic readiness rules proving every can-* false; a canonical readiness decision of NO-GO; fifty-one stop conditions; sixteen legal and governance disclaimers; and seventy safety markers all true. Every tracked content file top-level product version is bumped to 0.85.0 to stay in sync, every latest approved archive block is refreshed to the approved Phase 3N archive (1816084 bytes, 895 entries, 19557 / 19557 / 0 tests). Of the twenty-six latest approved archive blocks in the live archive after Phase 3O six carry the optional product version field — the Phase 2U archive-evidence block plus the Phase 3K, 3L, 3M, 3N, and 3O latest approved archive blocks — and all six set product version to 0.84.0, matching the approved Phase 3N archive they point to. Phase 2U summary block archive evidence snapshot product version is also set to 0.84.0 for the same reason. The other twenty older-schema latest approved archive blocks do not carry product version. The cascading hash refresh propagates from Phase 2N through Phase 3N to a fixed point. Eleven new unit tests pin the readiness packet end-to-end. NullProvider remains the default. No new local storage key, no API route, no service-role usage, no Phase 1D feature, no git command.
User impact: None at runtime. The methodology summary is bumped to product version 0.85.0 to keep methodology tracking in sync with this Phase 3O docs-side / test-side planning readiness packet.
When to re-score: Phase 3O adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 o.md, docs/phase plan.md (Phase 3O addendum), docs/phase 1 d entry decision record.md (Phase 3O addendum), README.md (Phase 3O shipped row), content/phase 1d final evidence bundle completeness review readiness packet.v1.json, an internal source file, tests/unit/phase_3o_*.test.ts (eleven), package.json (version bump 0.84.0 to 0.85.0), content/methodology changelog.v1.json (this 75th entry).
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3O internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3O authorisation
- Impact assessment
- Adds a Phase 1D final evidence bundle completeness review readiness packet at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and preconditions and inputs and gates and readiness checks and actions and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The readiness packet does not authorise any real Phase 1D work, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not issue a readiness certificate, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3PChange date:2026-05-02Product version:0.86.0Methodology engine version:0.9.1
Phase 3P — Phase 1D final evidence bundle readiness certificate rehearsal and no-op issuance transcript (docs-side / test-side planning)
Reason: Phase 3P adds the auditable Phase 1D final evidence bundle readiness certificate rehearsal and no-op issuance transcript that consumes the Phase 3O readiness packet and rehearses readiness-certificate issuance without issuing a certificate, without approving readiness, without completing the evidence bundle review, and without completing the evidence bundle. The canonical transcript defaults to certificate issuance rehearsal not run, readiness certificate not issued, readiness not approved, evidence bundle review not complete, evidence bundle incomplete, final start not authorised, Phase 1D not started, branch not created, NO-GO. The transcript is a verification artefact only and does not authorise Phase 1D, does not approve readiness, does not issue a readiness certificate, does not complete the evidence bundle review, does not complete the evidence bundle, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, and does not claim reviewer attestation complete. Real Phase 1D start still requires every Phase 3P certificate rehearsal precondition passed, every input registered, every no-op issuance row preview-only, and a separate execution instruction. Paid or public use remains blocked until the required legal or governance review state is explicitly completed where required.
What changed: A new deterministic auditable artefact lives at content/phase 1d final evidence bundle readiness certificate rehearsal noop issuance transcript.v1.json. It records seventeen source references (Phase 3O readiness packet, Phase 3N completeness rehearsal and no-op review transcript, Phase 3M evidence bundle index skeleton, Phase 3L reviewer attestation matrix skeleton, Phase 3K witness packet skeleton, Phase 3J final real-start go-or-no-go gate skeleton, Phase 3I final real-start rehearsal and no-op start transcript, Phase 3H final real-start authorisation packet, Phase 3G branch creation rehearsal and pre-flight command transcript, Phase 3F real branch creation authorisation packet, Phase 3E first-slice execution rehearsal pack, Phase 3D first-slice final execution checklist and go or no-go gate, Phase 3C first-slice real-authorisation evidence lock, Phase 3B first-slice dry-run execution packet, Phase 3A implementation branch preparation plan, Phase 2Z pre-branch final freeze certificate skeleton, Phase 2Y real-authorisation packet skeleton); a first-slice identity carried over from Phase 3O with all prior statuses non-authorising plus a new readiness certificate status not issued and a new certificate issuance rehearsal status not run; a certificate rehearsal status block; a no-op issuance status block; twenty-seven issuance preconditions; twenty-six certificate input register rows; thirteen no-op issuance transcript rows (every row preview-only and not executed); fifteen expected issuance evidence capture rows; a seventeen-row source-hash verification matrix; ten archive evidence verification rows; ten readiness packet review rows; eleven participant / legal / rollback review rows; nineteen no-secret checks; fourteen runtime boundary checks; twenty-four prohibited action confirmations; forty-three deterministic blockers; synthetic issuance rules proving every can-* false; a canonical issuance decision of NO-GO; fifty-three stop conditions; seventeen legal and governance disclaimers; and seventy-two safety markers all true. Every tracked content file top-level product version is bumped to 0.86.0 to stay in sync, every latest approved archive block is refreshed to the approved Phase 3O archive (1846047 bytes, 909 entries, 19792 / 19792 / 0 tests). Of the twenty-seven latest approved archive blocks in the live archive after Phase 3P seven carry the optional product version field — the Phase 2U archive-evidence block plus the Phase 3K, 3L, 3M, 3N, 3O, and 3P latest approved archive blocks — and all seven set product version to 0.85.0, matching the approved Phase 3O archive they point to. Phase 2U summary block archive evidence snapshot product version is also set to 0.85.0 for the same reason. The other twenty older-schema latest approved archive blocks do not carry product version. The cascading hash refresh propagates from Phase 2N through Phase 3O to a fixed point. Eleven new unit tests pin the certificate rehearsal transcript end-to-end. NullProvider remains the default. No new local storage key, no API route, no service-role usage, no Phase 1D feature, no git command.
User impact: None at runtime. The methodology summary is bumped to product version 0.86.0 to keep methodology tracking in sync with this Phase 3P docs-side / test-side planning certificate rehearsal transcript.
When to re-score: Phase 3P adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 p.md, docs/phase plan.md (Phase 3P addendum), docs/phase 1 d entry decision record.md (Phase 3P addendum), README.md (Phase 3P shipped row), content/phase 1d final evidence bundle readiness certificate rehearsal noop issuance transcript.v1.json, an internal source file, tests/unit/phase_3p_*.test.ts (eleven), package.json (version bump 0.85.0 to 0.86.0), content/methodology changelog.v1.json (this 76th entry).
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3P internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3P authorisation
- Impact assessment
- Adds a Phase 1D final evidence bundle readiness certificate rehearsal and no-op issuance transcript at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and preconditions and inputs and rows and review and boundaries and actions and blockers and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The transcript does not authorise any real Phase 1D work, does not approve readiness, does not issue a readiness certificate, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not complete the evidence bundle review, does not complete the evidence bundle, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3QChange date:2026-05-02Product version:0.87.0Methodology engine version:0.9.1
Phase 3Q — Phase 1D final readiness certificate issuance blocker ledger (docs-side / test-side planning)
Reason: Phase 3Q adds the auditable Phase 1D final readiness certificate issuance blocker ledger that consumes the Phase 3P certificate rehearsal and no-op issuance transcript and lists every blocker still preventing real readiness certificate issuance, evidence bundle review completion, evidence bundle completion, Phase 1D start, branch creation, and first-slice execution. The canonical ledger defaults to BLOCKERS PRESENT, READINESS CERTIFICATE NOT ISSUED, READINESS NOT APPROVED, EVIDENCE BUNDLE REVIEW NOT COMPLETE, EVIDENCE BUNDLE INCOMPLETE, FINAL START NOT AUTHORISED, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The ledger is a verification artefact only and does not clear blockers, does not approve readiness, does not issue a readiness certificate, does not complete the evidence bundle review, does not complete the evidence bundle, does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, and does not unlock implementation. The ledger drives toward controlled base-product implementation once required blockers are cleared by the documented future reviewer passes; the project is not parked and is not frozen. Real Phase 1D start still requires every Phase 3Q blocker cleared by the documented evidence, every implementation unlock criterion met, every clearance order step performed under future explicit authorisation, and a separate execution instruction. Paid or public use remains blocked until the required legal or governance review state is explicitly completed where required.
What changed: A new deterministic auditable artefact lives at content/phase 1d final readiness certificate issuance blocker ledger.v1.json. It records eighteen source references covering Phase 3P certificate rehearsal, Phase 3O readiness packet, Phase 3N completeness rehearsal, Phase 3M evidence bundle index, Phase 3L reviewer attestation matrix, Phase 3K witness packet, Phase 3J go-or-no-go gate, Phase 3I no-op transcript, Phase 3H final start packet, Phase 3G branch rehearsal, Phase 3F branch authorisation, Phase 3E execution rehearsal, Phase 3D first-slice gate, Phase 3C evidence lock, Phase 3B dry-run packet, Phase 3A plan, Phase 2Z freeze certificate, and Phase 2Y real authorisation packet; a first-slice identity carried over from Phase 3P with a new blocker ledger status pending review; a blocker ledger status block; a blocker summary with internally consistent counts; twenty blocker categories; thirty-eight active blocker items; twenty-one clearance evidence rows canonical missing; twelve clearance order steps none of which can be performed in this phase; fourteen implementation unlock criteria canonical not met; an eighteen-row source-hash verification matrix; ten archive evidence verification rows; ten readiness, certificate, and evidence review rows; ten participant, legal, and rollback blocker rows; nineteen no-secret checks; fourteen runtime boundary checks; twenty-four prohibited action confirmations; forty-three deterministic blockers including the explicit project park or freeze recommendation blocker; synthetic blocker ledger rules proving every can-* false; a canonical blocker ledger decision of NO-GO; fifty-three stop conditions; nineteen legal and governance disclaimers; and seventy-nine safety markers all true. Every tracked content file top-level product version is bumped to 0.87.0 to stay in sync, every latest approved archive block is refreshed to the approved Phase 3P archive (1884609 bytes, 923 entries, 20042 / 20042 / 0 tests). Of the twenty-eight latest approved archive blocks in the live archive after Phase 3Q eight carry the optional product version field — the Phase 2U archive-evidence block plus the Phase 3K, 3L, 3M, 3N, 3O, 3P, and 3Q latest approved archive blocks — and all eight set product version to 0.86.0, matching the approved Phase 3P archive they point to. Phase 2U summary block archive evidence snapshot product version is also set to 0.86.0 for the same reason. The other twenty older-schema latest approved archive blocks do not carry product version. The cascading hash refresh propagates from Phase 2N through Phase 3P to a fixed point. Eleven new unit tests pin the blocker ledger end-to-end. NullProvider remains the default. No new local storage key, no API route, no service-role usage, no Phase 1D feature, no git command.
User impact: None at runtime. The methodology summary is bumped to product version 0.87.0 to keep methodology tracking in sync with this Phase 3Q docs-side / test-side planning blocker ledger. The ledger gives the founder one calm view of exactly what remains blocked before controlled base-product implementation can begin; the project is not parked and is not frozen.
When to re-score: Phase 3Q adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 q.md, docs/phase plan.md (Phase 3Q addendum), docs/phase 1 d entry decision record.md (Phase 3Q addendum), README.md (Phase 3Q shipped row), content/phase 1d final readiness certificate issuance blocker ledger.v1.json, an internal source file, tests/unit/phase_3q_*.test.ts (eleven), package.json (version bump 0.86.0 to 0.87.0), content/methodology changelog.v1.json (this 77th entry).
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3Q internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3Q authorisation
- Impact assessment
- Adds a Phase 1D final readiness certificate issuance blocker ledger at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and summary and blockers and clearance and unlock criteria and boundaries and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The ledger does not authorise any real Phase 1D work, does not approve readiness, does not issue a readiness certificate, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not complete the evidence bundle review, does not complete the evidence bundle, does not unlock implementation, and does not replace legal review. The ledger explicitly does not park or freeze the project; it points the founder and the future reviewer toward controlled base-product implementation. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3RChange date:2026-05-02Product version:0.88.0Methodology engine version:0.9.1
Phase 3R — Phase 1D controlled implementation start decision packet (docs-side / test-side planning)
Reason: Phase 3R adds the auditable Phase 1D controlled implementation start decision packet that consumes the Phase 3Q blocker ledger and consolidates the defensive hardening loop into one practical controlled-start decision path. The canonical packet defaults to CONTROLLED START NOT APPROVED, IMPLEMENTATION NOT UNLOCKED, BLOCKERS PRESENT, READINESS CERTIFICATE NOT ISSUED, READINESS NOT APPROVED, EVIDENCE BUNDLE REVIEW NOT COMPLETE, EVIDENCE BUNDLE INCOMPLETE, FINAL START NOT AUTHORISED, PHASE 1D NOT STARTED, BRANCH NOT CREATED, NO-GO. The packet is a verification artefact only and does not approve controlled start, does not unlock implementation, does not clear blockers, does not issue a readiness certificate, does not approve readiness, does not complete the evidence bundle review, does not complete the evidence bundle, does not authorise Phase 1D, does not start Phase 1D, does not create a real branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not park the project, and does not freeze the project. The packet drives toward controlled base-product implementation once required blockers are cleared and an explicit execution instruction is given. Real Phase 1D start still requires every Phase 3R controlled-start gate verified, every base-product implementation start criterion met, every future explicit authorisation captured, every Phase 1D execution boundary intact, and a separate explicit execution instruction. Paid or public use remains blocked until the required legal or governance review state is explicitly completed where required.
What changed: A new deterministic auditable artefact lives at content/phase 1d controlled implementation start decision packet.v1.json. It records nineteen source references covering Phase 3Q blocker ledger, Phase 3P certificate rehearsal, Phase 3O readiness packet, Phase 3N completeness rehearsal, Phase 3M evidence bundle index, Phase 3L reviewer attestation matrix, Phase 3K witness packet, Phase 3J go-or-no-go gate, Phase 3I no-op transcript, Phase 3H final start packet, Phase 3G branch rehearsal, Phase 3F branch authorisation, Phase 3E execution rehearsal, Phase 3D first-slice gate, Phase 3C evidence lock, Phase 3B dry-run packet, Phase 3A plan, Phase 2Z freeze certificate, and Phase 2Y real authorisation packet; a first-slice identity carried over from Phase 3Q with new controlled-start decision status pending review and implementation unlock status not unlocked; a controlled-start decision status block; an implementation transition summary; eighteen controlled-start gates; eighteen base-product implementation start criteria; ten future explicit authorisations; a first controlled slice preview that is preview-only; eleven Phase 1D execution boundaries; a nineteen-row source-hash verification matrix; eleven archive evidence verification rows; eight blocker ledger review rows; ten readiness/evidence/certificate review rows; ten participant/legal/rollback readiness rows; nineteen no-secret checks; fourteen runtime boundary checks; twenty-five prohibited action confirmations; forty-eight deterministic blockers including the explicit project park or freeze recommendation blocker; synthetic controlled-start rules proving every can-* false; a canonical controlled-start decision of NO-GO; fifty-five stop conditions; twenty legal and governance disclaimers; and eighty-five safety markers all true. Every tracked content file top-level product version is bumped to 0.88.0 to stay in sync, every latest approved archive block is refreshed to the approved Phase 3Q archive (1922776 bytes, 937 entries, 20304 / 20304 / 0 tests). Of the twenty-nine latest approved archive blocks in the live archive after Phase 3R nine carry the optional product version field — the Phase 2U archive-evidence block plus the Phase 3K, 3L, 3M, 3N, 3O, 3P, 3Q, and 3R latest approved archive blocks — and all nine set product version to 0.87.0, matching the approved Phase 3Q archive they point to. Phase 2U summary block archive evidence snapshot product version is also set to 0.87.0 for the same reason. The other twenty older-schema latest approved archive blocks do not carry product version. The cascading hash refresh propagates from Phase 2N through Phase 3Q to a fixed point. Eleven new unit tests pin the controlled-start decision packet end-to-end. NullProvider remains the default. No new local storage key, no API route, no service-role usage, no Phase 1D feature, no git command.
User impact: None at runtime. The methodology summary is bumped to product version 0.88.0 to keep methodology tracking in sync with this Phase 3R docs-side / test-side planning controlled-start decision packet. The packet ends the defensive hardening loop and points the founder and the future reviewer at the controlled route into base-product implementation. The project is not parked and is not frozen.
When to re-score: Phase 3R adds no scoring change, no engine change, no rendered-output change, and no runtime change. Existing scorecards remain valid.
DocsTestsContentMethodologyEvidence trace: docs/phase 3 r.md, docs/phase plan.md (Phase 3R addendum), docs/phase 1 d entry decision record.md (Phase 3R addendum), README.md (Phase 3R shipped row), content/phase 1d controlled implementation start decision packet.v1.json, an internal source file, tests/unit/phase_3r_*.test.ts (eleven), package.json (version bump 0.87.0 to 0.88.0), content/methodology changelog.v1.json (this 78th entry).
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3R internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-02
- Reference
- Phase 3R authorisation
- Impact assessment
- Adds a Phase 1D controlled implementation start decision packet at the static content layer, one test-side support module, eleven new unit tests pinning shape and source integrity and slice identity and status and transition and gates and criteria and authorisations and boundaries and reviews and checks and decision and safety and Markdown rendering and cleanliness and determinism, and one founder-facing prose document at the docs layer. No product features. No runtime behaviour change. The packet does not authorise any real Phase 1D work, does not approve controlled start, does not unlock implementation, does not start Phase 1D, does not create a branch, does not execute the first slice, does not complete the first slice, does not run any git command, does not change git state, does not mark Phase 1D as started, does not claim final start approval, does not claim Phase 1D completion, does not claim witness signed, does not claim witness acknowledged, does not claim witness independence completion, does not claim reviewer attestation complete, does not complete the evidence bundle review, does not complete the evidence bundle, does not park the project, does not freeze the project, and does not replace legal review. The packet ends the defensive hardening loop and points the founder and the future reviewer at the controlled route into base-product implementation. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2UChange date:2026-05-01Product version:0.65.0Methodology engine version:0.9.1
Phase 1D final pre-authorisation checklist
Reason: Phase 2T closed the reviewer-handoff loop. Phase 2U closes the founder-facing final-decision loop with one calm deterministic checklist that records the founder's final pre-authorisation decision moment before real Phase 1D may begin. Where Phase 2T was the reviewer-facing front door, Phase 2U is the founder-facing final gate. The canonical checklist remains pending review and never asserts a real authorisation; it is the auditable single decision row that ties every prior phase's evidence together.
What changed: Phase 2U is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D final pre-authorisation checklist at the static content layer records the founder-facing final decision moment before real Phase 1D may begin. The checklist's top-level shape adds a checklist version, a product version that matches the package version after the Phase 2U bump, a generated-from-phase identifier of two-U, a checklist type that explicitly marks the checklist as a Phase 1D final pre-authorisation checklist, a dry-run-final-gate-only checklist status, a phase-owner identifier of two-U, a calm description, a final-authorisation-decision field defaulting to pending review (the canonical file never asserts yes), a latest-approved-archive block carrying the approved Phase 2T archive name and archive hash and archive size and file count and test totals as fixed example evidence, a decision-rows array of ten ordered rows, a summary block, a legal-and-governance notes block, and a safety markers block. Each decision row carries a stable identifier, a founder-facing decision name, the documented decision source, the source phase, the source content file, a deterministic SHA-256 of the source content body, the documented required evidence list, a calm pass condition, a calm failure meaning, an unresolved-risks acknowledgement field, a legal-or-governance review state field with documented enum values (completed, pending, not applicable), a decision state field with documented enum values (passed, failed, pending review) defaulting to pending review, a required-before-real-authorisation flag set to true, a dry-run-only marker set to true, a not-real-authorisation marker set to true, and calm notes. Required decision rows are: latest approved archive evidence is recorded; latest test totals are recorded and green; Phase 2P dry-run readiness pack index is reviewed; Phase 2Q scope boundary manifest is reviewed; Phase 2R implementation sequence manifest is reviewed; Phase 2S no-regression acceptance pack is reviewed; Phase 2T audit handoff pack is reviewed; legal and governance review state is recorded; unresolved risks are acknowledged; final real Phase 1D authorisation decision is still pending. The summary block aggregates byte-stably and deterministically the archive evidence snapshot, the test totals snapshot, the Phase 2P readiness snapshot, the Phase 2Q boundary-status snapshot, the Phase 2R sequence-status snapshot, the Phase 2S no-regression-status snapshot, the Phase 2T audit-handoff-status snapshot, an unresolved risks list, and the final decision state. The safety markers block pins nineteen documented dry-run / not-Phase-1D / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The checklist does not authorise real Phase 1D work, does not start Phase 1D, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the checklist: a deterministic helper that loads the checklist, validates every decision row's evidence-source reference resolves to an existing file, validates every recorded SHA-256 matches the on-disk body, validates every referenced product version is aligned where appropriate, validates the summary block matches the live upstream Phase 2P-2T artefacts, validates the canonical checklist never asserts yes, exposes eight documented mutated synthetic fixtures (missing source content file, stale source SHA-256, stale product version, final decision set to yes while one required row is not passed, unresolved risks not acknowledged, legal or governance review state missing, dry-run marker missing, not-real-authorisation marker missing), and renders a calm founder-facing Markdown final-gate cover sheet. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the checklist end-to-end. The shape test asserts the documented top-level shape, every per-row field-shape rule, every documented enum, and that no required field is missing. The decision coverage test asserts every required decision row is present exactly once, decision rows appear in the documented stable order, every documented row name is recognised, and the canonical final-authorisation-decision is pending review. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body, every referenced source content product version matches package.json where applicable, and documented synthetic mutations (missing source content file, stale source SHA-256, stale product version, dry-run marker missing, not-real-authorisation marker missing) each fail the validator with a deterministic reason. The summary consistency test asserts the summary block agrees with the live upstream Phase 2P-2T artefacts byte-for-byte (archive evidence snapshot matches the latest approved archive, test totals snapshot matches the live archive, Phase 2P readiness snapshot reads from the live readiness pack index, Phase 2Q boundary-status snapshot reads from the live boundary manifest, Phase 2R sequence-status snapshot reads from the live sequence manifest, Phase 2S no-regression-status snapshot reads from the live acceptance pack, Phase 2T audit-handoff-status snapshot reads from the live audit handoff pack). The final-gate safety test asserts the checklist does not authorise / start Phase 1D, does not imply Phase 1D is in progress, does not mark any slice complete, the canonical final-authorisation-decision is pending review, the synthetic yes fixture passes only when every required decision row is passed and every unresolved risk is acknowledged and legal-or-governance state is acceptable, the synthetic mutations (yes-while-row-not-passed, unresolved risks not acknowledged, legal-or-governance state missing) each fail with a deterministic reason, NullProvider remains default, API route count remains unchanged, Supabase remains unchanged, local storage remains unchanged. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the documented title, the latest approved archive SHA-256, the latest test totals, every decision row name, the summary block snapshot fields, the legal-and-governance notes, the dry-run-final-gate-only marker, the not-real-authorisation marker, never claims Phase 1D is authorised or has started, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that decision rows appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the checklist content file. New founder-facing documents at the docs layer add a Phase 2U document and a short addendum to the Phase 1D entry decision-record document explaining where to find the final pre-authorisation checklist. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack, and the new Phase 2U final pre-authorisation checklist product versions are bumped to 0.65.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the founder-facing final gate checklist that records the final pre-authorisation decision moment before real Phase 1D may begin. The canonical checklist remains pending review; it never asserts a real authorisation. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2U is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2U authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2U internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2U authorisation
- Impact assessment
- Adds a Phase 1D final pre-authorisation checklist at the static content layer, one test-side support module (a deterministic helper that loads the checklist, validates the evidence-source reference for every decision row and SHA-256 alignment and product-version alignment, validates the summary block against the live upstream Phase 2P-2T artefacts, exposes eight documented mutated synthetic fixtures, and renders a calm founder-facing Markdown final-gate cover sheet), and eight new unit tests pinning shape and decision coverage and evidence integrity and summary consistency and final-gate safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The checklist does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2VChange date:2026-05-01Product version:0.66.0Methodology engine version:0.9.1
Phase 1D launch-readiness verification report
Reason: Phase 2U closed the founder-facing final-decision loop. Phase 2V closes the reviewer-facing launch-readiness loop with one calm deterministic report that ties the Phase 2U final-gate decision moment and the full Phase 2K to Phase 2U readiness chain into one reviewer-facing Markdown receipt suitable as an external-review evidence document. The canonical report remains pending review and never asserts a real launch authorisation; it is the auditable single launch-readiness document that ties every prior phase together.
What changed: Phase 2V is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D launch-readiness verification report at the static content layer ties the Phase 2K to Phase 2U readiness chain into one reviewer-facing Markdown receipt. The report's top-level shape adds a report version, a product version that matches the package version after the Phase 2V bump, a generated-from-phase identifier of two-V, a report type that explicitly marks the report as a Phase 1D launch-readiness verification report, a dry-run-launch-readiness-only report status, a calm description, a calm reviewer-facing report purpose, a final-pre-authorisation-checklist reference pointing at the Phase 2U checklist, an audit-handoff-pack reference pointing at the Phase 2T pack, a readiness-chain references block pointing at the Phase 2K to Phase 2U artefacts, a latest-approved-archive block carrying the approved Phase 2U archive name and archive hash and archive size and file count and test totals as fixed example evidence, a readiness-rows array of eleven ordered rows (one per Phase 2K to Phase 2U artefact), an evidence-summary block, a launch-readiness-decision field defaulting to pending review (the canonical file never asserts yes), a legal-and-governance notes block, and a safety markers block. Each readiness row carries a stable identifier, a founder-facing row name, the documented phase owner, the source content file, the source doc file, the artefact's product version, a deterministic SHA-256 of the source content body, a reviewer state field with documented enum values (default pending review), a calm readiness meaning, a documented required-evidence list, a calm pass condition, a calm failure meaning, an unresolved-risk state, a legal-or-governance state, a dry-run-only marker set to true, a not-real-authorisation marker set to true, and calm notes. Required readiness rows cover Phase 2K readiness freeze checklist, Phase 2L decision-record template, Phase 2M synthetic completed dry-run decision record, Phase 2N synthetic entry-gate dry-run report, Phase 2O synthetic Verification Journal entry, Phase 2P dry-run readiness pack index, Phase 2Q Phase 1D scope boundary manifest, Phase 2R Phase 1D implementation sequence manifest, Phase 2S Phase 1D no-regression acceptance pack, Phase 2T Phase 1D audit handoff pack, Phase 2U Phase 1D final pre-authorisation checklist. The evidence summary aggregates byte-stably and deterministically the archive evidence snapshot, the test totals snapshot, the readiness-pack snapshot, the scope-boundary snapshot, the implementation-sequence snapshot, the no-regression-acceptance snapshot, the audit-handoff snapshot, the final-pre-authorisation snapshot, an unresolved risks list, the legal-and-governance review state, and the launch-readiness decision state. The safety markers block pins nineteen documented dry-run / not-Phase-1D / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The report does not authorise real Phase 1D work, does not start Phase 1D, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the report: a deterministic helper that loads the report and every referenced Phase 2K to Phase 2U artefact, computes deterministic SHA-256 hashes via crypto.create hash, validates readiness row integrity, validates evidence summary consistency, validates legal and governance notes, validates the canonical launch-readiness decision is pending review, exposes eight documented mutated synthetic fixtures (missing source content file, stale source SHA-256, stale product version, launch readiness set to yes while one row is not passed, unresolved risks not acknowledged, legal or governance review state missing, dry-run marker missing, not-real-authorisation marker missing), and renders a calm reviewer-facing Markdown launch-readiness receipt. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the report end-to-end. The shape test asserts the documented top-level shape, every per-row field-shape rule, every documented enum, and that no required field is missing. The row coverage test asserts every required readiness row is present exactly once, readiness rows appear in Phase 2K to Phase 2U order, every documented row is recognised, every reviewer state defaults to pending review, no row asserts real Phase 1D authorisation, and no duplicate or extra row exists. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body, every referenced source content product version matches package.json where applicable, the validator returns ok for the canonical report, and documented synthetic mutations each fail the validator with a deterministic reason. The summary consistency test asserts the evidence summary agrees with the live upstream Phase 2P to Phase 2U artefacts byte-for-byte. The launch-readiness safety test asserts the report does not authorise / start Phase 1D, does not imply Phase 1D is in progress, does not mark any slice complete, the canonical launch-readiness decision is pending review, the synthetic yes fixture passes only when every required readiness row is passed and every unresolved risk is acknowledged and legal-or-governance state is acceptable and every documented marker is present, the synthetic mutations each fail with a deterministic reason, NullProvider remains default, API route count remains unchanged, Supabase remains unchanged, local storage remains unchanged. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the documented title, the launch-readiness-verification-report marker, the not-real-authorisation marker, the latest approved archive SHA-256, the latest test totals, every readiness row name, the evidence summary section, the unresolved risks section, the legal-and-governance notes section, the pending-review decision label, never claims Phase 1D is authorised or has started, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that readiness rows appear in Phase 2K to Phase 2U order, and no clock or random or environment or absolute-path or runtime-secret reference in the report content file. New founder-facing documents at the docs layer add a Phase 2V document and a short addendum to the Phase 1D entry decision-record document explaining where to find the launch-readiness verification report. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes and Phase 2T row hash), and the new Phase 2V launch-readiness verification report product versions are bumped to 0.66.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, the Phase 2U final pre-authorisation checklist helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the reviewer-facing launch-readiness verification report that ties the Phase 2K to Phase 2U readiness chain into one calm reviewer-facing Markdown receipt suitable as an external-review evidence document. The canonical report remains pending review; it never asserts a real launch authorisation. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2V is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2V authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2V internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2V authorisation
- Impact assessment
- Adds a Phase 1D launch-readiness verification report at the static content layer, one test-side support module (a deterministic helper that loads the report and every referenced Phase 2K to Phase 2U artefact, computes deterministic SHA-256 hashes, validates readiness row integrity and evidence summary consistency and the legal and governance notes, exposes eight documented mutated synthetic fixtures, and renders a calm reviewer-facing Markdown launch-readiness receipt), and eight new unit tests pinning shape and row coverage and evidence integrity and summary consistency and launch-readiness safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The report does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes and Phase 2T row hash), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2WChange date:2026-05-01Product version:0.67.0Methodology engine version:0.9.1
Phase 1D frozen baseline manifest
Reason: Phase 2V closed the reviewer-facing launch-readiness loop. Phase 2W closes the auditable pre-Phase-1D baseline loop with one calm deterministic manifest that records the exact pre-Phase-1D baseline evidence: package version, approved archive evidence, file count, API route count, content artefact hashes, package lock hash, dependency list snapshot, test totals, build status, and all no-secret exclusions. The canonical manifest remains pending review and never asserts the baseline as frozen by default; it is the auditable single document that pins the calm pre-Phase-1D state.
What changed: Phase 2W is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D frozen baseline manifest at the static content layer records the exact pre-Phase-1D baseline evidence. The manifest's top-level shape adds a baseline version, a product version that matches the package version after the Phase 2W bump, a generated-from-phase identifier of two-W, a baseline type that explicitly marks the manifest as a Phase 1D frozen baseline manifest, a dry-run-baseline-only baseline status, a calm description, a calm reviewer-facing baseline purpose, a launch-readiness-report reference pointing at the Phase 2V report, a final-pre-authorisation-checklist reference pointing at the Phase 2U checklist, an audit-handoff-pack reference pointing at the Phase 2T pack, a latest-approved-archive block carrying the approved Phase 2V archive name and archive hash and archive size and file count and test totals as fixed example evidence, a repository snapshot block, a package snapshot block, a dependency snapshot block, an API route snapshot block, a content-artefact hashes block (one entry per Phase 2I to Phase 2W content artefact), a build-and-test snapshot block, a no-secret exclusion snapshot block, a frozen-baseline-state field defaulting to pending review (the canonical file never asserts frozen yes), a legal-and-governance notes block, and a safety markers block. The repository snapshot records package version, product version, API route count, content artefact count, test file count, total test count, typecheck and lint and build and test status, and archive evidence. The package snapshot records package.json SHA-256, package lock SHA-256, package name and version, runtime and dev dependency counts, the scripts snapshot sorted by name, and package manager evidence. The dependency snapshot records sorted runtime and dev dependency lists with name and version and the documented dependency change policy. The API route snapshot records the eight current API routes and the documented API-route change policy. The content-artefact hashes block records SHA-256 for every tracked content artefact with a documented self-entry strategy for the manifest itself. The build-and-test snapshot records green status for typecheck, lint, build, and test, with deterministic test totals (13058 / 13058 / 0). The no-secret exclusion snapshot records every documented exclusion as true. The safety markers block pins twenty documented dry-run / not-Phase-1D / not-marked-frozen-by-default / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The manifest does not authorise real Phase 1D work, does not start Phase 1D, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not mark the baseline as frozen by default, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the manifest: a deterministic helper that loads the manifest and every referenced content artefact, computes deterministic SHA-256 hashes via crypto.create hash, computes deterministic package.json and pnpm-lock.yaml hashes, computes deterministic dependency lists from package.json, computes deterministic API route inventory from the repository, validates baseline evidence integrity, validates no-secret exclusion evidence, validates the canonical frozen baseline state is pending review, exposes eight documented mutated synthetic fixtures (stale package hash, stale lockfile hash, stale content artefact hash, stale product version, API route count changed, test totals do not reconcile, archive SHA-256 missing, no-secret evidence incomplete), and renders a calm reviewer-facing Markdown frozen-baseline receipt. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the manifest end-to-end. The shape test asserts the documented top-level shape, every per-block field-shape rule, every documented enum, and that no required field is missing. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body, every package and lockfile hash matches the on-disk file body, every referenced source content product version matches package.json where applicable, the API route count is eight, the dependency counts match package.json, the validator returns ok for the canonical manifest, and documented synthetic mutations each fail the validator with a deterministic reason. The package and dependency snapshot test asserts dependency names are sorted, runtime and dev dependency counts match package.json, no payment or analytics or monitoring or live provider dependency is added, the dependency change policy references future boundary review, the scripts snapshot is sorted, and the package lock is present. The API and build and no-secret snapshot test asserts the API route count remains eight, every listed API route exists, no extra route exists, every build and test status flag is green, total tests are 13058 with zero failed, scorecard golden snapshots are unchanged, approved sample outcomes are unchanged, NullProvider remains default, no live LLM call, no Supabase change, no API route change, every no-secret exclusion is true, and the env-example is present. The frozen-baseline safety test asserts the manifest does not authorise / start Phase 1D, does not mark any slice complete, does not mark the baseline frozen by default, the canonical frozen baseline state is pending review, the synthetic frozen fixture passes only when every evidence category is complete and every hash matches and test totals are green and no-secret evidence is complete and every safety marker is present, and the documented synthetic mutations each fail with a deterministic reason. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the documented title, the frozen-baseline-manifest marker, the pending-review marker, the not-real-authorisation marker, the latest archive SHA-256, the package version, the API route count, the test totals, the content artefact hash summary, the package lock hash, the dependency snapshot summary, the no-secret exclusion summary, the legal-and-governance notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that content artefact hashes are sorted in the documented stable order, dependency lists are sorted by name, API routes are sorted in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the manifest content file. New founder-facing documents at the docs layer add a Phase 2W document and a short addendum to the Phase 1D entry decision-record document explaining where to find the frozen baseline manifest. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), and the new Phase 2W frozen baseline manifest product versions are bumped to 0.67.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, the Phase 2U final pre-authorisation checklist helper and Markdown renderer, the Phase 2V launch-readiness verification report helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable pre-Phase-1D baseline manifest that pins the calm pre-Phase-1D state. The canonical manifest remains pending review; it never marks the baseline as frozen by default. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2W is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2W authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2W internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2W authorisation
- Impact assessment
- Adds a Phase 1D frozen baseline manifest at the static content layer, one test-side support module (a deterministic helper that loads the manifest and every referenced content artefact, computes deterministic SHA-256 hashes for content files and package.json and pnpm-lock.yaml, computes deterministic dependency lists from package.json, computes deterministic API route inventory from the repository, validates baseline evidence integrity and no-secret exclusion evidence and the canonical pending-review state, exposes eight documented mutated synthetic fixtures, and renders a calm reviewer-facing Markdown frozen-baseline receipt), and eight new unit tests pinning shape and evidence integrity and package and dependency snapshot and API and build and no-secret snapshot and frozen-baseline safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The manifest does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2XChange date:2026-05-01Product version:0.68.0Methodology engine version:0.9.1
Phase 1D authorisation rehearsal record
Reason: Phase 2W closed the auditable pre-Phase-1D baseline loop. Phase 2X closes the final-decision rehearsal loop with one calm deterministic record that simulates the final authorisation decision flow without authorising real Phase 1D. The record consumes the Phase 2W frozen baseline manifest, the Phase 2V launch-readiness verification report, the Phase 2U final pre-authorisation checklist, and the Phase 2T audit handoff pack, and proves a simulated yes decision cannot be emitted unless every required evidence item is complete, every recorded SHA-256 matches the on-disk body, every unresolved risk is acknowledged, the legal-and-governance state is complete, every safety marker is present, the founder decision is explicit, and the full prior readiness chain is consistent. The canonical record remains pending review and never asserts yes.
What changed: Phase 2X is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D authorisation rehearsal record at the static content layer simulates the final authorisation decision flow without authorising real Phase 1D. The record's top-level shape adds a record version, a product version that matches the package version after the Phase 2X bump, a generated-from-phase identifier of two-X, a record type that explicitly marks the record as a Phase 1D authorisation rehearsal record, a dry-run-rehearsal-only rehearsal status, a calm description, a calm reviewer-facing rehearsal purpose, a frozen-baseline reference pointing at the Phase 2W manifest, a launch-readiness-report reference pointing at the Phase 2V report, a final-pre-authorisation-checklist reference pointing at the Phase 2U checklist, an audit-handoff-pack reference pointing at the Phase 2T pack, a latest-approved-archive block carrying the verified Phase 2W archive name and archive hash and archive size and file count and test totals as fixed example evidence, a rehearsal-steps array of twelve ordered steps in the documented stable order, an evidence-summary block that aggregates the upstream Phase 2T to Phase 2W artefact references and the latest archive evidence, a simulated-decision-walk block that records the canonical decision and the yes-allowed-only-if list and the blocking conditions and the deterministic failure reasons and the synthetic-yes fixture requirements, a rehearsal-decision-state field defaulting to pending review (the canonical file never asserts yes), a legal-and-governance notes block, and a safety markers block. Each rehearsal step carries a stable identifier, a founder-facing step name, a step order starting at one with no gaps, the source phase, the source content file, the documented required-evidence list, a calm pass condition, a calm failure meaning, a canonical state defaulting to pending review, a blocks-yes-if-incomplete flag set to true, and calm notes. Required rehearsal steps cover frozen baseline review, launch-readiness review, final pre-authorisation checklist review, audit handoff review, archive evidence review, test totals review, content hash consistency review, unresolved risk acknowledgement review, legal and governance state review, safety marker review, founder explicit decision review, and final rehearsal decision. The simulated decision walk lists the conditions under which yes is allowed and the blocking conditions that prevent yes when any condition is missing or stale; the canonical record never reaches yes. The safety markers block pins nineteen documented dry-run / not-Phase-1D / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The record does not authorise real Phase 1D work, does not start Phase 1D, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the record: a deterministic helper that loads the rehearsal record and every referenced Phase 2T to Phase 2W artefact, computes deterministic SHA-256 hashes via crypto.create hash, validates rehearsal step integrity, validates evidence summary consistency, validates the canonical rehearsal decision is pending review, validates the simulated decision walk preconditions, exposes documented mutated synthetic fixtures (missing source content file, stale source SHA-256, stale product version, rehearsal step missing, simulated yes while a step is not passed, unresolved risks not acknowledged, legal or governance state missing, dry-run marker missing, not-real-authorisation marker missing, founder explicit decision missing), and renders a calm reviewer-facing Markdown authorisation-rehearsal receipt. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Seven new unit tests pin the record end-to-end. The shape test asserts the documented top-level shape and every per-step field-shape rule. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body, every referenced source content product version matches package.json, the validator returns ok for the canonical record, and documented synthetic mutations each fail with a deterministic reason. The decision walk test asserts the simulated yes can pass only when every condition is satisfied and every blocking condition rejects yes deterministically. The rehearsal safety test asserts the record does not authorise / start Phase 1D, the canonical rehearsal decision is pending review, every safety marker is true, and documented synthetic mutations each fail with a deterministic reason. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the documented title, the authorisation-rehearsal-record marker, the not-real-authorisation marker, the latest archive SHA-256, the latest test totals, the Phase 2T to Phase 2W evidence references, the rehearsal steps section, the simulated decision walk section, the unresolved risks section, the legal-and-governance notes section, the pending-review decision label, never claims Phase 1D is authorised or has started, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that rehearsal steps appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the record content file. New founder-facing documents at the docs layer add a Phase 2X document and a short addendum to the Phase 1D entry decision-record document explaining where to find the authorisation rehearsal record. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields and refreshed latest-approved-archive to Phase 2W values), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes and refreshed latest-approved-archive to Phase 2W values), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2W values), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes and refreshed latest-approved-archive to Phase 2W values), and the new Phase 2X authorisation rehearsal record product versions are bumped to 0.68.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, the Phase 2U final pre-authorisation checklist helper and Markdown renderer, the Phase 2V launch-readiness verification report helper and Markdown renderer, the Phase 2W frozen baseline manifest helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable authorisation-rehearsal record that simulates the final authorisation decision flow without authorising real Phase 1D. The canonical record remains pending review; it never asserts a real authorisation. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2X is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2X authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2X internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2X authorisation
- Impact assessment
- Adds a Phase 1D authorisation rehearsal record at the static content layer, one test-side support module (a deterministic helper that loads the record and every referenced Phase 2T to Phase 2W artefact, computes deterministic SHA-256 hashes, validates rehearsal step integrity and evidence summary consistency and the canonical pending-review state, validates the simulated decision walk preconditions, exposes documented mutated synthetic fixtures, and renders a calm reviewer-facing Markdown authorisation-rehearsal receipt), and seven new unit tests pinning shape and evidence integrity and decision walk and rehearsal safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The record does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2YChange date:2026-05-01Product version:0.69.0Methodology engine version:0.9.1
Phase 1D real-authorisation packet skeleton
Reason: Phase 2X closed the final-decision rehearsal loop. Phase 2Y closes the real-authorisation packet skeleton loop with one calm deterministic skeleton that defines what the founder must manually fill in before real Phase 1D may begin. The skeleton consumes the Phase 2W frozen baseline manifest, the Phase 2V launch-readiness verification report, the Phase 2U final pre-authorisation checklist, the Phase 2T audit handoff pack, and the Phase 2X authorisation rehearsal record, and keeps every real founder field as blank, pending, or explicitly unfilled by default. The canonical packet remains pending review and never emits complete real authorisation.
What changed: Phase 2Y is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D real-authorisation packet skeleton at the static content layer defines what the founder must manually fill in before real Phase 1D may begin. The skeleton's top-level shape adds a packet version, a product version that matches the package version after the Phase 2Y bump, a generated-from-phase identifier of two-Y, a packet type that explicitly marks the file as a Phase 1D real-authorisation packet skeleton, a dry-run pending-review packet status, a calm description, a calm reviewer-facing packet purpose, a latest-approved-archive block carrying the verified Phase 2X archive name and archive hash and archive size and file count and test totals as fixed example evidence, an evidence-sources block referencing Phase 2W and Phase 2V and Phase 2U and Phase 2T and Phase 2X, a founder fill-in fields array of nineteen ordered manual fields the founder must fill in, an authorisation sections array of ten ordered sections each blocking authorisation if incomplete, a manual completion requirements block, an incomplete packet blockers array of twenty documented deterministic blockers, a synthetic completion rules block proving complete real authorisation cannot be reached without complete evidence, a legal and governance notes block, and a safety markers block. Every founder fill-in field defaults to incomplete with empty or null or not provided or pending review default values; no field defaults to yes or approved or authorised or signed. Every authorisation section defaults to incomplete and blocks authorisation if incomplete. The final go or no go section never defaults to yes; the signature or equivalent manual mark section never defaults to signed. Every blocker is documented and prevents real authorisation. The synthetic completion rules prove the canonical packet cannot reach complete real authorisation and that completion is only possible in a synthetic fixture where every founder field is manually filled, every evidence source hash matches, every authorisation section is complete, every blocker is cleared, tests are green, no runtime drift is present, and safety and legal markers remain present. The safety markers block pins twenty documented dry-run / not-Phase-1D / does-not-emit-complete-real-authorisation / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not emit complete real authorisation, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the packet skeleton and every referenced Phase 2T to Phase 2X artefact, computes deterministic SHA-256 hashes via crypto.create hash, validates skeleton evidence integrity, validates founder fill-in field completeness rules, validates authorisation section completeness rules, validates incomplete blocker coverage, validates synthetic completion rules, validates the canonical packet status is pending review, exposes documented mutated synthetic fixtures proving complete real authorisation cannot be reached without complete evidence, and renders a calm founder-facing Markdown packet skeleton. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field-shape rule. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The founder fields test asserts every manual field is present, required, incomplete by default, and cannot default to yes or approved or authorised or signed. The authorisation sections test asserts every section blocks real authorisation if incomplete. The packet safety test asserts the canonical packet status is pending review and complete real authorisation is impossible without complete evidence. The Markdown rendering test asserts the rendered Markdown carries the documented title, the real-authorisation-packet-skeleton marker, the not-real-authorisation marker, the latest archive SHA-256, the latest test totals, the founder fill-in fields, the blockers, the legal and governance notes, the pending-review status, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that founder fill-in fields appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 2Y document and a short addendum to the Phase 1D entry decision-record document explaining where to find the real-authorisation packet skeleton. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields and refreshed latest-approved-archive to Phase 2X values), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes and refreshed latest-approved-archive to Phase 2X values), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2X values), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes and refreshed latest-approved-archive to Phase 2X values), the Phase 2X authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2X values), and the new Phase 2Y real-authorisation packet skeleton product versions are bumped to 0.69.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, the Phase 2U final pre-authorisation checklist helper and Markdown renderer, the Phase 2V launch-readiness verification report helper and Markdown renderer, the Phase 2W frozen baseline manifest helper and Markdown renderer, the Phase 2X authorisation rehearsal record helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable real-authorisation packet skeleton that defines what the founder must manually fill in before real Phase 1D may begin. The canonical packet remains pending review; it never emits complete real authorisation. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2Y is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2Y authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2Y internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2Y authorisation
- Impact assessment
- Adds a Phase 1D real-authorisation packet skeleton at the static content layer, one test-side support module (a deterministic helper that loads the skeleton and every referenced Phase 2T to Phase 2X artefact, computes deterministic SHA-256 hashes, validates skeleton evidence integrity and founder fill-in field completeness rules and authorisation section completeness rules and incomplete blocker coverage and synthetic completion rules and the canonical pending-review state, exposes documented mutated synthetic fixtures, and renders a calm founder-facing Markdown packet skeleton), and eight new unit tests pinning shape and evidence integrity and founder fields and authorisation sections and packet safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes), the Phase 2X authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2ZChange date:2026-05-01Product version:0.70.0Methodology engine version:0.9.1
Phase 1D pre-branch final freeze certificate skeleton
Reason: Phase 2Y closed the real-authorisation packet skeleton loop. Phase 2Z closes the pre-branch final freeze certificate skeleton loop with one calm deterministic skeleton that defines what must be finally signed off before any real Phase 1D branch begins. The skeleton consumes the Phase 2Y real-authorisation packet skeleton, the Phase 2X authorisation rehearsal record, the Phase 2W frozen baseline manifest, the Phase 2V launch-readiness verification report, and the Phase 2U final pre-authorisation checklist, and keeps every final sign-off field as blank, pending, or explicitly unfilled by default. The canonical certificate remains pending review and never emits frozen certificate signed.
What changed: Phase 2Z is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D pre-branch final freeze certificate skeleton at the static content layer defines what must be finally signed off before any real Phase 1D branch begins. The skeleton's top-level shape adds a certificate version, a product version that matches the package version after the Phase 2Z bump, a generated-from-phase identifier of two-Z, a certificate type that explicitly marks the file as a Phase 1D pre-branch final freeze certificate skeleton, a dry-run pending-review certificate status, a calm description, a calm reviewer-facing certificate purpose, a latest-approved-archive block carrying the verified Phase 2Y archive name and archive hash and archive size and file count and test totals as fixed example evidence, an evidence-references block referencing Phase 2Y and Phase 2X and Phase 2W and Phase 2V and Phase 2U, a final-freeze fields array of twenty-one ordered manual fields the freeze owner must fill in, a sign-off sections array of twelve ordered sections each blocking branch start if incomplete, a pre-branch blockers array of twenty-four documented deterministic blockers, a synthetic sign-off rules block proving frozen certificate signed cannot be reached without complete evidence, a legal and governance notes block, and a safety markers block. Every final freeze field defaults to incomplete with empty or null or not provided or pending review default values; no field defaults to yes or approved or authorised or signed or frozen or branch allowed. Every sign-off section defaults to incomplete and blocks branch start if incomplete. The branch identity section never defaults to a real branch name; the final go or no-go section never defaults to yes; the final signature or manual mark section never defaults to signed. Every blocker is documented and prevents branch start. The synthetic sign-off rules prove the canonical certificate cannot reach frozen certificate signed and that signed state is only possible in a synthetic fixture where every final freeze field is manually filled, every evidence reference hash matches, every sign-off section is complete, every blocker is cleared, tests are green, API route count is stable, no runtime drift is present, every safety marker is present, and legal and governance markers remain present. The safety markers block pins twenty-one documented dry-run / not-Phase-1D / does-not-permit-real-branch-by-default / does-not-emit-frozen-certificate-signed / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The skeleton does not authorise real Phase 1D work, does not start Phase 1D, does not permit a real branch by default, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not emit frozen certificate signed, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the skeleton: a deterministic helper that loads the certificate skeleton and every referenced Phase 2U to Phase 2Y artefact, computes deterministic SHA-256 hashes via crypto.create hash, validates skeleton evidence integrity, validates final freeze field completeness rules, validates sign-off section completeness rules, validates pre-branch blocker coverage, validates synthetic sign-off rules, validates the canonical certificate status is pending review, exposes documented mutated synthetic fixtures proving frozen certificate signed cannot be reached without complete evidence, and renders a calm founder-facing Markdown certificate skeleton. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the skeleton end-to-end. The shape test asserts the documented top-level shape and every per-field-shape rule. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The final fields test asserts every manual final freeze field is present, required, incomplete by default, and cannot default to yes or approved or authorised or signed or frozen or branch allowed. The sign-off sections test asserts every section blocks branch start if incomplete. The certificate safety test asserts the canonical certificate status is pending review and frozen certificate signed is impossible without complete evidence. The Markdown rendering test asserts the rendered Markdown carries the documented title, the pre-branch-final-freeze-certificate-skeleton marker, the not-real-authorisation marker, the pending-review status, the latest archive SHA-256, the latest test totals, the final freeze fields, the blockers, the legal and governance notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that final freeze fields appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the skeleton content file. New founder-facing documents at the docs layer add a Phase 2Z document and a short addendum to the Phase 1D entry decision-record document explaining where to find the pre-branch final freeze certificate skeleton. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields and refreshed latest-approved-archive to Phase 2Y values), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes and refreshed latest-approved-archive to Phase 2Y values), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2Y values), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes and refreshed latest-approved-archive to Phase 2Y values), the Phase 2X authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2Y values), the Phase 2Y real-authorisation packet skeleton (with refreshed evidence source hashes and refreshed latest-approved-archive to Phase 2Y values), and the new Phase 2Z pre-branch final freeze certificate skeleton product versions are bumped to 0.70.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, the Phase 2T audit handoff pack helper and Markdown renderer, the Phase 2U final pre-authorisation checklist helper and Markdown renderer, the Phase 2V launch-readiness verification report helper and Markdown renderer, the Phase 2W frozen baseline manifest helper and Markdown renderer, the Phase 2X authorisation rehearsal record helper and Markdown renderer, the Phase 2Y real-authorisation packet skeleton helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable pre-branch final freeze certificate skeleton that defines what must be finally signed off before any real Phase 1D branch begins. The canonical certificate remains pending review; it never emits frozen certificate signed. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2Z is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2Z authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2Z internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 2Z authorisation
- Impact assessment
- Adds a Phase 1D pre-branch final freeze certificate skeleton at the static content layer, one test-side support module (a deterministic helper that loads the skeleton and every referenced Phase 2U to Phase 2Y artefact, computes deterministic SHA-256 hashes, validates skeleton evidence integrity and final freeze field completeness rules and sign-off section completeness rules and pre-branch blocker coverage and synthetic sign-off rules and the canonical pending-review state, exposes documented mutated synthetic fixtures, and renders a calm founder-facing Markdown certificate skeleton), and eight new unit tests pinning shape and evidence integrity and final fields and sign-off sections and certificate safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The skeleton does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, the Phase 2T audit handoff pack (with refreshed per-pointer source-hash fields), the Phase 2U final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the Phase 2V launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the Phase 2W frozen baseline manifest (with refreshed content artefact hashes), the Phase 2X authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes), the Phase 2Y real-authorisation packet skeleton (with refreshed evidence source hashes), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:3AChange date:2026-05-01Product version:0.71.0Methodology engine version:0.9.1
Phase 1D implementation branch preparation plan
Reason: Phase 2Z closed the pre-branch final freeze certificate skeleton loop. Phase 3A closes the implementation branch preparation planning loop with one calm deterministic plan that defines how the first real Phase 1D implementation branch would be opened only after explicit authorisation. The plan consumes the Phase 2Z pre-branch final freeze certificate skeleton, the Phase 2Y real-authorisation packet skeleton, the Phase 2X authorisation rehearsal record, the Phase 2W frozen baseline manifest, the Phase 2V launch-readiness verification report, and the Phase 2U final pre-authorisation checklist, and keeps every reviewer checkpoint as pending review and every command step as not executed by default. The canonical plan remains planning only and never creates a real branch.
What changed: Phase 3A is a docs-side and test-side planning phase. No product features were added. A new deterministic Phase 1D implementation branch preparation plan at the static content layer defines branch naming rules, baseline archive evidence, allowed first slice, file-family constraints, no-secret checks, rollback plan requirements, reviewer checkpoints, command sequence (proposed only), and stop conditions. The plan top-level shape adds a plan version, a product version that matches the package version after the Phase 3A bump, a generated-from-phase identifier of three-A, a plan type that explicitly marks the file as a Phase 1D implementation branch preparation plan, a planning-only pending-review plan status, a calm description, a calm reviewer-facing plan purpose, a latest-approved-archive block carrying the verified Phase 2Z archive name and archive hash and archive size and file count and test totals as fixed example evidence, an evidence-references block referencing Phase 2Z and Phase 2Y and Phase 2X and Phase 2W and Phase 2V and Phase 2U, a branch preparation rules block stating planning only and no branch created, a branch naming rules block proposing a stable pattern with allowed prefixes and forbidden prefixes and required tokens and clearly labelled examples, a baseline archive requirements block pinning Phase 2Z values, an allowed first slice block referencing the Phase 2R first slice without marking it started, a file-family constraints block consistent with Phase 2Q and Phase 2R, a no-secret checks block covering every documented exclusion, a rollback plan requirements block, a reviewer checkpoints block in stable order, a command sequence block of proposed commands all defaulting to not executed, a stop conditions block, a synthetic branch-start rules block, a legal and governance notes block, and a safety markers block. Every reviewer checkpoint defaults to pending review and blocks branch start if incomplete. Every command step defaults to not executed. The branch creation status defaults to not created. The synthetic branch-start rules prove the canonical plan cannot create a branch and that branch start is only possible in a synthetic fixture where every condition is met. The safety markers block pins twenty-one documented planning-only / not-Phase-1D / does-not-create-real-branch / does-not-permit-real-branch-by-default / NullProvider / no-service-role / no-real-customer / canonical-pending-review flags. The plan does not authorise real Phase 1D work, does not start Phase 1D, does not create a real branch, does not permit a real branch by default, does not mark Phase 1D in progress, does not mark any Phase 1D slice complete, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the plan: a deterministic helper that loads the branch preparation plan and every referenced Phase 2U to Phase 2Z artefact, computes deterministic SHA-256 hashes via crypto.create hash, validates plan evidence integrity, validates branch naming rules, validates baseline archive requirements, validates allowed first slice, validates file-family constraints, validates no-secret checks, validates rollback requirements, validates reviewer checkpoints, validates command sequence is not executed, validates stop conditions, validates synthetic branch-start rules, validates the canonical plan status is planning-only pending-review, exposes documented mutated synthetic fixtures proving a branch cannot be started without complete evidence, and renders a calm founder-facing Markdown branch preparation plan. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Nine new unit tests pin the plan end-to-end. The shape test asserts the documented top-level shape and every per-field-shape rule. The evidence integrity test asserts every recorded source SHA-256 matches the on-disk source content body and documented synthetic mutations each fail with a deterministic reason. The branch rules test asserts branch creation status is not created and no command step claims it ran. The first-slice and file-family test asserts the allowed first slice references the Phase 2R first slice and is not marked started. The secret rollback review test asserts every no-secret check passes, every rollback requirement is documented, and every reviewer checkpoint blocks branch start if incomplete. The safety test asserts the canonical plan cannot create a branch or start Phase 1D. The Markdown rendering test asserts the rendered Markdown carries the documented title, the implementation-branch-preparation-plan marker, the planning-only marker, the not-real-authorisation marker, the latest archive SHA-256, the latest test totals, the branch naming rules, the first slice, the file-family constraints, the no-secret checks, the rollback requirements, the reviewer checkpoints, the stop conditions, the legal and governance notes, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that reviewer checkpoints appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the plan content file. New founder-facing documents at the docs layer add a Phase 3A document and a short addendum to the Phase 1D entry decision-record document explaining where to find the implementation branch preparation plan. The audit bundle manifest, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report (regenerated under the new product version with refreshed Phase 2Z archive evidence and refreshed test totals), the synthetic Verification Journal entry (with refreshed source-artefact hash), the readiness pack index (with refreshed per-artefact source-hash fields and refreshed latest-approved-archive to Phase 2Z values), the scope boundary manifest, the sequence manifest, the no-regression acceptance pack, the audit handoff pack (with refreshed per-pointer source-hash fields and refreshed latest-approved-archive to Phase 2Z values), the final pre-authorisation checklist (with refreshed summary-block snapshot hashes and refreshed latest-approved-archive to Phase 2Z values), the launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2Z values), the frozen baseline manifest (with refreshed content artefact hashes and refreshed latest-approved-archive to Phase 2Z values), the authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes and refreshed latest-approved-archive to Phase 2Z values), the real-authorisation packet skeleton (with refreshed evidence source hashes and refreshed latest-approved-archive to Phase 2Z values), the pre-branch final freeze certificate skeleton (with refreshed evidence reference hashes and refreshed latest-approved-archive to Phase 2Z values), and the new Phase 3A implementation branch preparation plan product versions are bumped to zero point seventy-one point zero to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the fuzzer semantics, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation registry and optional-field coverage and future-schema-addition checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the gate report builder and renderer, the entry helper and Markdown and CSV renderers, the readiness pack index helper and Markdown renderer, the scope boundary manifest helper and Markdown renderer, the sequence manifest helper and Markdown renderer, the no-regression acceptance pack helper and Markdown renderer, the audit handoff pack helper and Markdown renderer, the final pre-authorisation checklist helper and Markdown renderer, the launch-readiness verification report helper and Markdown renderer, the frozen baseline manifest helper and Markdown renderer, the authorisation rehearsal record helper and Markdown renderer, the real-authorisation packet skeleton helper and Markdown renderer, the pre-branch final freeze certificate skeleton helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening adds the auditable implementation branch preparation plan that defines how the first real Phase 1D branch would be opened only after explicit authorisation. The canonical plan remains planning only; it never creates a real branch and never starts Phase 1D. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 3A is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 3A authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 3A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-05-01
- Reference
- Phase 3A authorisation
- Impact assessment
- Adds a Phase 1D implementation branch preparation plan at the static content layer, one test-side support module (a deterministic helper that loads the plan and every referenced Phase 2U to Phase 2Z artefact, computes deterministic SHA-256 hashes, validates plan evidence integrity and branch naming rules and baseline archive requirements and allowed first slice and file-family constraints and no-secret checks and rollback requirements and reviewer checkpoints and command sequence not-executed state and stop conditions and synthetic branch-start rules and the canonical planning-only state, exposes documented mutated synthetic fixtures, and renders a calm founder-facing Markdown branch preparation plan), and nine new unit tests pinning shape and evidence integrity and branch rules and first slice and file family and secret rollback review and safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The plan does not authorise any real Phase 1D work, does not create a branch, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The thirteen scorecard golden Markdown snapshots, the export-hash snapshots, the full-matrix snapshots, the fuzzing harness, the contract-shape fixtures and tests, the migration rehearsal tests, the deprecation and optional-field checklist, the audit bundle manifest, the owner map and orphan-surface guard, the readiness freeze checklist, the decision-record template, the dry-run record, the gate report (regenerated under the new product version), the synthetic Verification Journal entry (with refreshed source-artefact hash), the readiness pack index (with refreshed per-artefact source-hash fields), the scope boundary manifest, the sequence manifest, the no-regression acceptance pack, the audit handoff pack (with refreshed per-pointer source-hash fields), the final pre-authorisation checklist (with refreshed summary-block snapshot hashes), the launch-readiness verification report (with refreshed evidence summary upstream snapshot hashes), the frozen baseline manifest (with refreshed content artefact hashes), the authorisation rehearsal record (with refreshed evidence summary upstream snapshot hashes), the real-authorisation packet skeleton (with refreshed evidence source hashes), the pre-branch final freeze certificate skeleton (with refreshed evidence reference hashes), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2OChange date:2026-04-30Product version:0.59.0Methodology engine version:0.9.1
Phase 1D entry-gate dry-run report Verification Journal entry
Reason: Phase 2N shipped the calm dry-run gate report. Phase 2O closes the readiness loop with one synthetic Verification Journal entry that records the dry-run gate report as a synthetic verification artefact. The entry preserves every dry-run safety marker, the synthetic founder sign-off reference, the archive hash reference, and the legal-and-governance acknowledgements, while remaining docs-side and test-side only and never writing to the runtime Verification Journal. The entry makes the dry-run picture referenceable from the documented Verification Journal pattern without authorising any real Phase 1D work.
What changed: Phase 2O is a docs-side and test-side hardening phase. No product features were added. A new deterministic synthetic Verification Journal entry at the static content layer records the Phase 2N synthetic dry-run gate report as a calm founder-facing verification artefact. The entry's top-level shape adds an entry version, a product version that matches the package version after the Phase 2O bump, a generated-from-phase identifier of two-O, an entry type that explicitly marks the entry as a synthetic dry-run verification entry, a calm description, a source-report block referencing the Phase 2N dry-run gate report, a dry-run-only true flag, a real-authorisation false flag, a journal-entry block, and a safety-markers block. The journal-entry block carries a stable identifier, a founder-facing title, a deterministic ISO date anchored to the existing 2026 corpus, a documented verification type and verification status, a calm subject line, the source-artefact name, a deterministic SHA-256 of the canonical Phase 2N gate report content body, the source-artefact dotted version, the synthetic founder sign-off reference, the archive name and archive SHA-256 (matching the Phase 2N approved archive), test-count totals (total, passed, failed) consistent with the latest GREEN gauntlet pattern, an overall verdict that matches the Phase 2N report verdict, calm evidence summary and limitations prose, and a calm legal-and-governance note. Three new test-side support modules ship alongside the entry: a deterministic entry helper that consumes the Phase 2N gate report and the Phase 2O synthetic journal entry content file and validates source-report consistency without any clock or random or environment or absolute-path dependency; a deterministic Markdown renderer that carries the documented dry-run-only and not-a-real-authorisation markers, the Phase 2N overall verdict, the Phase 2N archive hash, the synthetic founder sign-off reference, the test count, the safety markers, and the legal-and-governance note; and a deterministic CSV renderer that emits a fixed header row and exactly one synthetic data row carrying the dry-run-only and not-real-authorisation markers. Five new unit tests pin the entry end-to-end. The shape test asserts the documented top-level shape, every documented field-shape rule, that the entry version is a dotted version string, that the product version matches the package version, that the generated-from-phase identifier is two-O, that the entry type marks the entry as synthetic dry-run verification, that the dry-run-only flag is true and the real-authorisation flag is false, that the source-report block references the Phase 2N gate report, that the journal-entry identifier is a stable snake or kebab token, that the journal-entry title is non-empty founder-facing prose, that the journal-entry date is a deterministic ISO date, that the verification type and verification status are documented, that the source-artefact hash is a sixty-four-lowercase-hex SHA-256 (or null with a documented reason), that the archive hash is a sixty-four-lowercase-hex SHA-256, that the test-count fields are non-negative integers, that the test-count totals reconcile, that the failed-test count is zero, that the overall verdict matches the Phase 2N report, and that the safety-markers block is non-empty. The source-report consistency test asserts the journal entry agrees with the Phase 2N gate report on dry-run-only flag, real-authorisation flag, overall verdict, archive hash, synthetic founder sign-off reference, and test totals, and that documented synthetic mutations to the gate report (real-authorisation true, dry-run-only false, verdict mutated, archive hash removed) each flip the consistency check to a deterministic failure without mutating the canonical content files. The Markdown and CSV rendering test asserts the rendered Markdown is non-empty, carries the dry-run-only and not-a-real-authorisation markers, the overall verdict, the archive hash, the synthetic founder sign-off reference, the test count, and the legal-and-governance note, asserts the Markdown states it is not legal advice and not a legal approval and not a regulatory decision, asserts the Markdown does not claim Phase 1D is authorised for real and does not claim certification or compliance approval, asserts the CSV is non-empty with deterministic headers and exactly one data row carrying the dry-run-only and not-real-authorisation markers, and asserts both renderings are byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value (description, journal-entry title, subject, evidence summary, limitations, legal-and-governance note, safety marker prose, the rendered Markdown body, and the founder-facing CSV cells) through the Phase 2A central forbidden-wording registry and asserts no service-role API field name, Supabase service-role environment-variable name, runtime API key constant, Windows-user path, one drive path, absolute Linux path, or real-customer placeholder appears anywhere in any prose surface. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of every renderer call, byte-stability of the source-report consistency output, and no clock or random or environment or absolute-path reference in the entry content file. New founder-facing documents at the docs layer add a Phase 2O document and a short addendum to the Phase 1D entry decision-record document explaining where to find the synthetic Verification Journal entry. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record (with archive fields synced to the Phase 2N approved archive), the Phase 2N gate report, and the new Phase 2O Verification Journal entry product versions are bumped to 0.59.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N dry-run gate report builder and renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic synthetic Verification Journal entry that records the Phase 2N dry-run gate report as a verification artefact without writing to the runtime Verification Journal, without authorising any real Phase 1D work, and without changing any user-facing surface. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2O is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2O authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2O internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2O authorisation
- Impact assessment
- Adds a synthetic Verification Journal entry at the static content layer, three test-side support modules (a deterministic entry helper, a deterministic Markdown renderer, and a deterministic CSV renderer), and five new unit tests pinning shape and source-report consistency and Markdown / CSV rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The journal entry does not write to the runtime Verification Journal, does not authorise any real Phase 1D work, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record (with archive fields synced to the Phase 2N approved archive), the Phase 2N dry-run gate report builder and renderer, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2PChange date:2026-04-30Product version:0.60.0Methodology engine version:0.9.1
Founder-visible dry-run readiness pack index
Reason: Phase 2K shipped the readiness freeze checklist. Phase 2L shipped the entry decision-record template. Phase 2M shipped the synthetic completed dry-run decision record. Phase 2N shipped the synthetic dry-run gate report. Phase 2O recorded the dry-run gate report as a synthetic Verification Journal entry. Phase 2P closes the readiness loop with one calm founder-visible index that ties those five dry-run readiness artefacts together so the founder can see, in one calm place, which readiness artefacts exist before Phase 1D. The index makes every dry-run readiness reference visible at a glance without authorising any real Phase 1D work.
What changed: Phase 2P is a docs-side and test-side hardening phase. No product features were added. A new deterministic founder-visible dry-run readiness pack index at the static content layer ties together every dry-run readiness artefact created in Phases 2K through 2O so a reviewer can see, in one calm place, the full inventory of synthetic readiness material before Phase 1D. The index's top-level shape adds an index version, a product version that matches the package version after the Phase 2P bump, a generated-from-phase identifier of two-P, an index type that explicitly marks the index as a synthetic dry-run readiness pack index, a calm description, a dry-run-only true flag, a real-authorisation false flag, a calm founder-facing recommended-use note, a latest-approved-archive block carrying the approved Phase 2O archive name, archive hash, archive size, file count, and test totals as fixed example evidence, an artefacts array of exactly five entries (Phase 2K readiness checklist, Phase 2L entry decision-record template, Phase 2M synthetic completed dry-run decision record, Phase 2N synthetic entry-gate dry-run report, Phase 2O synthetic Verification Journal entry, in that documented stable order), and a safety markers block. Each artefact entry carries a stable identifier, a calm founder-facing name, the phase owner, a documented artefact kind, the path of the referenced content file, the path of the referenced founder-facing doc, the artefact's own content version, the artefact's product version, a dry-run-only true flag, a real-authorisation false flag, a synthetic-only flag, a required-before-Phase-1D flag, a deterministic SHA-256 of the referenced content file body, a non-empty safety markers summary, and calm notes. Synthetic archive evidence is refreshed across the dependent dry-run records so every reference points at the latest approved Phase 2O archive: the dry-run decision record's archive name, archive hash, archive size, file count, and test totals are updated; the dry-run gate report is regenerated from the refreshed dry-run record; the synthetic Verification Journal entry's archive fields, test counts, and source-artefact hash are refreshed to match the regenerated gate report. One new test-side support module ships alongside the index: a deterministic helper that loads the index content file, loads each referenced readiness artefact, computes deterministic SHA-256 hashes for each referenced content file using only the synchronous SHA-256 hash from the standard library, exposes a validator for index-to-artefact consistency, and exposes a founder-facing Markdown rendering of the index. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Five new unit tests pin the index end-to-end. The shape test asserts the documented top-level shape, every per-artefact field-shape rule, that the index version is a dotted version string, that the product version matches the package version, that the generated-from-phase identifier is two-P, that the index type marks the index as a synthetic dry-run readiness pack index, that the dry-run-only flag is true and the real-authorisation flag is false, that the recommended-use field is non-empty, that the latest-approved-archive block carries the documented Phase 2O values, that the artefacts array is non-empty, that artefact identifiers are unique snake or kebab tokens, that every per-artefact field is present, that every phase owner and every artefact kind is documented, that every dry-run-only flag is true and every real-authorisation flag is false at the artefact level, that every required-before-Phase-1D flag is boolean, and that every safety markers summary is non-empty. The consistency test asserts every required readiness artefact (Phase 2K checklist, Phase 2L template, Phase 2M dry-run record, Phase 2N gate report, Phase 2O Verification Journal entry) is present exactly once, every referenced content file exists on disk, every referenced doc file exists where non-null, every content file SHA-256 matches the recorded source-hash field, every referenced content product version matches the package version where the artefact tracks the package version, every referenced artefact's generated-from-phase identifier matches its phase owner where applicable, every artefact's own dry-run-only and real-authorisation flags agree with the index, no required artefact can be removed without failing the test, and no extra artefact can be added without a documented kind and phase owner. The safety test asserts the index states dry-run-only and not-real-authorisation clearly, the index references synthetic data only, the index includes the latest approved Phase 2O archive hash, the index does not imply Phase 1D has started, the index does not authorise Phase 1D, every artefact safety marker summary is non-empty, every artefact that can contain a dry-run marker contains one, every artefact that can contain a real-authorisation flag keeps it false, every artefact that can contain a synthetic-only flag keeps it true, every legal-and-governance marker remains present where applicable, the index does not write to the runtime Verification Journal, and the index does not reference any live provider call. The cleanliness test sweeps every founder-facing prose value (description, recommended use, artefact name, safety markers summary, notes, safety marker prose, the rendered Markdown body) through the Phase 2A central forbidden-wording registry and asserts no service-role API field name, Supabase service-role environment-variable name, runtime API key constant, Windows-user path, one drive path, absolute Linux path, or real-customer placeholder appears anywhere in any prose surface. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator output across two consecutive calls, byte-stability of the Markdown renderer across two consecutive calls, that artefacts are sorted in the documented stable order of two-K, two-L, two-M, two-N, and two-O, and no clock or random or environment or absolute-path or runtime-secret reference in the index content file. New founder-facing documents at the docs layer add a Phase 2P document and a short addendum to the Phase 1D entry decision-record document explaining where to find the dry-run readiness pack index. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record (with archive fields synced to the Phase 2O approved archive), the Phase 2N gate report (regenerated from the refreshed dry-run record), the Phase 2O synthetic Verification Journal entry (with archive fields and source-artefact hash synced to the refreshed Phase 2N report), and the new Phase 2P readiness pack index product versions are bumped to 0.60.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic founder-visible index of every dry-run readiness artefact before Phase 1D, with documented per-artefact references, version alignment checks, and safety markers. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2P is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2P authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2P internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2P authorisation
- Impact assessment
- Adds a founder-visible dry-run readiness pack index at the static content layer, one test-side support module (a deterministic helper that loads the index, loads each referenced artefact, computes deterministic SHA-256 hashes, validates index-to-artefact consistency, and renders a founder-facing Markdown view), and five new unit tests pinning shape and consistency and safety and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The index does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record (with archive fields synced to the Phase 2O approved archive), the Phase 2N gate report (regenerated from the refreshed dry-run record), the Phase 2O Verification Journal entry (with archive fields and source-artefact hash synced to the refreshed Phase 2N report), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2QChange date:2026-04-30Product version:0.61.0Methodology engine version:0.9.1
Phase 1D scope boundary manifest
Reason: Phase 2K through Phase 2P shipped the readiness checklist, the entry decision-record template, the synthetic completed dry-run record, the synthetic dry-run gate report, the synthetic Verification Journal entry, and the dry-run readiness pack index. Phase 2Q closes the boundary loop with one calm deterministic manifest that defines exactly what a future Phase 1D may introduce after separate authorisation and what remains explicitly out of scope. The dry-run readiness artefacts say 'are you ready', the boundary manifest says 'are you about to introduce something out of scope'. Together they make the Phase 1D moment auditable on both axes without authorising any real Phase 1D work.
What changed: Phase 2Q is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D scope boundary manifest at the static content layer defines exactly what a future Phase 1D is allowed to introduce and what remains explicitly out of scope unless separately authorised later. The manifest's top-level shape adds a manifest version, a product version that matches the package version after the Phase 2Q bump, a generated-from-phase identifier of two-Q, a manifest type that explicitly marks the manifest as a Phase 1D scope boundary manifest, a calm description, a dry-run boundary status, a calm founder-facing boundary purpose, a reference to the Phase 2P dry-run readiness pack index, a latest-approved-archive block carrying the approved Phase 2P archive name, archive hash, archive size, file count, and test totals as fixed example evidence, an allowed-workstreams array, an excluded-features array, a dependency-boundaries section, an api-route-boundaries section, a provider-boundaries section, a data-handling-boundaries section, an analytics-payment-deployment-boundaries section, a boundary-change-requirements section, and a safety markers block. Allowed workstreams cover paid-beta access, payment-provider evaluation, EU-only deployment, legal pages, error monitoring, analytics, providers, API routes, data handling, and public surfaces; each allowed workstream carries a narrow scope, explicit limits, the evidence required before activation, and the items that remain excluded inside that workstream. Excluded features cover public sharing, team accounts, public launch assets, marketing pages, sales pages, pricing pages, product catalogue pages, live payment checkout, Stripe and Mollie dependencies, Sentry and Plausible dependencies, any third-party analytics dependency, live Mistral and open ai and Anthropic and any live LLM scoring, service-role usage, Supabase schema and RLS changes, new API routes, automated-crawl wording or behaviour, aggressive scanning, vulnerability probing, fear-based marketing, real customer or personal data, public deployment outside an explicitly authorised EU region, and engine and scoring and context-pack and receipt-hash and version changes; each excluded feature carries a clear reason, a separate-authorisation requirement, the evidence required before any change, and calm notes. Dependency boundaries enumerate the documented constraints (no runtime, payment, analytics, monitoring, or live-provider dependency added in Phase 2Q) plus the documented evidence required for any future dependency change. API-route boundaries enumerate the constraint that the API route count must remain unchanged in Phase 2Q and the documented evidence required before any future API route addition. Provider boundaries enumerate that NullProvider remains default, that no live Mistral, open ai, or Anthropic scoring is authorised, and the documented evidence required before any future provider change. Data-handling boundaries enumerate that no real customer data appears in tests, fixtures, docs, or dry-run artefacts; no PII appears in receipts unless separately authorised; no service-role usage is permitted; and no Supabase or local storage write path is added by Phase 2Q. The analytics, payment, and deployment boundary section enumerates the explicit out-of-scope status of public deployment, EU analytics, and payment checkout. The boundary-change-requirements section enumerates the deterministic preconditions before any boundary can change: a completed Phase 1D entry decision record, named owner, reason, evidence, GREEN gauntlet result, archive SHA-256, methodology changelog update, docs update, tests update, rollback plan, no-secret archive check, and legal or governance review where applicable. One new test-side support module ships alongside the manifest: a deterministic helper that loads the manifest, validates allowed workstreams, validates excluded features, validates dependency, API-route, provider, and data-handling boundaries, validates the boundary-change-requirements section, and renders a calm founder-facing Markdown summary. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Six new unit tests pin the manifest end-to-end. The shape test asserts the documented top-level shape, every per-section field-shape rule, every documented enum, and that no required field is missing. The allowed-vs-excluded coverage test asserts every required allowed workstream is present, every required excluded feature is present, every allowed workstream has explicit limits and required evidence and excluded items, every excluded feature has a reason and a separate-authorisation requirement and required evidence, and every documented boundary set (payment, analytics, monitoring, provider, API route, data handling, public surface) is fully covered. The boundary evidence test asserts every future boundary change requires the documented evidence list (completed Phase 1D entry decision record, named owner, reason, evidence, GREEN gauntlet, archive hash, methodology changelog update, docs update, tests update, rollback plan, no-secret archive check, plus legal or governance review where applicable). The safety test asserts the manifest does not authorise Phase 1D, does not state Phase 1D has started, does not imply paid features exist or deployment is live or analytics are enabled or live provider scoring is enabled, keeps NullProvider as default, keeps the API route count unchanged, keeps Supabase unchanged, keeps local storage unchanged, keeps the synthetic readiness artefacts separate from real authorisation, and every safety marker is populated. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry and asserts no service-role API field name, Supabase service-role environment-variable name, runtime API key constant, Windows-user path, one drive path, absolute Linux path, or real-customer placeholder appears anywhere in any prose surface. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that allowed workstreams and excluded features and boundary change requirements appear in documented stable orders, and no clock or random or environment or absolute-path or runtime-secret reference in the manifest content file. New founder-facing documents at the docs layer add a Phase 2Q document and a short addendum to the Phase 1D entry decision-record document explaining where to find the boundary manifest. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), and the new Phase 2Q boundary manifest product versions are bumped to 0.61.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic Phase 1D scope boundary manifest that pairs with the dry-run readiness pack index: the readiness artefacts say whether the project is ready, the boundary manifest says whether a proposed change is in scope. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2Q is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2Q authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2Q internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2Q authorisation
- Impact assessment
- Adds a Phase 1D scope boundary manifest at the static content layer, one test-side support module (a deterministic helper that loads the manifest, validates the allowed workstreams and excluded features and dependency and API-route and provider and data-handling boundaries and boundary-change-requirements, and renders a calm founder-facing Markdown summary), and six new unit tests pinning shape and allowed-vs-excluded coverage and boundary evidence and safety and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The manifest does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2RChange date:2026-04-30Product version:0.62.0Methodology engine version:0.9.1
Phase 1D implementation sequence manifest
Reason: Phase 2K through Phase 2Q shipped the readiness gates, the dry-run decision artefacts, the readiness pack index, and the explicit Phase 1D scope boundary. Phase 2R closes the temporal loop with one calm deterministic manifest that breaks a future Phase 1D into safe, ordered implementation slices. The dry-run readiness artefacts say whether the project is ready, the boundary manifest says what is in scope, and the sequence manifest says in what order Phase 1D may be delivered. Together they make the Phase 1D moment auditable on three axes (readiness, scope, sequence) without authorising any real Phase 1D work.
What changed: Phase 2R is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D implementation sequence manifest at the static content layer breaks a future Phase 1D into ten ordered implementation slices. The manifest's top-level shape adds a manifest version, a product version that matches the package version after the Phase 2R bump, a generated-from-phase identifier of two-R, a manifest type that explicitly marks the manifest as a Phase 1D implementation sequence manifest, a calm description, a dry-run sequence status, a calm founder-facing sequence purpose, a reference to the Phase 2Q scope boundary manifest, a reference to the Phase 2P dry-run readiness pack index, a latest-approved-archive block carrying the approved Phase 2Q archive name, archive hash, archive size, file count, and test totals as fixed example evidence, an implementation-slices array of ten ordered entries (entry-decision-record completion, scope confirmation, paid-beta access shell, payment-provider decision, payment-provider integration, legal pages, EU deployment preparation, analytics-and-monitoring decision, analytics-and-monitoring integration, release review), a global-sequence-rules section, and a safety markers block. Each slice carries a stable identifier, a numeric slice index starting at zero with no gaps, founder-facing slice name and slice purpose, non-empty preconditions (including references to earlier slices), required evidence before start (completed Phase 1D entry decision record, Phase 2Q boundary check, named owner, GREEN gauntlet baseline, approved archive SHA-256), allowed file families (narrow patterns the slice may touch), forbidden file families (every other family that must remain frozen, with explicit no-overlap), expected tests (file paths or future-new-test markers with documented reasons), rollback evidence (rollback owner, trigger, steps, and safe-state archive), archive evidence (archive path, hash, size, file count, test totals, no-secret archive check), and calm notes. The global sequence rules section enumerates sixteen documented rules every slice must obey before it may start: every slice begins only after a completed Phase 1D entry decision record, the Phase 2Q boundary check, a named owner, a GREEN gauntlet baseline, and an approved archive hash; no slice may touch files outside its allowed file families or touch its forbidden file families; no slice may add a runtime dependency without dependency review; no slice may add an API route without API-route boundary approval; no slice may add a Supabase change without data-boundary approval; no slice may add a live provider call without provider-boundary approval; no slice may add analytics or monitoring without privacy or governance review; no slice may add a payment provider without legal or governance review and dependency review; no slice may change scoring or receipts unless a future explicit boundary change authorises it; no slice may weaken Phase 2A through Phase 2Q tests; every slice records rollback evidence before completion. The sequence status is dry-run-sequence-only. The manifest does not authorise real Phase 1D work, does not start Phase 1D, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, and does not replace legal review. One new test-side support module ships alongside the manifest: a deterministic helper that loads the manifest, validates slice ordering, validates precondition references, validates allowed and forbidden file-family separation, validates expected-tests coverage, validates rollback-evidence and archive-evidence completeness, validates global sequence rules, and renders a calm founder-facing Markdown summary. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Seven new unit tests pin the manifest end-to-end. The shape test asserts the documented top-level shape, every per-slice field-shape rule, every documented enum, and that no required field is missing. The ordered-preconditions test asserts every required slice is present in the documented order, no slice depends on a later slice, every slice after index zero depends on at least one earlier slice, every slice requires a completed Phase 1D entry decision record where applicable, every implementation slice requires the Phase 2Q boundary check, every implementation slice requires a named owner, every implementation slice requires a GREEN gauntlet baseline and approved archive hash, and the release review slice depends on every prior authorised slice. The allowed-vs-forbidden file-family test asserts every slice has non-empty allowed and forbidden families, no allowed family equals or is a child of a forbidden family in the same slice, docs-only and decision-only slices do not allow runtime app code, the payment decision slice does not allow runtime checkout code, integration slices have narrower allowed families than their excluded families, and the release review slice forbids new runtime changes. The expected-tests-and-evidence test asserts every slice has expected tests, every expected test file exists or is clearly marked as a future-new-test with a documented reason, every rollback evidence block has owner and trigger and steps and a safe-state archive requirement, every archive evidence block requires archive path and hash and size and file count and test totals and no-secret archive check, every dependency-related slice requires package diff evidence, every provider-related slice requires no-live-provider evidence, every API-related slice requires route boundary evidence, and every data-related slice requires data-boundary evidence. The sequence safety test asserts the manifest does not authorise Phase 1D, does not state Phase 1D has started, does not imply paid features exist or deployment is live or analytics are enabled or live provider scoring is enabled, keeps NullProvider as default, keeps the API route count unchanged, keeps Supabase unchanged, keeps local storage unchanged, keeps the synthetic readiness artefacts separate from real authorisation, and every slice is sequence-only and not implementation. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that implementation slices are sorted by slice index and global sequence rules appear in documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the manifest content file. New founder-facing documents at the docs layer add a Phase 2R document and a short addendum to the Phase 1D entry decision-record document explaining where to find the implementation sequence manifest. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, and the new Phase 2R sequence manifest product versions are bumped to 0.62.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic Phase 1D implementation sequence manifest that pairs with the readiness pack index and the boundary manifest: the readiness artefacts say whether the project is ready, the boundary manifest says what is in scope, and the sequence manifest says in what order Phase 1D may be delivered. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2R is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2R authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2R internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2R authorisation
- Impact assessment
- Adds a Phase 1D implementation sequence manifest at the static content layer, one test-side support module (a deterministic helper that loads the manifest, validates slice ordering and precondition references and allowed-and-forbidden file-family separation and expected-tests coverage and rollback-evidence and archive-evidence completeness and global sequence rules, and renders a calm founder-facing Markdown summary), and seven new unit tests pinning shape and ordered-preconditions and allowed-vs-forbidden file families and expected-tests-and-evidence and sequence safety and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The manifest does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2SChange date:2026-04-30Product version:0.63.0Methodology engine version:0.9.1
Phase 1D no-regression acceptance pack
Reason: Phase 2R defined the future Phase 1D implementation sequence. Phase 2S closes the post-slice loop with one calm deterministic acceptance pack that defines what evidence each future Phase 1D slice must supply on exit before the next slice may start. Phase 2R answers in what order Phase 1D may be delivered; Phase 2S answers what each slice must prove on exit before the next slice begins. Together they prevent continuing from one slice to the next without post-slice review.
What changed: Phase 2S is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D no-regression acceptance pack at the static content layer carries one acceptance entry per Phase 2R implementation slice (ten entries in the documented Phase 2R order). The pack's top-level shape adds a pack version, a product version that matches the package version after the Phase 2S bump, a generated-from-phase identifier of two-S, a pack type that explicitly marks the pack as a Phase 1D no-regression acceptance pack, a calm description, a dry-run acceptance-pack-only status, a calm founder-facing pack purpose, references to the Phase 2R sequence manifest and the Phase 2Q boundary manifest and the Phase 2P readiness pack index, a latest-approved-archive block carrying the approved Phase 2R archive name and archive hash and archive size and file count and test totals as fixed example evidence, a slice-acceptance-entries array of ten entries, a global-acceptance-rules section, and a safety markers block. Each acceptance entry carries a stable identifier, the matching slice id and slice index and slice name from the Phase 2R sequence manifest, the phase owner, an acceptance status defaulting to pending review and never marking a real slice as completed, a calm founder-facing acceptance purpose, a required-test-results block (full gauntlet, typecheck, lint, build, test, total and passed and failed test totals, required prior-phase tests, slice-expected-tests reference, snapshot drift policy, sample-matrix and golden-snapshots requirements), an archive-evidence block (exit archive path and name and hash and size and file count and test totals and entry-to-exit delta, no-build-artefacts assertion, env-example presence assertion, runtime-env exclusion assertion), a boundary-evidence block (Phase 2Q boundary recheck, no-excluded-feature-introduced assertion, boundary-exception record requirement when applicable, changed-boundary-name requirement when applicable, legal-or-governance review requirement when applicable, calm evidence note), a dependency-diff-evidence block (package.json diff, lockfile diff, no-new-dependency assertion for non-dependency slices, dependency review and licence note and data-handling note and rollback note for dependency slices), a file-family-diff-evidence block (changed-files list, all-changed-files-must-be-inside-allowed-families assertion, no-changed-file-may-be-inside-forbidden-families assertion, references to the Phase 2R allowed and forbidden families for the slice, undocumented-file-change-blocks-next-slice assertion), a rollback-evidence block (rollback owner, trigger, steps, safe-state archive, rollback-test command, calm rollback note), a no-secret-archive-evidence block (no-env, no-runtime-env-files, no-service-role-secret, no-node-modules, no-build-artefacts, no-logs, env-example-present), a next-slice-release-condition (only when the current entry is exit ready with complete evidence), and calm notes. The global-acceptance-rules section enumerates fifteen documented rules every slice exit must obey before the next slice may start: no next slice starts unless the current slice is exit ready with complete evidence; no slice may be marked exit ready with failed tests; no slice may be marked exit ready without archive hash; no slice may be marked exit ready without no-secret archive evidence; no slice may be marked exit ready if any changed file falls outside allowed file families; no slice may be marked exit ready if any changed file falls inside forbidden file families; no dependency-related slice may be marked exit ready without package diff evidence; no provider-related slice may be marked exit ready without no-live-provider evidence; no API-related slice may be marked exit ready without API-route boundary evidence; no data-related slice may be marked exit ready without data-boundary evidence; no payment-related slice may be marked exit ready without legal or governance and dependency evidence; no analytics or monitoring slice may be marked exit ready without privacy or governance evidence; no slice may weaken Phase 2A through Phase 2R tests; no slice may change scoring or receipts or context packs unless a future explicit boundary change authorises it; every slice exit must record rollback evidence before the next slice starts. The pack status is dry-run-acceptance-pack-only. The pack does not authorise real Phase 1D work, does not start Phase 1D, does not mark any real slice as completed, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, and does not replace legal review. One new test-side support module ships alongside the pack: a deterministic helper that loads the pack and the Phase 2R sequence manifest, validates slice coverage, validates evidence completeness, validates dependency-and-file-family-diff requirements, validates rollback-evidence and no-secret-archive-evidence completeness, exposes a validator that proves default entries are pending review and not exit ready, exposes documented mutated synthetic fixtures showing exit ready passes only with complete evidence and fails when required evidence is missing (failed tests above zero, missing archive hash, missing no-secret archive evidence, changed file outside allowed families, changed file inside forbidden families, missing dependency diff for a dependency slice, missing rollback evidence, missing boundary evidence), and renders a calm founder-facing Markdown summary. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Seven new unit tests pin the pack end-to-end. The shape test asserts the documented top-level shape, every per-entry field-shape rule, every documented enum, and that no required field is missing. The slice coverage test asserts every Phase 2R slice has exactly one acceptance entry, every acceptance entry references an existing Phase 2R slice, acceptance entries appear in Phase 2R slice order, slice index and slice name match Phase 2R, every acceptance entry has acceptance status pending review by default, no acceptance entry claims the slice is completed, no acceptance entry releases the next slice by default, and the release review acceptance entry is present and last. The evidence completeness test asserts every acceptance entry requires the full gauntlet, typecheck, lint, build, and test, and the total and passed and failed test totals, and sample matrix and golden snapshots where applicable, and archive path and name and hash and size and file count and test totals and entry-to-exit delta, and no-secret archive evidence, and rollback owner and trigger and steps and safe-state archive and rollback test command, and that dependency-related entries require package diff and dependency review evidence, provider-related entries require no-live-provider evidence, API-related entries require API-route boundary evidence, data-related entries require data-boundary evidence, payment-related entries require legal or governance and dependency evidence, and analytics or monitoring entries require privacy or governance evidence. The dependency and file-family diff test asserts every acceptance entry references the Phase 2R allowed and forbidden families, requires a changed-files list, requires all changed files to fall inside allowed families, blocks changed files inside forbidden families, blocks undocumented file changes, that non-dependency slices require a no-new-dependency assertion, dependency slices require package json diff and lockfile diff and dependency review and licence note and data-handling note and rollback note, and integration slices require tighter diff evidence than decision-only slices. The exit-safety test asserts the pack does not authorise Phase 1D, does not state Phase 1D has started, does not mark any slice exit ready by default, does not permit the next slice to start by default, that a synthetic complete exit ready fixture passes validation only when all required evidence is present, that synthetic mutations fail validation when failed tests are above zero or archive hash is missing or no-secret archive evidence is missing or a changed file appears outside allowed families or a changed file appears inside forbidden families or dependency diff is missing for a dependency slice or rollback evidence is missing or boundary evidence is missing, and that every failure returns a deterministic reason. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that slice acceptance entries are sorted by slice index in the documented Phase 2R order, and global acceptance rules appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the pack content file. New founder-facing documents at the docs layer add a Phase 2S document and a short addendum to the Phase 1D entry decision-record document explaining where to find the no-regression acceptance pack. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, and the new Phase 2S acceptance pack product versions are bumped to 0.63.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic Phase 1D no-regression acceptance pack that pairs with the Phase 2R sequence manifest: the sequence manifest says in what order Phase 1D may be delivered; the acceptance pack says what each slice must prove on exit before the next slice begins. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2S is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2S authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2S internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2S authorisation
- Impact assessment
- Adds a Phase 1D no-regression acceptance pack at the static content layer, one test-side support module (a deterministic helper that loads the pack and the Phase 2R sequence manifest, validates slice coverage and evidence completeness and dependency-and-file-family-diff requirements and rollback-evidence and no-secret-archive-evidence completeness, exposes a validator and documented mutated synthetic fixtures, and renders a calm founder-facing Markdown summary), and seven new unit tests pinning shape and slice coverage and evidence completeness and dependency-and-file-family-diff coverage and exit safety and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The pack does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2TChange date:2026-04-30Product version:0.64.0Methodology engine version:0.9.1
Phase 1D audit handoff pack
Reason: Phase 2K through Phase 2S shipped the readiness checklist, decision-record template, synthetic completed dry-run record, synthetic dry-run gate report, synthetic Verification Journal entry, dry-run readiness pack index, scope boundary manifest, implementation sequence manifest, and no-regression acceptance pack. Phase 2T closes the reviewer-handoff loop with one calm deterministic audit handoff pack a founder can give to a reviewer before real Phase 1D begins. The pack makes the full Phase 2K to Phase 2S readiness chain reviewable from one document and preserves the explicit not-real-authorisation marker.
What changed: Phase 2T is a docs-side and test-side hardening phase. No product features were added. A new deterministic Phase 1D audit handoff pack at the static content layer carries one artefact pointer per Phase 2K to Phase 2S readiness artefact (nine pointers in the documented review order), a fifteen-item reviewer checklist, an explicit legal-and-governance notes block, a latest-approved-archive block referencing the approved Phase 2S archive name and archive hash and archive size and file count and test totals as fixed example evidence, and a fully-populated safety markers block. The handoff pack's top-level shape adds a pack version, a product version that matches the package version after the Phase 2T bump, a generated-from-phase identifier of two-T, a pack type that explicitly marks the pack as a Phase 1D audit handoff pack, a calm description, a dry-run handoff status, a calm founder-facing handoff purpose, the latest-approved-archive block, a review-order array, the artefact-pointers array, the reviewer-checklist array, the legal-and-governance-notes block, and the safety markers block. Each artefact pointer carries a stable identifier, the founder-facing artefact name, the phase owner, the documented artefact kind, the content path, the doc path, the artefact's content version, the artefact's product version, a deterministic SHA-256 of the artefact content body, a one-sentence calm reviewer focus, dry-run-only and not-real-authorisation markers where applicable, a legal-or-governance marker where applicable, and calm notes. The reviewer checklist enumerates fifteen documented checklist items every reviewer must pass before authorising real Phase 1D work; each item carries a stable id, a founder-facing name, a required flag, an evidence source, a calm pass condition, a calm failure meaning, and calm notes. The legal-and-governance notes block states explicitly that the pack is not legal advice, not a legal approval, not a regulatory decision, does not authorise Phase 1D, does not start Phase 1D, that separate legal or governance review may still be required before paid or public use, that synthetic dry-run artefacts are examples only, and that real Phase 1D work still needs an explicitly completed real decision record. The handoff status is dry-run-handoff-only. The pack does not authorise real Phase 1D work, does not start Phase 1D, does not mark any future slice as completed, does not implement any slice, does not run commands, does not change runtime behaviour, does not call Supabase, does not write to local storage, and does not replace legal review. One new test-side support module ships alongside the pack: a deterministic helper that loads the pack, loads every referenced Phase 2K through Phase 2S artefact, computes deterministic SHA-256 hashes for each referenced content file, validates artefact pointer integrity (existence, version alignment, hash match, dry-run and not-real-authorisation marker presence), validates reviewer checklist coverage, validates the legal and governance notes, exposes documented mutated synthetic fixtures (missing content file, stale SHA-256, stale product version, missing dry-run marker, missing not-real-authorisation marker), and renders a calm founder-facing Markdown handoff cover sheet. The helper never calls runtime storage, Supabase, fetch, local storage, or any live LLM provider, and never uses Date.now or new Date or Math.random or crypto random or process.env or local time. Eight new unit tests pin the pack end-to-end. The shape test asserts the documented top-level shape, every per-pointer field-shape rule, every per-checklist-item field-shape rule, every documented enum, and that no required field is missing. The artefact pointer coverage test asserts every required Phase 2K to Phase 2S artefact is present exactly once, the review order array includes every artefact pointer exactly once, artefact pointers appear in Phase 2K to Phase 2S review order, every content file exists, every doc file exists, every phase owner is documented, every reviewer focus is non-empty, and no extra artefact pointer exists without a documented phase owner and reviewer focus. The pointer integrity test asserts every recorded content SHA-256 matches the on-disk file body, every referenced content product version matches package.json where the artefact tracks product version, every referenced content generated from phase matches the artefact phase owner where applicable, every dry-run artefact pointer carries the dry-run-only marker where applicable, every not-real-authorisation marker is present where applicable, the validator returns ok for the canonical pack, and documented synthetic mutations (missing content file, stale SHA-256, stale product version, missing dry-run marker, missing not-real-authorisation marker) each fail with a deterministic reason. The reviewer checklist test asserts every documented checklist item from the spec is present, every item is required, every item has evidence source / pass condition / failure meaning / notes, and every documented review concern is covered (latest archive hash, latest test totals, all artefacts exist, product version alignment, content SHA-256 alignment, dry-run-only markers, not-real-authorisation markers, legal-and-governance notes, Phase 1D has not started, no real-customer data, no service-role usage, no runtime feature added, Phase 1D scope boundary explicit, implementation sequence ordered, no-regression acceptance evidence defined). The legal/governance and safety test asserts the pack does not authorise Phase 1D, does not state Phase 1D has started, does not mark any future slice as completed, does not imply paid features exist or public deployment is live or analytics are enabled or live provider scoring is enabled, keeps NullProvider as default, keeps the API route count unchanged, keeps Supabase unchanged, keeps local storage unchanged, states not legal advice and not a legal approval and not a regulatory decision, states separate legal or governance review may still be required, and every safety marker is populated. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the documented title, the latest archive SHA-256, the latest test totals, every artefact pointer, the review order, the reviewer checklist, the legal-and-governance notes, the not-real-authorisation marker, the not-legal-advice marker, the not-legal-approval marker, the not-regulatory-decision marker, never claims Phase 1D is authorised or has started, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, byte-stability of the validator and Markdown renderer, that artefact pointers are sorted in the documented Phase 2K to Phase 2S review order, that reviewer checklist items appear in the documented stable order, and no clock or random or environment or absolute-path or runtime-secret reference in the pack content file. New founder-facing documents at the docs layer add a Phase 2T document and a short addendum to the Phase 1D entry decision-record document explaining where to find the audit handoff pack. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, and the new Phase 2T audit handoff pack product versions are bumped to 0.64.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2N gate report builder and renderer, the Phase 2O entry helper and Markdown and CSV renderers, the Phase 2P index helper and Markdown renderer, the Phase 2Q scope boundary manifest helper and Markdown renderer, the Phase 2R sequence manifest helper and Markdown renderer, the Phase 2S no-regression acceptance pack helper and Markdown renderer, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic Phase 1D audit handoff cover sheet that pairs with the Phase 2K to Phase 2S readiness chain: the readiness pack, boundary manifest, sequence manifest, and acceptance pack are still discoverable in the documented review order, but a reviewer can now read one cover sheet first and follow the documented pointers in calm sequence. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2T is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2T authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2T internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-30
- Reference
- Phase 2T authorisation
- Impact assessment
- Adds a Phase 1D audit handoff pack at the static content layer, one test-side support module (a deterministic helper that loads the pack and every referenced Phase 2K through Phase 2S artefact, computes deterministic SHA-256 hashes, validates artefact pointer integrity and reviewer checklist coverage and the legal and governance notes, exposes documented mutated synthetic fixtures, and renders a calm founder-facing Markdown handoff cover sheet), and eight new unit tests pinning shape and artefact pointer coverage and pointer integrity and reviewer checklist completeness and legal-and-governance and safety and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The pack does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, the Phase 2N gate report (regenerated under the new product version), the Phase 2O Verification Journal entry (with refreshed source-artefact hash), the Phase 2P readiness pack index (with refreshed per-artefact source-hash fields), the Phase 2Q scope boundary manifest, the Phase 2R sequence manifest, the Phase 2S no-regression acceptance pack, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-WChange date:2026-04-29Product version:0.41.0Methodology engine version:0.9.1
Browser-only since-you-last-visited chip on the Methodology Change Log
Reason: Phase 1H-A added the Methodology Change Log; Phase 1H-C added per-entry source attribution and on-page anchors; Phase 1H-S added per-entry permalink lines to the Methodology Change Log Markdown export; Phase 1H-V added a browser-only since-you-last-visited chip for the Verify-now journal. The remaining awareness gap was that a founder returning to the dashboard had no calm signal that the Methodology Change Log itself had moved since they last looked. A small browser-only awareness chip closes that gap without sending anything off-device.
What changed: The dashboard now renders one short calm browser-only awareness line — N new methodology entries since you last visited — when the Methodology Change Log contains entries strictly newer than the user's acknowledgement marker. The chip pluralises correctly (1 new methodology entry vs N new methodology entries) and never renders when the changelog has not moved past the acknowledgement. A View changes link points at /methodology/changes; a Mark as seen action stores the latest visible change id and the current methodology product version in local storage under the key agentproof.methodology ack.v1, separate from the Phase 1H-V verification-journal acknowledgement and the Phase 1H-P journal envelope. Mark as seen never mutates the static changelog content, never sends anything to AgentProof, never touches Supabase, and never adds an API route — it is a one-local storage-key browser-only feature with no schema migration. The acknowledgement marker tolerates missing values, malformed envelopes, and rolled-back changelog edge cases by falling back to the calm default where every entry counts as new. The chip is gated on having at least one saved scorecard so an empty dashboard's onboarding flow is not disturbed. The Phase 1H-V verification-journal acknowledgement chip, the Phase 1H-U print-only Journal reference line on the Verification Report surface, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export shape, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface (1H-L, 1H-M, 1H-N), the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Founders returning to the dashboard now see a calm one-line awareness signal whenever the Methodology Change Log has moved since they last marked it as seen, with a one-click jump to the changes page. The acknowledgement is browser-only — it never leaves the user's machine, is never used to count product usage, and is never used to measure user behaviour off-device.
When to re-score: Not required. Phase 1H-W adds a single browser-only awareness chip to the dashboard; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-W authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-W internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 1H-W authorisation
- Impact assessment
- Adds a deterministic acknowledgement helper plus a small dashboard chip that compares the current Methodology Change Log against the last acknowledgement marker. The acknowledgement marker lives in its own localStorage key, separate from the Phase 1H-V verification-journal acknowledgement and the Phase 1H-P journal envelope; clearing one does not clear another. The chip never sends anything off-device, never adds an API route, never mutates the static changelog content, and never measures usage or behaviour. The Phase 1H-V acknowledgement chip, the Phase 1H-U print-only line, the Phase 1H-T reference helper / Markdown bullet / CSV column, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-XChange date:2026-04-29Product version:0.42.0Methodology engine version:0.9.1
Browser-only since-you-last-visited awareness sub-line on the per-scorecard Methodology lineage panel
Reason: Phase 1H-B added a per-scorecard Methodology lineage panel that surfaces how many methodology entries shipped after a saved scorecard was generated; Phase 1H-W added a dashboard-level Methodology Change Log acknowledgement chip. The remaining awareness gap is on the per-scorecard surface itself: the lineage panel knows the count of newer methodology entries, but it does not show how many of those are also new since the founder last marked the Methodology Change Log as seen on the dashboard. A small read-only awareness sub-line closes that gap by reusing the existing Phase 1H-W marker.
What changed: The Phase 1H-B Methodology lineage panel on saved scorecard results pages now renders one calm browser-only sub-line — N of those methodology entries are new since you last visited — when the lineage already lists newer methodology entries that shipped after this scorecard, and at least one of those entries is strictly newer than the Phase 1H-W acknowledgement marker. The line pluralises correctly (1 of those methodology entries is new vs N of those methodology entries are new) and never renders when the lineage has no newer entries or every newer entry is older than or equal to the acknowledgement. The line reuses the Phase 1H-W local storage key agentproof.methodology ack.v1 — no new acknowledgement key was added and no schema migration was required. The per-scorecard panel never writes the acknowledgement (the Phase 1H-W dashboard chip remains the only Mark-as-seen surface), never mutates the static changelog content, never adds an API route, never imports Supabase, never measures usage or behaviour, and never sends anything to AgentProof. The Phase 1H-W dashboard methodology chip, the Phase 1H-V verification-journal acknowledgement chip, the Phase 1H-U print-only Journal reference line on the Verification Report surface, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export shape, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface (1H-L, 1H-M, 1H-N), the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the Phase 1H-B methodology lineage calculation, the engine, scoring, methodology lineage semantics, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Founders opening a saved scorecard now see a calm one-line awareness signal under the Methodology lineage panel whenever methodology entries that shipped after this scorecard are also new since they last marked the Methodology Change Log as seen on the dashboard. The sub-line is browser-only — it never leaves the user machine, is never used to count product usage, and is never used to measure user behaviour off-device.
When to re-score: Not required. Phase 1H-X adds a single read-only browser-only awareness sub-line to the existing per-scorecard Methodology lineage panel; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-X authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-X internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 1H-X authorisation
- Impact assessment
- Adds a deterministic per-scorecard awareness helper plus a read-only sub-line on the Methodology lineage panel. The helper consumes only the existing Phase 1H-B newer-changes list and the Phase 1H-W acknowledgement marker; no new localStorage key is added and no schema migration is required. The per-scorecard surface never writes the acknowledgement, never mutates the static changelog content, never adds an API route, never imports Supabase, never measures usage or behaviour, and never sends anything off-device. The Phase 1H-W dashboard chip, the Phase 1H-V acknowledgement chip, the Phase 1H-U print-only line, the Phase 1H-T reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage semantics, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-YChange date:2026-04-29Product version:0.43.0Methodology engine version:0.9.1
Browser-only since-you-last-visited awareness sub-line on the Improvement Timeline page
Reason: Phase 1F-F added the Improvement Timeline page; Phase 1H-W added a dashboard-level Methodology Change Log acknowledgement chip; Phase 1H-X added a per-scorecard lineage awareness sub-line that reuses the same marker. The remaining timeline-awareness gap is that a founder can open an agent's Improvement Timeline without seeing a calm signal that some saved versions were generated under a methodology older than methodology entries new since the founder last marked the Methodology Change Log as seen. A small browser-only awareness sub-line on the timeline summary panel closes that gap by reusing the existing Phase 1H-W marker.
What changed: The Phase 1F-F Improvement Timeline summary panel now renders one calm browser-only sub-line — N timeline versions may benefit from reviewing methodology entries new since you last visited — when the Methodology Change Log contains entries that are both newer than the timeline's latest version stamps and strictly newer than the Phase 1H-W acknowledgement marker. The line pluralises correctly (1 timeline version may benefit vs N timeline versions may benefit) and never renders when no qualifying entries exist. The line reuses the Phase 1H-W local storage key agentproof.methodology ack.v1 — no new acknowledgement key was added and no schema migration was required. The timeline page never writes the acknowledgement (the Phase 1H-W dashboard chip remains the only Mark-as-seen surface), never mutates the static changelog content, never mutates timeline data, never adds an API route, never imports Supabase, never measures usage or behaviour, and never sends anything to AgentProof. The conservative count uses timeline.version count under a documented rule (per-row methodology version stamps are not exposed on timeline row today, so the helper cannot discriminate per-row); the may benefit wording does not overclaim. The Phase 1H-X per-scorecard methodology lineage awareness sub-line, the Phase 1H-W dashboard methodology chip, the Phase 1H-V verification-journal acknowledgement chip, the Phase 1H-U print-only Journal reference line on the Verification Report surface, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export shape, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface (1H-L, 1H-M, 1H-N), the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the Phase 1H-I timeline Markdown targeted-state body, the Phase 1F-F timeline calculation, the engine, scoring, methodology lineage semantics, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Founders opening an agent Improvement Timeline now see a calm one-line awareness signal under the timeline summary panel whenever methodology entries that shipped after the timeline are also new since they last marked the Methodology Change Log as seen on the dashboard. The sub-line is browser-only — it never leaves the user machine, is never used to count product usage, and is never used to measure user behaviour off-device.
When to re-score: Not required. Phase 1H-Y adds a single read-only browser-only awareness sub-line to the existing Improvement Timeline summary panel; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, timeline calculation, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-Y authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-Y internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 1H-Y authorisation
- Impact assessment
- Adds a deterministic timeline awareness helper plus a read-only sub-line on the Phase 1F-F Improvement Timeline summary panel. The helper consumes only the timeline version_count and version_stamps, the static Methodology Change Log content, and the Phase 1H-W acknowledgement marker; no new localStorage key is added and no schema migration is required. The timeline surface never writes the acknowledgement, never mutates the static changelog content, never mutates timeline data, never adds an API route, never imports Supabase, never measures usage or behaviour, and never sends anything off-device. The conservative count rule uses timeline.version_count and the may benefit wording does not overclaim. The Phase 1H-X per-scorecard lineage awareness, the Phase 1H-W dashboard chip, the Phase 1H-V acknowledgement chip, the Phase 1H-U print-only line, the Phase 1H-T reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, the timeline Markdown export and filename, golden snapshots, receipt hashing, storage shape, methodology lineage semantics, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-ZChange date:2026-04-29Product version:0.44.0Methodology engine version:0.9.1
Per-row methodology version stamps tighten the Phase 1H-Y Improvement Timeline awareness count
Reason: Phase 1H-Y added a calm browser-only Improvement Timeline awareness sub-line, but the count had to be conservative because the timeline rows did not yet carry per-row methodology version stamps. Phase 1H-Z lifts that limitation by adding optional row-level version stamps sourced from each saved scorecard stored versions block, so the awareness helper can return a tighter count when row-level data exists while preserving the calm conservative fallback for older rows that do not carry stamps.
What changed: Timeline rows now carry two optional methodology version stamps, populated only from each saved scorecard existing versions block at timeline-build time. No values are invented, no values are read from package.json, and no values are backfilled from elsewhere. Rows whose source saved scorecard lacks the field keep the field absent. The Phase 1H-Y timeline awareness helper now accepts optional row-level stamps and uses them for a precise per-row count when present; rows without stamps preserve the Phase 1H-Y conservative fallback within the same call. When no row-level stamps are provided, the helper returns the original Phase 1H-Y conservative count exactly as before. The timeline UI continues to render the same calm wording — 1 timeline version may benefit from reviewing methodology entries new since you last visited (singular) and n timeline versions may benefit from reviewing methodology entries new since you last visited (plural) — and never adds a Mark-as-seen button (the Phase 1H-W dashboard chip remains the only Mark-as-seen surface). The deterministic Phase 1H-I Improvement Timeline Markdown export is byte-identical: the export body reads only the row fields it already used and the new optional fields are never rendered. The Phase 1H-J timeline filename helper, the Phase 1H-M timeline copy preamble, the Phase 1H-X per-scorecard lineage awareness, the Phase 1H-W dashboard chip, the Phase 1H-V acknowledgement chip, the Phase 1H-U print-only line, the Phase 1H-T reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the engine, scoring, methodology lineage semantics, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Founders opening an agent Improvement Timeline now see a more accurate count of versions that may benefit from reviewing methodology entries new since they last marked the Methodology Change Log as seen on the dashboard. The wording is unchanged. No values are invented; older rows without stored versions retain the calm conservative treatment. The acknowledgement is browser-only — it never leaves the user machine, is never used to count product usage, and is never used to measure user behaviour off-device.
When to re-score: Not required. Phase 1H-Z adds optional per-row metadata to the timeline output and tightens the Phase 1H-Y awareness count. The engine, scoring, receipt, scorecard Markdown body, downloaded filename, timeline Markdown export, timeline filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-Z authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-Z internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 1H-Z authorisation
- Impact assessment
- Adds two optional fields to the existing Phase 1F-F timeline row shape and updates the Phase 1H-Y awareness helper to use them when present, with a deterministic fallback for rows without them. The helper signature stays backwards-compatible (the row-stamps argument is optional). Row values are sourced from each saved scorecard stored versions block only — never invented, never read from package.json, never backfilled. The deterministic timeline Markdown export, the timeline filename helper, the timeline copy preamble, the per-scorecard lineage awareness, the dashboard methodology chip, the verification-journal acknowledgement chip, the print-only line, the journal reference helper, the Methodology Change Log per-entry permalink line, the Verify-now journal CSV export, the print-only audit lines, the journal storage shape and Markdown export, the Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage semantics, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2AChange date:2026-04-29Product version:0.45.0Methodology engine version:0.9.1
Test-side hardening — central wording registry and export-regression harness
Reason: Phase 1E through Phase 1H accumulated wording-discipline rules per phase: a list of forbidden public-facing legal-overreach claims, a list of alarmist words, an OWASP overreach list, raw enum identifiers, internal helper-name leaks, internal file-path leaks, and a list of operational signals AgentProof does not produce. Each prior phase wired its own per-phase guard. Before any future paid or public Phase 1D work, those guards need a single source of truth so a future regression in any export body or rendered surface is caught immediately and consistently across every prior surface in one place.
What changed: Phase 2A is a test-side hardening phase. No product features were added. A central forbidden-wording registry was added at the tests support layer, grouping the accumulated rules into thirteen reusable lists across legal-overreach, alarmist, OWASP overreach, operational, raw mismatch, raw targeted-rescore, raw readiness, raw context, raw source-attribution, raw methodology, helper-name leaks, file-path leaks, and service-role references. Reusable assertion helpers were added under the same support layer. A central public-surface harness sweeps every documented Markdown and CSV export body and every filename helper for every wording group. An export-determinism harness double-renders every deterministic helper to catch wall-clock drift. A sample-matrix preservation guard pins all 13 sample outcomes (score, readiness, cap, primary context, engine and context-pack versions, receipt headline shape). A route and dependency safety guard pins the approved API route inventory and asserts no payments, error-monitoring, usage-measurement, PDF generation, or tour-onboarding dependencies are present. SHA-256 hash snapshots lock in cross-session drift detection for representative export bodies via Vitest snapshots. As part of the hardening sweep, the methodology page now applies the existing Phase 1H-O export-safe sanitiser to its on-page rendering of historical free-text changelog fields, so older entries cannot leak internal helper names or implementation paths to public on-page output the same way the export already protects against. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the print audit-line logic, and the methodology acknowledgement awareness logic are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening prevents a future code change from silently introducing forbidden public-facing wording, raw enum leaks, helper-name leaks, file-path leaks, or accidental drift in any deterministic export body. Goldens, sample outcomes, and every prior phase's user-facing surface remain byte-identical.
When to re-score: Not required. Phase 2A is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, timeline export, memo export, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2A authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2A authorisation
- Impact assessment
- Adds a centralised wording registry, reusable assertion helpers, a public-surface harness, an export-determinism harness, a sample-matrix preservation guard, a route and dependency safety guard, and SHA-256 hash snapshots for representative export bodies. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2BChange date:2026-04-29Product version:0.46.0Methodology engine version:0.9.1
Centralise approved user-facing friendly labels into a single source of truth
Reason: Phase 2A centralised the forbidden wording into one source of truth. Phase 2B is the matching approved-side single-source-of-truth: every friendly label that the product renders publicly was previously declared inline in the helper that emitted it (with two pairs duplicated across files). Centralising the approved wording lets a future label change happen deliberately in one place, guarded by a registry-completeness test, a cleanliness test that sweeps the new registry against the Phase 2A forbidden vocabulary, and the unchanged Phase 2A SHA-256 export-hash snapshots that pin every export body byte-identical.
What changed: Phase 2B is a refactor. A central friendly-labels registry now holds every approved label set in one place: softened readiness labels, literal receipt readiness labels, rating-cap labels, primary-context labels, friendly mismatch labels, targeted-rescore dashboard labels, targeted-rescore comparison direction labels, source-attribution labels, reviewed-by-role labels, Methodology Change Log change-type labels, and Methodology Change Log affected-area labels. Existing helper functions remain exported from their current modules and now delegate to the central maps. Two previously-duplicated direction-label maps (one in the targeted-rescore comparison helper and one in the memo preamble helper) now share the single central map. The bytes of every label are byte-identical to the pre-Phase-2B inline declarations, pinned by a new registry-completeness test and verified by the unchanged Phase 2A SHA-256 export hash snapshots, the existing 13 scorecard golden Markdown snapshots, the Phase 2A sample-matrix guard, and every prior phase wording-consistency test. A new cleanliness test sweeps every centralised label against the Phase 2A forbidden-wording vocabulary so a future helpful-looking edit cannot leak any forbidden public-facing term, raw enum identifier, helper name, file-path fragment, or service-role reference into a user-facing label. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, and the print audit-line logic are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The refactor brings approved wording under one source of truth so future label edits happen deliberately in one place with full test coverage.
When to re-score: Not required. Phase 2B is a refactor; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2B authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2B internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2B authorisation
- Impact assessment
- Centralises every approved user-facing friendly label into a single registry, with helper APIs delegating to the central maps and tests pinning the bytes byte-identical. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2CChange date:2026-04-29Product version:0.47.0Methodology engine version:0.9.1
Centralise approved long-form prose strings into a single source of truth
Reason: Phase 2A centralised the forbidden wording into one source of truth and Phase 2B centralised the approved short-form friendly labels. The remaining duplicated string content was the long-form prose: the standard non-legal-advice disclaimer (declared inline in many helpers and components), the Stored in this browser only short caption, the Verification Journal Markdown footer note, the dashboard Recent verifications panel reassurance sentence, the panel browser-only footer caption, and the three Phase 1H-Q print audit-context lines. Centralising these prose strings completes the approved-side single-source-of-truth pattern: future prose edits happen deliberately in one file and are guarded by a registry-completeness test, a cleanliness test, and the unchanged Phase 2A SHA-256 export-hash snapshots.
What changed: Phase 2C is a refactor. A central standard-copy registry now holds every long-form prose constant the product previously hardcoded inline: the standard non-legal disclaimer, the Markdown italic-wrapped disclaimer line, the Stored in this browser only caption, the Verification Report print-only journal-stored parenthetical, the journal-data-not-sent reassurance sentence, the browser-only-no-external-service caption, the Verification Journal Markdown footer note, and the three print audit-context lines (compare / memo / timeline). Existing helper functions and component APIs are unchanged; the inline string declarations in lib/report and components have been replaced by imports from the central registry. The bytes of every constant are byte-identical to the pre-Phase-2C inline declarations, pinned by a new registry-completeness test and verified by the unchanged Phase 2A SHA-256 export-hash snapshots, the existing 13 scorecard golden Markdown snapshots, the Phase 2A sample-matrix guard, and every prior phase wording-consistency test. A new cleanliness test sweeps every centralised prose constant against the Phase 2A forbidden-wording vocabulary so a future helpful-looking edit cannot leak any forbidden public-facing claim, alarmist word, OWASP-overreach phrase, internal helper name, file-path fragment, raw enum identifier, or service-role reference into a centralised prose string. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, and the print audit-line logic are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The refactor brings approved long-form prose under one source of truth so future prose edits happen deliberately in one place with full test coverage.
When to re-score: Not required. Phase 2C is a refactor; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2C authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2C internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2C authorisation
- Impact assessment
- Centralises every approved long-form user-facing prose string into a single registry, with the inline declarations replaced by imports and the bytes pinned byte-identical by tests. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2DChange date:2026-04-29Product version:0.48.0Methodology engine version:0.9.1
Full-matrix SHA-256 export-body snapshot coverage across the approved sample matrix
Reason: Phase 2A added representative SHA-256 export-body hash snapshots covering one sample per export type. The 13 scorecard golden Markdown snapshots already byte-pin the scorecard report body for every sample. Phase 2D extends snapshot coverage to every deterministic export body for every sample plus a small set of representative cross-version artefacts so any sub-byte drift in any export body is caught across the full matrix, not just for the representative subset.
What changed: Phase 2D is a test-side hardening phase. No product features were added. A new full-matrix snapshot test computes deterministic SHA-256 hashes across every approved sample and a small set of representative cross-version artefacts: scorecard report Markdown body and clipboard for all 13 samples; Verification Report Markdown body and clipboard for all 13 samples; four Verification Journal status fixtures (verified, mismatch, unable-to-verify-here, receipt-unavailable) across Markdown body, clipboard, and CSV; one mixed-status journal across the same three export types; four Improvement Memo direction shapes across Markdown body and clipboard; three Improvement Timeline shapes across Markdown body and clipboard; and the live Methodology Change Log Markdown body and clipboard. All caller-supplied timestamps are fixed so the hashes are stable across sessions. The matrix is captured via Vitest snapshots; future drift trips the test and forces a deliberate update. Determinism is verified by a self-check that calls the matrix builder twice and asserts equality. A self-check also asserts every hash is a 64-character lowercase hex SHA-256 digest, no duplicate keys exist, the scorecard hashes are distinct across the 13 samples, the four journal status hashes are distinct, the four memo direction hashes are distinct, and every key follows the documented naming convention. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening prevents a future code change from silently introducing sub-byte drift in any deterministic export body for any sample. Goldens, sample outcomes, and every prior phase user-facing surface remain byte-identical.
When to re-score: Not required. Phase 2D is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2D authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2D internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2D authorisation
- Impact assessment
- Adds 83 deterministic SHA-256 export-body hashes captured via Vitest snapshots across the full approved sample matrix. Companion to the existing 13 scorecard golden Markdown snapshots and the Phase 2A representative export hash snapshots. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2EChange date:2026-04-29Product version:0.49.0Methodology engine version:0.9.1
Deterministic seeded fuzzing harness for engine inputs across the documented input grammar
Reason: Phases 2A through 2D pinned what AgentProof already emits across hand-curated samples and representative artefacts. Phase 2E exercises what AgentProof could emit across a broad synthetic input space so the engine, the reproducibility receipt path, the deterministic Markdown renderer, and the public-surface wording protections stay stable across many well-typed synthetic agent inputs.
What changed: Phase 2E is a test-side hardening phase. No product features were added. A new pure deterministic seeded fuzzer at the tests support layer generates valid agent inputs across twelve documented risk profiles (minimal, low internal, medium internal, customer facing, public facing, sensitive data heavy, high autonomy tool using, decision support, prompt injection exposed, many tool, no tool, maximal realistic). The fuzzer uses a small mulberry32-based RNG implemented inside the test code so no new runtime dependency is added. A new fuzzing integration test runs 1,000 deterministic synthetic inputs through the full engine plus receipt plus Markdown path and asserts: every input passes the existing Zod schema; the engine never throws; the score is a finite integer in zero to one hundred; readiness rating and rating cap are one of the documented enum values; engine version and context packs version match the documented constants; the reproducibility receipt is present with valid 64-character lowercase hex SHA-256 hashes; verify reproducibility passes for the unchanged scorecard plus inputs plus markdown triple; scoring and rendering are byte-stable across two consecutive calls; the rendered Markdown body is clean of every Phase 2A forbidden term, raw enum identifier, helper name, and file-path fragment; a 50-input representative subset also passes the cleanliness sweep on the scorecard report clipboard, Verification Report Markdown body, and Verification Report clipboard. A new boundary-cases integration test exercises every documented profile as an explicit hand-built boundary case with the same invariants. A new generator-quality unit test asserts generator determinism (same seed yields same input), schema compliance for the first 100 seeds, full enum coverage across a 200-seed sweep, profile-specific shape invariants, and synthetic-prose cleanliness against the Phase 2A registry. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, and the standard-copy registry are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening prevents a future code change from silently breaking the engine on any well-typed synthetic input across the documented input grammar. Goldens, sample outcomes, and every prior phase user-facing surface remain byte-identical.
When to re-score: Not required. Phase 2E is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2E authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2E internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2E authorisation
- Impact assessment
- Adds a pure deterministic seeded fuzzer plus three new test files (one fuzzing integration test running 1,000 synthetic inputs, one boundary-cases integration test covering every documented profile, and one generator-quality unit test). The fuzzer uses an internal mulberry32 RNG and adds no new runtime dependency. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2FChange date:2026-04-29Product version:0.50.0Methodology engine version:0.9.1
Backwards-compatibility and public-contract shape harness
Reason: Phases 2A through 2E pinned the wording, label, prose, export-byte, and synthetic-input surfaces. Phase 2F extends that protection to the persistent and public contract shapes: stored saved scorecards, browser-local envelopes for the verification journal and acknowledgement markers, the reproducibility receipt schema, the methodology changelog and intelligence radar JSON shapes, and the public exports the rest of the product imports. The goal is that future phases cannot accidentally break older saved scorecards, older browser-local envelopes, or the documented public contract surface used by the dashboard, results page, compare view, memo, timeline, and verification report.
What changed: Phase 2F is a test-side hardening phase. No product features were added. A new contract-shape fixtures module at the tests support layer carries synthetic, audit-safe fixtures across saved scorecard envelopes (current, pre-alias, pre-receipt, pre-targeted-rescore, inputs-purged, malformed-metadata), saved scorecard summary variants, reproducibility receipts (current, missing, unknown-version), verification journal envelopes (current, empty, malformed, over-cap, missing-fields, non-JSON), verification journal acknowledgement envelopes, methodology acknowledgement envelopes, internal and external and future-https and non-http(s)-URL methodology changelog entries, and Supabase row shapes (current, no-targeted-rescore, malformed-targeted-rescore, no-receipt). A new contract-shape compatibility unit test exercises the real runtime helpers across every fixture and asserts that every saved scorecard variant parses, lists, and summarises without throwing; pre-alias rows surface no alias on the summary; pre-receipt rows surface a has receipt false flag; pre-targeted-rescore rows derive the documented enum from inputs; inputs-purged rows return unable to assess; malformed metadata is dropped defensively; update alias never touches targeted-rescore fields; the scorecard storage parser tolerates a malformed local storage payload; verify reproducibility returns the documented receipt-missing, receipt-version-unknown, and ok-on-current paths; the verification journal helpers tolerate every malformed-payload variant, enforce the documented retention cap, dedupe same-click double-appends, and never store raw inputs or scorecard Markdown; the journal acknowledgement and methodology acknowledgement envelopes round-trip correctly and tolerate every malformed-payload variant; clearing the methodology acknowledgement does not touch the journal envelopes and vice versa; every live methodology changelog entry has every required field; every source attribution block has every required field; internal entries have null source URL; non-internal entries with a source URL pass the documented safe-URL check; the safe-URL helper rejects every documented non-http(s) input including javascript, data, file, ftp, scheme-less, empty, whitespace, null, and undefined; the documented Supabase row defensive parser drops out-of-enum targeted-rescore values; update alias never touches targeted-rescore columns. A new public-contract export unit test pins the public-contract export surface across the friendly-labels, standard-copy, reproducibility, receipt-labels, targeted-rescore, methodology source-attribution, verification journal, methodology acknowledgement, scorecard timeline, comparison memo, and verification report modules. A new content-shape contract unit test locks the top-level shape of the methodology changelog, intelligence radar, and versions JSON files, and asserts no service-role API field name, Supabase service role environment-variable name, runtime API key constant, or build artefact path appears in any content file. A new local storage key registry unit test locks the four documented local storage keys, asserts every key starts with the product namespace, has an explicit version suffix, contains no whitespace, is not a generic word like data or store, is unique across the whole product, that no new key was added in Phase 2F, that the journal and acknowledgement keys are separate, and that clearing one browser-local store does not clear the others. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening prevents a future code change from silently breaking older saved scorecards, older browser-local envelopes, the receipt schema, the methodology JSON shape, the local storage key registry, or the public-contract export surface. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2F is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2F authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2F internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2F authorisation
- Impact assessment
- Adds a contract-shape fixtures module plus four new unit tests (a contract-shape compatibility test exercising every fixture against the real runtime helpers, a public-contract export inventory test, a content-JSON shape contract test, and a localStorage key registry test). No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2GChange date:2026-04-29Product version:0.51.0Methodology engine version:0.9.1
Stable backwards-compatibility migration rehearsal harness
Reason: Phase 2F locked the persistent and public contract shapes at the level of parsing and stable exports. Phase 2G extends that protection by running historical envelope shapes end-to-end through the current render and export paths. The goal is that future phases cannot accidentally break older saved scorecards, older browser-local envelopes, or older Supabase row variants when those flow through the current dashboard summary, results-page Markdown body, timeline summary, comparison memo, verification report, and journal export pipeline.
What changed: Phase 2G is a test-side hardening phase. No product features were added. Five new integration tests were added that reuse the Phase 2F contract-shape fixture registry. The first integration test exercises every saved-scorecard envelope variant (current, pre-alias, pre-receipt, pre-targeted-rescore, inputs-purged, malformed-metadata) through the local-storage round-trip, the summary derivation, the deterministic Markdown report renderer, the receipt attach and verify helpers, the Verification Report build and Markdown render and clipboard render, and the targeted-rescore guidance helper, asserting that no path throws and every rendered surface stays clean of Phase 2A forbidden public-surface terms. The second integration test builds a heterogeneous historical-mix timeline that combines the current envelope alongside the pre-alias, pre-receipt, and pre-targeted-rescore envelopes, asserts byte-stable Markdown rendering across two consecutive calls, runs the same single-row rehearsal for every per-envelope variant individually, and exercises both the conservative-fallback and per-row paths of the timeline methodology-awareness helper. The third integration test runs the cross-product of envelope variants through the deterministic compare and memo path, asserting a non-empty Markdown body, clipboard body, and filename for every pair, and explicitly covers the inputs-purged path on either side. The fourth integration test seeds the browser-local stores with the documented malformed payloads (non-JSON, schema-version mismatch, entries-not-array, missing-fields), asserts the browser-local helpers recover cleanly, performs a fresh write and read and render cycle, and renders the journal Markdown and clipboard and CSV exports, plus the new-events summary helper. The fifth integration test uses focused mock-store coverage that mirrors the documented file-local row defensive parser against the four row fixtures (current, no-targeted-rescore, malformed-targeted-rescore, no-receipt), then drives the resulting saved-scorecard envelope through the live local-storage helpers and the live summary and Markdown render path. Every rendered surface across all five tests is swept with the Phase 2A central forbidden-wording registry. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening prevents a future code change from silently breaking the end-to-end render and export path for older saved scorecards, older browser-local envelopes, or older Supabase row variants. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2G is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2G authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2G internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2G authorisation
- Impact assessment
- Adds five new integration tests that reuse the Phase 2F contract-shape fixture registry to drive historical envelope shapes through the live render and export paths. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2HChange date:2026-04-29Product version:0.52.0Methodology engine version:0.9.1
Stable deprecation-warning harness for future optional fields
Reason: Phases 2F and 2G locked the persistent and public contract shapes and rehearsed older envelopes through the current render and export paths. Phase 2H adds a calm test-side pattern for safely deprecating or renaming optional fields in the future, so a later phase that intentionally retires an old optional field has a predictable place to record the deprecation, a coverage manifest pointing at fixtures and rehearsal tests, and a checklist of guardrails to satisfy before shipping. Phase 2H itself does not deprecate any current field and does not add any runtime deprecation behaviour.
What changed: Phase 2H is a test-side hardening phase. No product features were added. A new deprecation registry support module ships an intentionally empty deprecation registry alongside an explicit marker constant that documents the empty state is deliberate, an optional-field coverage manifest pinning every optional field added since Phase 1F across saved scorecard, saved scorecard summary, timeline row, verification journal acknowledgement, methodology acknowledgement, and reproducibility receipt surfaces (each entry references the contract-shape fixture group that exercises it or the documented runtime-only sentinel for fields that exist only at build time), and a stable future-schema-addition checklist with twelve documented categories covering type, parser, fixtures, rehearsal, public exports, storage keys, wording, hash, fuzz, changelog, engine version, and context-pack version. Three new unit tests pin the deterministic shape and rules: the deprecation registry test asserts the registry exists and is deterministic, the empty-state marker is calm founder-facing prose when the registry is empty, every entry has non-empty surface and field name and introduced-in and deprecated-in and removal-not-before and migration note and test-fixture name, replacement field name is either non-empty string or null, severity is one of the documented values (advisory, soft, hard), deprecated-in is at or after introduced-in, removal-not-before is strictly after deprecated-in, surface plus field name pairs are unique, and reason and migration note both pass the Phase 2A wording sweep; the optional-field coverage test asserts every documented entry id matches surface dot field name, every field name is non-empty snake case, every fixture coverage reference is a known Phase 2F fixture group or the documented runtime-only sentinel, required-for-golden-snapshot is hard-pinned to false, part-of-receipt-contract is true only for the documented receipt-internal fields (markdown body hash and primary context id), every persistent surface entry has rehearsal-required true, and the surface set matches the documented allow list; the future-schema-addition checklist test asserts the checklist is non-empty and deterministic, every item has a non-empty snake case id, every item has a category from the documented twelve, every rule is calm prose ending with a sentence terminator, every documented category appears at least once, and the engine-version and context-pack-version items carry the documented do-not-bump-without-cause rule. A new compatibility checklist document at docs/compatibility checklist.md is the founder-facing prose mirror of the same rules. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening creates a calm path for any future deprecation of an optional field, so older saved scorecards keep rendering safely when a later phase eventually retires a legacy field. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2H is test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2H authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2H internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2H authorisation
- Impact assessment
- Adds a deprecation registry support module, an optional-field coverage manifest, a future-schema-addition checklist, three new unit tests, and a founder-facing compatibility checklist document. No runtime deprecation behaviour was added. No product features. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2IChange date:2026-04-29Product version:0.53.0Methodology engine version:0.9.1
Public audit bundle manifest
Reason: Phases 2A through 2H pinned wording, labels, prose, export bytes, fuzzing, contract shapes, migration rehearsal, and future optional-field discipline. Phase 2I creates a single deterministic manifest that lists every audit artefact AgentProof can export, copy, download, print, or otherwise hand to a reviewer, with one row per artefact recording founder-facing artefact name, owner surface, primary helper, body and clipboard renderers, filename helper, deterministic-body and browser-only flags, copy-preamble and download-filename flags, Phase 2A wording-harness coverage, Phase 2D hash-matrix coverage, receipt-hash inclusion, raw-inputs flag (always false), scorecard-Markdown inclusion, storage dependency, and calm notes. The manifest itself is test-side and documentation-side only.
What changed: Phase 2I is a test-side and documentation-side hardening phase. No product features were added. A new deterministic manifest at the static content layer enumerates every audit artefact across eight founder-facing groups (scorecard report, verification report, improvement memo, improvement timeline, methodology change log, verification journal, browser-local awareness, compare view), with documented entries covering Markdown bodies, clipboard variants, CSV bodies, download filenames, print-only audit lines, browser-local awareness chips and lines, the methodology entry permalink line, and the verification journal browser-local reference handle. Every entry pins the documented coverage flags so a reviewer can see at a glance which test-side hardening layers protect each artefact. Four new unit tests pin the manifest end-to-end. The shape test asserts the documented top-level shape, the documented per-entry field-shape rules, and the per-kind consistency rules including that the raw-inputs flag is always false, that scorecard Markdown is included only on the scorecard report group, that filename-kind artefacts are deterministic and carry the download-filename flag, that print-only and awareness-chip and awareness-line kinds are browser-only and outside the Phase 2D hash matrix, and that copy-preamble claims imply a clipboard renderer is set. The coverage test asserts every documented body, clipboard, and filename helper is a live export of the documented module (with a small set of documented sentinels for inline component helpers), every artefact claiming Phase 2D hash-matrix coverage maps to a representative Phase 2D hash key in the documented test source, every artefact claiming Phase 2A wording-harness coverage either has its helper referenced in the harness source or is documented as a UI-only or browser-local surface, every browser-only artefact's owner module has no fetch or Supabase or service-role usage, and every storage dependency is one of the four documented Phase 2F local storage keys. The cleanliness test sweeps every founder-facing prose field against the Phase 2A central forbidden-wording registry and asserts the manifest JSON contains no service-role API field name, no Supabase service role environment-variable name, no runtime API key constant, no Windows-user path, no one drive path, and no absolute Linux path. The determinism test asserts the manifest is byte-stable across two consecutive serialisations and across re-reads from disk, artefacts are grouped contiguously in a documented stable order, the manifest contains no Date-now or new-Date or random or crypto-random reference, no process-env or NODE-ENV reference, no absolute local paths, and no runtime secret placeholders. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The manifest gives reviewers a single deterministic catalogue of every artefact AgentProof can hand them, plus the test-side hardening layers protecting each one. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2I is test-side and documentation-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2I authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2I internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2I authorisation
- Impact assessment
- Adds a deterministic public audit bundle manifest at the static content layer plus four new unit tests pinning its top-level shape, per-entry rules, helper coverage against the live runtime, Phase 2D hash-matrix and Phase 2A wording-harness coverage claims, founder-facing prose cleanliness against the central wording registry, and byte-level determinism. No product features. No runtime behaviour change. The manifest does not add a new UI surface. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2JChange date:2026-04-29Product version:0.54.0Methodology engine version:0.9.1
Audit artefact owner map and orphan-surface guard
Reason: Phase 2I created the public audit bundle manifest. Phase 2J closes the discovery loop. The manifest catalogues every audit artefact AgentProof can hand a reviewer; this phase adds the guardrail that fails loudly if a future phase introduces a new export, copy, download, print, or browser-local awareness surface in the codebase without adding it to the manifest, or if the manifest references an owner that no longer exists.
What changed: Phase 2J is a test-side and documentation-side hardening phase. No product features were added. A new audit artefact owner map at the test-side support layer pairs every manifest artefact with exactly one documented owner module, owner export or component, owner kind (one of exported function, exported constant, component, or inline component helper), non-empty list of test owner files, source-detection patterns the orphan scanner uses, optional documented allowed-duplicate reason, and calm founder-facing notes. The module also ships a documented broad-harness test owner list (the central public-surface wording harness, the full export hash matrix, the representative export hash snapshots, the export determinism harness, and the thirteen scorecard golden snapshots) plus a documented orphan-surface allowlist of internal helpers that are deliberately not standalone artefacts. Four new unit tests pin the integrity loop. The owner-map integrity test asserts the owner map exists and is deterministic, every manifest artefact has exactly one owner-map entry and every owner-map entry references a real manifest artefact, every owner module and every test owner file exists on disk, every per-entry shape rule holds, no duplicate owner triple exists unless allowed-duplicate-reason is documented, inline-only sentinels are well-formed and used only for inline component helpers, prose fields are calm founder-facing English clean of forbidden public-surface terms and service-role and raw-enum leakage, no entry references absolute local paths or one drive paths or environment files, and the manifest group and manifest kind echo the manifest. The orphan-surface discovery guard runs a deterministic source scanner over the production source tree (lib and components only) matching exported render-Markdown helpers, render-clipboard-Markdown helpers, render-CSV helpers, filename helpers, awareness-summary helpers, and the scorecard render-Markdown-report helper; the scanner asserts every detected declaration is either covered by an owner-map detection pattern or appears in the documented allowlist, asserts every documented detectable artefact appears in the scan (catching stale owner-map entries pointing at deleted helpers), and never descends into node modules or build artefacts. The test-owner coverage guard asserts every artefact has at least one test owner, every per-test reference references the documented owner or component or detection-pattern token (or is a documented broad-harness owner), every deterministic export artefact has hash-matrix coverage or a documented not-applicable kind, every wording-harness claim holds, every browser-only owner module has no fetch or Supabase or service-role usage, every filename artefact has a documented safe-filename test, every clipboard-preamble artefact has a body-preservation test, and every print-only artefact has a documented print-only wiring test. The manifest-to-owner consistency test asserts the manifest and the owner map agree on artefact identifier, group, kind, owner module equals manifest primary helper, and the helper-presence relationships, with the manifest as source of truth. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening guarantees that future phases cannot accidentally introduce a new export or copy or print or awareness surface without it appearing in the public audit bundle manifest, and cannot accidentally orphan a manifest artefact whose owner is later renamed or deleted. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2J is test-side and documentation-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2J authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2J internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2J authorisation
- Impact assessment
- Adds an audit artefact owner map at the test-support layer plus four new unit tests pinning owner-map integrity, orphan-surface discovery, test-owner coverage, and manifest-to-owner consistency. No product features. No runtime behaviour change. The orphan-surface guard does not add any UI surface. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No API route change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2KChange date:2026-04-29Product version:0.55.0Methodology engine version:0.9.1
Pre-Phase-1D readiness freeze checklist
Reason: Phases 2A through 2J built the full hardening floor: wording and public-surface cleanliness, friendly-label centralisation, standard-copy centralisation, deterministic export hash matrix, deterministic engine fuzzing, public-contract shape locking, migration rehearsal, deprecation discipline, audit bundle manifest, and artefact owner map and orphan-surface guard. Phase 2K consolidates those gates into one deterministic readiness freeze checklist that says what must be GREEN before any paid, public, or Phase 1D work starts.
What changed: Phase 2K is a docs-side and test-side hardening phase. No product features were added. A new deterministic readiness freeze checklist at the static content layer enumerates the gates that must be GREEN before Phase 1D begins, grouped across build and test gauntlet, archive hygiene, version and changelog sync, scorecard stability, Phase 2A hardening, Phase 2B and 2C registries, Phase 2D export hashes, Phase 2E fuzzing, Phase 2F contracts, Phase 2G migration rehearsal, Phase 2H optional-field discipline, Phase 2I audit bundle manifest, Phase 2J artefact ownership, and provider and Phase 1D safety. Every gate carries a stable identifier, a founder-facing name, a documented group and type, an evidence-source pointer to a real test or document or content file or the documented tarball-extraction sentinel, the owning phase, a non-empty list of what the gate protects, calm prose explaining what a failure means and how to remediate it, and a flag (always true in Phase 2K) that says the gate blocks Phase 1D until restored. Four new unit tests pin the checklist. The shape test asserts the documented top-level shape, every per-gate field-shape rule, every documented enum value, unique snake-case identifiers, no duplicate gate name within a group, and that every gate is critical. The coverage test asserts every required gate group is represented, every prior phase from 2A through 2J is represented as the owner of at least one gate, every documented gate identifier is present, and every evidence source either points at an existing file on disk or matches the documented tarball-extraction sentinel; per-type rules also assert the right kind of file each gate type points at. The cleanliness test sweeps every founder-facing prose field through the Phase 2A central forbidden-wording registry and asserts the founder-facing prose contains no service-role API field name, Supabase service-role environment-variable name, runtime API key constant, Windows-user path, one drive path, absolute Linux path, or file-path fragments. The determinism test asserts the checklist is byte-stable across two consecutive serialisations and across re-reads from disk, gates are grouped contiguously in a documented stable order, and contains no clock or random or environment dependency. A new founder-facing readiness freeze checklist document is added at the docs layer as the prose mirror of the same gates. The audit bundle manifest's product version is bumped to 0.55.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard and test-owner coverage and manifest-to-owner consistency, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer a single deterministic document that lists every gate that must be GREEN before paid or public work starts. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2K is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2K authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2K internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2K authorisation
- Impact assessment
- Adds a deterministic readiness freeze checklist at the static content layer plus four new unit tests pinning shape, coverage, cleanliness, and determinism, and a founder-facing prose mirror at the docs layer. No product features. No runtime behaviour change. The checklist does not run commands, does not add CI behaviour, does not add an API route, does not touch Supabase, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2LChange date:2026-04-29Product version:0.56.0Methodology engine version:0.9.1
Phase 1D entry decision-record template
Reason: Phase 2K shipped the Pre-Phase-1D readiness freeze checklist. Phase 2L adds the deterministic decision-record template that must be completed before Phase 1D begins. Phase 2K says what must be GREEN; Phase 2L says what the founder must record and sign off before authorising Phase 1D. The template makes the authorisation moment auditable.
What changed: Phase 2L is a docs-side and test-side hardening phase. No product features were added. A new deterministic decision-record template at the static content layer enumerates the sections the founder must complete before Phase 1D begins, grouped across decision identity, Phase 1D scope, readiness freeze evidence, test evidence, archive evidence, provider and dependency evidence, risk sign-off, legal and governance, and final authorisation. Every section carries a stable identifier, a founder-facing name, a documented field type, required and evidence-required and blocks-Phase-1D booleans, optional allowed values for enum sections, calm expected-content prose, and calm notes. Synthetic decision-record fixtures live at the test-support layer: a completed record with every required field filled, plus four documented incomplete variants (missing final authorisation, missing archive hash, failed gate, exception without owner) and two malformed variants (wrong product version, missing section). All fixtures use synthetic founder and owner names; no real customer or real personal data appears anywhere. Five new unit tests pin the template end-to-end. The shape test asserts the documented top-level shape, every per-section field-shape rule, every documented enum value, that section identifiers are unique snake-case, that no duplicate section name exists within a group, and that every section is critical. The completed-record validation test asserts the completed synthetic record passes the full template contract; the missing-final-authorisation record carries the documented no sentinel and is structurally valid but explicitly not authorised; the missing-archive-hash record fails with the documented missing-or-empty reason; the failed-gate record fails with both approved-with-failed-tests and approved-with-red-gate reasons; the exception-without-owner record fails with the documented reason; the malformed-version record fails with product-version-mismatch; the missing-section record fails with the documented missing-or-empty reason; cross-cutting rules assert a record cannot approve Phase 1D when failed tests are above zero or when any required gate is red. The coverage test asserts every required section group is represented, every documented section identifier is present, every section is required, every section that blocks Phase 1D is marked required, the final-authorisation section blocks Phase 1D, every evidence-required section has a documented field type, no section claims to run commands or replace legal review, and the legal-and-governance group has the documented four sections. The cleanliness test sweeps every founder-facing prose field in the template and in every fixture record through the Phase 2A central forbidden-wording registry and asserts the JSON contains no service-role API field name, no Supabase service-role environment variable, no runtime API key constant, no Windows-user path, no one drive path, and no absolute Linux path. The determinism test asserts the template is byte-stable across two consecutive serialisations and across re-reads from disk, sections are grouped contiguously in a documented stable order, the completed synthetic fixture serialises deterministically, and the JSON contains no Date-now or new-Date or random or crypto-random or process-env or NODE-ENV reference. New founder-facing decision-record and Phase 2L documents are added at the docs layer. The Phase 2I audit bundle manifest and Phase 2K readiness freeze checklist product versions are bumped to 0.56.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard and test-owner coverage and manifest-to-owner consistency, the Phase 2K readiness freeze checklist, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer a single deterministic document that records exactly what was authorised, what evidence supports the decision, what is excluded, what archive was packaged, what the rollback plan is, and that separate legal review may still be required. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2L is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2L authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2L internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2L authorisation
- Impact assessment
- Adds a deterministic decision-record template at the static content layer, synthetic fixtures at the test-support layer, five new unit tests pinning shape and completed-record validation and coverage and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The template does not run commands, does not add CI behaviour, does not add an API route, does not touch Supabase, and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2MChange date:2026-04-29Product version:0.57.0Methodology engine version:0.9.1
Dry-run completed Phase 1D decision record
Reason: Phase 2K shipped the Pre-Phase-1D readiness freeze checklist. Phase 2L shipped the Phase 1D entry decision-record template. Phase 2M completes the loop with one synthetic completed dry-run record showing exactly how the founder would fill in the template before authorising real Phase 1D work. The dry-run record makes the calm shape of a complete authorisation visible to a reviewer and gives the founder a worked example to follow.
What changed: Phase 2M is a docs-side and test-side hardening phase. No product features were added. A new deterministic synthetic dry-run completed decision record at the static content layer covers every section identifier from the Phase 2L template in the documented stable order, preserves the template's section group and field type for each entry, and provides a value field per section. The record's top-level shape adds a record version, a template version that matches the Phase 2L template, a product version that matches the package version after the Phase 2M bump, a generated-from-phase identifier of two-M, a record type of synthetic dry run, a calm description, a template reference path, and a completion status of synthetic complete. Every value is synthetic and audit-safe: synthetic founder name, synthetic owner roles, synthetic dates anchored to the existing 2026 corpus, the latest approved Phase 2L archive hash as fixed example evidence, and test totals consistent with the latest GREEN gauntlet pattern. No real customer data and no real personal details appear anywhere. Five new unit tests pin the dry-run end-to-end. The shape test asserts the documented top-level shape, every per-section field-shape rule, full coverage of the Phase 2L template (every section identifier appears exactly once), preservation of section group and field type, and section ordering matches the template. The completeness test asserts every required Phase 2L section is filled (with a documented may-be-empty allow list for exception fields and known limitations), every evidence-required section has evidence, every group is filled, no placeholder sentinels (the four documented sentinel words) appear, and the documented explicitly-empty fields are explicit. The cross-section validity test asserts failed tests is zero, total tests equals passed tests plus failed tests, every required gate is green, the final approval is yes, the final authorisation timestamp is an ISO timestamp, archive hash fields are sixty-four lowercase hex characters and agree with each other, the package and methodology product versions match, the Phase 2K checklist version reference is dotted, the template reference points at the Phase 2L template, non-goals and excluded features and rollback fields and residual risk summary are non-empty, and every legal and governance acknowledgement is true. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry and asserts no service-role API field name or Supabase service-role environment-variable name or runtime API key constant or Windows-user path or one drive path or absolute Linux path appears anywhere in the JSON, and no file-path fragments appear inside founder-facing prose. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, sections grouped contiguously in the documented Phase 2L order, and no clock or random or environment dependency. New founder-facing documents at the docs layer add a Phase 2M document and a short addendum to the Phase 1D entry decision-record document explaining where to find the dry-run example. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, and the Phase 2L decision-record template product versions are bumped to 0.57.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer a single deterministic synthetic example of a completed Phase 1D entry decision record, anchored to the latest GREEN gauntlet pattern and the latest approved archive hash. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2M is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2M authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2M internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2M authorisation
- Impact assessment
- Adds a synthetic dry-run completed Phase 1D entry decision record at the static content layer, five new unit tests pinning shape and completeness and cross-section validity and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The dry-run record does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:2NChange date:2026-04-29Product version:0.58.0Methodology engine version:0.9.1
Phase 1D entry-gate dry-run report
Reason: Phase 2K shipped the readiness freeze checklist. Phase 2L shipped the decision-record template. Phase 2M shipped the synthetic completed dry-run decision record. Phase 2N closes the loop with one calm founder-readable summary view that fuses the Phase 2K checklist and the Phase 2M dry-run record into a single deterministic ready or not-ready report. The report makes the readiness picture visible at a glance and preserves an auditable verdict with documented per-row reasoning, while remaining a dry run only and not a real authorisation.
What changed: Phase 2N is a docs-side and test-side hardening phase. No product features were added. A new deterministic synthetic dry-run gate report at the static content layer fuses the Phase 2K readiness freeze checklist with the Phase 2M completed synthetic dry-run decision record and produces a calm founder-readable summary. Every gate from the Phase 2K checklist appears exactly once in the documented Phase 2K stable order, grouped contiguously by gate group. Each gate row carries a stable identifier, a founder-facing name, the gate group, the required status, the observed status (derived from the dry-run record via three documented mirror tables plus a documented version-gates presumption set), a passed boolean, an evidence source pointer, and a calm reason. Per-group rollups carry total gates, passed gates, failed gates, and a group verdict. The report's top-level shape adds a report version, a product version that matches the package version after the Phase 2N bump, a generated-from-phase identifier of two-N, a report type of synthetic dry run gate report, a calm description, a dry-run-only true flag, a real-authorisation false flag, an overall verdict of ready or not ready, a deterministic verdict reason, a calm summary, source references, gate rows, group rollups, a decision-record summary, an archive summary, and a safety markers block carrying the dry-run-only true flag, the real-authorisation false flag, the not-legal-advice flag, the not-a-legal-approval flag, and the not-a-regulatory-decision flag. The report does not authorise any real Phase 1D work and does not replace legal review. Three new test-side support modules ship alongside the report: a deterministic builder that derives the report from the live checklist and the live dry-run record without any clock or random or environment dependency, a deterministic Markdown renderer that carries the documented dry-run-only and not-a-real-authorisation header markers, the not-legal-advice and not-a-legal-approval and not-a-regulatory-decision footer, and a synthetic founder sign-off reference, and a documented set of mutated dry-run fixtures (red required gate, missing archive hash, missing final authorisation, failed tests above zero, passed below total, wrong record type) that each flip the verdict to not ready with a deterministic failure reason recorded in the verdict-reason field. Five new unit tests pin the report end-to-end. The shape test asserts the documented top-level shape, every per-row and per-rollup field-shape rule, full coverage of the Phase 2K checklist (every gate identifier appears exactly once), Phase 2K group ordering preservation, group totals reconciliation against gate rows, and that the builder reproduces the canonical content file byte-for-byte. The verdict test asserts the default dry-run fixture produces ready, every documented mutation flips the verdict to not ready with the exact documented failure reason recorded in the verdict-reason field, and the canonical dry-run record on disk is not mutated by any test fixture. The Markdown rendering test asserts the rendered Markdown is non-empty, carries the dry-run-only and not-a-real-authorisation header markers, the overall verdict label, the archive hash, the synthetic founder sign-off reference, the group rollups and gate summary and safety markers headings, the not-legal-advice and not-a-legal-approval and not-a-regulatory-decision footer, and is byte-stable across two consecutive renders. The cleanliness test sweeps every founder-facing prose value through the Phase 2A central forbidden-wording registry and asserts no raw-secret token, Windows-user path, one drive path, absolute Linux path, or real-customer placeholder appears anywhere in any prose surface. The determinism test asserts byte-stability across two consecutive serialisations and across re-reads from disk, builder and renderer byte-stability across two consecutive calls, group rollups and gate rows in Phase 2K order, and no clock or random or environment or absolute-path reference in the report content file. New founder-facing documents at the docs layer add a Phase 2N document and a short addendum to the Phase 1D entry decision-record document pointing at the gate report. The Phase 2I audit bundle manifest, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, and the new Phase 2N gate report product versions are bumped to 0.58.0 to stay in sync with the package version. The deterministic scoring engine, the scoring weights, the scoring rules, the red-flag rules, the AI Act-aware indicator rules, the context-pack detection, the decision-layer logic, the compare logic, the memo logic, the timeline calculation, the alias logic, the receipt hashing, the receipt storage shape, the verification report logic, the methodology lineage logic, the source-attribution logic, the provenance trail logic, the external-source entry content, the targeted-rescore logic, the targeted-rescore comparison logic, the filename slug logic, the copy-preamble logic, the Methodology Change Log export sanitisation logic, the Verification Journal Markdown logic, the Verification Journal CSV logic, the verification journal reference helper, the verification journal acknowledgement helper, the methodology acknowledgement helper, the methodology acknowledgement awareness logic, the print audit-line logic, the friendly-label registry, the standard-copy registry, the Phase 2E fuzzer semantics, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation registry and optional-field coverage and future-schema-addition checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, and the approved sample outcomes are all unchanged. NullProvider remains the default. The 13 scorecard golden Markdown snapshots remain byte-identical.
User impact: Founders see no on-screen change. The hardening gives a reviewer one calm deterministic founder-readable summary that fuses the readiness freeze checklist and the synthetic dry-run decision record into a single ready or not-ready picture, with documented per-row reasoning and a documented set of mutations that each flip the verdict to not ready for a deterministic reason. Goldens, sample outcomes, every prior phase user-facing surface, and every prior export body remain byte-identical.
When to re-score: Not required. Phase 2N is docs-side and test-side hardening only; the engine, scoring, receipt, scorecard Markdown body, downloaded filenames, and deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 2N authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 2N internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-29
- Reference
- Phase 2N authorisation
- Impact assessment
- Adds a synthetic dry-run gate report at the static content layer, three test-side support modules (a deterministic builder, a deterministic Markdown renderer, and a documented set of mutated dry-run fixtures), and five new unit tests pinning shape and verdict and Markdown rendering and cleanliness and determinism, and two founder-facing prose documents at the docs layer. No product features. No runtime behaviour change. The report does not authorise any real Phase 1D work and does not replace legal review. No scoring change. No receipt change. No Markdown export body change. No filename change. No copy-preamble change. No localStorage key change. No service-role usage. No live external call. NullProvider remains the default. The 13 scorecard golden Markdown snapshots, the Phase 2A SHA-256 export-hash snapshots, the Phase 2D full-matrix snapshots, the Phase 2E fuzzing harness, the Phase 2F contract-shape fixtures and tests, the Phase 2G migration rehearsal tests, the Phase 2H deprecation / optional-field checklist, the Phase 2I audit bundle manifest, the Phase 2J owner map and orphan-surface guard, the Phase 2K readiness freeze checklist, the Phase 2L decision-record template, the Phase 2M dry-run record, every prior phase user-facing surface, and every prior export shape remain byte-identical. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-RChange date:2026-04-28Product version:0.36.0Methodology engine version:0.9.1
Verify-now journal CSV export (browser-only)
Reason: Phase 1H-P added the local Verify-now journal with Markdown export. Founders preparing a governance review or audit binder often want to open the same audit data in Excel, Google Sheets, or a governance tracking workbook. A reviewer-friendly CSV companion lets them paste or open the journal as a spreadsheet without writing any extraction code.
What changed: The dashboard Recent verifications panel now exposes a Download CSV action alongside the existing Copy Markdown / Download Markdown / Clear journal actions. The CSV is follows RFC 4180 (comma-separated, CRLF line terminators, double-quoted cells with escaped internal quotes when a cell contains a comma, quote, or newline) and includes one row per journal entry in newest-first order with reviewer-friendly columns: the audit-event timestamp, the agent name, the saved scorecard id when available, the friendly status label (Verified / Mismatch detected / Unable to verify here / Receipt unavailable), the severity tone, the mismatch count, the friendly visible-mismatch labels joined inside the cell with a semicolon plus space, the receipt version when available, the product version when available, the engine version when available, and the source field. The CSV is generated entirely in the browser via a client-side Blob with the MIME type text/csv and saved as agentproof-verification-journal.csv. Generated entirely in the browser from the local journal — no new API route, no service-role usage, no live LLM, no external fetch, no Supabase persistence, no server-side PDF library. The deterministic scorecard Markdown report, all golden Markdown snapshots, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Reviewers preparing a governance review can now open the local Verify-now journal directly in a spreadsheet for sorting, filtering, and pasting into a governance tracking workbook. The CSV lives in this browser only and is never sent to AgentProof.
When to re-score: Not required. Phase 1H-R adds a browser-only export format on top of the existing journal; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-R authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-R internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-28
- Reference
- Phase 1H-R authorisation
- Impact assessment
- Adds a deterministic CSV export of the local Verify-now journal that follows RFC 4180 plus a Download CSV action on the dashboard Recent verifications panel. The CSV carries only the friendly Phase 1G-B status and visible-mismatch labels — never raw mismatch kind discriminators, receipt hashes, raw inputs, or the scorecard Markdown body. The Phase 1H-P Markdown export shape and storage, the Phase 1H-Q print-only audit lines, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-SChange date:2026-04-28Product version:0.37.0Methodology engine version:0.9.1
Methodology Change Log per-entry permalink line
Reason: Phase 1H-C added per-entry on-page anchors to the Methodology Change Log. Phase 1H-O added the Markdown export. The remaining reviewer-citation gap was that the Markdown export did not yet show a stable reference for each methodology entry. A reviewer pasting the export into a governance ticket or audit binder needed a calm one-line reference per entry so they could link back to the on-page entry without copying the URL by hand.
What changed: Every entry in the Methodology Change Log Markdown export now ends its metadata bullet list with one short calm line: a Permalink line of the form /methodology/changes#<lowercase-change-id>. The permalink is built deterministically from the existing Phase 1H-C on-page anchor convention (lowercase change id with non-alphanumeric characters replaced by hyphens), is always relative (no hostname), and renders for every entry — historical and new. The on-screen Methodology page is unchanged. The downloaded Markdown body is otherwise unchanged. The copy-only preamble line, the filename helper, the structured export shape's other fields, and the Phase 1H-O sanitiser are unchanged. The Phase 1H-R Verify-now journal CSV, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal Markdown export and storage shape, every prior copy-preamble surface, the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Reviewers who copy or download the Methodology Change Log Markdown export can now paste a stable per-entry reference into a governance ticket or wiki — for example /methodology/changes#1h-r — to cite a specific Change Log entry without copying the full page URL by hand.
When to re-score: Not required. Phase 1H-S adds a single Markdown line per entry to the Methodology Change Log export; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-S authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-S internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-28
- Reference
- Phase 1H-S authorisation
- Impact assessment
- Adds a single deterministic permalink line per entry to the Methodology Change Log Markdown export, built from the existing Phase 1H-C anchor convention. The on-screen Methodology page is unchanged. The Phase 1H-O sanitiser and copy-only preamble are unchanged. The Phase 1H-R Verify-now journal CSV, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal Markdown export and storage, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-TChange date:2026-04-28Product version:0.38.0Methodology engine version:0.9.1
Verify-now journal entry reference handle (browser-local)
Reason: Phase 1H-P added the browser-only Verify-now journal; Phase 1H-R added CSV export; Phase 1H-S added per-entry permalinks to the Methodology Change Log export. The remaining citation gap was that individual verification journal events had no short, stable handle a reviewer could paste into a ticket, spreadsheet, or governance note. A short browser-local reference closes that gap without changing storage or sending anything off-device.
What changed: Every Verify-now journal entry now exposes a short, stable, browser-local reference handle of the form VJ-YYYYMMDD-XXXX. The date prefix is taken from the entry's audit-event timestamp; the four-character uppercase suffix is a deterministic short hash of the entry's stable id, the audit-event timestamp, and the agent name — same entry always yields the same handle. The dashboard Recent verifications panel now renders one Reference line per visible entry, in monospace and never styled as a link. The Verify-now journal Markdown export adds one Journal reference bullet per event, immediately under the entry's H3 heading. The Verify-now journal CSV export adds a leading journal reference column, placed first so reviewers see it as soon as they open the file in a spreadsheet. The reference is computed at render / export time from the existing journal-entry shape — no field is added to local storage and no migration is required for journals saved before this phase. The reference is a calm browser-local citation handle: it resolves only inside this browser's journal, never crosses the network, and is never synced. The journal storage cap, newest-first listing, malformed-storage handling, Clear journal action, Copy Markdown action, Download Markdown action, Download CSV action, existing CSV escaping, and existing browser-only design are all unchanged. The Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-Q print-only audit lines, the Phase 1H-O Methodology Change Log export sanitiser and copy-only preamble, every prior copy-preamble surface, the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Reviewers can now cite a specific Verify-now event in a governance ticket, spreadsheet, or audit binder using a short stable handle (for example VJ-20260428-A7F2) without copying long ids or hashes. The reference resolves only inside this browser's journal — it is a citation handle, not a public link.
When to re-score: Not required. Phase 1H-T adds a single deterministic reference handle to the existing Verify-now journal surfaces; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-T authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-T internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-28
- Reference
- Phase 1H-T authorisation
- Impact assessment
- Adds a deterministic short reference handle per Verify-now journal event, computed at render / export time from the existing journal-entry shape. The dashboard panel adds a Reference line per visible entry; the journal Markdown export adds a Journal reference bullet per entry; the CSV export adds a leading journal_reference column. The journal storage shape, the localStorage envelope, the storage cap, the malformed-storage handling, and every other journal action (Copy Markdown, Download Markdown, Download CSV, Clear journal) are unchanged. The Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-O Methodology Change Log export sanitiser and copy-only preamble, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-UChange date:2026-04-28Product version:0.39.0Methodology engine version:0.9.1
Print-only journal reference line on the Verification Report surface
Reason: Phase 1H-T added a short, stable, browser-local reference handle to every Verify-now journal event (panel + Markdown + CSV). The remaining cross-reference gap was the printed Verification Report itself: a reviewer reading a saved PDF of the receipt section had no calm one-line bridge back to the local journal. A print-only Journal reference line on the Reproducibility Receipt section closes that gap.
What changed: After a founder clicks Verify now on a saved scorecard's results page and verification reaches a terminal status (Verified, Mismatch detected, Unable to verify here, or Receipt unavailable), printing or saving the page as PDF now includes a calm one-line cross-reference between the printed Verification Report and the browser-local Verify-now journal: Journal reference: VJ-YYYYMMDD-XXXX, followed by the calm parenthetical Verification journal stored in this browser only. The line is print-only via the existing ap-print-only class; on-screen layout is unchanged. The line is rendered only when Verify now has been clicked AND has reached a terminal status — the running state and the unrun state never render it. The reference is computed deterministically with the existing Phase 1H-T helper from the same scorecard id + audit-event timestamp + agent name fields the Phase 1H-P journal append uses, so the printed reference matches the journal entry recorded in the local journal. Same event always yields the same handle. The downloaded Verification Report Markdown body (Phase 1H-K), the Verification Report filename slug (Phase 1H-K), the Phase 1H-L copy preamble, the Verify-now behaviour, the receipt hashing, the receipt storage shape, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines on Compare / Memo / Timeline, the Phase 1H-P Verification Journal storage shape, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Reviewers reading a saved PDF of a scorecard's results page can now cross-reference the printed Verification Report against the local Verify-now journal in one step. The reference is browser-local — it never crosses the network and is never sent to AgentProof.
When to re-score: Not required. Phase 1H-U adds a single print-only line to the existing Reproducibility Receipt section; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-U authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-U internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-28
- Reference
- Phase 1H-U authorisation
- Impact assessment
- Adds a single print-only Journal reference line to the Reproducibility Receipt section, computed deterministically with the Phase 1H-T helper. The line is hidden on screen via ap-print-only, never renders before Verify now completes, never renders for the running state, and never claims a hosted URL or remote audit trail. The downloaded Verification Report Markdown body, the Verification Report filename, the copy-to-clipboard preamble, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-VChange date:2026-04-28Product version:0.40.0Methodology engine version:0.9.1
Browser-only since-you-last-visited chip on the verification journal panel
Reason: Phase 1H-P added the browser-only Verify-now journal; Phase 1H-T added per-entry reference handles; Phase 1H-U added a print-only cross-reference on the Verification Report surface. The remaining awareness gap was that a founder returning to the dashboard had no calm signal that new Verify-now events had been recorded since they last looked. A small browser-only awareness chip closes that gap without sending anything off-device.
What changed: The dashboard Recent verifications panel now renders one short calm browser-only awareness line — N new verification events since you last visited — when the local journal contains entries strictly newer than the user's acknowledgement marker. The chip pluralises correctly (1 new verification event vs N new verification events) and never renders when the journal is empty or every entry is older than or equal to the acknowledgement. A Mark as seen action stores the most recent entry's audit-event timestamp in local storage under the key agentproof.verification journal ack.v1, separate from the Phase 1H-P journal envelope. Mark as seen never clears journal entries, never sends anything to AgentProof, never touches Supabase, and never adds an API route — it is a one-local storage-key browser-only feature with no schema migration. The acknowledgement marker tolerates missing values and malformed envelopes by treating them as no acknowledgement (so every entry counts as new on first read). The Phase 1H-U print-only Journal reference line on the Verification Report surface, the Phase 1H-T journal reference helper, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export shape, the Phase 1H-Q print-only audit lines, the Phase 1H-P Verification Journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface (1H-L, 1H-M, 1H-N), the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report and all 13 golden Markdown snapshots remain byte-identical.
User impact: Founders returning to the dashboard now see a calm one-line awareness signal whenever new Verify-now events have been recorded since they last marked them as seen. The acknowledgement is browser-only — it never leaves the user's machine, is never used to count product usage, and is never used to measure user behaviour off-device.
When to re-score: Not required. Phase 1H-V adds a single browser-only awareness chip to the existing dashboard panel; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-V authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-V internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-28
- Reference
- Phase 1H-V authorisation
- Impact assessment
- Adds a deterministic acknowledgement helper plus a small dashboard chip wired to the local journal. The acknowledgement marker lives in its own localStorage key, separate from the Phase 1H-P journal envelope; clearing one does not clear the other. The chip never sends anything off-device, never adds an API route, and never measures usage or behaviour. The Phase 1H-U print-only line, the Phase 1H-T reference helper / Markdown bullet / CSV column, the Phase 1H-S Methodology Change Log per-entry permalink line, the Phase 1H-R Verify-now journal CSV export, the Phase 1H-Q print-only audit lines, the Phase 1H-P journal storage shape and Markdown export, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are all unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-PChange date:2026-04-27Product version:0.34.0Methodology engine version:0.9.1
Verify-now audit-event journal (browser-only)
Reason: Phase 1G-B added Verify now and Phase 1G-C added a downloadable Verification Report, but the in-product memory of past verification events was limited to the current scorecard page session. A founder preparing a governance review or audit binder needs a calm browser-only list of recent verification events with Copy and Download Markdown actions, without sending anything anywhere.
What changed: Every Verify-now click on a saved scorecard now appends a calm local journal entry capturing the scorecard id when available, the engine-output agent name, the audit-event timestamp captured at click time, the friendly status label (Verified, Mismatch detected, Unable to verify here, Receipt unavailable), the mismatch count, the friendly visible-mismatch labels, and the receipt and engine versions when available. The dashboard renders a Recent verifications panel when at least one event exists, with Copy Markdown, Download Markdown, and Clear journal actions. Copied Markdown opens with a short calm preamble line (**AgentProof Verification Journal** — <N> entries — copied at <ISO timestamp>); the downloaded file body is unchanged from the export body. The journal is stored in this browser only (local storage), is never sent to AgentProof, never enters Supabase, never enters the receipt body, and never stores raw saved inputs, the scorecard Markdown body, receipt hashes, or raw mismatch kind discriminators. Retained entries are capped at fifty newest-first events. Same-click dedupe protects against double-append on re-render. The deterministic scorecard Markdown report, all golden Markdown snapshots, the Phase 1H-O Methodology Change Log export, the Phase 1H-N scorecard copy preamble, the Phase 1H-M Improvement Memo and Improvement Timeline copy preambles, the Phase 1H-L Verification Report copy preamble, the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Founders preparing a governance review can now see the most recent Verify-now events at a glance on the dashboard, copy the journal as Markdown into a ticket or wiki, or download a calm reviewer-ready Markdown file. The journal lives in this browser only and never leaves the user's machine.
When to re-score: Not required. Phase 1H-P adds a browser-only auditability surface; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-P authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-P internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-27
- Reference
- Phase 1H-P authorisation
- Impact assessment
- Adds a deterministic browser-only journal of Verify-now events, plus a Markdown export. The journal never enters Supabase, never sends data to AgentProof, never enters the receipt body, and never stores raw inputs, the scorecard Markdown, receipt hashes, or raw mismatch kind discriminators. The deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, targeted-rescore comparison logic, the Phase 1H-O Methodology Change Log export, and every prior copy-preamble surface are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-QChange date:2026-04-27Product version:0.35.0Methodology engine version:0.9.1
Print-only audit-context lines on Compare, Improvement Memo, and Improvement Timeline
Reason: Phase 1G-C added a print-only Latest verification run line on the single-scorecard results page. Phase 1H-P added a browser-only Verify-now journal. The remaining presentation gap was that printed Compare, Improvement Memo, and Improvement Timeline artefacts did not carry a calm audit-context line — a reviewer reading a saved PDF could not see at a glance how the artefact was produced or where the verification journal lived.
What changed: Each of the Compare full view, the Improvement Memo view, and the Improvement Timeline view now renders one short calm print-only audit-context line just before the existing print-only footer. The lines read: Audit context: comparison generated from saved scorecards. Verification journal is stored in this browser only. (for Compare) / Audit context: improvement memo generated from saved scorecards. Verification journal is stored in this browser only. (for the Improvement Memo) / Audit context: improvement timeline generated from saved scorecards. Verification journal is stored in this browser only. (for the Improvement Timeline). Each line is followed by the standard non-legal-advice disclaimer. The lines are print-only via the existing ap-print-only class; on-screen layout is unchanged. The Markdown export bodies, the downloaded filenames, the Phase 1H-P Verification Journal storage shape and behaviour, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface (1H-L, 1H-M, 1H-N), the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the Phase 1G-A reproducibility receipt, the receipt hashing, the receipt storage shape, and all 13 golden Markdown snapshots are all unchanged. The shared component reads no runtime state, calls no fetch, never touches local storage or the Verification Journal, and never imports Supabase or any external service.
User impact: Founders who run Print / Save as PDF on the Compare, Improvement Memo, or Improvement Timeline pages now produce a saved PDF that carries a calm audit-context line plus the standard disclaimer. Reviewers reading the PDF can see at a glance that the artefact was generated from saved scorecards and that the verification journal lives in this browser only.
When to re-score: Not required. Phase 1H-Q is a print-only presentation feature; the engine, scoring, receipt, scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-Q authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-Q internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-27
- Reference
- Phase 1H-Q authorisation
- Impact assessment
- Adds a small shared print-only component rendering one audit-context line plus the standard non-legal-advice disclaimer on the Compare, Improvement Memo, and Improvement Timeline print surfaces. The Markdown export bodies, downloaded filenames, the Phase 1H-P Verification Journal, the Phase 1H-O Methodology Change Log export, every prior copy-preamble surface, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-CChange date:2026-04-26Product version:0.18.0Methodology engine version:0.9.1
Verification Report (copy / download)
Reason: Verification needs a shareable, dated artefact that records WHEN verification was run, not only THAT verification was possible.
What changed: Added a deterministic Verification Report helper (build / Markdown / filename) and Copy / Download verification report actions on the receipt panel. Print view carries a compact 'Latest verification run' line below the receipt section.
User impact: Users can attach a dated Markdown verification report to internal AI-agent governance reviews.
When to re-score: Not required. The Verification Report is built from data already on the saved scorecard.
Report wording updateTraining / documentation updateEvidence trace: Phase 1G-C authorisation; engine v0.9.1 unchanged; receipt hashing/storage shape unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-C internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1G-C authorisation
- Impact assessment
- Adds an audit-event surface and Markdown export. Receipt hashing and storage shape unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-AChange date:2026-04-26Product version:0.19.0Methodology engine version:0.9.1
Intelligence Radar + Methodology Change Log foundation
Reason: AgentProof must be a versioned, maintained, traceable methodology — not a static checklist. Users need to see the cadence, the change log, and the re-score guidance on a calm product-facing page.
What changed: Added a versioned Intelligence Radar content file documenting the methodology cadence (weekly light scan, monthly formal review, quarterly methodology update review, emergency review for material change), monitored domains, product-impact categories, review roles, and update decision states. Added a versioned Methodology Change Log seeded with the eight historical phases. Added a user-visible /methodology/changes page rendering the current versions, the radar cadence, the change log entries, and re-score guidance. Added a low-key Methodology changes link in the global layout footer and a UI-only View methodology changes link in the results-page Methodology section (ap-no-print, does not change the deterministic Markdown report).
User impact: Users can now see how AgentProof's methodology has evolved, which product areas changed, why, and whether re-scoring is recommended.
When to re-score: Not required. Phase 1H-A adds the methodology framework and surface; deterministic scoring outputs are unchanged.
Training / documentation updateEvidence trace: Phase 1H-A authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new content files intelligence radar.v1.json and methodology changelog.v1.json.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-A authorisation
- Impact assessment
- Adds the methodology framework and a product-facing changes page. No change to scoring, receipt hashing, or storage shape.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-BChange date:2026-04-26Product version:0.20.0Methodology engine version:0.9.1
Per-scorecard methodology lineage
Reason: Saved scorecards should not feel like isolated artefacts. Each one should sit inside the AgentProof methodology timeline and tell the user which entries shipped at or before its versions and how many shipped since.
What changed: Added a deterministic methodology lineage helper that compares a scorecard's engine + context-pack versions against the Methodology Change Log entries. Added a Methodology lineage panel on the results page showing scorecard versions vs current versions, the latest applicable change, the newer-change count, the re-score recommendation, and a link to the Methodology Changes page. Added a calm Current methodology / Newer methodology entries / Re-score recommended chip on every dashboard tile. Cross-link mention added to the Methodology Changes page. UI-only — does not affect the deterministic Markdown report.
User impact: Every saved scorecard now shows where it sits in the methodology timeline, with a calm re-score recommendation when newer changes warrant it.
When to re-score: Not required. Phase 1H-B is a UI / status feature; deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-B authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-B internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-B authorisation
- Impact assessment
- Adds a deterministic per-scorecard lineage helper and two presentation surfaces. Receipt hashing, storage shape, and goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-CChange date:2026-04-26Product version:0.21.0Methodology engine version:0.9.1
Methodology source-attribution layer
Reason: Every methodology change entry should carry a transparent provenance block so a reviewer can see whether the change came from internal development, regulator guidance, a platform release, a model-provider update, a security-incident pattern, user feedback, or a benchmark finding — and what its limitations are.
What changed: Added a per-entry source attribution block to content/methodology changelog.v1.json covering source type, source label, source reference, source published at, source reviewed at, source url, reviewed by role, impact assessment, and limitations. Added a deterministic source-attribution helper exposing friendly source-type labels and safe optional URL handling. Updated the /methodology/changes page to render a compact provenance block under every entry and to expose change id anchors so each entry is linkable. UI-only — does not affect the deterministic Markdown report.
User impact: Every methodology change entry on the public methodology page now displays a transparent source/provenance block with friendly labels, an explicit limitations note, and a stable per-entry anchor.
When to re-score: Not required. Phase 1H-C is a content + presentation feature; deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-C authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-C internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-C authorisation
- Impact assessment
- Adds a per-entry provenance block, a friendly source-type label helper, and per-entry anchors on the methodology page. Receipt hashing, storage shape, methodology lineage logic, and goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-DChange date:2026-04-26Product version:0.22.0Methodology engine version:0.9.1
Per-scorecard methodology provenance trail
Reason: A reviewer of a saved scorecard should be able to see the methodology basis it was generated under at a glance — whether that basis was internal-only, mixed internal-and-external, or external-source-supported — without leaving the report.
What changed: Added a deterministic per-scorecard methodology provenance trail helper that joins Phase 1H-B applicable-vs-newer logic with the Phase 1H-C source-attribution helper. Added a Methodology provenance trail panel on the results page showing the provenance basis, applicable-entry count, latest applicable change, reviewed-by role summary, and the top 3 applicable source rows (with a +N more line when applicable). Added a single concise Methodology provenance line to the Phase 1G-C Verification Report Markdown. UI / report-view only — does not affect the deterministic Markdown report or receipt hashing.
User impact: Every saved scorecard now shows the provenance basis of the methodology entries that were applicable when the scorecard was produced. The Verification Report carries the same one-line summary so an audited artefact records the basis at the moment of verification.
When to re-score: Not required. Phase 1H-D is a UI / verification-report wording feature; deterministic scoring outputs and receipt hashing are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-D authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-D internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-D authorisation
- Impact assessment
- Adds a deterministic per-scorecard provenance helper, a results-page panel, and a one-line Verification Report addition. Receipt hashing, storage shape, methodology lineage logic, source-attribution logic, and goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score recommended
Change id:1H-EChange date:2026-04-26Product version:0.23.0Methodology engine version:0.9.1
Prompt-injection source attribution added
Reason: AgentProof's prompt-injection scenario coverage and red-flag wording should be supportable by a real, named industry security source so users and reviewers can see that AgentProof's methodology is informed by community-reviewed agent-safety guidance — not only internal records.
What changed: Recorded a methodology review of the OWASP GenAI Security Project's LLM01:2025 Prompt Injection page as the first non-internal source in the AgentProof Methodology Change Log. Added the Security guidance source type to the source-attribution helper. The methodology page now displays the source label, the safe HTTPS source URL, the reviewed date, the reviewed-by role, the impact assessment, and the limitations for this entry. For scorecards whose methodology versions include this entry, the per-scorecard provenance trail moves from Internal methodology records only to Mixed internal and external sources. Engine, scoring, receipt hashing and storage shape are unchanged.
User impact: Users can see one real, documented external-source methodology entry on /methodology/changes. Saved scorecards covered by this entry will display a Mixed internal and external sources provenance basis. Re-scoring is recommended for agents that accept untrusted instructions, retrieved content, tool output, user messages, or external content so the prompt-injection scenario coverage is reviewed.
When to re-score: Re-score agents that accept untrusted instructions, retrieved content, tool output, user messages, or external content so prompt-injection scenario coverage can be reviewed against the reviewed OWASP prompt-injection security guidance.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-E authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; OWASP GenAI Security Project LLM01:2025 page reviewed manually by the methodology owner.
Source / provenance
- Source type
- Security guidance
- Source label
- OWASP GenAI Security Project — LLM01:2025 Prompt Injection
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- OWASP LLM01:2025 Prompt Injection
- Source URL
- https://genai.owasp.org/llmrisk/llm01-prompt-injection/
- Impact assessment
- Reinforces the AgentProof prompt-injection and safety-bypass scenario coverage and red-flag wording. The source is a community-authored security guidance artefact and is treated as supporting evidence for the existing AgentProof scenario expectations; it does not change scoring weights, scoring rules, red-flag rules, AI Act-aware indicator rules, or the deterministic engine output.
- Limitations
- This source is industry security guidance authored by an open community project. It is not legal advice, not a legal approval, not a regulatory decision, and does not constitute external endorsement of AgentProof. AgentProof reviewed the source manually; the product does not crawl, scan, or fetch external sources at runtime.
- Re-score optional
Change id:1H-FChange date:2026-04-26Product version:0.24.0Methodology engine version:0.9.1
Targeted per-scorecard re-score guidance for 1H-E
Reason: Phase 1H-E recorded a global re-score recommendation for prompt-injection-exposed agents but did not say whether any specific saved scorecard sits in that affected class. Reviewers and founders should be able to see, on the scorecard itself, whether THIS agent appears exposed and why.
What changed: Added a deterministic targeted re-score guidance helper that takes the saved agent inputs and decides whether the 1H-E affected class applies: the helper checks the deployment context, the systems and data accessed, the tool list, and the free-text fields, and returns one of three states (applies / not applicable / unable to assess) with friendly matched-signal labels. Added a Methodology / 1H-E targeted re-score guidance panel on the results page rendering the decision, the matched signals, and a link to /methodology/changes#1h-e. Added one concise targeted-guidance line to the Verification Report Markdown. Targeted guidance is intentionally results-page-only because the saved-scorecard summary lacks the inputs needed to compute it without a full-scorecard fetch. Engine, scoring, receipt hashing, storage shape, methodology lineage, source attribution, and provenance trail logic are unchanged.
User impact: Every saved scorecard whose inputs are still available now shows a targeted 1H-E re-score recommendation specific to this agent. Scorecards with purged inputs render an honest unable-to-assess state instead of guessing.
When to re-score: Not required. Phase 1H-F is a UI / verification-report wording feature on top of the Phase 1H-E methodology entry; deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-F authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-F internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-F authorisation
- Impact assessment
- Adds a deterministic targeted re-score helper, a results-page panel, and a one-line Verification Report addition. Receipt hashing, storage shape, methodology lineage logic, source-attribution logic, and provenance trail logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-GChange date:2026-04-26Product version:0.25.0Methodology engine version:0.9.1
Dashboard targeted-rescore chip via summary metadata
Reason: The Phase 1H-F targeted re-score guidance was results-page-only because dashboard summaries did not carry the saved agent inputs. To close the visibility gap without per-tile network round-trips, summaries should pre-compute a compact targeted-rescore status at save time and the dashboard tile should render a calm chip from that metadata.
What changed: Extended saved scorecard summary with three optional fields (targeted rescore entry id, targeted rescore state, targeted rescore label). Local saves compute and store the metadata on the saved-scorecard envelope at save time; to summary prefers stored envelope metadata and only derives from inputs for older local rows (or sets unable-to-assess when the version was purged). Signed-in saves compute the metadata server-side from validated agent inputs and persist three nullable TEXT columns via Supabase migration a database migration with CHECK constraints on the documented enum values. save body schema continues to be .strict() and rejects any client-supplied targeted-rescore field; the alias-only PATCH route does not mutate targeted-rescore metadata. The dashboard tile renders a calm chip with one of: 1H-E re-score recommended (warn), 1H-E targeted check passed (ok), or 1H-E unable to assess (neutral). The Phase 1H-F results-page panel and Verification Report line remain the source of detail and are unchanged. The metadata never enters the Scorecard JSON, the Phase 1G-A receipt hash, or the deterministic Markdown report; engine, scoring, receipt hashing, storage shape, methodology lineage, source-attribution, provenance trail and targeted-rescore logic are unchanged.
User impact: Users now see at-a-glance 1H-E status across the entire saved-scorecard list on the dashboard, computed once at save time. Older pre-1H-G saves render no chip; signed-in users do not need to re-score to surface the chip on rows created after the migration.
When to re-score: Not required. Phase 1H-G is summary-metadata + UI; deterministic scoring outputs and the Phase 1G-A receipt are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-G authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; Supabase migration a database migration.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-G internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-G authorisation
- Impact assessment
- Adds three nullable summary metadata columns plus a dashboard chip. Receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, and targeted-rescore logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-HChange date:2026-04-26Product version:0.26.0Methodology engine version:0.9.1
Targeted re-score state across Compare, Memo, and Timeline
Reason: Phase 1H-G surfaced the 1H-E targeted state on every dashboard tile via summary metadata. The cross-version improvement artefacts (Compare, Improvement Memo, Improvement Timeline) did not carry that state. A reviewer comparing two versions of an agent should see whether the targeted 1H-E status improved, regressed, is now assessable, or is now unable to assess.
What changed: Added a deterministic targeted-rescore comparison helper (an internal source file) that takes two saved scorecards and returns before/after labels, a direction (Improved, Regressed, Unchanged, Now assessable, Now unable to assess), a calm summary line, and a calm user-action label. The Compare view renders a small 1H-E targeted re-score state section between the Evidence + scenarios panel and the Methodology + disclaimer panel. The Improvement Memo carries one concise targeted re-score status line below the Recommended next step, both in the screen view and in the Markdown export. The Improvement Timeline renders a compact 1H-E chip per row (warn / ok / neutral) and a Latest targeted state caption in the timeline header. Signed-in saved scorecards now expose the targeted-rescore envelope fields via the Supabase get() path so the artefacts read the same metadata the dashboard does. Engine, scoring, receipt hashing, storage shape, methodology lineage, source-attribution, provenance trail, and targeted-rescore logic are unchanged.
User impact: Users now see the 1H-E targeted state in every existing improvement artefact: Compare shows before/after with an improved / regressed / unchanged direction; the Improvement Memo carries one concise status line; the Timeline shows per-version chips and a latest-state caption.
When to re-score: Not required. Phase 1H-H is a presentation feature on top of the existing 1H-E targeted-rescore logic and the 1H-G summary metadata; deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-H authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-H internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-H authorisation
- Impact assessment
- Adds a deterministic targeted-rescore comparison helper plus presentation across Compare / Memo / Timeline. The signed-in get() path now selects the three targeted-rescore TEXT columns and exposes them on the SavedScorecard envelope. Receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, and targeted-rescore logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-IChange date:2026-04-26Product version:0.27.0Methodology engine version:0.9.1
Targeted re-score state in Timeline Markdown export
Reason: Phase 1H-H carried the 1H-E targeted state into the on-screen Compare, Improvement Memo, and Improvement Timeline. The Improvement Timeline Markdown export — the artefact a founder copies into a ticket or wiki — did not yet carry the per-version state or the latest state, so the export was less informative than the screen view.
What changed: Extended render timeline markdown to include a Latest targeted state header line when the latest row has a documented state, an optional Targeted re-score movement line (first → latest) when at least two versions are present and both have documented states, and a 1H-E state field on every version row. Versions whose envelope state is absent and whose inputs are unavailable render Not recorded. The export uses only the friendly Phase 1H-G dashboard labels (1H-E re-score recommended, 1H-E targeted check passed, 1H-E unable to assess) — never raw enum identifiers. The Phase 1H-F Verification Report targeted line and the Phase 1H-H Improvement Memo targeted line continue to render with the same friendly labels. Engine, scoring, receipt hashing, storage shape, methodology lineage, source-attribution, provenance trail, and targeted-rescore logic are unchanged. The deterministic scorecard Markdown report is byte-identical.
User impact: Users who download or copy the Improvement Timeline Markdown now see the 1H-E targeted state on every version row plus a header line summarising the latest state and the first → latest movement. The artefact reads the same calm wording as the on-screen UI.
When to re-score: Not required. Phase 1H-I is an exported-artefact wording feature on top of the existing Phase 1H-H targeted-rescore comparison logic; deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-I authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; render timeline markdown extended in an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-I internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-I authorisation
- Impact assessment
- Extends the Improvement Timeline Markdown export with friendly targeted-state lines. Receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-JChange date:2026-04-26Product version:0.28.0Methodology engine version:0.9.1
1H-E targeted-state slugs in exported artefact filenames
Reason: Phase 1H-I carried the 1H-E targeted state into the Improvement Timeline Markdown export body. The exported filenames (memo and timeline) did not yet reflect the targeted state, so two artefacts saved on different days for the same agent could collide on disk and the filename alone did not communicate the targeted basis.
What changed: Exported Improvement Memo Markdown filenames now include a safe 1H-E targeted-state slug when the memo was built from a cross-version comparison: the slug captures the targeted-rescore movement (one of 1h-e-improved, 1h-e-regressed, 1h-e-unchanged, 1h-e-now-assessable, 1h-e-now-unable). When only the latest state is known, the memo filename falls back to the after-state slug (one of 1h-e-recommended, 1h-e-passed, 1h-e-unassessed). Exported Improvement Timeline Markdown filenames now include the latest 1H-E targeted-state slug when available. Memos and timelines with no targeted-rescore data keep their original filenames byte-identical. The Markdown bodies of every artefact, the deterministic scorecard Markdown report, the engine, scoring, receipt hashing, receipt storage shape, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Users who download artefacts now see the targeted state in the filename: improvement-memo-internal-faq-assistant-1h-e-improved.md, timeline-internal-faq-assistant-1h-e-passed.md, etc. Older artefacts saved before this phase are unaffected — legacy filenames remain byte-identical when no targeted data is present.
When to re-score: Not required. Phase 1H-J is exported-filename-only; deterministic scoring outputs and the Markdown bodies are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-J authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-J internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-J authorisation
- Impact assessment
- Adds a deterministic targeted-rescore filename helper and threads its safe 1H-E slug fragments into the Improvement Memo and Improvement Timeline Markdown filename helpers. Markdown bodies, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-KChange date:2026-04-26Product version:0.29.0Methodology engine version:0.9.1
1H-E targeted-state slug in Verification Report filename
Reason: Phase 1H-J added safe 1H-E targeted-state slug fragments to the Improvement Memo and Improvement Timeline Markdown filenames so downloaded artefacts are easier to recognise. The downloadable Verification Report — the artefact a reviewer attaches to an internal AI-agent governance review — did not yet carry the targeted state in its filename, so two verification artefacts saved on different days for the same agent could collide on disk and the filename alone did not communicate the targeted basis.
What changed: Exported Verification Report Markdown filenames now include the safe 1H-E targeted-state slug (one of 1h-e-recommended, 1h-e-passed, 1h-e-unassessed) when the verified scorecard carries a documented targeted-rescore state. Verification Reports for scorecards with no targeted-rescore data keep their original filenames byte-identical. The Verification Report Markdown body, the copy-to-clipboard content, the Verify-now behaviour, the Phase 1G-A reproducibility receipt, the receipt hashing, the receipt storage shape, the Phase 1H-F targeted-rescore line on the report, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged. The deterministic scorecard Markdown report is byte-identical.
User impact: Users who download a Verification Report after Verify now now see the 1H-E targeted state in the filename: verification-report-internal-faq-assistant-2026-04-26t08-30-00-000z-1h-e-passed.md, etc. Verification Reports for older scorecards saved before the targeted-rescore framework still download with their original filename when no targeted-rescore data is available.
When to re-score: Not required. Phase 1H-K is a Verification Report filename-only feature; the report body, the receipt, the engine, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-K authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; existing slug helper from Phase 1H-J reused.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-K internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-K authorisation
- Impact assessment
- Threads the existing Phase 1H-J safe 1H-E slug fragments into the Verification Report Markdown filename helper. The Verification Report Markdown body, copy-to-clipboard content, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-LChange date:2026-04-26Product version:0.30.0Methodology engine version:0.9.1
Self-identifying Verification Report copy preamble
Reason: When a founder copied the Verification Report to the clipboard and pasted it into a ticket, wiki, or email, the pasted artefact opened with the Markdown body's first heading and could lose its identity once it was wrapped in surrounding ticket text. A short calm preamble line on the clipboard path makes pasted content self-identifying without changing the downloaded file or the receipt body.
What changed: Added a deterministic copy-only helper (render verification report clipboard markdown) that returns the existing Verification Report Markdown body byte-for-byte, prefixed with a short calm preamble line: **AgentProof Verification Report** — <agent name> — <friendly 1H-E targeted state label> — verified at <ISO timestamp>. The friendly state labels reuse the Phase 1H-G dashboard wording (1H-E re-score recommended, 1H-E targeted check passed, 1H-E unable to assess) plus an explicit fallback (1H-E state not recorded) for legacy scorecards that do not carry a documented targeted-rescore state. The reproducibility receipt section Copy verification report action now produces this preamble + body output. The Download verification report action continues to use the unchanged Verification Report Markdown body and the Phase 1H-K filename helper. The preamble does not enter the receipt body, the receipt hashes, the stored receipt shape, the deterministic scorecard Markdown, or the downloaded Verification Report file. Engine, scoring, receipt hashing, storage shape, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Users who copy the Verification Report into a ticket, wiki, or email now paste a self-identifying artefact: the first line names AgentProof, the agent, the 1H-E targeted state, and the verification timestamp. The downloaded Verification Report file remains byte-identical to Phase 1H-K and continues to download with the Phase 1H-K targeted-state filename slug.
When to re-score: Not required. Phase 1H-L is a copy-to-clipboard wording feature; the engine, scoring, receipt, downloaded report body, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-L authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged; new helper render verification report clipboard markdown in an internal source file.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-L internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-L authorisation
- Impact assessment
- Adds a deterministic copy-only helper that prepends a short calm preamble line to the existing Verification Report Markdown body. The downloaded Verification Report file, the Verification Report filename, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-MChange date:2026-04-26Product version:0.31.0Methodology engine version:0.9.1
Self-identifying Improvement Memo and Improvement Timeline copy preambles
Reason: Phase 1H-L closed the copy-to-clipboard polish gap on the Verification Report. The same gap existed on the Improvement Memo and Improvement Timeline copy actions: when a founder pasted a copied Markdown body into a ticket, wiki, or email, the artefact opened with a generic Markdown heading and could lose its identity once it was wrapped in surrounding ticket text. Short calm preamble lines on those clipboard paths make pasted content self-identifying without changing the downloaded files, the downloaded filenames, or the receipt body.
What changed: Copied Improvement Memo Markdown now opens with a single short calm preamble line: **AgentProof Improvement Memo** — <before label> → <after label> — <friendly targeted comparison label or '1H-E state not recorded'>. The friendly direction labels reuse the Phase 1H-H Compare-view chip wording (Improved, Regressed, Unchanged, Now assessable, Now unable to assess); legacy memos with no targeted-rescore comparison data use the explicit '1H-E state not recorded' fallback. Copied Improvement Timeline Markdown now opens with a single short calm preamble line: **AgentProof Improvement Timeline** — <agent name> — <version count> versions — <latest friendly 1H-E targeted state label or '1H-E state not recorded'>. The latest-state label reuses the Phase 1H-G dashboard wording (1H-E re-score recommended, 1H-E targeted check passed, 1H-E unable to assess); legacy timelines with no documented latest state use the explicit '1H-E state not recorded' fallback. The version-count word is automatically pluralised (1 version vs 2 or more versions). The Improvement Memo Copy Markdown action and the Improvement Timeline Copy Timeline Markdown action are now wired to the new copy-only helpers; their Download Markdown actions continue to produce the unchanged Improvement Memo Markdown body, the unchanged Improvement Timeline Markdown body, and the Phase 1H-J targeted-state filename slugs. The preambles do not enter the receipt body, the receipt hashes, the stored receipt shape, the deterministic scorecard Markdown, the downloaded Improvement Memo file, the downloaded Improvement Timeline file, or any filename. The Verification Report copy preamble (Phase 1H-L) and the Verification Report filename slug (Phase 1H-K) remain intact. Engine, scoring, receipt hashing, storage shape, methodology lineage, source-attribution, provenance trail, targeted-rescore, targeted-rescore comparison, and filename-slug logic are all unchanged.
User impact: Users who copy the Improvement Memo or the Improvement Timeline into a ticket, wiki, or email now paste a self-identifying artefact: the first line names AgentProof, the artefact type, and (where available) the friendly 1H-E targeted comparison or latest state. The downloaded Improvement Memo and Improvement Timeline files remain byte-identical to Phase 1H-I / 1H-J and continue to download with the Phase 1H-J targeted-state filename slugs.
When to re-score: Not required. Phase 1H-M is a copy-to-clipboard wording feature; the engine, scoring, receipt, downloaded artefact bodies, downloaded filenames, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-M authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-M internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-M authorisation
- Impact assessment
- Adds two deterministic copy-only preambles (one for the Improvement Memo, one for the Improvement Timeline). Downloaded Improvement Memo Markdown bodies, downloaded Improvement Timeline Markdown bodies, downloaded filenames, the Verification Report copy preamble from Phase 1H-L, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, and targeted-rescore comparison logic are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-NChange date:2026-04-26Product version:0.32.0Methodology engine version:0.9.1
Self-identifying scorecard report copy preamble
Reason: Phase 1H-L closed the copy-to-clipboard polish gap on the Verification Report; Phase 1H-M closed it on the Improvement Memo and the Improvement Timeline. The single-scorecard Copy Markdown action was the last user-copyable artefact without a self-identifying preamble: when a founder pasted a copied scorecard body into a ticket, wiki, or email, the artefact opened with the report's first heading and could lose its identity once it was wrapped in surrounding ticket text. A short calm preamble line on the scorecard clipboard path closes the pattern across all four user-copyable artefacts.
What changed: Copied scorecard Markdown now opens with a single short calm preamble line: **AgentProof Scorecard Report** — <agent name> — <friendly readiness label> — <score>/100 — <friendly 1H-E targeted state or '1H-E state not recorded'> — generated at <ISO timestamp>. The friendly readiness wording reuses the existing readiness labels shown elsewhere in the report (Not ready, Design improvement needed, Candidate for sandbox testing, Candidate for controlled pilot review, Candidate for limited deployment review, Candidate for monitored deployment review). The friendly 1H-E targeted state wording reuses the Phase 1H-G dashboard labels (1H-E re-score recommended, 1H-E targeted check passed, 1H-E unable to assess); when the saved scorecard does not carry a documented targeted-rescore state and has no saved inputs to derive from, the preamble uses the explicit '1H-E state not recorded' fallback rather than inventing a value. The timestamp segment is omitted entirely when the saved-at value is unavailable. The single-scorecard Copy Markdown action now produces the preamble + body; the Download Markdown action continues to write the unchanged saved-scorecard Markdown body and the existing filename behaviour. Print / Save as PDF is unchanged. The deterministic scorecard Markdown report, all 13 golden Markdown snapshots, the receipt hashing, the receipt storage shape, the Phase 1H-L Verification Report copy preamble, the Phase 1H-K Verification Report filename, the Phase 1H-M Improvement Memo and Improvement Timeline copy preambles, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Users who copy a scorecard report into a ticket, wiki, or email now paste a self-identifying artefact: the first line names AgentProof, the agent, the friendly readiness label, the score out of 100, the friendly 1H-E targeted state (or the '1H-E state not recorded' fallback), and the saved/generated timestamp when available. The downloaded scorecard Markdown file is byte-identical to the Phase 1B output and continues to download under its existing filename behaviour.
When to re-score: Not required. Phase 1H-N is a copy-to-clipboard wording feature; the engine, scoring, receipt, downloaded scorecard Markdown body, downloaded filename, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-N authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-N internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-N authorisation
- Impact assessment
- Adds a deterministic copy-only preamble to the single-scorecard Copy Markdown action. The downloaded scorecard Markdown body, the deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, targeted-rescore comparison logic, the Phase 1H-L Verification Report copy preamble, the Phase 1H-M Improvement Memo and Improvement Timeline copy preambles, the Phase 1H-K Verification Report filename, and the Phase 1H-J memo and timeline filename slugs are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1H-OChange date:2026-04-26Product version:0.33.0Methodology engine version:0.9.1
Methodology Change Log Markdown export
Reason: Phase 1H-A made the methodology visible on a product-facing page; Phases 1H-B through 1H-N added per-scorecard lineage, source attribution, provenance trail, targeted-rescore guidance, and self-identifying copy preambles across the four scorecard-side artefacts. The remaining reviewer-facing surface was the Methodology Change Log itself, which had no copy or download path. A reviewer preparing a governance review or audit binder needs to be able to attach a calm reviewer-ready Markdown export of the current methodology version and every Change Log entry without screen-scraping the page.
What changed: The Methodology changes page now exposes Copy Markdown and Download Markdown actions at the top of the page. The exported Markdown opens with the current product, engine, context-pack, and changelog versions, the Intelligence Radar cadence summary, and the entry count, then the current methodology headline and the live AI provider / external research policies, then every Change Log entry in newest-first order with friendly source / provenance labels (Internal methodology record, Security guidance, Methodology owner, etc.), the date, the change type and affected areas in friendly wording, the reviewed-by role in friendly wording, the source label, the safe HTTPS source URL when present, the re-score recommendation as Yes / No, the re-score reason when present, the reason / what changed / user impact / public note prose, and the impact assessment + limitations from the per-entry source attribution. The export ends with the standard non-legal-advice disclaimer. Copied Markdown opens with a short calm preamble line (**AgentProof Methodology Change Log** — product v<version> — <N> entries — copied at <ISO timestamp>); the downloaded file body is unchanged from the export body. The export runs entirely in the browser from the static methodology and Intelligence Radar JSON content — no new API route, no service-role usage, no live LLM, no external fetch, no server-side PDF library. The deterministic scorecard Markdown report, all 13 golden Markdown snapshots, the Phase 1H-N scorecard copy preamble, the Phase 1H-M Improvement Memo and Improvement Timeline copy preambles, the Phase 1H-L Verification Report copy preamble, the Phase 1H-K Verification Report filename, the Phase 1H-J memo and timeline filename slugs, the engine, scoring, methodology lineage, source-attribution, provenance trail, targeted-rescore, and targeted-rescore comparison logic are all unchanged.
User impact: Reviewers can now attach a calm reviewer-ready Markdown export of the Methodology Change Log to a governance review, audit binder, ticket, or wiki. The exported artefact opens with one short self-identifying line on the clipboard path and a clean header on the downloaded path, and never surfaces internal helper names, raw enum identifiers, or alarmist wording.
When to re-score: Not required. Phase 1H-O adds a reviewer-facing export on top of existing static methodology content; the engine, scoring, receipt, scorecard Markdown body, and deterministic scoring outputs are unchanged.
Report wording updateTraining / documentation updateEvidence trace: Phase 1H-O authorisation; engine v0.9.1 unchanged; context packs version 1.2.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1H-O internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-26
- Reference
- Phase 1H-O authorisation
- Impact assessment
- Adds a deterministic Methodology Change Log Markdown export driven by the existing static content files and a small client component that wires Copy / Download actions on the Methodology changes page. The downloaded Markdown body, the copy-only preamble, the friendly source / provenance labels, and the filesystem-safe filename are all rendered without raw enum identifiers and without internal helper names. The deterministic scorecard Markdown report, golden snapshots, receipt hashing, storage shape, methodology lineage logic, source-attribution logic, provenance trail logic, targeted-rescore logic, targeted-rescore comparison logic, and every prior copy-preamble surface (1H-L, 1H-M, 1H-N) are unchanged. Goldens unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1G-BChange date:2026-04-25Product version:0.17.0Methodology engine version:0.9.1
Verify now + integrity badge
Reason: The reproducibility receipt should be visibly verifiable in the product, not only in code.
What changed: Added a results-page Verify now button that runs the deterministic verification client-side via Web Crypto, with friendly Verified / Mismatch detected / Receipt unavailable status. Added a Receipt present / Receipt unavailable integrity badge on dashboard tiles.
User impact: Users can confirm in one click whether a saved scorecard still matches its receipt.
When to re-score: Not required. Existing scorecards with a receipt can be verified in the browser without re-scoring.
Report wording updateEvidence trace: Phase 1G-B authorisation; engine v0.9.1; receipt hashing/storage shape unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-B internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-25
- Reference
- Phase 1G-B authorisation
- Impact assessment
- Adds a client-side verification surface using Web Crypto. Receipt hashing and storage shape unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score recommended
Change id:1G-AChange date:2026-04-20Product version:0.16.0Methodology engine version:0.9.0
Reproducibility receipt added
Reason: Every saved scorecard needs to be auditable as the deterministic output of a specific input + engine/content version combination.
What changed: Added a deterministic reproducibility receipt to every newly-generated scorecard: input + scorecard + Markdown body SHA-256 hashes, version stamps, fired rule counts, primary context id, and headline echoes. Embedded in the Scorecard JSON, appended to the Markdown report between stable comment markers, and rendered as a compact panel on the results page.
User impact: Reports become independently re-verifiable from their receipts, even months later.
When to re-score: Older scorecards do not carry a receipt. Re-score the agent to produce a current scorecard with a reproducibility receipt.
Report wording updateTraining / documentation updateEvidence trace: Phase 1G-A authorisation; engine v0.8.2 → v0.9.0; Supabase migration 0004 added.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1G-A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-20
- Reference
- Phase 1G-A authorisation
- Impact assessment
- Adds a deterministic receipt schema and migration; engine echoes new fields without changing scores, ratings, caps, or contexts. Older scorecards remain valid; re-scoring is recommended only to attach a receipt.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-FChange date:2026-04-10Product version:0.13.0Methodology engine version:0.8.2
Improvement Timeline added
Reason: Founders need a per-agent record of how an agent's design evolved across saved versions.
What changed: Added a /timeline route grouping saved scorecards by agent identity, with a summary panel, recommended first-vs-latest comparison, chronological version rows, and a Timeline Markdown export.
User impact: Each agent now has a chronological improvement record, not just isolated reports.
When to re-score: Not required. The timeline works against any existing saved scorecards.
Report wording updateEvidence trace: Phase 1F-F authorisation; engine v0.8.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F-F internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-04-10
- Reference
- Phase 1F-F authorisation
- Impact assessment
- Adds a presentation surface that reads any number of saved scorecards. No change to scoring or rule logic.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-DChange date:2026-03-15Product version:0.11.0Methodology engine version:0.8.2
Compare two scorecards (Improvement Record)
Reason: Founders need to see the deterministic difference between two saved scorecards.
What changed: Added a compare route, a deterministic compare scorecards helper, and a six-panel compare view (executive comparison, improvement summary, red flags + indicators, top actions, evidence + scenarios, methodology + disclaimer). Added Compare buttons on dashboard tiles and the results page.
User impact: AgentProof now produces a side-by-side improvement record between any two saved scorecards.
When to re-score: Not required. Compare works against any two existing saved scorecards.
Report wording updateEvidence trace: Phase 1F-D authorisation; engine v0.8.2 unchanged.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F-D internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-03-15
- Reference
- Phase 1F-D authorisation
- Impact assessment
- Adds a presentation surface that reads two existing saved scorecards. No change to scoring or rule logic.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1F-AChange date:2026-02-20Product version:0.8.0Methodology engine version:0.8.0
Executive decision layer added
Reason: Reports needed a one-screen founder-facing answer to 'what's the decision today, why, and what should we do this week'.
What changed: Added a deterministic executive decision panel at the top of every report: decision today, reason, readiness gate, biggest limiting factor, best next action, Top 3 actions this week, evidence to prepare first, scenario tests to run first, and an improvement path.
User impact: Every report now opens with a calm, decision-useful executive summary.
When to re-score: Optional. Older scorecards remain valid; re-scoring renders the new decision panel for the same scoring outputs.
Decision-layer updateReport wording updateEvidence trace: Phase 1F-A authorisation; engine v0.8.0.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1F-A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-02-20
- Reference
- Phase 1F-A authorisation
- Impact assessment
- Adds a presentation-only executive decision layer. Underlying scoring outputs are unchanged.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score recommended
Change id:1E-BChange date:2026-02-01Product version:0.7.0Methodology engine version:0.7.2
Deep context packs expanded
Reason: Five previously-stubbed context packs needed full evidence and scenario sets to be production-ready.
What changed: Deepened five context packs (finance / customer support / HR / public-facing / Microsoft Copilot Power Platform) into production-ready content with full evidence and scenario sets. Added five new synthetic samples (09–13) to prove each pack.
User impact: Coverage improved for finance, customer-support, HR-employment, public-facing, and Microsoft Copilot / Power Platform agents.
When to re-score: Re-score agents in the deepened domains to pick up the new evidence + scenario expectations.
Context-pack updateEvidence expectation updateScenario-test updateEvidence trace: Phase 1E-B authorisation; engine v0.7.2; context packs version 1.2.0; samples 09-13 added.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E-B internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-02-01
- Reference
- Phase 1E-B authorisation
- Impact assessment
- Deepens existing context packs with additional evidence and scenario sets. No change to scoring weights or rule logic.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
- Re-score optional
Change id:1E-AChange date:2026-01-15Product version:0.6.0Methodology engine version:0.7.0
Context-pack foundation added
Reason: Provide context-specific evidence expectations and scenario tests rather than generic advice.
What changed: Introduced versioned context packs, deterministic context detection, and a context-pack guidance block in every report. Added evidence expectations and recommended test scenarios per pack.
User impact: Reports now name a primary detected context and surface context-specific evidence and scenario tests.
When to re-score: Optional. Older scorecards remain valid; re-scoring will surface a primary context and context-specific evidence.
Context-pack updateNew context pack requiredReport wording updateEvidence trace: Phase 1E-A authorisation; engine v0.7.0; context packs version 1.0.0.
Source / provenance
- Source type
- Internal methodology record
- Source label
- AgentProof Phase 1E-A internal methodology record
- Reviewed by
- Methodology owner
- Reviewed
- 2026-01-15
- Reference
- Phase 1E-A authorisation
- Impact assessment
- Introduces the context-pack framework. Adds new content but does not change scoring weights, red-flag rules, or AI Act-aware indicator rules.
- Limitations
- No external source was reviewed for this entry. Internal methodology development only.
No external source was reviewed for this entry.
When to re-score
Saved scorecards remain valid against the methodology version they were generated under. Re-score an agent when a Change Log entry above is marked Re-score recommended, when you have made meaningful edits to the agent description, or when you want a fresh reproducibility receipt under the current engine and content versions.
Each saved scorecard also shows a Methodology lineage panel on its results page. The lineage panel displays which changes shipped at or before that scorecard's engine + context- pack versions, how many newer changes have shipped since, and whether re-scoring is recommended.
This is not legal advice, not a legal approval, and not a regulatory decision.