The architecture, the measurement, and the twelve-week mechanics, traced end to end through a live worked example at a Lloyd's commercial property syndicate.
Two research lineages have each held half of the expert-cognition problem for forty years. Tacit Labs composes them.
Gary Klein's Sources of Power (1998) and the Critical Decision Method (Klein, Calderwood & Macgregor, 1989) set the standard for how experts actually reason under time pressure. MacroCognition, Cognitive Performance Group, Perigean, and Hoffman & Militello at IHMC have refined the methodology across thirty-five years of field practice. The science is settled. The commercial translation does not exist: every engagement in the tradition ends in a narrative document.
UpToDate (acquired by Wolters Kluwer for $1.1B in 2008; ~7,100 physician contributors serving 2M clinicians across 190 countries), DynaMed, BMJ Best Practice, Westlaw Edge, Practical Law. The delivery problem solved at industrial scale through named-expert authorship, graded evidence, editorial cadence, and point-of-decision integration. What they do not do is capture. Their knowledge is curated from published literature; the tacit reasoning of working practitioners is out of scope.
The composition. We take the rigour of NDM capture, the discipline of editorial-board delivery, and the measurement infrastructure that only became tractable with foundation models. No vendor in either lineage has crossed into the other's territory in thirty-five years. We build the bridge.
What sits adjacent but is not us: knowledge management (digitises what experts already wrote); generic enterprise RAG (hallucinates at 17% on the best-funded vendor, per Stanford HAI, Magesh et al. 2024); Palantir Foundry (transaction data, not cognition); classical expert systems (hit the 1980s knowledge-acquisition wall).
Most knowledge-capture frameworks start in the wrong place: the expert's skull. Distributed-cognition research (Hutchins, Cognition in the Wild, 1995; Lave & Wenger, 1991; Suchman, 1987) established decades ago that expertise is irreducibly situated in the interactions between expert and environment. Our taxonomy is built around that insight. We instrument the surfaces, not the skull.
| # | Tier | Surface | What it captures |
|---|---|---|---|
| A | Core | Human ↔ digital | Interaction with enterprise software: navigation, field dwell, hesitation, overrides. Where most regulated cognition now sits. |
| E | Core | Human ↔ time | Temporal rhythm, cadence, deliberation duration, sequence dependencies. How the expert paces cognition. |
| G | Core | Digital ↔ digital | Automations, scripts, saved views, notification rules the expert has configured. Their cognitive prosthetics. |
| D | Derived | Negative space | What the expert did not do. Fields skipped, options not clicked, questions not asked. Absence is diagnostic. |
| B | Extended | Human ↔ human | Colleagues consulted, trust patterns, political calibration, social signals read. The collective dimension of judgment. |
| F | Extended | Human ↔ self | Internal monitoring, fatigue, confidence calibration, metacognition. The expert's relationship with their own judgment. |
| I | Extended | Physical → digital | Translation of physical observation into digital input. Reading a property photograph, typing a survey note. |
| J | Extended | Digital → physical | Digital decisions that trigger physical action. Scheduling a site visit, commissioning a survey. |
| C | Frontier | Human ↔ physical | Embodied inspection, hands-on assessment, spatial reasoning. Requires Shadow Sessions, wearables, specialist instrumentation. |
| H | Frontier | Physical ↔ physical | Environmental dynamics the expert reads: tidal ranges, structural vibration, market tape, equipment signatures. Sensed via instrumented proxies. |
Why this matters commercially. Capture is a tiered program, not a fixed bundle. Most engagements begin and end in Core (~65% coverage). Extended unlocks the social and metacognitive layers. Frontier is where industrial, clinical, and field-work verticals live. The taxonomy is the contract between vertical and capture cost.
The system is a pipeline with four named phases. Phase Zero seeds the graph with candidate heuristics from the client's historical record before live capture begins. The three moves that follow produce, structure, and deliver the knowledge.
Eight strategies triangulate the ten interaction surfaces. Observer SDK and SIA do the bulk; the other six reach what those two miss. The stack is an operationalisation of Klein's Critical Decision Method at software scale.
A ten-stage pipeline turns raw signals into production heuristics in a Neo4j graph. Where generic AI vendors fall over. This is the IP.
Context-Aware RAG serves situation-matched reasoning chains with named attribution. Hallucination below 3%, not 17%. The alignment layer enforces a regulator-grade audit trail.
A mid-sized Lloyd's commercial property syndicate. Names and numbers are composite from deployment telemetry patterns; the mechanics are the platform as it ships.
~40 underwriters across four teams. Stack: Guidewire PolicyCenter plus an internal ML pricing model built in 2023. GWP ~£180m. PRA-regulated under SS1/23.
Sarah Chen, 24 years, coastal and heritage property specialist, retires Q3 2027. Michael Okonkwo, 19 years, industrial and warehouse. David Lim, 22 years, London commercial, holds the dissenting pricing framework. Sarah is the anchor: shortest retirement clock, highest override rate (14% vs team average 6%).
Automated pricing accuracy drops materially on coastal-heritage cases without Sarah's overrides. The PRA has asked for documented model challenge under SS1/23. Knowledge capture becomes a risk-register item with a deadline, not an HR ambition.
12-week MVP. Target at Week 12: 200+ graph nodes, 15-25 heuristics per expert, live alignment on the pricing model, SS1/23 audit trail, David's dissent preserved as a parallel heuristic set.
Every senior expert has spent twenty years producing an explicit record: case files, memos, referral notes, committee minutes, email threads. KM systems treat this as cold storage. For us it is the starting corpus. Abductive inference over that record produces 50-100 candidate heuristics before the Observer SDK fires.
Scoping precedes deployment. Which of the ten surfaces matter for this expert and this vertical? A commercial underwriter lives primarily in surfaces A, E, G with extended reach into B and F. A field engineer lives in C and H with residual A. The surface map determines which strategies deploy, in what order, at what intensity.
A data-readiness score gates scope. Rich rationale on historical submissions gives us a lot. File notes that only record "declined" give us nothing. We grade the client's legacy data on three dimensions: outcome coverage, rationale density, artefact continuity.
Given an observed outcome, find the simplest rule that would have produced it. Peirce's 1878 formulation; tractable at scale only now that foundation models can generate hundreds of candidate hypotheses and score each against held-out outcome data.
Two weeks before the SDK lands, we ingest five years of each senior underwriter's decision record: 3,847 submissions, 412 structured underwriting memos, 1,240 referral threads, claims data linked by submission ID. Data-readiness score: 0.81. Gaps flagged: sparse rationale on 2021 submissions (systems migration), no committee minutes pre-2022. Scope adjusted.
Abductive pass produces 847 candidate rules. Causal validation on matched historical cohorts filters to 63 with explanatory coverage above threshold. Week 0 corpus: 63 ranked candidates attributed to source expert, queued for live validation. The CRO sees the first ten in Week 1 — value delivered before the SDK has fired an event.
No single strategy reaches a full surface. Each is triangulated by two to four strategies that observe it from different angles. The matrix is the commercial and technical heart of the platform. It answers the question every serious buyer asks: how do these strategies actually join forces to deliver the outcome?
Read down a column to see which strategies triangulate each surface. Read across a row to see which surfaces a single strategy reaches. ● primary · ○ partial · — minimal.
| Strategy | A H↔D |
E H↔T |
G D↔D |
D neg |
B H↔H |
F H↔S |
I P→D |
J D→P |
C H↔P |
H P↔P |
|---|---|---|---|---|---|---|---|---|---|---|
| 01 Observer SDK | ● | ● | ● | ● | ○ | — | ○ | ○ | — | — |
| 02 SIA Dialogues | ○ | ○ | — | ● | ● | ● | — | — | ○ | ○ |
| 03 Shadow Sessions | ● | ○ | ○ | ○ | ○ | ○ | ● | ○ | ● | — |
| 04 Perturbation Probing | ● | — | — | ● | — | ○ | — | — | — | ○ |
| 05 Contrastive Elicitation | ○ | — | — | ○ | ● | ● | — | — | — | — |
| 06 Longitudinal Drift | ○ | ● | ○ | ● | — | ○ | — | — | — | ○ |
| 07 Cognitive Archaeology | ○ | ○ | — | ○ | ● | ○ | ○ | ● | — | — |
| 08 Contrastive Expertise | ○ | ○ | ○ | ○ | ○ | ○ | ○ | ○ | ○ | ○ |
Triangulation in practice, Human↔time (E) as the example. Observer SDK tracks the rhythm (median deliberation, hesitation clusters, sequence dependencies). Longitudinal Drift tracks how that rhythm shifts across months and market cycles. Cognitive Archaeology reconstructs rhythm from pre-SDK historical records. SIA Reasoning Excavation probes the why behind the order: "Why always flood zone before building age?" Four strategies produce the temporal model; no single one could. Same pattern across every surface.
Three worked in depth because they do seventy percent of the work and represent the three archetypes. Five compressed.
A Chrome MV3 extension and a macOS pyobjc daemon, deployed through enterprise device management. The SDK captures a fixed schema of behavioural events in the background without action from the expert. Technical foundation: rrweb (originally built for session replay) repurposed for expert behaviour analysis, layered on OpenTelemetry for transport and Redpanda for event streaming. Every event carries a SHA-256-hashed expert ID, session ID, timestamp, type, surface mapping, payload, and consent flag.
Tuesday of Week 1: SDK deployed to Sarah, Michael, David through Meridian's Google Workspace. First events land at the ingestion endpoint within four hours. By Friday, 52,000 events across three experts. Welford accumulators populate baselines: Sarah's median deliberation time 4m 23s per case, override rate 14%, hesitation clusters on "flood zone" and "building age" at 8-15 seconds.
By Week 2, PrefixSpan over event sequences flags the pattern Phase Zero proposed as a candidate: whenever Sarah opens a submission with building age above 40 and a postcode in the coastal set, she hesitates, backtracks, overrides the automated price upward by 8-12%. P0 produced the hypothesis; Path A confirms it with live behavioural evidence. The heuristic moves from candidate to staged.
A Socially Interactive Agent runs scheduled 15-25 minute sessions, two or three times per month per expert. Real-time avatar (Ready Player Me mesh, NVIDIA Audio2Face lip sync) fronts a GPT-4o or Claude backbone with LlamaIndex + pgvector for retrieval. The five-phase protocol is a direct operationalisation of the Critical Decision Method (Klein, Calderwood & Macgregor, 1989): warm-up, process tracing, reasoning excavation, boundary probing, social-relational.
Four perception channels fuse into a Certainty Index that gates the agent's next question. Whisper large-v3 transcribes verbal content. Prosody analysis (fundamental frequency, jitter, shimmer) detects hesitation. MediaPipe face mesh with AU detection reads micro-expressions. The SDK feed supplies same-hour behavioural context.
Monday 11:00. Warm-up (2 min). Process tracing on case #4821: "Walk me through what you did last Thursday." Reasoning excavation on the flagged override: "What made you add 11%?" Sarah: "Building's 1960s, on the Norfolk coast. Asbestos cladding on old builds that age; subsidence risk from salt exposure the model doesn't price." Boundary probe: "When would that rule not apply?" Sarah: "If there's a structural survey within two years." Certainty Index: 0.83 on the core rule; 0.61 on the exception (prosody shows hesitation). A perturbation probe schedules for Week 6 to sharpen the exception.
A field-deployed engineer sits with the expert for 60-90 minutes while the expert narrates their work on live or recent cases. The FDE timestamps narration against the SDK event stream. The output is a dense sequence of causally-linked observation-explanation pairs. Grounded in concurrent verbal protocol (Ericsson & Simon, Protocol Analysis, 1984) aligned to behavioural telemetry.
90 minutes with Sarah across four live cases. 287 narration-timestamp pairs captured, aligned to 2,412 behavioural events. Reveals a cue the SDK alone would have missed: Sarah pulls up the property photograph on three of four cases and specifically looks for salt staining on cladding joints. Surface I (physical→digital) instrumented explicitly. Pattern mining backtests across 47 prior cases; salt-staining presence correlates with her overrides at r = 0.71. New heuristic candidate, only reachable through behavioural-narrative alignment.
Controlled variations of real cases synthesised and presented to the expert. Five sub-types: single-variable, contextual, threshold, counterfactual, adversarial. Surfaces decision boundaries the expert cannot articulate in conversation but reveals through differential action. Grounded in Kahneman & Klein (2009) on boundary conditions, and active learning from ML.
20 synthetic cases vary age and coastal distance on a grid. Sarah's response surface reveals sharp thresholds at exactly 40 years of age and 200m from coastline. Information Path A could not surface because real cases do not cluster at decision boundaries.
Two cases that look similar but that the expert handles differently. The gap between what the expert notices and what a novice would notice is the tacit knowledge. Kahneman's principle: knowledge activates in response to specific stimuli, not abstract questions.
Sarah sees Case A and Case B, built near-identical on all model features. "A is straightforward. B I'd decline. Tenant covenant — retail sector, hollowing out post-2024." A junior couldn't separate them. New social-contextual heuristic captured.
Continuous background analysis of behaviour change over months and years. Surfaces market-condition sensitivity, fatigue, learning adaptation. River streaming ML (CluStream + ADWIN) over the Observer stream; change-point detection via Bayesian online change-point methods.
Drift detector flags a shift in Sarah's pricing for commercial retail since October 2025 (Cohen's d = 0.72). Overrides widened from +5% to +11%. Hypothesis: cascading retail tenant bankruptcies after 2024 consumer credit contraction. Old heuristic decays; new candidate enters Path B probing.
Reconstructs past decisions from artefacts — underwriting files, emails, system logs — combined with expert retrospection. Four steps: artefact collection, timeline reconstruction, expert retrospection, counterfactual probing. Essential for rare, high-stakes decisions the SDK will never see enough of. Borrowed from CTA practice (Crandall, Klein & Hoffman, Working Minds, 2006).
Cohen's d effect size between expert cohort and novice cohort on the same markers. Surfaces the dimensions where experts materially differ from novice baseline, prioritises those for deeper capture, gives the CFO the ROI evidence.
When multiple strategies speak to the same marker, their confidences fuse into one. The choice of fusion function is an architectural commitment because it determines whether dissent survives or disappears.
Each heuristic's final confidence is a weighted composite of four evidence dimensions. This is the object regulators read when they ask how confident a heuristic is, and why.
When strategies disagree on a marker, the engine does not average. It flags the conflict, stores both versions with full provenance, and triggers a targeted SIA dialogue. Dissent is a first-class object in the graph.
Sarah's coastal-override heuristic is cross-checked with Michael (agrees) and David (dissents; prices coastal risk through upstream reinsurance adjustment). The engine stores H-001 (Sarah + Michael, conf 0.78) and H-002 (David, conf 0.71) connected by a CONFLICTS_WITH edge with context annotation. A junior querying the graph sees both with attribution. This is the feature that makes the platform defensible under SR 11-7.
Capture produces signals. Codification turns them into production heuristics with confidence, provenance, and a full audit chain. Rule-based scaffolding today, progressively upgraded to statistical ML as data volume accrues.
847 raw events from Sarah's 3-hour work session on file #4821 grouped by session ID and temporal proximity. One Session object with aggregated features.
PrefixSpan over the event sequence discovers: OPEN → HOVER_FLOOD(8-15s) → HOVER_AGE(3-8s) → BACKTRACK → OVERRIDE_UP appears in 23 of 31 sessions with age > 40 and coastal postcode. Support 0.74.
Welford accumulators: override rate on coastal-old cases is 74% against Sarah's overall 14%. Z-score 8.3.
LSTM autoencoder reconstruction loss is low on this pattern. It is systematic, not noise. Flagged for promotion.
CART produces the interpretable rule: IF age > 40 AND coastal AND model_conf < 0.85 THEN override ∈ [+0.08, +0.12]. Tree depth 3, entropy gain 0.67.
DoWhy runs matched counterfactual on historical cohort. Override-group loss ratio 52% vs matched non-override 71%. Nineteen-point delta survives propensity matching and placebo tests. Causal strength 0.82.
spaCy NER with domain ontology tags: LOB commercial property, peril flood+age+subsidence, market UK, regulator PRA. Context nodes linked.
Michael's data confirms. David dissents. Agreement score 0.67. Dissent preserved as CONFLICTS_WITH edge to David's alternative.
Weighted composite: 0.35 × 0.82 + 0.25 × 0.67 + 0.25 × 0.76 + 0.15 × 0.91 = 0.78. Above the 0.70 production threshold.
Cypher insertion creates node "The Chen Override". Provenance chain links source expert, sessions, conditions, action, context, capture strategies, three-gate validation status, and the conflict edge to David's alternative. Status: production.
| Gate | Type | Operation | Lineage |
|---|---|---|---|
| 01 | Face validity | Source expert confirms: "Is this what you do?" | Nunnally, psychometrics, 1978 |
| 02 | Generalisability | Peer experts asked: "Would you do this?" | Cochrane systematic review practice |
| 03 | Predictive validity | Backtest against historical outcomes | GRADE (Guyatt et al., 2008) |
Generic enterprise RAG retrieves documents by keyword similarity and generates on top of them. It hallucinates at 17% even on the best-funded legal vendor (Stanford HAI, 2024). Our retrieval layer retrieves by situation similarity instead, and generation is constrained to the retrieved reasoning chain with post-generation verification.
Standard text embeddings encode words. For tacit knowledge retrieval we need embeddings that encode situations. Two submissions with very different wording may be the same underwriting problem. Two with near-identical wording may be different problems. Standard embeddings miss the distinction.
Maya, second-year underwriter: new submission, commercial property, 52-year-old building in Wells-next-the-Sea, model quotes £18,400. She types into the activation surface: "Is this a straightforward price or does this need a senior eye?"
Stage 1: bi-encoder matches to "heritage coastal commercial, 40+ years" at 0.89 confidence. Stage 2: parallel pulls surface the Chen Override (H-001), the Lim Framework (H-002), seven similar historical cases, three rules, two SIA transcripts. Stage 3: cross-encoder reranks. H-001 leads on freshness, 2-of-3 agreement, and 0.78 confidence. H-002 surfaces as "Alternative view" one click away.
Reasoning chain renders: "Based on building age (52) and coastal postcode, this matches what your senior underwriters call heritage coastal commercial. It differs from standard commercial property because of asbestos-cladding and subsidence exposure the automated model does not price. Sarah Chen uses the Chen Override here: adds 8-12% to the automated price. 47 similar cases; 19-point loss-ratio improvement. Exception: if there is a structural survey within two years, the model is more reliable. Alternative: David Lim uses reinsurance-cost adjustment upstream (tap for the Lim Framework). Attribution: Sarah Chen, 24 yrs coastal; Michael Okonkwo concurs."
| User level | Delivery style | Content focus |
|---|---|---|
| Novice | Full reasoning chain, step by step | Schemas, common situations, defaults; explain the why |
| Intermediate | Key distinctions, exceptions | Edge cases, cross-expert views; skip the basics |
| Peer | Raw cases, contradictions | Rare cases, unresolved disagreements; maximum density |
| Component | Function | Backing | Regulatory anchor |
|---|---|---|---|
| Intent Engine | Injects heuristics into model calls as prompt context | LlamaIndex + LiteLLM | SR 11-7 conceptual soundness |
| Guardrails | Translates hard rules into enforced constraints | NeMo Guardrails (NVIDIA) | EU AI Act Art 14 |
| Drift Detection | Monitors AI-expert divergence | Evidently AI | DORA operational resilience |
| Override Monitor | Closed-loop: every override re-enters the pipeline | Langfuse + internal | SR 11-7 effective challenge |
| Audit Trail | Input → heuristics → output → override → reasoning | Langfuse + Grafana | GDPR Art 22 · CFPB 2023-03 |
Capture is where every AI company claims novelty. Codification is where the IP lives. Retrieval is where the customer feels it.
An operational trace of what happens between Week minus-two and Week twelve at Meridian. Field engineer on site for the first two weeks; remote with fortnightly visits thereafter.
Phase Zero. Surface mapping with CDO and Head of Underwriting. Data-readiness 0.81. Five years of decision records ingested. Abductive inference produces 63 candidate heuristics. CRO sees the first ten in Week 1.
Monday W1: kickoff, consent signed, SDK deployed through Google Workspace admin. Thursday W1: 15,000 events through ingestion. W2: Shadow Session 1 with Sarah (90 min). Pattern engine live. P0 heuristics begin validation.
Shadow Session 2 (Sarah), 1 each (Michael, David). Cognitive Archaeology on the 2019 Norfolk flood claim. 287 narration-timestamp pairs from Sarah. Salt-staining cue instrumented on surface I.
SIA Session 1 per senior. Path B triggered four times on Sarah during live work. Perturbation Probing reveals Sarah's 40-year and 200m thresholds. Draft corpus: 47 candidates on top of 63 P0 inheritance.
Face-validity pass: 41 of 47 confirmed, 6 dropped or modified. Generalisability pass: 32 confirmed by peers, 8 held as contested (including David's dissent). Path D: Cohen's d = 1.4 on coastal-heritage. KG v1: 234 nodes, 38 production heuristics including the Chen Override.
Intent specification drafted for the pricing model. 12 hard-boundary rules via NeMo Guardrails. First integration test: Intent Engine injects top-5 heuristics into the pricing-model prompt on 200 shadow cases. Accuracy lifts 76% → 85%.
Alignment monitor live. Langfuse logs every AI call. Grafana dashboard visible to CRO. First SR 11-7 effective-challenge artefact produced automatically: 22 pages, 6 hours of engineer time. Handover training. Stage 2 scoped for months 4-7.
Week 12 scorecard. 234 graph nodes · 42 production heuristics · average confidence 0.79 · David's dissent preserved as parallel set · SS1/23 audit artefact in 6 hours · the CRO's Q2 board paper cites the nine-point accuracy uplift with provenance down to expert level.
Capture only works with willing participation. A platform that feels extractive will be refused by the best people in the firm. The incentive architecture makes participation intrinsically useful, extrinsically recognised, and financially rewarded.
Every production heuristic carries the name of the expert who originated it. The Chen Override. The Lim Framework. The Okonkwo Rule. The convention is borrowed from fields that solved the attribution problem long before software did: UpToDate credits its physician authors by name on every article; case law credits its jurists. Naming is audit trail, royalty anchor, and a senior expert's permanent authorship of a pattern the firm will use for years.
The convention is agreed at capture time. Experts can decline naming, pick a non-eponymous label, or change the name later. Once a named heuristic has been invoked in downstream decisions, attribution is frozen, because reversing it breaks the audit chain.
Every invocation of a named pattern through the activation surface is logged. Quarterly, the expert receives a royalty scaled by invocation volume and penalised where downstream outcomes were worse than baseline. Volume gaming loses to reality. The per-invocation rate is set by the firm's compensation committee — typically between £0.50 and £5 depending on decision stakes — with caps and floors negotiated as a contractual line item.
A senior expert with five named patterns generating 300-600 invocations a month earns roughly £1K to £10K per quarter at common rates. Not transformative individually. Meaningful as recognition. Cumulatively material across a decade of post-capture years.
| Tier | Mechanism | What the expert gets |
|---|---|---|
| Utility (today) | Personalised Copilot | Cases that need their attention surface automatically; trivial cases defer |
| "Why?" button auto-formats compliance documentation | Audit-ready rationale as a byproduct, not overhead | |
| Private skill analytics | Their own decision speed, accuracy, unique-strength metrics vs team baseline | |
| Recognition (career) | Named patterns in the graph | Institutional authorship that persists across the career and beyond retirement |
| "Golden Rule" leaderboard | Quality-based ranking tied to override correctness | |
| Expert Review Board appointment | Paid seat adjudicating low-confidence AI disputes firm-wide | |
| Financial (direct) | Royalty pool on invocation | Per-quarter payment scaled by pattern usage and outcome quality |
| SIA hours billed as "Strategic AI Training" | Capture sessions log as value-add hours, not overhead | |
| Promotion pillar | Tacit contribution becomes a formal KPI at Senior and Principal grade |
Anti-gaming. Rewards tie to outcome quality and novelty (reasoning materially different from cohort baseline), not to raw volume. An expert who floods the graph with trivial rules sees their quality score fall and royalty drop. An expert who contributes rare, well-grounded, repeatedly-invoked patterns sees their share rise. Outcome validation keeps the system honest.
Privacy by policy is a promise. Privacy by technology is an engineering guarantee. We ship the latter. The security architecture is designed before the capture architecture, not after. This section is the commitment a prospect's CRO, CISO, and works council can hold us to.
The Observer SDK observes senior practitioners during sensitive professional work. That is an intrusion profile closer to clinical research than enterprise software. Without deliberate architecture, the platform could become a surveillance tool, a GDPR liability, or unacceptable to works councils and trade unions. We refuse each of those outcomes by engineering.
WITHDRAWN and excluded from future retrieval; graph structure is preserved for audit continuity. No HR ticket, no internal approval.| Standard | Scope | Status |
|---|---|---|
| ISO/IEC 27001 | Information Security Management System | Target Q4 2026; controls in implementation |
| SOC 2 Type II | Security, Availability, Confidentiality, Privacy (AICPA) | Target Q4 2026; observation window opens at first production deployment |
| GDPR Article 25 | Privacy by Design & by Default | Architecturally aligned; DPIA per engagement |
| EU AI Act Article 10 | Data governance for high-risk AI | Architecturally aligned; Art 10 evidence pack per engagement |
| ISO/IEC 42001 | AI management system | Roadmap target 2027 |
| NIST AI RMF | Voluntary US alignment | Architecturally mapped; evidence pack for US deployments |
What this produces in diligence. A CISO asks what we have access to. The honest answer: platform telemetry, DP-protected aggregates, software update channels. Never raw behavioural data, never expert identity, never case content. The answer is identical whether the CISO is a prospect's or a regulator's. Consistency is the point.
The eight capture strategies sit on an intrusivity gradient. Any expert can opt out of any strategy. Opting out of a high-signal strategy means we need more time on others to reach comparable coverage. The trade-off is explicit.
| Strategy | Intrusivity | Captures | Does NOT capture | Opt-out consequence | Market default |
|---|---|---|---|---|---|
| 01 Observer SDK | High | Derived behavioural features (hover, click, override, hesitation, sequence) | Screen pixels, keystrokes, private browsing, off-scope apps | +50% SIA time, more Shadow cycles to compensate | EU/UK/US opt-in; works-council review in DE/NL/FR |
| 02 SIA Dialogues | Low | Scheduled dialogue transcripts + 4-channel derived features | Anything outside the session window | Minimal; lean on Shadow and Archaeology | Opt-in, expert-visible |
| 03 Shadow Sessions | High (in-session) | Screen + narration + behaviour during 60-90 min, with expert present | Anything outside the session; recording light visible | Rely on Cognitive Archaeology for rare decisions | Opt-in, session by session |
| 04 Perturbation Probing | Low | Decisions on synthesised cases | No real case data | Boundary mapping depth reduced | Opt-in |
| 05 Contrastive Elicitation | Low | Paired-case decisions and rationales | No production data | Social-contextual coverage reduced | Opt-in |
| 06 Longitudinal Drift | Medium | Behavioural patterns aggregated over time | Identifiable events post-aggregation | Drift detection manual via more SIA touchpoints | Opt-in, DP-aggregated |
| 07 Cognitive Archaeology | Low (retrospective) | Historical artefacts + retrospection | No live work; scope bounded by DPO | Historical pattern reconstruction lost | Opt-in; DPO scope required |
| 08 Contrastive Expertise | Low (aggregate) | Cohort statistics via differential privacy | No identifiable data in output | Cohort-level evidence loses one channel | Opt-in, DP-enforced |
How this reads commercially. Transparent at diligence, negotiable at sales, compliant at deployment. A firm that forbids Observer SDK can run a Shadow-plus-SIA configuration that takes about 50% longer and 30% more engineer time but reaches comparable Core coverage. A firm sensitive to multimodal perception in SIA can run audio-only with lower certainty weighting. The platform is a dial.
The knowledge graph is not single-purpose. Once the three-move system runs at a client, the same substrate serves a sequence of applications. Clients land on one, expand into three within eighteen months.
| Rank | Application | Buyer | Budget line | Meridian example |
|---|---|---|---|---|
| 01 | AI alignment & decision intelligence | CRO, CDO, Head of Responsible AI | Model risk / AI governance | Pricing-model uplift 76→85%, SS1/23 artefact in hours |
| 02 | Succession capture & continuity | CHRO, workforce planning | Critical-role retention | Sarah's 24 years preserved before Q3-27 retirement |
| 03 | Middle-management training & onboarding | L&D, learning technology | Professional development | Maya's Week-14 query inherits the Chen Override directly |
| 04 | Decision review & QA | Second-line risk, compliance | Control testing | Any UW decision scored against expert heuristics |
| 05 | Regulatory defensibility & model validation | Model risk, compliance | Regulatory, audit | Continuous SS1/23, DORA, GDPR Art 22 artefacts |
Beyond the top five. M&A due diligence (captured expertise as valuation input), talent analytics (objective decision-quality measurement), organisational learning (drift detection surfaces reasoning shifts before quarterly numbers show them), enterprise agentic systems (autonomous workflows grounded in named-expert priors). Platform logic that defends the valuation multiple.
Karan Bhandari (prior CISO, Canadian financial sector) and Jas (deep technical background including pre-LLM AI engineering). We are not selling consulting. We are building the infrastructure and taking the capital to build it faster.