Tacit LabsSolution FrameworkConfidential
Solution Framework · Technical

How we externalise expert cognition.

The architecture, the measurement, and the twelve-week mechanics, traced end to end through a live worked example at a Lloyd's commercial property syndicate.

Version 2.0  ·  April 2026  ·  Tacit Labs, confidential

00 · Thesis

Two traditions understood parts of this. No one composed them until now.

Two research lineages have each held half of the expert-cognition problem for forty years. Tacit Labs composes them.

Capture: Naturalistic Decision Making.

Gary Klein's Sources of Power (1998) and the Critical Decision Method (Klein, Calderwood & Macgregor, 1989) set the standard for how experts actually reason under time pressure. MacroCognition, Cognitive Performance Group, Perigean, and Hoffman & Militello at IHMC have refined the methodology across thirty-five years of field practice. The science is settled. The commercial translation does not exist: every engagement in the tradition ends in a narrative document.

Delivery: editorial-board decision support.

UpToDate (acquired by Wolters Kluwer for $1.1B in 2008; ~7,100 physician contributors serving 2M clinicians across 190 countries), DynaMed, BMJ Best Practice, Westlaw Edge, Practical Law. The delivery problem solved at industrial scale through named-expert authorship, graded evidence, editorial cadence, and point-of-decision integration. What they do not do is capture. Their knowledge is curated from published literature; the tacit reasoning of working practitioners is out of scope.

The composition. We take the rigour of NDM capture, the discipline of editorial-board delivery, and the measurement infrastructure that only became tractable with foundation models. No vendor in either lineage has crossed into the other's territory in thirty-five years. We build the bridge.

What sits adjacent but is not us: knowledge management (digitises what experts already wrote); generic enterprise RAG (hallucinates at 17% on the best-funded vendor, per Stanford HAI, Magesh et al. 2024); Palantir Foundry (transaction data, not cognition); classical expert systems (hit the 1980s knowledge-acquisition wall).

01 · The Governing Taxonomy

Expertise does not live in the head. It lives in ten interaction surfaces.

Most knowledge-capture frameworks start in the wrong place: the expert's skull. Distributed-cognition research (Hutchins, Cognition in the Wild, 1995; Lave & Wenger, 1991; Suchman, 1987) established decades ago that expertise is irreducibly situated in the interactions between expert and environment. Our taxonomy is built around that insight. We instrument the surfaces, not the skull.

— The Ten Interaction Surfaces of Expertise —
EXPERT AHuman ↔ digital GDigital ↔ digital EHuman ↔ time DNegative space JDigital → physical IPhysical → digital BHuman ↔ human FHuman ↔ self CHuman ↔ physical HPhysical ↔ physical
Core capture
Extended
Frontier
Derived

The ten surfaces, defined.

#TierSurfaceWhat it captures
ACoreHuman ↔ digitalInteraction with enterprise software: navigation, field dwell, hesitation, overrides. Where most regulated cognition now sits.
ECoreHuman ↔ timeTemporal rhythm, cadence, deliberation duration, sequence dependencies. How the expert paces cognition.
GCoreDigital ↔ digitalAutomations, scripts, saved views, notification rules the expert has configured. Their cognitive prosthetics.
DDerivedNegative spaceWhat the expert did not do. Fields skipped, options not clicked, questions not asked. Absence is diagnostic.
BExtendedHuman ↔ humanColleagues consulted, trust patterns, political calibration, social signals read. The collective dimension of judgment.
FExtendedHuman ↔ selfInternal monitoring, fatigue, confidence calibration, metacognition. The expert's relationship with their own judgment.
IExtendedPhysical → digitalTranslation of physical observation into digital input. Reading a property photograph, typing a survey note.
JExtendedDigital → physicalDigital decisions that trigger physical action. Scheduling a site visit, commissioning a survey.
CFrontierHuman ↔ physicalEmbodied inspection, hands-on assessment, spatial reasoning. Requires Shadow Sessions, wearables, specialist instrumentation.
HFrontierPhysical ↔ physicalEnvironmental dynamics the expert reads: tidal ranges, structural vibration, market tape, equipment signatures. Sensed via instrumented proxies.

Why this matters commercially. Capture is a tiered program, not a fixed bundle. Most engagements begin and end in Core (~65% coverage). Extended unlocks the social and metacognitive layers. Frontier is where industrial, clinical, and field-work verticals live. The taxonomy is the contract between vertical and capture cost.

02 · Three-Move Architecture

Capture. Codify. Retrieve. Preceded by Phase Zero.

The system is a pipeline with four named phases. Phase Zero seeds the graph with candidate heuristics from the client's historical record before live capture begins. The three moves that follow produce, structure, and deliver the knowledge.

— System Architecture —
SENIOREXPERTS+ historical record PHASE 0EXPLICIT INGESTSurface mappingAbductive inferenceData readiness scoreCandidate heuristicsPeirce abduction overthe client legacy recordW-2 to W0 CAPTUREMOVE 01Observer SDK · SIAShadow · PerturbationContrastive · DriftArchaeology · CEA8 strategies triangulatethe ten surfacesW1 to W12 MVP CODIFYMOVE 0210-stage extractionCross-Strategy SynthesisNeo4j knowledge graphConfidence + decayThree-gate validationDissent preserved RETRIEVEMOVE 03Context-Aware RAGSituation-matchingReasoning chain synthAdaptive deliveryAlignment layer on top:Intent · Drift · Audit AI +JUNIORSdecisions PRIORS RAW SIGNALS STRUCTURED HEURISTICS MATCHED REASONING
MOVE 01

Capture

Eight strategies triangulate the ten interaction surfaces. Observer SDK and SIA do the bulk; the other six reach what those two miss. The stack is an operationalisation of Klein's Critical Decision Method at software scale.

MOVE 02

Codify

A ten-stage pipeline turns raw signals into production heuristics in a Neo4j graph. Where generic AI vendors fall over. This is the IP.

MOVE 03

Retrieve

Context-Aware RAG serves situation-matched reasoning chains with named attribution. Hallucination below 3%, not 17%. The alignment layer enforces a regulator-grade audit trail.

The Running Example

Meridian Syndicate.

A mid-sized Lloyd's commercial property syndicate. Names and numbers are composite from deployment telemetry patterns; the mechanics are the platform as it ships.

Profile

~40 underwriters across four teams. Stack: Guidewire PolicyCenter plus an internal ML pricing model built in 2023. GWP ~£180m. PRA-regulated under SS1/23.

The three senior experts

Sarah Chen, 24 years, coastal and heritage property specialist, retires Q3 2027. Michael Okonkwo, 19 years, industrial and warehouse. David Lim, 22 years, London commercial, holds the dissenting pricing framework. Sarah is the anchor: shortest retirement clock, highest override rate (14% vs team average 6%).

Commercial trigger

Automated pricing accuracy drops materially on coastal-heritage cases without Sarah's overrides. The PRA has asked for documented model challenge under SS1/23. Knowledge capture becomes a risk-register item with a deadline, not an HR ambition.

The engagement

12-week MVP. Target at Week 12: 200+ graph nodes, 15-25 heuristics per expert, live alignment on the pricing model, SS1/23 audit trail, David's dissent preserved as a parallel heuristic set.

— Part One —
03 · Phase Zero

Before we capture the tacit, we ingest the explicit.

Every senior expert has spent twenty years producing an explicit record: case files, memos, referral notes, committee minutes, email threads. KM systems treat this as cold storage. For us it is the starting corpus. Abductive inference over that record produces 50-100 candidate heuristics before the Observer SDK fires.

Surface mapping.

Scoping precedes deployment. Which of the ten surfaces matter for this expert and this vertical? A commercial underwriter lives primarily in surfaces A, E, G with extended reach into B and F. A field engineer lives in C and H with residual A. The surface map determines which strategies deploy, in what order, at what intensity.

A data-readiness score gates scope. Rich rationale on historical submissions gives us a lot. File notes that only record "declined" give us nothing. We grade the client's legacy data on three dimensions: outcome coverage, rationale density, artefact continuity.

Abductive inference.

Given an observed outcome, find the simplest rule that would have produced it. Peirce's 1878 formulation; tractable at scale only now that foundation models can generate hundreds of candidate hypotheses and score each against held-out outcome data.

The simplest rule that would have produced what Sarah did on 500 past cases is the best first guess at what's in her head. Score candidates on explanatory coverage, penalise complexity.
$$ \hat{H} = \arg\max_{H \in \mathcal{H}} \Bigl[ \log P(D \mid H) - \lambda \cdot \text{complexity}(H) \Bigr] $$
D is the decision record. H is a candidate rule. λ is the parsimony penalty. Top-N rules become Phase Zero candidate heuristics, queued for live validation in Capture.
Meridian · Weeks -2 to 0

Two weeks before the SDK lands, we ingest five years of each senior underwriter's decision record: 3,847 submissions, 412 structured underwriting memos, 1,240 referral threads, claims data linked by submission ID. Data-readiness score: 0.81. Gaps flagged: sparse rationale on 2021 submissions (systems migration), no committee minutes pre-2022. Scope adjusted.

Abductive pass produces 847 candidate rules. Causal validation on matched historical cohorts filters to 63 with explanatory coverage above threshold. Week 0 corpus: 63 ranked candidates attributed to source expert, queued for live validation. The CRO sees the first ten in Week 1 — value delivered before the SDK has fired an event.

— Part Two —
04 · Capture

Eight strategies. Ten surfaces. The matrix that shows how they intertwine.

No single strategy reaches a full surface. Each is triangulated by two to four strategies that observe it from different angles. The matrix is the commercial and technical heart of the platform. It answers the question every serious buyer asks: how do these strategies actually join forces to deliver the outcome?

The Strategy × Surface Matrix.

Read down a column to see which strategies triangulate each surface. Read across a row to see which surfaces a single strategy reaches. primary · partial · minimal.

Strategy A
H↔D
E
H↔T
G
D↔D
D
neg
B
H↔H
F
H↔S
I
P→D
J
D→P
C
H↔P
H
P↔P
01 Observer SDK
02 SIA Dialogues
03 Shadow Sessions
04 Perturbation Probing
05 Contrastive Elicitation
06 Longitudinal Drift
07 Cognitive Archaeology
08 Contrastive Expertise

Triangulation in practice, Human↔time (E) as the example. Observer SDK tracks the rhythm (median deliberation, hesitation clusters, sequence dependencies). Longitudinal Drift tracks how that rhythm shifts across months and market cycles. Cognitive Archaeology reconstructs rhythm from pre-SDK historical records. SIA Reasoning Excavation probes the why behind the order: "Why always flood zone before building age?" Four strategies produce the temporal model; no single one could. Same pattern across every surface.

The eight strategies in operation.

Three worked in depth because they do seventy percent of the work and represent the three archetypes. Five compressed.

01
PASSIVE · CONTINUOUS · ZERO EXPERT EFFORT
Observer SDK
A · E · G · D

A Chrome MV3 extension and a macOS pyobjc daemon, deployed through enterprise device management. The SDK captures a fixed schema of behavioural events in the background without action from the expert. Technical foundation: rrweb (originally built for session replay) repurposed for expert behaviour analysis, layered on OpenTelemetry for transport and Redpanda for event streaming. Every event carries a SHA-256-hashed expert ID, session ID, timestamp, type, surface mapping, payload, and consent flag.

InstallDay 1 · device management push
Volume8K-15K events per expert per week
LineageHCI behavioural telemetry · 40+ yrs
Limitsblind to reasoning, social, embodied
Meridian · Weeks 1-2

Tuesday of Week 1: SDK deployed to Sarah, Michael, David through Meridian's Google Workspace. First events land at the ingestion endpoint within four hours. By Friday, 52,000 events across three experts. Welford accumulators populate baselines: Sarah's median deliberation time 4m 23s per case, override rate 14%, hesitation clusters on "flood zone" and "building age" at 8-15 seconds.

By Week 2, PrefixSpan over event sequences flags the pattern Phase Zero proposed as a candidate: whenever Sarah opens a submission with building age above 40 and a postcode in the coastal set, she hesitates, backtracks, overrides the automated price upward by 8-12%. P0 produced the hypothesis; Path A confirms it with live behavioural evidence. The heuristic moves from candidate to staged.

02
ACTIVE · SCHEDULED · MODERATE EFFORT
SIA Structured Dialogues
B · F · D

A Socially Interactive Agent runs scheduled 15-25 minute sessions, two or three times per month per expert. Real-time avatar (Ready Player Me mesh, NVIDIA Audio2Face lip sync) fronts a GPT-4o or Claude backbone with LlamaIndex + pgvector for retrieval. The five-phase protocol is a direct operationalisation of the Critical Decision Method (Klein, Calderwood & Macgregor, 1989): warm-up, process tracing, reasoning excavation, boundary probing, social-relational.

Four perception channels fuse into a Certainty Index that gates the agent's next question. Whisper large-v3 transcribes verbal content. Prosody analysis (fundamental frequency, jitter, shimmer) detects hesitation. MediaPipe face mesh with AU detection reads micro-expressions. The SDK feed supplies same-hour behavioural context.

Cadence2-3x per month per expert
Duration15-25 min per session
LineageCDM (Klein 1989) · CTA (Hoffman 2008)
Limitsexpert-time gated; articulation-bound
Meridian · Week 5 · SIA session with Sarah

Monday 11:00. Warm-up (2 min). Process tracing on case #4821: "Walk me through what you did last Thursday." Reasoning excavation on the flagged override: "What made you add 11%?" Sarah: "Building's 1960s, on the Norfolk coast. Asbestos cladding on old builds that age; subsidence risk from salt exposure the model doesn't price." Boundary probe: "When would that rule not apply?" Sarah: "If there's a structural survey within two years." Certainty Index: 0.83 on the core rule; 0.61 on the exception (prosody shows hesitation). A perturbation probe schedules for Week 6 to sharpen the exception.

03
ACTIVE · LIVE · HIGHEST FIDELITY
Shadow Session Protocol
A · I · C

A field-deployed engineer sits with the expert for 60-90 minutes while the expert narrates their work on live or recent cases. The FDE timestamps narration against the SDK event stream. The output is a dense sequence of causally-linked observation-explanation pairs. Grounded in concurrent verbal protocol (Ericsson & Simon, Protocol Analysis, 1984) aligned to behavioural telemetry.

Meridian · Week 3

90 minutes with Sarah across four live cases. 287 narration-timestamp pairs captured, aligned to 2,412 behavioural events. Reveals a cue the SDK alone would have missed: Sarah pulls up the property photograph on three of four cases and specifically looks for salt staining on cladding joints. Surface I (physical→digital) instrumented explicitly. Pattern mining backtests across 47 prior cases; salt-staining presence correlates with her overrides at r = 0.71. New heuristic candidate, only reachable through behavioural-narrative alignment.

04
ACTIVE · BOUNDARY MAPPING
Perturbation Probing
A · D

Controlled variations of real cases synthesised and presented to the expert. Five sub-types: single-variable, contextual, threshold, counterfactual, adversarial. Surfaces decision boundaries the expert cannot articulate in conversation but reveals through differential action. Grounded in Kahneman & Klein (2009) on boundary conditions, and active learning from ML.

Meridian · Week 6

20 synthetic cases vary age and coastal distance on a grid. Sarah's response surface reveals sharp thresholds at exactly 40 years of age and 200m from coastline. Information Path A could not surface because real cases do not cluster at decision boundaries.

05
ACTIVE · PAIRED COMPARISON
Contrastive Elicitation
B · F

Two cases that look similar but that the expert handles differently. The gap between what the expert notices and what a novice would notice is the tacit knowledge. Kahneman's principle: knowledge activates in response to specific stimuli, not abstract questions.

Meridian · Week 7

Sarah sees Case A and Case B, built near-identical on all model features. "A is straightforward. B I'd decline. Tenant covenant — retail sector, hollowing out post-2024." A junior couldn't separate them. New social-contextual heuristic captured.

06
PASSIVE · TEMPORAL
Longitudinal Drift Analysis
E · D

Continuous background analysis of behaviour change over months and years. Surfaces market-condition sensitivity, fatigue, learning adaptation. River streaming ML (CluStream + ADWIN) over the Observer stream; change-point detection via Bayesian online change-point methods.

Meridian · Month 6

Drift detector flags a shift in Sarah's pricing for commercial retail since October 2025 (Cohen's d = 0.72). Overrides widened from +5% to +11%. Hypothesis: cascading retail tenant bankruptcies after 2024 consumer credit contraction. Old heuristic decays; new candidate enters Path B probing.

07
RETROSPECTIVE · ARTEFACT-BASED
Cognitive Archaeology
B · J

Reconstructs past decisions from artefacts — underwriting files, emails, system logs — combined with expert retrospection. Four steps: artefact collection, timeline reconstruction, expert retrospection, counterfactual probing. Essential for rare, high-stakes decisions the SDK will never see enough of. Borrowed from CTA practice (Crandall, Klein & Hoffman, Working Minds, 2006).

08
PASSIVE · COHORT · COMPARATIVE
Contrastive Expertise Analysis
All surfaces · cohort level

Cohen's d effect size between expert cohort and novice cohort on the same markers. Surfaces the dimensions where experts materially differ from novice baseline, prioritises those for deeper capture, gives the CFO the ROI evidence.

A gap between two groups means more when each group is tight around its own average and far from the other. Cohen's d is the distance between averages in units of their pooled noisiness.
$$ d = \frac{\mu_\text{expert} - \mu_\text{novice}}{\sigma_\text{pooled}}, \qquad \sigma_\text{pooled} = \sqrt{\tfrac{(n_e - 1)s_e^2 + (n_n - 1)s_n^2}{n_e + n_n - 2}} $$
Cohen (1988): d > 0.8 is "large", d > 1.2 is where judgment is genuinely irreplicable and capture investment pays back fastest. Meridian: d = 1.4 on coastal-heritage, d = 0.3 on standard urban office. Path D points the capture effort.
04.1 · Fusion

Eight signals. One heuristic. Dissent preserved.

When multiple strategies speak to the same marker, their confidences fuse into one. The choice of fusion function is an architectural commitment because it determines whether dissent survives or disappears.

If two independent methods each say a rule is true, confidence should rise above either alone. Noisy-OR is the correct way to combine independent evidence. Averaging would wash out the signal.
$$ \text{conf}(M) = 1 - \prod_{i \in \text{active strategies}} \bigl( 1 - \text{conf}_i(M) \bigr) $$
Three strategies each returning 0.7 fuse to 1 − 0.3³ = 0.973. Triangulation earns high confidence without claiming any single channel is perfect. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988.

Each heuristic's final confidence is a weighted composite of four evidence dimensions. This is the object regulators read when they ask how confident a heuristic is, and why.

A heuristic earns trust for four reasons: plausible causal story, other experts agree, outcomes validated it, capture was thorough. The confidence score is a weighted sum of those four.
$$ \text{conf}(H) = w_1 \cdot \text{causal\_strength} + w_2 \cdot \text{agreement} + w_3 \cdot \text{outcome\_corr} + w_4 \cdot \text{capture\_fidelity} $$
$$ w_1 = 0.35,\ w_2 = 0.25,\ w_3 = 0.25,\ w_4 = 0.15 $$
Causal strength dominates because a heuristic without causal grounding is coincidence. Outcome correlation is capped at 0.25 because outcome data is biased and lagging. Capture fidelity is lightest because it is the easiest to game.

Conflict protocol.

When strategies disagree on a marker, the engine does not average. It flags the conflict, stores both versions with full provenance, and triggers a targeted SIA dialogue. Dissent is a first-class object in the graph.

// The two edge types that define the architecture (:Heuristic)-[:CONFLICTS_WITH]->(:Heuristic) (:Heuristic)-[:SUPERSEDES_IN_CONTEXT {context}]->(:Heuristic)
Meridian · Week 6

Sarah's coastal-override heuristic is cross-checked with Michael (agrees) and David (dissents; prices coastal risk through upstream reinsurance adjustment). The engine stores H-001 (Sarah + Michael, conf 0.78) and H-002 (David, conf 0.71) connected by a CONFLICTS_WITH edge with context annotation. A junior querying the graph sees both with attribution. This is the feature that makes the platform defensible under SR 11-7.

— Part Three —
05 · Codify

Ten stages from raw event to graph insertion.

Capture produces signals. Codification turns them into production heuristics with confidence, provenance, and a full audit chain. Rule-based scaffolding today, progressively upgraded to statistical ML as data volume accrues.

The Chen Override, traced through ten stages.

01 · Assembly

847 raw events from Sarah's 3-hour work session on file #4821 grouped by session ID and temporal proximity. One Session object with aggregated features.

02 · Mining

PrefixSpan over the event sequence discovers: OPEN → HOVER_FLOOD(8-15s) → HOVER_AGE(3-8s) → BACKTRACK → OVERRIDE_UP appears in 23 of 31 sessions with age > 40 and coastal postcode. Support 0.74.

03 · Baseline

Welford accumulators: override rate on coastal-old cases is 74% against Sarah's overall 14%. Z-score 8.3.

04 · Anomaly

LSTM autoencoder reconstruction loss is low on this pattern. It is systematic, not noise. Flagged for promotion.

05 · Induction

CART produces the interpretable rule: IF age > 40 AND coastal AND model_conf < 0.85 THEN override ∈ [+0.08, +0.12]. Tree depth 3, entropy gain 0.67.

06 · Causal

DoWhy runs matched counterfactual on historical cohort. Override-group loss ratio 52% vs matched non-override 71%. Nineteen-point delta survives propensity matching and placebo tests. Causal strength 0.82.

07 · Context

spaCy NER with domain ontology tags: LOB commercial property, peril flood+age+subsidence, market UK, regulator PRA. Context nodes linked.

08 · Social

Michael's data confirms. David dissents. Agreement score 0.67. Dissent preserved as CONFLICTS_WITH edge to David's alternative.

09 · Confidence

Weighted composite: 0.35 × 0.82 + 0.25 × 0.67 + 0.25 × 0.76 + 0.15 × 0.91 = 0.78. Above the 0.70 production threshold.

10 · Graph

Cypher insertion creates node "The Chen Override". Provenance chain links source expert, sessions, conditions, action, context, capture strategies, three-gate validation status, and the conflict edge to David's alternative. Status: production.

Three-gate heuristic validation.

GateTypeOperationLineage
01Face validitySource expert confirms: "Is this what you do?"Nunnally, psychometrics, 1978
02GeneralisabilityPeer experts asked: "Would you do this?"Cochrane systematic review practice
03Predictive validityBacktest against historical outcomesGRADE (Guyatt et al., 2008)

Experiential decay.

Knowledge that is not reinforced should lose credibility over time. Knowledge that keeps being validated should hold. A half-life model per heuristic type.
$$ \text{conf}(H, t) = \text{conf}_0(H) \cdot e^{-\lambda_H \cdot t_\text{since\_last\_reinforcement}} $$
λ varies by type. Procedural heuristics decay slowly (half-life ~2 years). Contextual heuristics decay fast (~6 months) because market conditions shift. The system flags heuristics below threshold for re-elicitation.
— Part Four —
06 · Retrieve

Context-Aware RAG. Situation-matched reasoning, not document snippets.

Generic enterprise RAG retrieves documents by keyword similarity and generates on top of them. It hallucinates at 17% even on the best-funded legal vendor (Stanford HAI, 2024). Our retrieval layer retrieves by situation similarity instead, and generation is constrained to the retrieved reasoning chain with post-generation verification.

— Context-Aware RAG · Three-Stage Retrieval —
JUNIOR QUERYquestion + case context+ user role metadata 01 · Situation ClassificationBi-encoder trained oncontrastive situation pairs.Match against schema library.Output: ranked types.→ "heritage coastal commercial" 02 · Multi-Index RetrievalParallel pulls from:• Schema templates (graph)• Relevant cases (vector)• Decision rules (rule engine)• Expert commentary (doc) 03 · Cross-Source Re-rankingCross-encoder scores allretrieved items againstfull user context.Rank: high-confidence, recent,validated, multi-dim context. ReasoningChain Synthesis5-step reconstruction+ named attribution+ dissent register+ confidence grading+ post-gen verification

Situation-matching embeddings.

Standard text embeddings encode words. For tacit knowledge retrieval we need embeddings that encode situations. Two submissions with very different wording may be the same underwriting problem. Two with near-identical wording may be different problems. Standard embeddings miss the distinction.

Train a model on pairs: descriptions the expert thinks of as the same kind of case get pushed close in vector space; descriptions that look similar but get handled differently get pushed apart. InfoNCE is the standard loss.
$$ \mathcal{L}_\text{InfoNCE} = -\log \frac{\exp(\text{sim}(q, k^+)/\tau)}{\exp(\text{sim}(q, k^+)/\tau) + \sum_{k^-} \exp(\text{sim}(q, k^-)/\tau)} $$
q query, k+ positive pair from the expert schema library, k hard negatives (look similar, functionally different), τ ≈ 0.07. Bi-encoder fine-tuned from a strong general embedding. Lineage: van den Oord et al., Representation Learning with Contrastive Predictive Coding, 2018.
Meridian · Week 14 · Maya's first query

Maya, second-year underwriter: new submission, commercial property, 52-year-old building in Wells-next-the-Sea, model quotes £18,400. She types into the activation surface: "Is this a straightforward price or does this need a senior eye?"

Stage 1: bi-encoder matches to "heritage coastal commercial, 40+ years" at 0.89 confidence. Stage 2: parallel pulls surface the Chen Override (H-001), the Lim Framework (H-002), seven similar historical cases, three rules, two SIA transcripts. Stage 3: cross-encoder reranks. H-001 leads on freshness, 2-of-3 agreement, and 0.78 confidence. H-002 surfaces as "Alternative view" one click away.

Reasoning chain renders: "Based on building age (52) and coastal postcode, this matches what your senior underwriters call heritage coastal commercial. It differs from standard commercial property because of asbestos-cladding and subsidence exposure the automated model does not price. Sarah Chen uses the Chen Override here: adds 8-12% to the automated price. 47 similar cases; 19-point loss-ratio improvement. Exception: if there is a structural survey within two years, the model is more reliable. Alternative: David Lim uses reinsurance-cost adjustment upstream (tap for the Lim Framework). Attribution: Sarah Chen, 24 yrs coastal; Michael Okonkwo concurs."

Adaptive delivery.

User levelDelivery styleContent focus
NoviceFull reasoning chain, step by stepSchemas, common situations, defaults; explain the why
IntermediateKey distinctions, exceptionsEdge cases, cross-expert views; skip the basics
PeerRaw cases, contradictionsRare cases, unresolved disagreements; maximum density

Alignment layer.

ComponentFunctionBackingRegulatory anchor
Intent EngineInjects heuristics into model calls as prompt contextLlamaIndex + LiteLLMSR 11-7 conceptual soundness
GuardrailsTranslates hard rules into enforced constraintsNeMo Guardrails (NVIDIA)EU AI Act Art 14
Drift DetectionMonitors AI-expert divergenceEvidently AIDORA operational resilience
Override MonitorClosed-loop: every override re-enters the pipelineLangfuse + internalSR 11-7 effective challenge
Audit TrailInput → heuristics → output → override → reasoningLangfuse + GrafanaGDPR Art 22 · CFPB 2023-03

Capture is where every AI company claims novelty. Codification is where the IP lives. Retrieval is where the customer feels it.

— Part Five —
07 · Deployment · Grounded

Phase Zero, then twelve weeks.

An operational trace of what happens between Week minus-two and Week twelve at Meridian. Field engineer on site for the first two weeks; remote with fortnightly visits thereafter.

W -2 to W 0

Phase Zero. Surface mapping with CDO and Head of Underwriting. Data-readiness 0.81. Five years of decision records ingested. Abductive inference produces 63 candidate heuristics. CRO sees the first ten in Week 1.

W 1-2

Monday W1: kickoff, consent signed, SDK deployed through Google Workspace admin. Thursday W1: 15,000 events through ingestion. W2: Shadow Session 1 with Sarah (90 min). Pattern engine live. P0 heuristics begin validation.

W 3-4

Shadow Session 2 (Sarah), 1 each (Michael, David). Cognitive Archaeology on the 2019 Norfolk flood claim. 287 narration-timestamp pairs from Sarah. Salt-staining cue instrumented on surface I.

W 5-6

SIA Session 1 per senior. Path B triggered four times on Sarah during live work. Perturbation Probing reveals Sarah's 40-year and 200m thresholds. Draft corpus: 47 candidates on top of 63 P0 inheritance.

W 7-8

Face-validity pass: 41 of 47 confirmed, 6 dropped or modified. Generalisability pass: 32 confirmed by peers, 8 held as contested (including David's dissent). Path D: Cohen's d = 1.4 on coastal-heritage. KG v1: 234 nodes, 38 production heuristics including the Chen Override.

W 9-10

Intent specification drafted for the pricing model. 12 hard-boundary rules via NeMo Guardrails. First integration test: Intent Engine injects top-5 heuristics into the pricing-model prompt on 200 shadow cases. Accuracy lifts 76% → 85%.

W 11-12

Alignment monitor live. Langfuse logs every AI call. Grafana dashboard visible to CRO. First SR 11-7 effective-challenge artefact produced automatically: 22 pages, 6 hours of engineer time. Handover training. Stage 2 scoped for months 4-7.

Week 12 scorecard. 234 graph nodes · 42 production heuristics · average confidence 0.79 · David's dissent preserved as parallel set · SS1/23 audit artefact in 6 hours · the CRO's Q2 board paper cites the nine-point accuracy uplift with provenance down to expert level.

— Part Six —
08 · Named Patterns & Expert Incentives

The legacy mechanic, and why senior experts participate.

Capture only works with willing participation. A platform that feels extractive will be refused by the best people in the firm. The incentive architecture makes participation intrinsically useful, extrinsically recognised, and financially rewarded.

Named patterns.

Every production heuristic carries the name of the expert who originated it. The Chen Override. The Lim Framework. The Okonkwo Rule. The convention is borrowed from fields that solved the attribution problem long before software did: UpToDate credits its physician authors by name on every article; case law credits its jurists. Naming is audit trail, royalty anchor, and a senior expert's permanent authorship of a pattern the firm will use for years.

The convention is agreed at capture time. Experts can decline naming, pick a non-eponymous label, or change the name later. Once a named heuristic has been invoked in downstream decisions, attribution is frozen, because reversing it breaks the audit chain.

Royalty mechanic.

Every invocation of a named pattern through the activation surface is logged. Quarterly, the expert receives a royalty scaled by invocation volume and penalised where downstream outcomes were worse than baseline. Volume gaming loses to reality. The per-invocation rate is set by the firm's compensation committee — typically between £0.50 and £5 depending on decision stakes — with caps and floors negotiated as a contractual line item.

A senior expert with five named patterns generating 300-600 invocations a month earns roughly £1K to £10K per quarter at common rates. Not transformative individually. Meaningful as recognition. Cumulatively material across a decade of post-capture years.

Three-tier incentive framework.

TierMechanismWhat the expert gets
Utility
(today)
Personalised CopilotCases that need their attention surface automatically; trivial cases defer
"Why?" button auto-formats compliance documentationAudit-ready rationale as a byproduct, not overhead
Private skill analyticsTheir own decision speed, accuracy, unique-strength metrics vs team baseline
Recognition
(career)
Named patterns in the graphInstitutional authorship that persists across the career and beyond retirement
"Golden Rule" leaderboardQuality-based ranking tied to override correctness
Expert Review Board appointmentPaid seat adjudicating low-confidence AI disputes firm-wide
Financial
(direct)
Royalty pool on invocationPer-quarter payment scaled by pattern usage and outcome quality
SIA hours billed as "Strategic AI Training"Capture sessions log as value-add hours, not overhead
Promotion pillarTacit contribution becomes a formal KPI at Senior and Principal grade

Anti-gaming. Rewards tie to outcome quality and novelty (reasoning materially different from cohort baseline), not to raw volume. An expert who floods the graph with trivial rules sees their quality score fall and royalty drop. An expert who contributes rare, well-grounded, repeatedly-invoked patterns sees their share rise. Outcome validation keeps the system honest.

— Part Seven —
09 · Security & Privacy

By technology, not policy.

Privacy by policy is a promise. Privacy by technology is an engineering guarantee. We ship the latter. The security architecture is designed before the capture architecture, not after. This section is the commitment a prospect's CRO, CISO, and works council can hold us to.

Why this matters more here than in most enterprise AI.

The Observer SDK observes senior practitioners during sensitive professional work. That is an intrusion profile closer to clinical research than enterprise software. Without deliberate architecture, the platform could become a surveillance tool, a GDPR liability, or unacceptable to works councils and trade unions. We refuse each of those outcomes by engineering.

Architectural commitments.

Standards and certifications.

StandardScopeStatus
ISO/IEC 27001Information Security Management SystemTarget Q4 2026; controls in implementation
SOC 2 Type IISecurity, Availability, Confidentiality, Privacy (AICPA)Target Q4 2026; observation window opens at first production deployment
GDPR Article 25Privacy by Design & by DefaultArchitecturally aligned; DPIA per engagement
EU AI Act Article 10Data governance for high-risk AIArchitecturally aligned; Art 10 evidence pack per engagement
ISO/IEC 42001AI management systemRoadmap target 2027
NIST AI RMFVoluntary US alignmentArchitecturally mapped; evidence pack for US deployments

Data-flow privacy boundaries at Meridian.

— Data-Flow Privacy Boundaries —
BOUNDARY · EXPERT DEVICE Raw input · LOCAL ONLYkeystrokes, pixels, clipboard, browsing Local feature extractorhesitation, override, nav sequence Local encryptorAES-256 + SHA-256 expert ID Consent scope tag applied derived only BOUNDARY · MERIDIAN CLOUD TENANCY Ingestion endpointpurpose-policy enforcement Neo4j graphMeridian tenancy only Codification + retrieval stackpipeline · embeddings · alignment layer DP aggregator (Path D)ε-budget enforced Audit trail (Langfuse)immutable · consent-linked Revocation handler — expert self-serve · graph marks WITHDRAWN BOUNDARY · TACIT OPERATIONS Platform opssoftware updates only No tenant data accesscryptographically enforced No cross-tenant learningseparate model weights

What this produces in diligence. A CISO asks what we have access to. The honest answer: platform telemetry, DP-protected aggregates, software update channels. Never raw behavioural data, never expert identity, never case content. The answer is identical whether the CISO is a prospect's or a regulator's. Consistency is the point.

— Part Eight —
10 · The Intrusivity Matrix

A dial, not a toggle.

The eight capture strategies sit on an intrusivity gradient. Any expert can opt out of any strategy. Opting out of a high-signal strategy means we need more time on others to reach comparable coverage. The trade-off is explicit.

StrategyIntrusivityCapturesDoes NOT captureOpt-out consequenceMarket default
01 Observer SDKHighDerived behavioural features (hover, click, override, hesitation, sequence)Screen pixels, keystrokes, private browsing, off-scope apps+50% SIA time, more Shadow cycles to compensateEU/UK/US opt-in; works-council review in DE/NL/FR
02 SIA DialoguesLowScheduled dialogue transcripts + 4-channel derived featuresAnything outside the session windowMinimal; lean on Shadow and ArchaeologyOpt-in, expert-visible
03 Shadow SessionsHigh (in-session)Screen + narration + behaviour during 60-90 min, with expert presentAnything outside the session; recording light visibleRely on Cognitive Archaeology for rare decisionsOpt-in, session by session
04 Perturbation ProbingLowDecisions on synthesised casesNo real case dataBoundary mapping depth reducedOpt-in
05 Contrastive ElicitationLowPaired-case decisions and rationalesNo production dataSocial-contextual coverage reducedOpt-in
06 Longitudinal DriftMediumBehavioural patterns aggregated over timeIdentifiable events post-aggregationDrift detection manual via more SIA touchpointsOpt-in, DP-aggregated
07 Cognitive ArchaeologyLow (retrospective)Historical artefacts + retrospectionNo live work; scope bounded by DPOHistorical pattern reconstruction lostOpt-in; DPO scope required
08 Contrastive ExpertiseLow (aggregate)Cohort statistics via differential privacyNo identifiable data in outputCohort-level evidence loses one channelOpt-in, DP-enforced

How this reads commercially. Transparent at diligence, negotiable at sales, compliant at deployment. A firm that forbids Observer SDK can run a Shadow-plus-SIA configuration that takes about 50% longer and 30% more engineer time but reaches comparable Core coverage. A firm sensitive to multimodal perception in SIA can run audio-only with lower certainty weighting. The platform is a dial.

— Part Nine —
11 · Applications

One substrate. Five paying applications.

The knowledge graph is not single-purpose. Once the three-move system runs at a client, the same substrate serves a sequence of applications. Clients land on one, expand into three within eighteen months.

RankApplicationBuyerBudget lineMeridian example
01AI alignment & decision intelligenceCRO, CDO, Head of Responsible AIModel risk / AI governancePricing-model uplift 76→85%, SS1/23 artefact in hours
02Succession capture & continuityCHRO, workforce planningCritical-role retentionSarah's 24 years preserved before Q3-27 retirement
03Middle-management training & onboardingL&D, learning technologyProfessional developmentMaya's Week-14 query inherits the Chen Override directly
04Decision review & QASecond-line risk, complianceControl testingAny UW decision scored against expert heuristics
05Regulatory defensibility & model validationModel risk, complianceRegulatory, auditContinuous SS1/23, DORA, GDPR Art 22 artefacts

Beyond the top five. M&A due diligence (captured expertise as valuation input), talent analytics (objective decision-quality measurement), organisational learning (drift detection surfaces reasoning shifts before quarterly numbers show them), enterprise agentic systems (autonomous workflows grounded in named-expert priors). Platform logic that defends the valuation multiple.

Where we are

Landing page live. Product in active build. First deployment in conversation.

Karan Bhandari (prior CISO, Canadian financial sector) and Jas (deep technical background including pre-LLM AI engineering). We are not selling consulting. We are building the infrastructure and taking the capital to build it faster.

Version 2.0

April 2026

hello@tacitlabs.ai

tacitlabs.ai