Knowledge Bases and RAG for Operations: Grounded Answers That Survive Contact with Reality
Overview
Retrieval-augmented generation (RAG) helps agents answer with internal documents—if retrieval and governance are right.
Quick definition
Production RAG chunks documents with stable IDs, embeds with versioned models, retrieves with hybrid search, and grounds answers with citations—reranking reduces false positives.
Definition
RAG connects prompts to curated document chunks with similarity search, then generates answers constrained to retrieved context—with citations where possible.
Why it matters
Ungrounded models hallucinate policies; bad retrieval surfaces wrong snippets. Operations needs accuracy over fluency.
Core framework
Step-by-step model as TypeScript interfaces (machine-readable checkpoints).
Source governance
/**
* Source governance
* Authoritative docs only; versioned; expiry for time-sensitive policy.
*/
export interface CoreFrameworkStep1SourceGovernance {
/** Order in the core framework (0-based) */
readonly stepIndex: 0;
/** Display title for this step */
readonly title: "Source governance";
/** Narrative checkpoints as published in the guide */
readonly narrative: readonly string[];
}
export const CoreFrameworkStep1SourceGovernance_NARRATIVE: readonly string[] = [
"Authoritative docs only; versioned; expiry for time-sensitive policy."
] as const;Chunking strategy
/**
* Chunking strategy
* Structure-aware splits for SOPs and tables—not naive fixed sizes.
*/
export interface CoreFrameworkStep2ChunkingStrategy {
/** Order in the core framework (0-based) */
readonly stepIndex: 1;
/** Display title for this step */
readonly title: "Chunking strategy";
/** Narrative checkpoints as published in the guide */
readonly narrative: readonly string[];
}
export const CoreFrameworkStep2ChunkingStrategy_NARRATIVE: readonly string[] = [
"Structure-aware splits for SOPs and tables—not naive fixed sizes."
] as const;Detailed breakdown
Logic sections encoded as Python functions with structured narrative payloads.
Evaluation
def logic_block_1_evaluation(context: dict) -> dict:
"""Operational logic: Evaluation"""
# Narrative steps from the guide (logic section)
paragraphs = ["Test sets from real questions; measure grounded vs ungrounded responses."]
return {
"heading": "Evaluation",
"paragraphs": paragraphs,
"context_keys": tuple(sorted(context.keys())),
}Technical patterns
Chunk lineage
- `chunk_id → doc_version → storage_uri` for compliance takedowns.
- Re-embed only when embedding model or chunking policy changes.
Grounding response
- Answer must cite `chunk_id`; refuse if retrieval score below threshold.
Code examples
Citation-enforced answer stub
Caller merges LLM output with allowed chunk IDs only.
export function validateCitations(answer, allowedChunkIds) {
for (const c of answer.citations) {
if (!allowedChunkIds.has(c.chunkId)) throw new Error('invalid_citation');
}
return answer;
}System architecture
[Document ingest]
→ [Chunk + embed pipeline]
→ [Vector index + BM25 index]
→ [Retriever + reranker]
→ [LLM with citation template]
→ [Cache + feedback]Real-world example
A manufacturer reduced incorrect repair steps by requiring citations to service bulletins before field instructions displayed.
Common mistakes
- Dumping PDFs without metadata—retrieval returns junk.
- No feedback when answers wrong—no improvement loop.
Related topics
PrimeAxiom builds grounded knowledge systems for ops—book a retrieval architecture session.