Skip to main content

Module models

context_harness_core

Module models

Expand description

Core data models used throughout Context Harness.

These types represent the documents, chunks, and search results that flow through the ingestion and retrieval pipeline. The data lifecycle is:

Connector → SourceItem → normalize() → Document → chunk() → Chunk
                                                      ↓
                                                 embed() → Embedding
                                                      ↓
                                                 search() → SearchResult

§Type Relationships

A SourceItem is produced by a connector (filesystem, Git, S3) before any normalization or storage.
A Document is the normalized, stored representation with a deduplication hash and Unix timestamps.
A Chunk is a segment of a document’s body, stored alongside a content hash for embedding staleness detection.
A SearchResult is returned by the query engine with a relevance score and snippet.

Structs§

Chunk: A chunk of a document’s body text, stored in the chunks table.
Document: Normalized document stored in the documents table.
SearchResult: A search result returned from the query engine.
SourceItem: Raw item produced by a connector before normalization.