Context Farm is being built for the small team where one operator, department head, or founder carries too much of the business in their head. It captures tribal operational knowledge, structures it into grounded context, and serves it back to both humans and agent systems.
Most small teams do not have a documentation shortage. They have a knowledge concentration problem. One person knows the real workflow, the exceptions, the caveats, and which source actually wins when documents conflict.
That breaks teams in predictable ways: repeated interruptions, slow onboarding, inconsistent execution, and AI agents that can read the handbook but still make bad decisions. Context Farm exists to turn that concentrated operational knowledge into reusable context.
Boundary Labs hit the same wall internally. Retrieval was finding relevant documents, but the agent or operator still needed the unstated rule, the exception, or the ranking between sources. The chunks were there. The operational truth was not.
The strongest practical wedge is local-first deployment. Context Farm is being built to run on Boundary Labs infrastructure with local inference and no required external API dependency. For small teams with sensitive process knowledge, internal client rules, or compliance concerns, "no data leaves your building" matters more than another generic AI search surface.
Context Farm is now being developed as a layered system: ingest messy operational material, compile it into readable knowledge artifacts, extract typed operational objects, and serve grounded retrieval and briefings to humans and agents.
CURRENT DIRECTION ───────────────────────────────────────────────────────────── raw input │ PDF, URL, text paste, transcripts, operator seed ↓ │ ingestion │ normalize source, preserve provenance, assign authority ↓ │ compile layer │ build readable linked artifacts from messy material ↓ │ structured layer │ extract facts, procedures, constraints, exceptions, │ decisions, and source-linked evidence ↓ │ governance │ review high-impact items, track authority, flag conflicts ↓ │ serving layer │ search, ask, brief, and agent retrieval ───────────────────────────────────────────────────────────── STORES wiki / article layer │ human-readable audit trail and fallback retrieval SQLite │ structured operational objects ChromaDB │ semantic recall over compiled knowledge
The key differentiator is still domain seeding. Before full ingestion, the operator describes the domain in plain English: what matters, what entities exist, what rules apply, what exceptions are common, and which sources outrank others. That seed guides subsequent extraction and review.
The wiki-style compile layer is not dead weight. In the current design it serves four jobs: human-readable audit trail, intermediate normalization before structured extraction, fallback retrieval while extraction is incomplete, and a debugging surface when the structured layer gets something wrong.
| Source Type | Input Format | Current Handling | Status |
|---|---|---|---|
| Uploaded file or local path | Ingested in the internal pipeline; target source for structured extraction | live | |
| URL | HTTP/HTTPS page URL | Ingested in the internal pipeline; target source for structured extraction | live |
| Text paste | Plain text via API or UI | Ingested in the internal pipeline and easiest source for manual or semi-manual review | live |
| Domain seed | Plain-language domain description | Used to define the domain before broader extraction and review | live |
| Transcripts / interviews | Operator interviews, meeting notes, AI session exports | Important next input class for tribal knowledge capture | in progress |
| Manual structured object set | Curated JSON seed for demo domain | Used to prove retrieval and briefing before full automation | beta |
Two things are true at once. First, the underlying ingestion and knowledge-compilation pipeline has been running internally across finance, research, and infrastructure domains. Second, the product-shaped Context Farm work is now being tightened around a small-team operational-memory use case with an explicit manual demo domain before broader extraction automation.
Boundary Labs already uses the underlying pipeline for finance, research, and infrastructure knowledge. That is where the practical lessons came from: provenance matters, source ranking matters, and document retrieval alone is not enough.
internal uselocal inferenceThe first explicit demo domain is service-dispatch: a small operations-heavy workflow with deposits, after-hours approvals, emergency overrides, and client exceptions. The current prototype already retrieves governing rules and linked exceptions from a structured SQLite store.
The near-term proof target is simple: can Context Farm answer a realistic operational question with the governing rule, the relevant exception, and the source trail, then generate a compact briefing from the same domain objects?
retrievalbriefingsContext Farm has real forward motion, but the hardest parts are not being glossed over. These are the actual gaps now driving the build.
The SQLite schema and manual demo path are in place. The hard part is the extraction loop that turns raw material into the right object type with enough provenance and low enough review burden to be trusted. Constraint versus exception versus procedure is not a trivial distinction. This is the current center of gravity.
Review is necessary, but a small-team operator cannot spend their day approving rows. The system has to batch high-impact review, auto-accept low-risk items where possible, and surface contradictions only where the review load is justified.
Domain seeds are useful, but domains change. New entities appear, old exceptions stop applying, and source hierarchies get messier over time. Detecting when the model of the domain is stale is still open work.
The internal foundation already spans finance, research, and infrastructure, but the product path is deliberately narrowing before it widens again. Cross-domain retrieval and staleness tracking both matter, but they come after the single-domain operational-memory loop is credible.
Context Farm is active development, not a finished product. The internal pipeline is real, the manual demo path is now real, and the next stage is making extraction and review good enough that the same operational clarity can be produced without hand-seeding everything first.
Boundary Labs is looking for visibility, feedback, and aligned partnerships around local-first operational memory, agent grounding, and structured knowledge extraction for small teams. The work is moving from internal necessity toward a public product direction.