Code, schemas, APIs, docs. Multimodal ingestion. everything your enterprise produces, parsed, chunked, embedded, and linked to your ontology. AI from step one, not bolted on later.
What We Ingest
tree-sitter reads source code and breaks it into structural components (functions, classes, methods) rather than treating it as raw text. AST-based chunking guarantees every chunk is a complete, meaningful unit.
9 Languages · AST-Level Parsing
Plus: database schemas, API contracts (OpenAPI/Swagger, GraphQL), BPMN workflows, documentation (Confluence, Markdown). Every chunk gets a content-addressed SHA-256 ID.
Local AI First
BAAI/bge-small-en-v1.5. 384 dimensions. ONNX runtime. CPU-only. No cloud API calls.
Zero-shot NER. Technologies, organizations, standards. No training per entity type.
Hand-written patterns for known entities. Deterministic matching complementing GLiNER.
Two Ingestion Modes
Local math only. Steps 1–7. Parse, embed, link, propagate.
Full 8-step pipeline. 4-pass LLM enrichment.
Plug-and-Play Connectors
Source Code
APIs
Databases
Infrastructure
Project Management
Data Pipelines
This site uses cookies
We use essential cookies for the site to function and analytics cookies (Google Analytics) to understand how you use it. Analytics cookies are only activated with your consent. We do not track you across other websites. Your data is stored in the EU and processed in accordance with GDPR. Read our Privacy Policy