Embed corpus
Twelve original AI-consulting paragraphs are split into sentence chunks and embedded once at boot with BGE-small via fastembed (ONNX).
Type a question about AI implementation, governance, RAG, MLOps, or the EU AI Act. The system embeds your question with a small ONNX sentence-transformer (BGE-small), scores it against 12 curated paragraphs by cosine similarity, and returns the best sentence with the supporting citations.
Try one
Knowledge base
0 curated articles on AI implementation, governance, RAG, MLOps.
Embed corpus
Twelve original AI-consulting paragraphs are split into sentence chunks and embedded once at boot with BGE-small via fastembed (ONNX).
Embed the question
Same model embeds the live question; result is L2-normalised.
Cosine retrieval
Dot product against the chunk matrix returns top-k matches by semantic similarity.
Extractive answer
The highest-scoring sentence is returned as the answer. Other top hits accompany it as citations.
No LLM in this demo
Pure retrieval. In production we feed the citations to an LLM that synthesises an answer constrained to the retrieved context.
Production swap
Replace the corpus, add BM25 hybrid retrieval, plug in Cohere reranker + Anthropic Claude — same API shape, real answers.
We build production RAG with your documents, hybrid retrieval (BM25 + dense + reranker), eval gates on groundedness and faithfulness, and citation-grade outputs your auditor will sign off on.
Book a 30-minute consultation. We will walk through the use case, sketch the value case, and tell you honestly whether we can help.