Overview
The RAG demo is a documentation Q&A app: you type a question, it retrieves the relevant docs from SQLite, and it streams an AI answer with cited sources back over the wire — no React, no npm, around 50 lines of Python. Reach for it to see Server-Sent Events, fragments, and free-threaded dual-model streaming working together in one runnable app.
Location:examples/chirpui/rag_demo/
What It Demonstrates
Each row is a feature the demo exercises and the page that owns it:
| Feature | In the demo | Learn more |
|---|---|---|
| Fragments | Fragment("ask.html", "answer", ...)renders one named block per token. |
Fragments |
| Server-Sent Events | EventStream yields fragments; htmx swaps them into sse-swaptargets. |
Server-Sent Events |
| Multi-swap SSE layout | Sources, answer, and share link are separatesse-swaptargets in one stream. |
SSE patterns |
| Dual streaming | Compare two models side by side; each streams independently across worker threads. | Free-threading and thread safety |
| Typed SQLite | chirp.data.Databasereturns frozen dataclasses for document storage. |
Database |
| Event delegation | AppConfig(delegation=True)wires copy and compare controls on SSE-swapped content. |
htmx patterns |
Run It
Running the demo is an ordered procedure with one prerequisite the model needs — a local Ollama model — before the app can answer anything.
- 1
Install Chirp with the AI extras
The demo stores docs in SQLite, which
chirp.dataserves through the stdlibsqlite3module — no database extra is needed.pip install chirp[ai,sessions,markdown] - 2
Pull the default Ollama model
The demo uses Ollama by default, so it needs no API key.
ollama pull llama3.2 - 3
Start Ollama in another terminal
ollama serve - 4
Run the app
PYTHONPATH=src python examples/chirpui/rag_demo/app.pyIt starts four worker threads when
pounceis installed, and falls back to a single-worker dev server otherwise. - 5
Open the browser
Open
http://127.0.0.1:8000and ask a question about the docs.
To use a cloud model instead, setCHIRP_LLM(for example
CHIRP_LLM=anthropic:claude-sonnet-4-20250514) and the matching API key such as
ANTHROPIC_API_KEY.
Source: examples/chirpui/rag_demo/app.py.
The Streaming Endpoint
The SSE handler retrieves docs, builds a prompt, and streams the answer
token-by-token.stream_with_sourcesre-renders the named blocks as the model
emits text and yields oneFragmentper chunk:
from chirp import EventStream, Request, SSEEvent
from chirp.ai.streaming import stream_with_sources
@app.route("/ask/stream", referenced=True, template="ask.html")
async def ask_stream(request: Request) -> EventStream:
async def generate():
question = (request.query.get("question") or "").strip()
sources = await _retrieve_docs(_db_var.get(), question)
async for frag in stream_with_sources(
llm.stream(prompt),
"ask.html",
sources_block="sources",
sources=sources,
response_block="answer",
):
yield frag
yield SSEEvent(event="done", data="complete")
return EventStream(generate())
This is an excerpt — prompt, _retrieve_docs, and the per-worker _db_varare
defined in the full app./ask/stream and /share/{slug}are marked
referenced=Trueso the
route contract does not flag
them as orphans — htmx connects to them rather than a browser navigating directly.
Chirp Macros
Chirp ships a reusable answer macro so you don't hand-write the body, prose, and copy-button structure for the streamed answer:
How the swap targets and buttons are wired
These are template-internal details specific to this demo. You don't need them to
run it — open this if you're readingask.html.
Multi-swap structure. Each answer card opens one SSE connection with
sse-connect and hx-disinherit="hx-target hx-swap", then carries three inner
sse-swap targets (sources, answer, share_link), each with
hx-target="this". The streamed .answer-body uses the chirpui-streaming-block
classes for the typing cursor. See
SSE patterns for the multi-swap
layout in general.
Copy and compare controls.hx-on::clickis bound at parse time, so it does
not fire on content that htmx swaps in over SSE. The demo sets
AppConfig(delegation=True), which injects one document-level listener that
matches.copy-btnand
.chirpui-copy-btn for clipboard copy and .compare-switchfor the
role="switch"model toggle. You write the buttons; Chirp wires the behavior. See
htmx patterns for event delegation.
Next Steps
- SSE patterns — multi-swap layout and
hx-target - SSE example — the smaller, single-feature version
- Database — typed SQLite queries