RAG Patterns & Guide
This guide covers practical patterns for using InitRunner's retrieval-augmented generation (RAG) capabilities. For full configuration reference, see Ingestion and Memory.
RAG vs Memory — When to Use Which
InitRunner has two systems for giving agents access to information beyond their training data:
| Aspect | Ingestion (RAG) | Memory |
|---|---|---|
| Purpose | Search external documents | Remember learned information |
| Data source | Files on disk, URLs | Agent's own observations |
| Who writes | You (via initrunner ingest) | Agent (via remember() tool) |
| Who reads | Agent (via search_documents()) | Agent (via recall()) |
| Best for | Knowledge base Q&A, doc search | Personalization, context carry-over |
| Persistence | Rebuilt on each ingest run | Accumulates across sessions |
You can use both together — ingestion for your docs, memory for user preferences:
spec:
ingest:
sources:
- "./docs/**/*.md"
memory:
semantic:
max_memories: 500End-to-End Walkthrough
1. Create a role with ingestion
Create role.yaml:
apiVersion: initrunner/v1
kind: Agent
metadata:
name: docs-agent
description: Documentation Q&A agent
spec:
role: |
You are a documentation assistant. ALWAYS call search_documents
before answering questions. Cite your sources.
model:
provider: openai
name: gpt-4o-mini
ingest:
sources:
- "./docs/**/*.md"
chunking:
strategy: paragraph
chunk_size: 512
chunk_overlap: 502. Add some documents
Create a docs/ directory with markdown files:
docs/
├── getting-started.md
├── api-reference.md
└── faq.md3. Ingest documents
$ initrunner ingest role.yaml
Ingesting documents for docs-agent...
✓ Stored 47 chunks from 3 files4. Run the agent
$ initrunner run role.yaml -p "How do I authenticate?"The agent calls search_documents("authenticate") behind the scenes, retrieves matching chunks from your docs, and uses them to answer.
5. Interactive session
$ initrunner run role.yaml -i
docs-agent> How do I get an API key?
I found the answer in your documentation. Per the Getting Started guide
(./docs/getting-started.md), you can generate an API key by navigating to
Settings > API Keys in your dashboard...
docs-agent> What rate limits apply?
According to the API Reference (./docs/api-reference.md), the default rate
limit is 100 requests per minute per API key...Choosing an Embedding Model
The embedding model determines how well semantic search performs. Different models trade off between dimension size, cost, speed, and quality.
| Model | Provider | Dimensions | Notes |
|---|---|---|---|
text-embedding-3-small | OpenAI | 1536 | Fast and cheap — good default for most use cases |
text-embedding-3-large | OpenAI | 3072 | Higher quality at higher cost |
text-embedding-004 | 768 | Cost-effective; strong multilingual support | |
nomic-embed-text | Ollama | 768 | Fully local — no API key or network needed |
Which model should I use?
- Cost-sensitive: Google
text-embedding-004or Ollamanomic-embed-text - Precision-critical: OpenAI
text-embedding-3-large - Fully local / no API keys: Ollama
nomic-embed-text - Google ecosystem: Google
text-embedding-004
The default (openai:text-embedding-3-small) is a sensible starting point for most projects. See Providers for the full embedding configuration reference and how to override the default.
Common Patterns
Basic knowledge base
Single format, paragraph chunking for natural document boundaries:
ingest:
sources:
- "./knowledge-base/**/*.md"
chunking:
strategy: paragraph
chunk_size: 512
chunk_overlap: 50Multi-format knowledge base
Mix HTML, Markdown, and PDF sources. Install initrunner[ingest] for PDF support:
ingest:
sources:
- "./docs/**/*.md"
- "./docs/**/*.html"
- "./docs/**/*.pdf"
chunking:
strategy: fixed
chunk_size: 1024
chunk_overlap: 100URL-based ingestion
Ingest content from remote URLs alongside local files:
ingest:
sources:
- "./local-docs/**/*.md"
- "https://docs.example.com/api/reference"
- "https://docs.example.com/changelog"URL content is hashed — re-running ingest skips unchanged pages.
Auto re-indexing with file watch trigger
Use a file_watch trigger to re-ingest when source files change:
spec:
ingest:
sources:
- "./knowledge-base/**/*.md"
triggers:
- type: file_watch
paths:
- ./knowledge-base
extensions:
- .md
prompt_template: "Knowledge base updated: {path}. Re-index."
debounce_seconds: 1.0Using source filter to scope searches
When your knowledge base spans multiple topics, use the source parameter to narrow results:
spec:
role: |
You are a support agent. When the user asks about billing, search
only billing docs: search_documents(query, source="*billing*").
For technical issues, search: search_documents(query, source="*troubleshooting*").
ingest:
sources:
- "./kb/billing/**/*.md"
- "./kb/troubleshooting/**/*.md"
- "./kb/general/**/*.md"Fully local RAG with Ollama
No external API keys needed — use Ollama for both the LLM and embeddings:
spec:
model:
provider: ollama
name: llama3.2
ingest:
sources:
- "./docs/**/*.md"
embeddings:
provider: ollama
model: nomic-embed-textSee the Providers page for Ollama setup instructions.
Next Steps
- Ingestion reference — full configuration options, chunking strategies, embedding models
- Memory reference — session persistence and long-term memory (semantic, episodic, procedural)
- Tools reference — built-in and custom tool types