InitRunner

Tools

Tools give agents the ability to interact with the outside world — reading files, making HTTP requests, connecting to MCP servers, calling APIs, or running custom Python functions. They are configured in the spec.tools list, keyed on the type field.

Tool Types

TypeDescription
filesystemRead/write files within a sandboxed root directory
httpMake HTTP requests to a base URL
mcpConnect to MCP servers (stdio, SSE, streamable-http)
customLoad Python functions from a module
delegateInvoke other agents as tool calls
apiDeclarative REST API endpoints defined in YAML
web_readerFetch web pages and convert to markdown
pythonExecute Python code in a subprocess
datetimeGet current time and parse dates
sqlQuery SQLite databases (read-only)
gitRun git operations in a subprocess
shellExecute shell commands with allowlists
web_scraperScrape web pages and extract structured data
slackSend messages via Slack webhooks
searchWeb and news search via DuckDuckGo, SerpAPI, Brave, or Tavily
emailSearch, read, and send emails via IMAP/SMTP
audioFetch YouTube transcripts and transcribe local audio files
csv_analysisInspect, summarize, and query CSV files within a sandboxed root directory
thinkInternal reasoning scratchpad — agent thinks step-by-step without user-visible output
scriptInline shell scripts defined in YAML as named, parameterized tools
calculatorSafe AST-based math expression evaluator with trig, log, and utility functions
clarifyAgent-initiated human-in-the-loop — asks the user for clarification mid-run (run-scoped)
image_genGenerate and edit images via OpenAI DALL-E 3 or Stability AI
pdf_extractExtract text and metadata from PDF files
spawnRun multiple agent instances in parallel (run-scoped)
todoTask management for autonomous agent workflows (run-scoped)
(plugin)Any other type resolved via the plugin registry

Quick Example

spec:
  tools:
    - type: filesystem
      root_path: ./src
      read_only: true
      allowed_extensions: [".py", ".md"]
    - type: http
      base_url: https://api.example.com
      allowed_methods: ["GET", "POST"]
      headers:
        Authorization: Bearer ${API_TOKEN}
    - type: mcp
      transport: stdio
      command: npx
      args: ["-y", "@anthropic/mcp-server-filesystem"]
    - type: custom
      module: my_tools
      config:
        db_url: "postgres://..."
    - type: api
      name: weather
      base_url: https://api.weather.com
      endpoints:
        - name: get_weather
          path: "/current/{city}"
          parameters:
            - name: city
              type: string
              required: true

Tool Permissions

Every built-in tool type has an optional permissions block on its configuration. When present, a PermissionToolset wrapper evaluates glob patterns against call arguments before the tool executes. When absent, no filtering is applied — existing behavior is preserved.

Fields

FieldTypeDefaultDescription
default"allow" | "deny""allow"Policy applied when no rule matches
allowlist[str][]Patterns that permit a call
denylist[str][]Patterns that block a call

Pattern Format

Two pattern forms are supported:

  • Named argumentarg_name=glob_pattern matches with fnmatch against a specific named argument (e.g. command=kubectl *).
  • Bare glob — a pattern without = matches against all string arguments (e.g. *.env).

Validation rejects empty argument names and empty globs.

Evaluation Order

  1. Deny rules are checked first. If any deny pattern matches, the call is blocked.
  2. Allow rules are checked next. If any allow pattern matches, the call is permitted.
  3. Default policy is applied when no rule matches.

Deny always wins — a call matching both an allow and a deny pattern is blocked.

Examples

Shell — deny by default, allow only safe commands:

tools:
  - type: shell
    allowed_commands: [kubectl, docker, curl]
    permissions:
      default: deny
      allow:
        - command=kubectl get *
        - command=kubectl describe *
        - command=docker ps *
        - command=curl https://*
      deny:
        - command=rm *

Filesystem — allow by default, block sensitive files:

tools:
  - type: filesystem
    root_path: ./project
    permissions:
      default: allow
      deny:
        - "*.env"
        - "*credentials*"
        - "*.pem"

HTTP — block internal and admin endpoints:

tools:
  - type: http
    base_url: https://api.example.com
    permissions:
      default: allow
      deny:
        - "*internal*"
        - "*admin*"

Denied Response Format

When a call is blocked, the agent receives the message:

Permission denied: {tool_name} — blocked by rule: {pattern}

Raw argument values are never echoed in the denial message to prevent secret leakage.

CSV Analysis

Inspect, summarize, and query CSV files within a sandboxed root directory. Three sub-functions are registered automatically.

tools:
  - type: csv_analysis
    root_path: ./data
    max_rows: 1000
    max_file_size_mb: 10.0
    delimiter: ","
FieldTypeDefaultDescription
root_pathstr"."Root directory for CSV file access (path traversal is blocked)
max_rowsint1000Maximum rows loaded from the CSV
max_file_size_mbfloat10.0Maximum CSV file size in MB
delimiterstr","CSV delimiter character

Registered functions:

  • inspect_csv(path) — Returns column names, types, row count, and a sample of the first few rows.
  • summarize_csv(path, column) — Returns per-column statistics. Numeric columns: min, max, mean, median, stdev. Categorical columns: unique count and top values.
  • query_csv(path, filter_column, filter_value, columns, limit) — Filter rows by exact column=value match and return as a markdown table.

Filesystem

Sandboxed file operations within a root directory. Paths cannot escape the root (path traversal is blocked).

tools:
  - type: filesystem
    root_path: ./src
    read_only: true
    allowed_extensions: [".py", ".md", ".txt"]
FieldTypeDefaultDescription
root_pathstr"."Root directory for file operations
allowed_extensionslist[str][]File extensions to allow (empty = all)
read_onlybooltrueOnly allow read operations

Registered functions: read_file(path), list_directory(path), and write_file(path, content) (when read_only: false).

HTTP

Makes HTTP requests to a configured base URL.

tools:
  - type: http
    base_url: https://api.example.com
    allowed_methods: ["GET"]
    headers:
      Authorization: Bearer ${API_TOKEN}
FieldTypeDefaultDescription
base_urlstr(required)Base URL for requests
allowed_methodslist[str]["GET"]Allowed HTTP methods
headersdict{}Headers sent with every request

Registered function: http_request(method, path, body).

MCP

Connects to MCP (Model Context Protocol) servers, exposing their tools to the agent.

tools:
  # Stdio transport (local process)
  - type: mcp
    transport: stdio
    command: npx
    args: ["-y", "@anthropic/mcp-server-filesystem"]

  # SSE transport (remote server)
  - type: mcp
    transport: sse
    url: http://localhost:3001/sse

  # Streamable HTTP transport
  - type: mcp
    transport: streamable-http
    url: http://localhost:3001/mcp
    tool_filter: [search, get_document]
FieldTypeDefaultDescription
transportstr"stdio""stdio", "sse", or "streamable-http"
commandstr | nullnullCommand for stdio transport
argslist[str][]Arguments for the stdio command
urlstr | nullnullURL for SSE or streamable-http transport
tool_filterlist[str][]Only expose these tools (empty = all; mutually exclusive with tool_exclude)
tool_excludelist[str][]Exclude these tools (mutually exclusive with tool_filter)
headersdict{}HTTP headers for SSE/streamable-http transport
envdict{}Environment variables passed to the stdio subprocess
cwdstr | nullnullWorking directory for the stdio subprocess
tool_prefixstr | nullnullPrefix added to tool names to avoid collisions
max_retriesint1Maximum connection retry attempts
timeout_secondsint | nullnullConnection timeout in seconds
deferboolfalseDefer server connection until first tool call; serve cached schemas meanwhile. See Deferred Tool Loading

Custom

Load Python functions from a module and register them as agent tools.

tools:
  # Auto-discover all public functions
  - type: custom
    module: my_tools

  # Load a single function
  - type: custom
    module: my_tools
    function: search_db

  # With config injection
  - type: custom
    module: my_tools
    config:
      api_key: ${MY_API_KEY}
FieldTypeDefaultDescription
modulestr(required)Python module path (must be importable)
functionstr | nullnullSpecific function to load (null = auto-discover all)
configdict{}Config injected into functions with a tool_config parameter

Functions that declare a tool_config parameter receive the config dict automatically — the parameter is hidden from the LLM.

Scaffold a tool module:

initrunner init --template tool --name my_tools

Complete Custom Tool Walkthrough

Here's a full example with the Python module and the role YAML that uses it.

my_tools.py — every public function becomes an agent tool:

"""Custom tools module for InitRunner.

All public functions are auto-discovered as agent tools. Type annotations and
docstrings are used as tool schemas and descriptions. Functions accepting a
``tool_config`` parameter receive the config dict from role.yaml (hidden from
the LLM).
"""

import hashlib
import json
import uuid


def convert_units(value: float, from_unit: str, to_unit: str) -> str:
    """Convert a numeric value between common measurement units.

    Supported conversions: km/mi, kg/lb, c/f, l/gal, m/ft, cm/in.
    """
    conversions: dict[tuple[str, str], float | None] = {
        ("km", "mi"): 0.621371,
        ("mi", "km"): 1.60934,
        ("kg", "lb"): 2.20462,
        ("lb", "kg"): 0.453592,
        ("c", "f"): None,
        ("f", "c"): None,
        ("l", "gal"): 0.264172,
        ("gal", "l"): 3.78541,
        ("m", "ft"): 3.28084,
        ("ft", "m"): 0.3048,
        ("cm", "in"): 0.393701,
        ("in", "cm"): 2.54,
    }

    key = (from_unit.lower(), to_unit.lower())
    if key == ("c", "f"):
        result = value * 9 / 5 + 32
    elif key == ("f", "c"):
        result = (value - 32) * 5 / 9
    elif key in conversions:
        result = value * conversions[key]
    else:
        return f"Unsupported conversion: {from_unit} -> {to_unit}"

    return f"{value} {from_unit} = {result:.4f} {to_unit}"


def generate_uuid() -> str:
    """Generate a random UUID v4 identifier."""
    return str(uuid.uuid4())


def format_json(text: str) -> str:
    """Pretty-print a JSON string with 2-space indentation."""
    try:
        parsed = json.loads(text)
        return json.dumps(parsed, indent=2, ensure_ascii=False)
    except json.JSONDecodeError as e:
        return f"Invalid JSON: {e}"


def word_count(text: str) -> str:
    """Count words, characters, and lines in a text string."""
    words = len(text.split())
    chars = len(text)
    lines = text.count("\n") + 1 if text else 0
    return f"Words: {words}, Characters: {chars}, Lines: {lines}"


def hash_text(text: str, algorithm: str = "sha256") -> str:
    """Hash text using the specified algorithm (md5, sha1, sha256, sha512)."""
    algo = algorithm.lower()
    if algo not in ("md5", "sha1", "sha256", "sha512"):
        return f"Unsupported algorithm: {algorithm}. Use md5, sha1, sha256, or sha512."
    h = hashlib.new(algo)
    h.update(text.encode())
    return f"{algo}:{h.hexdigest()}"


def lookup_with_config(query: str, tool_config: dict) -> str:
    """Look up a query using the configured prefix and source.

    The tool_config parameter is injected by InitRunner from the role YAML
    and is hidden from the LLM.
    """
    prefix = tool_config.get("prefix", "DEFAULT")
    source = tool_config.get("source", "unknown")
    return f"[{prefix}] Result for '{query}' from source '{source}'"

custom-tools-demo.yaml — the role that loads it:

apiVersion: initrunner/v1
kind: Agent
metadata:
  name: custom-tools-demo
  description: Demonstrates custom tool type with auto-discovered Python functions
spec:
  role: |
    You are a utility assistant with access to custom tools defined in a Python
    module. Use these tools to help the user with practical tasks.

    Available custom tools:
    - convert_units: Convert between common measurement units
    - generate_uuid: Generate a random UUID v4 identifier
    - format_json: Pretty-print a JSON string
    - word_count: Count words, characters, and lines in text
    - hash_text: Hash text with md5, sha1, sha256, or sha512
    - lookup_with_config: Look up a query using the configured prefix and source

    Always use the appropriate tool rather than trying to compute results yourself.
  model:
    provider: openai
    name: gpt-4o-mini
    temperature: 0.1
  tools:
    - type: custom
      module: my_tools
      config:
        prefix: "DEMO"
        source: "custom-tools-demo"
    - type: datetime
  guardrails:
    max_tokens_per_run: 20000
    max_tool_calls: 15
    timeout_seconds: 60

Run from the directory containing both files:

cd examples/roles/custom-tools-demo
initrunner run custom-tools-demo.yaml -i

Example prompts:

> Convert 72 degrees Fahrenheit to Celsius
> Generate a UUID for me
> Hash "hello world" with sha256
> Look up "test query"

Key patterns: Docstrings become tool descriptions. Type annotations become parameter schemas. The tool_config parameter is injected from the YAML config block and hidden from the LLM — the agent never sees prefix or source as callable parameters. Omitting function in the YAML auto-discovers all public functions in the module.

API

Declarative REST API endpoints defined entirely in YAML — no Python required.

tools:
  - type: api
    name: github
    description: GitHub REST API
    base_url: https://api.github.com
    headers:
      Accept: application/vnd.github.v3+json
    auth:
      Authorization: "Bearer ${GITHUB_TOKEN}"
    endpoints:
      - name: get_repo
        method: GET
        path: "/repos/{owner}/{repo}"
        description: Get repository information
        parameters:
          - name: owner
            type: string
            required: true
          - name: repo
            type: string
            required: true
        response_extract: "$.full_name"
      - name: create_issue
        method: POST
        path: "/repos/{owner}/{repo}/issues"
        description: Create a new issue
        parameters:
          - name: owner
            type: string
            required: true
          - name: repo
            type: string
            required: true
          - name: title
            type: string
            required: true
          - name: body
            type: string
            required: false
            default: ""
        body_template:
          title: "{title}"
          body: "{body}"
        response_extract: "$.html_url"
FieldTypeDefaultDescription
namestr(required)API group name
base_urlstr(required)Base URL for all endpoints
headersdict{}Headers sent with every request (supports ${VAR})
authdict{}Auth headers merged into headers
endpointslist(required)Endpoint definitions

Each endpoint supports name, method, path, description, parameters, headers, body_template, query_params, response_extract, and timeout_seconds.

Scaffold an API tool agent:

initrunner init --template api --name weather-agent

Delegate

Invoke other agents as tool calls. Each agent reference generates a delegate_to_{name} tool.

tools:
  - type: delegate
    agents:
      - name: summarizer
        role_file: ./roles/summarizer.yaml
        description: "Summarizes long text"
      - name: researcher
        role_file: ./roles/researcher.yaml
        description: "Researches topics"
    mode: inline
    max_depth: 3
    timeout_seconds: 120
FieldTypeDefaultDescription
agentslist(required)Agent references (name + role_file or url)
modestr"inline""inline" (in-process), "mcp" (HTTP), or "a2a" (A2A protocol)
max_depthint3Maximum delegation recursion depth
timeout_secondsint120Timeout per delegation call
shared_memoryobject | nullnullShared memory config with store_path (str) and max_memories (int, default 1000)
agents[].headers_envdict | nullnullMap of header name to env var name (for mcp and a2a modes)

The a2a mode sends JSON-RPC requests to a remote A2A server and polls for results. Use it to call agents running on other machines or in other frameworks. Each agent reference needs a url instead of a role_file.

Git

Subprocess-based git operations with read-only default.

tools:
  - type: git
    repo_path: .
    read_only: true
    timeout_seconds: 30
FieldTypeDefaultDescription
repo_pathstr"."Path to the git repository
read_onlybooltrueOnly allow read operations
timeout_secondsint30Timeout for each git command

Read tools: git_status, git_log, git_diff, git_show, git_blame, git_changed_files, git_list_files. Write tools (when read_only: false): git_checkout, git_commit, git_tag.

Shell

Execute shell commands with an allowlist.

tools:
  - type: shell
    allowed_commands: [kubectl, docker, curl]
    require_confirmation: false
    timeout_seconds: 30
    working_dir: .
FieldTypeDefaultDescription
allowed_commandslist[str][]Allowlist of executable names; empty = all non-blocked commands are permitted
blocked_commandslist[str](built-in denylist)Commands always blocked regardless of allowed_commands (e.g. rm, sudo)
require_confirmationbooltruePrompt user before each execution
timeout_secondsint30Timeout per command in seconds
working_dirstr | nullnullWorking directory (null = role file's directory)
max_output_bytesint102400Truncate combined stdout+stderr beyond this byte count

Registered function: run_shell(command). Shell operators (|, &&, ;, redirects) are blocked — use dedicated tools instead. When allowed_commands is empty, all non-blocked commands are permitted; when non-empty, only listed executables are allowed.

When security.sandbox is enabled, commands run inside the configured sandbox backend (bubblewrap or Docker) instead of on the host.

Web Reader

Fetch a web page and return its content as markdown. Internal (SSRF) addresses are automatically blocked.

tools:
  - type: web_reader
    allowed_domains: []
    timeout_seconds: 15
    max_content_bytes: 512000
FieldTypeDefaultDescription
allowed_domainslist[str][]Only fetch from these domains (empty = allow all)
blocked_domainslist[str][]Never fetch from these domains (ignored when allowed_domains is set)
max_content_bytesint512000Truncate page content beyond this byte count
timeout_secondsint15HTTP request timeout in seconds
user_agentstr(default)User-Agent header sent with requests

Registered function: fetch_page(url).

Python

Execute Python code in a subprocess with optional network isolation.

tools:
  - type: python
    timeout_seconds: 30
    network_disabled: true
    require_confirmation: true
FieldTypeDefaultDescription
timeout_secondsint30Timeout per execution in seconds
max_output_bytesint102400Truncate combined stdout+stderr beyond this byte count
working_dirstr | nullnullWorking directory (null = fresh temp directory per run)
require_confirmationbooltruePrompt user before each execution
network_disabledbooltrueBlock outbound network access via audit hook

Registered function: run_python(code).

When security.sandbox is enabled, code runs inside the configured sandbox backend (bubblewrap or Docker) instead of on the host.

DateTime

Get the current date/time and parse date strings. Requires no API key or external service.

tools:
  - type: datetime
    default_timezone: UTC
FieldTypeDefaultDescription
default_timezonestr"UTC"Default timezone when none is specified in the tool call

Registered functions: current_time(timezone), parse_date(text, format).

SQL

Query a SQLite database. Read-only by default — ATTACH DATABASE is blocked at the engine level to prevent escaping the configured database.

tools:
  - type: sql
    database: ./data.db
    read_only: true
    max_rows: 100
FieldTypeDefaultDescription
databasestr(required)Path to the SQLite file, or :memory: for an in-memory database
read_onlybooltrueOnly allow SELECT statements
max_rowsint100Maximum rows returned per query
max_result_bytesint102400Truncate result output beyond this byte count
timeout_secondsint10SQLite connection timeout in seconds

Registered function: query_database(sql).

Web Scraper

Fetch a web page, extract its content, and store it in the document store so it becomes searchable via search_documents. Uses the chunking and embedding settings from spec.ingest.

tools:
  - type: web_scraper
    allowed_domains: []
    timeout_seconds: 15
FieldTypeDefaultDescription
allowed_domainslist[str][]Only scrape these domains (empty = allow all)
blocked_domainslist[str][]Never scrape these domains (ignored when allowed_domains is set)
max_content_bytesint512000Truncate page content beyond this byte count
timeout_secondsint15HTTP request timeout in seconds
user_agentstr(default)User-Agent header sent with requests

Registered function: scrape_page(url). After scraping, the page is chunked and embedded using the settings from spec.ingest, then stored so search_documents can retrieve it.

Web and news search via pluggable providers. The default provider (DuckDuckGo) requires no API key.

tools:
  - type: search
    provider: duckduckgo
    max_results: 10
    safe_search: true
    timeout_seconds: 15
FieldTypeDefaultDescription
providerstr"duckduckgo"Search backend to use
api_keystr | nullnullAPI key (required for paid providers)
max_resultsint10Maximum results per query
safe_searchbooltrueEnable safe-search filtering
timeout_secondsint15Timeout for each search request

Providers

ProviderAPI key requiredNotes
duckduckgoNoFree, no account needed
serpapiYesGoogle results via SerpAPI
braveYesBrave Search API
tavilyYesTavily search API

Registered functions: web_search(query, num_results), news_search(query, num_results, days_back).

Install the search extra for the DuckDuckGo provider:

pip install initrunner[search]

Slack

Send messages to Slack channels via incoming webhooks.

tools:
  - type: slack
    webhook_url: ${SLACK_WEBHOOK_URL}
    default_channel: "#general"
    username: "InitRunner Bot"
    icon_emoji: ":robot_face:"
    timeout_seconds: 30
    max_response_bytes: 1024
FieldTypeDefaultDescription
webhook_urlstr(required)Slack incoming webhook URL
default_channelstr | nullnullOverride the webhook's default channel
usernamestr | nullnullBot username override
icon_emojistr | nullnullBot icon emoji (e.g. :robot_face:)
timeout_secondsint30HTTP request timeout in seconds
max_response_bytesint1024Truncate Slack API response beyond this byte count

Registered function: send_slack_message(text, channel?, blocks?).

Email

Search, read, and send emails via IMAP/SMTP. Read-only by default — sending requires explicit opt-in.

tools:
  - type: email
    imap_host: imap.gmail.com
    smtp_host: smtp.gmail.com
    imap_port: 993
    smtp_port: 587
    username: ${EMAIL_USER}
    password: ${EMAIL_PASSWORD}
    use_ssl: true
    default_folder: INBOX
    read_only: true
    max_results: 20
    max_body_chars: 50000
    timeout_seconds: 30
FieldTypeDefaultDescription
imap_hoststr(required)IMAP server hostname
smtp_hoststr | nullnullSMTP server hostname (required for sending)
imap_portint993IMAP port
smtp_portint587SMTP port
usernamestr(required)Email account username
passwordstr(required)Email account password (supports ${VAR})
use_sslbooltrueUse SSL/TLS for connections
default_folderstr"INBOX"Default mailbox folder
read_onlybooltrueOnly allow read operations
max_resultsint20Maximum emails returned per search
max_body_charsint50000Truncate email bodies beyond this length
timeout_secondsint30Timeout for IMAP/SMTP operations

Registered functions: search_inbox(query, folder, limit), read_email(message_id, folder), list_folders().

When read_only: false, an additional function is registered: send_email(to, subject, body, reply_to, cc).

Security: The email tool defaults to read-only mode. Use environment variables (${EMAIL_USER}, ${EMAIL_PASSWORD}) for credentials — never hard-code them in YAML.

Audio

Fetch YouTube video transcripts and transcribe local audio/video files. Requires the audio extra (pip install initrunner[audio]).

tools:
  - type: audio
    youtube_languages: ["en"]
    include_timestamps: false
    transcription_model: null       # defaults to spec.model
    max_audio_mb: 20.0
    max_transcript_chars: 50000
FieldTypeDefaultDescription
youtube_languageslist[str]["en"]Preferred caption language codes for YouTube transcripts
include_timestampsboolfalseInclude timestamps in transcript output
transcription_modelstr | nullnullMultimodal model for local transcription (e.g. openai:gpt-4o-audio-preview); defaults to the agent's model
max_audio_mbfloat20.0Maximum local file size to send for transcription
max_transcript_charsint50000Truncate transcript output beyond this length

Registered functions: get_youtube_transcript(url, language), transcribe_audio(file_path).

Supported audio formats: .mp3, .mp4, .m4a, .wav, .ogg, .webm, .mpeg, .flac.

Model requirement: transcribe_audio passes audio to the agent's model (or transcription_model if set). Use a model that supports audio input such as openai:gpt-4o-audio-preview. See Multimodal for supported models.

Example — meeting notes agent:

spec:
  model:
    provider: openai
    name: gpt-4o-audio-preview
  tools:
    - type: audio
      include_timestamps: true
      max_audio_mb: 25.0

Think Tool

Gives the agent an accumulated reasoning scratchpad. Each call appends a thought and returns the full numbered chain — surviving context trimming. An optional ring buffer caps token overhead, and periodic self-critique nudges keep reasoning on track.

tools:
  - type: think
    critique: true
    max_thoughts: 30

Options

FieldTypeDefaultDescription
critiqueboolfalseAppend a self-critique nudge every 5th thought
max_thoughtsint50Ring buffer capacity (1–200). Oldest thoughts are evicted when full

Registered Functions

  • think(thought: str) -> str — Append a thought and return the full numbered chain. With critique: true, every 5th thought includes a nudge: "You have recorded N thoughts. Before proceeding, critically evaluate your reasoning. What assumptions might be wrong? What have you missed?"

When to Use

  • Always add type: think for agents doing multi-step reasoning.
  • Enable critique: true for complex tasks where self-correction matters.
  • Reduce max_thoughts for agents with tight token budgets.

The think tool works in both single-shot and autonomous mode. In autonomous mode, thoughts persist across iterations through run-scoped state. See Reasoning Primitives for strategies that orchestrate thinking across turns.

Example

# Careful reasoning agent with self-critique
spec:
  role: >
    You are a careful, methodical assistant. Before answering any question
    or taking any action, always use the think tool to reason step-by-step.
  model:
    provider: openai
    name: gpt-5-mini
  tools:
    - type: think
      critique: true
    - type: datetime

Todo Tool

Priority-aware task management with dependency resolution. The agent creates structured todo lists, works through items by priority, and auto-completes when all items reach terminal status. Operates on run-scoped state — fresh per run, never leaking across sessions.

tools:
  - type: todo
    max_items: 30

Options

FieldTypeDefaultDescription
max_itemsint30Maximum concurrent items (1–100)
sharedboolfalseBack state with SQLite for sub-agent access
shared_pathstr""SQLite file path (required when shared: true)

Registered Functions

ToolDescription
add_todo(description, priority?, depends_on?)Create an item. Returns its 8-char ID + the full formatted list
batch_add_todos(items)Create multiple items at once. Supports inter-batch dependency refs via index ("0", "1", ...)
update_todo(id, status?, notes?, priority?)Update fields on an existing item. Returns the full formatted list
remove_todo(id)Remove an item and clean up dangling dependency references
list_todos(status_filter?)Show all items, or filter by status
get_next_todo()Return the highest-priority pending item whose dependencies are all in terminal status
finish_task(summary, status)Explicitly signal task completion (completed/blocked/failed)

Statuses

StatusTerminal?IconDescription
pendingNo[ ]Not started
in_progressNo[>]Currently being worked on
completedYes[x]Successfully finished
failedYes[!]Failed
skippedYes[-]Intentionally skipped

Priority and Dependencies

Priority ordering: critical > high > medium > low. get_next_todo() returns the highest-priority pending item whose dependencies are all in terminal status.

Items can depend on other items by ID. In batch creation, use 0-based indices as dependency refs. Cycles are detected via Kahn's algorithm and rejected immediately.

Auto-Completion

When every item in the list reaches a terminal status (completed, failed, or skipped), the autonomous loop automatically signals completion. The agent does not need to call finish_task explicitly — though it can do so at any time to override.

Shared Mode

When shared: true, the todo list is backed by SQLite with WAL mode for concurrent access. Sub-agents spawned via the spawn tool can read and update the same list.

tools:
  - type: todo
    shared: true
    shared_path: ./.initrunner/shared_todo.db

When to Use

Add the todo tool for agents that need to track multi-step work:

  • Autonomous agents — structured task tracking with automatic completion detection.
  • Todo-driven reasoning — pair with spec.reasoning.pattern: todo_driven for plan-first execution. See Reasoning Primitives.
  • Multi-agent coordination — enable shared: true so spawned sub-agents can update the same list.

Example

# Autonomous agent with structured task tracking
spec:
  role: |
    You are a project planner. Break tasks into structured
    todo lists and work through each item systematically.
  model:
    provider: openai
    name: gpt-5-mini
  tools:
    - type: think
      critique: true
    - type: todo
      max_items: 20
  reasoning:
    pattern: todo_driven
    auto_plan: true
  autonomy:
    max_plan_steps: 20
  guardrails:
    max_iterations: 15
    autonomous_token_budget: 100000

Spawn Tool

Non-blocking parallel agent execution. Spawn sub-agents as background tasks, poll for results, and await completion — all within a single agent run.

tools:
  - type: spawn
    max_concurrent: 3
    timeout_seconds: 120
    agents:
      - name: researcher
        role_file: ./agents/researcher.yaml
        description: Researches a specific topic

Options

FieldTypeDefaultDescription
agentslistrequiredAgent refs with name, role_file or url, and description
max_concurrentint4Maximum parallel tasks (1–16)
max_depthint3Maximum delegation depth
timeout_secondsint300Per-task wall-clock timeout
shared_memoryobjectnullShared LanceDB memory config

Each agent ref needs either role_file (inline execution) or url (remote execution via MCP).

Registered Functions

ToolDescription
spawn_agent(agent_name, prompt)Submit a background task. Returns immediately with a task_id
poll_tasks(task_ids?)Check status of specific tasks or all. Returns a formatted status table
await_tasks(task_ids)Block until all specified tasks complete. Returns their results
await_any(task_ids)Block until any one task completes. Returns its result
cancel_task(task_id)Cancel a running background task

Task statuses: running, completed, failed, timeout.

When to Use

  • Parallelizable research — spawn multiple researchers for different topics simultaneously.
  • Fan-out/gather — distribute work across specialist agents and synthesize results.
  • Long-running sub-tasks — offload heavy work to background agents while the coordinator continues.

See Reasoning Primitives for how to compose the spawn tool with todo-driven strategies.

Example

# Coordinator with parallel sub-agents
spec:
  role: |
    You are a research lead. Spawn researchers for different topics
    and synthesize their findings into a report.
  model:
    provider: openai
    name: gpt-5-mini
  tools:
    - type: todo
    - type: spawn
      max_concurrent: 3
      agents:
        - name: web-researcher
          role_file: ./agents/web-researcher.yaml
          description: Searches the web and summarizes findings
        - name: data-analyst
          role_file: ./agents/data-analyst.yaml
          description: Analyzes data and produces charts
  reasoning:
    pattern: todo_driven
    auto_plan: true

Script Tool

Defines inline shell scripts in YAML as named, parameterized agent tools. Each script becomes a separate tool function with typed parameters. Script bodies are piped to an interpreter via stdin — no temporary files, no shell=True.

tools:
  - type: script
    interpreter: /bin/sh           # default interpreter
    timeout_seconds: 30            # default timeout per script
    max_output_bytes: 102400       # default: 100 KB
    working_dir: null              # default: role directory
    scripts:
      - name: disk_usage
        description: Check disk usage for a path
        interpreter: /bin/bash     # override per script
        body: |
          df -h "$TARGET_PATH"
        parameters:
          - name: target_path
            description: Filesystem path to check
            required: true

Top-Level Options

FieldTypeDefaultDescription
scriptslist[ScriptDefinition](required)One or more script definitions. Names must be unique.
interpreterstr"/bin/sh"Default interpreter for scripts that don't specify their own.
timeout_secondsint30Default timeout for scripts that don't specify their own.
max_output_bytesint102400Maximum output size (100 KB). Truncated output includes a [truncated] marker.
working_dirstr | nullnullWorking directory for all scripts. null uses the role file's directory.

Script Definition

FieldTypeDefaultDescription
namestr(required)Tool function name. Must be a valid Python identifier.
descriptionstr""Tool description shown to the LLM. Falls back to "Run the '<name>' script".
bodystr(required)The script source. Piped to the interpreter via stdin. Must not be empty.
interpreterstr | nullnullOverride the top-level interpreter for this script. null inherits from parent.
parameterslist[ScriptParameter][]Parameters injected as uppercase environment variables.
timeout_secondsint | nullnullOverride the top-level timeout for this script. null inherits from parent.
allowed_commandslist[str][]When non-empty, validates that every command line in the body uses one of these commands. Empty list skips validation.

Script Parameter

FieldTypeDefaultDescription
namestr(required)Parameter name. Must be a valid Python identifier. Injected as NAME (uppercased) in the subprocess environment.
descriptionstr""Parameter description for the LLM.
requiredboolfalseWhether the parameter is required.
defaultstr""Default value for optional parameters.

Parameter Injection

Parameters are injected as uppercase environment variables. A parameter named target_path becomes $TARGET_PATH in the script body:

parameters:
  - name: target_path
    description: Filesystem path to check
    required: true
# In the script body:
df -h "$TARGET_PATH"

Default values are always applied to the environment, so scripts work correctly even when the LLM omits optional parameters.

Security

  • No shell=True — Scripts are piped to the interpreter via stdin, not passed through a shell.
  • Env scrubbing — Sensitive environment variables (OPENAI_API_KEY, AWS_SECRET, etc.) are removed from the subprocess environment.
  • Output bounded — Output exceeding max_output_bytes is truncated with a [truncated] marker.
  • Timeout enforcement — Scripts that exceed their timeout are killed and a SubprocessTimeout error is raised.
  • Working directory isolation — When working_dir is set, all scripts execute in that directory. Falls back to the role file's directory.
  • Runtime sandbox — When security.sandbox.backend is set to bwrap, docker, or auto, scripts run inside the resolved backend. See Runtime Sandbox.

Examples

Single-command scripts with allowed_commands:

tools:
  - type: script
    scripts:
      - name: disk_usage
        description: Check disk usage for a path
        allowed_commands: [df]
        body: |
          df -h "$TARGET_PATH"
        parameters:
          - name: target_path
            required: true

Multi-command scripts (no allowed_commands — trusts the role author):

tools:
  - type: script
    scripts:
      - name: system_info
        description: Show basic system information
        interpreter: /bin/bash
        body: |
          echo "Hostname: $(hostname)"
          echo "Kernel: $(uname -r)"
          echo "Uptime: $(uptime -p 2>/dev/null || uptime)"
          echo "Memory:"
          free -h 2>/dev/null || echo "free not available"

Python interpreter:

tools:
  - type: script
    scripts:
      - name: calculate
        description: Evaluate a math expression
        interpreter: python3
        body: |
          import os, ast
          print(ast.literal_eval(os.environ["EXPR"]))
        parameters:
          - name: expr
            description: Math expression to evaluate
            required: true

Auto-Registered Tools

Document Search (from ingest)

When spec.ingest is configured, a search_documents tool is auto-registered:

search_documents(query: str, top_k: int = 5, source: str | None = None) -> str
  • query — natural-language search string (embedded and compared against stored chunks).
  • top_k — number of results to return (default 5).
  • source — optional glob pattern to filter results by source file path (e.g. "*billing*").

See Ingestion for full details and the RAG Patterns Guide for usage examples.

Memory Tools (from memory)

When spec.memory is configured, up to five tools are auto-registered depending on which memory types are enabled: remember(content, category), recall(query, top_k, memory_types), list_memories(category, limit, memory_type), learn_procedure(content, category), and record_episode(content, category). See Memory.

Plugin Tools

Third-party packages can register new tool types via the initrunner.tools entry point. Once installed (pip install initrunner-<name>), the new type is available in spec.tools like any built-in.

List discovered plugins with initrunner plugins.

Note: Plugin tools do not support the permissions block. The plugin parser strips non-type keys into a generic config dict, so permissions is silently ignored. This is a known limitation.

Async Tool Execution

When running inside Flow or the API layer, agents are built with prefer_async=True. This gives I/O-bound tools async closures that run natively on the asyncio event loop without thread-pool overhead.

ToolAsync Behavior
httpUses httpx.AsyncClient with SSRF-safe transport
web_readerAsync fetch and markdown conversion
web_scraperAsync fetch + concurrent embeddings via asyncio.gather
searchAsync HTTP for search APIs

Inherently blocking tools (filesystem, script, shell, sql, git) ignore prefer_async since their I/O is CPU-bound or uses blocking libraries. Custom tools are always sync — PydanticAI auto-wraps them in run_in_executor when running in an async context.

Resource Limits

ToolLimitBehavior
read_file1 MBTruncated with [truncated] note
http_request100 KBTruncated with [truncated] note
git_*100 KBTruncated with recovery hint

On this page