InitRunner

Multimodal Input

InitRunner supports sending images, audio, video, and documents alongside text prompts. Multimodal input works across the CLI, interactive REPL, OpenAI-compatible API server, web dashboard, and TUI.

Supported File Types

CategoryExtensionsNotes
Image.jpg, .jpeg, .png, .gif, .webpMost models support these natively
Audio.mp3, .wav, .ogg, .flac, .aacRequires model support (e.g. gpt-4o-audio-preview)
Video.mp4, .webm, .mov, .mkvLimited model support
Document.pdf, .docx, .xlsxSent as binary content
Text.txt, .md, .csv, .htmlInlined as text in the prompt

Size limit: 20 MB per file.

CLI Usage

Use --attach (or -A) to attach files or URLs to a prompt. The flag is repeatable.

# Single file
initrunner run role.yaml -p "Describe this image" -A photo.png

# Multiple files
initrunner run role.yaml -p "Compare these" -A before.png -A after.png

# URL attachment
initrunner run role.yaml -p "What's in this image?" -A https://example.com/photo.jpg

# Mixed files and URLs
initrunner run role.yaml -p "Summarize" -A report.pdf -A https://example.com/chart.png

--attach requires -p (or piped stdin). Without a prompt, the command exits with an error.

Interactive REPL

In interactive mode (-i), three commands manage attachments:

CommandDescription
/attach <path_or_url>Queue a file or URL for the next prompt
/attachmentsList queued attachments
/clear-attachmentsClear all queued attachments

Queued attachments are sent with your next message and then cleared automatically.

> /attach diagram.png
Queued attachment: diagram.png
> /attach notes.pdf
Queued attachment: notes.pdf
> /attachments
  1. diagram.png
  2. notes.pdf
> What do these show?
[assistant response with both attachments]
> /attachments
No attachments queued.

Server API (OpenAI Format)

The initrunner serve endpoint accepts multimodal content in the standard OpenAI format. The content field of a ChatMessage can be a string or a list of content parts.

Content Part Types

TypeFieldDescription
texttextPlain text content
image_urlimage_urlImage via HTTP URL or base64 data: URI
input_audioinput_audioAudio as base64 with format specifier

Image via URL

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'

Image via Base64

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this image."},
        {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0KGgo..."}}
      ]
    }]
  }'

Audio Input

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Transcribe this audio."},
        {"type": "input_audio", "input_audio": {"data": "<base64>", "format": "mp3"}}
      ]
    }]
  }'

The format field defaults to "mp3" if omitted.

OpenAI Python SDK

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="unused")

response = client.chat.completions.create(
    model="my-agent",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
        ],
    }],
)
print(response.choices[0].message.content)

Web Dashboard

The chat interface supports file uploads via a button or drag-and-drop.

Upload flow:

  1. Files are uploaded to POST /roles/{role_id}/chat/upload and staged in memory
  2. The server returns a list of attachment IDs
  3. Attachment IDs are passed to the SSE stream endpoint with the next prompt
  4. Staged files expire after 5 minutes if unused

Limits: 20 MB per file, same supported file types as the CLI.

TUI

In the TUI chat panel, press Ctrl+A to attach a file. The same file type restrictions and 20 MB size limit apply.

Model Support

Not all models support all modalities. If a model doesn't support a given content type, the provider API will return an error.

ModalityExample models
Imagesgpt-4o, gpt-4o-mini, claude-sonnet-4-5-20250929, gemini-2.0-flash
Audiogpt-4o-audio-preview
Videogemini-2.0-flash
Documents (PDF)gpt-4o, claude-sonnet-4-5-20250929, gemini-2.0-flash

When in doubt, use gpt-4o or a Claude model for broad multimodal support.

Error Handling

ConditionError
File not foundAttachment file not found: <path>
No file extensionCannot determine file type — file has no extension: <path>
Unsupported extensionUnsupported file type '<ext>' for: <path>. Supported: ...
File exceeds 20 MBFile too large (<size> MB): <path>. Maximum: 20 MB
Dashboard upload too largeFile too large: <filename> (max 20 MB) (HTTP 400)

In the interactive REPL, attachment errors are printed and the prompt is not sent. In the CLI, the command exits with a non-zero status.

On this page