InitRunner

API Server

The initrunner run <role> --serve command exposes any agent as an OpenAI-compatible HTTP API. Use InitRunner agents as drop-in replacements for OpenAI in any client that speaks the chat completions format — including the official OpenAI SDKs, curl, and tools like Open WebUI.

Quick Start

# Start the server
initrunner run role.yaml --serve

# With authentication
initrunner run role.yaml --serve --api-key my-secret-key

# Custom host/port
initrunner run role.yaml --serve --host 0.0.0.0 --port 3000

CLI Options

See CLI Reference — Run Options for the full flag list. The key --serve flags:

OptionTypeDefaultDescription
--serveboolfalseEnable API server mode
--hoststr127.0.0.1Host to bind to (0.0.0.0 for all interfaces)
--portint8000Port to listen on
--api-keystrnullAPI key for Bearer token authentication
--cors-originstrnullAllowed CORS origin (repeatable)
--audit-dbPath~/.initrunner/audit.dbAudit database path
--no-auditboolfalseDisable audit logging

Endpoints

GET /health

Always returns 200 OK. Not protected by authentication.

{"status": "ok"}

GET /v1/models

Lists available models. Returns the agent's metadata.name as the model ID.

{
  "object": "list",
  "data": [
    {
      "id": "my-agent",
      "object": "model",
      "created": 1700000000,
      "owned_by": "initrunner"
    }
  ]
}

POST /v1/chat/completions

The main chat completions endpoint. Accepts the standard OpenAI request format.

FieldTypeDefaultDescription
modelstr""Model name (ignored — uses role config)
messageslist[]Conversation messages (role + content)
streamboolfalseEnable SSE streaming

ChatMessage Fields

FieldTypeDescription
rolestr"user", "assistant", or "system"
contentstr | list[ContentPart]Plain text string, or a list of content parts for multimodal input

Multimodal Input

The content field supports multimodal content parts in the standard OpenAI format. See Multimodal Input for the full reference.

Content Part Types

TypeFieldDescription
texttextPlain text content
image_urlimage_urlImage via HTTP URL or base64 data: URI
input_audioinput_audioAudio as base64 with format specifier

Image via URL

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'

Image via Base64

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this image."},
        {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0KGgo..."}}
      ]
    }]
  }'

Audio Input

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Transcribe this audio."},
        {"type": "input_audio", "input_audio": {"data": "<base64>", "format": "mp3"}}
      ]
    }]
  }'

OpenAI Python SDK (multimodal)

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="unused")

response = client.chat.completions.create(
    model="my-agent",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
        ],
    }],
)
print(response.choices[0].message.content)

Streaming

When stream: true, the server responds with Server-Sent Events (SSE):

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Multi-Turn Conversations

Use the X-Conversation-Id header for server-side conversation history:

  1. Send a request with X-Conversation-Id: conv-001.
  2. The server stores message history after each request.
  3. Subsequent requests with the same ID use stored history — only the last user message is the new prompt.
  4. Conversations expire after 1 hour of inactivity.

Authentication

When --api-key is set, all /v1/* endpoints require:

Authorization: Bearer <api-key>

The /health endpoint is never protected.

Usage Examples

curl

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

curl (with auth and conversation)

# First message
curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-key" \
  -H "X-Conversation-Id: conv-001" \
  -d '{"messages": [{"role": "user", "content": "My name is Alice."}]}'

# Follow-up
curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-key" \
  -H "X-Conversation-Id: conv-001" \
  -d '{"messages": [{"role": "user", "content": "What is my name?"}]}'

OpenAI Python SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key="my-secret-key",  # or "unused" if no --api-key set
)

response = client.chat.completions.create(
    model="my-agent",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

OpenAI Python SDK (streaming)

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key="unused",
)

stream = client.chat.completions.create(
    model="my-agent",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

OpenAI Node.js SDK

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://127.0.0.1:8000/v1",
  apiKey: "my-secret-key",
});

const response = await client.chat.completions.create({
  model: "my-agent",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Open WebUI Integration

Open WebUI gives you a ChatGPT-like web interface for any InitRunner agent. Because initrunner run --serve speaks the OpenAI wire format, Open WebUI works out of the box — no plugins or adapters needed.

Setup

This walkthrough uses the support-agent example, which includes a RAG knowledge base.

1. Ingest the knowledge base

initrunner ingest examples/roles/support-agent/support-agent.yaml

2. Start the InitRunner server

initrunner run examples/roles/support-agent/support-agent.yaml --serve --host 0.0.0.0 --port 3000

--host 0.0.0.0 is required so the Docker container can reach the server.

3. Launch Open WebUI

docker run -d \
  --name open-webui \
  --network host \
  -e OPENAI_API_BASE_URL=http://127.0.0.1:3000/v1 \
  -e OPENAI_API_KEY=unused \
  -v open-webui:/app/backend/data \
  ghcr.io/open-webui/open-webui:main

4. Open your browser

Navigate to http://localhost:8080, create a local account, and select the support-agent model from the model dropdown. Start chatting — responses are served by your InitRunner agent.

Cleanup

docker rm -f open-webui
docker volume rm open-webui

Notes

  • If you start the server with --api-key, set OPENAI_API_KEY to the same value in the docker run command.
  • For production deployments, consider running both services behind a reverse proxy with TLS.

On this page