Autonomous Mode

Autonomous mode lets an agent plan its own work, execute steps, adapt when things go wrong, and signal completion — all without human input. It's enabled by the spec.autonomy section and the -a CLI flag.

How It Works

An autonomous agent follows a plan-execute-adapt loop:

Plan — The agent calls update_plan to create a step-by-step checklist
Execute — It works through each step using its tools
Adapt — If a step fails, the agent modifies its plan (add retries, skip, investigate)
Finish — The agent calls finish_task with a status when all steps are complete

Two tools are auto-registered when autonomy is enabled:

Tool	Description
`update_plan(steps)`	Create or update the execution plan. Each step has a description and status (pending, in_progress, completed, failed)
`finish_task(status, summary)`	Signal task completion with an overall status and summary

Example: Deployment Checker

A complete autonomous agent that verifies deployments:

apiVersion: initrunner/v1
kind: Agent
metadata:
  name: deployment-checker
  description: Autonomous deployment verification agent
  tags: [devops, autonomous, deployment]
spec:
  role: |
    You are a deployment verification agent. When given one or more URLs to check,
    create a verification plan, execute each step, and produce a pass/fail report.

    Workflow:
    1. Use update_plan to create a checklist — one step per URL to verify
    2. Run curl -sSL -o /dev/null -w "%{http_code} %{time_total}s" for each URL
    3. Mark each step passed (2xx) or failed (anything else)
    4. If a check fails, adapt your plan — add a retry or investigation step
    5. When done, send a Slack summary with pass/fail results per URL
    6. Call finish_task with the overall status

    Keep each plan step concise. Mark steps completed/failed as you go.
  model:
    provider: openai
    name: gpt-4o-mini
    temperature: 0.0
  tools:
    - type: shell
      allowed_commands:
        - curl
      require_confirmation: false
      timeout_seconds: 30
    - type: slack
      webhook_url: "${SLACK_WEBHOOK_URL}"
      default_channel: "#deployments"
      username: Deploy Checker
      icon_emoji: ":white_check_mark:"
  autonomy:
    max_plan_steps: 6
    max_history_messages: 20
    iteration_delay_seconds: 1
    max_scheduled_per_run: 1
  guardrails:
    max_iterations: 6
    autonomous_token_budget: 30000
    max_tokens_per_run: 10000
    max_tool_calls: 15
    session_token_budget: 100000

initrunner run deployment-checker.yaml -a \
  -p "Verify https://api.example.com/health and https://api.example.com/ready"

Configuration

The spec.autonomy section controls planning behavior:

Field	Type	Default	Description
`max_plan_steps`	`int`	`10`	Maximum steps allowed in a plan
`max_history_messages`	`int`	`20`	Messages kept in context during iteration
`iteration_delay_seconds`	`int`	`1`	Pause between iterations (prevents tight loops)
`continuation_prompt`	`str`	`"Continue working on the task..."`	Prompt injected at each iteration to keep the agent on track
`max_scheduled_per_run`	`int`	`3`	Maximum follow-up tasks scheduled per autonomous run
`max_scheduled_total`	`int`	`50`	Maximum total scheduled tasks across all runs
`max_schedule_delay_seconds`	`int`	`86400`	Maximum delay allowed when scheduling a follow-up (seconds)

Guardrails

Autonomous agents need spending limits since they run without human oversight. These fields in spec.guardrails control resource usage:

Field	Type	Default	Description
`max_iterations`	`int`	`10`	Maximum plan-execute-adapt cycles
`autonomous_token_budget`	`int`	`50000`	Token budget for the autonomous run
`autonomous_timeout_seconds`	`int \| null`	`null`	Wall-clock timeout for the entire autonomous run
`session_token_budget`	`int`	`100000`	Overall session token cap
`max_tool_calls`	`int`	`20`	Maximum tool calls across all iterations

When any limit is hit, the agent stops and reports its progress. See Guardrails for full enforcement behavior, daemon budgets, and all available limits.

Scheduling Tools

When autonomy is combined with daemon mode, two additional tools are auto-registered for scheduling follow-up tasks:

Tool	Description
`schedule_followup(prompt, delay_seconds)`	Schedule a follow-up task to run after a delay (in seconds)
`schedule_followup_at(prompt, iso_datetime)`	Schedule a follow-up task at a specific ISO 8601 datetime

Both tools are limited by max_scheduled_per_run and max_scheduled_total from the autonomy config. Scheduled follow-ups always run in autonomous mode.

Note: Scheduled tasks are in-memory only and are lost on daemon restart.

autonomy:
  max_scheduled_per_run: 3
  max_scheduled_total: 50
  max_schedule_delay_seconds: 86400  # max 24 hours

Trigger Autonomous Flag

Each trigger type (cron, file_watch, webhook) supports an autonomous: true flag. When set, that trigger fires in autonomous mode — the agent plans, executes, and finishes without human input.

triggers:
  - type: cron
    schedule: "0 */6 * * *"
    prompt: "Check system health and remediate issues."
    autonomous: true   # this trigger runs in autonomous mode
  - type: file_watch
    paths: ["./reports"]
    extensions: [".csv"]
    prompt_template: "Process new report: {path}"
    autonomous: true

Scheduled follow-ups (via schedule_followup / schedule_followup_at) always run in autonomous mode regardless of this flag.

CLI Flags

Flag	Description
`-a`, `--autonomous`	Enable autonomous mode for this run
`--max-iterations N`	Override `max_iterations` from the YAML

# Enable autonomous mode
initrunner run role.yaml -a -p "Check all endpoints"

# Override max iterations
initrunner run role.yaml -a --max-iterations 3 -p "Quick check"

Reflection State

At each iteration, the agent's current state is captured as a ReflectionState and injected into the continuation prompt. This gives the agent awareness of what it has accomplished and what remains.

ReflectionState contains:

Field	Type	Description
`completed`	`bool`	Whether the agent has called `finish_task`
`summary`	`str`	Running summary of progress
`status`	`str`	Current status label
`steps`	`list[PlanStep]`	The current plan steps

Each PlanStep has:

Field	Type	Description
`description`	`str`	What this step does
`status`	`str`	One of: `pending`, `in_progress`, `completed`, `failed`, `skipped`
`notes`	`str`	Optional notes (error details, results, etc.)

The reflection state is rendered as a summary and appended to the continuation_prompt at the start of each iteration, so the agent always has context about its progress.

Terminal Statuses

When an autonomous run ends, it produces a final_status indicating how it concluded:

Status	Description	Success?
`completed`	Agent called `finish_task` successfully	Yes
`max_iterations`	Reached the `max_iterations` limit	Yes
`blocked`	Agent is stuck and cannot proceed	No
`failed`	Agent encountered a failure it couldn't recover from	No
`budget_exceeded`	Token budget exhausted	No
`timeout`	`autonomous_timeout_seconds` elapsed	No
`error`	Unexpected error during execution	No

completed and max_iterations are considered successful outcomes. All others indicate the run did not finish its intended work.

When to Use Autonomous Mode

Good fit:

Verification tasks (deployment checks, health audits)
Batch processing (process a list of items with per-item steps)
Multi-step investigations (diagnose an issue, try fixes)
Tasks with clear completion criteria

Consider alternatives:

Recurring tasks → use Triggers with daemon mode instead
Multi-agent workflows → use Compose for coordination
Interactive exploration → use REPL mode (-i) for human-in-the-loop

On this page