How COSMO Works

The execution model

COSMO uses what we call "guided mode"—you provide a task structure, and the system executes it. This is intentionally different from open-ended AI chat.

Plan → Execute → Knowledge

When you start a COSMO run, you define:

Topic — What you're researching or building
Context — Background information, constraints, focus areas
Phases — The structure of work (research first, then analysis, then outputs)
Depth — How many cycles to run (more cycles = deeper investigation)

The system parses this into a formal plan with milestones and tasks, then executes it while you watch.

The two agents

COSMO uses an "IDE-first" architecture. Instead of many specialized agents, there are two:

Research Agent

Handles external information gathering. When a task requires web search, finding sources, or gathering data from the internet, the Research Agent handles it.

This is the only agent that reaches outside the system.

IDE Agent

Handles everything else. File operations, analysis, synthesis, code generation, document creation—one capable agent that does most of the work.

"IDE-first" means we favor one versatile agent over many narrow specialists. Less coordination overhead, simpler execution.

This is a deliberate simplification. Earlier versions of COSMO had many specialized agents (analysis, synthesis, document creation, code creation, etc.). The current architecture consolidates most work into the IDE Agent.

Task execution tiers

Tasks don't all run at once. They execute in dependency order:

Tier 0: Data Collection └── Research agents gather external information └── No dependencies—can start immediately ↓ Tier 1: Processing └── IDE agents analyze gathered data └── Waits for Tier 0 to complete ↓ Tier 2: Creation └── IDE agents create outputs (documents, code) └── Waits for Tier 1 to complete ↓ Tier 3: Validation └── Quality checks on outputs └── Final synthesis across phases

This tiered approach ensures tasks have the inputs they need before they start. Research happens before analysis. Analysis happens before document creation.

The Plan Executor

A dedicated component called the Plan Executor manages task lifecycle:

Tracks which tasks are pending, running, completed, or failed
Manages phase advancement (won't start Phase 2 until Phase 1 completes)
Handles retries when tasks fail
Assigns agents to tasks based on what the task requires
Maintains state so execution can recover from interruptions

The Plan Executor is the "single authority" for task execution. It prevents race conditions and ensures tasks complete in the right order.

The knowledge brain

As COSMO executes tasks, it builds a knowledge graph—what we call a "brain."

What's in a brain

Nodes — Individual pieces of knowledge (facts, concepts, findings)
Edges — Relationships between nodes (supports, contradicts, relates to)
Outputs — Generated documents, code, reports
Metadata — Sources, timestamps, which agent created what

Why this matters

The brain persists after execution completes. You can:

Query it with natural language questions
Visualize the knowledge graph
Continue building on it with additional runs
Export findings and outputs

Unlike a chat conversation that disappears, the brain is a persistent artifact. Knowledge accumulates instead of evaporating.

Watching execution

COSMO streams execution progress in real-time via WebSocket. You can watch:

Which tasks are currently running
What the Research Agent is searching for
What files the IDE Agent is creating or modifying
When phases complete and new phases begin
Task success and failure status

This transparency is intentional. You see what the system is doing, not just what it produces.

Execution modes

COSMO supports different execution modes depending on how tightly you want to control the work:

Strict — 100% task-focused. Executes exactly what you defined.
Mixed — ~85% task, ~15% exploration. Mostly follows the plan but can pursue relevant tangents.
Advisory — ~65% task. Plan provides guidance but system has more latitude to explore.

Most users want Strict or Mixed mode. Advisory mode is for open-ended investigation where you want the system to have more autonomy.

State management

Recent development has focused on reliability. The system now uses:

Cluster State Store — Persists all plan, task, and agent state
Task State Queue — Serializes state changes to prevent race conditions
Retry logic — Failed tasks can be retried with configurable limits
Crash recovery — Execution can resume after interruption

These aren't user-facing features, but they make the system more reliable. Tasks complete correctly even when things go wrong.

How COSMO works