Multi-Agent Coordination
Build agent swarms with real-time state sync, resource locking, and task handoffs. Coordinate multiple AI agents working on complex, interdependent tasks.
Plan Requirements
Multi-Agent Coordination features vary by plan. Task queues require Pro+, real-time events require Team+.
Overview
Multi-Agent Coordination enables multiple AI agents to work together on complex tasks. This includes:
- Swarms — Groups of agents working toward a shared goal
- Resource Locking — Prevent conflicts when modifying shared resources
- Shared State — Synchronized key-value store across agents
- Task Queues — Distributed work queues with claim/complete semantics
- Task Monitoring — Stats, pagination, and incremental event updates
- Real-time Events — Broadcast messages to other agents
Swarms
A swarm is a coordinated group of agents working together. Each swarm has a unique ID, shared state, and can coordinate via resource locks and task queues.
Creating a Swarm
rlm_swarm_create({ name: "refactor-auth", goal: "Refactor authentication system to use JWT", maxAgents: 5})// Response:{ "swarmId": "swarm_abc123", "joinCode": "XKCD-1234"}Joining a Swarm
Other agents can join using the swarm ID or join code:
rlm_swarm_join({ swarmId: "swarm_abc123", role: "worker" // Optional: "coordinator", "worker", or "observer"})Swarm Roles
| Role | Description | Typical Use |
|---|---|---|
coordinator | Plans and assigns tasks | Senior agent that decomposes work |
worker | Executes assigned tasks | Agents that write code, review, or test |
observer | Read-only access | Monitoring, logging |
Listing Swarm Members
Inspect who is currently in the swarm, what role they hold, and which agents are available for more work.
const members = await rlm_swarm_members({ swarmId: "swarm_abc123"})Leaving or Removing an Agent
Remove an agent when work is done or when a crashed worker needs cleanup. Leaving a swarm releases held claims and unassigns that agent's queued work.
await rlm_swarm_leave({ swarmId: "swarm_abc123", agentId: "worker-3"})Resource Locking
Prevent multiple agents from modifying the same resource simultaneously using claims:
Claiming a Resource
// Agent wants to modify a fileconst claim = await rlm_claim({ resource: "file:src/auth/login.ts", swarmId: "swarm_abc123", ttl: 300 // 5 minute lock})if (claim.acquired) { // Safe to modify the file // ... do work ... await rlm_release({ claimId: claim.claimId })} else { // Another agent has this resource console.log("Held by:", claim.heldBy)}Claim Parameters
| Parameter | Type | Description |
|---|---|---|
resource | string | Resource identifier (e.g., "file:path", "db:table") |
swarmId | string | Swarm context for the lock |
ttl | number | Lock timeout in seconds (max 3600) |
wait | boolean | Wait for lock vs immediate fail |
waitTimeout | number | Max wait time in seconds |
Shared State
Agents in a swarm can read and write shared key-value state:
// Set staterlm_state_set({ swarmId: "swarm_abc123", key: "current_phase", value: "implementation"})// Get stateconst state = await rlm_state_get({ swarmId: "swarm_abc123", key: "current_phase"})// state.value === "implementation"Atomic Updates
For counters and other values that need atomic updates:
// Increment a counter atomicallyrlm_state_set({ swarmId: "swarm_abc123", key: "files_processed", operation: "increment", delta: 1})Polling Multiple State Keys
Poll several shared-state keys at once and receive only the values that changed since your last known versions.
const changed = await rlm_state_poll({ swarmId: "swarm_abc123", keys: ["current_phase", "files_processed", "blocked_count"], lastVersions: { current_phase: 2, files_processed: 18 }})Task Queues
Distribute work across agents using task queues. Tasks are claimed, processed, and completed:
Creating Tasks
// Coordinator creates tasksrlm_task_create({ swarmId: "swarm_abc123", type: "implement", title: "Add JWT token generation", description: "Implement JWT signing in src/auth/jwt.ts", priority: "high", dependencies: [] // IDs of tasks that must complete first})Creating Tasks in Bulk
Seed a backlog quickly by creating many tasks in one call. This is useful after planning, decomposition, or importing a queue from another system.
await rlm_task_bulk_create({ swarmId: "swarm_abc123", tasks: [ { title: "Add JWT utilities", priority: 2 }, { title: "Write auth migration tests", priority: 1 }, { title: "Update rollout checklist", priority: 0 } ]})Claiming and Completing Tasks
// Worker claims next available taskconst task = await rlm_task_claim({ swarmId: "swarm_abc123", types: ["implement", "fix"] // Task types this agent handles})if (task) { // Do the work... // Mark complete with result await rlm_task_complete({ taskId: task.id, status: "completed", result: { filesModified: ["src/auth/jwt.ts"] } })}Task States
Task created, waiting to be claimed
Agent is working on the task
Task finished successfully
Task failed (can be retried)
Unclaiming a Stuck Task
If an agent crashes or abandons a task after claiming it, return the task to pending so another worker can pick it up.
await rlm_task_unclaim({ swarmId: "swarm_abc123", taskId: "task_dead_worker", reason: "worker process terminated before completion"})Recovering Stale Tasks in Batch
Use batch recovery to scan a swarm for claimed or in-progress tasks that have gone stale. Start with dryRun: true to preview what would be recovered.
const preview = await rlm_task_recover({ swarmId: "swarm_abc123", stuckThresholdMinutes: 30, dryRun: true})await rlm_task_recover({ swarmId: "swarm_abc123", stuckThresholdMinutes: 30, dryRun: false})Task Statistics
Get aggregated task counts for dashboards and progress tracking. Distinguishes between blocked (waiting on dependencies) and pending (ready to claim).
rlm_task_stats({ swarmId: "swarm_abc123" })// Response:{ "done": 15, // Completed "in_progress": 3, // Currently claimed "blocked": 2, // Waiting on dependencies "pending": 5, // Ready to claim "failed": 1, "total": 26}Task List with Pagination
List tasks with cursor-based pagination for large task queues. Returns owner (agent who claimed/completed) and updated_at timestamp.
const page1 = await rlm_task_list({ swarmId: "swarm_abc123", status: "completed", // optional filter limit: 20})// Response: { tasks: [...], has_more: true, next_cursor: "..." }// Get next pageconst page2 = await rlm_task_list({ swarmId: "swarm_abc123", cursor: page1.next_cursor})Snapshot Task Listing
Use rlm_tasks when you want a simpler filtered list without cursor-based iteration. It's a good fit for quick inspections and lightweight dashboards.
const openTasks = await rlm_tasks({ swarmId: "swarm_abc123", status: "pending", assignedTo: "worker-2", limit: 25})Task Events (Incremental Updates)
Get task status change events since a timestamp. Use for incremental progress reports ("5 tasks completed since last check").
// Get events from last 15 minutesconst since = new Date(Date.now() - 15*60*1000).toISOString()const events = await rlm_task_events({ swarmId: "swarm_abc123", since: since})// Response:// { events: [{ event_type: "task_completed", task_id: "...", ... }], total: 5 }| Tool | Use Case |
|---|---|
rlm_task_bulk_create | Seed a swarm backlog in one call after planning or decomposition |
rlm_tasks | Quick snapshot of tasks filtered by status or assigned agent |
rlm_task_stats | Dashboard metrics, completion %, health checks |
rlm_task_list | Build task tables, export reports, iterate all tasks |
rlm_task_events | Incremental updates, "X tasks closed since last sync" |
rlm_task_unclaim | Manually re-open a task after a worker crash or abandoned claim |
rlm_task_recover | Find and recover stale claimed tasks in bulk with a dry-run option |
rlm_task_update | Admin-level title, description, priority, or status changes |
rlm_task_reassign | Move work to another agent when load balancing or recovering from failure |
rlm_task_delete | Remove obsolete, test, or cancelled tasks from the queue |
Updating a Task
Update queue metadata when the task itself changes. This is typically reserved for admins or coordinators.
await rlm_task_update({ swarmId: "swarm_abc123", taskId: "task_auth_jwt", title: "Add JWT signing + verification", priority: 3})Reassigning a Task
Reassign a task when the current owner is overloaded, unavailable, or no longer the best fit. Use force only when you intentionally override an in-progress assignment.
await rlm_task_reassign({ swarmId: "swarm_abc123", taskId: "task_auth_jwt", newAgentId: "worker-security", force: false})Deleting a Task
Delete obsolete or erroneous tasks from the queue. Keep this for cleanup and cancelled work rather than normal completion flow.
await rlm_task_delete({ swarmId: "swarm_abc123", taskId: "task_bad_fixture", force: false})Real-time Events (Team+)
Broadcast messages to other agents in the swarm for real-time coordination:
// Broadcast event to swarmrlm_broadcast({ swarmId: "swarm_abc123", event: "phase_complete", data: { phase: "planning", tasksCreated: 12 }})Event Types
| Event | Description | Use Case |
|---|---|---|
task_available | New task added to queue | Wake idle workers |
phase_complete | Major milestone reached | Coordinate phase transitions |
resource_released | Lock released | Allow waiting agents to proceed |
error | Critical error occurred | Alert other agents |
custom | Application-specific | Any coordination need |
Querying Swarm Events
Use rlm_swarm_events to inspect prior broadcast activity, filter by sender or event type, and build lightweight event timelines.
const recentEvents = await rlm_swarm_events({ swarmId: "swarm_abc123", eventType: "phase_complete", since: new Date(Date.now() - 60 * 60 * 1000).toISOString(), limit: 20})Example: Coordinated Refactoring
Here's a complete example of multiple agents working together to refactor a codebase:
Coordinator Agent// Create swarm and plan workconst swarm = await rlm_swarm_create({ name: "auth-refactor", goal: "Migrate from sessions to JWT"})// Create tasks for workersawait rlm_task_create({ swarmId: swarm.swarmId, type: "implement", title: "Create JWT utilities", description: "Add sign/verify functions"})// ... more tasks ...await rlm_broadcast({ swarmId: swarm.swarmId, event: "tasks_ready", data: { count: 5 }})Worker Agent// Join swarm and workawait rlm_swarm_join({ swarmId: "swarm_abc123", role: "implementer"})while (true) { const task = await rlm_task_claim({ swarmId: "swarm_abc123", types: ["implement"] }) if (!task) break // No more tasks // Claim resource lock before editing const lock = await rlm_claim({ resource: `file:${task.targetFile}`, swarmId: "swarm_abc123" }) // Do the implementation... await rlm_release({ claimId: lock.claimId }) await rlm_task_complete({ taskId: task.id })}Plan Limits
| Feature | Starter | Pro | Team | Enterprise |
|---|---|---|---|---|
| Swarms | 1 | 5 | 20 | Unlimited |
| Agents/Swarm | 2 | 5 | 15 | 50 |
| Resource Locks | 10 | 100 | 500 | Unlimited |
| Task Queue | — | Yes | Yes | Yes |
| Real-time Events | — | — | Yes | Yes |
Best Practices
- Start simple — Use shared state before adding task queues
- Always release locks — Use try/finally to ensure release
- Set appropriate TTLs — Prevent deadlocks from crashed agents
- Use task dependencies — Ensure correct execution order
- Monitor swarm state — Check swarm status via the dashboard or API
- Handle claim failures gracefully — Retry or work on other tasks