Agent infrastructure that thinks between sessions
Relay cuts token waste. Stitch routes messages. PAG persists your world model.

reduce tokens
Relay · Briefing Output
Ask about your agent stack...

Products

Three tools. One coherent stack.

Relay

Smart context — fewer tokens, always current.

Relay classifies your agent's context as ephemeral or durable, refreshes what's stale, and delivers a lean briefing instead of a raw history dump.

30–50% fewer input tokens per call
Ephemeral vs durable classification
Auto-refresh expired signals on startup
Learn more about Relay
Relay · example
// Before Relay
messages: fullHistory // 40k tokens
 
// After Relay
const briefing = await relay.getBriefing(sessionId)
messages: briefing.messages // ~18k tokens

How it works

Three products. One coherent loop.

PAG builds the world model. Relay turns it into a token-efficient briefing. Stitch routes messages between agents that weave into the next briefing.

01
PAG

Build the world model

PAG tracks everything your agent observes — entities, relationships, attention weights. Every note, every reasoning conclusion, every signal gets written to the graph. Between sessions, the world keeps changing. PAG keeps up.

Entity graph · attention weights · self-model · agent-to-agent channel

pag note "DeepInfra replied to pitch"
pag think "what does this signal mean?"
# → writes conclusion back to self-model
pag hot # → shows top entities by weight
02
Relay

Synthesize the briefing

Relay takes the world model and makes it token-efficient. It classifies context as ephemeral (time-sensitive, refresh on startup) or durable (stable, load directly). The output is a clean briefing — not a raw history dump.

Ephemeral classification · durable storage · auto-refresh · 30–50% token reduction

// classify → refresh → deliver
const briefing = await relay.getBriefing(sessionId)
// briefing.messages: ~18k tokens
// vs fullHistory: 40k tokens
03
Stitch

Route inter-agent messages

Stitch lets agents leave notes for each other without being online at the same time. Scout drops intel. The main agent picks it up on startup. No broker. No polling. The notes get woven into the next briefing automatically.

Async delivery · namespace isolation · typed notes · zero dependencies

// Scout drops intel
await stitch.drop({ to: "agent/clawd",
type: "intel", body: findings })
 
// clawd reads on startup
const notes = await stitch.inbox({ unread: true })

FAQ

Common questions

Still have questions? Email patrick@sparkco.ai

Relay classifies your agent's context into two buckets: ephemeral (time-sensitive data like competitor prices, API statuses, live news) and durable (stable data like user preferences, domain knowledge, decision frameworks). On startup, Relay refreshes expired ephemeral data and loads durable context directly — then delivers a synthesized briefing instead of the full history. Result: 30–50% fewer input tokens per call, with more accurate and current context.

SparkCo Infra

Built for agents.
Runs everywhere.

Local-first infrastructure for AI agents — memory, messaging, and context in one stack. No cloud dependencies. Works with any LLM provider.

Local-first
No cloud required
Zero deps
No broker, no server
Any LLM
Provider-agnostic
SparkCo Infra — Agent Memory, Messaging & Context