The Architecture Nobody Is Selling

The Best Agent
Is Not An Agent
At All.

Everyone is building multi-agent orchestration systems. One company quietly realized the answer was already there — in a folder on your computer.

Explore the Idea Read the ICM Paper →

01 — The Problem

The Context Wall

To make AI useful, you need to route it to the right place — with the right instructions, tools, and data. This is the context wall. Everyone is solving it the hard way.

❌ The Wrong Way

Agent Frameworks

Teams spend months writing Python orchestration code — custom harnesses, routing graphs, vector databases — only to watch a model update break half of it.

LangChainLangGraphSemantic KernelAutoGenCrewAI

✓ The Right Way

The File Tree

A folder with a markdown file and a script. One coding agent reads it, gets full context, executes tasks, spawns sub-agents, and manages memory — zero orchestration code.

SKILL.mdscript.pydata/Claude Code

~/company_ai $ — bash

02 — The Core Insight

Every Agent Needs
Only Three Things

Strip away every framework. Any agent ever built is just a system for routing a model to the right combination of these three things — every time.

📋

Instructions

What the agent should do, how to behave, what rules it follows.

→ SKILL.md

🔧

Tools

What the agent can actually do — APIs, code execution, database queries.

→ tools.py / MCP servers

🗄️

Data

What the agent knows — company docs, history, config, reference knowledge.

→ data/ folder

"You can map every single AI agent out there to a simple file tree. Not approximately. Exactly."

— Core Thesis

03 — Interactive Explorer

Click Through the Architecture

A real company's AI setup — built entirely from folders and files. Click any node to understand what lives there and why.

📂

Click a file or folder to explore

04 — Side by Side

Old Way vs Right Way

The same outcome — a multi-workflow AI system — built two different ways.

⚠️Agent Framework Approach

✗

Three separate agent deployments

One per workflow. Each has its own infrastructure, dependencies, failure modes.

✗

Python orchestration code

Thousands of lines that break on model updates.

✗

Brittle to model updates

Every new release potentially breaks the orchestration layer.

✗

Scaling = more infrastructure

10 workflows means 10 agent deployments.

✅File Tree Approach

✓

One agent, unlimited workflows

Claude Code reads whatever skill folder it's pointed at. Zero extra infrastructure.

✓

Zero orchestration code

Markdown and Python. Anyone can read, edit, and debug without a framework.

✓

Model updates make it better

New capabilities condense to simpler instructions. Files never break.

✓

Scaling = adding folders

Thousands of parallel instances from one foundation.

05 — Abstraction Layers

Build at the Right Layer

Most developers build at the bottom two layers — where the ground shifts fastest.

⚠ Model Layer

Raw intelligence and reasoning. Changes constantly.

AI Companies — don't touch

⚡ Agent Runtime

VS Code / Cursor + Claude Code. Already built.

You — barely touch

📄 Skill / Task

SKILL.md + scripts. Atomic unit of work.

You — build simply

🗂 Workflow

Folder of skill folders. One per department or role.

You — organize freely

🤖 Sub-Agent

Nested folders, spawned dynamically by the agent itself.

Agent — auto-generates

The insight: You're never racing against AI progress because you don't live at the layer that changes.

Model Updates Help You

Instead of breaking your system, every improvement collapses complexity upward.

Scenario 01

Model gains new capability

❌ Old: Orchestration layer needs refactoring.

✓ New: Update two lines in SKILL.md.

Scenario 02

Competitor ships your feature

❌ Old: Your agent framework is commoditized.

✓ New: That workflow becomes a subfolder with one tool call.

Scenario 03

You switch AI providers

❌ Old: Entire SDK surface changes. Months of migration.

✓ New: Point a different agent at the same folders.

The Punchline

The most powerful AI deployment
tool is already on your computer

Nobody is selling it. It's called a folder.

Free Already Built Infinitely Scalable Model-Agnostic Never Obsolete

Read the Academic Paper Behind This →

Academic Research · arXiv 2603.16021 · MIT License

Interpretable Context
Methodology

Jake Van Clief & David McDermott — Eduba / University of Edinburgh · March 2026

📄 arXiv Abstract ⬇ Download PDF 🌐 HTML Version

Abstract: Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and step coordination through code. These frameworks work well for complex, concurrent systems. But for sequential workflows where a human reviews output at each step, they introduce engineering overhead the problem does not require. This paper presents a method that replaces framework-level orchestration with filesystem structure — numbered folders as stages, plain markdown files as context, local scripts for mechanical tasks. Open source under the MIT license.

Section 01

The Problem Being Solved

Frameworks like LangChain and AutoGen are genuinely good tools — for concurrent, dynamic systems. But for sequential, human-reviewed workflows, they introduce enormous accidental complexity.

Problem	What It Costs
Changing step order	Requires editing orchestration code and redeploying
Modifying a prompt	Requires finding it buried in agent configuration
Inspecting intermediate state	Requires adding logging, dashboards, or tracing infrastructure
Handing off to a colleague	Requires documenting environment, dependencies, and setup
Non-developer making changes	Often impossible without developer involvement

"For sequential workflows, you're using a hammer designed for a different nail."

— ICM Paper, Section 1

Section 02

The Central Insight

The paper's core observation is elegant and counterintuitive. You don't need a coordination framework — you need a folder structure. The "coordination logic" that LangChain puts in Python objects and message arrays, ICM puts in file names, folder hierarchy, and markdown contracts.

Framework Approach

Coordination in Code

Agents as objects. State as in-memory variables. Coordination as function calls. Opaque by default, requires developers to modify. Breaks when models update.

ICM Approach

Coordination in Folders

The filesystem IS the orchestrator. Coordination logic lives in file names, folder hierarchy, and markdown contracts. Plain text is the universal interface. Inspectable by anyone.

"This is philosophically aligned with Unix's 1970s insight: the power comes not from any individual program, but from how they're connected — and plain text files are the universal connective tissue."

— ICM Paper, Section 2

Section 03

Intellectual Lineage

The paper is unusually well-grounded in CS history. Each theoretical reference maps directly to a concrete ICM design decision — this is not a hack, it's a synthesis.

1978

Unix Pipeline Philosophy

McIlroy

"Do one thing well." "Output of one is input of another." "Plain text as universal interface." The foundational architecture of composable systems.

→ Each ICM stage does one thing. Output folders feed next stages. Everything is markdown.

1979

Make / Build Systems

Feldman

Files are both the artifacts of work AND the coordination mechanism. No separate orchestration layer needed when the filesystem tracks what's been produced.

→ ICM stages have explicit Input tables in their contracts, just like Make dependency declarations.

1986

Multi-Pass Compilers

Aho, Lam, Sethi, Ullman

A compiler does multiple passes. Tokenize → Parse → Analyze → Optimize → Generate. Each pass reads the prior output and transforms it into an intermediate representation.

→ ICM does the same with content. Research → Script → Animation. Each stage is a pass. This unlocks incremental recompilation theory.

1972

Information Hiding

Parnas / Dijkstra

Systems should be decomposed so each module hides its internal decisions. "Address one thing at a time."

→ Each ICM stage hides its processing from the next. Stage 2 only sees Stage 1's output file, not how it was produced.

1984

Literate Programming

Knuth

Programs should be written primarily for humans to read. The instruction and the documentation should be the same artifact — not separate documents.

→ ICM's CONTEXT.md files are simultaneously agent instructions AND documentation. Reading them tells you exactly what the pipeline does.

1991

"Worse is Better"

Gabriel

Systems prioritizing simplicity of implementation tend to survive and spread. Easier to port, understand, and improve than feature-complete but complex alternatives.

→ ICM trades framework flexibility for portability. A folder of markdown files can be zipped, emailed, or handed to a non-developer.

Section 04

The Architecture In Depth

The heart of ICM is a five-layer context hierarchy. Every agent at every stage loads context from exactly these layers — structured so the model receives already-organized context, not a mixed dump.

CLAUDE.md

Structural / routing — "Where am I?" Identity of the workspace. Top-level orientation for the agent.

~800 tokens

CONTEXT.md

Structural / routing — "Where do I go?" Workspace structure, stage sequence, and navigation map.

~300 tokens

Stage CONTEXT.md

Structural / routing — "What do I do?" The current stage's specific task, process, and output requirements.

200–500 tokens

Reference Material

Content — "What rules apply?" voice.md, design-system.md, conventions.md. Static across runs. Model internalizes as constraints.

500–2k tokens

Working Artifacts

Content — "What am I working with?" research-output.md, script-draft.md. Changes every run. Model transforms as input.

Varies

Layer 3 vs Layer 4: The Critical Distinction

When you mix rules with per-run content in an undifferentiated context window, the model has to sort them itself. ICM separates them structurally before the model ever sees them.

Layer 3 — The Factory (Reference)

Changes per run?No

Examplesvoice.md, design-system.md

Model shouldInternalize as constraints

AnalogyThe recipe

Layer 4 — The Product (Working)

Changes per run?Yes

Examplesresearch-output.md, script-draft.md

Model shouldTransform as input

AnalogyThe ingredients

ICM keeps each stage at 2,000–8,000 focused tokens. The monolithic alternative reaches 30,000–50,000 tokens, most of it irrelevant to the current stage. This is prevention, not compression — the irrelevant tokens are never loaded in the first place.

— Citing Liu et al., "Lost in the Middle," 2024

Section 05

What Practitioners Actually Do

Empirical observation from 33 practitioners reveals a U-shaped intervention pattern. High editing at the start and end, lower in the middle. This isn't complacency — it's appropriate calibration.

Human Edit Rate by Pipeline Stage

Stage 1
Research

~92%

Stage 2
Script

~30%

Stage 3
Middle

~30%

Final
Production

~78%

Stage 1: Directional editing — narrowing from broad possibilities. Creative human judgment.

Middle: Well-constrained stages — clear inputs + strong reference = narrow error space.

Final: Alignment editing — checking output against earlier decisions. Closer to debugging.

Section 06

Where ICM Works and Doesn't

The paper is admirably honest. ICM isn't replacing frameworks across the board — it targets a large, common, underserved class of workflows that existing tools over-engineer.

✓ ICM Works For

✓Sequential workflows — step 2 genuinely follows step 1

✓Human-reviewed workflows — a person checks each step

✓Repeatable workflows — same pipeline, different inputs each run (weekly reports, video production, course development)

✓Non-developer operators — the markdown interface means anyone can modify stage behavior

✗ ICM Does NOT Work For

✗Real-time multi-agent collaboration — agents needing tight communication loops (AutoGen is right here)

✗High-concurrency systems — many users hitting the same pipeline simultaneously

✗Complex automated branching — mid-pipeline automated decisions require scripting that turns ICM into a framework anyway

✗Dynamic, unpredictable workflows — if you don't know the stages in advance, you can't define the folders in advance

Section 07

Observability as Side Effect

One of the most important points in the paper: observability isn't a feature you add — it's a structural consequence. You cannot make the system opaque. There's nothing to hide.

🔍

Inherently Inspectable

Every intermediate output is a plain file in a predictable folder. Open it in any text editor. No dashboards, no logging infrastructure, no tracing setup required.

⚖️

Regulatory Alignment

The EU AI Act requires human oversight, staged review points, and audit trails for high-risk AI systems. ICM produces all three as a byproduct of architecture — for free.

🧠

Rudin's Principle

Stop building opaque systems and trying to explain them after the fact. Build systems that are inherently interpretable. ICM is a direct implementation of this principle.

Section 08

Future Directions

Section 6 of the paper extends the multi-pass compiler analogy into specific proposed tooling — applying decades of compiler theory to AI workflow debugging.

🔗

Output Provenance Identifiers

Embed markers in stage outputs that link back to the source instruction — like debug symbols in compiled binaries. Trace a wrong phrase in Stage 3 all the way back to the specific line in voice.md that caused it.

✔️

Cross-Stage Trace Verification

A Verify section in stage contracts that checks current output against earlier stage outputs. Already prototyped as an "audit file" catching timing and alignment errors between Stage 2 and Stage 3.

⏸️

Breakpoints in Markdown

Pause execution mid-stage to verify the agent interpreted a constraint correctly before continuing. Semantic debugging — inspect the agent's current interpretation, not just its final output.

The Edit-Source Principle

Editing output is patching the binary

Two kinds of edits exist: creative edits (genuine human value — correct to edit output) and diagnostic edits (you tighten the same paragraph every run — this is a bug in the source contract). The proposed direction: track output edits across runs, surface recurring patterns, and suggest source-level changes. Workspaces that improve with use.

Section 09

Key Nuances & Tensions

The paper is unusually self-aware. It names five tensions any serious practitioner should understand before adopting ICM.

Tension 01

Simplicity vs. Capability

ICM explicitly trades capability for simplicity — no concurrent execution, no complex branching, no real-time coordination. The tradeoff is only acceptable if your workflows fall within ICM's scope.

Tension 02

Self-Reported Empirical Data

The U-shaped intervention pattern and the 30-of-33 practitioner claims come from self-reported conversations in an invite-only community. The paper acknowledges this. These are hypotheses worth testing, not established facts.

Tension 03

Model Agnosticism vs. Specific Implementation

ICM claims model-agnosticism, but all testing was done on Claude Opus/Sonnet. The 5-layer hierarchy was likely tuned to how Claude handles context. Whether it generalizes to GPT-4o, Gemini, or Llama is an open empirical question.

Tension 04

Output Editing vs. Source Improvement

The review gate design encourages editing output. But editing output without updating the source means the pipeline doesn't improve over runs. The tooling to close this loop doesn't exist yet.

Tension 05

Growing Context Windows

As models handle 200K+ tokens without degradation, the engineering argument for scoped context loading weakens. But the human-interaction arguments remain: even if the model handles 50K tokens equally well, the practitioner still cannot review a 50K-token context to catch errors. Observability and editability are human concerns, not just model concerns.

Section 10

The Complete Picture

The philosophical claim: the filesystem is a coordination primitive that has been systematically underused in AI system design. What if files are the universal interface?

PROBLEM: Sequential AI workflows → frameworks add unnecessary complexity INSIGHT: Filesystem IS the orchestrator (Unix, 1970s — still works) SOLUTION: ICM ├── Numbered folders = stage sequence ├── CONTEXT.md files = stage contracts ├── Layer 3/4 split = rules vs. inputs ├── Review gates = human control points └── Output folders = handoff points THEORETICAL GROUNDING: Unix pipelines → composability Make → files as coordination Compilers → multi-pass transforms Literate programming → self-documenting Context engineering → focused windows RESULT: ✓ No framework code ✓ No server infrastructure ✓ Editable by non-developers ✓ Observable by default ✓ Version-controllable ✓ Portable as a zip file ✗ No concurrency ✗ No complex branching ✗ Not for dynamic workflows FUTURE: → Semantic debugging (trace output to source) → Edit-source principle (improve pipeline over time) → Cross-stage verification

📄 arXiv Abstract ⬇ Download PDF 🌐 HTML Version

The Best AgentIs Not An AgentAt All.

The Context Wall

Every Agent NeedsOnly Three Things

Click Through the Architecture

Old Way vs Right Way

Build at the Right Layer

Model Updates Help You

The most powerful AI deploymenttool is already on your computer

The Best Agent
Is Not An Agent
At All.

Every Agent Needs
Only Three Things

The most powerful AI deployment
tool is already on your computer