Unfold your Agents

In 2025, almost every AI automation framework was built on the same model: autonomous agents, sending messages to each other. If you wear a tie, you call this “agentic”.

For most of what we’re building, it’s the wrong model. A better one was being developed in the same building at MIT, the same year the actor model was born, and we mostly walked right past it.

545 Technology Square

In 1973, Carl Hewitt published a paper at MIT titled “A Universal Modular ACTOR Formalism for Artificial Intelligence”. The same year, a few floors below¹, Jack Dennis and his group were building the world’s first dataflow computers. Two different models for computing things were being developed in the same hallways and went in different directions.

Hewitt’s vision was explicitly one of autonomous agents collaborating:

“The model becomes a cooperating society of ‘little men’ each of whom can address others with whom it is acquainted and politely request that some task be performed.”²

The ‘little men’ model started being called the actor model through the late 1970s. The formalism turned out to describe systems like Erlang³ that powered telephone networks. These systems were built around independent entities responding to messages from an uncertain world. A telephone switch is a population of actors. It’s a good fit because each call’s routing depends on information that only exists at the time the call is made: who’s calling, what number they dialed, which lines are free, etc.

By contrast, the dataflow model contains the same insights as systems like spreadsheets and database views⁴. For these, we declare the dependencies implicitly by saying what values are used to calculate other values, and the system figures out how to run those calculations itself. The “what gets computed when” question is mostly⁵ a property of how the dependencies link up, following general rules that were set well ahead of time, rather than depending on the actual values being computed.

These different philosophies of computing lead to very different system designs.

The Conversational Assumption

Fifty years later, when LLMs arrived, we reached for the actor model. Autonomous entities, called agents, receive messages, reason, take actions, and send messages. Frameworks were quick to cement this mental model, and it quickly became the norm to attack every nail with an actor-pattern hammer.

The assumption is not unreasonable, an LLM feels like a conversational entity. We talk to them, they respond, they even operate using natural language, so it feels like a natural fit. However, a pattern feeling right doesn’t mean it is right, and this simple sounding pattern has hidden complexities.

Most of what people actually build with these frameworks isn’t very agentic at all. Document processing, invoice routing, data enrichment, support triage. Mostly predefined paths, with LLMs doing classification and extraction. These are not use-cases where LLMs have to navigate unknown territory. Most of the time when we’re building “agentic” systems, we’re actually building dataflow systems, but using actor-model tools designed for problems we don’t have, so we pay all the costs of needing decision discretion without needing any of it.

Vibe-lawyers on Agent Street

Wiring up a society of little men comes with a set of questions that the actor model leaves open. Who talks next, and who decides? What does each one need to know? How do humans get involved? The frameworks all have opinions, which is partly why they exist. The questions themselves only exist because we chose the actor model.

Routing: Which little man talks next, and who decides?⁶; Who does he talk to?⁷; What is each one responsible for?⁸
State: What does each little man need to know?⁹; What does he need to know about what others know and don’t know?¹⁰; How should the original intent be conveyed during hand-offs?¹¹
Oversight: How do we include real humans?¹²; How do we know what this society of little men will do next?¹³

Then, after our society is built, we let it run. This is where the fun starts. Because we picked the actor model, fixing it involves figuring out where things went wrong in a telephone game played at computer-speed by our anti-social league of little men¹⁴. After a while, it’s not uncommon to start using very strong language in our instruction prompts as the frustration grows.

After strong language fails, the “engineering” response to fixing this up this mess is guardrails¹⁵. Guardrails are checks you add to catch the little men before they go off-script. Writing guardrails is kind of like writing laws, they work for simple things, but they’re subject to interpretation a lot of the time¹⁶, particularly laws around planning and intent¹⁷. So we write more laws to fix the existing laws, adding guardrails to our guardrails.

If you’re super-professional about it, you’ll even bolt on extensive testing infrastructure at this point. The per-agent unit tests are usually simple but the end-to-end “trace evaluations” can end up taking more effort than the implementation.

As builders of AI systems, this isn’t a great spot for us. In the construction industry, using too much silicon filler is a bad sign, it means something is not lining up right. Doubling-down by adding more silicon solves the problem for now, but we walk away knowing we’ll be back in a couple of years when weather patterns change to fix it again. Guardrails are the silicon filler of AI systems. A great build will have a few of them, in places where things should be water-tight, but massive layered guardrails mean we’re filling a gap that probably shouldn’t exist.

The real problem is that the ‘little men’ metaphor flatters LLMs. A society of highly-trained vibe-lawyers is closer to the truth, working extremely hard to do exactly, and only, what is in the contracts. Other vibe-lawyers interpret the guardrail and are generally working against the lawyers doing the actual work. The contracts get bigger and more detailed as the system grows, and when something goes wrong, the little men don’t sort it out between themselves, that’s left up to you as the law-maker.¹⁸ Maybe the actor pattern wasn’t the right design metaphor for our AI-powered invoicing software.

Calm on Declaration Drive

Backing out of Agent St. for a bit, let’s talk about imperative vs. declarative systems.

So far on Agent St, we’re dealing with imperative design. What imperative means, is that we get to design each step of a larger process (sometimes called an execution graph) and stitch it together. The result is whatever this graph produces when run. Sort of a bottom-up approach.

But there’s another class of systems called declarative. More of a top-down approach, we describe the result (not the steps) and the system builds a graph containing the steps necessary to produce that result.

Spreadsheets are declarative. We lay out a sheet, add some formulas that reference other cells, and it does the rest. All the grunt-work of figuring out the inter-cell dependencies and which order they should be calculated is handled by the spreadsheet. This is the kind of magic¹⁹ that declarative systems have.²⁰

The nice thing about working in the declarative space is you focus on the outcome rather than the method. You still have to describe exactly how things work, but much of the “graph work”²¹ is handled for you.

Let’s use the spreadsheet metaphor, because it fits so well. Say you have a bunch of sheets in a spreadsheet, and this spreadsheet also has a function like can CALL_LLM($cells, "prompt").²² One of the tables in this spreadsheet is linked up to an email inbox. An email arrives and populates a row, calculated values in other cells call LLMs to classify it. Is this an invoice? If yes, then add a row to the invoices sheet. Then cells on the extracted_invoices sheet populate; using CALL_LLM in the formula they pull out fields like the vendor, amount, and due date. Maybe then another sheet called accounts_payable picks these up and performs further calculations.

Here, there’s no “agentic framework” deciding what to do next, nor are agents deciding what to do next, it’s all just spreadsheet logic. This is a data-flow processing system, like the machines Dennis was working on the same year “agentic” was born.

Where to build

You can build an AI system on Agent Street or Declaration Drive. I’d suggest building on Declaration, but there’s some edge-cases where Agent St makes sense. Let’s go through the trade-offs.

When Discretion is Key

How do you determine if declarative or imperative is the right fit?

A spreadsheet-like systems have limited discretion. They’re just inputs followed by relatively predictable outputs. The decisions on “what a value means” are taken care of at the spreadsheet layer, not within the cell function calling an LLM, and there’s a natural separation between the two. Because of this distinction, you don’t have spreadsheet cells deciding the formulas of other spreadsheet cells, they only decide their values.

If you want cells to come up with formulas for other cells²³ then it usually means you’re walking outside the design-spec of a declarative system.

This happens though, sometimes you need LLMs to make decisions about the graph itself. Decisions require discretion. A full-blown agent chooses what to look at. It chooses what to do. It decides when it is done. For some problems, this is exactly what you want. For most problems, you want to make these decisions ahead of time yourself.

There are solid use-cases where doling out discretion is the right idea:

Coding Agents: Very loosely predefined paths. They don’t know what files matter ahead of time.; Across sessions, tools might change, and users will have different workflows.; Each step produces information that changes the plan in unknown ways.; They’re often part of supervision loop, where a programmer oversees the work.
Research Agents: Unknown paths, they find threads and follow them.; Requires access to many different tools that give different types of data back.; Each step produces information that changes the plan in unknown ways.; Part of a supervision loop where the research is acted on in a more controlled context.

You can see the pattern here. The actor pattern makes sense for these use-cases. High-discretion scenarios are about the only time the actor pattern makes sense, however.

The Cycle Objection

This gets raised sometimes. Spreadsheets (and database views) don’t like circular references. We can’t define a formula like A1 = SUM(B1, A1). Unlike actor systems, the spreadsheet model is acyclic by design.²⁴

The objection is “business processes contain cycles”. Consider an invoice that gets rejected, then resubmitted, then reclassified. A “cycle-free” design doesn’t seem to support this.

But there’s a simple answer to this, just “unfold” your data model.²⁵ In the invoice case, when it gets rejected, that is information you want to capture, so it becomes a row in the invoice_decisions spreadsheet. An invoice being resubmitted is also different from the original invoice, even if it has the same invoice number, it’s a different event, and one you might want to handle differently, so it’s a new row.

What a Paper-Trail Gives You

If you unfold things this way, you don’t have cycles because we’re always computing from new rows. Plus, it leaves a paper-trail. These paper-trails are genuinely useful. They help you with auditing and debugging, they give nice crash-restart behavior because they’re “durable” and preserve “granular state”.²⁶ This is how most databases work under the hood.²⁷

The other good thing about paper-trails with their own sheets is they give you easy reporting out of the box. Want to make a dashboard that shows status on invoice approvals and rejections? Great, just read it straight off the invoice_decisions sheet. In an agentic framework, this is a secondary consideration, and often means doing complicated “trace analysis” or having bolt-on “instrumentation” that’s hard to configure after-the-fact.

Approval as Data

With the spreadsheet approach, approval is just another table, and the approved_invoices sheet just merges the extracted_invoices and invoice_approvals sheets. A missing row in the invoice_approvals sheet means no approved_invoices row.

This shifts the approval decisions away from prompt engineering toward the data model (how tables are designed and linked). Data modelling is something that has been well-studied, and there are established patterns with mathematical guarantees for important things like approval workflows. A good data model can also be widened progressively as trust accumulates, so that auto-approval can be gradually introduced.²⁸

What to build with

The Big-Dog Players

The existing landscape is arriving at “the spreadsheet approach”, albeit unevenly. It’s mostly the Enterprise-data big-dogs that have this figured out. Databricks has ai_query() inside materialised view definitions. Snowflake Cortex has AI_CLASSIFY() and AI_COMPLETE() as SQL functions, applied to columns. There are other academic efforts²⁹ that take it even further, but as-yet, nothing really open-source or SMB that does it well.

How to Unfold your Agents

Before reaching for a framework, ask whether the decisions in your system actually need to be made at runtime. If an LLM needs to decide what to look at, what to do, and when it’s done, give it discretion. If it’s just classifying invoices and extracting fields, then it doesn’t need discretion, it needs to be a node in a graph that you design.

If you’re not on enterprise data infrastructure, the most practical option right now is to use an agentic framework in its most constrained mode. LangGraph with fixed edges and no conditional routing is a deterministic DAG, the LLM discretion is opt-in. Other frameworks are learning the same lessons from production experience: CrewAI and OpenClaw both retrofitted declarative pipeline layers when users kept running into the predictability wall.

Most of the decisions we’ve been handing to runtime agents don’t need to be made at runtime. If we unfold those decisions into our graph in a dataflow way, then we end up with a system that’s easier to reason about, easier to debug, and the LLMs still do exactly the work that they’re good at, just without being asked to run the show.

Both groups occupied 545 Technology Square (MIT building NE43). Hewitt’s AI Lab held floors 7–9; Dennis’s Project MAC held floors 1–6. The ninth-floor machine room housed the building’s IBM mainframes. See MIT News, 17 March 2004: “For the next quarter-century, Tech Square remained their joint home, with LCS having the first six floors and AI the top three.” ↩
Hewitt, C., Bishop, P., and Steiger, R. “A Universal Modular ACTOR Formalism for Artificial Intelligence.” IJCAI, 1973, p. 236. ↩
Armstrong, J. “Making reliable distributed systems in the presence of software errors.” PhD thesis, KTH, 2003. Armstrong designed Erlang’s process model to handle telephone exchanges — independent concurrent entities, each managing a single call, failing and restarting without affecting others. He later noted the correspondence with the actor model was recognised retrospectively — see Armstrong, J. “A History of Erlang.” HOPL-III, 2007. The little men were already hard at work before the actor model formalism rose in prominence. ↩
Similar to SQL, make, and Terraform are also declarative and house similar concepts. ↩
Conditional constructs such as SQL’s WHERE, CASE, and filtered pipelines, do let values influence which branch executes. The qualifier “mostly” is because the bulk of the logic concerns query planning that is independent of the data’s meaning. In Actor systems, they’re mixed by design. ↩
Frameworks handle this in one of two ways: fixed sequencing, or LLM-based routing where a model picks who runs next. LangGraph lets you choose either via its edge functions. CrewAI’s hierarchical mode runs a “manager” LLM call after each task to assign the next worker. AutoGen’s SelectorGroupChat runs an LLM call after every single message to select the next speaker. In the wild, some teams bypass frameworks entirely and route between OpenClaw agents using Telegram broadcast groups, with the chat log as shared state — the actor model with a messaging app as the runtime. The more “agentic” the setup, the more routing decisions are handed to an LLM, whether or not the workflow needs it. Tool use follows the same pattern: in every major framework, the LLM decides which tool to call and when; the framework just executes the choice. ↩
In every major framework, the set of possible agents is fixed at design time. You always list the participants upfront. When an LLM handles routing, it only picks from a list you already wrote. You drew the graph; the framework just made it harder to see. ↩
All frameworks define agent responsibilities through system prompts. CrewAI gives each agent a role, goal, and backstory, all concatenated into a system prompt. LangGraph nodes are plain Python functions; identity comes from whatever system prompt you pass into the agent node. AutoGen’s AssistantAgent takes a system_message. There is no structural “responsibility” primitive in any of them; it is prompt engineering all the way down. ↩
This varies widely. LangGraph uses a shared typed state object that all nodes read from and write to, with reducer functions to merge updates. AutoGen’s group chats broadcast the full conversation history to all participants by default. CrewAI uses a combination of RAG-recalled memory and explicit context threading between tasks, if you wired it up that way. None of them give you the clean, declared input-output wiring of a dataflow model. ↩
In AutoGen, every agent sees everything every other agent said, with no capability isolation by default. In LangGraph, full shared state is the default; restricting an agent’s view requires building a subgraph with a different schema. In CrewAI, agents get whatever the similarity search surfaces from memory. “Need to know” is not a first-class concept in any of these frameworks. ↩
LangGraph and AutoGen both preserve the full message history by default, though token limits can force truncation. CrewAI is most fragile here: downstream agents see the original intent only if their task description explicitly threads it through, or if memory retrieval happens to surface it. Intent drift across a long chain is a documented failure mode. ↩
LangGraph treats this as first-class: interrupt() pauses execution and saves the full graph state to a checkpointer, so execution can resume later from exactly where it stopped. CrewAI’s human_input=True requires the process to stay alive, with no checkpoint or resume. AutoGen v0.4 uses a UserProxyAgent that blocks execution and hands control to the user, but their own docs warn it puts the team “in an unstable state that cannot be saved or resumed.” The more actor-model the framework, the harder genuine oversight becomes. ↩
The frameworks closest to dataflow give you high structural predictability: you can inspect the graph before it runs. The more LLM-driven the routing, the less predictable the execution path. The pressure shows up in how these frameworks evolve: CrewAI added Flows (a declarative event graph) on top of Crews because users needed predictability; OpenClaw shipped Lobster Shell, a YAML-based pipeline engine, for the same reason. LangGraph, the most dataflow-native of the major frameworks, is explicit that its Time Travel feature makes agent workflows reliable, not LLMs predictable. Worth noting: with fixed edges only and no conditional routing, LangGraph operates as a pure deterministic DAG — LLM involvement is opt-in, not baked in. The distinction matters. ↩
A LLM, if you will… ↩
Borrowed and broken metaphors are a key feature of natural language, and hence also in NLP terminology. ↩
Natural language, and law specifically, suffers from leaky abstractions. In law, you end up with things like the Acts Interpretation Act 1901 [Cth.], which is very meta. I haven’t seen a similar things for LLMs yet, but it should probably exist. Natural language is a little too unnatural for purposeful work. ↩
Not all guardrails are written in natural language but most of them are. Behavioral guardrails are usually an LLM call. ↩
Not so fun ruling a society of little men now right? Heavy is the head that wears the crown. ↩
Clarke, A.C. “Profiles of the Future”, 1973 (revised ed.): “Any sufficiently advanced technology is indistinguishable from magic.” The law first appeared in a 1968 letter to Science; the 1973 revision of the book is where it was formally incorporated. The spreadsheet qualifies. ↩
In 1974, Gilles Kahn wrote a gem called “The Semantics of a Simple Language for Parallel Programming”, wherein he proved that the output of a dataflow network is independent of the scheduling order of its nodes. This decoupling helped make things like databases and spreadsheets possible. ↩
Sometimes, we need a slightly different result that implies a massive change in the execution graph. If we’re working in the imperative mode, this change is on us. In a declarative system this change falls to the system. If you’re interested in how much small changes in result can cascade to huge changes in optimal graphs, check out any database query planner and what it means for specific predicate value changes given diverse table statistics. ↩
Or, if you’re a fellow data nerd, consider a database with tables and materialised views, where call_llm is a UDF. ↩
This kind of purposeful mixing of abstraction can sometimes be powerful, it’s sometimes called “meta-programming”, and it’s super-clever but very easy to abuse. Like most sharp tools, they can be debugging nightmares. Put another way: samurai swords can be used to cut up fruit (extremely fun), but commercial kitchens don’t routinely do this. Using a Wakizashi to butter toast is never fun, even if that’s the “agentic” way to do things. ↩
The ‘A’ in DAG. ↩
Programming nerd note: Haskell’s answer to this is fibs = 0 : 1 : zipWith (+) fibs (tail fibs). This looks circular, fibs defined in terms of fibs. But it’s not a loop. Each element is computed from elements that already exist, the computation moves strictly forward. ↩
The data is a stream. Data Vault understood this before it was fashionable: always add new rows, never mutate. Append-only event streams processed by stateless views give you lineage and auditability for free. ↩
Append-only log structures, or WALs, provide a nice basis for ACID semantics because they give us a nice common representation, with ordering, that can translate up to pretty much every kind of data structure. The same approaches can be selectively used at the schema-level for providing ACID-like properties “on top” of your database. Often times, transaction primitives need to exist quite a way “above” the data-domain because they have their own logic and exist in the business process domain. Threading these layers together is, in part, the art of data modelling. ↩
This is one of those small-change-in-formula leads to big-change-in-graph scenarios that we want to avoid coding in the imperative fashion. ↩
LOTUS (out of Stanford and Berkeley), formalises the idea as semantic operators extending Codd’s relational algebra, with LLM-powered versions of filter, join, rank, aggregate and project. These traditions arrived at the same insight independently. ↩