LLMProc: Thinking in Processes, Not Agents

Abstract

LLMProc introduces a process-based approach to LLM applications, drawing inspiration from Unix systems. This blog explores how thinking in terms of processes rather than agents can lead to more robust, scalable LLM applications with elegant abstractions for context management and workflow control.

Introduction

When working with large language models (LLMs), the term “agent” often causes confusion. The AI community uses this word loosely: it can refer to anything from any model with tool-calling capabilities to fully autonomous systems with complex goals.

LLMProc sidesteps this confusion entirely by adopting a familiar analogy: the Unix process model.

Why Processes?

At a high level, LLM execution is naturally analogous to a computing process:

Both run sequentially, executing based on provided instructions (prompts).
Both maintain internal state (context, history).
Both can perform input/output operations and invoke external tools or processes.
Both can spawn child processes, fork to explore multiple execution paths, or even communicate with other running processes.

By explicitly embracing this analogy, LLMProc unlocks powerful, intuitive abstractions that mirror established computing concepts. It provides a framework that makes scaling and managing LLM execution straightforward.

Process Control: Spawn, Fork, and Goto

Spawn() is inspired by dispatch_agent() from Claude Code. It allows linking different specialized programs with different underlying models.

LLMProc further introduces other advanced operations like fork() and goto() to manage complex workflows elegantly:

fork(): Unlike Spawn, Fork allows children to have complete shared history, and is naturally suitable for parallel tasks execution. It’s based on an earlier exploration here: github.com/cccntu/forking-an-agent
goto(): While Fork is powerful, it can cause race conditions when children processes have write access. Goto elegantly solves this by maintaining a single process, but allowing the LLM to reset (“time travel”) to earlier conversation points. This allows model to self-summarize and rewind, staying in the most productive zone, where there’s sufficient context and ample room left in the context window.

Imagine a scenario involving complex multi-file refactoring. After numerous intermediate steps, the context window may become overly cluttered. The goto() operation lets the LLM compress accumulated context into a concise summary, retaining essential knowledge while discarding irrelevant detail. Or, during debugging, Claude can stash changes, self-rewind, and try a different approach.

Unix-Inspired File Descriptor System

A good analogy doesn’t just clarify, it inspires new capabilities. In Unix systems, processes access data through file descriptors, rather than managing data directly. Similarly, LLMProc introduces a file descriptor system to handle large tool outputs and context management:

Automatic Paging: Tools don’t overload the LLM’s limited context window. Instead, outputs are paged automatically.
Unified Interface: Tools don’t need to implement custom paging functionality. From reading file, to bash tool, web search tool, automatic paging guarantees the context window is safe from unexpectedly large tool result, and no data is lost in truncation.
File Descriptor Inheritance: Like Unix, File Descriptors can be passed to children, allowing use cases such as having a child process debug a long error log.

This approach simplifies tool design, protects the context window from exhaustion, and facilitates delegation.

Execution with Debugger-like Semantics

The run() method in LLMProc is inspired by the semantics of debugger commands. Instead of returning a response object, it executes until a natural stopping point, like stepping through a program with a debugger.

The RunResult object provides execution metadata like api calls, tool calls, cost information. A callback system is used to provide realtime feedback during the execution. For scenarios where a direct response is needed, accessing the final message remains simple:

callbacks = {
  "on_tool_start": lambda tool_name, args: print(f"Using tool: {tool_name}"),
  "on_tool_end": lambda tool_name, result: print(f"Tool {tool_name} completed"),
  "on_response": lambda content: print(content),
}
run_result = await process.run("Summarize the changes", max_iterations=10, callbacks=callbacks)
message = process.get_last_message()

Looking Forward

By thinking of LLM execution in terms of familiar computing paradigms, LLMProc offers both clarity and extensibility. Future development will introduce additional Unix-inspired abstractions and kernel-level tools, and will be applied to more real-world applications.

This isn’t merely a metaphor: it’s a robust, practical framework designed based on the needs and experiences of working with Claude Code.

Check out the LLMProc repository for more details and implementation.