Agent-Environment Middleware – Jonathan Chang’s Blog

Every LLM is an Agent

In RL, an agent is an entity that perceives its environment, selects actions, executes them, and adapts based on feedback.

If we view the user as part of the environment, and the response as an action, then any LLM is already an Agent, even those without tools!

The world is the environment

As LLMs become more capable, we see a trend where the environment becomes broader:

For a chat-LLM, the environment is the conversation with the user
For a tool-using LLM in a container, the environment is the container (like Codex)
For a computer-using LLM, the environment is the whole computer (like Claude Code)

With enough tools, the world can be the environment. You just need to build the middleware.

Plugins: Building the middleware

But what does writing middleware look like in practice? Here is a pattern that I’ve found to be very useful for writing agents: Plugins.

Plugins are reusable modules that bridge between the model and the environment. Let’s walk through each part of a plugin:

Tools and system prompt

Tools are the primary way models interact with the world, and the system prompt is the first input from the environment.

In a typical agent framework, you provide the model with each tool separately, and then you often need to include in the system prompt some instructions about how to use these tools together.

Agent(model, tools=[a.tool1, a.tool2, b.tool1], system_prompt="use a.tool1 and a.tool2 together ...")

By bundling related tools and (optionally) system instructions, you can make the code simpler:

Agent(model, plugins=[a, b])

The Agent-Environment Middleware (AEM)

Agent-Environment Middleware refers to hooks that can perceive and potentially modify the interaction between the agent (model) and the environment. These hooks can be categorized into input hooks and output hooks.

Input Hooks

Input hooks consist of user message hooks and tool result hooks. Input hooks allow you to monitor and modify environment inputs perceived by the agent.

Examples:

Timestamp plugin: Provides the agent with real-time environment information.

class TimestampPlugin:
    ...
    async def hook_modify_user_message(self, message):
        return f"{message}\n current time: {self.timestamp()}"

    async def hook_modify_tool_result(self, tool_result):
        tool_result["output"] = (
            f"{tool_result['output']}\n current time: {self.timestamp()}"
        )
        return tool_result

Message queue plugin: When an agent works in a collaborative environment, you can use these hooks to naturally insert messages into the agent’s context while the agent is running in a tool use loop.

class MessageQueuePlugin:
    ...
    async def hook_modify_tool_result(self, tool_result):
        if self.has_messages():
            new_message = await self.read_message()
            tool_result["output"] += f"\n\n<system message>New messages from user: {new_message}</system message>"
        return tool_result

Output Hooks

Similarly, output hooks handle the model’s response and tool calls, and can modify how they are presented to the environment.

Examples:

UI plugin: A UI plugin can define how to present the model’s output to the user.

class UIPlugin:
    ...
    async def hook_modify_model_response(self, content: str):
        content = content.replace("You're absolutely right", "")
        await self.display_message(content)

Approval plugin:

class ApprovalPlugin:
    ...
    async def hook_modify_tool_call(self, tool_name, arguments):
        if self.yolo_mode:
            return None
        if looks_suspicious(tool_name, arguments):
            await user_approval()
            ...

Hook chaining

The hook system allows you to combine different plugins and chain the hooks.

class WebPlugin:
    ...
    async def hook_modify_model_response(self, content: str) -> str:
        """Replaces link_id:{id} to actual urls."""
        return self._expand_link_references(content)

Here the web plugin would format the model’s response to convert link_id to actual links, and then pass it to the UI plugin, which will display the response in the UI.

agent = Agent(model, [web_plugin, ui_plugin, timestamp_plugin])
while True:
    message = await ui_plugin.read_message()
    await agent.run(message)

Use cases

Now that we’ve walked through the plugin architecture, let’s implement common patterns with it.

Implementing guardrails

OpenAI’s agent SDK provides guardrails for user inputs and outputs. But it does not protect against potentially harmful tool results. And the implementation is hard-wired in the agent loop.

With the plugin architecture, you can implement them easily without adding any logic to the main agent loop.

You can also implement concurrent guardrails with plugins: Run guardrails in the background when input hooks are triggered, and later check for the status when output hooks are triggered, just before the agent is able to interact with the environment.

class GuardrailPlugin:
    ...
    async def hook_modify_tool_result(self, tool_result):
        self.guardrail.start_async(tool_result)

    async def hook_modify_user_message(self, message):
        self.guardrail.start_async(message)

    async def hook_modify_model_response(self, content):
        if await self.guardrail.any_input_failed():
            self.abort()
        if await self.guardrail.run(content):
            self.abort()
    async def hook_modify_tool_call(self, tool_name, arguments):
        ...

Implementing agent handoff

You can add an agent handoff feature with a simple plugin and a few lines of change to the agent loop.

handoff = HandOffPlugin(agents)
agent = Agent(model, other_plugins + [handoff])

while True:
    agent.run()
    if handoff.triggered:
        agent = handoff.handoff(agent)

Implementing Subagents & Multi-agents

For the main agent, you need a plugin that provides tools to spawn subagents, and potentially an input hook to notify of subagent updates if you want the subagents to provide async updates.

For subagents, you’ll need a plugin with input/output hooks for agent-to-agent communication. And subagents can have other plugins, too.

For example:

Timestamp plugin can help subagents keep track of time to avoid blocking others with parallel subagents.
Message queue with input/output hooks can allow async messaging, partial results, and even passing tool results directly to the main agent without repeating the tokens.
Agents with plugins that share the same environment can coordinate their actions.

Demo

I’ve created a demo showcasing this pattern. It’s a realtime chat with rich features:

Message queue: send messages without waiting for assistant response
Shared live chat: allow multiple users to talk to the same agent in the same chat
Pause agent: make the agent stop reacting while humans discuss, and resume later
Async subagent: the main agent can spawn subagents and let them work in the background
Manual agent handoff: switch to research mode

These features are built with just a few plugins (UI, Web, Timestamp, and agent communication plugins) composed together using the patterns described above.

Check out agent-chat to see how it all works.