Jonathan Chang’s Blog
Categories
All (8)
Agents (4)
FastAPI (1)
FlexAttention (1)
Hardware (1)
LLM (2)
PyTorch (2)
Robotics (1)
Tutorial (3)

Claude Daemon

Agents
Tutorial

Running Claude Code in yolo mode safely using macOS user isolation and ACLs

Dec 13, 2025
Jonathan Chang

vLLM from scratch with FlexAttention

PyTorch
FlexAttention
Tutorial

PyTorch FlexAttention tutorial: Building a minimal vLLM-style inference engine from scratch with paged attention

Aug 7, 2025
Jonathan Chang

MIST Bench: Building the MIST robot from Pantheon

Robotics
Agents
Hardware

Using AI to build a robot inspired by the animated series Pantheon

Jul 15, 2025
Jonathan Chang

Agent-Environment Middleware

Agents
LLM

Composing agents with Agent-Environment Middleware (AEM)

Jul 8, 2025
Jonathan Chang

LLMProc: Thinking in Processes, Not Agents

Agents
LLM

LLMProc introduces a process-based approach to LLM applications, drawing inspiration from Unix systems.

Apr 12, 2025
Jonathan Chang

Maximizing PyTorch Throughput with FastAPI

PyTorch
FastAPI
Tutorial

Techniques to optimize your PyTorch inference server for maximum throughput using FastAPI, asyncio, and CUDA’s asynchronous execution APIs

Oct 28, 2024
Jonathan Chang

Additive Rotary Embedding - A Competitive Alternative to RoPE

A competitive variant of rotary position embedding (RoPE) with interesting properties

Jul 31, 2024
Jonathan Chang
 

Exploring the Effective Rank of Projection Weights in Attention

Disclaimer: Most of the code are written by GPT/Copilot, and is not optimized for presentation.
May 13, 2024
Jonathan Chang
No matching items