All posts

The LLM is the compiler

We’re in a strange in-between time when it comes to LLMs and code. The models now write the code we used to write ourselves. But we’ve quietly crossed a line: most people barely read the source code anymore. We describe what we want, sometimes skim the diff, run the tests, move on. The source code is becoming an intermediate representation that humans glance at at best.

It makes you wonder where it will end. I think we’ll soon see programming languages purpose-built for LLMs rather than humans. But maybe even those will disappear. Maybe at some point the LLM just compiles our intent straight down to machine code that runs on the device, and the language disappears.

I wanted to know if that was even possible today. So I set about trying to compile without a compiler.

Compiling without a compiler

The idea is this: You hand the LLM a C program and it produces the assembly by reasoning about the source, instruction by instruction. No clang -S, no gcc, nothing generating the code but the model. The only tools allowed are the assembler and linker, which turn the model’s hand-written assembly into a running binary and link it against the standard C library. The translation step that a compiler normally owns is done entirely by the LLM.

I started small: a program that adds two numbers and prints the result. It worked first try. Targeting ARM64 on an Apple M2, the model wrote correct assembly, linked it, and ran it.

Then bubble sort, which adds loops and array indexing. It got there, but it took a few rounds. The bug was instructive: the algorithm was always right, but the model kept fumbling the stack-frame arithmetic — the byte offsets of local variables, the alignment of the stack, where the array sits relative to the saved registers. One version laid an array directly on top of the saved return address; the program built cleanly and then died with a bus error the instant it tried to return.

That failure told me exactly where the weakness was. The model is good at the semantics of compilation: which instructions implement a loop, what a swap looks like, which values have to survive a function call. What it’s bad at is the bookkeeping — the careful, mechanical offset math. Which is unsurprising, because LLMs are famously bad at arithmetic.

So I gave it a calculator. A small Python script that lays out a stack frame: tell it what the frame holds and it returns the total size, every field’s offset, and whether the alignment is correct. The model does the part it’s good at and calls the script for the part it isn’t. That fixed the crashes, and I moved on to a recursive quicksort — partition logic, two recursive calls, the works. It compiled and ran, output identical to what real clang produces.

The whole thing now lives in a Claude Code skill. You point it at one or more .c files and it hand-compiles them, links them together, runs the result, and shows you the output:

git clone https://github.com/drewmccormack/llmc.git ~/.claude/skills/llmc
/llmc qsort.c
/llmc main.c math.c    # multiple files link together

What this is and isn’t

It’s early, and it’s slow. A real compiler does this in milliseconds and never makes an offset error; the LLM thinks for a minute and needs a tool to keep its arithmetic straight. As a way to actually build software today, it’s a toy.

But the failure modes are telling. The model handled real algorithms, recursion, and linking against libc, which are the genuinely hard parts of code generation.

Real DSLs

The LLM C compiler is a toy, but there may already be uses for this approach. We often develop Domain Specific Languages (DSLs) tailored to a purpose. Usually, we adopt an existing programming language like Ruby or Swift, and we mould it as best we can into something that fits our purpose. But this process can result in quite ugly solutions, and we only do it this way because we want to leverage the existing compilers.

But if an LLM is doing the compiling, you’re no longer required to write in a language a compiler understands. You can invent your own. If your compiler is a language model, you can drop the host language entirely and write a small domain-specific language that really fits. No constraints. Imagine a screen defined like this:

screen Profile
  avatar, then name in bold, then a "Follow" button
  tapping Follow toggles it to "Following"

No closures, no view builders, no framework. You write the rules of the language, the model compiles it to SwiftUI or React or straight to assembly, and a real binary comes out the other end. The grammar is yours to invent.

Right now, we are using LLMs to transform or generate static code, but maybe we could end up using them in dynamic roles, as tools to compile our DSLs and pseudocode.