Experiments
Blog YouTube Newsletter
Research · AI Will Replace You

Experiments at the edge of AI & engineering

Hands-on tests of how AI tools really behave when you push them. Real builds, real data, full source. Each experiment is reproducible — pick one and look at the receipts.

Live experiments

Click into any experiment for the full interactive report.

2026-05-08 · ● Live · 4 reports

Karpathy vs 8 CLAUDE.MDs

Karpathy's CLAUDE.md vs the internet's most-liked CLAUDE.md / AGENTS.md files — plus a few of my own. The results aren't what you'd expect.

9 CLAUDE.md variants 126 code builds 378 LLM judgements 27 live smoke tests
Open the experiment →
Coming soon · ○ Drafting

More experiments in flight

Prompt-specificity curve (3+ specificity levels), tool-use reliability across model versions, agent-vs-orchestrator latency. New experiments published as runs complete.