Erik Perttu
LinkedIn · GitHub · X · Plain text version
Start Here
This page is a summary index. For technical detail, start with the linked essays below.
Public profiles: linkedin.com/in/erikperttu, github.com/erikperttu, x.com/ErikPerttu.
- From LLM Author to LLM Reviewer: An AI Coding Agent Authors a Production Feature With Zero LLM Code Generation
How the pipeline's structural floor rose high enough to commit a multi-layer feature without any LLM source-code authoring. The current state of the system.
- From LLM Luck to Structurally Guaranteed: One Ticket Across Four Architectural Eras
The architectural overview: four eras, one ticket, cost per run dropping from $0.385 to $0.074.
- Prompt Rules Are Advisory; Validators Are Binding
Why structural validation matters more than instructions alone.
- What the Symbol Registry Stores, and How It Stays Fresh
The registry that keeps agent context precise across a changing codebase.
- The Lego Instructions: An Architectural Principle for AI Coding Agents
Why manifest quality determines outcome more than builder quality: the principle behind the Synthesizer's role.
- Stop Asking the Model What the Code Already Knows
Why every field a Planner emits that the codebase already knows is a dice roll, and how machine extraction eliminates it.
- Correct Code, Wrong File: How the Write Gate Contains Scope Creep
How write scope is constrained so the right code lands in the right place.
- What Calls This Function? Why AI Coding Agents Need a Language Server
Why the pipeline uses language-server context instead of loading whole files.
Profile
I build autonomous engineering pipelines. The design decisions come from a decade of shipping production software and getting burned by the alternative.
I'm a Swedish engineer based in Ho Chi Minh City. Shipping software to a million users teaches you where systems fail. Not in theory, but at 3am when something breaks and people are depending on it.
Hard stops, agent isolation, TDD enforcement: all come from shipping real software and living with the consequences.
Projects
February 2026 – Present
An autonomous engineering pipeline written in Python that takes a ticket from input to tested, reviewed, committed code with no human in the execution loop. The architecture behaves more like a compiler for LLM behavior than a conventional agent framework: LLMs handle the parts that require interpretation, deterministic code handles everything that can be computed.
A grounding chat stage runs before autonomous execution begins. The operator confirms intent, exclusions, and design decisions in a structured exchange grounded against the registry. The output is a structured handoff the downstream pipeline reads as ground truth, not as text to re-interpret.
Specialist agents then run in sequence with no shared memory or reasoning between stages. Each receives a structured artifact from the previous stage and produces one in return. Tests are generated by a deterministic Test Writer before the Coder runs and confirmed failing before implementation starts. Behavioral oracle tests are also derived from the same manifest operations that write the implementation, running below the route-test boundary where behavior is actually observable, and proven non-vacuous with mutation testing.
Built on a symbol registry and language server integration. A Synthesizer stage populates the manifest with machine-extracted file paths, symbols, signatures, and dependency resolutions before any builder runs, giving agents precise codebase context without reading entire files. Proven on real production codebases, including a ~100k line TypeScript monorepo. Sequential tickets with no reset between runs.
Empirical validation includes a 248-run reliability campaign that exposed a hallucination ceiling and drove architectural changes rather than patching prompts field by field.
Cost trajectory is documented across four architectural eras, reducing per-successful-run cost from $0.385 to $0.074 while raising consistency through structural controls.
Safety is structural, not advisory. Hard stops, a write gate that contains scope to the manifest, and full per-stage trace logging are enforced at the architecture level. Validated across ~1,000 runs on the target codebase.
The Debugger router classifies each failure and dispatches to the stage that owns the fix. Code failures route to the Coder with a corrective brief. Upstream specification failures surface to the operator with a structured correction request. Four of six end-of-ticket gates route to a fail-safe repair pass that re-runs the gate authoritatively rather than terminating. It is the only stage that can change where the pipeline goes next.
Each stage has its own model configuration. Model, effort level, and LLM vendor are set independently per agent. Designed to work with any LLM provider.
The pipeline itself is developed under TDD.
On certain ticket shapes the pipeline now commits code with zero LLM source-code authoring. The Coder stage runs; it produces no code.
Still R&D.
Project evidence
- From LLM Luck to Structurally Guaranteed: One Ticket Across Four Architectural Eras
- From LLM Author to LLM Reviewer: An AI Coding Agent Authors a Production Feature With Zero LLM Code Generation
- What Calls This Function? Why AI Coding Agents Need a Language Server
- What the Symbol Registry Stores, and How It Stays Fresh
- Stop Asking the Model What the Code Already Knows
- The Lego Instructions: An Architectural Principle for AI Coding Agents
- Correct Code, Wrong File: How the Write Gate Contains Scope Creep
- Prompt Rules Are Advisory; Validators Are Binding
Reinforcement Learning, VizDoom
June 2025 – August 2025
Implemented DQN, REINFORCE, and PPO from scratch using PyTorch. The agent was constrained to first-person visual input only; full game state was available but deliberately excluded. Switched to StableBaselines3 for parallel training once parallel rollouts became the bottleneck.
Experience
Head of Engineering
Edu2Review - Ho Chi Minh City, VietnamJanuary 2017 – Present
Leading technical strategy for Vietnam's largest education review platform, 1M+ MAU.
Built and led the engineering team from the ground up: sourced, interviewed, hired, and mentored every engineer. Responsible for technical culture, career development, and engineering standards across the department.
- Complete cloud migration cutting latency 90% and infrastructure costs 50%
- Re-engineered core search logic, 50% increase in user engagement
- Built a payment gateway covering MoMo, ZaloPay, and credit cards
- Built internal marketing automation tools that tripled lead generation efficiency
- Engineered a real-time testing platform handling thousands of concurrent users for large-scale student competitions
Technical Lead & Project Manager
INS ENCO LTD. - Ho Chi Minh City, VietnamJanuary 2016 – January 2017
Technical management for a financial software firm, bridging European stakeholders and a local engineering team in Vietnam.
Led and mentored local developers delivering high-performance, low-latency financial applications in C#/.NET for the banking sector. Worked directly with the CEO on operational reporting and resource allocation. Established development processes including technology selection, time estimation, and quality control. Led technical screening and hiring to scale the team.
Personally architected a backend solution connecting multiple disparate financial systems.
Software Developer
23 Critters - Stockholm, SwedenNovember 2012 – April 2015
Backend development for a fast-paced Swedish tech startup. Built core backend systems for web applications in Python, taking full ownership of feature lifecycles from estimation to deployment. Autonomous environment, high standards, early foundation in scalable architecture and clean code.
Technical Skills
- Leadership
- Engineering Leadership, Team Building, Hiring & Mentoring, Technical Strategy, Engineering Culture
- AI & Agents
- Large Language Models (LLM), AI Agents, Multi-Agent Systems, Autonomous Engineering Pipelines, AI Agent Architecture, Code Generation, LLM Hallucination Mitigation, Deterministic Enforcement, Mutation Testing, PyTorch, Test-Driven Development (TDD)
- Code Intelligence
- Language Server Protocol (LSP), Tree-Sitter, Abstract Syntax Tree (AST), Symbol Registry Design, Static Analysis
- Languages & Frameworks
- Python, Go (Golang), TypeScript, Node.js, PHP, C
- Cloud & Infrastructure
- Amazon Web Services (AWS), Docker, CI/CD Pipelines, Elasticsearch, Git, GitHub
- Data
- SQLite, Relational Database Design, MySQL
Languages
- English Bilingual
- Swedish Native
- Vietnamese Beginner