Can AI agents really build a whole app?

Yes, with the right setup. Garry Tan, YC's CEO, says he rebuilt essentially all of Posterous, a product that first took two years and ten engineers, mostly on his own with AI agents. He argues the limit is not the model's intelligence but the process around it: agents that plan, review, and test like a team produce shippable software, while a single unguided agent writes plausible code that quietly breaks.

Do founders still need engineers if AI writes the code?

The role shifts more than it disappears. Someone still has to scope the work, choose between approaches, review what the agents produce, and own the judgment calls. Running AI coding agents looks less like typing code and more like managing a small engineering team, where direction, review, and accountability stay human.

How do you review code written by AI agents?

With the same gates a good team uses, automated. A staff-level review pass hunts for bugs after the code is written, an adversarial step tries to break the plan before the build, and an automated browser runs the QA clicks. Nothing lands on the main branch until it is reviewed and a human approves it.

Is it safe to build software with AI agents in production?

Only with guardrails. Treat every agent change like a pull request from a new hire: reviewed and approved before it merges. Keep a human sign-off on anything irreversible or touching customer data, give agents the least access they need, and stay paranoid about supply-chain attacks, which Garry Tan calls one of the scariest things in AI coding right now.

AI agents for coding, run like a team

Garry Tan has written a lot of code. He was an early engineer at Palantir, co-founded Posterous (acquired by Twitter), and built the first version of Bookface, Y Combinator's internal network. Today he runs YC as its president and CEO. In a recent walkthrough of how he builds now, he says he has written more code in the last two months than in all of 2013, the last year he was heads-down as an engineer. By his own account, he rebuilt essentially all of Posterous, a product that first took two years, ten million dollars, and ten engineers, mostly on his own, with AI agents.

To do it he built gstack, an open-source set of Claude Code skills that turns a single coding agent into an AI engineering team. In its first weeks it blew past long-standing frameworks like Ruby on Rails, and it now has more than 100,000 GitHub stars. The useful part for founders is not the tool. It is the idea behind it: building software with AI agents works the way building anything with people works. You get shippable software out of a team with roles, review, and a gate before anything merges, not out of one genius working alone.

This is the shift from one-prompt "vibe coding" to agentic engineering: running AI agents for coding the way you would run a team. Here is the playbook, and how to copy it whether or not you ever touch gstack.

Why one AI coding agent wanders, and a team ships

Point a raw coding agent at your codebase and it guesses. It does not know your data, so it writes plausible looking code that silently breaks. Tan's framing is that the bottleneck is not the model's intelligence. The models are already smart enough to do extraordinary work. The bottleneck is the setup around them.

His approach is "thin harness, fat skills." Keep the scaffolding trivially thin and put the weight in the skills: a set of specialists, each good at a single job, that hand work to each other. That is the difference between a chatbot that types code and an organization that produces software you can ship.

Give your agents the roles a real team has

gstack is 23 tools that play the parts of a CEO, a designer, an engineering manager, a release manager, a doc engineer, and QA. You do not need his exact setup. You need the stages. Here is the assembly line, and what each stage buys you.

Intake, before any code. gstack opens with "office hours," a distilled version of the questions YC partners ask founders: who is the user, what is the strongest evidence someone actually wants this, what is the business model. It reframes the idea before a line is written. The cheapest bug to fix is the feature you decide not to build, which is the whole point of AI for product managers: deciding what to build is the part the agents do not do for you.
Plan with options, not one answer. Instead of jumping to code, it produces two or three approaches with different effort and risk, and you pick. A plan you reviewed beats a plan you discover halfway through the build.
Adversarial review. A separate pass tries to break the plan on purpose: no failure handling, no privacy section, an unhandled login step. In Tan's demo it caught and fixed 16 issues and moved the design from a 6 out of 10 to an 8. Always have one agent attack what another agent produced.
Design exploration. A "design shotgun" generates several visual directions at once and asks you to rate them, instead of committing to the first layout. Cheap to generate many, expensive to rebuild one.
Code review and QA. After the code is written, a staff-level review hunts for bugs the plan missed, and an automated browser runs the clicks you would have done by hand. Tan wrapped a real browser at the command line so his agents could take screenshots, fill forms, and catch CSS and JavaScript bugs, because manual QA was the least fun and least scalable part of his day.
A ship gate. One last check makes sure the change is actually ready to land on the main branch.

Read that list again as a founder, not an engineer. It is how you would run a small team: scope the work, plan it, pressure-test it, build, review, and only then ship. The agents just let one person run all of it.

Run many AI agents in parallel

Once each piece of work is planned, reviewed, and tested on its own, you stop doing them one at a time. Tan says he runs 10 to 15 AI coding agents in parallel, each on its own branch, and lands anywhere from 10 to 50 pull requests a day. Every new idea, bug report, or complaint he sees becomes a new work item that runs through the same pipeline. He says he no longer keeps a to-do list.

For a founder, this is the real shift. It is not that AI types faster. It is that a structured process lets you keep many things in flight without losing the thread, which is exactly what a small team has never been able to do. Coordinating that many agents at once is its own skill, which I go deeper on in AI agent orchestration.

The part most demos skip: do it safely

This is where the agent era gets a security person's attention. Tan is blunt that one of the scariest things in AI coding right now is supply chain attacks, and he says he is "really, really paranoid" about them. He pulls community fixes into his open-source repos in reviewed waves and leans on his own pipeline to vet what lands.

That instinct is the whole point. When AI coding agents write and merge code at 50 pull requests a day, your review gates are the only thing between you and a quiet disaster. A few rules keep the speed without the blast radius:

Treat every agent change like a pull request from a new hire: reviewed by something, approved by someone, before it touches main.
Keep a human sign-off on anything irreversible, anything that spends money, and anything that touches customer data or credentials.
Give agents the least access they need. An agent that can read one repo is safer than one holding a key to everything.
Be paranoid about dependencies. The faster you merge, the more a poisoned package costs you. This is the same supply-chain risk the best AI security startups are built to catch.

Speed and safety are not opposites here. The review and adversarial steps that keep the code correct are the same ones that keep it safe.

What to do this week

Take one real task and run it through stages instead of one prompt: have the agent reframe the problem, propose two or three approaches, then build the one you pick.
Add an adversarial step. Tell a fresh agent its only job is to find what is wrong with the plan before you build it.
Put a review gate on the output. Nothing lands until something reviews it and you approve it.
Write your red lines: what an agent may never do without a human, and which systems it may never touch.

The barrier to building has collapsed. Garry Tan's point is that the winners will be the ones who run agents like a disciplined team, not a magic box, and who stay paranoid while they move fast. That is the operating system we help founders install in AI Operating System for Startups: put AI to work across the company, build your own internal tools, and operate safely, with a human on every irreversible call. The same discipline applied to the whole org is what makes an AI-native company work. To see how other YC startups are doing it, start with the YC companies building AI coding agents, or read how Anthropic's founders ship fast without cutting safety.

Sources

Garry Tan on building in the agent era, the video this article distills.
gstack on GitHub, Garry Tan's open-source Claude Code setup, the role-based tools described above.
Background on Garry Tan: Y Combinator, Wikipedia, and TechCrunch on his appointment as YC's CEO. Posterous was acquired by Twitter in 2012.
Garry Tan on LinkedIn.