AI · ProductJune 2026· 7 min read

Using AI to actually do product

Where AI gives product managers enormous leverage and where it gives none — lessons from building two AI products end-to-end, alone.

Pier Stein

Product · Growth · Investment Products · AI

For years the constraint on a product idea was never taste — it was hands. As a PM I could see the product clearly, sometimes months before a team could ship it. The brief, the flows, the edge cases all sat in my head fully formed, waiting on engineering capacity that was always scarcer than ideas. AI has quietly inverted that. The brief is no longer a document you hand off; it is a working prototype you build the same afternoon. That changes the craft of product management more than any tool I have used in twelve years.

The brief becomes a prototype

I no longer write specs to persuade people an idea is worth building. I build the thing and let it argue for itself. Invest, my Revolut-grade investing simulator, started as a question about whether a live chart could feel alive without a real trading desk behind it. Instead of a deck, I had a working custom chart engine on iOS within days. Call My Agent — a phone number elderly people ring to talk to their own AI agent — went from concept to a real Twilio number you could actually call inside a week. The argument for each product was the product.

This is the part everyone is excited about, and they are right to be. The 0 to 1 has genuinely collapsed. What used to be a quarter of discovery and scaffolding is now an afternoon. But that is also where the honest conversation has to start, because the collapse stops there.

AI collapses the 0 to 1, not the 0 to 100

Getting to a thing that demos is fast. Getting to a thing you would put your name on is exactly as slow as it has always been. The last mile — trust, latency, edge cases, failure states — has not shortened by a single day. With Call My Agent, the happy path took an afternoon. Making it safe for a confused eighty-year-old who repeats themselves, mishears, or goes quiet mid-sentence took weeks. What happens when the model is wrong, when the line drops, when the research call times out — that is the actual product, and AI gives you almost no leverage there.

Verification is the real job

Once you accept that, your job stops being generation and becomes verification. This is where I spend most of my time now. The headline value in Invest is exact to the cent and comes from the data layer, never from the model — the model is not allowed anywhere near a number a user might trust. When I need structured output, I force it through tool-use rather than parsing prose, so the shape is guaranteed by contract instead of hope. In Call My Agent I run a second model to score the first one's responses, a self-improving evaluation loop that catches the failures a single pass never will. The generation is cheap. The scaffolding that makes the generation trustworthy is the whole game.

Taste becomes the moat

When anyone can generate a screen in seconds, the screen is worth nothing. What is left is knowing which screen, which flow, which three features to cut so the fourth one sings. AI flattens execution, which means the differentiation moves entirely upstream into judgement — what to build, what to refuse, what good actually feels like. That has always been the PM's real job; AI just stripped away the busywork that let people pretend it was about throughput. Taste was a nice-to-have when hands were scarce. Now it is the only scarce thing left.

Treat the model like a junior with infinite energy

The mental model that works for me: the model is a brilliant junior teammate with infinite energy and zero context. It will do anything you ask, instantly, and it will do the wrong thing with total confidence if your contract is vague. So I give it clear contracts — tight scope, forced output shapes, explicit failure modes — the same way I would brief a sharp graduate I do not yet fully trust. Reviewed properly it is the most productive teammate I have ever had. Trusted blindly it is a liability. The skill is knowing which interactions need a human in the loop and which do not.

What this means for product teams

The strategic shift underneath all of this is simple: the cost of being wrong about an idea has collapsed. When testing a direction meant a team and a quarter, you argued in meetings to avoid the spend. Now you build the contentious version over a weekend and let reality settle the debate. That should make teams braver and far more honest — fewer roadmaps defended on sunk cost, more directions killed cheaply because the prototype told the truth. The leverage is real and it is enormous. It just sits in a different place than the hype suggests: not in building the thing, but in knowing what is worth building and proving, ruthlessly, that it actually works.

Get in touch