experienced executive product roadmap whiteboard
experienced executive product roadmap whiteboard

What Three Decades of PM Taught Me About Shipping AI

  • Post author:
  • Post category:AI

I’ve been a product manager since 1994, three decades of shipping products in fintech (the world’s first end-to-end B2B credit card encryption service), mobile (Qualcomm’s BREW Managed Services and Mobile Advertising Platform), MarTech (Mapp’s Customer Data Platform), VoIP (Voxox’s Unified Communications stack), blockchain (Open Crypto Trust’s Network-as-a-Service), EV charging (Shell Recharge’s fleet, OEM, and energy management products), and earlier work in internet, digital media, web technologies, interactive 3D, gaming, and online storage.

What I’ve noticed in the last year of building production AI is that almost nothing about good product management changes for AI. The principles transfer. They just have a few additional sharp edges.

Here are the principles I keep returning to, and what’s specifically different about applying them to AI work.

1. Solve a problem someone has, not a problem you find interesting

The single most reliable predictor of whether a product will ship and matter is whether it solves a problem the customer can name unprompted. This was true at Qualcomm in 2007, and it’s true with LLMs in 2026.

The AI version of this principle is sharp because the technology is genuinely fascinating. Engineers and PMs alike fall in love with what the model can do, and lose sight of whether anyone has a use for it. I see this every week, beautifully crafted AI demos that solve no problem, ship to no users, and get quietly shelved six months later.

The check: ask a real prospective user to describe the problem in their own words, with no leading. If they can, you have a product worth building. If you have to coach the words out of them, you don’t.

2. Define success before you start

Every product I’ve shipped that succeeded had a defined success metric before kickoff. Every product I’ve shipped that didn’t succeed had a metric defined retroactively, after the team had already committed to the build.

For AI, “success” needs an explicit translation step. Model accuracy is not success. Latency is not success. Tokens-per-second is not success. The metric has to live in the customer’s world: “percentage of bookings that complete without human intervention,” “time from intake to live website,” “support tickets resolved without escalation.” Then the technical metrics get derived from the customer-facing ones, not the other way around.

This is more important in AI because the technical metrics are seductive and easy. Resist.

3. Schema before code

I’ve watched more product disasters caused by ill-considered data models than by any other category of mistake. The schema is the constitution of the product. Once it’s wrong, every feature you build on top of it is harder than it should be.

For AI, this principle extends in a specific way: the structure of the prompts and the schema of the model’s outputs are part of your data model. They deserve the same rigor. A free-text Claude response that you parse with regex is fragile in exactly the same way a missing foreign key constraint is fragile, and for the same reason — the structure isn’t enforced where it matters.

When I built the LocalValue.co Claude integration, the prompt explicitly demands JSON output with a defined schema, and the workflow validates against the schema before continuing. That’s the AI version of “schema before code.” It’s not optional.

4. Cost discipline is product discipline

Every product I’ve shipped lived or died on its unit economics. At Verifone, the credit-card encryption service had to clear payment-processor margin requirements per transaction. At Shell Recharge, every kilowatt-hour through the network had a cost-of-goods that the platform had to clear. At Mapp, every email send had a fractional-cent cost that compounded into real money at scale.

For AI, cost discipline is the same problem with different units. Every Claude call costs something. Every Pexels query costs something. Every Twilio SMS costs something. A workflow that calls the LLM three times per user-event will be ten to a hundred times more expensive at scale than one that calls it once with better context, and the difference is the difference between a viable business and an unviable one.

The PM job is the same one I had at Mapp: design the workflow so the per-event cost is bounded, watch the cost dashboard daily, and reject features that don’t pay for themselves.

5. Operate in production from day one

The systems I’m proudest of were the ones where I treated the day-one production launch as the start of the project, not the end. Monitoring, alerting, error handling, audit trails, on-call rotations — all of it set up before any real customer traffic hit. The boring infrastructure becomes the difference between a product that earns trust and one that doesn’t.

For AI, this is even more important because AI fails in subtler ways than traditional software. A web service that returns a 500 error is obvious. An LLM that returns a confident but incorrect answer is invisible until a customer notices, by which point the damage is done. The monitoring you need is not “is the service up” — it’s “are the outputs still good,” which requires an evaluation harness that grades a sample of outputs continuously.

Building that evaluation harness is product work. It’s not glamorous. It’s not optional.

6. Decisions need rationale, and rationale needs a date

Every system I’ve ever come back to six months later required me to remember why I made the choices I made. The systems where I had written down the rationale, with a date, were the ones I could move fast in. The systems where I had to reverse-engineer my past thinking from the code were the ones that calcified.

Every Local.Pet architectural decision lives in a dated decision log with a one-line rationale. When I come back to a question of “why did we do it this way,” I have an answer. When I’m tempted to revisit a settled decision, the log tells me what trade-offs I already weighed.

The AI version of this principle is identical. AI projects involve more architectural choices than typical software projects — model choice, prompt structure, retrieval strategy, evaluation approach, fallback behavior — and those choices interact. Without a decision log, you’ll make incompatible choices six months apart and not notice until something breaks.

7. Narrow scope, then earn the right to expand

The product you ship in v1 should be smaller than the product you want to ship. v1 exists to validate that anyone wants the thing. Once that’s validated, you earn the right to expand.

I built Local.Pet’s marketplace stack — booking, escrow, reviews, disputes — in code, then deliberately gated it from launch. v1 ships listings and a contact form. Only after the directory and lead model validate does the marketplace go live.

For AI, the principle holds and the temptation to violate it is stronger, because LLMs make it easy to build adjacent features at low cost. “While we’re at it, let’s also have it do X.” That’s how scope creeps until the team can no longer hold the whole product in their head.

The discipline: ship the smallest thing that proves the metric, watch the metric, expand only on evidence.

8. People close to the work know things you don’t

Every time I’ve shipped something that worked, it was because I listened to the people closest to the customer — the support reps, the salespeople, the field engineers — instead of the people closest to the strategy.

For AI, the closest-to-the-work voices are the ones who’d be using the AI output in their job. The customer service rep who’d read what the AI drafted before sending it. The accountant who’d review what the AI categorized. The marketer who’d publish what the AI wrote. Their feedback on AI outputs is the only feedback that maps to whether the AI is actually working.

Most AI projects don’t talk to those people until launch. Then the launch fails, and everyone is surprised.

9. Trust compounds, and so does the lack of it

The single most durable asset I’ve built in three decades is reputation for shipping things that work. It compounds. It opens doors. It makes the next sale, the next hire, the next product easier than the last.

The inverse compounds too. A product that ships broken, an AI that confidently misleads a customer, a launch that overpromises and underdelivers — those leave a residue that takes years to clear.

The pace of AI hype right now is creating a generation of products that will leave the wrong residue. The PMs who hold the line on quality are building reputational equity that will pay off for the rest of the cycle.

What’s different

Three things are genuinely different about AI work, and they sit on top of these principles, not in place of them.

Output quality is probabilistic, not deterministic. A traditional software function returns the same answer every time for the same inputs. An LLM doesn’t. This requires evaluation discipline that traditional software doesn’t, and it changes how you think about edge cases — every output is, in some sense, an edge case.

The cost surface moves daily. Model prices, capability frontiers, latency profiles all shift faster than they do in any other domain I’ve worked in. The architecture that’s right today may be wrong in six months. Plan for re-architecture in a way you wouldn’t for traditional software.

The capability ceiling is a moving target. Things that are impossible today will be table stakes a year from now. Calibrate your roadmap to assume capability will grow, not shrink, and design extensibility into your product accordingly.

Where this all lands

Three decades of product management taught me that the principles are durable and the technology is incidental. AI is the technology of the moment. The principles are what determine whether the AI products you ship matter to customers or get quietly forgotten.

If you’re a senior PM thinking about pivoting into AI, the work you’ve already done is more transferable than you think. Don’t apologize for not being a data scientist. Bring the product judgment. Bring the ship-discipline. Apply both to the new tools.

That’s what I do at Local Value Marketing — for businesses that want AI to actually move a metric, run by an AI Product Consultant who has shipped both before and during the AI era.


Related: Building Production AI as a Solo PM and AI Doesn’t Need More Engineers — It Needs Product Managers.

Rob Lewis is the founder of Local Value LLC and an AI Product Consultant for SMB and mid-market businesses. He has three decades of senior product leadership experience and ships production AI applications. Reach him at [email protected].