How a five-person startup ships like a thirty-person one
What I learned running AirPLAi as a fleet of agents instead of a bigger team.

There was a moment, around 2am on a Thursday, when I caught myself reading a brief that three agents had built and routed to my phone. I'd typed one sentence into a terminal six hours earlier. The brief was useful.
I sat there for a minute thinking: this isn't how startups are supposed to run.
The bet
We're five at AirPLAi. Trying to build video AI for every level of the game, not just pro. The math doesn't really work if you scale headcount the normal way, the deals are too small, the volume too high.
So most days I'm not running a company so much as running a fleet that runs one. The fleet handles the connective tissue. The five of us handle the irreducible part.
I keep calling this automation. It isn't, really. It feels closer to organizational design.
Three things that actually happened
The brief I didn't ask for. A research agent pulled vault notes on a competitor launch. A second agent, the judgment layer, added a framing I hadn't considered. A third routed the summary to my phone. The whole loop took about six hours unattended.
The thing that surprised me wasn't the output. It was that I didn't have to manage the handoffs. I've built multi-step pipelines before. You always end up babysitting them. This one just ran.
Two parts of the company talking without me. An agent working on a pitch noticed a customer-side agent had flagged a risk with the same prospect. It messaged the relevant owner and resolved the conflict before I saw either of them.
That isn't AI helping me do tasks. That's the system noticing things I wasn't watching and routing them.
The cron that woke me up. Around 2am one night, a health-watch job flagged a production deploy that had gone sideways. It tried the standard rollback, failed, and escalated to my phone with the specific error and a proposed fix. I approved from bed. Total time I was involved: four minutes. Six months ago this would have been a 45-minute incident.
What surprised me
I expected the gain to come from doing the thing I would have done, but faster. That part is real. It isn't the interesting part.
The interesting part is what the system does to my attention. When routine monitoring, first-pass research, and cross-team coordination get handled by the fleet, what's left for the humans is a different class of problem. Customer taste. Fundraising narrative. The judgment calls that only matter if you actually know our situation.
What does it even mean when the lowest-leverage work in your day stops happening because something else picked it up? I'm still working that one out.

How it fails
The third surprise is how the system fails. It fails like a human org fails. Two agents claim the same task. A brief that was technically correct misses the real question because the agent lacked relationship context. Two parts of the system solve the same problem from different angles.
These aren't code bugs. They're organizational design problems. With the same fixes: clearer scope, better handoff contracts, more explicit context-sharing.
What I'm sitting with
The next horizon is workstream autonomy. Most agents today are good at bounded tasks. Research this. Write that. Watch this signal. The next level is an agent claiming a full workstream end-to-end, unexpected cases included, without touching me.
The eval I'd love to pass: an agent takes a new pilot from first email to live dashboard, no founder touches, thumbs-up from the customer.
We have pieces. We don't have the loop. The honest version of this post in six months might read very differently.
What I'm not sure about
The bet isn't the agents themselves. It's the operations OS we're building underneath them.
Or maybe the bet is smaller than that and I just haven't named it yet. I keep going back and forth on whether what I'm describing is genuinely new infrastructure or an elaborate workaround for a team that should hire a sixth person.
If you're hitting the same walls, I want to talk.