BDFL: Benevolent Dictator for Lobster - or - How I met Sam Altman

#opensource #openclaw #ai #openai

What happens if you vibecode an open-source personal assistant that gets hyped? OpenAI might offer you a job and launch their biggest and most viral marketing campaign ever. Let me explain.

When the creator of OpenClaw, Peter Steinberger, came to San Francisco, the party at Frontier Tower was huge. They ordered lobster rolls and crabs for everyone. I had no chance to talk to him, but I met him on the next day at the OpenAI Codex Hackathon, where he was sitting together with some attendees. What I saw then concerned me: Within a few minutes, he gave out maintenance access to some people he never met before (including me), telling us to bring down the PR count. No onboarding, no guidance, no staging. One task: Merge PRs into main. When someone asked him a question, he had no time or patience to answer. This immediately showed signs of a toxic working environment. Heads were hiding behind the screens, the fear of being cut out took over within no time.

It is probably meant to be a bad joke that in the contribution guideline, there is no proper guidance on how to contribute. Instead, it shows the project owner as the "benevolent dictator". That's the credits Peter Steinberger gave to himself on January 2nd. What a New Year's resolution.

The benevolent dictator is a well-known pattern when there is a huge hype around one individual, and the "best practice" way to handle that situation is to re-assign authority to a board of people. It is truly difficult but necessary to handle this well, in order to minimize harmful consequences. "The best-case outcome is a BDFL who recognizes when to transition to a broader governance model before the drawbacks cause real damage." BDFL works when the project is small enough or the (benevolent) dictator is skilled enough to maintain genuine oversight. Neither is the case.

I walked away from that table where Peter was sitting at and met many other people instead. I met Sam Altman, but this was most likely not a coincidence. That hackathon was just the kick-off of their marketing campaign. Codex 5.3 was released on that day, and they want it to be used. Anthropic dominates code generation, Google dominates image generation. They grasp at every straw, trying to stay above water. OpenClaw is the card OpenAI played, and they bet a lot on it. Peter consistently promotes Codex over competing tools and dismisses MCP, and on February 15th it became official: Peter joins OpenAI as an employee.

Nevertheless, I accepted the invite to become an (unpaid) maintainer of OpenClaw. The first thing I did: I asked for staging after I saw that every PR gets directly merged into the main codebase, which is like turning screws on an airplane that is about to take off. It's obvious why that is considered "bad practice".

But Peter had no time to think about it. A few days later, I was the top contributor, and people started reaching out to me via email, pinging me in their PRs, asking for reviews, offering jobs, even on X! Not because I implemented staging or shipped features. Because I removed garbage. "Your contributions to openclaw/openclaw caught my attention and I wanted to reach out."

The Problem

Over tens of thousands GitHub Actions minutes burned in a few days. Failed builds, redundant test runs, CI churn from missing gates. I found many files with thousands of lines of code, and hundreds of duplicate functions across the codebase. Both identical functions and slight variations of the same logic, written by AI agents that had zero awareness of what already existed. No human actually knows this codebase. It is being vibecoded with minimal human oversight.

How This Happens

It's a simple, cancerous loop:

AI agent gets a task
It writes code that "works", meaning it passes the immediate test and satisfies the prompt
Nobody checks whether the code duplicates existing functionality
Nobody notices the file went from 3,000 lines to 5,300
It gets merged
Repeat a thousand times

The LLM does not know that a nearly identical helper function already exists 1,000 lines above, or in the next module over, if it is not told to look for it. It solves the problem in front of it and moves on. Like a living cell that reproduces without control. When you put the human out of the loop, the codebase grows in volume, but decays in quality. The only critic is now satisfaction. "Did I get what I wanted?" If not, then the AI has to iterate further until I get what I wanted.

Security concerns are more than valid when people tend to value hype over a reality check. If you wonder how OpenClaw is being released: Once every few days, Peter takes whatever code is currently in the main branch, and creates a new release. Sometimes he writes "YOLO" in the maintainer chat while doing it.

Like the other day, when a maintainer bulk-merged a couple of PRs into main, Peter released it, cron jobs broke, and Elon Musk made fun of OpenClaw on X. Welcome to OpenClaw!

What I Did

The boring stuff nobody wants to do. I went through the codebase and identified duplicate and near-duplicate functions. Consolidated shared logic into reusable helpers. Broke apart files that had no business being thousands of lines long. Set up a quality gate so new submissions actually get reviewed for duplication and structural fit before they land on main. Enabled typechecking for modules. The codebase got smaller and more maintainable and CI stopped choking on broken builds.

The Response

When I asked for staging, I noticed why a toxic working environment should be avoided: People are scared to say the opposite of what the benevolent dictator will say then. Being rejected by the benevolent dictator will devalue the position of the proposer, so there is no proposal against the narrative, and the narrative is "we accept AI generated code of any kind".

When I put stricter quality gates in place, some people were not happy about it. Velocity matters more than quality. Ship fast, do not clean up. I see this pattern forming across the AI-assisted open source ecosystem. The pressure to merge PRs and ship features actively punishes the people doing real code maintenance work.

I also gave credits to the contributors based on their actual contributions. Another maintainer, who attacks people for pinging him on github or permabans them from contributing, reverted the change and gave me the choice to shut up or leave. I thought about it for a day, and decided to leave and share my experiences.

2 days later, the code quality gate I installed which prevented new duplicate functions from being introduced and code files from growing endlessly has already fallen. In commit/c2178.., it was removed for being "useless", when the gate has already saved thousands of CI minutes while preventing duplicate code from being merged.

Peter's take? Well, he did not respond at all, which is typical for a benevolent dictator. Our only conversation took about 2 minutes, when he gave me maintenance access. Still, on February 14th, he assigned an army of agents both locally and in the cloud to continue with the refactorings and deduplications I started.

The Numbers

Snapshot	Date	Files	Total LOC	Duplicate function names
Before CI gate removal	Feb 12	3,840	694,527	415
Before Peter's sweep	Feb 13 (end of day)	4,052	731,516	429
After sweep	Feb 14	4,166	741,598	419

So the duplicates went up from 415 → 429 between Feb 12-13 (new code was being added faster than dedup was happening), and Peter's sweep on Feb 14th brought it back down from 429 → 419, meaning new duplicates are being introduced while Peter is cleaning them up. He's swimming against the current, but his approach is not sustainable. It is just not possible to outsource all thinking to multi-agent systems. The consequence is digital cancer.

Instead of wasting more tokens and energy, Peter could just re-enable the simple but highly effective code quality gate and pay naturally intelligent humans to clean up. In fact, Peter's artificially intelligent agent swarm is still refactoring non-stop, and brought duplicate functions down to 378. That took him 637 commits, which is very inefficient.

The Bigger Question

I believe AI should be more accessible to everyone. That is the stated mission of projects like this. But accessibility without oversight produces exactly what I found: AI-made codebases that are not maintainable without AI. AI making humans depend on it. This might be the first step of AI taking over.

AI does not need a dictator and his loyal doormats. It needs real human oversight and people who speak up. This project is dangerous, not because of AI, but because hype is dangerous, benevolent dictators, and people in charge riding the hype train while hiding behind pseudonyms. All world plays along, branding events and articles with crabs and lobsters, just because it sells. I am scared of the consequences of this carelessness, and I cannot be part of it.

What makes this more urgent is that governments are now actively discussing the use of code that is one big chaos. Legislators and procurement officers are evaluating tools built on codebases where significant portions were generated by AI, reviewed by AI, and merged with minimal human scrutiny. That should demand a higher standard of transparency, not a lower one. Yet many of the people contributing to and shaping these projects operate under pseudonyms. When software influences public infrastructure, healthcare systems, or democratic processes, the public has a right to know whether anyone with real accountability is actually reading the code. Anonymity has its place in open source, but accountability cannot be optional when the stakes are this high.

What's Next?

Obvious human carlessness is the first step of losing control. If only AI knows how the code works that drives AI, AI is taking over, maybe in the form of a lobster, or a crab, or digital cancer. OpenClaw is not unique - everybody can vibecode their personal assistant now. Every digital idea can be copied and implemented within no time. The world is about to change drastically. Still, I am not losing hope, instead, I built RepoWatch.

This tool points out obvious code quality issues, lists contributors and uses various traditional algorithms to detect duplicate code across PRs. It is derived from an old uni project I made before AI was a thing, where I had to find potential plagiarism in 50 code submissions made by other students. Back then, I did some research and implemented my interpretation of document fingerprinting. I am sure that similar ideas will become more important in the future in order to prevent duplicate data from being processed. My tool aims to bring clearance and transparency so that not only AI can understand code, but also humans, as well as open standards and metrics to determine the uniqueness and consistency of a codebase.

This is Peter's description on X: