DEV Community

Cover image for I Spent 48 Hours Red-Teaming the "Magic AI Assistant" Everyone's Hyping. Here's What I Found.
Yuval Avidani
Yuval Avidani

Posted on

I Spent 48 Hours Red-Teaming the "Magic AI Assistant" Everyone's Hyping. Here's What I Found.

TL;DR: I tore apart OpenClaw - the open-source AI assistant that promises to run on "any OS, any platform" across 19 messaging channels. I found 10 exploitable vulnerabilities, a supply chain that depends on one person's npm account, a WhatsApp integration that could get you banned, and an architecture that wastes 93% of your token spend. All backed by code, line numbers, and dollar amounts (your $40 conversation could cost $2.49 instead)

The Setup

I'm Yuval Avidani. I break things for a living.

When OpenClaw started trending - "your own personal AI assistant, the lobster way 🦞" - I did what any security researcher would do: I cloned the repo and started reading.

330,000 lines of TypeScript. 1,156 npm dependencies. 22 tools. 19 messaging channels. 15+ LLM providers.

Impressive scope. But scope is where bugs hide.

So I pulled out my red team toolkit, set a timer, and went hunting. What I found wasn't pretty.


Finding #1: I Can Write Files Anywhere on Your System

Severity: HIGH (CVSS 7.5)

OpenClaw lets you install "skills" - community plugins that extend its capabilities. When you install a skill from a tarball, here's what happens:

// src/agents/skills-install.ts, lines 255-279
const argv = ["tar", "xf", archivePath, "-C", targetDir];
Enter fullscreen mode Exit fullscreen mode

That's it. No path validation. No ../ prevention. Nothing.

A malicious skill author can craft an archive with entries like ../../../.bashrc or ../../../.ssh/authorized_keys. When you install it, the file gets written outside the skill directory - directly onto your filesystem.

This is called a Zip Slip attack. It was publicly disclosed in 2018 and affects hundreds of projects. OpenClaw is now one of them.

Impact: Remote code execution. A popular skill maintainer goes rogue (or gets their account compromised), pushes a poisoned update, and every user who updates gets owned.


Finding #2: The Security Scanner Is Theater

Severity: HIGH (CVSS 7.2)

"But wait," you say. "OpenClaw has a skill scanner that catches malicious code!"

Yes. It does. And it's trivially bypassed.

The scanner at src/security/skill-scanner.ts (442 lines) uses regex patterns to look for dangerous APIs:

// It looks for this:
child_process.exec("rm -rf /")

// But not this:
const cp = require("child" + "_process");
cp["ex" + "ec"]("rm -rf /");

// Or this:
const fn = global["eval"];
fn("require('child_process').exec('...')");
Enter fullscreen mode Exit fullscreen mode

Every single rule can be bypassed through:

  • Dynamic property access: obj["ev" + "al"]()
  • Indirect requires: const m = module.constructor._load("child_process")
  • Template literals: `${cp}`.exec()
  • Prototype chain access: Object.getPrototypeOf(process).constructor

Pattern-based scanning without AST (Abstract Syntax Tree) analysis is like a bouncer who only checks IDs that say "FAKE" on them.


Finding #3: Your Conversations Leak Between Sessions

Severity: HIGH (CVSS 6.5)

Here's a fun one. In src/config/types.base.ts, line 84:

dmScope: "main"
Enter fullscreen mode Exit fullscreen mode

All DM conversations default to the same scope: "main". In a multi-user deployment β€” which is exactly what OpenClaw encourages with its multi-channel architecture β€” this means User A's private DM conversation can bleed into User B's context.

Memory searches, conversation history, tool results β€” all potentially shared across what users believe are private conversations.

Not a theoretical risk. A configuration default.


Finding #4: WhatsApp Credentials in Plaintext

Severity: HIGH (CVSS 6.2)

// src/web/auth-store.ts, lines 19-24
// Credentials stored as plaintext JSON on disk
// Path: creds.json
Enter fullscreen mode Exit fullscreen mode

Your WhatsApp session credentials β€” the keys that let OpenClaw act as you on WhatsApp β€” are stored as unencrypted JSON on disk. No file permissions check. No encryption at rest. Anyone with read access to the filesystem can impersonate your WhatsApp account.


Finding #5: The Timing Attack on Authentication

Here's something subtle. In src/gateway/server-http.ts, line 160:

if (hookToken !== expectedToken) { // ← Standard !== comparison
Enter fullscreen mode Exit fullscreen mode

And in src/gateway/auth.ts, lines 35-40:

if (a.length !== b.length) {
  return false;  // ← Leaks token length via timing
}
return crypto.timingSafeEqual(bufA, bufB);
Enter fullscreen mode Exit fullscreen mode

They almost got it right. They use timingSafeEqual for the byte comparison β€” but they leak the token length by returning early when lengths don't match. An attacker can determine the exact length of your auth token by measuring response times.

The hook token comparison is even worse β€” plain !== is fully vulnerable to character-by-character timing attacks.


The Supply Chain: One Person's npm Account Controls Everything

This is the finding that keeps me up at night.

OpenClaw's core runtime β€” the agent loop, the prompt builder, the API communicator, the session manager - is split across four npm packages:

"@mariozechner/pi-agent-core": "0.52.9",
"@mariozechner/pi-ai": "0.52.9",
"@mariozechner/pi-coding-agent": "0.52.9",
"@mariozechner/pi-tui": "0.52.9"
Enter fullscreen mode Exit fullscreen mode

These are personal namespace packages from a single npm account. Not an organization. Not a foundation. One person.

Here's why that matters:

  1. Account compromise = supply chain attack. If @mariozechner's npm account gets phished, hacked, or credential-stuffed, an attacker can push malicious versions of the packages that power every OpenClaw installation worldwide.

  2. No peer review on publishes. Organization-scoped packages can require multiple maintainers to publish. Personal packages don't.

  3. Bus factor = 1. One person gets sick, loses interest, or sells their npm account? Every OpenClaw user is affected.

  4. You can't audit the core. Want to add prompt caching to save 10x on system prompt costs? Want to add token budgets? Too bad. The core is a black box.

The npm ecosystem has seen this movie before. event-stream, ua-parser-js, colors.js. One compromised maintainer, millions of affected downstream projects.


The WhatsApp Problem Nobody Talks About

OpenClaw uses @whiskeysockets/baileys (v7.0.0-rc.9) for WhatsApp integration.

Let me be blunt: Baileys is a reverse-engineered implementation of WhatsApp's private protocol. It's not an official API. It's not sanctioned by Meta. Using it violates WhatsApp's Terms of Service.

What happens when you use Baileys:

Risk Consequence Likelihood
Account ban Meta detects non-official client, permanently bans your number High - Meta actively detects Baileys
Protocol break WhatsApp updates their protocol, Baileys stops working High β€” happens regularly
Credential theft Baileys needs your full session keys (not just a bot token) Built-in
No SLA Community-maintained, RC version, no support contract Guaranteed

The official alternative? WhatsApp Business Cloud API. Free tier. Official. Won't get you banned. But it requires a business account and webhook setup - effort that OpenClaw chose not to invest in.


The $40 Conversation: Why OpenClaw Bleeds Your Wallet

This is where it gets expensive.

I traced a typical 40-turn developer conversation through OpenClaw's architecture and calculated the token spend at every layer. The numbers are staggering.

How OpenClaw burns tokens:

1. No history limit (quadratic cost growth)
Every message sends the entire conversation history. Turn 1 sends 500 tokens. Turn 50 sends 70,000 tokens. Total input cost for a 100-turn conversation: $50-80.

// src/agents/pi-embedded-runner/history.ts, lines 15-36
if (!limit || limit <= 0) {
  return messages; // Returns EVERYTHING
}
Enter fullscreen mode Exit fullscreen mode

2. Context overflow costs 9-22x more than prevention
When the 200K context fills up, OpenClaw:

  • Pays for the failed request ($3.00)
  • Makes 2-3 additional API calls to summarize ($2.84)
  • Retries the original request ($0.90)
  • Total for one overflow event: $6.74

A proactively managed system: $0.30-0.75.

3. A 648-line system prompt sent every single time
src/agents/system-prompt.ts - 648 lines, ~5,000 tokens - sent on every request without prompt caching. Cost: $225/month. With Anthropic's native prompt caching: $22.50/month.

4. Memory search fires on every turn - even for "thanks"
2,400 tokens injected per turn from memory search results. Even when you just type "thanks." Cost: $108/month.

5. Opus by default - the Ferrari for grocery runs
The default model is Claude Opus 4.6 at $15/MTok input. Sonnet 4.5 does 90% of tasks identically at $3/MTok. Switching saves $1,080/month.

The real math:

What You Pay (OpenClaw) What You'd Pay (Optimized)
$40.03 per 40-turn conversation $2.70 per conversation
$3,603/month (3 conversations/day) $243/month
$43,231/year $2,916/year

That's $40,315/year in waste for a medium-usage deployment.

And every model cost in the config defaults to zero:

// src/config/defaults.ts, lines 28-33
const DEFAULT_MODEL_COST = {
  input: 0,    // ← Zero!
  output: 0,   // ← Zero!
  cacheRead: 0,
  cacheWrite: 0,
};
Enter fullscreen mode Exit fullscreen mode

The system literally cannot tell you how much you're spending.


22 Tools, All Dumping Into One Bottomless Context

OpenClaw registers 22 tools - file operations, web search, shell execution, browser control, messaging, image analysis, and more. Every tool result gets injected into the conversation context and stays there forever.

Here's what a debugging session looks like:

Turn What Happens Tokens Added Running Total
1 You ask a question 50 50
2 Agent greps your codebase 3,000 3,050
3 Agent reads 2 files 4,000 7,050
4 Agent searches the web 5,000 12,050
5 Agent fetches a docs page 15,000 27,050
10 Still growing... 32,600

By turn 10, every subsequent message ships 32,600 tokens of stale tool results. At Opus pricing: $0.49 per turn, just for re-transmitting a grep result from turn 2.

A properly built system would:

  • Summarize tool results before storing (15K web fetch β†’ 500-token summary)
  • Expire tool results after N turns
  • Use a retrieval store, not the LLM context window

19 Channels Γ— Separate Sessions = Token Multiplication

OpenClaw supports 19 messaging channels. Each runs a completely independent session with its own system prompt, conversation history, and memory search.

Same user. Same assistant. Three channels. Three separate token streams:

Channel System Prompt History Memory Per Turn
WhatsApp 5,000 20,000 2,400 27,400
Telegram 5,000 15,000 2,400 22,400
Slack 5,000 10,000 2,400 17,400
Total 67,200

At Opus pricing, that's $1.01 per turn across 3 channels. For 50 messages/day: $1,512/month.

One user. One context. That's the fix.


What OpenClaw Actually Gets Right

I'm a security researcher, not a hater. Credit where it's due:

SSRF Protection - src/infra/net/fetch-guard.ts implements DNS pinning, redirect validation (limit of 3), loop detection, and protocol enforcement. This is solid defensive engineering.

Secret Scanning β€” Integrated detect-secrets in CI/CD. Has a .secrets.baseline and .detect-secrets.cfg. Real commitment to preventing credential leaks in code.

Prompt Injection Defense - src/security/external-content.ts wraps external content with boundary markers and security warnings. Pattern-based detection for common injection attempts. Not bulletproof, but genuine effort.

Docker Hardening - Runs as non-root user, supports --read-only and --cap-drop=ALL. Follows container security best practices.

Dependency Hygiene - onlyBuiltDependencies allowlist and minimumReleaseAge: 2880 (48 hours) to prevent install-time attacks from brand-new package versions.

These aren't trivial. Someone on this project cares about security. The problems are architectural, not attitudinal.


The Fix: What a Properly Architected Solution Looks Like

Problem OpenClaw Proper Solution
History Unlimited (quadratic cost) Sliding window (15-20 turns)
System prompt Sent raw every time ($225/mo) Cached ($22.50/mo)
Tool results Persist forever in context Summarized, expired after 5 turns
Model Opus for everything ($1,350/mo) Routed: Haiku/Sonnet/Opus ($270/mo)
Context overflow React after failure ($6.74) Prevent proactively ($0.30)
Memory search Every turn ($108/mo) Only when relevant
Multi-channel Separate sessions per channel Shared context per user
Cost tracking All zeros Real pricing, real budgets
Core runtime Opaque npm packages Direct SDK integration

So, Should You Use OpenClaw?

For tinkering, learning, and local experiments - sure. It's a fascinating project with impressive breadth. 19 channels, 22 tools, 15+ providers. That's ambitious.

For anything involving real data, real users, or real money β€” not without significant hardening. The Zip Slip alone is a showstopper. The supply chain risk is a dealbreaker for enterprise. And the token economics will eat your budget alive.

My recommendations:

  1. Don't install community skills until Zip Slip is patched
  2. Don't use it for WhatsApp unless you're okay with account bans
  3. Switch the default model to Sonnet immediately (save 80%)
  4. Set a history limit in your config
  5. Never deploy the gateway on a public network
  6. Audit the @mariozechner packages before production use

Full Report

The complete technical audit β€” all 10 vulnerabilities with exploitation steps, the full supply chain breakdown, token cost simulations, and architecture comparison - is available at:

github.com/hoodini/openclaw

The repo includes:

  • README.md - Full red team report with CVSS scores and code evidence
  • README_tokens.md - Deep-dive token economics analysis with cost tables

About the Author

Yuval Avidani is a security researcher and developer based in Israel. He believes that open-source projects deserve honest, evidence-based analysis - not hype.

Follow on GitHub: @hoodini


Audit performed against OpenClaw v2026.2.6-3, commit c984e6d8d on branch main. 330,000 lines of TypeScript. 1,156 dependencies. All findings verified against source code.

Responsible disclosure: The OpenClaw project's SECURITY.md explicitly lists "Prompt injection attacks" as out of scope and states there is no bug bounty program. This audit was performed on publicly available source code.


If this saved you from a $40K/year surprise, share it. The next person evaluating OpenClaw for their company deserves to see the numbers.

Top comments (0)