Frozen Blood

Posted on Feb 11

I Almost Ran rm -rf / on Production — And It Changed How I Deploy Forever

#devops #webdev #backend #linux

There’s a specific kind of silence that happens when your terminal cursor blinks… and you realize you’re connected to production.

I didn’t actually wipe the server.
But I was one Enter key away.

Here’s what happened — and the small habits I changed that probably saved me (and might save you too).

The Setup: A “Quick Fix” at 1:17 AM

Classic story:

Minor config bug
“It’ll take 2 minutes”
SSH into the server
Fix a path
Restart the service
Go to sleep

Except I had two terminals open:

One connected to staging
One connected to production

They looked almost identical.

Same prompt.
Same theme.
Same everything.

I typed:

rm -rf dist

And then I noticed the hostname.

Production.

Why This Was Worse Than It Sounds

That project didn’t just store build artifacts in dist/.
It also had runtime-generated files.

And since I was inside the app directory, not root, rm -rf dist wouldn’t nuke the whole machine — but it would absolutely break live traffic.

It would have:

Deleted compiled output
Broken running processes on restart
Forced an emergency redeploy

All because I didn’t slow down for 3 seconds.

The Real Problem Wasn’t the Command

It was this:

Production access was too easy
The environment looked too similar
Destructive commands had no friction
There were no guardrails

It wasn’t a technical issue.
It was a design flaw in my workflow.

What I Changed Immediately

1️⃣ Aggressively Different Shell Prompts

Production is now bright red in my terminal.

Using .bashrc:

export PS1="\[\e[1;31m\][PROD] \u@\h:\w$ \[\e[0m\]"

If I SSH into prod, it screams at me.

No more guessing.

2️⃣ Alias Protection for Dangerous Commands

I added interactive flags to destructive commands:

alias rm='rm -i'
alias mv='mv -i'
alias cp='cp -i'

Now I must confirm before deleting.

Is it slightly annoying?
Yes.
Is it better than downtime? Also yes.

3️⃣ Production Is Now Mostly Immutable

Instead of:

SSH → change files → restart

I moved to:

Build locally or in CI
Deploy artifact
Restart via process manager

No editing files on prod. Ever.

If I need to “fix something quickly,” I fix it in code and redeploy.

4️⃣ Reduced Direct SSH Access

Now:

Logs are centralized
Restarts are script-driven
Environment variables are managed via config

If I SSH, something is already very wrong.

The Psychological Lesson

Most production disasters don’t come from incompetence.

They come from:

Fatigue
Familiarity
Repetition
“Just this once”

The most dangerous command isn’t rm -rf.
It’s confidence.

The Quiet Truth About DevOps

We talk about:

Kubernetes
Blue-green deploys
Zero-downtime rollouts

But the simplest protection is often:

Make dangerous actions slightly harder.

Friction is underrated.

Conclusion / Key takeaway

I didn’t delete production.
But that near-miss changed how I treat access, commands, and environments forever.

The scariest production story isn’t the one where everything breaks.
It’s the one where you realize how easily it could have.

What’s the closest you’ve come to breaking production — and what guardrail did you add afterward?

DEV Community