đź‘‹ Let's Connect! Follow me on GitHub for new projects and tips.
Introduction
Anyone who has worked in enterprise environments at any point in their career can tell you a real constraint is operational attention. This guide focuses on patterns that reduce load: tight scope, safe defaults, automated checks, and predictable maintenance. All examples that follow are just for concept illustration.
Scope, Ownership, and Enterprise Criteria
Define criteria up front.
-
SLOs and blast radius
- Set a basic SLO (e.g., 99.5% monthly availability) and a max acceptable data loss (RPO) / recovery time (RTO).
- Identify the blast radius: which teams, which workflows, what happens if it’s down.
-
Non-negotiables
- Authentication + authorization (no shared logins).
- Audit trail for sensitive actions.
- Backups + restore test.
- Observability: logs + metrics + error reporting.
- CI checks and repeatable deploys.
-
Explicit ownership
- Write down: who approves access, who onboards users, who rotates secrets, who can disable the tool.
- If it’s you, automate responsibly.
Pitfall: internal tools often become “critical” without being treated as such. Add a banner in the README: support hours, escalation path, and what to do if it breaks.
Architecture and Operations That Scale Down (Solo Friendly)
Optimize for simplicity under change.
-
Choose boring building blocks
- One service, one database, one deployment target.
- Prefer managed services for auth, DB, and secrets if your org provides them.
-
Data safety
- Use migrations, constraints, and idempotent writes.
- Add “dry run” modes for destructive operations.
- Prefer append only audit tables for critical workflows.
-
Security baseline
- SSO/OIDC if possible; enforce MFA and short-lived sessions.
- RBAC: start with minimum roles (viewer/operator/admin).
- Least privilege for service accounts; separate read vs write credentials.
- CSRF protection for browser apps; strict CORS; secure cookies.
-
Deployability
- Single command deploy (CI does it).
- Blue/green or rolling deploy if supported; otherwise maintenance window + fast rollback.
- Feature flags for risky changes.
-
Observability
- Structured logs with request_id/user_id.
- Metrics: request rate, latency, error rate, job failures, DB errors.
- Alert on symptoms (5xx rate, job backlog), not on every exception.
Example 1: CI Gate for a Solo Maintained Tool
A minimal CI pipeline that prevents the most common solo dev regressions: failing tests, broken migrations, lint drift, and missing env config.
Step 1: Add a GitHub Actions workflow (.github/workflows/ci.yml)
name: ci
on:
pull_request:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16
env:
POSTGRES_USER: SECRET
POSTGRES_PASSWORD: SECRET
POSTGRES_DB: SECRET
ports:
- 5432:5432
options: >-
--health-cmd="pg_isready -U app -d app_test"
--health-interval=5s
--health-timeout=5s
--health-retries=10
env:
DATABASE_URL: postgresql://app:app@localhost:5432/app_test
NODE_ENV: test
steps:
- uses: actions/checkout@v4
- name: Use Node
uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- name: Install
run: npm ci
- name: Lint
run: npm run lint
- name: Typecheck
run: npm run typecheck
- name: Migrate (smoke test)
run: npm run db:migrate
- name: Test
run: npm test -- --runInBand
- name: Build
run: npm run build
Step 2: Run the same checks locally
npm ci && npm run lint && npm run typecheck && npm run db:migrate && npm test && npm run build
Expected Output
> lint
âś” no issues found
> typecheck
âś” 0 errors
> db:migrate
Applied 3 migrations
> test
PASS 42 tests
> build
Build completed successfully
Notes:
- The migration smoke test catches “works locally” schema drift early.
- If you can’t run DB in CI, at least validate migrations compile and run against a disposable container in a nightly job.
- Keep the pipeline under ~10 minutes; long CI trains solo devs to bypass it.
Example 2: Structured Logging + Request Correlation (Node/Express)
Make debugging cheap: every log line should tell you who did what, where, and why it failed.
Add request_id and structured logs
const app = express();
const logger = pino({ level: process.env.LOG_LEVEL ?? "info" });
app.use(
pinoHttp({
logger,
genReqId: (req, res) => {
const id = (req.headers["x-request-id"] as string) ?? randomUUID();
res.setHeader("x-request-id", id);
return id;
},
customProps: (req) => ({
user_id: (req as any).user?.id ?? null, // set after auth middleware
}),
redact: {
paths: ["req.headers.authorization", "req.body.password", "req.body.token"],
remove: true,
},
})
);
app.get("/healthz", (_req, res) => res.status(200).send("ok"));
app.post("/admin/reindex", async (req, res) => {
req.log.info({ action: "reindex_start" }, "admin action");
// ... do work
req.log.info({ action: "reindex_done" }, "admin action complete");
res.json({ ok: true });
});
app.use((err: any, req: any, res: any, _next: any) => {
req.log.error({ err }, "request failed");
res.status(500).json({ error: "internal_error", request_id: req.id });
});
Output
{"level":30,"time":...,"req":{"id":"9c...","method":"POST","url":"/admin/reindex"},"user_id":"u_123","action":"reindex_start","msg":"admin action"}
{"level":30,"time":...,"req":{"id":"9c...","method":"POST","url":"/admin/reindex"},"user_id":"u_123","action":"reindex_done","msg":"admin action complete"}
Notes:
- Always return
request_idto the caller; it’s your fastest support loop. - Redact secrets at the logger level; don’t rely on developer discipline.
- Add an audit table for admin actions; logs are not an audit trail.
Solution: A Solo Developer Maintenance Loop That Prevents Fires
Treat maintenance as a product feature. The goal is stability with a small weekly budget.
-
Weekly (30–60 min)
- Review error budget signals: 5xx rate, job failures, slow endpoints.
- Triage dependency updates (security first).
- Scan audit logs for unexpected admin actions.
-
Monthly (1–2 hrs)
- Restore test from backup into a scratch environment.
- Rotate secrets (or validate rotation automation).
- Review access list and remove stale accounts/roles.
-
Quarterly
- Chaos-lite: kill a worker, simulate DB failover (if applicable), validate alerts.
- Revisit SLOs and “critical path” workflows.
Automate the routine checks so you don’t rely on memory.
# Example: a simple scheduled “ops check” script you can run in CI or cron
./scripts/ops-check.sh
Notes:
- Your best leverage is removing manual steps: onboarding, deploys, migrations, and access changes.
- If a task happens more than twice, script it; if it’s risky once, add a guardrail (dry-run, confirmation, role check).
Key Takeaways
- Define criteria upfront: auth, auditability, backups+restore, observability, and repeatable deploys.
- Optimize for operational simplicity: architecture, safe data patterns, and CI gates that catch drift.
- Run a lightweight maintenance loop with automated checks; solo success is about reducing attention load.
Conclusion
Solo built tools can be enterprise grade if you design for reliability and maintenance from day one: constrain scope, enforce security defaults, automate validation, and keep a predictable ops cadence. The payoff is fewer interruptions and a tool that earns trust across the org.
Meta Description
A pragmatic playbook for solo developers building enterprise grade tools: scope, security, CI/CD, observability, and maintenance routines with concrete examples.
TLDR - Highlights for Skimmers
- Ship with non-negotiables: auth, audit trail, backups+restore test, and observability.
- Add CI gates for lint/typecheck/tests and a migration smoke test to prevent schema drift.
- Maintain with a weekly/monthly ops loop and automate anything that repeats or is risky.
What’s the one solo development failure mode you’ve seen most often; auth gaps, data drift, or deploy brittleness?
Top comments (1)
Do you prefer working solo, on a team, or a bit of both?