DEV Community

Cover image for What Nobody Tells You About Golden Paths at Scale
Improving
Improving

Posted on • Originally published at improving.com

What Nobody Tells You About Golden Paths at Scale

Your platform team just celebrated hitting 85% golden path adoption. Everyone is excited. Onboarding time for new members dropped from three weeks to two days. New services spin up in minutes. Leadership loved the improved metrics.

Six months later, you've got 23 capability requests in your backlog. Your platform team is drowning. ML teams need custom GPU scheduling. The data team wants streaming pipeline patterns. API teams are rolling their own rate limiting because yours doesn’t fit their needs.

You nailed Day 1.

You're dying on Day 50.

This is the hidden scaling problem with golden paths. And it’s not solved by building more golden paths.


Golden Path Promise vs. What Actually Happens

The platform engineering playbook says golden paths reduce cognitive load and bring standardization across teams. They give developers a blessed path from code to production through self-service, accelerating feature development.

This works well for onboarding and early development. But creating new projects and features is maybe 1% of an application’s lifetime. The remaining 99% is operations, debugging, scaling, adding features, and handling edge cases.

Golden paths excel at the first 1%. They struggle with the rest.

Netflix learned this the hard way. They built a polished developer portal with documentation, recommended tools, and curated paths. Developers said it “wasn’t compelling enough” to change habits. Why?

Because it helped them start things, not run things.

The real work happens after deployment. That’s where centralized golden paths become bottlenecks.


Why Your Platform Team Hits a Ceiling

Your platform team can’t scale linearly with the organization. It’s just math.

Imagine:

  • 200 engineers across 20 teams
  • Each team with distinct needs:
    • ML teams need GPU scheduling, Kubeflow, model serving
    • Data teams want Kafka, Airflow, stream processing
    • API teams need rate limiting, circuit breakers, tracing
    • Mobile backend teams need push notification infrastructure
  • Platform team size: 6 generalists

What Goes Wrong

Queue problem

Every capability funnels through the platform team. Prioritization becomes about who shouts loudest, not what delivers the most value.

Expertise problem

You build “good enough” solutions. ML teams need 12 GPU configurations. They get 3. It checks the box but doesn’t solve the problem.

Maintenance trap

You ship 30 capabilities over two years. Now you maintain all 30.

  • Kubernetes upgrade? Update 30 configs
  • Security patch? Test 30 capabilities
  • Team that requested capability #17 moved on? You still own it

Rigidity issue

Abstractions cover the 80% use case. The remaining 20% fights the platform or bypasses it entirely. This is abstraction debt.

Your platform team becomes the bottleneck for every capability, edge case, and new tool. That’s not sustainable.


Go With a Marketplace Approach

At KubeCon Atlanta, I discussed a different model.

Why should the platform team be the sole provider?

Why not turn the platform into a marketplace?

At a certain point, platform teams should stop being the builders of everything and become marketplace operators.

  • ML team contributes GPU scheduling
  • Data team contributes streaming pipelines
  • API team contributes rate limiting
  • Security team contributes authorization patterns

The platform provides the infrastructure for contribution, not every capability.


How the IDP Marketplace Model Works

Define clear interfaces

Expose APIs and standards for capability integration. Teams know exactly what to implement.

Build contribution templates

Provide scaffolding so teams don’t guess how to package their capability.

Automate validation

Every contribution must pass automated checks:

  • Metrics exposure
  • Security scans
  • Documentation
  • Health checks

Create recognition systems

Contribution isn’t charity. Track it. Reward it. Make it count in performance reviews.


Advantages of the IDP Marketplace Model

  • Parallel capability development instead of queues
  • Domain expertise embedded where it belongs
  • Platform team focuses on primitives, not products
  • Network effects drive adoption and value

Organizations running mature marketplace models see 3–4x faster capability development compared to centralized teams.


But Here’s the Part Nobody Talks About

After KubeCon Atlanta, many teams shared failed attempts at this approach.

Governance Breakdown

  • No quality standards lead to capability sprawl
  • Developers don’t trust community contributions
  • Multiple poorly maintained implementations of the same thing

One organization had three different Postgres operators, none properly maintained. Teams gave up and installed Postgres manually.


Quality Problems

Capabilities work for the original team but fail later:

  • Security CVEs
  • Kubernetes upgrades
  • Hidden network assumptions

Nobody owns the fix. Capabilities become orphaned and unusable.


Contribution Friction

Platform APIs are complex. Contributing requires understanding:

  • Service meshes
  • CI/CD pipelines
  • Monitoring
  • Security policies

Only senior engineers contribute. Participation dies out.


Maintenance Nightmare

  • Kubernetes 1.35 drops. Who updates 40 capabilities?
  • Security patch lands. Who validates everything?
  • Production breaks at 3am. Who’s on call?

Prerequisites for Making Marketplaces Work

1. Platform Primitives That Enable Contribution

Capabilities must plug in without platform code changes. If every addition requires core modifications, your platform isn’t ready.


2. Enforced Quality Standards

  • Automated testing
  • Mandatory metrics and health checks
  • Security scanning for CVEs and secrets
  • Documentation requirements:
    • Runbooks
    • Troubleshooting guides
    • Usage examples

No documentation means no shipment.


3. Ownership Beyond Initial Contribution

  • Define maintenance responsibilities upfront
  • Clear security patching ownership
  • Deprecation and migration policies
  • Explicit handoff mechanisms

“You build it, you own it for 12 months” is a valid rule.


4. Cultural Readiness

  • Inner-source culture already exists
  • Contributions count toward goals and reviews
  • Leadership supports contribution time

If leadership sees contribution as “not real work,” the marketplace fails.


Hybrid Approach

Don’t go all-in immediately.

  • Golden capabilities for common needs (70–80%)
  • Marketplace capabilities for specialized domains

Capability Tiers

  • Platform-blessed

    Maintained by platform team, SLAs guaranteed

  • Community-maintained

    Supported by contributors, use at own risk

  • Experimental

    No stability guarantees

Clear expectations prevent surprises.


Next Step for You

If you’re hitting scaling issues:

  • Audit your backlog for domain-specific requests
  • Identify teams with deep expertise
  • Start with a low-risk pilot capability
  • Build templates and validation, not just docs
  • Establish governance before scale

If you’re building your first platform:

  • Start centralized
  • Design extensibility from day one
  • Avoid premature marketplace complexity

Real Insight

Platform maturity isn’t “build golden paths and stop.”

It’s:

  • Build golden paths
  • Recognize when they become bottlenecks
  • Evolve your model intentionally

Centralization gives control and consistency.

Marketplaces give scale and expertise.

Neither is perfect.

The right choice depends on your organization’s stage.

I explored platform marketplaces, governance models, and real-world failure modes at KubeCon Atlanta.

Want to discuss platform scaling or share your experience?

Connect with me on LinkedIn. If you’re struggling with platform engineering, contact our consultants—we help teams build platforms that actually scale.

Top comments (0)