DEV Community

Cover image for I Replaced 200 Threads With 10,000. Java Finished 13.5x Faster.

I Replaced 200 Threads With 10,000. Java Finished 13.5x Faster.

S M Tahosin on June 04, 2026

I expected the fans to spin. I had just asked Java to start 10,000 tasks, give each task its own virtual thread, and make every one wait for 100 m...
Collapse
 
tamimrao profile image
Tamim Rao

Since virtual threads are getting a lot of attention lately, experiments like this are a good reminder of why. The surprising part isn't that 10,000 tasks ran, it's how little overhead there was compared to the mental model many of us still have from platform threads.

What I find interesting is that it also highlights a common misconception. Seeing 10,000 threads complete smoothly doesn't mean we should start creating threads everywhere. It means the cost model has changed, so we can focus more on expressing concurrency in a straightforward way and less on building complex pooling strategies for I/O-heavy workloads.

I'd be curious to see the same experiment with blocking network calls, database operations, and some CPU-bound work mixed in. That's usually where the real trade-offs start to show up.

Collapse
 
tahosin profile image
S M Tahosin

That's exactly the takeaway I was hoping readers would get from the experiment. The interesting part isn't the number itself, it's that virtual threads let us go back to a much simpler concurrency model without paying the same cost we used to associate with threads.

I also agree that "10,000 threads worked" can easily turn into the wrong conclusion if people stop there. Virtual threads make waiting cheap, but they don't magically make CPU work cheaper.

The mixed workload scenario you mentioned would be a great follow-up. Network I/O, database calls, and CPU-bound tasks in the same benchmark would probably show a much more nuanced picture of where virtual threads shine and where the underlying hardware limits still dominate. That's actually the direction I'm thinking of exploring next.

Collapse
 
motedb profile image
mote

The "cheap waiting, not cheap work" framing is the part that took me longest to internalize. I kept trying to use virtual threads as a drop-in for thread pools on CPU-bound tasks, then wondering why memory usage spiked without speed improvement. The Semaphore pattern for rate-limiting actual bottlenecks is underrated — most devs reach for it too late, after they've already blown up a downstream API's rate limit.

One thing I'd push back on slightly: the article mentions ThreadLocal as a gotcha, but the deeper issue is that virtual threads fundamentally change the cost model. In Rust's async model, you'd handle this differently — instead of Semaphore + blocking calls, you'd reach for async channels or futures that yield without thread blocking. Same problem, different primitives. Neither is wrong, just requires rethinking what "waiting" means in your specific runtime.

What's your take on structured concurrency here? Virtual threads make it easier to accidentally spawn fire-and-forget tasks that outlive their parent scope.

Collapse
 
tahosin profile image
S M Tahosin

That's a great point. I think the ThreadLocal example is really just one symptom of the broader shift in the cost model. Virtual threads let us write code in a more direct style, but they also force us to revisit assumptions that were built around expensive threads.

I also agree with the Rust comparison. The primitives are different, but the underlying challenge is the same: expressing concurrency without confusing waiting with useful work.

As for structured concurrency, I'm a big fan of it for exactly the reason you mentioned. Once spawning work becomes cheap, lifecycle management becomes more important, not less. It's very easy to create tasks that technically work but are no longer tied to the scope that created them. Structured concurrency feels like the missing guardrail that keeps that power manageable.

Collapse
 
ankitasarker profile image
Ankita Sarkar

Really enjoyed this experiment. A lot of developers still think "threads are expensive" without considering what those threads are actually doing. Your results are a good reminder that modern JVMs and operating systems handle idle threads much better than many of us expect.

What stood out to me is how easy it is to carry old assumptions forward without testing them. It would be interesting to see the same experiment with CPU-heavy work instead of sleeping threads to compare where the real limits start showing up.

Thanks for sharing actual measurements instead of just repeating common wisdom.

Collapse
 
tahosin profile image
S M Tahosin

Exactly. That was one of the main motivations behind the experiment. It's surprisingly easy to inherit assumptions from older threading models and never revisit them.

A CPU-heavy version would be a great comparison because that's where I'd expect the hardware limits to become much more visible. Thanks for the thoughtful observation.

Collapse
 
ismailhasan profile image
Ismail Hasan

This experiment is fascinating because it really challenges our intuition about what modern hardware can handle. Most people assume starting thousands of threads would instantly bring a laptop to its knees, but seeing it barely notice is eye-opening. It also makes me think about how much the JVM and modern operating systems optimize thread management behind the scenes. I wonder how this would scale on different workloads, especially when threads are doing more than just sleeping. It’s a great reminder that sometimes our assumptions about performance bottlenecks are outdated, and testing can reveal surprising truths about the tools we use every day.

Collapse
 
tahosin profile image
S M Tahosin

I completely agree. One of the biggest lessons for me was realizing how many performance assumptions I was carrying around without ever testing them.

You're also right that the workload matters. Sleeping threads are one thing, but CPU-heavy work or blocking I/O can tell a very different story. That's why benchmarks are so valuable. They often reveal that the bottleneck isn't where we expected it to be.

Thanks for sharing your thoughts.

Collapse
 
mansadatta profile image
Mansa Datta

What I liked about this experiment is that it challenges a common assumption many developers have: seeing a huge thread count and immediately expecting the system to fall apart. The interesting takeaway isn't that 10,000 threads worked, but understanding why they worked. Most of them were likely waiting rather than actively competing for CPU time.

It's also a good reminder that concurrency discussions are often more nuanced than "more threads = bad." Thread state, memory usage, and workload type matter just as much as the raw number of threads.

A follow-up comparison with CPU-bound tasks or Java virtual threads would be really interesting. That would show where traditional threads start to hit their limits and how different concurrency approaches compare in practice.

Great experiment and a nice reality check for many of the assumptions we carry about threads.

Collapse
 
tahosin profile image
S M Tahosin

That's a great way to put it. I think many of us still carry the mental model that a large thread count automatically means trouble, because that's often true with platform threads. What surprised me most wasn't the number itself, but how little actual contention there was once you look at what those threads were doing.

I also like your point that concurrency discussions often get reduced to a single metric. The raw thread count is easy to focus on, but thread state and workload characteristics usually tell a much more useful story.

A CPU-bound comparison is definitely on my list. My expectation is that the gap becomes much smaller there, which would reinforce the idea that virtual threads make waiting cheap, not computation cheap. That's where the distinction between concurrency and parallelism becomes really interesting in practice.

Collapse
 
adrianng profile image
Adrian Ng

What stood out to me is how this highlights the gap between theory and reality. We often hear "threads are expensive" and stop there, but seeing 10,000 threads barely make a modern laptop sweat puts that advice into context. The most interesting part wasn't the number itself, it was the reminder to test assumptions instead of repeating them. Nice experiment and a fun read.

Collapse
 
tahosin profile image
S M Tahosin

I couldn't agree more. The phrase "threads are expensive" isn't wrong, but it's often repeated without enough context.

What surprised me most was how different the actual result was from the picture I had in my head before running the test. That's exactly why I love small experiments like this. They have a way of exposing assumptions we didn't even realize we were carrying around.

Collapse
 
danielmarcus profile image
Daniel Markus

This was a fun reminder that many of us still think about Java threads using rules from a different era. The most interesting part wasn't that 10,000 threads could be created, but how little impact it had when those threads weren't actively doing work.

It also highlights an important distinction between thread count and actual concurrency pressure. Numbers alone can be misleading. Thanks for sharing a simple experiment that challenges assumptions instead of repeating them.

Collapse
 
tahosin profile image
S M Tahosin

Exactly. I think that's the key distinction people often miss. A large thread count sounds scary until you look at what those threads are actually doing.

The experiment was less about proving that 10,000 is a magic number and more about questioning an assumption I've seen repeated for years. Sometimes the mental model becomes outdated long before we realize it.

Collapse
 
hani1808 profile image
Hani Lieu

Really cool experiment. It's amazing how something that used to feel impossible is now running comfortably on a regular laptop thanks to virtual threads.

Collapse
 
tahosin profile image
S M Tahosin

That's what surprised me too. A few years ago, "10,000 threads on a laptop" would have sounded like a terrible idea. Virtual threads really change what feels practical for highly concurrent workloads.