DEV Community

# localllm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Qwen 3.6 35B-A3B for Local AI in 2026: The 24GB VRAM Line That Gets You 120 tok/s

Qwen 3.6 35B-A3B for Local AI in 2026: The 24GB VRAM Line That Gets You 120 tok/s

Comments
6 min read
How to Tune llama.cpp --n-gpu-layers: A Practical VRAM Guide (2026)

How to Tune llama.cpp --n-gpu-layers: A Practical VRAM Guide (2026)

Comments
3 min read
How to Tune --n-gpu-layers for Your VRAM Budget

How to Tune --n-gpu-layers for Your VRAM Budget

Comments
4 min read
Open-LLM-VTuber Review: Offline AI Companion with Live2D

Open-LLM-VTuber Review: Offline AI Companion with Live2D

Comments
10 min read
Local LLM Hardware Requirements in 2026: What You Actually Need for Every Model Tier [Guide]

Local LLM Hardware Requirements in 2026: What You Actually Need for Every Model Tier [Guide]

Comments
8 min read
Hermes Agent Desktop Free With Local LLMs: The Claude Code Alternative Nobody's Billing You For [2026]

Hermes Agent Desktop Free With Local LLMs: The Claude Code Alternative Nobody's Billing You For [2026]

Comments
8 min read
[Day 11] I turned my cat into anime art — and the AI drew a human girl instead. One photo through IPAdapter pulls it back to a cat

[Day 11] I turned my cat into anime art — and the AI drew a human girl instead. One photo through IPAdapter pulls it back to a cat

Comments
5 min read
Run Cursor with a Local Model: Privacy-First AI Coding Without a Subscription

Run Cursor with a Local Model: Privacy-First AI Coding Without a Subscription

Comments
5 min read
Qwen3-Coder-Next review 2026: 80B params, 3B active, and the cheapest credible coding agent API

Qwen3-Coder-Next review 2026: 80B params, 3B active, and the cheapest credible coding agent API

Comments
5 min read
Qwen3-Coder-Next for Local AI in 2026: Which GPU Can Actually Run Alibaba's #1 Coding Agent?

Qwen3-Coder-Next for Local AI in 2026: Which GPU Can Actually Run Alibaba's #1 Coding Agent?

Comments
6 min read
RTX 5060 for Local AI in 2026: When 448 GB/s Hits an 8GB Wall

RTX 5060 for Local AI in 2026: When 448 GB/s Hits an 8GB Wall

Comments
6 min read
Gemma 4 12B vs GPT-4o Mini vs Claude Haiku: Is Google's Local LLM Good Enough to Replace API Calls? [2026]

Gemma 4 12B vs GPT-4o Mini vs Claude Haiku: Is Google's Local LLM Good Enough to Replace API Calls? [2026]

Comments
7 min read
We pre-registered, ran, and verified the macro ablation: information per joule, measured

We pre-registered, ran, and verified the macro ablation: information per joule, measured

Comments
3 min read
We ported how brains manage the cost of thinking to LLM systems

We ported how brains manage the cost of thinking to LLM systems

Comments 2
9 min read
Localmaxxing isn't theory. Here's what my 3-GPU rig actually does.

Localmaxxing isn't theory. Here's what my 3-GPU rig actually does.

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.