DEV Community

Cover image for Open Source Toolkit for Building AI Agents in 2026

Open Source Toolkit for Building AI Agents in 2026

Anmol Baranwal on May 21, 2026

It used to take a lot of effort to get your first PR merged in open source. Now you can ship something real in a weekend thanks to coding agents li...
Collapse
 
shogun444 profile image
shogun 444

Finding high-quality, maintained agent tools right now is a minefield of hype, so this list is a breath of fresh air. Love the structure and how you broke down the alternatives too. Phenomenal write-up, thanks for sharing your daily exploration with us!

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

great to hear that! tbh most of us won't even need half of these but knowing what exists saves a ton of time. spent months exploring these so hopefully this helps someone out there :)

you should definitely try agent-skills & taste-skill. I use them almost every time I'm building side projects.

Collapse
 
shogun444 profile image
shogun 444

I use agent skills.
will use taste-skill it's seems interesting.

Thread Thread
 
anmolbaranwal profile image
Anmol Baranwal

yeah. using those skills will help you avoid ai slop websites, like gpt-taste (for gsap animations), high-end-visual design.. I have tried all of them.

let me also share something I have been using. using this prompt in chatgpt will give you a lot of cool assets. then you can ask it to export those assets without bg and voila.. your website will look far better :)

https://github.com/Leonxlnx/taste-skill/blob/main/skills/imagegen-frontend-web/SKILL.md

Based on the skill above, generate images for a website for an AI agency teaching how to build agents. The design should include one section, with one image,.

The website is for a creative AI company focused on research in ai agents along with creativity and design. Because of that, I want the visuals to feel highly original, playful, and art-directed, with text integrated thoughtfully into the design. Make it feel ultra-creative and intentional, like an Awwwards SOTD-level website in both concept and execution.

Please go beyond standard layouts. Do not rely only on simple text-left, image-right compositions. Explore more experimental and varied layouts. Feel free to go completely wild, but keep it purposeful, not random.

I want different section structures, including horizontal images, fullscreen sections, full background imagery, and more minimal sections with beautiful colors and a strong sense of motion or animation. Please use full background images or strong full-background color compositions, not just plain white sections. Keep it in light mode.

Overall, try to stay somewhat consistent across the site while still making each section feel distinct. I want it to look crazy creative, thoughtful, and visually impressive, with strong UX and a clear sense of purpose.

Generate 8 different images total. Do not combine them into one image. Each image should represent one section of the website.
Enter fullscreen mode Exit fullscreen mode

here are some samples.

websites like open-design.ai has been built using same method.

Thread Thread
 
shogun444 profile image
shogun 444

whoa!!! this is crazy.
I will be sure to check these out thankyou for these knowledge.

Collapse
 
raju_dandigam profile image
Raju Dandigam

This is a useful ecosystem overview because the tooling landscape is getting fragmented very quickly. I especially liked the inclusion of MCP-aware tooling and generative UI runtimes because many discussions still focus only on orchestration frameworks themselves. One thing I keep noticing is that observability and debugging tooling still feels underrepresented compared to orchestration, evals, and memory layers. I’ve been exploring that local inspection gap in TypeScript with agent-inspect, particularly around execution trees and tool-call traces. Curious which tooling category you think becomes most important over the next year.

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

thanks!! most blogs about this only cover the backend which is weird to me, there is a lot more that makes the overall system better.

I think observability is mostly a maturity thing. most devs aren't at production scale yet so they skip it. last I used langsmith was for basic smoke testing and validating agent responses on one of my projects.

I personally believe harness + skills will be very useful by the end of this year. and the models are getting good, really good at using those skills (like I have only seen it couple of times missing those strict rules)

Collapse
 
uliyahoo profile image
uliyahoo

Nice article Anmol

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

yay! means a lot, took forever to put together :)

Collapse
 
dark_coder_vibes profile image
Dark Coder

Not a big fan of listicles, but some really solid projects in here. Wish there were MCP servers like Context7 - I use it all the time. Just curious have you personally used LiveKit Agents / Pipecat?

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

thanks. I used LiveKit agents a few months ago and Pipecat very recently.. still need to learn more since I wasn't able to try subagents last time

Collapse
 
sanreds profile image
sanreds

Solid list. One category that's going to matter more in the next six months: tools for inspecting and replaying agent runs locally. Once you have 3+ agents chained, the bottleneck moves from "which framework" to "what actually happened in run #47 last Tuesday." Anything you've found useful for that?

Collapse
 
fernandezbaptiste profile image
Bap

Great piece!

Collapse
 
eli_discovers profile image
Eli Berman

Exactly what I've been looking for. Thanks Anmol!!

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

tried my best to include all the awesome repos I found in the past few months. my personal favorite among the list is agent-skills by Addy Osmani & sutando. thanks for reading!!

Collapse
 
hemapriya_kanagala profile image
Hemapriya Kanagala

Anmol, this has been sitting on my reading list for a while and I finally got around to reading it šŸ˜„

Really enjoyed it. You can tell a lot of time went into putting this together. I ended up opening quite a few tabs while reading. Thanks for keeping everything in one place and sharing it with the community šŸ™Œ

Collapse
 
mudassirworks profile image
Mudassir Khan

the harness point for Deep Agents is the one teams learn the hard way — we spent months swapping models on a document QA system before realizing chunking, reranking, and prompt structure were doing most of the work. swapping the model at the end shifted accuracy maybe 8%. redesigning the retrieval harness shifted it 30%. exact same pattern as the Terminal Bench 52→66% jump you cited.

the DeepEval section deserves a callout: task completion and argument correctness metrics catch failures that hallucination metrics completely miss in agentic workflows.

curious which memory store you'd pick for temporal reasoning agents — graphiti vs mem0 is the one i see teams get wrong most often. which did you end up recommending?

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

sir stop using ai to write lol (really don't mean to be rude)

by the way, I definitely recommend reading about agent harness on Addy's blog + langchain. they have covered it very well.

Collapse
 
sanreds profile image
sanreds

Good roundup. One thing I'd add from running these in anger: the framework choice matters way less than your tracing. I shipped nearly the same agent flow on two stacks, and what actually decided maintainability was whether I could see every tool call and token spend per step. Without that you debug blind. Would love a section on observability in a future version, it's the part people regret skipping.