DEV Community

Open Source Toolkit for Building AI Agents in 2026

Anmol Baranwal on May 21, 2026

It used to take a lot of effort to get your first PR merged in open source. Now you can ship something real in a weekend thanks to coding agents li...

Read full post

shogun 444 • May 21

Finding high-quality, maintained agent tools right now is a minefield of hype, so this list is a breath of fresh air. Love the structure and how you broke down the alternatives too. Phenomenal write-up, thanks for sharing your daily exploration with us!

Anmol Baranwal • May 21

great to hear that! tbh most of us won't even need half of these but knowing what exists saves a ton of time. spent months exploring these so hopefully this helps someone out there :)

you should definitely try agent-skills & taste-skill. I use them almost every time I'm building side projects.

shogun 444 • May 21

I use agent skills.
will use taste-skill it's seems interesting.

Anmol Baranwal • May 21

yeah. using those skills will help you avoid ai slop websites, like gpt-taste (for gsap animations), high-end-visual design.. I have tried all of them.

let me also share something I have been using. using this prompt in chatgpt will give you a lot of cool assets. then you can ask it to export those assets without bg and voila.. your website will look far better :)

https://github.com/Leonxlnx/taste-skill/blob/main/skills/imagegen-frontend-web/SKILL.md

Based on the skill above, generate images for a website for an AI agency teaching how to build agents. The design should include one section, with one image,.

The website is for a creative AI company focused on research in ai agents along with creativity and design. Because of that, I want the visuals to feel highly original, playful, and art-directed, with text integrated thoughtfully into the design. Make it feel ultra-creative and intentional, like an Awwwards SOTD-level website in both concept and execution.

Please go beyond standard layouts. Do not rely only on simple text-left, image-right compositions. Explore more experimental and varied layouts. Feel free to go completely wild, but keep it purposeful, not random.

I want different section structures, including horizontal images, fullscreen sections, full background imagery, and more minimal sections with beautiful colors and a strong sense of motion or animation. Please use full background images or strong full-background color compositions, not just plain white sections. Keep it in light mode.

Overall, try to stay somewhat consistent across the site while still making each section feel distinct. I want it to look crazy creative, thoughtful, and visually impressive, with strong UX and a clear sense of purpose.

Generate 8 different images total. Do not combine them into one image. Each image should represent one section of the website.

here are some samples.

websites like open-design.ai has been built using same method.

shogun 444 • May 21

whoa!!! this is crazy.
I will be sure to check these out thankyou for these knowledge.

Raju Dandigam • May 25

This is a useful ecosystem overview because the tooling landscape is getting fragmented very quickly. I especially liked the inclusion of MCP-aware tooling and generative UI runtimes because many discussions still focus only on orchestration frameworks themselves. One thing I keep noticing is that observability and debugging tooling still feels underrepresented compared to orchestration, evals, and memory layers. I’ve been exploring that local inspection gap in TypeScript with agent-inspect, particularly around execution trees and tool-call traces. Curious which tooling category you think becomes most important over the next year.

Anmol Baranwal • May 26

thanks!! most blogs about this only cover the backend which is weird to me, there is a lot more that makes the overall system better.

I think observability is mostly a maturity thing. most devs aren't at production scale yet so they skip it. last I used langsmith was for basic smoke testing and validating agent responses on one of my projects.

I personally believe harness + skills will be very useful by the end of this year. and the models are getting good, really good at using those skills (like I have only seen it couple of times missing those strict rules)

uliyahoo • May 21

Nice article Anmol

Anmol Baranwal • May 21

yay! means a lot, took forever to put together :)

Dark Coder • May 21

Not a big fan of listicles, but some really solid projects in here. Wish there were MCP servers like Context7 - I use it all the time. Just curious have you personally used LiveKit Agents / Pipecat?

Anmol Baranwal • May 26

thanks. I used LiveKit agents a few months ago and Pipecat very recently.. still need to learn more since I wasn't able to try subagents last time

sanreds • Jun 3

Solid list. One category that's going to matter more in the next six months: tools for inspecting and replaying agent runs locally. Once you have 3+ agents chained, the bottleneck moves from "which framework" to "what actually happened in run #47 last Tuesday." Anything you've found useful for that?

Bap • Jun 2

Great piece!

Eli Berman • May 21

Exactly what I've been looking for. Thanks Anmol!!

Anmol Baranwal • May 21

tried my best to include all the awesome repos I found in the past few months. my personal favorite among the list is agent-skills by Addy Osmani & sutando. thanks for reading!!

Hemapriya Kanagala • Jun 11

Anmol, this has been sitting on my reading list for a while and I finally got around to reading it 😄

Really enjoyed it. You can tell a lot of time went into putting this together. I ended up opening quite a few tabs while reading. Thanks for keeping everything in one place and sharing it with the community 🙌

Mudassir Khan • May 23

the harness point for Deep Agents is the one teams learn the hard way — we spent months swapping models on a document QA system before realizing chunking, reranking, and prompt structure were doing most of the work. swapping the model at the end shifted accuracy maybe 8%. redesigning the retrieval harness shifted it 30%. exact same pattern as the Terminal Bench 52→66% jump you cited.

the DeepEval section deserves a callout: task completion and argument correctness metrics catch failures that hallucination metrics completely miss in agentic workflows.

curious which memory store you'd pick for temporal reasoning agents — graphiti vs mem0 is the one i see teams get wrong most often. which did you end up recommending?

Anmol Baranwal • May 26

sir stop using ai to write lol (really don't mean to be rude)

by the way, I definitely recommend reading about agent harness on Addy's blog + langchain. they have covered it very well.

sanreds • Jun 9

Good roundup. One thing I'd add from running these in anger: the framework choice matters way less than your tracing. I shipped nearly the same agent flow on two stacks, and what actually decided maintainability was whether I could see every tool call and token spend per step. Without that you debug blind. Would love a section on observability in a future version, it's the part people regret skipping.