DEV Community

Cover image for Hermes Mentor — A Local AI Agent That Gets You Out of Tutorial Hell
Aditya
Aditya

Posted on

Hermes Mentor — A Local AI Agent That Gets You Out of Tutorial Hell

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent


What I Built

Every developer knows the feeling. You've watched 50 hours of YouTube tutorials. You "know" React. You "know" Python. Then you sit down to build something real — and you freeze. Not because you're not smart. But because watching is not building.

This is tutorial hell. I've lived it. I've watched juniors at work live it for months.

Hermes Mentor is a fully local, privacy-first AI mentorship agent that pulls you out of it — not with more tutorials, but with a personalised project roadmap built from scanning your actual GitHub repos.

Here's what it does:

  • 🔍 Audits your real GitHub repos — reads every public repo, checks languages, CI/CD configs, test files, README quality
  • 🧠 Identifies your exact skill gaps — local LLM via Ollama reasons across your code to find what's actually missing
  • 🗺️ Generates a 4-week project roadmap — real projects, each one closing a specific gap, no tutorials
  • 📬 Sends daily Telegram challenges — every weekday at 08:30, your nudge, hints if stuck, celebration when you ship
  • 💬 Two-way Telegram agent — reply with your repo link and Hermes reads it, tracks your progress, creates TODO tasks
  • 💾 Persistent memory — your developer profile lives in ~/.hermes/memory/, updated every run
  • 🔒 100% local and private — Ollama runs the LLM on your machine, nothing leaves your box

Everything runs on WSL2. One command to start.


Demo

The audit running — 20 repos scanned in real time

Terminal showing Hermes Mentor audit scanning 20 GitHub repos with colourful output

The roadmap printed — gaps identified, 4 weeks planned

Terminal showing gaps identified and 4-week roadmap generated by local LLM

Telegram Message

Telegram Message

Hermes memory — your developer profile saved automatically

Terminal showing cat of USER_TheCoderAdi.md with full developer profile

Cron installed — daily nudges set up for 08:30 weekdays

Terminal showing cron installed message

Daily nudge delivered — Week 1 Day 2 challenge on Telegram

Terminal showing nudge sent command and Telegram showing the morning challenge message

Telegram Message for nudge

🤯 The moment it became a two-way agent

This is the screenshot that made me realise this project was something else entirely.

I sent my GitHub repo link to the bot on Telegram after completing the CI/CD challenge.
Hermes didn't just reply with text. It:

  • Read the GitHub repo link and understood the context
  • Ran a terminal commandecho 'Pipeline check is successful. Opening PR for review.'
  • Created a real Pull Request on GitHub — TheCoderAdi/Basic_Calculator · Pull Request #1
  • Marked the TODO as completed"Create pull request for CI/CD pipeline changes." → status: completed

This is Hermes Agent's tool use, terminal access, GitHub integration, and task
planning all firing together in real time — over Telegram — powered entirely
by a local LLM on my machine.

No cloud. No API keys. A fully autonomous agent that reviewed my work,
opened a PR, and closed its own task. All from a Telegram message.

Telegram showing two-way conversation with Hermes creating TODO tasks and tracking progress


Landing Page

Code

Repository: github.com/TheCoderAdi/hermes-mentor

Project Structure

hermes-mentor/
├── mentor_agent.py          ← Core agent: GitHub audit + roadmap + Telegram
├── hermes_cron.py           ← Daily nudge scheduler (weekdays 08:30)
├── setup.sh                 ← Automated setup wizard
├── requirements.txt
├── .env.example
├── hermes-mentor.html       ← Project landing page
├── config/
│   └── hermes-config.yaml   ← Hermes Agent config (Ollama + Telegram)
└── skills/
    └── github-audit-mentor.md  ← Reusable Hermes skill file
Enter fullscreen mode Exit fullscreen mode

My Tech Stack

Layer Tool
Agent orchestration Hermes Agent (NousResearch)
Local LLM Ollama — qwen2.5-coder:7b
GitHub data PyGithub — GitHub REST API
Messaging python-telegram-bot — Telegram Bot API
Scheduling Hermes cron + Linux crontab
Memory Hermes persistent memory~/.hermes/memory/
Skill system Hermes skill files — agentskills.io format
Environment WSL2 on Windows
Language Python 3.12

How I Used Hermes Agent

Hermes Mentor doesn't just mention Hermes — every core capability is actively used. Here's exactly how:

Persistent Memory — the agent remembers you forever

After every GitHub audit, Hermes writes USER_TheCoderAdi.md directly into ~/.hermes/memory/. Every future Hermes session loads this file automatically. The agent already knows your skill level, active roadmap week, and past struggles — without you ever re-explaining yourself.

This is what turns a one-shot script into a real mentor. It builds a relationship with you over time.

Skill Learning (GEPA Loop) — gets smarter every run

The github-audit-mentor.md skill file is not static documentation. After every audit it gets updated with the latest findings — gaps found, roadmap generated, developer profile. This is Hermes' Generate-Evaluate-Patch-Apply loop making the skill more accurate and personalised with each developer it touches.

Cron Scheduling — autonomous daily action

hermes_cron.py registers a weekday 08:30 cron job that auto-advances through your roadmap week and day, firing the right personalised Telegram nudge every morning. The user does nothing after setup. Hermes just shows up, every day, like a real mentor.

Two-Way Telegram Gateway — live agent conversations

The most unexpected moment in building this: when I started hermes gateway and sent my own GitHub repo link to the bot, Hermes read the message, recognised the URL, created TODO tasks with in_progress and pending status, and replied with next steps. That's Hermes' tool use, task planning, and messaging gateway all firing together in real time.

Multi-Step Agentic Reasoning — the full loop

Fetch repos → Extract language/CI/test signals →
Reason about gaps → Generate targeted project per gap →
Deliver via Telegram → Save to memory → Update skill file →
Listen for replies → Track progress → Plan next steps
Enter fullscreen mode Exit fullscreen mode

Each step informs the next. This is what separates Hermes Mentor from a chatbot.

Local LLM via Ollama — private by design

The entire reasoning layer runs on your machine through Ollama. No OpenAI. No Anthropic. No billing. Your GitHub activity, your learning gaps, your daily habits — stay on your box. This felt philosophically aligned with Hermes itself — an open source agent you run on your own infrastructure.


Why I Built This

I'm an SDE at KFintech and I've been building things for years. But I remember tutorial hell clearly — watching videos for months, feeling productive, then sitting down to build something real and freezing completely.

The problem isn't knowledge. It's the gap between watching and doing.

Roadmap.sh gives you a static path. GitHub Copilot helps you write code. But nobody looks at what you've actually built and tells you specifically what to build next to close your specific gaps.

That's Hermes Mentor. And watching it scan my own GitHub, find my real gaps, generate a roadmap, send it to my Telegram, and then reply when I shared my repo — that moment reminded me why I love building things.


Built by Aditya Swayam Siddha · @TheCoderAdi

Top comments (62)

Collapse
 
itskondrat profile image
Mykola Kondratiuk

this is one of those problems that's 10% information and 90% accountability loop. most people in tutorial hell already know what to build - they just need the forcing function.

Collapse
 
aditya_007 profile image
Aditya

Exactly this. And I think that's why most learning tools fail they keep solving the 10% information problem with more information.

The Telegram nudge at 08:30 every morning isn't smart. It's not even that technical. But it's there, every day, with your name on it. That's the whole point.

The best mentor I ever had didn't teach me much I couldn't have Googled. He just kept asking "did you build it yet?" until I did.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

yeah, the failure mode is never "not smart enough" — it is "smart but inconsistent." most habit tools personalize the content when they should be personalizing the pressure. your name at 08:30 > a personalized curriculum you check whenever motivation shows up.

Thread Thread
 
aditya_007 profile image
Aditya

"Personalising the pressure not the content" that's a better one-line description of Hermes Mentor than anything I wrote in the whole post. 😂

You've just described exactly why every personalised learning app with beautiful dashboards and AI-curated paths still loses to a friend who texts you "bro did you push today."

Might have to steal that line for the v2 landing page if that's okay with you.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

haha steal it. the friend-who-texts case is the one worth solving for - social pressure doesn't compress into a feature, it's relational.

Thread Thread
 
aditya_007 profile image
Aditya

"Social pressure doesn't compress into a feature, it's relational" okay now you're just writing my thesis for me. 😂

And that's the honest limitation of Hermes Mentor right now. The 08:30 message shows up but it doesn't know you didn't sleep, had a bad day, or just got a new job. A real friend adjusts. The bot doesn't.

Maybe that's the actual v2 problem not smarter curriculum, not better gap detection, but making the pressure feel less like a cron job and more like someone who notices when you go quiet.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah that's exactly the ceiling. to fix it you'd need the user to pipe in life state constantly - but then it stops feeling like a friend and starts feeling like a form. can't route around that.

Thread Thread
 
aditya_007 profile image
Aditya

Yeah that's the trap. The moment you ask "how are you feeling today?" every morning it becomes a mood tracker with extra steps.

Maybe the answer isn't asking at all just observing. GitHub commit frequency drops for 5 days, tone of messages changes, nudge adapts. No form, no friction. The signal is already there, you just have to stop ignoring it.

Still relational. Just quieter about it.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

passive observation is better UX but it creates a consent problem - watching commit cadence to infer emotional state is surveillance dressed as care. the friend framing breaks the moment you reveal what signals the system is actually reading.

Thread Thread
 
aditya_007 profile image
Aditya

And there it is - the real ceiling. Not technical, ethical.

You're right. "We noticed you haven't pushed in 5 days" hits completely differently depending on whether you consented to being watched or just signed up for a learning tool. Same signal, same intent, totally different feeling.

Maybe the honest version is just transparency upfront "here's exactly what I observe and why" but then you're back to the form problem. Consent UI is still a form.

Some problems don't have a clean solution. The friend works because there's no terms of service between friends. The moment you productise care, you change what it is.

This has genuinely been the best comment thread I've had on anything I've ever posted.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

disagree slightly — separating ‘technical’ from ‘ethical’ here lets the implementation off the hook. the consent architecture IS the ethics: opt-in vs opt-out, retention window, what triggers the alert vs what just sits in logs — all of that is engineering. calling it a values problem after the system is built is what creates the surveillance feel. ‘we meant well’ is not a consent model.

Thread Thread
 
aditya_007 profile image
Aditya

Fair pushback and you're right to call that out.

I was using 'ethical' as a shorthand for 'hard' when really I was just deferring the engineering. Consent architecture being the ethics means you have to design it before the first line of data collection code, not bolt it on after users complain.

Opt-in observation with explicit retention windows and user-visible logs of exactly what triggered a nudge that's buildable. It's just uncomfortable to design upfront because it forces you to confront what the system is actually doing before you can hide behind good intentions.

'We meant well' is not a consent model genuinely adding that to a doc somewhere. That's the clearest I've heard that problem stated.

This thread has rewritten how I'm thinking about v2 more than anything else. Thank you for not letting me off the hook.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah exactly - the schema usually betrays it. if the data model gets built before the consent layer, it's optimized for collection, not user control. the retention window problem's the same - teams define it as a DB policy, users never see it or change it. the 'user-visible logs' part almost never ships.

Thread Thread
 
aditya_007 profile image
Aditya

"The schema usually betrays it" that's such a precise way to put it. You can read the ethics of a system from the data model before a single line of product code is written.

And the user-visible logs point is just true. It's always on the roadmap, always deprioritised, ships in 0% of v1s. Not because teams are malicious because it adds friction to the happy path and nobody's screaming for it until something goes wrong.

I think the honest constraint for Hermes Mentor is: keep it local. The whole point is nothing leaves your machine. No central DB, no retention policy to define, no logs to hide. The consent model is just it's your filesystem, you can delete it anytime.

Not a solution to the general problem you're describing. But maybe local-first is the only architecture that sidesteps it cleanly.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

if you can read the ethics from the schema, you can write them in before the first migration. consent fields as non-nullable — opt_in, retention_until, audit_log_visible — in the spec before the model exists. the deprioritization problem is a sequencing problem: consent always gets queued for after we have users, which is exactly when retrofitting it is most expensive.

Thread Thread
 
aditya_007 profile image
Aditya

Non-nullable consent fields in the spec before the model exists. That's the whole answer in one sentence.

And the sequencing point is brutal because it's so obvious in hindsight, "after we have users" is precisely when you can't change the schema without a migration, can't change the UX without retraining expectations, and can't change the data policy without a terms update nobody reads. You've locked yourself in by waiting.

Consent as a DB constraint rather than a product feature means it literally cannot be skipped or deprioritised. The migration fails. The build breaks. That's the only forcing function that actually works.

Honestly this thread has given me a cleaner framework for thinking about v2 than months of reading could have. You should write this up properly. I'd read that post.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

the non-nullable approach is right but only works when you know your data model in advance — the harder case is the consent field you discover you needed after you have been collecting event data for 6 months. most teams end up backfilling a nullable column and calling it done. the migration is survivable. the audit conversation is not.

Thread Thread
 
aditya_007 profile image
Aditya

The nullable backfill is the technical debt equivalent of "we'll get consent later" - survivable in the DB, catastrophic in the boardroom.

And the audit conversation is uniquely brutal because you can't un-collect 6 months of data. You can patch the schema. You can't patch the timeline. Every row with a null consent field is a timestamp that says "we were collecting this before we asked."

Maybe the realistic answer isn't perfect upfront design, it's a consent changelog. Same discipline as a migration file but for data policy. Every new signal you start collecting gets a dated entry: what, why, retention window, user visibility. Not pretty. But at least when the audit conversation happens you have a paper trail that shows intent evolved, not that it was always absent.

Still not airtight. But "documented evolution" beats "undated nullable column" in most rooms.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

right, and the schema patch is one sprint. the 'what did we actually collect and why' conversation is four teams, a legal review, and three retrospectives. by the time you're in that room the fix doesn't matter - what matters is who signed off on the original null.

Thread Thread
 
aditya_007 profile image
Aditya

"Who signed off on the original null" - that's not a technical question anymore. That's a career question.

And that's what makes it so hard to fix culturally. The engineer who added the nullable column wasn't being malicious, they were being pragmatic under a deadline. But pragmatic decisions don't come with signatures. Nobody writes "I deprioritised consent because we needed to ship" in a commit message.

So the audit room reconstructs intent from timestamps and Slack threads and whoever is still at the company. Which is why the consent changelog idea matters less as a technical artifact and more as a paper trail of accountability. Not "here's our perfect data policy" but "here's the name, the date, and the reason" for every decision.

Makes the nullable survivable. Barely.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

fair - "career question" is the better frame. deadline explains the vector, not the decision. the cultural failure is treating consent as a feature rather than a constraint - once it moves to the backlog, it's already lost.

Thread Thread
 
aditya_007 profile image
Aditya

"Once it moves to the backlog it's already lost" - that's the whole thing.

Features get prioritised. Constraints get enforced. The moment consent is a ticket it has a priority, an owner, a sprint, and a reason to slip. The moment it's a constraint it has none of those; it just blocks.

I think that's the reframe that actually changes behaviour at the team level. Not "we should care more about consent" - teams already agree with that. But "consent is not a feature, stop putting it in Jira."

Started this thread talking about a cron job that texts developers. Ending it with a framework for ethical system design I'll be thinking about for a while.

Seriously, write this up. The progression from schema to backlog to audit room to career question is a complete argument. It deserves its own post.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

exactly right - and the signal that it crossed over is usually subtle. no one announces "we are treating consent as a feature now." what you see is the first sprint where it gets a story point, which feels like progress. that is the conversion moment.

Thread Thread
 
aditya_007 profile image
Aditya

The story point is the tell. It feels like progress because the team finally "took it seriously" but the act of sizing it is the admission that it was negotiable all along.

And nobody flags it in the moment because it looks like process working correctly. Ticket created, pointed, assigned. The system is healthy. Except the thing being ticketed shouldn't be in the system at all.

I think that's the hardest antipattern to catch the ones that look exactly like good engineering from the outside.

Okay I have to ask do you work in privacy engineering or is this just how you think about systems?

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

the sizing is the tell — not the ticket itself, but the fact that someone estimated it. the moment it shifts from what do we owe users to how long will this take, the frame has already moved and the process looks completely normal while it happens.

Thread Thread
 
aditya_007 profile image
Aditya

"What do we owe users" to "how long will this take" that's the exact moment and you've just named it more precisely than I've ever heard it named.

The frame shift is invisible because both questions sound responsible. One is ethics. One is planning. Planning wins every time because it has a meeting format.

I'm going to stop trying to match this and just say you should genuinely write this up. The full arc from sizing as the conversion moment to the audit room to the career question is one of the clearest arguments I've read on why consent fails in practice. Not theory, not regulation, just the exact sequence of normal-looking decisions that gets you there.

I'd reshare it. And I think a lot of people in this thread would too.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

there's probably a piece in here. your 'planning wins because it has a meeting format' is the cleaner frame - ethics questions don't have a retro slot, a JIRA status, a review cadence. they fall out of the rhythm before anyone decides to drop them.

Thread Thread
 
aditya_007 profile image
Aditya

"Ethics questions don't have a retro slot" that's the piece title right there.

And that's the real structural problem. It's not that teams don't care. It's that the entire engineering process is built around things that recur on a schedule standups, sprints, retros, reviews. Ethics doesn't recur. It arrives once, quietly, at the wrong moment, with no owner and no agenda item.

If you write it I'll link it from the Hermes Mentor repo. Feels right that a post about a learning agent ends up pointing to something about why the systems we build forget to ask permission.

Seriously though write it.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah the "doesn't recur" framing is the real root cause. anything without a calendar slot gets treated as optional by default - not because the team doesn't care but because the system has no hook for it. hijacking an existing ceremony is probably the path of least resistance there.

Thread Thread
 
aditya_007 profile image
Aditya

Hijacking an existing ceremony is underrated as an implementation strategy. Don't create an ethics meeting nobody will attend just add one question to the retro that already exists. "Did we collect anything new this sprint? Who can see it? Can they delete it?"

Three questions. Existing slot. No new process to maintain.

The system doesn't need a new hook. It needs someone to staple ethics to the hook that's already there before the first sprint starts.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

three questions in an existing slot is exactly right - new processes die because they have no host. retro already has one

Thread Thread
 
aditya_007 profile image
Aditya

And that's the whole thing in one line.

New process, no host dies in week 3. Three questions, existing host survives because the meeting was already happening anyway.

The best system design is the one that works with human laziness, not against it.

Alright I need to go build v2 now. You've given me more to think about than I expected from a comment on a Telegram bot post. Genuinely. 😄

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

nah, works until retro gets cancelled - then you've got no host AND no habit. but go build it, curious what v2 changes

Thread Thread
 
aditya_007 profile image
Aditya

Haha fair parasitic process inherits the host's mortality too.

v2 changes: consent fields non-nullable, commit frequency as signal not surveillance, and probably a retro question that survives even when the retro doesn't.

Will post it. Watch this space. 😄

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

non-nullable consent is the right call - optional consent fields always end up as "fill in later" and never do. most curious about the retro question that survives the retro - are you thinking async signal collection, or still requires someone to trigger it?

Thread Thread
 
aditya_007 profile image
Aditya

Async signal collection has to be. Anything that requires someone to trigger it inherits the same problem as the retro: it exists when someone remembers it exists.

The version I'm thinking for v2: every time a new signal type gets added to the GitHub audit, it writes a dated entry to consent_changelog.json automatically. Not a manual step, not a ceremony, just a side effect of the code that collects the signal. The collection and the consent record are the same commit.

The retro question that survives the retro is basically that "did we add any new signals this sprint" gets answered by the changelog, not by a person remembering to ask it. The question moves from ceremony to artifact.

Still has a gap though: it only works for signals you know you're collecting. The ones that slip through are the derived signals inferring emotional state from commit cadence isn't in any schema field, it's in the interpretation layer. Not sure how you log consent for a conclusion rather than a data point.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

dated write-on-add is the right shape. one edge: when a signal type gets renamed or split, old consent_change entries lose their anchor. versioning the signal schema in the log buys you the diff later.

Thread Thread
 
aditya_007 profile image
Aditya

Signal schema versioning in the log that's the missing piece I didn't see coming.

Without it you get a changelog that's accurate at write time and meaningless at audit time. "commit_cadence" from six months ago might be half of what "commit_frequency_7d" and "commit_frequency_30d" are today and there's no way to reconstruct that split from the log alone.

The shape I'm thinking: each consent entry gets a signal_version field alongside the timestamp. When a signal gets renamed or split, a migration writes a transition record signal_old, signal_new, date, reason. The changelog becomes a diff history not just an append log.

Same discipline as database migrations. You don't delete the old schema, you document the change.

Starting to think consent_changelog.json was the wrong format for this. It wants to be a table.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

migration layer is the actual fix — log the canonical type name at write time, keep a schema_history map of old names to new splits. audit joins on schema version. without it every rename invalidates consent evidence retroactively.

Thread Thread
 
aditya_007 profile image
Aditya

schema_history as the join key is exactly right. canonical name at write time means the audit can always reconstruct what "commit_cadence" meant at the point of collection even after three renames and two splits.

Without that join the consent log is just timestamps. With it it's evidence.

This has fully redesigned the consent layer in real time. v2 SQLite schema now has four tables: users, audits, roadmaps, and two consent tables consent_log with signal_version, and schema_history with canonical_name, old_name, new_names, migration_date, reason.

The audit joins on schema_version. The evidence survives the rename.

I'm going to go build this now before the architecture gets any better and I never ship it. 😂

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

the "timestamps vs evidence" framing is honestly the cleanest way to sell this to non-technical stakeholders - same underlying gap but "can't reconstruct intent" gets budget where "missing join key" doesn't

Thread Thread
 
aditya_007 profile image
Aditya

That translation layer is half the job in any room with mixed technical and non-technical people.

"Missing join key" is correct. "Can't reconstruct intent" is what gets the budget, the priority, and the legal team's attention. Same gap, completely different urgency depending on who's hearing it.

Honestly that's the real skill nobody teaches not the architecture, but knowing which frame unlocks the decision in which room. Engineers speak in joins. Stakeholders speak in liability.

I'm adding that to the v2 README under the consent section. Not just the schema, but the sentence that explains why it matters to someone who will never read the schema.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

exactly - the accountability angle matters too. if you cannot tell what the tool was trying to do, you cannot assign a fix owner. missing join key gets fixed in the schema. missing intent becomes a post-mortem.

Thread Thread
 
aditya_007 profile image
Aditya

"Missing join key gets fixed in the schema. Missing intent becomes a post-mortem."

That's the whole argument in two sentences. The technical debt you can pay back in a sprint. The accountability debt you pay back in a room nobody wants to be in.

And the fix owner point is underrated schema_history doesn't just preserve consent evidence, it preserves authorship. Who added the signal, when, why. The post-mortem has a name attached before it starts.

Which means schema_history is actually three things at once a diff log, a consent record, and an accountability trail. One table doing the work of a process most teams never build.

Okay I really need to go write this code now. You've turned a Telegram bot into a compliance system and I'm genuinely here for it. 😂

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

who and when schema_history gets automatically. the why needs a contemporaneous annotation — something written at add-time, not reconstructed from commit timestamps after the incident. that gap is why intent debt survives schema versioning.

Collapse
 
lcmd007 profile image
Andy Stewart

Hits the nail on the head! From scanning real repos to seamless Telegram integration, this is a true AI-native mentor. The local-first architecture with persistent local memory protects privacy while compounding development skills. A brilliant way to escape tutorial hell and revive stagnant code!

Collapse
 
aditya_007 profile image
Aditya

Thanks Andy!!

Collapse
 
xulingfeng profile image
xulingfeng

Great question! Entity disambiguation across sessions is actually where the trust scoring system shines. Instead of relying purely on LLM extraction (which can hallucinate entities), we layer it: (1) rule-based extraction for known patterns like names and project references, (2) jieba tokenizer for Chinese entity boundaries, then (3) a confidence filter that rejects entities with < 0.3 trust score. The unexpected win was the alias system — mapping "xulingfeng" ↔ "许凌峰" ↔ "许工" let us span English and Chinese mentions without duplicating entries. Curious — how does your approach handle multi-language entity references?

Thread Thread
 
aditya_007 profile image
Aditya

Glad if you post it in the same thread where we are discussing, nvm 😆

That layered extraction approach is really elegant rule-based first, then tokenizer, then confidence filter. The trust score threshold as a rejection layer is something I hadn't considered but makes total sense. LLM hallucinating entities into memory is exactly the kind of silent corruption that would be a nightmare to debug weeks later.

The alias system is genuinely impressive. Spanning English and Chinese mentions without duplicating entries solves a problem most Western-built tools don't even think about.

Honest answer on my side right now I don't handle multi-language entity references at all. The USER.md approach is flat markdown, so it's as smart as the LLM writing it. Works fine for English GitHub profiles but would fall apart fast with mixed-language repos or non-Latin usernames.

This is making me think the right v2 move is to not reinvent this and instead look at integrating something like what you've built as the memory layer rather than raw markdown files. A trust-scored entity store would make the audit results significantly more reliable over time.

Are you planning to open source MemBridge? Would genuinely love to dig into the implementation. 😄

Collapse
 
stephen_sebastian_c85ea2b profile image
Stephen Sebastian

Tutorial hell is real — and an agent that learns your progress instead of resetting every session is exactly the fix. Love the mentor angle.

Collapse
 
aditya_007 profile image
Aditya

Thanks Stephen!!

Collapse
 
klaudiagrz profile image
Klaudia Grzondziel

This looks super cool and useful! 👏🏻 One question, though: isn't the local run consuming too many resources? I remember running Gemma on Ollama got my laptop completely frozen at some point 🥶

Collapse
 
aditya_007 profile image
Aditya

Hey Klaudia! Your laptop won't turn into a space heater, I promise 😅

The LLM only fires once during the audit (~60-90 sec) then goes back to sleep. Daily nudges are just cached JSON , your CPU can relax.

Close Chrome's 47 tabs before running though. That's on you 😂

Collapse
 
klaudiagrz profile image
Klaudia Grzondziel

Close Chrome's 47 tabs before running though. That's on you 😂

Ahahaha, you got me with this! 😂

Thread Thread
 
aditya_007 profile image
Aditya

Haha glad it landed! 😄
Let me know if you try it out 🚀

Collapse
 
xulingfeng profile image
xulingfeng

Love the Hermes Mentor concept — especially the "break the tutorial loop" angle. I've been running Hermes locally for a while and the execution boundary pattern you're using is solid.

One thing I'd be curious about: how are you handling the knowledge retention across sessions? Tutorial hell is partly a memory problem — you learn something, don't use it for 2 weeks, and it's gone. Does Mentor remember what you've covered?

Collapse
 
aditya_007 profile image
Aditya

Thanks! And yes memory across sessions was actually one of the first things I designed around, because you're spot on that forgetting is half the problem.

Every audit writes a USER_<username>.md into ~/.hermes/memory/ which Hermes loads automatically at the start of every future session. It tracks gaps identified, current roadmap week, what's been closed, and audit history over time.

So if you completed Week 1 and come back 2 weeks later, Hermes already knows it doesn't start from scratch.

The honest limitation right now: it tracks what projects you've pushed via GitHub re-audits, but it doesn't yet track conceptual retention (like "did you actually understand Jest or just copy-paste it"). That would need either a quiz layer or richer commit analysis. Definitely something I want to explore next.

Curious how are you handling long-term memory in your Hermes setup?

Collapse
 
xulingfeng profile image
xulingfeng

Great breakdown! The USER_.md approach is clever — we went a different route with MemBridge: SQLite with entity linking and trust scoring. The conceptual retention problem is real — curious if you have thought about integrating access frequency into the audit flow?

Thread Thread
 
aditya_007 profile image
Aditya

MemBridge sounds really interesting SQLite with trust scoring is a much more structured approach than flat markdown files. Would love to see how you handle entity disambiguation across sessions, especially when the same concept appears in different contexts.

On access frequency honestly hadn't thought about it that way but now I can't stop thinking about it. The idea of weighting audit signals by how recently and how often a language or pattern appears in commits is really compelling. A repo you touched once 2 years ago shouldn't carry the same weight as something you pushed to last week.

Could even go further commit density + file churn rate as a proxy for "am I actually using this or just copy-pasting." Richer than just language bytes.

Adding this to the v2 list for real. Thanks for the nudge this comment thread is turning into a roadmap 😄

Some comments may only be visible to logged-in visitors. Sign in to view all comments.