Skip to main content

Hermes Agent – The Complete Guide

What's this about?

Hermes Agent is an open-source AI agent from Nous Research that improves itself through use: it authors its own skills from experience, curates memory across sessions, and runs on virtually any infrastructure β€” from a cheap VPS to cloud clusters. Through a gateway, it connects 15+ messaging platforms (Telegram, Discord, Slack, WhatsApp, Signal, iMessage, Matrix, …) into a single personal, persistent assistant.

Repo: NousResearch/hermes-agent Β· License: MIT

1. What is Hermes?​

Hermes is not a coding copilot and not just another chatbot shell β€” it's an autonomous agent with:

  • Closed-loop learning – curated memory + periodic nudges
  • Skill self-authoring – the agent writes its own skills after complex tasks
  • Cross-session recall – SQLite full-text search + Gemini Flash summarization
  • User modeling via Honcho integration
  • 6 terminal backends – local, Docker, SSH, Daytona, Singularity, Modal
  • Subagent spawning for parallel workstreams
  • Cron scheduler for recurring tasks (60-second tick)

Hermes vs. Claude Code vs. OpenClaw​

AspectClaude CodeOpenClawHermes
Primary purposeCoding in terminal/IDEMessaging gatewayGeneral-purpose agent with learning loop
Model providerAnthropicflexibleOpenRouter, Anthropic, OpenAI, Ollama, Copilot, Nous Portal, DeepSeek, Gemini, …
Self-improvement––Yes (skill_manage, memory curation)
Terminal backends1 (local)1 + sandbox6
Channels–10+18+
Voice mode––Yes (TTS + voice-capable channels)

2. Prerequisites​

  • Git (the installer handles everything else)
  • Linux, macOS, WSL2, or Android/Termux
  • API key for an LLM provider (or a local Ollama model)
Windows

There is no native Windows support. On Windows, use WSL2.

The installer automatically fetches:

  • uv (Python package manager)
  • Python 3.11
  • Node.js v22
  • ripgrep (fast file search)
  • ffmpeg (audio conversion for voice mode)

3. Installation​

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

The installer auto-detects Linux, macOS, WSL2, and Termux.

From source (for contributors)​

git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
# Setup per the contributing guide

First steps after installation​

source ~/.bashrc          # or ~/.zshrc
hermes # start an interactive session

Setup wizard​

hermes setup              # complete wizard
# or individual sections:
hermes model # configure LLM provider
hermes tools # enable/disable tools
hermes gateway setup # set up messaging platforms
hermes doctor # diagnostics

4. Configuring the LLM Provider​

Configuration precedence (highest first)​

  1. CLI arguments (per invocation)
  2. config.yaml (primary settings, no secrets)
  3. .env (API keys & secrets)
  4. Built-in defaults

Supported providers​

  • OpenRouter – easiest entry point, many models
  • Anthropic – Claude Opus/Sonnet/Haiku
  • OpenAI – GPT-4o, o-series
  • Copilot – via ChatGPT OAuth
  • Custom endpoints – Ollama, vLLM, LM Studio, your own OpenAI-compatible servers
  • Nous Portal – Hermes in-house models
  • DeepSeek, Gemini, and more via the provider registry

Universal schema​

# ~/.hermes/config.yaml
llm:
provider: anthropic
model: claude-opus-4-7
# base_url: https://meine-ollama:11434/v1 # optional, overrides provider
# ~/.hermes/.env
ANTHROPIC_API_KEY=sk-ant-...
OPENROUTER_API_KEY=sk-or-...
ENV substitution

${VAR_NAME} works inside config.yaml β€” this lets you inject secrets cleanly from .env without hardcoding them.

base_url disables the provider

Once base_url is set, Hermes ignores the provider field and calls the endpoint directly. Handy for self-hosting, but you have to handle authentication yourself.


5. Terminal Backends​

Hermes can execute tool calls in 6 different environments.

BackendUsed forIsolationCost
localFastest iteration, devnone–
dockerSecure local sandboxingcontainer–
sshRemote box, homelabhost OS–
daytonaDev sandboxes in the cloudcontainerlow
singularityHPC / research clusterscontainerdepends on cluster
modalServerless – "costs nearly nothing when idle"containerper second

Container security defaults​

  • Read-only root filesystem (Docker)
  • All Linux capabilities dropped
  • No privilege escalation
  • PID limit (256 processes)
  • Full namespace isolation
Local backend has no sandbox

local executes directly on your host. For risky workflows (web browsing, third-party scripts), always use docker or modal.


6. Messaging Gateway​

hermes gateway setup

The wizard walks you through the configuration. Services that are already set up are marked.

Platform overview​

PlatformVoiceImages/FilesThreadsReactions
Telegramβœ…βœ…β€“β€“
Discordβœ…βœ…βœ…βœ…
Slackβœ…βœ…βœ…βœ…
WhatsAppβ€“βœ…β€“β€“
Signalβ€“βœ…β€“β€“
Matrixβœ…βœ…βœ…βœ…
Mattermostβœ…βœ…β€“β€“
iMessage (BlueBubbles)β€“βœ…β€“β€“
Emailβ€“βœ…βœ…β€“
SMS––––
Feishu/Lark, WeCom, Weixin, QQ, Yuanbao, DingTalkpartialβœ…partialpartial
Home Assistant, Webhooks––––

Architecture​

A background process connects to all configured platforms, manages per-chat sessions, runs the cron scheduler (60s tick), and delivers voice messages.

  • Linux β†’ systemd user or system service
  • macOS β†’ launchd agent (including PATH/ENV configuration)

7. Memory System​

      • MEMORY.md β€” ~2,200 characters, environment notes & lessons learned
      • USER.md β€” ~1,375 characters, user preferences & style
    • config.yaml
    • .env

How memory works​

  • Snapshot at session start – MEMORY.md and USER.md are frozen into the system prompt and are not updated during the session. Reason: the LLM's prefix cache stays warm β†’ faster, cheaper.
  • Updates are written continuously by the agent but only become visible in the next session.
  • Cross-session recall runs through the session_search tool:
    • Full-text search in the SQLite sessions DB
    • Hits are summarized via Gemini Flash
    • This lets the agent surface content from conversations weeks old without keeping it actively in context
MEMORY.md vs. skills
  • MEMORY.md = declarative (what do I know? facts, preferences).
  • skills/ = procedural (how do I do X? repeatable workflows).

8. Skills System​

Skills are the agent's procedural memory β€” versionable Markdown documents with YAML frontmatter.

SKILL.md – format​

---
name: deploy-staging
description: Deployt das aktuelle Repo auf Staging via Helm
version: 1.2.0
platforms: [linux, darwin]
---

## When to Use
Wenn der User nach β€ždeploy", β€žstaging", β€žrelease candidate" fragt
und ein `helm/` Ordner im Repo existiert.

## Procedure
1. `git status` β†’ muss clean sein
2. Image bauen: `docker build -t app:$(git rev-parse --short HEAD) .`
3. `helm upgrade --install app helm/ -f helm/values.staging.yaml`
4. Health-Check: `curl https://staging.example.com/health`

## Pitfalls
- Bei dirty working tree β†’ abbrechen
- Bei fehlgeschlagenem Health-Check β†’ `helm rollback`

## Verification
- HTTP 200 vom Health-Endpoint
- Pods im Status `Running`

Progressive disclosure​

So that skills don't blow up the token budget, they are loaded in three stages:

  1. List metadata – just name + description (~3,000 tokens for the entire library)
  2. Full content – on the agent's request
  3. Reference files – deeper sub-files if referenced

Where do skills live?​

  • ~/.hermes/skills/ – source of truth, mirrored from the repo on first install
  • external_dirs in config.yaml – include external directories
  • agentskills.io – open standard, compatible with other agent platforms

Self-improvement: skill_manage​

The agent creates skills itself after:

  • complex tasks (β‰₯ 5 tool calls)
  • successfully resolved errors
  • discovered non-trivial workflows

Actions: create, patch, edit, delete.

Review your skills

Skills are written by the agent β€” review them regularly (hermes skills list, manual diff against a Git backup) before they internalize bad habits.


9. Built-in Tools & Toolsets​

Tool categories​

CategoryTools
Webweb_search, web_extract
Terminal & filesterminal, process, read_file, patch
Browserbrowser_navigate, browser_snapshot, browser_vision
Mediavision_analyze, image_generate, text_to_speech
Agent orchestrationtodo, clarify, execute_code, delegate_task
Memory & recallmemory, session_search
Automation & deliverycronjob, send_message
IntegrationsHome Assistant, MCP servers, RL training

Toolsets​

Toolsets bundle tools logically and can be toggled on or off per agent / per channel:

web Β· terminal Β· file Β· browser Β· vision Β· image_gen Β· moa Β· skills
tts Β· todo Β· memory Β· session_search Β· cronjob Β· code_execution
delegation Β· clarify Β· homeassistant
hermes tools           # toggle toolsets

10. MCP – Model Context Protocol​

Hermes is an MCP client and can integrate any MCP server β€” e.g. filesystem, GitHub, Postgres, Brave Search, or your own servers.

# ~/.hermes/config.yaml
mcp:
servers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: ${GITHUB_TOKEN}

This extends the tool pool without touching Hermes' code.


11. Voice Mode​

On voice-capable channels (Telegram, Discord, Slack, Mattermost, Matrix, Feishu/Lark, WeCom, Weixin, QQ, Yuanbao):

  • Incoming voice messages β†’ transcription
  • Outgoing replies β†’ text_to_speech β†’ audio reply

ffmpeg is bundled by the installer.


12. Subagents & Cron​

Delegation to subagents​

β€žRecherchiere parallel: Konkurrent A, B und C – jeweils Pricing,
Tech-Stack und letzte 3 Pressemeldungen."

The agent spawns parallel subagents via delegate_task, each with its own session, and consolidates their results.

Cron scheduler​

β€žPrΓΌfe jeden Werktag um 8:30 die GitHub-Issues mit Label `urgent`
und schick mir eine Zusammenfassung auf Telegram."

The agent registers a cronjob via the cronjob tool. The gateway scheduler ticks every 60s.


13. Security – Required Reading​

Prompt injection

Incoming messages can manipulate the agent. Layers of defense:

  • Container backend (docker, modal) instead of local
  • Toolset restrictions per channel: no terminal/browser/code_execution tools for unknown senders
  • Sender allowlists in the gateway
  • MCP servers are also a source of code β†’ treat them with the same caution

Audit & diagnostics​

hermes doctor                 # health checks
hermes gateway status # gateway + channels
hermes logs # central logs

14. Cheatsheet​

# Lifecycle
hermes # interaktive Session
hermes setup # vollstΓ€ndiger Wizard
hermes doctor # Diagnose

# LLM
hermes model # Provider/Modell wechseln

# Tools
hermes tools # Toolsets togglen

# Gateway
hermes gateway setup
hermes gateway status
hermes gateway start|stop|restart

# Skills
hermes skills list
hermes skills edit <name>

15. Common Issues​

ProblemSolution
Installer failsCheck git --version, use WSL2 instead of native Windows if needed
hermes: command not foundReload your shell (source ~/.bashrc), check PATH
Model doesn't respondAPI key in .env, check credits, hermes doctor
Memory updates don't take effectCorrect β€” they only become active in the next session
Voice doesn't workIs ffmpeg installed? Is the channel voice-capable?
Skills blow up the token budgetToolset filters, clean out external_dirs

16. Further Reading​

Quote

"The self-improving AI agent β€” it creates skills from experience, improves them during use."