Hermes Agent β The Complete Guide
Hermes Agent is an open-source AI agent from Nous Research that improves itself through use: it authors its own skills from experience, curates memory across sessions, and runs on virtually any infrastructure β from a cheap VPS to cloud clusters. Through a gateway, it connects 15+ messaging platforms (Telegram, Discord, Slack, WhatsApp, Signal, iMessage, Matrix, β¦) into a single personal, persistent assistant.
Repo: NousResearch/hermes-agent Β· License: MIT
1. What is Hermes?β
Hermes is not a coding copilot and not just another chatbot shell β it's an autonomous agent with:
- Closed-loop learning β curated memory + periodic nudges
- Skill self-authoring β the agent writes its own skills after complex tasks
- Cross-session recall β SQLite full-text search + Gemini Flash summarization
- User modeling via Honcho integration
- 6 terminal backends β local, Docker, SSH, Daytona, Singularity, Modal
- Subagent spawning for parallel workstreams
- Cron scheduler for recurring tasks (60-second tick)
Hermes vs. Claude Code vs. OpenClawβ
| Aspect | Claude Code | OpenClaw | Hermes |
|---|---|---|---|
| Primary purpose | Coding in terminal/IDE | Messaging gateway | General-purpose agent with learning loop |
| Model provider | Anthropic | flexible | OpenRouter, Anthropic, OpenAI, Ollama, Copilot, Nous Portal, DeepSeek, Gemini, β¦ |
| Self-improvement | β | β | Yes (skill_manage, memory curation) |
| Terminal backends | 1 (local) | 1 + sandbox | 6 |
| Channels | β | 10+ | 18+ |
| Voice mode | β | β | Yes (TTS + voice-capable channels) |
2. Prerequisitesβ
- Git (the installer handles everything else)
- Linux, macOS, WSL2, or Android/Termux
- API key for an LLM provider (or a local Ollama model)
There is no native Windows support. On Windows, use WSL2.
The installer automatically fetches:
uv(Python package manager)- Python 3.11
- Node.js v22
ripgrep(fast file search)ffmpeg(audio conversion for voice mode)
3. Installationβ
One-line installer (recommended)β
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
The installer auto-detects Linux, macOS, WSL2, and Termux.
From source (for contributors)β
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
# Setup per the contributing guide
First steps after installationβ
source ~/.bashrc # or ~/.zshrc
hermes # start an interactive session
Setup wizardβ
hermes setup # complete wizard
# or individual sections:
hermes model # configure LLM provider
hermes tools # enable/disable tools
hermes gateway setup # set up messaging platforms
hermes doctor # diagnostics
4. Configuring the LLM Providerβ
Configuration precedence (highest first)β
- CLI arguments (per invocation)
config.yaml(primary settings, no secrets).env(API keys & secrets)- Built-in defaults
Supported providersβ
- OpenRouter β easiest entry point, many models
- Anthropic β Claude Opus/Sonnet/Haiku
- OpenAI β GPT-4o, o-series
- Copilot β via ChatGPT OAuth
- Custom endpoints β Ollama, vLLM, LM Studio, your own OpenAI-compatible servers
- Nous Portal β Hermes in-house models
- DeepSeek, Gemini, and more via the provider registry
Universal schemaβ
# ~/.hermes/config.yaml
llm:
provider: anthropic
model: claude-opus-4-7
# base_url: https://meine-ollama:11434/v1 # optional, overrides provider
# ~/.hermes/.env
ANTHROPIC_API_KEY=sk-ant-...
OPENROUTER_API_KEY=sk-or-...
${VAR_NAME} works inside config.yaml β this lets you inject secrets cleanly from .env without hardcoding them.
base_url disables the providerOnce base_url is set, Hermes ignores the provider field and calls the endpoint directly. Handy for self-hosting, but you have to handle authentication yourself.
5. Terminal Backendsβ
Hermes can execute tool calls in 6 different environments.
| Backend | Used for | Isolation | Cost |
|---|---|---|---|
| local | Fastest iteration, dev | none | β |
| docker | Secure local sandboxing | container | β |
| ssh | Remote box, homelab | host OS | β |
| daytona | Dev sandboxes in the cloud | container | low |
| singularity | HPC / research clusters | container | depends on cluster |
| modal | Serverless β "costs nearly nothing when idle" | container | per second |
Container security defaultsβ
- Read-only root filesystem (Docker)
- All Linux capabilities dropped
- No privilege escalation
- PID limit (256 processes)
- Full namespace isolation
local executes directly on your host. For risky workflows (web browsing, third-party scripts), always use docker or modal.
6. Messaging Gatewayβ
hermes gateway setup
The wizard walks you through the configuration. Services that are already set up are marked.
Platform overviewβ
| Platform | Voice | Images/Files | Threads | Reactions |
|---|---|---|---|---|
| Telegram | β | β | β | β |
| Discord | β | β | β | β |
| Slack | β | β | β | β |
| β | β | β | β | |
| Signal | β | β | β | β |
| Matrix | β | β | β | β |
| Mattermost | β | β | β | β |
| iMessage (BlueBubbles) | β | β | β | β |
| β | β | β | β | |
| SMS | β | β | β | β |
| Feishu/Lark, WeCom, Weixin, QQ, Yuanbao, DingTalk | partial | β | partial | partial |
| Home Assistant, Webhooks | β | β | β | β |
Architectureβ
A background process connects to all configured platforms, manages per-chat sessions, runs the cron scheduler (60s tick), and delivers voice messages.
- Linux β systemd user or system service
- macOS β launchd agent (including PATH/ENV configuration)
7. Memory Systemβ
- MEMORY.md β ~2,200 characters, environment notes & lessons learned
- USER.md β ~1,375 characters, user preferences & style
- config.yaml
- .env
How memory worksβ
- Snapshot at session start β
MEMORY.mdandUSER.mdare frozen into the system prompt and are not updated during the session. Reason: the LLM's prefix cache stays warm β faster, cheaper. - Updates are written continuously by the agent but only become visible in the next session.
- Cross-session recall runs through the
session_searchtool:- Full-text search in the SQLite sessions DB
- Hits are summarized via Gemini Flash
- This lets the agent surface content from conversations weeks old without keeping it actively in context
MEMORY.md= declarative (what do I know? facts, preferences).skills/= procedural (how do I do X? repeatable workflows).
8. Skills Systemβ
Skills are the agent's procedural memory β versionable Markdown documents with YAML frontmatter.
SKILL.md β formatβ
---
name: deploy-staging
description: Deployt das aktuelle Repo auf Staging via Helm
version: 1.2.0
platforms: [linux, darwin]
---
## When to Use
Wenn der User nach βdeploy", βstaging", βrelease candidate" fragt
und ein `helm/` Ordner im Repo existiert.
## Procedure
1. `git status` β muss clean sein
2. Image bauen: `docker build -t app:$(git rev-parse --short HEAD) .`
3. `helm upgrade --install app helm/ -f helm/values.staging.yaml`
4. Health-Check: `curl https://staging.example.com/health`
## Pitfalls
- Bei dirty working tree β abbrechen
- Bei fehlgeschlagenem Health-Check β `helm rollback`
## Verification
- HTTP 200 vom Health-Endpoint
- Pods im Status `Running`
Progressive disclosureβ
So that skills don't blow up the token budget, they are loaded in three stages:
- List metadata β just name + description (~3,000 tokens for the entire library)
- Full content β on the agent's request
- Reference files β deeper sub-files if referenced
Where do skills live?β
~/.hermes/skills/β source of truth, mirrored from the repo on first installexternal_dirsinconfig.yamlβ include external directories- agentskills.io β open standard, compatible with other agent platforms
Self-improvement: skill_manageβ
The agent creates skills itself after:
- complex tasks (β₯ 5 tool calls)
- successfully resolved errors
- discovered non-trivial workflows
Actions: create, patch, edit, delete.
Skills are written by the agent β review them regularly (hermes skills list, manual diff against a Git backup) before they internalize bad habits.
9. Built-in Tools & Toolsetsβ
Tool categoriesβ
| Category | Tools |
|---|---|
| Web | web_search, web_extract |
| Terminal & files | terminal, process, read_file, patch |
| Browser | browser_navigate, browser_snapshot, browser_vision |
| Media | vision_analyze, image_generate, text_to_speech |
| Agent orchestration | todo, clarify, execute_code, delegate_task |
| Memory & recall | memory, session_search |
| Automation & delivery | cronjob, send_message |
| Integrations | Home Assistant, MCP servers, RL training |
Toolsetsβ
Toolsets bundle tools logically and can be toggled on or off per agent / per channel:
web Β· terminal Β· file Β· browser Β· vision Β· image_gen Β· moa Β· skills
tts Β· todo Β· memory Β· session_search Β· cronjob Β· code_execution
delegation Β· clarify Β· homeassistant
hermes tools # toggle toolsets
10. MCP β Model Context Protocolβ
Hermes is an MCP client and can integrate any MCP server β e.g. filesystem, GitHub, Postgres, Brave Search, or your own servers.
# ~/.hermes/config.yaml
mcp:
servers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: ${GITHUB_TOKEN}
This extends the tool pool without touching Hermes' code.
11. Voice Modeβ
On voice-capable channels (Telegram, Discord, Slack, Mattermost, Matrix, Feishu/Lark, WeCom, Weixin, QQ, Yuanbao):
- Incoming voice messages β transcription
- Outgoing replies β
text_to_speechβ audio reply
ffmpeg is bundled by the installer.
12. Subagents & Cronβ
Delegation to subagentsβ
βRecherchiere parallel: Konkurrent A, B und C β jeweils Pricing,
Tech-Stack und letzte 3 Pressemeldungen."
The agent spawns parallel subagents via delegate_task, each with its own session, and consolidates their results.
Cron schedulerβ
βPrΓΌfe jeden Werktag um 8:30 die GitHub-Issues mit Label `urgent`
und schick mir eine Zusammenfassung auf Telegram."
The agent registers a cronjob via the cronjob tool. The gateway scheduler ticks every 60s.
13. Security β Required Readingβ
Incoming messages can manipulate the agent. Layers of defense:
- Container backend (
docker,modal) instead oflocal - Toolset restrictions per channel: no
terminal/browser/code_executiontools for unknown senders - Sender allowlists in the gateway
- MCP servers are also a source of code β treat them with the same caution
Audit & diagnosticsβ
hermes doctor # health checks
hermes gateway status # gateway + channels
hermes logs # central logs
14. Cheatsheetβ
# Lifecycle
hermes # interaktive Session
hermes setup # vollstΓ€ndiger Wizard
hermes doctor # Diagnose
# LLM
hermes model # Provider/Modell wechseln
# Tools
hermes tools # Toolsets togglen
# Gateway
hermes gateway setup
hermes gateway status
hermes gateway start|stop|restart
# Skills
hermes skills list
hermes skills edit <name>
15. Common Issuesβ
| Problem | Solution |
|---|---|
| Installer fails | Check git --version, use WSL2 instead of native Windows if needed |
hermes: command not found | Reload your shell (source ~/.bashrc), check PATH |
| Model doesn't respond | API key in .env, check credits, hermes doctor |
| Memory updates don't take effect | Correct β they only become active in the next session |
| Voice doesn't work | Is ffmpeg installed? Is the channel voice-capable? |
| Skills blow up the token budget | Toolset filters, clean out external_dirs |
16. Further Readingβ
- Docs index: hermes-agent.nousresearch.com/docs
- Installation: /docs/getting-started/installation
- Quickstart: /docs/getting-started/quickstart
- Configuration: /docs/user-guide/configuration
- Memory: /docs/user-guide/features/memory
- Skills: /docs/user-guide/features/skills
- Tools: /docs/user-guide/features/tools
- MCP: /docs/user-guide/features/mcp
- Architecture: /docs/developer-guide/architecture
- Skills hub: agentskills.io
- GitHub: NousResearch/hermes-agent
"The self-improving AI agent β it creates skills from experience, improves them during use."