Carnice-9b
This model would not have been possible without the contributions of Teknium, (Nous Research), Zachary Mueller, (Lambda).
Carnice-9b is a standalone merged model tuned specifically for the Hermes Agent harness.
It is built on top of Qwen/Qwen3.5-9B, but the training target here was not generic chat quality or leaderboard chasing. The goal was to improve behavior inside Hermes Agent itself: tool calling, terminal use, browser use, multi-step execution, and the exact message patterns the Hermes harness expects.
This repo is the direct-load merged checkpoint form of kai-os/qwen35-hermes-stage2-adapter-v1. It loads as its own model without a separate PEFT adapter step.
Important detail: this is a merged standalone checkpoint, not a separate full-parameter training run from scratch.
Training Approach
Carnice-9b was trained in two stages.
- Stage A was a reasoning repair pass on carefully selected high-signal reasoning data.
- Stage B was a Hermes-specific refresh pass built around harness-native traces and Hermes-style action structure.
The second stage is the important part for this release. Instead of teaching a generic external tool schema, it was trained on data shaped for the Hermes Agent environment itself.
Hermes-Agent Focus
Carnice-9b is intended for Hermes Agent first.
It was tuned around workflows such as:
- terminal-heavy task execution
- file editing and structured tool use
- browser and web-assisted agent behavior
- multi-turn tool calling inside the Hermes runtime
- Hermes-native conversation and tool-call formatting
A major design constraint during training was to avoid teaching the model foreign agent habits that would make it awkward inside the Hermes harness.
Data
The Hermes-specialized stage draws primarily from:
- kai-os/carnice-glm5-hermes-traces
open-thoughts/OpenThoughts-Agent-v1-SFT
The earlier repair stage uses a smaller reasoning mix centered on:
bespokelabs/Bespoke-Stratos-17kAI-MO/NuminaMath-CoT
The release intentionally centers harness-native behavior over broad generic benchmark optimization.
Evaluation
This model is being evaluated primarily inside Hermes Agent rather than through generic standalone chat benchmarks.
The main evaluation focus is official Hermes-compatible benchmark paths and harness-native runs. Partial one-shot numbers exist, but this card intentionally does not center them. For this release, the important point is what the model was optimized for: Hermes Agent execution quality, not shallow benchmark cosmetics.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "kai-os/carnice-v1-9b-hermes-agent-stage2-merged"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
Notes
- This release is specifically intended for Hermes Agent style use.
- The model card keeps benchmark discussion intentionally lightweight until the stronger harness-native eval pass is finished.
- Supplementary diagnostics from the training progression are still available in the repo files, but they are not the main story of the release.
- Downloads last month
- 198
Model tree for kai-os/Carnice-9b
Base model
Qwen/Qwen3.5-9B-Base