AGILLM-4

AGILLM-4 is the next training target after AGILLM-3. The current code is a production-oriented starting point, copied from the proven single-file trainer and extended for:

~1.5B parameter main preset (agillm4_main)
100 tokens per parameter target ratio
longer block-size work on 24GB, B200, and B300 class GPUs
AR+SAT every step with sequential backward to reduce peak VRAM
SDPA and experimental sublinear local+landmark attention backends
exact M-fold expansion attention harvested from n1.py, with local verifier
fused QKV projection harvested from n1.py, with legacy checkpoint loading
profiling tools for memory, throughput, AR cost, SAT cost, and optimizer cost
synthetic long-context curriculum generation for recall and multi-hop tests

Start with AGILLM-4.md for the training plan and command recipes. The current sublinear backend is intentionally experimental: profile it against SDPA before using it for a real run.

Current harvest status from n1.py is tracked in N1_HARVEST.md.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support