Only a year into open source, MiniMax is already making a great impact. Not only through solid models/products, but also by how well the team uses community platforms like Hugging Face. HF Teams, blogs, Daily Papers, Spaces as project pages, and always experimenting with new ways to engage. Super impressive!
Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥 It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models. LLaDA2.0: Scaling Up Diffusion Language Models to 100B (2512.15745)
✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks ✨ Tests end-to-end finance workflows ✨ Multimodal & cross-file reasoning ✨ Expert annotated (700+ hours) and genuinely challenging hard
✨ Any-to-Any & World-Model : one step forward to the real world - BAAI Emu 3.5 - Antgroup Ming-flash-omni - HunyuanWorld-Mirror: 3D
Aligning with the “world model” globally
✨ Audio & Speech + Video & Visual: released from entertainment labs to delivery platforms - SoulX-Podcast TTS - LongCat-Audio-Codec & LongCat-Video by Meituan delivery paltform - xiabs DreamOmni 2
✨ 48B total/ 3B active - MIT license ✨ Up to 1M context ✨ 84.3 on RULER (128k) with 3.98× speedup ✨ Hybrid KDA + MLA architecture for peak throughput & quality
✨ Compresses long sequences visually to bypass token limits ✨ Reduces computational and memory costs ✨ Preserves meaning through multimodal encoding ✨ Built on GLM-4.1V-9B-Base
✨ Any prior in → 3D world out ✨ Mix camera, intrinsics, depth as priors ✨ Predict point clouds, normals, Gaussians & more in one pass ✨ Unified architecture for all 3D task
✨ Trained on Honey-Data-15M, a 15M-sample SFT corpus with dual-level CoT reasoning ✨ Backed by HoneyPipe, a transparent & reproducible open data curation suite
✨1T total / 50B active params per token ✨20T+ reasoning-dense tokens (Evo-CoT) ✨128K context via YaRN ✨FP8 training: 15%+ faster, same precision as BF16 ✨Hybrid Syntax-Function-Aesthetics reward for front-end & visual generation
- At Alibaba Cloud Summit, Qwen dropped at least 7 new series of models. ( some are not open sourced ) - Since June, Tencent has doubled down on open source, especially after Hunyuan gained traction
✨ Some of the community’s hottest models come from startups.
- Kimi K2-0905 - GLM v4.6 -OpenBMB MiniCPM 4.1
✨ New players are pushing hard!
- Baidu ERNIE & Qianfan: enterprise-ready focus - Ant Group: MoE + low-activation; from small to trillion, from core to reasoning fast track - Xiaomi MiMo: stands out with Any-to-Any audio models
✨ Robotics is joining the open-source wave
- Unitree released its first open-source model - BAAI launched RoboBrain-X0, an open-source robotics model + dataset
👀 Each month brings cooler models. After the 8-day National Holiday, expect another wave before the end of the year.