On Vacation 🏝️

263 977 926

Adina Yakefu

AdinaY

AI & ML interests

None yet

Recent Activity

posted an update 29 minutes ago

MiniMax M2.1 blog is out🔥 https://huggingface.co/blog/MiniMaxAI/multilingual-and-multi-task-coding-with-strong-gen Only a year into open source, MiniMax is already making a great impact. Not only through solid models/products, but also by how well the team uses community platforms like Hugging Face. HF Teams, blogs, Daily Papers, Spaces as project pages, and always experimenting with new ways to engage. Super impressive!

upvoted an article about 1 hour ago

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

posted an update about 16 hours ago

2025.1 - DeepSeek entered the scene, backed by High Flyer Quant 2026.1 - IQuest enters the game, backed by Uniquant Quant 📈 and launching IQuest-Coder on huggingface https://huggingface.co/collections/IQuestLab/iquest-coder ✨ 40B models: Instruct / Thinking / Loop ✨ Loop = MoE-level performance with only ~5% extra training cost ✨ Native 128K context

View all activity

Organizations

posted an update 29 minutes ago

Post

MiniMax M2.1 blog is out🔥
https://huggingface.co/blog/MiniMaxAI/multilingual-and-multi-task-coding-with-strong-gen

Only a year into open source, MiniMax is already making a great impact. Not only through solid models/products, but also by how well the team uses community platforms like Hugging Face.

HF Teams, blogs, Daily Papers, Spaces as project pages, and always experimenting with new ways to engage. Super impressive!

posted an update about 16 hours ago

Post

1303

2025.1 - DeepSeek entered the scene, backed by High Flyer Quant
2026.1 - IQuest enters the game, backed by Uniquant Quant 📈 and launching IQuest-Coder on huggingface
https://huggingface.co/collections/IQuestLab/iquest-coder

✨ 40B models: Instruct / Thinking / Loop
✨ Loop = MoE-level performance with only ~5% extra training cost
✨ Native 128K context

posted an update 17 days ago

Post

706

Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥
It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models.
LLaDA2.0: Scaling Up Diffusion Language Models to 100B (2512.15745)

posted an update 20 days ago

Post

4568

Finch 💰 an enterprise-grade benchmark that measures whether AI agents can truly handle real world finance & accounting work.

FinWorkBench/Finch

✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks
✨ Tests end-to-end finance workflows
✨ Multimodal & cross-file reasoning
✨ Expert annotated (700+ hours) and genuinely challenging hard

posted an update about 2 months ago

Post

3371

Kimi K2 Thinking is now live on the hub 🔥

moonshotai/Kimi-K2-Thinking

✨ 1T MoE for deep reasoning & tool use
✨ Native INT4 quantization = 2× faster inference
✨ 256K context window
✨ Modified MIT license

posted an update 2 months ago

Post

725

Chinese open source AI in October wasn’t about bigger models, it was about real world impact 🔥

https://huggingface.co/collections/zh-ai-community/october-2025-china-open-source-highlights

✨ Vision-Language & OCR wave 🌊
- DeepSeek-OCR : 3B
- PaddleOCR-VL : 0.9B
- Qwen3-VL : 2B / 4B / 8B / 32B /30B-A3B
- Open-Bee: Bee-8B-RL
- http://Z.ai Glyph :10B

OCR is industrializing, the real game now is understanding the (long context) document, not just reading it.

✨ Text generation: scale or innovation?
- MiniMax-M2: 229B
- Antgroup Ling-1T & Ring-1T
- Moonshot Kimi-Linear : linear-attention challenger
- Kwaipilot KAT-Dev

Efficiency is the key.

✨ Any-to-Any & World-Model : one step forward to the real world
- BAAI Emu 3.5
- Antgroup Ming-flash-omni
- HunyuanWorld-Mirror: 3D

Aligning with the “world model” globally

✨ Audio & Speech + Video & Visual: released from entertainment labs to delivery platforms
- SoulX-Podcast TTS
- LongCat-Audio-Codec & LongCat-Video by Meituan delivery paltform
- xiabs DreamOmni 2

Looking forward to what's next 🚀

posted an update 2 months ago

Post

569

Kimi Linear🚀 Hybrid linear attention model from Moonshot AI

https://huggingface.co/collections/moonshotai/kimi-linear-a3b

✨ 48B total/ 3B active - MIT license
✨ Up to 1M context
✨ 84.3 on RULER (128k) with 3.98× speedup
✨ Hybrid KDA + MLA architecture for peak throughput & quality

posted an update 2 months ago

Post

1783

Ming-flash-omni Preview 🚀 Multimodal foundation model from AntGroup

inclusionAI/Ming-flash-omni-Preview

✨ Built on Ling-Flash-2.0: 10B total/6B active
✨ Generative segmentation-as-editing
✨ SOTA contextual & dialect ASR
✨ High-fidelity image generation

posted an update 2 months ago

Post

1887

Glyph 🔥 a framework that scales context length by compressing text into images and processing them with vision–language models, released by Z.ai.

Paper:https://huggingface.co/papers/2510.17800
Model:https://huggingface.co/zai-org/Glyph

✨ Compresses long sequences visually to bypass token limits
✨ Reduces computational and memory costs
✨ Preserves meaning through multimodal encoding
✨ Built on GLM-4.1V-9B-Base

posted an update 2 months ago

Post

2672

HunyuanWorld Mirror🔥a versatile feed forward model for universal 3D world reconstruction by Tencent

tencent/HunyuanWorld-Mirror

✨ Any prior in → 3D world out
✨ Mix camera, intrinsics, depth as priors
✨ Predict point clouds, normals, Gaussians & more in one pass
✨ Unified architecture for all 3D task

posted an update 3 months ago

Post

699

PaddleOCR VL🔥 0.9B Multilingual VLM by Baidu

PaddlePaddle/PaddleOCR-VL

✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment

posted an update 3 months ago

Post

1827

Bee-8B 🐝 open 8B Multimodal LLM built on high quality data, released by
TencentHunyuan

Paper: Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs (2510.13795)
Model: https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995

✨ Trained on Honey-Data-15M, a 15M-sample SFT corpus with dual-level CoT reasoning
✨ Backed by HoneyPipe, a transparent & reproducible open data curation suite

posted an update 3 months ago

Post

350

Reflection ≠ self-correction

Interesting paper on long-chain reasoning 📑 from Miromind AI.
First Try Matters: Revisiting the Role of Reflection in Reasoning Models (2510.08308)

It dives into how LLMs think 🧠 Most reflections confirm, not fix, and true improvement comes from stronger initial reasoning.

posted an update 3 months ago

Post

512

Ring-1T🔥 the trillion-parameter thinking model released by Ant group, the company behind Alipay

inclusionAI/Ring-1T

✨ 1T params (50B active)- MIT license
✨ 128K context (YaRN)
✨ RLVR, Icepop, and ASystem make trillion-scale RL stable

posted an update 3 months ago

Post

537

KAT-Dev-72B-Exp🔥 Kuaishou's ( the company behind Kring AI ) new open model for software engineering

Kwaipilot/KAT-Dev-72B-Exp

✨ 72B - Apache2.0
✨ Redesigned attention kernel & training engine for efficient context-aware RL
✨ 74.6% accuracy on SWE-Bench Verified

posted an update 3 months ago

Post

4432

At the close of the National Holiday🇨🇳, Antgroup drops a new SoTA model.

Ling-1T 🔥 the trillion-parameter flagship of the Ling 2.0 series.

inclusionAI/Ling-1T

✨1T total / 50B active params per token
✨20T+ reasoning-dense tokens (Evo-CoT)
✨128K context via YaRN
✨FP8 training: 15%+ faster, same precision as BF16
✨Hybrid Syntax-Function-Aesthetics reward for front-end & visual generation

1 reply

posted an update 3 months ago

Post

587

Qwen never stops🤯
Meet Qwen3-VL-30B-A3B-Thinking & Instruct 🔥

Qwen/Qwen3-VL-30B-A3B-Instruct
Qwen/Qwen3-VL-30B-A3B-Thinking

✨256K–1M context for long docs & video
✨32 language OCR & stronger visual recognition
✨“Visual Agent” that operates GUIs

posted an update 3 months ago

Post

620

New release from Ant Group 🔥

inclusionAI/ming-v2-68ddea4954413c128d706630

✨MingTok (Vision & Audio): continuous unified tokenizer, no quantization, preserves semantic & perceptual fidelity, enables faster convergence.

✨Ming-UniVision: MLLM unifying image understanding + generation, supports multi-round editing & visualized CoT.

✨Ming-UniAudio: unified speech LLM for ASR, TTS & free-form editing, integrates semantic + acoustic features for high-fidelity audio.

posted an update 3 months ago

Post

566

🔥 September highlights from Chinese open source community

zh-ai-community/september-2025-china-open-source-highlights-68b55c9e757c439ad9dd6aba

✨ Massive releases from the two tech giants

- At Alibaba Cloud Summit, Qwen dropped at least 7 new series of models. ( some are not open sourced )
- Since June, Tencent has doubled down on open source, especially after Hunyuan gained traction

✨ Some of the community’s hottest models come from startups.

- Kimi K2-0905
- GLM v4.6
-OpenBMB MiniCPM 4.1

✨ New players are pushing hard!

- Baidu ERNIE & Qianfan: enterprise-ready focus
- Ant Group: MoE + low-activation; from small to trillion, from core to reasoning fast track
- Xiaomi MiMo: stands out with Any-to-Any audio models

✨ Robotics is joining the open-source wave

- Unitree released its first open-source model
- BAAI launched RoboBrain-X0, an open-source robotics model + dataset

👀 Each month brings cooler models. After the 8-day National Holiday, expect another wave before the end of the year.

Stay tuned!

posted an update 3 months ago

Post

2815

GLM-4.6 is here🚀

zai-org/GLM-4.6

✨ 200K context window
✨ Superior coding & polished UI generation
✨ Stronger reasoning & tool use
✨ More capable agents & agent frameworks

Adina Yakefu

AI & ML interests

Recent Activity

Organizations

AdinaY's activity