L56-D1920-qwen_mamba2_qwen2-e2-i1920-s256-hd64-gn6-A0-S4096-step1-rand2b

This is a model uploaded from /mnt/nanjingcephfs/project_wx-rec-alg-bdc-exp/bwzheng/yulan/hyw/pretrain-linear-moe-dev/RADLADS-paper/out/L56-D1920-qwen_mamba2_qwen2-e2-i1920-s256-hd64-gn6-A0-S4096--step1-rand2b.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support