Block Diffusion for Flash Speculative Decoding
AI & ML interests
Efficient AI
Recent Activity
View all activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 13.5k • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 27.6k • 44 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 15.3k • 15
Block Diffusion for Flash Speculative Decoding
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 13.5k • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 27.6k • 44 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 15.3k • 15
models 33
z-lab/Qwen3.6-35B-A3B-DFlash
Text Generation • 0.5B • Updated • 4 • 6
z-lab/Kimi-K2.5-DFlash
Text Generation • 3B • Updated • 311 • 19
z-lab/gpt-oss-20b-DFlash
Text Generation • 0.8B • Updated • 1.16k • 15
z-lab/LLaMA3.1-8B-Instruct-DFlash-UltraChat
Text Generation • 1B • Updated • 941 • 2
z-lab/Qwen3-Coder-30B-A3B-DFlash
Text Generation • 0.5B • Updated • 1.46k • 28
z-lab/Qwen3-Coder-Next-DFlash
Text Generation • 0.5B • Updated • 1.34k • 8
z-lab/Qwen3.5-9B-DFlash
Text Generation • 1B • Updated • 6.79k • 22
z-lab/Qwen3.5-4B-DFlash
Text Generation • 0.5B • Updated • 6.86k • 14
z-lab/Qwen3.5-35B-A3B-DFlash
Text Generation • 0.5B • Updated • 4.48k • 32
z-lab/Qwen3.5-27B-DFlash
Text Generation • 2B • Updated • 13.8k • 82
datasets 6
z-lab/humaneval-long
Viewer • Updated • 1k • 18
z-lab/gsm8k-filtered
Viewer • Updated • 1.31k • 23
z-lab/mt-bench-filtered
Viewer • Updated • 79 • 27
z-lab/mbpp-sanitized-filtered
Viewer • Updated • 256 • 29
z-lab/humaneval-filtered
Viewer • Updated • 137 • 27
z-lab/qwen3-4b-instruct-100k
Viewer • Updated • 100k • 48