view article Article Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer) +1 Jun 16, 2023 β’ 45
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper β’ 2512.20578 β’ Published Dec 23, 2025 β’ 85
Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs Paper β’ 2506.17080 β’ Published Jun 20, 2025 β’ 7
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper β’ 2512.22238 β’ Published Dec 23, 2025 β’ 30
deepseek-ai/DeepSeek-Coder-V2-Instruct Text Generation β’ 236B β’ Updated Aug 21, 2024 β’ 5.55k β’ 681
Qwen/Qwen3-Coder-480B-A35B-Instruct Text Generation β’ 480B β’ Updated Aug 21, 2025 β’ 71.2k β’ β’ 1.31k