Lost in Backpropagation: The LM Head is a Gradient Bottleneck Paper • 2603.10145 • Published Mar 10 • 13
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data Paper • 2602.06669 • Published Feb 6 • 7
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content Paper • 2506.20331 • Published Jun 25, 2025 • 5
Gaperon: A Peppered English-French Generative Language Model Suite Paper • 2510.25771 • Published Oct 29, 2025 • 16
Zero-Shot Styled Text Image Generation, but Make It Autoregressive Paper • 2503.17074 • Published Mar 21, 2025 • 2
Binarizing Documents by Leveraging both Space and Frequency Paper • 2404.17243 • Published Apr 26, 2024 • 2
Alfie: Democratising RGBA Image Generation With No $$$ Paper • 2408.14826 • Published Aug 27, 2024 • 1
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas Paper • 2408.15660 • Published Aug 28, 2024 • 1
μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context Paper • 2408.15646 • Published Aug 28, 2024 • 1
Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri Paper • 2308.05070 • Published Aug 9, 2023 • 2
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5, 2025 • 61
view post Post 7184 SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱And check out this real-time camera demo by @ngxson , powered by llama.cpp:https://github.com/ngxson/smolvlm-realtime-webcamhttps://x.com/pocketpal_ai See translation 4 replies · ❤️ 12 12 😎 1 1 + Reply
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 207
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published Mar 18, 2025 • 50
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 258
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published Jan 7, 2025 • 82
view post Post 3307 Made a HF Dataset editor a la gg sheets here: lhoestq/dataset-spreadsheetsWith Dataset Spreadsheets:✏️ Edit datasets in the UI🔗 Share link with collaborators🐍 Use locally in DuckDB or PythonAvailable for the 100,000+ parquet datasets on HF :) See translation ❤️ 9 9 🔥 1 1 + Reply