Nemotron Reinforcement Learning Collection RL training datasets with verifiable and execution-based rewards across math, code, agentic tasks, instruction following, knowledge, and safety. • 29 items • Updated 7 days ago • 1
Nemotron Reward Modeling Collection Human preference data, reward model training sets, and generative reward modeling data for training Nemotron reward models. • 6 items • Updated 7 days ago • 3
Nemotron Math & Reasoning Collection Datasets for building models that excel at math reasoning, proofs, and quantitative problem-solving. Covers SFT, RL, and pretraining data. • 21 items • Updated 7 days ago • 4
Nemotron Code & SWE Collection Datasets for building models that write, debug, and reason about code. Covers competitive programming, software engineering, and code pretraining. • 12 items • Updated 7 days ago • 1
Nemotron Chat & Instruction Following Collection Datasets for building helpful, multi-turn, instruction-following conversational models across single and multi-turn settings. • 11 items • Updated 7 days ago • 2
Nemotron Safety & Content Moderation Collection Datasets for building safe models with refusals, content moderation, PII detection, agentic safety, and audio safety capabilities. • 9 items • Updated 7 days ago • 2
Nemotron Agentic & Tool-Use Collection Datasets for building models capable of function calling, multi-step agentic tasks, terminal use, and SWE workflows. • 9 items • Updated 7 days ago • 4
Nemotron Vision-Language Collection Image-text paired datasets for building vision-language models (VLMs). • 3 items • Updated 3 days ago • 4
Nemotron Supervised Fine-Tuning Collection SFT datasets covering math, code, chat, safety, agentic, VLM, multilingual, and specialized domains. • 38 items • Updated 7 days ago • 5
Vidi Collection Vidi model collection for multimodal video understanding and creation • 2 items • Updated Jan 22 • 4
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Paper • 2504.02605 • Published Apr 3, 2025 • 49
CodeContests+: High-Quality Test Case Generation for Competitive Programming Paper • 2506.05817 • Published Jun 6, 2025 • 10
CryoFM: A Flow-based Foundation Model for Cryo-EM Densities Paper • 2410.08631 • Published Oct 11, 2024 • 3
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published Jan 2, 2025 • 13
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training Paper • 2506.05301 • Published Jun 5, 2025 • 60
VINCIE: Unlocking In-context Image Editing from Video Paper • 2506.10941 • Published Jun 12, 2025 • 5
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published Jun 23, 2025 • 35