Stanford AI

university

https://www.ai.stanford.edu

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

awwkl submitted a paper 13 days ago

Zero-shot World Models Are Developmentally Efficient Learners

qizhengz authored a paper 13 days ago

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

qizhengz authored a paper 13 days ago

FrontierCS: Evolving Challenges for Evolving Intelligence

View all activity

Papers

Sparse Reward Subsystem in Large Language Models

Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

View all Papers

awwkl

submitted a paper to Daily Papers 13 days ago

Zero-shot World Models Are Developmentally Efficient Learners

Paper • 2604.10333 • Published 16 days ago • 7

qizhengz

authored 5 papers 13 days ago

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

Paper • 2511.02230 • Published Nov 4, 2025 • 2

submitted a paper to Daily Papers 3 months ago

Sparse Reward Subsystem in Large Language Models

Paper • 2602.00986 • Published Feb 1 • 13

adamm-hf

posted an update 6 months ago

Post

1312

The #1 trending AI/ML dataset today 🏆

Massive scale, diversity and end-to-end potential from nvidia !
nvidia/PhysicalAI-Autonomous-Vehicles

adamm-hf

posted an update 6 months ago

Post

803

The new King 👑has arrived!

Moonshot AI now the top model on Hugging Face 🔥
moonshotai/Kimi-K2-Thinking

adamm-hf

posted an update 6 months ago

Post

2866

💸🤑You don’t need 100 GPUs to train something amazing!

Our Smol Training Playbook teaches you a better path to world-class LLMs, for free!

Check out the #1 trending space on 🤗 :
HuggingFaceTB/smol-training-playbook

nicholswang

authored 3 papers 6 months ago

Closing the Modality Gap for Mixed Modality Search

Paper • 2507.19054 • Published Jul 25, 2025

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Paper • 2510.08559 • Published Oct 9, 2025 • 9

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20, 2025 • 80

qizhengz

authored a paper 7 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 131

adamm-hf

posted an update 7 months ago

Post

2348

Cool stuff these past weeks on huggingface! 🤗 🚀 !
• 📈Trackio, local-first W&B alternative
https://github.com/gradio-app/trackio/issues
• 🌍EmbeddingGemma, 300M-param, multilingual embeddings, on-device
https://huggingface.co/blog/embeddinggemma
• 💻Open LLMs in VS Code (Inference Providers)
https://x.com/reach_vb/status/1966185427582497171
• 🤖Smol2Operator GUI agents
https://huggingface.co/blog/smol2operator
• 🖼️Gradio visible watermarking
https://huggingface.co/blog/watermarking-with-gradio

AnSungJae3489

posted an update 7 months ago

Post

2632

ShareGPT? How about ShareGPT-X?

We release **92K** Human with LLM conversations as a refresh and update over the original ShareGPT Dataset.

DSULT-Core/ShareGPT-X

qizhengz

authored 3 papers 7 months ago

CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

Paper • 2405.16444 • Published May 26, 2024 • 1

Cost-Efficient Serving of LLM Agents via Test-Time Plan Caching

Paper • 2506.14852 • Published Jun 17, 2025 • 1

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 118

Abhaykoul

posted an update 8 months ago

Post

3348

🚀 Ever dreamed of training your own Large Language Model from scratch? What if I told you it doesn't require a supercomputer or PhD in ML? 🤯

Introducing LLM Trainer - the educational framework that makes LLM training accessible to EVERYONE! Whether you're on a CPU-only laptop or scaling to distributed GPUs, we've got you covered. 💻➡️🖥️

Why LLM Trainer? Because existing tools are either too simplistic (hiding the magic) or too complex (requiring expert knowledge). We bridge the gap with:

🎓 Educational transparency - every component built from scratch with clear code
💻 CPU-first approach - start training immediately, no GPU needed
🔧 Full customization - modify anything you want
📈 Seamless scaling - from laptop to cluster without code changes
🤝 HuggingFace integration - works with existing models & tokenizers

Key highlights:
✅ Built-in tokenizers (BPE, WordPiece, HF wrappers)
✅ Complete Transformer implementation from scratch
✅ Optimized for CPU training
✅ Advanced features: mixed precision, gradient checkpointing, multiple generation strategies
✅ Comprehensive monitoring & metrics

Perfect for:
- Students learning transformers
- Researchers prototyping new ideas
- Developers building domain-specific models

Ready to train your first LLM? It's easier than you think!

🔗 Check it out: https://github.com/HelpingAI/llm-trainer
📚 Docs: Getting Started Guide
💬 Join the community: GitHub Discussions

#AI #MachineLearning #LLM #DeepLearning #OpenSource #Python #HuggingFace #NLP

Special thanks to HuggingFace and PyTorch teams for the amazing ecosystem! 🙏

1 reply

AI & ML interests

Recent Activity

Papers

Team members 443

Stanford's activity