Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

danielhanchen 
posted an update 2 days ago
view post
Post
2634
Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
• Self-healing tool calling and code execution
• Compare models side by side + export to GGUF

GitHub: https://github.com/unslothai/unsloth
Blog and Guide: https://unsloth.ai/docs/new/studio

Available now on Hugging Face, NVIDIA, Docker and Colab.
Keeby-smilyai 
posted an update 2 days ago
view post
Post
2869
Hello everyone!
  • 1 reply
·
Shrijanagain 
posted an update about 15 hours ago
view post
Post
1141
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: Shrijan Kumar Tiwari
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion

Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull
  • 24 replies
·
fffiloni 
posted an update 1 day ago
view post
Post
1267
I brought DALL·E mini back to life 🤖🎨

You can try it here:
fffiloni/dalle-mini-reboot

And I also built a batch version using Hugging Face Jobs (up to 50 images per prompt):
fffiloni/dalle-mini-via-jobs

The goal was to stay close to the original JAX/Flax pipeline, while integrating it with modern tooling (Gradio + Jobs).

It ended up being a fun way to revisit this model — still weird, still fun 😄
  • 1 reply
·
ajibawa-2023 
posted an update 2 days ago
view post
Post
2549
C-Code-Large
Dataset: ajibawa-2023/C-Code-Large

C-Code-Large is a large-scale corpus of C programming language source code comprising more than 4 million code samples stored in .jsonl format. The dataset is designed to support research and development in large language model (LLM) pretraining, static analysis, and software engineering automation for the C ecosystem.

By offering a high-volume, language-focused dataset, C-Code-Large enables targeted experimentation in low-level programming, memory-constrained environments, and performance-critical systems, where C continues to be a dominant language.

C-Code-Large addresses the lack of large, curated, C-specific datasets, making it possible to conduct focused research on procedural programming paradigms, manual memory management, and system-level abstractions.

prithivMLmods 
posted an update 2 days ago
view post
Post
2678
Introducing QIE-Bbox-Studio! 🔥🤗

The QIE-Bbox-Studio demo is now live — more precise and packed with more options. Users can manipulate images with object removal, design addition, and even move objects from one place to another, all in just 4-step fast inference.

🤗 Demo: prithivMLmods/QIE-Bbox-Studio
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/QIE-Bbox-Studio

🚀 Models [LoRA] :

● QIE-2511-Object-Mover-Bbox: prithivMLmods/QIE-2511-Object-Mover-Bbox
● QIE-2511-Object-Remover-Bbox-v3: prithivMLmods/QIE-2511-Object-Remover-Bbox-v3
● QIE-2511-Outfit-Design-Layout: prithivMLmods/QIE-2511-Outfit-Design-Layout
● QIE-2509-Object-Remover-Bbox-v3: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
● QIE-2509-Object-Mover-Bbox: prithivMLmods/QIE-2509-Object-Mover-Bbox

🚀 Collection:

● Qwen Image Edit [Layout Bbox]: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
DedeProGames 
posted an update 1 day ago
view post
Post
1181
Can small models program?

Although even if they are reasoning AIs, small AIs cannot create extensive and high-quality code, at least that's what is commonly thought.

We present OrionLLM/NanoCoder-0.6b, an AI with just 600 million parameters based on qwen3-0.6b and trained with the dataset nvidia/OpenCodeReasoning.

While not good at complex code, we observed a significant improvement in code generation (especially in Python code), demonstrating that, when trained correctly, small AIs can, in fact, program.
ZennyKenny 
posted an update 1 day ago
view post
Post
1178
🤔 So we're supposed to post our repo storage graphs now right?
DedeProGames 
posted an update 2 days ago
view post
Post
2480
Introducing GRM Family, a family of fine-tuned small models from the Qwen2.5 family for Long Cot and General Reasoning and Agentic Tasks.

GRM is available in 7b and 1.5b parameter sizes, these models being significantly relevant for complex tasks or local inference agents.
OrionLLM/GRM-7b
OrionLLM/GRM-1.5b
  • 1 reply
·
kanaria007 
posted an update 1 day ago
view post
Post
91
✅ Article highlight: *Incentives in Structured Intelligence* (art-60-045, v0.1)

TL;DR:
Most serious systems already run on incentives — budgets, tariffs, subsidies, penalties, and scarce-resource allocation. The problem is that these usually live outside the runtime as opaque spreadsheets, billing rules, or political defaults.

This article sketches how to make incentives *first-class inside SI-Core*: attach *BudgetSurface* and *CostSurface* to GoalSurface, run *ETH-aware tariff experiments* under PoLB, and treat pricing / allocation as auditable structured decisions rather than hidden knobs.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• makes economic trade-offs explicit instead of burying them in billing logic or policy spreadsheets
• prevents incentives from quietly fighting safety, fairness, or affordability goals
• lets tariff changes and budget-heavy actions be evaluated, simulated, and gated before rollout
• keeps pricing and allocation auditable with portable artifacts and normalized verdicts

What’s inside:
• *BudgetSurface / CostSurface* as typed attachments to GoalSurface
• *IncentiveLedger* for budgets, tariffs, exceptions, and compliance traces
• *PoLB modes for tariffs*: sandbox, shadow, and online rollout
• *ETH-aware A/B* for affordability and burden-by-income-band checks
• *Goal markets* for scarce resource allocation without reducing everything to tokens
• *Price discovery* as an E-Jump problem under welfare, fairness, and stability constraints

Key idea:
A serious intelligence runtime should not treat incentives as external afterthoughts. Budgets, tariffs, and price signals should be *observable, governable, and replayable* inside the same structure as safety and fairness.