Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
5
4
2
Demian L. P.
very-cooluser
Follow
DualityAI-RebekahBogdanoff's profile picture
21world's profile picture
Gargaz's profile picture
3 followers
·
13 following
AI & ML interests
Anything that can run on ~3GB of memory is a instant thumbs up to me
Recent Activity
reacted
to
Shrijanagain
's
post
with 🔥
10 days ago
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training Author: SKT AI LABS Affiliation: SKT AI Labs / Project Surya Model Architecture: Optimized Dense Transformer Parameters: 1.1 Trillion Training Tokens: 146 Trillion Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull Whitepaper - https://github.com/SHRIJANAGAIN/PROFF
reacted
to
Keeby-smilyai
's
post
with 🤗
11 days ago
Hello everyone!
reacted
to
robtacconelli
's
post
with 🤯
13 days ago
🧬 Midicoth: diffusion-based lossless compression — no neural net, no GPU, no training data What if reverse diffusion could compress text — without a neural network? Midicoth brings score-based denoising into classical compression. It treats prior smoothing as forward noise and reverses it with Tweedie's formula on a binary tree — 3 denoising steps, James-Stein shrinkage, applied after all model blending. ~2,000 lines of C, single CPU core. Beats every dictionary compressor we tested: enwik8 (100 MB) → 1.753 bpb (−11.9% vs xz, −15% vs Brotli, −24.5% vs bzip2) alice29.txt → 2.119 bpb (−16.9% vs xz) Outperforms xz, zstd, Brotli, bzip2, gzip on all inputs PAQ/CMIX still win with hundreds of models + LSTMs. LLM compressors win with pre-trained knowledge. Midicoth closes the gap with pure statistics — no mixer, no gradient descent, just counting. The Tweedie denoising layer adds 2.3–2.7% on every file tested — the most consistent component in the ablation. Adding SSE or logistic mixers made things worse. In the online setting, count-based beats gradient-based. No external dependencies. Fully deterministic. Bit-exact encode/decode. ~60 KB/s throughput. 💻 Code: https://github.com/robtacconelli/midicoth 📄 Paper: https://huggingface.co/papers/2603.08771 ⭐ Space: https://huggingface.co/spaces/robtacconelli/midicoth If you ever wondered whether diffusion ideas belong in data compression — here's proof they do. ⭐ appreciated!
View all activity
Organizations
None yet
very-cooluser
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
about 2 months ago
z-lab/Qwen3-Coder-30B-A3B-DFlash
Text Generation
•
0.5B
•
Updated
14 days ago
•
549
•
28
liked
a model
2 months ago
Qwen/Qwen3-TTS-Tokenizer-12Hz
Audio-to-Audio
•
Updated
Jan 29
•
65.8k
•
55