In a Training Loop 🔄

Urro PRO

urroxyz

daryltucker's profile picture

CYGDEN's profile picture

YellowjacketGames's profile picture

https://urro.xyz/

urroxyz

AI & ML interests

computational linguistics major 🤖🔎🔠 i am autistic. if i come off rude, i probably didn't mean to. please feel free to ask me for clarification.

Recent Activity

updated a collection about 10 hours ago

WTF GENIUS PAPERS

upvoted a paper about 10 hours ago

IMU-1: Sample-Efficient Pre-training of Small Language Models

upvoted a paper 1 day ago

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

View all activity

Organizations

urroxyz 's collections 6

✨ free demo spaces

HF Spaces for demoing chat completion models—no ZeroGPU, WebGPU, or BYOK included. Thank you so much to these devs!

Running

Featured

45

Step-3.5-Flash Chatbot

🚀

45

Run interactive Streamlit apps directly in your browser
Running

1

MiniMax M2.5 Chat

👀

1

Chat with MiniMax M2.5 — 230B MoE model (10B active)
Running

5

Ling Space

🦉

5

Chat, code, and write with AI‑powered multilingual assistant
Running on CPU Upgrade

Featured

334

GPT-OSS-120B on AMD MI300X

💻

334

gpt-oss-120b on AMD MI300X GPUs

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts.

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated 5 days ago • 38.6k • 399
Qwen/Qwen3.5-27B

Image-Text-to-Text • 28B • Updated Feb 25 • 3M • • 855
Qwen/Qwen3.5-9B

Image-Text-to-Text • 10B • Updated Mar 2 • 4.87M • • 1.18k
cerebras/GLM-4.7-Flash-REAP-23B-A3B

Text Generation • 23B • Updated Jan 23 • 119k • 68

HUMAN-WRITTEN & LEGALLY-SOURCED*

Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly.

BramVanroy/CommonCrawl-CreativeCommons

Viewer • Updated Aug 28, 2025 • 739M • 658 • 34
PleIAs/common_corpus

Viewer • Updated Feb 19 • 69.9k • 140k • 388
common-pile/comma_v0.1_training_dataset

Viewer • Updated Jun 6, 2025 • 784M • 4.37k • 39
crumb/openstax-text

Viewer • Updated Jul 14, 2023 • 3.35M • 1.22k • 4

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

ETHICALLY-DECENT & LEGALLY-ADJACENT

Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.

ibm-granite/granite-4.0-h-small

Text Generation • Updated Nov 3, 2025 • 175k • 307
ibm-granite/granite-3.3-8b-instruct

Text Generation • 8B • Updated May 12, 2025 • 357k • 154
ibm-granite/granite-3.0-8b-instruct

Text Generation • Updated Dec 19, 2024 • 25.5k • 206
alea-institute/kl3m-003-1.7b

Text Generation • 2B • Updated Apr 10, 2025 • 472 • 4

ATTENTIVE ASR MODELS FOR ONNX

ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.

urroxyz/whisper-medium_timestamped

Automatic Speech Recognition • Updated Aug 15, 2025 • 6
urroxyz/whisper-medium.en_timestamped

Automatic Speech Recognition • Updated Apr 25, 2025 • 2
urroxyz/Voxtral-Mini-3B-2507_timestamped

Audio-Text-to-Text • Updated Jul 27, 2025 • 3

✨ free demo spaces

HF Spaces for demoing chat completion models—no ZeroGPU, WebGPU, or BYOK included. Thank you so much to these devs!

Running

Featured

45

Step-3.5-Flash Chatbot

🚀

45

Run interactive Streamlit apps directly in your browser
Running

1

MiniMax M2.5 Chat

👀

1

Chat with MiniMax M2.5 — 230B MoE model (10B active)
Running

5

Ling Space

🦉

5

Chat, code, and write with AI‑powered multilingual assistant
Running on CPU Upgrade

Featured

334

GPT-OSS-120B on AMD MI300X

💻

334

gpt-oss-120b on AMD MI300X GPUs

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts.

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated 5 days ago • 38.6k • 399
Qwen/Qwen3.5-27B

Image-Text-to-Text • 28B • Updated Feb 25 • 3M • • 855
Qwen/Qwen3.5-9B

Image-Text-to-Text • 10B • Updated Mar 2 • 4.87M • • 1.18k
cerebras/GLM-4.7-Flash-REAP-23B-A3B

Text Generation • 23B • Updated Jan 23 • 119k • 68

ETHICALLY-DECENT & LEGALLY-ADJACENT

Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.

ibm-granite/granite-4.0-h-small

Text Generation • Updated Nov 3, 2025 • 175k • 307
ibm-granite/granite-3.3-8b-instruct

Text Generation • 8B • Updated May 12, 2025 • 357k • 154
ibm-granite/granite-3.0-8b-instruct

Text Generation • Updated Dec 19, 2024 • 25.5k • 206
alea-institute/kl3m-003-1.7b

Text Generation • 2B • Updated Apr 10, 2025 • 472 • 4

HUMAN-WRITTEN & LEGALLY-SOURCED*

Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly.

BramVanroy/CommonCrawl-CreativeCommons

Viewer • Updated Aug 28, 2025 • 739M • 658 • 34
PleIAs/common_corpus

Viewer • Updated Feb 19 • 69.9k • 140k • 388
common-pile/comma_v0.1_training_dataset

Viewer • Updated Jun 6, 2025 • 784M • 4.37k • 39
crumb/openstax-text

Viewer • Updated Jul 14, 2023 • 3.35M • 1.22k • 4

ATTENTIVE ASR MODELS FOR ONNX

ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.

urroxyz/whisper-medium_timestamped

Automatic Speech Recognition • Updated Aug 15, 2025 • 6
urroxyz/whisper-medium.en_timestamped

Automatic Speech Recognition • Updated Apr 25, 2025 • 2
urroxyz/Voxtral-Mini-3B-2507_timestamped

Audio-Text-to-Text • Updated Jul 27, 2025 • 3

Urro PRO

AI & ML interests

Recent Activity

Organizations

urroxyz 's collections 6

Step-3.5-Flash Chatbot

MiniMax M2.5 Chat

Ling Space

GPT-OSS-120B on AMD MI300X

Step-3.5-Flash Chatbot

MiniMax M2.5 Chat

Ling Space

GPT-OSS-120B on AMD MI300X