-
-
-
-
-
-
Inference Providers
Active filters:
grpo
weygu/qwen2.5-3b-graph-extraction
3B
•
Updated
•
9
•
2
ericrisco/gemma-3-4b-reasoning
Any-to-Any
•
4B
•
Updated
•
29
•
4
Text Generation
•
27B
•
Updated
•
36
•
3
Text Generation
•
Updated
•
19
•
3
Wildstash/business-strategy-grpo-v2
Updated
•
17
•
1
Wildstash/strategic-consultant-for-corporate-strategy
Text Generation
•
Updated
•
1
0xgr3y/Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-tall_tame_panther
Text Generation
•
0.5B
•
Updated
•
2.99k
•
2
Text Generation
•
4B
•
Updated
•
199
•
9
srallabandi0225/inframind-0.5b-grpo
Text Generation
•
0.5B
•
Updated
•
50
•
7
kangdawei/MMR-Sigmoid-GRPO-8B
Text Generation
•
8B
•
Updated
•
62
•
1
mradermacher/MMR-Sigmoid-GRPO-8B-GGUF
8B
•
Updated
•
403
•
1
Text Generation
•
3B
•
Updated
•
20
•
1
Text Generation
•
8B
•
Updated
•
19
•
1
mradermacher/MolOptAgent-3B-GGUF
3B
•
Updated
•
943
•
1
mradermacher/MolOptAgent-7B-GGUF
8B
•
Updated
•
357
•
1
Text Generation
•
Updated
•
194
•
1
gyung/lfm2-1.2b-koen-mt-v8-rl-10k-adapter
Text Generation
•
Updated
•
65
•
1
aquiffoo/neo-3-3B-A400M-Thinking
Text Generation
•
Updated
•
1
aquiffoo/neo-3-1B-A90M-Instruct
Text Generation
•
Updated
•
1
gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged
Text Generation
•
1B
•
Updated
•
15
•
1
Chun121/Qwen3-4B-RPG-Roleplay-V2
Text Generation
•
4B
•
Updated
•
8.96k
•
30
Text Generation
•
0.1B
•
Updated
•
12
8B
•
Updated
•
6
sergiopaniego/Qwen2-0.5B-GRPO-test
Updated
Novaciano/ESP-NSFW-GRPO-1B-Sin_Censura-GGUF
1B
•
Updated
•
63
•
3
nbd22/Llama-3.1-8B-Instruct-GRPO-gsm8k-ft-lora
Updated
sergiopaniego/Qwen2-0.5B-GRPO
Updated
philschmid/qwen-2.5-3b-r1-countdown
Text Generation
•
3B
•
Updated
•
29
•
8
spinech/qwen-2.5-3b-r1-countdown
Text Generation
•
3B
•
Updated
•
9
Dongwei/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
2B
•
Updated
•
5
•
1