trl-internal-testing/tiny-Qwen3ForCausalLM-Instruct-2507
Text Generation • 2.45M • Updated • 1.01k
Internal testing artifact mangement for trl library
from trl.experimental.ssd import SSDConfig, SSDTrainer
trainer = SSDTrainer(
model="Qwen/Qwen3-4B-Instruct",
args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95),
train_dataset=dataset,
)
trainer.train()use_transformers_paged, and key fixes for VLM response parsing.pip install --upgrade trl