arxiv:2511.11672

OSGym: Scalable Distributed Data Engine for Generalizable Computer Agents

Published on Nov 11, 2025

Authors:

Abstract

OSGym is a scalable distributed data engine that enables efficient training of computer use agents across diverse operating system tasks with high parallelization and low computational costs.

AI-generated summary

We introduce OSGym, a scalable distributed Data Engine for training agents across diverse computer use tasks. OSGym efficiently scales to more than a thousand operating system (OS) replicas under academia-affordable cost budget, to serve as agent runtime environments. OSGym has three advantages: 1) Scalability: Despite intensive resource consumption for running OS replicas, OSGym can parallelize a thousand OS replicas while maintaining the operation efficiency under limited resources. Its scalable parallelization enables generating a vast amount of data (1420 multi-turn trajectories per minute). 2) Generality and Customizability: OSGym supports a wide variety of tasks as long as they run on operating systems, including functional tool-use, browser interactions, software engineering, office applications, etc. It also enables easy and flexible customization of model training algorithms. 3) Economic Viability for Academia Use: Only costs 0.2 to 0.3 USD per day per OS replica on easily accessible on-demand compute providers. Our experiments demonstrate the effectiveness of OSGym for implementing comprehensive pipelines for data collection, supervised fine-tuning, and reinforcement learning for computer use agents. We believe OSGym will push the scalability and universality in future agent research.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.11672 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.11672 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.11672 in a Space README.md to link it from this page.