ROSALIA-7B-v1

ROSALIA is a vision-language model (VLM) designed for precise lesion segmentation in chest X-rays (CXRs). It is a LISA model fine-tuned on the MIMIC-ILS dataset, a large-scale instruction-answer dataset for CXR lesion segmentation.

ROSALIA is capable of Instruction-Guided Lesion Segmentation (ILS), a medical-domain adaptation of referring image segmentation (RIS), allowing it to segment diverse lesions and provide textual explanations in response to simple, user-friendly instructions.

This model is the core checkpoint of the paper:
Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset, accepted to CVPR 2026.

Code: https://github.com/checkoneee/ROSALIA
Dataset: https://physionet.org/content/mimic-cxr-ext-ils/1.0.0/
Paper: arXiv:2511.15186

📖 Citation

If you find this model or the related research useful, please cite our work:

@article{choi2025instruction,
  title={Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset},
  author={Choi, Geon and Yoon, Hangyul and Shin, Hyunju and Park, Hyunki and Seo, Sang Hoon and Yang, Eunho and Choi, Edward},
  journal={arXiv preprint arXiv:2511.15186},
  year={2025}
}

Downloads last month: 166

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for checkone/ROSALIA-7B-v1

Base model

xinlai/LISA-7B-v1

Finetuned

(3)

this model

Paper for checkone/ROSALIA-7B-v1

Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset

Paper • 2511.15186 • Published Nov 19, 2025 • 26