π HMKD-ICMR: Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation
Mingzhu Xu1 Jing Wang1 Mingcai Wang1 Yiping Li1 Yupeng Hu1β Xuemeng Song1 Weili Guan1
1Affiliation (Please update if needed)
Official implementation of HMKD, a Heterogeneous Model Knowledge Distillation framework with Dual Alignment for Semantic Segmentation.
π Conference: ICMR 2025
π Task: Semantic Segmentation
π Framework: PyTorch
π Model Information
1. Model Name
HMKD (Heterogeneous Model Knowledge Distillation)
2. Task Type & Applicable Tasks
- Task Type: Semantic Segmentation / Model Compression
- Core Task: Knowledge Distillation for segmentation
- Applicable Scenarios:
- Lightweight model deployment
- Cross-architecture distillation
- Efficient semantic understanding
3. Project Introduction
Semantic segmentation models often rely on heavy architectures, limiting their deployment in resource-constrained environments. Knowledge distillation (KD) provides a promising solution by transferring knowledge from a large teacher model to a compact student model.
HMKD introduces a Dual Alignment Distillation Framework, which:
- Aligns heterogeneous architectures between teacher and student models
- Performs feature-level and prediction-level alignment
- Bridges the representation gap across different model families
- Improves segmentation accuracy while maintaining efficiency
4. Training Data Source
Supported datasets:
- Cityscapes
- CamVid
| Dataset | Train | Val | Test | Classes |
|---|---|---|---|---|
| Cityscapes | 2975 | 500 | 1525 | 19 |
| CamVid | 367 | 101 | 233 | 11 |
π Environment Setup
- Ubuntu 20.04.4 LTS
- Python 3.8.10 (Anaconda recommended)
- CUDA 11.3
- PyTorch 1.11.0
- NCCL 2.10.3
Install dependencies:
pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48
βοΈ Pre-trained Weights
Initialization Weights
- ResNet-18
- ResNet-101
- SegFormer-B0
- SegFormer-B4
(Download from official PyTorch and Google Drive links)
Trained Weights
Download trained HMKD models:
π Training
- Download datasets and pre-trained weights
- Generate dataset path lists (.txt files)
- Update dataset paths in the code
Run training:
CUDA_VISIBLE_DEVICES=0,1 nohup python -m torch.distributed.launch --nproc_per_node=2 train_NEW_AEU_kd.py > train.log 2>&1 &
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train_NEW_AEU_kd.py
β οΈ Notes
- Designed for research purposes
- Performance depends on teacher-student architecture pairing
- Multi-GPU training is recommended
π Citation
@ARTICLE{HMKD,
author={Xu, Mingzhu and Wang, Jing and Wang, Mingcai and Li, Yiping and Hu, Yupeng and Song, Xuemeng and Guan, Weili},
journal={ICMR},
title={Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation},
year={2025}
}
π¬ Contact
For questions or collaboration, please contact the corresponding author.