Gaming for Boundary: Elastic Localization for Frame-Supervised Video Moment Retrieval

Hao Liu¹ Yupeng Hu^1✉ Kun Wang¹ Yinwei Wei¹ Liqiang Nie²

¹School of Software, Shandong University, Jinan, China
²School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China

This is the official PyTorch implementation of GOAL, a frame-supervised Video Moment Retrieval (VMR) framework for elastic boundary localization via a game-based paradigm and Dynamic Updating Technique (DUT).

🔗 Paper: SIGIR 2025 🔗 GitHub Repository: iLearn-Lab/SIGIR25-GOAL

Model Information

1. Model Name

GOAL (Gaming fOr elAstic Localization).

2. Task Type & Applicable Tasks

Task Type: Frame-Supervised Video Moment Retrieval (VMR) / Temporal Localization / Vision-Language Learning
Applicable Tasks: Retrieving the temporal moment in a video that matches a natural language query using a single annotated frame, with a focus on ambiguous temporal boundary localization.

3. Project Introduction

Frame-supervised Video Moment Retrieval (VMR) aims to retrieve the temporal moment in a video that matches a natural language query using only a single annotated frame. While this setting reduces annotation cost, it brings severe ambiguity in temporal boundary prediction.

GOAL addresses this challenge through a game-based paradigm with three players, namely KFP, AFP, and BP, together with a Dynamic Updating Technique (DUT) that progressively refines boundary decisions through unilateral and bilateral updates for more elastic localization.

4. Training Data Source

The model is trained and evaluated on standard frame-supervised VMR benchmarks:

ActivityNet Captions
Charades-STA
TACoS

Usage & Basic Inference

This codebase provides training and evaluation scripts for frame-supervised VMR, as well as checkpoints for quick reproduction.

Step 1: Prepare the Environment

Clone the GitHub repository and install dependencies:

git clone https://github.com/iLearn-Lab/SIGIR25-GOAL.git
cd GOAL
python -m venv .venv
source .venv/bin/activate   # Linux / Mac
# .venv\Scripts\activate    # Windows
pip install numpy scipy pyyaml tqdm

Step 2: Download Model Weights & Data

Prepare features and raw annotations following ViGA's dataset preparation protocol.

Before running the code, please check and replace local dataset and feature paths in:

src/config.yaml
src/utils/utils.py

Step 3: Run Inference

To evaluate a trained experiment folder, run:

python -m src.experiment.eval --exp path/to/your/experiment_folder

Limitations & Notes

Disclaimer: This repository is intended for academic research purposes only.

The model requires access to the original benchmark datasets and extracted video features for evaluation.
Some configuration files currently contain local path settings and should be updated before use.

Citation

If you find our work useful in your research, please consider citing our paper:

@inproceedings{liu2025gaming,
  title={Gaming for Boundary: Elastic Localization for Frame-Supervised Video Moment Retrieval},
  author={Liu, Hao and Hu, Yupeng and Wang, Kun and Wei, Yinwei and Nie, Liqiang},
  booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  year={2025},
  doi={10.1145/3726302.3729984}
}

Contact

If you have any questions, feel free to contact me at liuh90210@gmail.com.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support