WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments

Joshua Knights^1,2 · Joseph Reid¹ · Kaushik Roy¹
David Hall¹ · Mark Cox¹ · Peyman Moghadam^1,2

¹CSIRO Robotics, CSIRO ²Queensland University of Technology

This repository contains the pre-trained checkpoints for a variety of tasks on the WildCross benchmark.

WildCross Overview

We introduced WildCross, a large-scale benchmark for cross-modal place recognition and metric depth estimation in natural environments. The dataset comprises over 476K sequential RGB frames with semi-dense depth and surface normal annotations, each aligned with accurate 6DoF poses and synchronized dense lidar submaps.

We conduct comprehensive experiments on visual, lidar, and cross-modal place recognition, as well as metric depth estimation, demonstrating the value of WildCross as a challenging benchmark for multi-modal robotic perception tasks.

This HuggingFace Repository contains the model weights for replicating all experiments outlined in the original paper.

Data Download Instructions

Our dataset can be downloaded through the CSIRO Data Access Portal. Detailed instructions for downloading the dataset can be found in the README file provided on the data access portal page.

Training and Benchmarking

Here we provide pre-trained checkpoints for a variety of tasks on WildCross. For instructions on how to use all checkpoints for training or evaluation, further instructions can be found on the WildCross GitHub repository.

Visual Place Recognition

WildCross supports visual relocalization with sequential RGB imagery across challenging revisits, including reverse-direction traversals and long-term appearance changes. The benchmark includes cross-fold train/test splits for robust evaluation of generalization and in-domain adaptation. For each model below we provide the weights for the original pre-trained model as well as models fine-tuned on our different data splits.

Checkpoints

Model	Checkpoint Folder
NetVlad	Link
MixVPR	Link
SALAD	Link
BoQ	Link

LiDAR Place Recognition

WildCross is an extension of the original Wild-Places dataset for LiDAR place recognition. WildCross extends it's evaluation setup using new splits of the original data. For LiDAR place recognition (LPR), code for training and evaluation can be found on a WildCross branch of the original Wild-Places repository.

For each model below we provide model weights which have been fine-tuned on our new data splits.

Checkpoints

Model	Checkpoint Folder
LoGG3D-Net	Link
MinkLoc3Dv2	Link
HOTFormerLoc	Link

Cross Modal Place Recognition

CMPR in WildCross evaluates retrieval across sensing modalities, such as image-to-lidar localization. The synchronized RGB frames, accurate poses, and dense lidar submaps provide a strong testbed for cross-modal representation learning.

Checkpoints below provide Lip-Loc CMPR model weights using different backbones, fine-tuned on our different data splits.

Checkpoints

Model	Checkpoint Folder
Lip-Loc (ResNet50)	Link
Lip-Loc (Dino-v2)	Link
Lip-Loc (Dino-v3)	Link

Metric Depth Estimation

WildCross provides semi-dense metric depth and surface normal annotations for every frame, generated from accumulated global point clouds, accurate camera poses, and visibility filtering to remove occluded points. This supports training and benchmarking depth models in natural environments where current methods face substantial domain-shift challenges.

Checkpoints below provide model weights for different DepthAnythingv2 models fine-tuned on WildCross data.

Checkpoints

Model	Checkpoint Folder
DepthAnythingV2-vits	Link
DepthAnythingV2-vitb	Link
DepthAnythingV2-vitl	Link

BibTeX

If you find this repository useful or use the WildCross dataset in your work, please cite us using the following:

@inproceedings{wildcross2026,
  title={{WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments}},
  author={Joshua Knights, Joseph Reid, Kaushik Roy, David Hall, Mark Cox, Peyman Moghadam},
  booktitle={Proceedings-IEEE International Conference on Robotics and Automation},
  pages={},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Depth Estimation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for CSIRORobotics/WildCross

WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments

Paper • 2603.01475 • Published 4 days ago