Depth Estimation

WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments

Joshua Knights1,2 · Joseph Reid1 · Kaushik Roy1
David Hall1 · Mark Cox1 · Peyman Moghadam1,2

1CSIRO Robotics, CSIRO   2Queensland University of Technology

Paper PDF Project Page

This repository contains the pre-trained checkpoints for a variety of tasks on the WildCross benchmark.

teaser

WildCross Overview

We introduced WildCross, a large-scale benchmark for cross-modal place recognition and metric depth estimation in natural environments. The dataset comprises over 476K sequential RGB frames with semi-dense depth and surface normal annotations, each aligned with accurate 6DoF poses and synchronized dense lidar submaps.

We conduct comprehensive experiments on visual, lidar, and cross-modal place recognition, as well as metric depth estimation, demonstrating the value of WildCross as a challenging benchmark for multi-modal robotic perception tasks.

This HuggingFace Repository contains the model weights for replicating all experiments outlined in the original paper.

Data Download Instructions

Our dataset can be downloaded through the CSIRO Data Access Portal. Detailed instructions for downloading the dataset can be found in the README file provided on the data access portal page.

Training and Benchmarking

Here we provide pre-trained checkpoints for a variety of tasks on WildCross. For instructions on how to use all checkpoints for training or evaluation, further instructions can be found on the WildCross GitHub repository.

Visual Place Recognition

WildCross supports visual relocalization with sequential RGB imagery across challenging revisits, including reverse-direction traversals and long-term appearance changes. The benchmark includes cross-fold train/test splits for robust evaluation of generalization and in-domain adaptation. For each model below we provide the weights for the original pre-trained model as well as models fine-tuned on our different data splits.

Checkpoints

Model Checkpoint Folder
NetVlad Link
MixVPR Link
SALAD Link
BoQ Link

LiDAR Place Recognition

WildCross is an extension of the original Wild-Places dataset for LiDAR place recognition. WildCross extends it's evaluation setup using new splits of the original data. For LiDAR place recognition (LPR), code for training and evaluation can be found on a WildCross branch of the original Wild-Places repository.

For each model below we provide model weights which have been fine-tuned on our new data splits.

Checkpoints

Model Checkpoint Folder
LoGG3D-Net Link
MinkLoc3Dv2 Link
HOTFormerLoc Link

Cross Modal Place Recognition

CMPR in WildCross evaluates retrieval across sensing modalities, such as image-to-lidar localization. The synchronized RGB frames, accurate poses, and dense lidar submaps provide a strong testbed for cross-modal representation learning.

Checkpoints below provide Lip-Loc CMPR model weights using different backbones, fine-tuned on our different data splits.

Checkpoints

Model Checkpoint Folder
Lip-Loc (ResNet50) Link
Lip-Loc (Dino-v2) Link
Lip-Loc (Dino-v3) Link

Metric Depth Estimation

WildCross provides semi-dense metric depth and surface normal annotations for every frame, generated from accumulated global point clouds, accurate camera poses, and visibility filtering to remove occluded points. This supports training and benchmarking depth models in natural environments where current methods face substantial domain-shift challenges.

Checkpoints below provide model weights for different DepthAnythingv2 models fine-tuned on WildCross data.

Checkpoints

Model Checkpoint Folder
DepthAnythingV2-vits Link
DepthAnythingV2-vitb Link
DepthAnythingV2-vitl Link

BibTeX

If you find this repository useful or use the WildCross dataset in your work, please cite us using the following:

@inproceedings{wildcross2026,
  title={{WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments}},
  author={Joshua Knights, Joseph Reid, Kaushik Roy, David Hall, Mark Cox, Peyman Moghadam},
  booktitle={Proceedings-IEEE International Conference on Robotics and Automation},
  pages={},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for CSIRORobotics/WildCross