arxiv:2603.19339

Spectral Tempering for Embedding Compression in Dense Passage Retrieval

Published on Mar 19

Authors:

Abstract

Spectral Tempering (SpecTemp) adapts dimensionality reduction scaling based on signal-to-noise ratio and eigenspectrum analysis without requiring labeled data or hyperparameter tuning.

AI-generated summary

Dimensionality reduction is critical for deploying dense retrieval systems at scale, yet mainstream post-hoc methods face a fundamental trade-off: principal component analysis (PCA) preserves dominant variance but underutilizes representational capacity, while whitening enforces isotropy at the cost of amplifying noise in the heavy-tailed eigenspectrum of retrieval embeddings. Intermediate spectral scaling methods unify these extremes by reweighting dimensions with a power coefficient γ, but treat γ as a fixed hyperparameter that requires task-specific tuning. We show that the optimal scaling strength γ is not a global constant: it varies systematically with target dimensionality k and is governed by the signal-to-noise ratio (SNR) of the retained subspace. Based on this insight, we propose Spectral Tempering (SpecTemp), a learning-free method that derives an adaptive γ(k) directly from the corpus eigenspectrum using local SNR analysis and knee-point normalization, requiring no labeled data or validation-based search. Extensive experiments demonstrate that Spectral Tempering consistently achieves near-oracle performance relative to grid-searched γ^*(k) while remaining fully learning-free and model-agnostic. Our code is publicly available at https://anonymous.4open.science/r/SpecTemp-0D37.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.19339 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.19339 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.19339 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.