Papers
arxiv:2603.19456

In-the-Wild Camouflage Attack on Vehicle Detectors through Controllable Image Editing

Published on Mar 19
· Submitted by
Xiao Fang
on Mar 24
Authors:
,
,
,
,
,
,

Abstract

A novel framework formulates vehicle camouflage attacks as a conditional image-editing problem using ControlNet to generate stealthy adversarial examples with preserved structure and enhanced transferability.

AI-generated summary

Deep neural networks (DNNs) have achieved remarkable success in computer vision but remain highly vulnerable to adversarial attacks. Among them, camouflage attacks manipulate an object's visible appearance to deceive detectors while remaining stealthy to humans. In this paper, we propose a new framework that formulates vehicle camouflage attacks as a conditional image-editing problem. Specifically, we explore both image-level and scene-level camouflage generation strategies, and fine-tune a ControlNet to synthesize camouflaged vehicles directly on real images. We design a unified objective that jointly enforces vehicle structural fidelity, style consistency, and adversarial effectiveness. Extensive experiments on the COCO and LINZ datasets show that our method achieves significantly stronger attack effectiveness, leading to more than 38% AP50 decrease, while better preserving vehicle structure and improving human-perceived stealthiness compared to existing approaches. Furthermore, our framework generalizes effectively to unseen black-box detectors and exhibits promising transferability to the physical world. Project page is available at https://humansensinglab.github.io/CtrlCamo

Community

Motivation:
How can we generate realistic adversarial camouflages for real-world, in-the-wild vehicles that we cannot achieve in simulation environments?

Contributions:

  1. We formulate camouflage attacks against detectors as a conditional digital image-editing problem.

  2. We propose two camouflage strategies inspired by nature. Given an image containing a vehicle, the image-level strategy blends the vehicle with its surroundings, and the scene-level strategy adapts the vehicle to a common concept present in the scene.

  3. We propose a novel pipeline based on ControlNet fine-tuning. Our method jointly enforces structural fidelity to maintain vehicle geometry, style consistency to produce stealthy camouflage, and an adversarial objective to reduce detectability by object detectors.

  4. We evaluate our approach on the COCO (ground-view) and LINZ (aerial-view) datasets. Results demonstrate:
    (1) Stronger adversarial effectiveness, better preservation of vehicle structure, and improved stealthiness compared to SOTA;
    (2) Transferability to black-box detectors, diverse environmental conditions, and the physical world;
    (3) Robustness under preprocessing defense strategies such as denoising and smoothing.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.19456 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.19456 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.19456 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.