Adapting with an Open Mind: Leveraging Open-Vocabulary Detectors for Closed Set Source-Free Domain Adaptive Object Detection

Kaustubh R BorgaviSarvesh ShashikumarChetan Arora

CVPR 2026

Abstract

Source-Free Domain Adaptive Object Detection (SFDAOD) aims to adapt a detector trained on a labeled source domain to an unlabeled target domain without accessing source data. We identify a key complementarity: Open-Vocabulary Object Detectors (OVODs) offer strong domain generalization but weaker source-domain accuracy, whereas Closed-Set Object Detectors (CSODs) exhibit the opposite behavior.

Existing SFDAOD methods typically employ a Mean-Teacher (MT) framework, where pseudo-labels generated by a source-pretrained teacher model guide a student model, and the teacher is in-turn updated via Exponential Moving Average. We observe that these pseudo-labels frequently carry a source-domain bias, which impedes effective target-domain adaptation. To address this, we propose a Dual-Branch Distillation framework that augments the student with two post-decoder projection heads: a main branch distilling from the source-pretrained teacher and an auxiliary branch distilling from an OVOD "anchor" model. The proposed design enables joint learning of domain-invariant features from the OVOD and domain-specific representations from the CSOD.

Extensive experiments demonstrate consistent gains over prior state-of-the-art (SOTA), achieving +14.4% on Sim10k → Cityscapes, +11.5% on Kitti → Cityscapes, +4.7% on Cityscapes → Foggy-Cityscapes and +7.9% on Cityscapes → BDD100k.

Our model architecture consisting of a Mean-Teacher (MT) framework where the student model contains a dual branch decoder, providing two sets of predictions for a given image. The CSOD teacher model is initialized with the same weights as that of the student and its weights are updated indirectly via Exponential Moving Average (EMA). We use GDINO [41] as our OVOD anchor model which distills domain-invariant representations to the student through the auxiliary branch whereas the CSOD teacher preserves the transferable, domain-specific features in the student through the main branch.

Citation

@InProceedings{Borgavi_2026_CVPR,
                    author    = {Borgavi, Kaustubh R and Shashikumar, Sarvesh and Arora, Chetan},
                    title     = {Adapting with an Open Mind: Leveraging Open-Vocabulary Detectors for Closed Set Source-Free Domain Adaptive Object Detection},
                    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
                    month     = {June},
                    year      = {2026},
                    pages     = {6570--6581}
                }