This repository offers the official implementation of DiaNA in PyTorch.
In the meantime, check out our related papers if you are interested:
- 【AAAI 2024】 An Empirical Study of CLIP for Text-based Person Search [paper | code]
- 【ACM MM 2023】 Text-based Person Search without Parallel Image-Text Data [paper]
- 【IJCAI 2023】 RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search [paper | code]
- 【ICASSP 2022】 Learning Semantic-Aligned Feature Representation for Text-based Person Search [paper | code]
DiaNA is a novel dialogue-refined cross-modal framework for chat-based person retrieval that leverages two adaptive attribute refiner modules to bottleneck the conversational and visual information for fine-grained cross-modal alignment.
- Release code
- Release checkpoints
- Release dataset (ChatPedes)
If you find this paper useful, please consider staring 🌟 this repo and citing 📑 our paper:
@InProceedings{bai2025chat,
author = {Bai, Yang and Ji, Yucheng and Cao, Min and Wang, Jinqiao and Ye, Mang},
title = {Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
month = {June},
year = {2025}
}
This code is distributed under an MIT LICENSE.