Deep fused two-step cross-modal hashing with multiple semantic supervision
Kang P, Lin Z, Yang Z, Bronstein AM, Li Q, Liu W. 2022. Deep fused two-step cross-modal hashing with multiple semantic supervision. Multimedia Tools and Applications. 81(11), 15653–15670.
Download
No fulltext has been uploaded. References only!
Journal Article
| Published
| English
Scopus indexed
Author
Kang, Peipei;
Lin, Zehang;
Yang, Zhenguo;
Bronstein, Alex M.ISTA ;
Li, Qing;
Liu, Wenyin
Abstract
Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.
Publishing Year
Date Published
2022-05-01
Journal Title
Multimedia Tools and Applications
Publisher
Springer Nature
Volume
81
Issue
11
Page
15653-15670
ISSN
eISSN
IST-REx-ID
Cite this
Kang P, Lin Z, Yang Z, Bronstein AM, Li Q, Liu W. Deep fused two-step cross-modal hashing with multiple semantic supervision. Multimedia Tools and Applications. 2022;81(11):15653-15670. doi:10.1007/s11042-022-12187-6
Kang, P., Lin, Z., Yang, Z., Bronstein, A. M., Li, Q., & Liu, W. (2022). Deep fused two-step cross-modal hashing with multiple semantic supervision. Multimedia Tools and Applications. Springer Nature. https://doi.org/10.1007/s11042-022-12187-6
Kang, Peipei, Zehang Lin, Zhenguo Yang, Alex M. Bronstein, Qing Li, and Wenyin Liu. “Deep Fused Two-Step Cross-Modal Hashing with Multiple Semantic Supervision.” Multimedia Tools and Applications. Springer Nature, 2022. https://doi.org/10.1007/s11042-022-12187-6.
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, and W. Liu, “Deep fused two-step cross-modal hashing with multiple semantic supervision,” Multimedia Tools and Applications, vol. 81, no. 11. Springer Nature, pp. 15653–15670, 2022.
Kang P, Lin Z, Yang Z, Bronstein AM, Li Q, Liu W. 2022. Deep fused two-step cross-modal hashing with multiple semantic supervision. Multimedia Tools and Applications. 81(11), 15653–15670.
Kang, Peipei, et al. “Deep Fused Two-Step Cross-Modal Hashing with Multiple Semantic Supervision.” Multimedia Tools and Applications, vol. 81, no. 11, Springer Nature, 2022, pp. 15653–70, doi:10.1007/s11042-022-12187-6.