Adaptive slot attention: Object discovery with dynamic slot number
Fan K, Bai Z, Xiao T, He T, Horn M, Fu Y, Locatello F, Zhang Z. 2024. Adaptive slot attention: Object discovery with dynamic slot number. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. CVPR: Conference on Computer Vision and Pattern Recognition.
Download (ext.)
Conference Paper
| Published
| English
Author
Fan, Ke;
Bai, Zechen;
Xiao, Tianjun;
He, Tong;
Horn, Max;
Fu, Yanwei;
Locatello, FrancescoISTA
;
Zhang, Zheng

Department
Abstract
Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major draw-back of most object-centric models, including slot attention, is their reliance on predefining the number of slots. This not only necessitates prior knowledge of the dataset but also overlooks the inherent variability in the number of objects present in each instance. To overcome this fundamental limitation, we present a novel complexity-aware object auto-encoder framework. Within this framework, we introduce an adaptive slot attention (AdaSlot) mecha-nism that dynamically determines the optimal number of slots based on the content of the data. This is achieved by proposing a discrete slot sampling module that is responsible for selecting an appropriate number of slots from a candidate list. Furthermore, we introduce a masked slot decoder that suppresses unselected slots during the decoding process. Our framework, tested extensively on object discovery tasks with various datasets, shows performance matching or exceeding top fixed-slot models. Moreover, our analysis substantiates that our method exhibits the capability to dynamically adapt the slot number according to each instance's complexity, offering the potential for further exploration in slot attention research. Project will be available at https://kfan21.github.io/AdaSlot/
Publishing Year
Date Published
2024-06-15
Proceedings Title
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Publisher
IEEE
Acknowledgement
Yanwei Fu is the corresponding authour. Yanwei Fu is with School of Data Science, Fudan University, Shanghai Key Lab of Intelligent Information Processing, Fudan University, and Fudan ISTBI-ZJNU Algorithm Centre for Brain-inspired Intelligence, Zhejiang Normal University, Jinhua, China.
Conference
CVPR: Conference on Computer Vision and Pattern Recognition
Conference Location
Seattle, WA, United States
Conference Date
2024-06-16 – 2024-06-22
IST-REx-ID
Cite this
Fan K, Bai Z, Xiao T, et al. Adaptive slot attention: Object discovery with dynamic slot number. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2024. doi:10.1109/cvpr52733.2024.02176
Fan, K., Bai, Z., Xiao, T., He, T., Horn, M., Fu, Y., … Zhang, Z. (2024). Adaptive slot attention: Object discovery with dynamic slot number. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, United States: IEEE. https://doi.org/10.1109/cvpr52733.2024.02176
Fan, Ke, Zechen Bai, Tianjun Xiao, Tong He, Max Horn, Yanwei Fu, Francesco Locatello, and Zheng Zhang. “Adaptive Slot Attention: Object Discovery with Dynamic Slot Number.” In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2024. https://doi.org/10.1109/cvpr52733.2024.02176.
K. Fan et al., “Adaptive slot attention: Object discovery with dynamic slot number,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, United States, 2024.
Fan K, Bai Z, Xiao T, He T, Horn M, Fu Y, Locatello F, Zhang Z. 2024. Adaptive slot attention: Object discovery with dynamic slot number. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. CVPR: Conference on Computer Vision and Pattern Recognition.
Fan, Ke, et al. “Adaptive Slot Attention: Object Discovery with Dynamic Slot Number.” 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2024, doi:10.1109/cvpr52733.2024.02176.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level

Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
arXiv 2406.09196