{"main_file_link":[{"url":" https://doi.org/10.48550/arXiv.2309.00233","open_access":"1"}],"doi":"10.48550/arXiv.2309.00233","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","year":"2023","status":"public","month":"09","date_created":"2024-02-08T15:34:43Z","oa":1,"author":[{"first_name":"Zixu","full_name":"Zhao, Zixu","last_name":"Zhao"},{"first_name":"Jiaze","last_name":"Wang","full_name":"Wang, Jiaze"},{"first_name":"Max","last_name":"Horn","full_name":"Horn, Max"},{"last_name":"Ding","full_name":"Ding, Yizhuo","first_name":"Yizhuo"},{"first_name":"Tong","last_name":"He","full_name":"He, Tong"},{"first_name":"Zechen","full_name":"Bai, Zechen","last_name":"Bai"},{"first_name":"Dominik","last_name":"Zietlow","full_name":"Zietlow, Dominik"},{"first_name":"Carl-Johann Simon-Gabriel","full_name":"Carl-Johann Simon-Gabriel, Carl-Johann Simon-Gabriel","last_name":"Carl-Johann Simon-Gabriel"},{"first_name":"Bing","last_name":"Shuai","full_name":"Shuai, Bing"},{"last_name":"Tu","full_name":"Tu, Zhuowen","first_name":"Zhuowen"},{"first_name":"Thomas","full_name":"Brox, Thomas","last_name":"Brox"},{"last_name":"Schiele","full_name":"Schiele, Bernt","first_name":"Bernt"},{"full_name":"Fu, Yanwei","last_name":"Fu","first_name":"Yanwei"},{"id":"26cfd52f-2483-11ee-8040-88983bcc06d4","orcid":"0000-0002-4850-0683","first_name":"Francesco","full_name":"Locatello, Francesco","last_name":"Locatello"},{"first_name":"Zheng","full_name":"Zhang, Zheng","last_name":"Zhang"},{"full_name":"Xiao, Tianjun","last_name":"Xiao","first_name":"Tianjun"}],"citation":{"ista":"Zhao Z, Wang J, Horn M, Ding Y, He T, Bai Z, Zietlow D, Carl-Johann Simon-Gabriel C-JS-G, Shuai B, Tu Z, Brox T, Schiele B, Fu Y, Locatello F, Zhang Z, Xiao T. Object-centric multiple object tracking. arXiv, 2309.00233.","apa":"Zhao, Z., Wang, J., Horn, M., Ding, Y., He, T., Bai, Z., … Xiao, T. (n.d.). Object-centric multiple object tracking. arXiv. https://doi.org/10.48550/arXiv.2309.00233","short":"Z. Zhao, J. Wang, M. Horn, Y. Ding, T. He, Z. Bai, D. Zietlow, C.-J.S.-G. Carl-Johann Simon-Gabriel, B. Shuai, Z. Tu, T. Brox, B. Schiele, Y. Fu, F. Locatello, Z. Zhang, T. Xiao, ArXiv (n.d.).","mla":"Zhao, Zixu, et al. “Object-Centric Multiple Object Tracking.” ArXiv, 2309.00233, doi:10.48550/arXiv.2309.00233.","chicago":"Zhao, Zixu, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, et al. “Object-Centric Multiple Object Tracking.” ArXiv, n.d. https://doi.org/10.48550/arXiv.2309.00233.","ieee":"Z. Zhao et al., “Object-centric multiple object tracking,” arXiv. .","ama":"Zhao Z, Wang J, Horn M, et al. Object-centric multiple object tracking. arXiv. doi:10.48550/arXiv.2309.00233"},"article_processing_charge":"No","publication_status":"submitted","_id":"14963","publication":"arXiv","oa_version":"Preprint","external_id":{"arxiv":["2309.00233"]},"date_published":"2023-09-01T00:00:00Z","title":"Object-centric multiple object tracking","abstract":[{"lang":"eng","text":"Unsupervised object-centric learning methods allow the partitioning of scenes\r\ninto entities without additional localization information and are excellent\r\ncandidates for reducing the annotation burden of multiple-object tracking (MOT)\r\npipelines. Unfortunately, they lack two key properties: objects are often split\r\ninto parts and are not consistently tracked over time. In fact,\r\nstate-of-the-art models achieve pixel-level accuracy and temporal consistency\r\nby relying on supervised object detection with additional ID labels for the\r\nassociation through time. This paper proposes a video object-centric model for\r\nMOT. It consists of an index-merge module that adapts the object-centric slots\r\ninto detection outputs and an object memory module that builds complete object\r\nprototypes to handle occlusions. Benefited from object-centric learning, we\r\nonly require sparse detection labels (0%-6.25%) for object localization and\r\nfeature binding. Relying on our self-supervised\r\nExpectation-Maximization-inspired loss for object association, our approach\r\nrequires no ID labels. Our experiments significantly narrow the gap between the\r\nexisting object-centric model and the fully supervised state-of-the-art and\r\noutperform several unsupervised trackers."}],"type":"preprint","article_number":"2309.00233","date_updated":"2024-02-12T10:16:21Z","extern":"1","language":[{"iso":"eng"}],"department":[{"_id":"FrLo"}],"day":"01"}