Distillation-based training for multi-exit architectures

Phuong M, Lampert C. 2019. Distillation-based training for multi-exit architectures. IEEE International Conference on Computer Vision. ICCV: International Conference on Computer Vision vol. 2019–October, 1355–1364.

Download
OA main.pdf 735.77 KB [Submitted Version]

Conference Paper | Published | English

Scopus indexed
Department
Abstract
Multi-exit architectures, in which a stack of processing layers is interleaved with early output layers, allow the processing of a test example to stop early and thus save computation time and/or energy. In this work, we propose a new training procedure for multi-exit architectures based on the principle of knowledge distillation. The method encourage searly exits to mimic later, more accurate exits, by matching their output probabilities. Experiments on CIFAR100 and ImageNet show that distillation-based training significantly improves the accuracy of early exits while maintaining state-of-the-art accuracy for late ones. The method is particularly beneficial when training data is limited and it allows a straightforward extension to semi-supervised learning,i.e. making use of unlabeled data at training time. Moreover, it takes only afew lines to implement and incurs almost no computational overhead at training time, and none at all at test time.
Publishing Year
Date Published
2019-10-01
Proceedings Title
IEEE International Conference on Computer Vision
Publisher
IEEE
Volume
2019-October
Page
1355-1364
Conference
ICCV: International Conference on Computer Vision
Conference Location
Seoul, Korea
Conference Date
2019-10-27 – 2019-11-02
ISSN
IST-REx-ID

Cite this

Phuong M, Lampert C. Distillation-based training for multi-exit architectures. In: IEEE International Conference on Computer Vision. Vol 2019-October. IEEE; 2019:1355-1364. doi:10.1109/ICCV.2019.00144
Phuong, M., & Lampert, C. (2019). Distillation-based training for multi-exit architectures. In IEEE International Conference on Computer Vision (Vol. 2019–October, pp. 1355–1364). Seoul, Korea: IEEE. https://doi.org/10.1109/ICCV.2019.00144
Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit Architectures.” In IEEE International Conference on Computer Vision, 2019–October:1355–64. IEEE, 2019. https://doi.org/10.1109/ICCV.2019.00144.
M. Phuong and C. Lampert, “Distillation-based training for multi-exit architectures,” in IEEE International Conference on Computer Vision, Seoul, Korea, 2019, vol. 2019–October, pp. 1355–1364.
Phuong M, Lampert C. 2019. Distillation-based training for multi-exit architectures. IEEE International Conference on Computer Vision. ICCV: International Conference on Computer Vision vol. 2019–October, 1355–1364.
Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit Architectures.” IEEE International Conference on Computer Vision, vol. 2019–October, IEEE, 2019, pp. 1355–64, doi:10.1109/ICCV.2019.00144.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
File Name
main.pdf 735.77 KB
Access Level
OA Open Access
Date Uploaded
2020-02-11
MD5 Checksum
7b77fb5c2d27c4c37a7612ba46a66117


Material in ISTA:
Dissertation containing ISTA record

Export

Marked Publications

Open Data ISTA Research Explorer

Web of Science

View record in Web of Science®

Search this title in

Google Scholar
ISBN Search