Distillation-based training for multi-exit architectures

Bui Thi Mai, Phuong; Lampert, Christoph

Distillation-based training for multi-exit architectures

Phuong M, Lampert C. 2019. Distillation-based training for multi-exit architectures. IEEE International Conference on Computer Vision. ICCV: International Conference on Computer Vision vol. 2019–October, 1355–1364.

Download

main.pdf 735.77 KB [Submitted Version]

DOI

10.1109/ICCV.2019.00144

Conference Paper | Published | English

Scopus indexed

Author

Phuong, Mary^ISTA; Lampert , Christoph^ISTA

Department

Lampert Group

Grant

Lifelong Learning of Visual Scene Understanding

Abstract

Multi-exit architectures, in which a stack of processing layers is interleaved with early output layers, allow the processing of a test example to stop early and thus save computation time and/or energy. In this work, we propose a new training procedure for multi-exit architectures based on the principle of knowledge distillation. The method encourage searly exits to mimic later, more accurate exits, by matching their output probabilities. Experiments on CIFAR100 and ImageNet show that distillation-based training significantly improves the accuracy of early exits while maintaining state-of-the-art accuracy for late ones. The method is particularly beneficial when training data is limited and it allows a straightforward extension to semi-supervised learning,i.e. making use of unlabeled data at training time. Moreover, it takes only afew lines to implement and incurs almost no computational overhead at training time, and none at all at test time.

Publishing Year

2019

Date Published

2019-10-01

Proceedings Title

IEEE International Conference on Computer Vision

Publisher

IEEE

Volume

2019-October

Page

1355-1364

Conference

ICCV: International Conference on Computer Vision

Conference Location

Seoul, Korea

Conference Date

2019-10-27 – 2019-11-02

ISBN

9781728148038

ISSN

1550-5499

IST-REx-ID

7479

Cite this

Phuong M, Lampert C. Distillation-based training for multi-exit architectures. In: IEEE International Conference on Computer Vision. Vol 2019-October. IEEE; 2019:1355-1364. doi:10.1109/ICCV.2019.00144

Phuong, M., & Lampert, C. (2019). Distillation-based training for multi-exit architectures. In IEEE International Conference on Computer Vision (Vol. 2019–October, pp. 1355–1364). Seoul, Korea: IEEE. https://doi.org/10.1109/ICCV.2019.00144

Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit Architectures.” In IEEE International Conference on Computer Vision, 2019–October:1355–64. IEEE, 2019. https://doi.org/10.1109/ICCV.2019.00144.

M. Phuong and C. Lampert, “Distillation-based training for multi-exit architectures,” in IEEE International Conference on Computer Vision, Seoul, Korea, 2019, vol. 2019–October, pp. 1355–1364.

Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit Architectures.” IEEE International Conference on Computer Vision, vol. 2019–October, IEEE, 2019, pp. 1355–64, doi:10.1109/ICCV.2019.00144.

All files available under the following license(s):

Copyright Statement: