The unreasonable effectiveness of fully-connected layers for low-data regimes

Download
OA NeurIPS-2022-the-unreasonable-effectiveness-of-fully-connected-layers-for-low-data-regimes-Paper-Conference.pdf 444.82 KB [Published Version]
Conference Paper | Published | English

Scopus indexed
Author
Kocsis, Peter; Súkeník, PeterISTA; Brasó, Guillem; Niessner, Matthias; Leal-Taixé, Laura; Elezi, Ismail
Series Title
Advances in Neural Information Processing Systems
Abstract
Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes. We further present an online joint knowledge-distillation method to utilize the extra FC layers at train time but avoid them during test time. This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time. We perform classification experiments for a large range of network backbones and several standard datasets on supervised learning and active learning. Our experiments significantly outperform the networks without fully-connected layers, reaching a relative improvement of up to 16% validation accuracy in the supervised setting without adding any extra parameters during inference.
Publishing Year
Date Published
2022-12-01
Proceedings Title
36th Conference on Neural Information Processing Systems
Publisher
Neural Information Processing Systems Foundation
Acknowledgement
This work was supported by a Sofja Kovalevskaja Award, a postdoc fellowship from the Humboldt Foundation, the ERC Starting Grant Scan2CAD (804724), and the German Research Foundation (DFG) Research Unit "Learning and Simulation in Visual Computing".
Volume
35
Page
1896-1908
Conference
NeurIPS: Neural Information Processing Systems
Conference Location
New Orleans, LA, United States
Conference Date
2022-11-28 – 2022-12-09
ISSN
IST-REx-ID
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Date Uploaded
2025-01-24
MD5 Checksum
2a14e59ef8b34d9a1a27a7adbc6f83ff


Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 2210.05657

Search this title in

Google Scholar