Underspecification in deep learning
Phuong M. 2021. Underspecification in deep learning. Institute of Science and Technology Austria.
Download
Thesis
| PhD
| Published
| English
Author
Supervisor
Corresponding author has ISTA affiliation
Department
Series Title
ISTA Thesis
Abstract
Deep learning is best known for its empirical success across a wide range of applications
spanning computer vision, natural language processing and speech. Of equal significance,
though perhaps less known, are its ramifications for learning theory: deep networks have
been observed to perform surprisingly well in the high-capacity regime, aka the overfitting
or underspecified regime. Classically, this regime on the far right of the bias-variance curve
is associated with poor generalisation; however, recent experiments with deep networks
challenge this view.
This thesis is devoted to investigating various aspects of underspecification in deep learning.
First, we argue that deep learning models are underspecified on two levels: a) any given
training dataset can be fit by many different functions, and b) any given function can be
expressed by many different parameter configurations. We refer to the second kind of
underspecification as parameterisation redundancy and we precisely characterise its extent.
Second, we characterise the implicit criteria (the inductive bias) that guide learning in the
underspecified regime. Specifically, we consider a nonlinear but tractable classification
setting, and show that given the choice, neural networks learn classifiers with a large margin.
Third, we consider learning scenarios where the inductive bias is not by itself sufficient to
deal with underspecification. We then study different ways of ‘tightening the specification’: i)
In the setting of representation learning with variational autoencoders, we propose a hand-
crafted regulariser based on mutual information. ii) In the setting of binary classification, we
consider soft-label (real-valued) supervision. We derive a generalisation bound for linear
networks supervised in this way and verify that soft labels facilitate fast learning. Finally, we
explore an application of soft-label supervision to the training of multi-exit models.
Publishing Year
Date Published
2021-05-30
Publisher
Institute of Science and Technology Austria
Acknowledged SSUs
Page
125
ISSN
IST-REx-ID
Cite this
Phuong M. Underspecification in deep learning. 2021. doi:10.15479/AT:ISTA:9418
Phuong, M. (2021). Underspecification in deep learning. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:9418
Phuong, Mary. “Underspecification in Deep Learning.” Institute of Science and Technology Austria, 2021. https://doi.org/10.15479/AT:ISTA:9418.
M. Phuong, “Underspecification in deep learning,” Institute of Science and Technology Austria, 2021.
Phuong M. 2021. Underspecification in deep learning. Institute of Science and Technology Austria.
Phuong, Mary. Underspecification in Deep Learning. Institute of Science and Technology Austria, 2021, doi:10.15479/AT:ISTA:9418.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
File Name
mph-thesis-v519-pdfimages.pdf
2.67 MB
Access Level
Open Access
Date Uploaded
2021-05-24
MD5 Checksum
4f0abe64114cfed264f9d36e8d1197e3
Source File
File Name
thesis.zip
93.00 MB
Access Level
Closed Access
Date Uploaded
2021-05-24
MD5 Checksum
f5699e876bc770a9b0df8345a77720a2
Material in ISTA:
Part of this Dissertation
Part of this Dissertation
Part of this Dissertation