Neural attentive circuits
Rahaman N, Weiss M, Locatello F, Pal C, Bengio Y, Schölkopf B, Li LE, Ballas N. 2022. Neural attentive circuits. 36th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 35.
Download (ext.)
https://doi.org/10.48550/arXiv.2210.08031
[Preprint]
Conference Paper
| Published
| English
Author
Rahaman, Nasim;
Weiss, Martin;
Locatello, FrancescoISTA ;
Pal, Chris;
Bengio, Yoshua;
Schölkopf, Bernhard;
Li, Li Erran;
Ballas, Nicolas
Department
Series Title
Advances in Neural Information Processing Systems
Abstract
Recent work has seen the development of general purpose neural architectures
that can be trained to perform tasks across diverse data modalities. General
purpose models typically make few assumptions about the underlying
data-structure and are known to perform well in the large-data regime. At the
same time, there has been growing interest in modular neural architectures that
represent the data using sparsely interacting modules. These models can be more
robust out-of-distribution, computationally efficient, and capable of
sample-efficient adaptation to new data. However, they tend to make
domain-specific assumptions about the data, and present challenges in how
module behavior (i.e., parameterization) and connectivity (i.e., their layout)
can be jointly learned. In this work, we introduce a general purpose, yet
modular neural architecture called Neural Attentive Circuits (NACs) that
jointly learns the parameterization and a sparse connectivity of neural modules
without using domain knowledge. NACs are best understood as the combination of
two systems that are jointly trained end-to-end: one that determines the module
configuration and the other that executes it on an input. We demonstrate
qualitatively that NACs learn diverse and meaningful module configurations on
the NLVR2 dataset without additional supervision. Quantitatively, we show that
by incorporating modularity in this way, NACs improve upon a strong non-modular
baseline in terms of low-shot adaptation on CIFAR and CUBs dataset by about
10%, and OOD robustness on Tiny ImageNet-R by about 2.5%. Further, we find that
NACs can achieve an 8x speedup at inference time while losing less than 3%
performance. Finally, we find NACs to yield competitive results on diverse data
modalities spanning point-cloud classification, symbolic processing and
text-classification from ASCII bytes, thereby confirming its general purpose
nature.
Publishing Year
Date Published
2022-10-14
Proceedings Title
36th Conference on Neural Information Processing Systems
Volume
35
Conference
NeurIPS: Neural Information Processing Systems
Conference Location
New Orleans, United States
Conference Date
2022-11-29 – 2022-12-01
IST-REx-ID
Cite this
Rahaman N, Weiss M, Locatello F, et al. Neural attentive circuits. In: 36th Conference on Neural Information Processing Systems. Vol 35. ; 2022.
Rahaman, N., Weiss, M., Locatello, F., Pal, C., Bengio, Y., Schölkopf, B., … Ballas, N. (2022). Neural attentive circuits. In 36th Conference on Neural Information Processing Systems (Vol. 35). New Orleans, United States.
Rahaman, Nasim, Martin Weiss, Francesco Locatello, Chris Pal, Yoshua Bengio, Bernhard Schölkopf, Li Erran Li, and Nicolas Ballas. “Neural Attentive Circuits.” In 36th Conference on Neural Information Processing Systems, Vol. 35, 2022.
N. Rahaman et al., “Neural attentive circuits,” in 36th Conference on Neural Information Processing Systems, New Orleans, United States, 2022, vol. 35.
Rahaman N, Weiss M, Locatello F, Pal C, Bengio Y, Schölkopf B, Li LE, Ballas N. 2022. Neural attentive circuits. 36th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 35.
Rahaman, Nasim, et al. “Neural Attentive Circuits.” 36th Conference on Neural Information Processing Systems, vol. 35, 2022.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level
Open Access
Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
arXiv 2210.08031