Layer-wise quantization for quantized optimistic dual averaging

Nguyen AD, Markov I, Wu FZ, Ramezani-Kebrya A, Antonakopoulos K, Alistarh D-A, Cevher V. 2025. Layer-wise quantization for quantized optimistic dual averaging. 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 46026–46072.

Download
OA 2025_ICML_Nguyen.pdf 756.21 KB [Published Version]
Conference Paper | Published | English

Scopus indexed
Author
Nguyen, Anh Duc; Markov, IliaISTA; Wu, Frank Zhengqing; Ramezani-Kebrya, Ali; Antonakopoulos, Kimon; Alistarh, Dan-AdrianISTA ; Cevher, Volkan
Department
Series Title
PMLR
Abstract
Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneities over the course of training. We then apply a new layer-wise quantization technique within distributed variational inequalities (VIs), proposing a novel Quantized Optimistic Dual Averaging (QODA) algorithm with adaptive learning rates, which achieves competitive convergence rates for monotone VIs. We empirically show that QODA achieves up to a 150% speedup over the baselines in end-to-end training time for training Wasserstein GAN on 12+GPUs.
Publishing Year
Date Published
2025-05-01
Proceedings Title
42nd International Conference on Machine Learning
Publisher
ML Research Press
Acknowledgement
This work was supported by Hasler Foundation Program: Hasler Responsible AI (project number 21043). The research was also sponsored by the Army Research Office and was accomplished under Grant Number W911NF-24-1-0048. This work was further funded by the Swiss National Science Foundation (SNSF) under grant number 200021_205011. We also acknowledge project A11 of the Swiss National Supercomputing Centre (CSCS) for providing computing resources. Dan Alistarh and Ilia Markov were supported in part through the ERC Proofof-Concept grant FastML (Grant Agreement 101158077). Ali Ramezani-Kebrya was supported by the Research Council of Norway through FRIPRO Grant under project number 356103, its Centres of Excellence scheme, Integreat - Norwegian Centre for knowledge-driven machine learning under project number 332645 - and its Centre for Research-based Innovation funding scheme (Visual Intelligence under grant no. 309439).
Volume
267
Page
46026-46072
Conference
ICML: International Conference on Machine Learning
Conference Location
Vancouver, Canada
Conference Date
2025-07-13 – 2025-07-19
eISSN
IST-REx-ID

Cite this

Nguyen AD, Markov I, Wu FZ, et al. Layer-wise quantization for quantized optimistic dual averaging. In: 42nd International Conference on Machine Learning. Vol 267. ML Research Press; 2025:46026-46072.
Nguyen, A. D., Markov, I., Wu, F. Z., Ramezani-Kebrya, A., Antonakopoulos, K., Alistarh, D.-A., & Cevher, V. (2025). Layer-wise quantization for quantized optimistic dual averaging. In 42nd International Conference on Machine Learning (Vol. 267, pp. 46026–46072). Vancouver, Canada: ML Research Press.
Nguyen, Anh Duc, Ilia Markov, Frank Zhengqing Wu, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan-Adrian Alistarh, and Volkan Cevher. “Layer-Wise Quantization for Quantized Optimistic Dual Averaging.” In 42nd International Conference on Machine Learning, 267:46026–72. ML Research Press, 2025.
A. D. Nguyen et al., “Layer-wise quantization for quantized optimistic dual averaging,” in 42nd International Conference on Machine Learning, Vancouver, Canada, 2025, vol. 267, pp. 46026–46072.
Nguyen AD, Markov I, Wu FZ, Ramezani-Kebrya A, Antonakopoulos K, Alistarh D-A, Cevher V. 2025. Layer-wise quantization for quantized optimistic dual averaging. 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 46026–46072.
Nguyen, Anh Duc, et al. “Layer-Wise Quantization for Quantized Optimistic Dual Averaging.” 42nd International Conference on Machine Learning, vol. 267, ML Research Press, 2025, pp. 46026–72.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
File Name
Access Level
OA Open Access
Date Uploaded
2025-12-16
MD5 Checksum
a7edf0e4304171a3e035842b3aab1704


Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 2505.14371

Search this title in

Google Scholar