{"author":[{"id":"3EC6EE64-F248-11E8-B48F-1D18A9856A87","last_name":"Bui Thi Mai","first_name":"Phuong","full_name":"Bui Thi Mai, Phuong"},{"first_name":"Christoph","orcid":"0000-0001-8622-7887","full_name":"Lampert, Christoph","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","last_name":"Lampert"}],"language":[{"iso":"eng"}],"file_date_updated":"2021-05-24T11:15:57Z","department":[{"_id":"GradSch"},{"_id":"ChLa"}],"related_material":{"record":[{"status":"public","id":"9418","relation":"dissertation_contains"}]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","main_file_link":[{"url":"https://openreview.net/pdf?id=krz7T0xU9Z_","open_access":"1"}],"status":"public","abstract":[{"lang":"eng","text":"We study the inductive bias of two-layer ReLU networks trained by gradient flow. We identify a class of easy-to-learn (`orthogonally separable') datasets, and characterise the solution that ReLU networks trained on such datasets converge to. Irrespective of network width, the solution turns out to be a combination of two max-margin classifiers: one corresponding to the positive data subset and one corresponding to the negative data subset. The proof is based on the recently introduced concept of extremal sectors, for which we prove a number of properties in the context of orthogonal separability. In particular, we prove stationarity of activation patterns from some time  onwards, which enables a reduction of the ReLU network to an ensemble of linear subnetworks."}],"date_created":"2021-05-24T11:16:46Z","date_updated":"2023-09-07T13:29:50Z","publication":"9th International Conference on Learning Representations","date_published":"2021-05-01T00:00:00Z","article_processing_charge":"No","quality_controlled":"1","day":"01","file":[{"creator":"bphuong","checksum":"f34ff17017527db5ba6927f817bdd125","file_id":"9417","access_level":"open_access","relation":"main_file","file_size":502356,"date_created":"2021-05-24T11:15:57Z","file_name":"iclr2021_conference.pdf","content_type":"application/pdf","date_updated":"2021-05-24T11:15:57Z"}],"ddc":["000"],"year":"2021","month":"05","title":"The inductive bias of ReLU networks on orthogonally separable data","has_accepted_license":"1","_id":"9416","citation":{"mla":"Phuong, Mary, and Christoph Lampert. “The Inductive Bias of ReLU Networks on Orthogonally Separable Data.” <i>9th International Conference on Learning Representations</i>, 2021.","ieee":"M. Phuong and C. Lampert, “The inductive bias of ReLU networks on orthogonally separable data,” in <i>9th International Conference on Learning Representations</i>, Virtual, 2021.","chicago":"Phuong, Mary, and Christoph Lampert. “The Inductive Bias of ReLU Networks on Orthogonally Separable Data.” In <i>9th International Conference on Learning Representations</i>, 2021.","ama":"Phuong M, Lampert C. The inductive bias of ReLU networks on orthogonally separable data. In: <i>9th International Conference on Learning Representations</i>. ; 2021.","ista":"Phuong M, Lampert C. 2021. The inductive bias of ReLU networks on orthogonally separable data. 9th International Conference on Learning Representations.  ICLR: International Conference on Learning Representations.","apa":"Phuong, M., &#38; Lampert, C. (2021). The inductive bias of ReLU networks on orthogonally separable data. In <i>9th International Conference on Learning Representations</i>. Virtual.","short":"M. Phuong, C. Lampert, in:, 9th International Conference on Learning Representations, 2021."},"type":"conference","oa_version":"Published Version","scopus_import":"1","oa":1,"publication_status":"published","conference":{"name":" ICLR: International Conference on Learning Representations","location":"Virtual","start_date":"2021-05-03","end_date":"2021-05-07"}}