---
OA_place: publisher
_id: '21198'
abstract:
- lang: eng
  text: "In recent years there has been a massive increase in the amount of data generated
    in a\r\ndecentralized manner. Ever more powerful edge devices, such as smartphones,
    have become\r\nubiquitous in most societies on earth. Through text typed, photos
    taken and apps used,\r\nthese devices, which we refer to as clients, generate
    enormous amounts of high quality and\r\ncomplex data. Moreover, the nature of
    these devices means the data they generate is often\r\nsensitive and privacy concerns
    prevent it being gathered and stored in a central location. This\r\npresents a
    challenge to the modern machine learning paradigm that requires central access\r\nto
    large amounts of data. Federated learning (FL) has emerged as one of the answers
    to\r\nthis problem. Rather than bringing the data to the model, FL sends the model
    to the data.\r\nModel training takes place on device, with periodically synchronized
    updates, allowing data to\r\nremain locally stored. While this approach offers
    significant privacy advantages it comes with\r\nits own set of unique challenges.
    These include: data heterogeneity, the notion that different\r\ndevices generate
    data in distinct ways which can negatively impact training dynamics; systems\r\nheterogeneity,
    meaning that different devices may have differing hardware specifications; high\r\ncommunication
    costs, which are induced by the repeated transferring of models over the\r\nnetwork
    and low device computational power, which limits the use of larger models on device.\r\nIn
    this thesis we present a range of methods for federated learning. We focus primarily
    on\r\nthe challenge of data heterogeneity, though the methods presented are designed
    to be well\r\nadapted to the other challenges of a federated setting, such as
    the constraints of limited\r\ncompute and communication overhead. We first present
    a method for explicitly modeling client\r\ndata heterogeneity. The approach formulates
    clients as samples from a certain probability\r\ndistribution and infers the parameters
    of this distribution from the available training clients.\r\nThis learned distribution
    then represents the heterogeneity present among the clients and can\r\nbe sampled
    from in order to create new simulated clients that are similar to the real clients
    we\r\nhave observed so far. Following this we present two methods for directly
    dealing with data\r\nheterogeneity through personalization. Highly heterogeneous
    client data distributions can mean\r\nthat learning a single global model becomes
    suboptimal, and some form of personalization of\r\nmodels to each individual client
    is required. Our approaches are based around hypernetworks,\r\nwhich we use to
    generate personalized model parameters without the need for additional\r\ntraining
    or finetuning. In the first approach we focus on generating full parameterizations
    of\r\nclient models using learned embeddings of client data and labels, with a
    hypernetwork located\r\non the central server. In the second approach we address
    the more challenging scenario where\r\nwe want to generate a personalized model
    for a client without any label information. The\r\nhypernetwork is trained to
    generate a low dimensional representation of a client’s personalized\r\nmodel
    parameters, allowing it to be transferred to and run on the client devices. In
    our final\r\npresented method, we change our focus and rather than aim to directly
    address the challenge\r\nof data heterogeneity, we instead ensure we are unaffected
    by it. This is done in the context\r\nof k-means clustering and we present a method
    for federated clustering with a focus on added\r\nprivacy guarantees."
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "This research was funded in part by the Austrian Science Fund (FWF)\r\n[10.55776/COE12].
  Furthermore, the candidate acknowledges the support from the Scientific\r\nService
  Units (SSU) of ISTA through resources provided by Scientific Computing (SciComp)."
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
citation:
  ama: Scott JA. Data heterogeneity and personalization in federated learning. 2026.
    doi:<a href="https://doi.org/10.15479/AT-ISTA-21198">10.15479/AT-ISTA-21198</a>
  apa: Scott, J. A. (2026). <i>Data heterogeneity and personalization in federated
    learning</i>. Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT-ISTA-21198">https://doi.org/10.15479/AT-ISTA-21198</a>
  chicago: Scott, Jonathan A. “Data Heterogeneity and Personalization in Federated
    Learning.” Institute of Science and Technology Austria, 2026. <a href="https://doi.org/10.15479/AT-ISTA-21198">https://doi.org/10.15479/AT-ISTA-21198</a>.
  ieee: J. A. Scott, “Data heterogeneity and personalization in federated learning,”
    Institute of Science and Technology Austria, 2026.
  ista: Scott JA. 2026. Data heterogeneity and personalization in federated learning.
    Institute of Science and Technology Austria.
  mla: Scott, Jonathan A. <i>Data Heterogeneity and Personalization in Federated Learning</i>.
    Institute of Science and Technology Austria, 2026, doi:<a href="https://doi.org/10.15479/AT-ISTA-21198">10.15479/AT-ISTA-21198</a>.
  short: J.A. Scott, Data Heterogeneity and Personalization in Federated Learning,
    Institute of Science and Technology Austria, 2026.
corr_author: '1'
date_created: 2026-02-09T14:59:53Z
date_published: 2026-02-09T00:00:00Z
date_updated: 2026-04-07T11:46:11Z
day: '09'
ddc:
- '005'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/AT-ISTA-21198
file:
- access_level: closed
  checksum: 121c1d968bd86f3630aa7e81d5bbbcb0
  content_type: application/zip
  creator: jscott
  date_created: 2026-02-17T11:46:22Z
  date_updated: 2026-02-17T11:46:22Z
  file_id: '21298'
  file_name: 2026_Scott_Jonathan_Thesis_Source.zip
  file_size: 272379252
  relation: source_file
- access_level: open_access
  checksum: 6e3e08ba474bbee8511cc8a839ab2077
  content_type: application/pdf
  creator: jscott
  date_created: 2026-02-27T10:25:41Z
  date_updated: 2026-02-27T10:25:41Z
  file_id: '21366'
  file_name: 2026_Jonathan_Scott_Thesis.pdf
  file_size: 15220298
  relation: main_file
  success: 1
file_date_updated: 2026-02-27T10:25:41Z
has_accepted_license: '1'
language:
- iso: eng
month: '02'
oa: 1
oa_version: Published Version
page: '158'
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '20819'
    relation: part_of_dissertation
    status: public
  - id: '17411'
    relation: part_of_dissertation
    status: public
  - id: '18120'
    relation: part_of_dissertation
    status: public
  - id: '21207'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Data heterogeneity and personalization in federated learning
type: dissertation
user_id: ba8df636-2132-11f1-aed0-ed93e2281fdd
year: '2026'
...
---
_id: '21916'
abstract:
- lang: eng
  text: 'Social network graphs are central to graph learning research, serving as
    standard benchmarks for algorithm evaluation. However, existing datasets focus
    mainly on mainstream social media platforms whose structures are shaped notably
    by algorithmic recommendations. This raises an important question: would alternative,
    decentralized social networks exhibit different properties? We address this by
    studying the Fediverse; a collection of decentralized social networks (such as
    Mastodon and Lemmy). These platforms differ fundamentally from for-profit social
    media, notably in decentralization and absence of recommendation algorithms, which
    may yield distinct graph structures. We introduce Fedivertex, a dataset of over
    400 graphs from seven decentralized networks, collected weekly over six months.
    The dataset, released with a companion Python package to facilitate its use, supports
    research on temporal and structural aspects of decentralized social networks.
    In particular, we benchmark applications to decentralized machine learning and
    community detection.'
article_processing_charge: No
author:
- first_name: Marc
  full_name: Damie, Marc
  last_name: Damie
- first_name: Edwige Audrey Lucienne
  full_name: Cyffers, Edwige Audrey Lucienne
  id: 20d4c299-977a-11ef-ae55-98b15ac64a57
  last_name: Cyffers
citation:
  ama: 'Damie M, Cyffers EAL. Fedivertex: A graph dataset based on decentralized Social
    Media. In: <i>2026 Proceedings of the ACM Web Conference 2026</i>. ACM; :8393-8396.
    doi:<a href="https://doi.org/10.1145/3774904.3792868">10.1145/3774904.3792868</a>'
  apa: 'Damie, M., &#38; Cyffers, E. A. L. (n.d.). Fedivertex: A graph dataset based
    on decentralized Social Media. In <i>2026 Proceedings of the ACM Web Conference
    2026</i> (pp. 8393–8396). Dubai: ACM. <a href="https://doi.org/10.1145/3774904.3792868">https://doi.org/10.1145/3774904.3792868</a>'
  chicago: 'Damie, Marc, and Edwige Audrey Lucienne Cyffers. “Fedivertex: A Graph
    Dataset Based on Decentralized Social Media.” In <i>2026 Proceedings of the ACM
    Web Conference 2026</i>, 8393–96. ACM, n.d. <a href="https://doi.org/10.1145/3774904.3792868">https://doi.org/10.1145/3774904.3792868</a>.'
  ieee: 'M. Damie and E. A. L. Cyffers, “Fedivertex: A graph dataset based on decentralized
    Social Media,” in <i>2026 Proceedings of the ACM Web Conference 2026</i>, Dubai,
    pp. 8393–8396.'
  ista: 'Damie M, Cyffers EAL. Fedivertex: A graph dataset based on decentralized
    Social Media. 2026 Proceedings of the ACM Web Conference 2026. WWW: Web Conference,
    8393–8396.'
  mla: 'Damie, Marc, and Edwige Audrey Lucienne Cyffers. “Fedivertex: A Graph Dataset
    Based on Decentralized Social Media.” <i>2026 Proceedings of the ACM Web Conference
    2026</i>, ACM, pp. 8393–96, doi:<a href="https://doi.org/10.1145/3774904.3792868">10.1145/3774904.3792868</a>.'
  short: M. Damie, E.A.L. Cyffers, in:, 2026 Proceedings of the ACM Web Conference
    2026, ACM, n.d., pp. 8393–8396.
conference:
  end_date: 2026-07-03
  location: Dubai
  name: 'WWW: Web Conference'
  start_date: 2026-06-29
date_created: 2026-05-24T22:01:32Z
date_published: 2026-04-12T00:00:00Z
date_updated: 2026-06-03T05:40:18Z
day: '12'
department:
- _id: ChLa
doi: 10.1145/3774904.3792868
language:
- iso: eng
month: '04'
oa_version: None
page: 8393-8396
publication: 2026 Proceedings of the ACM Web Conference 2026
publication_identifier:
  isbn:
  - '9798400723070'
publication_status: accepted
publisher: ACM
scopus_import: '1'
status: public
title: 'Fedivertex: A graph dataset based on decentralized Social Media'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2026'
...
---
OA_place: publisher
OA_type: hybrid
PlanS_conform: '1'
_id: '12662'
abstract:
- lang: eng
  text: 'Modern machine learning tasks often require considering not just one but
    multiple objectives. For example, besides the prediction quality, this could be
    the efficiency, robustness or fairness of the learned models, or any of their
    combinations. Multi-objective learning offers a natural framework for handling
    such problems without having to commit to early trade-offs. Surprisingly, statistical
    learning theory so far offers almost no insight into the generalization properties
    of multi-objective learning. In this work, we make first steps to fill this gap:
    We establish foundational generalization bounds for the multi-objective setting
    as well as generalization and excess bounds for learning with scalarizations.
    We also provide the first theoretical analysis of the relation between the Pareto-optimal
    sets of the true objectives and the Pareto-optimal sets of their empirical approximations
    from training data. In particular, we show a surprising asymmetry: All Pareto-optimal
    solutions can be approximated by empirically Pareto-optimal ones, but not vice
    versa.'
acknowledgement: Open access funding provided by Institute of Science and Technology
  (IST Austria).
article_processing_charge: Yes (via OA deal)
article_type: original
arxiv: 1
author:
- first_name: Peter
  full_name: Súkeník, Peter
  id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c
  last_name: Súkeník
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Súkeník P, Lampert C. Generalization in multi-objective machine learning. <i>Neural
    Computing and Applications</i>. 2025;37:24669–24683. doi:<a href="https://doi.org/10.1007/s00521-024-10616-1">10.1007/s00521-024-10616-1</a>
  apa: Súkeník, P., &#38; Lampert, C. (2025). Generalization in multi-objective machine
    learning. <i>Neural Computing and Applications</i>. Springer Nature. <a href="https://doi.org/10.1007/s00521-024-10616-1">https://doi.org/10.1007/s00521-024-10616-1</a>
  chicago: Súkeník, Peter, and Christoph Lampert. “Generalization in Multi-Objective
    Machine Learning.” <i>Neural Computing and Applications</i>. Springer Nature,
    2025. <a href="https://doi.org/10.1007/s00521-024-10616-1">https://doi.org/10.1007/s00521-024-10616-1</a>.
  ieee: P. Súkeník and C. Lampert, “Generalization in multi-objective machine learning,”
    <i>Neural Computing and Applications</i>, vol. 37. Springer Nature, pp. 24669–24683,
    2025.
  ista: Súkeník P, Lampert C. 2025. Generalization in multi-objective machine learning.
    Neural Computing and Applications. 37, 24669–24683.
  mla: Súkeník, Peter, and Christoph Lampert. “Generalization in Multi-Objective Machine
    Learning.” <i>Neural Computing and Applications</i>, vol. 37, Springer Nature,
    2025, pp. 24669–24683, doi:<a href="https://doi.org/10.1007/s00521-024-10616-1">10.1007/s00521-024-10616-1</a>.
  short: P. Súkeník, C. Lampert, Neural Computing and Applications 37 (2025) 24669–24683.
corr_author: '1'
date_created: 2023-02-20T08:23:06Z
date_published: 2025-10-01T00:00:00Z
date_updated: 2025-12-30T06:39:56Z
day: '01'
ddc:
- '004'
department:
- _id: ChLa
doi: 10.1007/s00521-024-10616-1
external_id:
  arxiv:
  - '2208.13499'
file:
- access_level: open_access
  checksum: 61ad4591aee16b1e02daf6c164321a42
  content_type: application/pdf
  creator: dernst
  date_created: 2025-12-30T06:39:11Z
  date_updated: 2025-12-30T06:39:11Z
  file_id: '20877'
  file_name: 2025_NeuralCompApplic_Sukenik.pdf
  file_size: 500213
  relation: main_file
  success: 1
file_date_updated: 2025-12-30T06:39:11Z
has_accepted_license: '1'
intvolume: '        37'
language:
- iso: eng
month: '10'
oa: 1
oa_version: Published Version
page: 24669–24683
publication: Neural Computing and Applications
publication_identifier:
  eissn:
  - 1433-3058
  issn:
  - 0941-0643
publication_status: published
publisher: Springer Nature
quality_controlled: '1'
scopus_import: '1'
status: public
title: Generalization in multi-objective machine learning
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 37
year: '2025'
...
---
OA_place: publisher
OA_type: gold
_id: '20256'
abstract:
- lang: eng
  text: We study the problem of predictive runtime monitoring of black-box dynamical
    systems with quantitative safety properties. The black-box setting stipulates
    that the exact semantics of the dynamical system and the controller are unknown,
    and that we are only able to observe the state of the controlled (aka, closed-loop)
    system at finitely many time points. We present a novel framework for predicting
    future states of the system based on the states observed in the past. The numbers
    of past states and of predicted future states are parameters provided by the user.
    Our method is based on a combination of Taylor’s expansion and the backward difference
    operator for numerical differentiation. We also derive an upper bound on the prediction
    error under the assumption that the system dynamics and the controller are smooth.
    The predicted states are then used to predict safety violations ahead in time.
    Our experiments demonstrate practical applicability of our method for complex
    black-box systems, showing that it is computationally lightweight and yet significantly
    more accurate than the state-of-the-art predictive safety monitoring techniques.
acknowledgement: "This work was supported in part by the ERC project ERC-2020-AdG
  101020093.\r\n"
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Thomas A
  full_name: Henzinger, Thomas A
  id: 40876CD8-F248-11E8-B48F-1D18A9856A87
  last_name: Henzinger
  orcid: 0000-0002-2985-7724
- first_name: Fabian
  full_name: Kresse, Fabian
  id: faff3c84-23f6-11ef-9085-e5187b51c604
  last_name: Kresse
- first_name: Kaushik
  full_name: Mallik, Kaushik
  id: 0834ff3c-6d72-11ec-94e0-b5b0a4fb8598
  last_name: Mallik
  orcid: 0000-0001-9864-7475
- first_name: Zhengqi
  full_name: Yu, Zhengqi
  id: 20aa2ae8-f2f1-11ed-bbfa-8205053f1342
  last_name: Yu
- first_name: Dorde
  full_name: Zikelic, Dorde
  id: 294AA7A6-F248-11E8-B48F-1D18A9856A87
  last_name: Zikelic
  orcid: 0000-0002-4681-1699
citation:
  ama: 'Henzinger TA, Kresse F, Mallik K, Yu E, Zikelic D. Predictive monitoring of
    black-box dynamical systems. In: <i>7th Annual Learning for Dynamics &#38; Control
    Conference</i>. Vol 283. ML Research Press; 2025:804-816.'
  apa: 'Henzinger, T. A., Kresse, F., Mallik, K., Yu, E., &#38; Zikelic, D. (2025).
    Predictive monitoring of black-box dynamical systems. In <i>7th Annual Learning
    for Dynamics &#38; Control Conference</i> (Vol. 283, pp. 804–816). Ann Arbor,
    MI, United States: ML Research Press.'
  chicago: Henzinger, Thomas A, Fabian Kresse, Kaushik Mallik, Emily Yu, and Dorde
    Zikelic. “Predictive Monitoring of Black-Box Dynamical Systems.” In <i>7th Annual
    Learning for Dynamics &#38; Control Conference</i>, 283:804–16. ML Research Press,
    2025.
  ieee: T. A. Henzinger, F. Kresse, K. Mallik, E. Yu, and D. Zikelic, “Predictive
    monitoring of black-box dynamical systems,” in <i>7th Annual Learning for Dynamics
    &#38; Control Conference</i>, Ann Arbor, MI, United States, 2025, vol. 283, pp.
    804–816.
  ista: 'Henzinger TA, Kresse F, Mallik K, Yu E, Zikelic D. 2025. Predictive monitoring
    of black-box dynamical systems. 7th Annual Learning for Dynamics &#38; Control
    Conference. L4DC: Learning for Dynamics &#38; Control, PMLR, vol. 283, 804–816.'
  mla: Henzinger, Thomas A., et al. “Predictive Monitoring of Black-Box Dynamical
    Systems.” <i>7th Annual Learning for Dynamics &#38; Control Conference</i>, vol.
    283, ML Research Press, 2025, pp. 804–16.
  short: T.A. Henzinger, F. Kresse, K. Mallik, E. Yu, D. Zikelic, in:, 7th Annual
    Learning for Dynamics &#38; Control Conference, ML Research Press, 2025, pp. 804–816.
conference:
  end_date: 2025-06-06
  location: Ann Arbor, MI, United States
  name: 'L4DC: Learning for Dynamics & Control'
  start_date: 2025-06-04
corr_author: '1'
date_created: 2025-08-31T22:01:32Z
date_published: 2025-06-01T00:00:00Z
date_updated: 2025-09-03T10:37:59Z
day: '01'
ddc:
- '000'
department:
- _id: ToHe
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '2412.16564'
file:
- access_level: open_access
  checksum: d5236e561560635f5ae1d17de4903033
  content_type: application/pdf
  creator: dernst
  date_created: 2025-09-03T10:32:12Z
  date_updated: 2025-09-03T10:32:12Z
  file_id: '20283'
  file_name: 2025_L4DC_HenzingerT.pdf
  file_size: 489639
  relation: main_file
  success: 1
file_date_updated: 2025-09-03T10:32:12Z
has_accepted_license: '1'
intvolume: '       283'
language:
- iso: eng
month: '06'
oa: 1
oa_version: Published Version
page: 804-816
project:
- _id: 62781420-2b32-11ec-9570-8d9b63373d4d
  call_identifier: H2020
  grant_number: '101020093'
  name: Vigilant Algorithmic Monitoring of Software
publication: 7th Annual Learning for Dynamics & Control Conference
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Predictive monitoring of black-box dynamical systems
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 283
year: '2025'
...
---
OA_place: publisher
OA_type: diamond
_id: '20296'
abstract:
- lang: eng
  text: Learning-based systems are increasingly deployed across various domains, yet
    the complexity of traditional neural networks poses significant challenges for
    formal verification. Unlike conventional neural networks, learned Logic Gate Networks
    (LGNs) replace multiplications with Boolean logic gates, yielding a sparse, netlist-like
    architecture that is inherently more amenable to symbolic verification, while
    still delivering promising performance. In this paper, we introduce a SAT encoding
    for verifying global robustness and fairness in LGNs. We evaluate our method on
    five benchmark datasets, including a newly constructed 5-class variant, and find
    that LGNs are both verification-friendly and maintain strong predictive performance.
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "This work is supported in part by the ERC grant under Grant No.
  ERC-2020-AdG 101020093 and\r\nthe Austrian Science Fund (FWF) [10.55776/COE12].
  This research was supported by the Scientific\r\nService Units (SSU) of ISTA through
  resources provided by Scientific Computing (SciComp)."
alternative_title:
- PMLR
article_number: '26'
article_processing_charge: No
arxiv: 1
author:
- first_name: Fabian
  full_name: Kresse, Fabian
  id: faff3c84-23f6-11ef-9085-e5187b51c604
  last_name: Kresse
- first_name: Zhengqi
  full_name: Yu, Zhengqi
  id: 20aa2ae8-f2f1-11ed-bbfa-8205053f1342
  last_name: Yu
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Thomas A
  full_name: Henzinger, Thomas A
  id: 40876CD8-F248-11E8-B48F-1D18A9856A87
  last_name: Henzinger
  orcid: 0000-0002-2985-7724
citation:
  ama: 'Kresse F, Yu E, Lampert C, Henzinger TA. Logic gate neural networks are good
    for verification. In: <i>2nd International Conferenceon Neuro-Symbolic Systems</i>.
    Vol 288. ML Research Press; 2025.'
  apa: 'Kresse, F., Yu, E., Lampert, C., &#38; Henzinger, T. A. (2025). Logic gate
    neural networks are good for verification. In <i>2nd International Conferenceon
    Neuro-Symbolic Systems</i> (Vol. 288). Philadephia, PA, United States: ML Research
    Press.'
  chicago: Kresse, Fabian, Emily Yu, Christoph Lampert, and Thomas A Henzinger. “Logic
    Gate Neural Networks Are Good for Verification.” In <i>2nd International Conferenceon
    Neuro-Symbolic Systems</i>, Vol. 288. ML Research Press, 2025.
  ieee: F. Kresse, E. Yu, C. Lampert, and T. A. Henzinger, “Logic gate neural networks
    are good for verification,” in <i>2nd International Conferenceon Neuro-Symbolic
    Systems</i>, Philadephia, PA, United States, 2025, vol. 288.
  ista: 'Kresse F, Yu E, Lampert C, Henzinger TA. 2025. Logic gate neural networks
    are good for verification. 2nd International Conferenceon Neuro-Symbolic Systems.
    NeuS: International Conferenceon Neuro-Symbolic Systems, PMLR, vol. 288, 26.'
  mla: Kresse, Fabian, et al. “Logic Gate Neural Networks Are Good for Verification.”
    <i>2nd International Conferenceon Neuro-Symbolic Systems</i>, vol. 288, 26, ML
    Research Press, 2025.
  short: F. Kresse, E. Yu, C. Lampert, T.A. Henzinger, in:, 2nd International Conferenceon
    Neuro-Symbolic Systems, ML Research Press, 2025.
conference:
  end_date: 2025-05-30
  location: Philadephia, PA, United States
  name: 'NeuS: International Conferenceon Neuro-Symbolic Systems'
  start_date: 2025-05-28
corr_author: '1'
date_created: 2025-09-07T22:01:34Z
date_published: 2025-06-01T00:00:00Z
date_updated: 2025-09-09T08:12:44Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
- _id: ToHe
ec_funded: 1
external_id:
  arxiv:
  - '2505.19932'
file:
- access_level: open_access
  checksum: 90a32defed34787e771a5c1623b6b0d2
  content_type: application/pdf
  creator: dernst
  date_created: 2025-09-09T08:10:13Z
  date_updated: 2025-09-09T08:10:13Z
  file_id: '20314'
  file_name: 2025_NeuS_Kresse.pdf
  file_size: 295466
  relation: main_file
  success: 1
file_date_updated: 2025-09-09T08:10:13Z
has_accepted_license: '1'
intvolume: '       288'
language:
- iso: eng
month: '06'
oa: 1
oa_version: Published Version
project:
- _id: 62781420-2b32-11ec-9570-8d9b63373d4d
  call_identifier: H2020
  grant_number: '101020093'
  name: Vigilant Algorithmic Monitoring of Software
publication: 2nd International Conferenceon Neuro-Symbolic Systems
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Logic gate neural networks are good for verification
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 288
year: '2025'
...
---
OA_place: publisher
OA_type: diamond
_id: '20298'
abstract:
- lang: eng
  text: "In this paper, we study the problem of estimating the unknown mean θ of a
    unit variance Gaussian distribution in a locally differentially private (LDP)
    way. In the high-privacy regime (ϵ≤1\r\n), we identify an optimal privacy mechanism
    that minimizes the variance of the estimator asymptotically. Our main technical
    contribution is the maximization of the Fisher-Information of the sanitized data
    with respect to the local privacy mechanism Q. We find that the exact solution
    Qθ,ϵ of this maximization is the sign mechanism that applies randomized response
    to the sign of Xi−θ, where X1,…,Xn are the confidential iid original samples.
    However, since this optimal local mechanism depends on the unknown mean θ, we
    employ a two-stage LDP parameter estimation procedure which requires splitting
    agents into two groups. The first n1 observations are used to consistently but
    not necessarily efficiently estimate the parameter θ by θn1~\r\n. Then this estimate
    is updated by applying the sign mechanism with θ~n1 instead of θ\r\n to the remaining
    n−n1 observations, to obtain an LDP and efficient estimator of the unknown mean."
acknowledgement: "We would like to express our gratitude to Christoph Lampert for
  his valuable insights and fruitful discussions that significantly contributed to
  the development of this paper.\r\nWe also thank Salil Vadhan for his constructive
  feedback on an earlier version of this draft.\r\nThe second author gratefully acknowledges
  support by the Austrian Science Fund (FWF): I 5484-N, as part of the Research Unit
  5381 of the German Research Foundation."
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Nikita
  full_name: Kalinin, Nikita
  id: 4b14526e-14d2-11ed-ba64-c14c9553d137
  last_name: Kalinin
- first_name: Lukas
  full_name: Steinberger, Lukas
  last_name: Steinberger
citation:
  ama: 'Kalinin N, Steinberger L. Efficient estimation of a Gaussian mean with local
    differential privacy. In: <i>Proceedings of the 28th International Conference
    on Artificial Intelligence and Statistics</i>. Vol 258. ML Research Press; 2025:118-126.'
  apa: 'Kalinin, N., &#38; Steinberger, L. (2025). Efficient estimation of a Gaussian
    mean with local differential privacy. In <i>Proceedings of the 28th International
    Conference on Artificial Intelligence and Statistics</i> (Vol. 258, pp. 118–126).
    Mai Khao, Thailand: ML Research Press.'
  chicago: Kalinin, Nikita, and Lukas Steinberger. “Efficient Estimation of a Gaussian
    Mean with Local Differential Privacy.” In <i>Proceedings of the 28th International
    Conference on Artificial Intelligence and Statistics</i>, 258:118–26. ML Research
    Press, 2025.
  ieee: N. Kalinin and L. Steinberger, “Efficient estimation of a Gaussian mean with
    local differential privacy,” in <i>Proceedings of the 28th International Conference
    on Artificial Intelligence and Statistics</i>, Mai Khao, Thailand, 2025, vol.
    258, pp. 118–126.
  ista: 'Kalinin N, Steinberger L. 2025. Efficient estimation of a Gaussian mean with
    local differential privacy. Proceedings of the 28th International Conference on
    Artificial Intelligence and Statistics. AISTATS: Conference on Artificial Intelligence
    and Statistics, PMLR, vol. 258, 118–126.'
  mla: Kalinin, Nikita, and Lukas Steinberger. “Efficient Estimation of a Gaussian
    Mean with Local Differential Privacy.” <i>Proceedings of the 28th International
    Conference on Artificial Intelligence and Statistics</i>, vol. 258, ML Research
    Press, 2025, pp. 118–26.
  short: N. Kalinin, L. Steinberger, in:, Proceedings of the 28th International Conference
    on Artificial Intelligence and Statistics, ML Research Press, 2025, pp. 118–126.
conference:
  end_date: 2025-05-05
  location: Mai Khao, Thailand
  name: 'AISTATS: Conference on Artificial Intelligence and Statistics'
  start_date: 2025-05-03
corr_author: '1'
date_created: 2025-09-07T22:01:34Z
date_published: 2025-05-01T00:00:00Z
date_updated: 2025-09-09T08:28:41Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2402.04840'
file:
- access_level: open_access
  checksum: 3dcd59988ca974b98662ba09a516e616
  content_type: application/pdf
  creator: dernst
  date_created: 2025-09-09T08:26:44Z
  date_updated: 2025-09-09T08:26:44Z
  file_id: '20316'
  file_name: 2025_AISTATS_Kalinin.pdf
  file_size: 395864
  relation: main_file
  success: 1
file_date_updated: 2025-09-09T08:26:44Z
has_accepted_license: '1'
intvolume: '       258'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: 118-126
publication: Proceedings of the 28th International Conference on Artificial Intelligence
  and Statistics
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Efficient estimation of a Gaussian mean with local differential privacy
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 258
year: '2025'
...
---
OA_place: repository
OA_type: green
_id: '20455'
abstract:
- lang: eng
  text: Despite extensive research since the community learned about adversarial examples
    10 years ago, we still do not know how to train high-accuracy classifiers that
    are guaranteed to be robust to small perturbations of their inputs. Previous works
    often argued that this might be because no classifier exists that is robust and
    accurate at the same time. However, in computer vision this assumption does not
    match reality where humans are usually accurate and robust on most tasks of interest.
    We offer an alternative explanation and show that in certain settings robust generalization
    is only possible with unrealistically large amounts of data. Specifically, we
    find a setting where a robust classifier exists, it is easy to learn an accurate
    classifier, yet it requires an exponential amount of data to learn a robust classifier.
    Based on this theoretical result, we evaluate the influence of the amount of training
    data on datasets such as CIFAR10. Our findings indicate that the the amount of
    training data is the main factor determining the robust performance. Furthermore
    we show that that there are low magnitude directions in the data which are useful
    for non-robust generalization but are not available for robust classifiers. This
    implies that robust classification is a strictly harder tasks than normal classification,
    thereby providing an explanation why robust classification requires more data.
article_processing_charge: No
arxiv: 1
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Prach B, Lampert C. Intriguing properties of robust classification. In: <i>2025
    IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</i>.
    IEEE; 2025:660-669. doi:<a href="https://doi.org/10.1109/CVPRW67362.2025.00071">10.1109/CVPRW67362.2025.00071</a>'
  apa: 'Prach, B., &#38; Lampert, C. (2025). Intriguing properties of robust classification.
    In <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</i>
    (pp. 660–669). Nashville, TN, United States: IEEE. <a href="https://doi.org/10.1109/CVPRW67362.2025.00071">https://doi.org/10.1109/CVPRW67362.2025.00071</a>'
  chicago: Prach, Bernd, and Christoph Lampert. “Intriguing Properties of Robust Classification.”
    In <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</i>,
    660–69. IEEE, 2025. <a href="https://doi.org/10.1109/CVPRW67362.2025.00071">https://doi.org/10.1109/CVPRW67362.2025.00071</a>.
  ieee: B. Prach and C. Lampert, “Intriguing properties of robust classification,”
    in <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</i>,
    Nashville, TN, United States, 2025, pp. 660–669.
  ista: 'Prach B, Lampert C. 2025. Intriguing properties of robust classification.
    2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.
    CVPR: Conference on Computer Vision and Pattern Recognition, 660–669.'
  mla: Prach, Bernd, and Christoph Lampert. “Intriguing Properties of Robust Classification.”
    <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</i>,
    IEEE, 2025, pp. 660–69, doi:<a href="https://doi.org/10.1109/CVPRW67362.2025.00071">10.1109/CVPRW67362.2025.00071</a>.
  short: B. Prach, C. Lampert, in:, 2025 IEEE/CVF Conference on Computer Vision and
    Pattern Recognition Workshops, IEEE, 2025, pp. 660–669.
conference:
  end_date: 2025-06-12
  location: Nashville, TN, United States
  name: 'CVPR: Conference on Computer Vision and Pattern Recognition'
  start_date: 2025-06-11
corr_author: '1'
date_created: 2025-10-12T22:01:26Z
date_published: 2025-06-15T00:00:00Z
date_updated: 2025-10-13T07:18:26Z
day: '15'
department:
- _id: ChLa
doi: 10.1109/CVPRW67362.2025.00071
external_id:
  arxiv:
  - '2412.04245'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2412.04245
month: '06'
oa: 1
oa_version: Preprint
page: 660-669
publication: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
publication_identifier:
  eissn:
  - 2160-7516
  isbn:
  - '9798331599942'
  issn:
  - 2160-7508
publication_status: published
publisher: IEEE
quality_controlled: '1'
related_material:
  record:
  - id: '18874'
    relation: earlier_version
    status: public
scopus_import: '1'
status: public
title: Intriguing properties of robust classification
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2025'
...
---
OA_place: publisher
OA_type: gold
_id: '20819'
abstract:
- lang: eng
  text: "Clustering is a cornerstone of data analysis that is particularly suited
    to identifying coherent subgroups or substructures in unlabeled data, as are generated
    continuously in large amounts these days. However, in many cases traditional clustering
    methods are not applicable, because data are increasingly being produced and stored
    in a distributed way, e.g. on edge devices, and privacy concerns prevent it from
    being transferred to a central server. To address this challenge, we present FedDP-KMeans,
    a new algorithm for \r\n-means clustering that is fully-federated as well as differentially
    private. Our approach leverages (potentially small and out-of-distribution) server-side
    data to overcome the primary challenge of differentially private clustering methods:
    the need for a good initialization. Combining our initialization with a simple
    federated DP-Lloyds algorithm we obtain an algorithm that achieves excellent results
    on synthetic and real-world benchmark tasks. We also provide a theoretical analysis
    of our method that provides bounds on the convergence speed and cluster identification
    success."
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "This research was funded in part by the Austrian Science Fund (FWF)
  [10.55776/COE12] and supported by the Scientific Service Units (SSU) of ISTA through
  resources provided by Scientific Computing (SciComp).\r\n"
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: David
  full_name: Saulpic, David
  id: f8e48cf0-b0ff-11ed-b0e9-b4c35598f964
  last_name: Saulpic
citation:
  ama: 'Scott JA, Lampert C, Saulpic D. Differentially private federated k-means clustering
    with server-side data. In: <i>42nd International Conference on Machine Learning</i>.
    Vol 267. ML Research Press; 2025:53757-53790.'
  apa: 'Scott, J. A., Lampert, C., &#38; Saulpic, D. (2025). Differentially private
    federated k-means clustering with server-side data. In <i>42nd International Conference
    on Machine Learning</i> (Vol. 267, pp. 53757–53790). Vancouver, Canada: ML Research
    Press.'
  chicago: Scott, Jonathan A, Christoph Lampert, and David Saulpic. “Differentially
    Private Federated K-Means Clustering with Server-Side Data.” In <i>42nd International
    Conference on Machine Learning</i>, 267:53757–90. ML Research Press, 2025.
  ieee: J. A. Scott, C. Lampert, and D. Saulpic, “Differentially private federated
    k-means clustering with server-side data,” in <i>42nd International Conference
    on Machine Learning</i>, Vancouver, Canada, 2025, vol. 267, pp. 53757–53790.
  ista: 'Scott JA, Lampert C, Saulpic D. 2025. Differentially private federated k-means
    clustering with server-side data. 42nd International Conference on Machine Learning.
    ICML: International Conference on Machine Learning, PMLR, vol. 267, 53757–53790.'
  mla: Scott, Jonathan A., et al. “Differentially Private Federated K-Means Clustering
    with Server-Side Data.” <i>42nd International Conference on Machine Learning</i>,
    vol. 267, ML Research Press, 2025, pp. 53757–90.
  short: J.A. Scott, C. Lampert, D. Saulpic, in:, 42nd International Conference on
    Machine Learning, ML Research Press, 2025, pp. 53757–53790.
conference:
  end_date: 2025-07-19
  location: Vancouver, Canada
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2025-07-13
corr_author: '1'
date_created: 2025-12-14T23:02:05Z
date_published: 2025-05-01T00:00:00Z
date_updated: 2026-04-07T11:46:11Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
- _id: MoHe
external_id:
  arxiv:
  - '2506.05408'
file:
- access_level: open_access
  checksum: 815b32b463023ca21e569c2158745c15
  content_type: application/pdf
  creator: dernst
  date_created: 2025-12-16T12:38:29Z
  date_updated: 2025-12-16T12:38:29Z
  file_id: '20829'
  file_name: 2025_ICML_Scott.pdf
  file_size: 746612
  relation: main_file
  success: 1
file_date_updated: 2025-12-16T12:38:29Z
has_accepted_license: '1'
intvolume: '       267'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: 53757-53790
publication: 42nd International Conference on Machine Learning
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '21198'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Differentially private federated k-means clustering with server-side data
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 267
year: '2025'
...
---
OA_place: repository
_id: '21207'
abstract:
- lang: eng
  text: Personalized federated learning has emerged as a popular approach to training
    on devices holding statistically heterogeneous data, known as clients. However,
    most existing approaches require a client to have labeled data for training or
    finetuning in order to obtain their own personalized model. In this paper we address
    this by proposing FLowDUP, a novel method that is able to generate a personalized
    model using only a forward pass with unlabeled data. The generated model parameters
    reside in a low-dimensional subspace, enabling efficient communication and computation.
    FLowDUP's learning objective is theoretically motivated by our new transductive
    multi-task PAC-Bayesian generalization bound, that provides performance guarantees
    for unlabeled clients. The objective is structured in such a way that it allows
    both clients with labeled data and clients with only unlabeled data to contribute
    to the training process. To supplement our theoretical results we carry out a
    thorough experimental evaluation of FLowDUP, demonstrating strong empirical performance
    on a range of datasets with differing sorts of statistically heterogeneous clients.
    Through numerous ablation studies, we test the efficacy of the individual components
    of the method.
article_processing_charge: No
author:
- first_name: Hossein
  full_name: Zakerinia, Hossein
  id: 653bd8b6-f394-11eb-9cf6-c0bbf6cd78d4
  last_name: Zakerinia
  orcid: 0009-0007-3977-6462
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Zakerinia H, Scott JA, Lampert C. Federated learning with unlabeled clients:
    Personalization can happen in low dimensions. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/ARXIV.2505.15579">10.48550/ARXIV.2505.15579</a>'
  apa: 'Zakerinia, H., Scott, J. A., &#38; Lampert, C. (n.d.). Federated learning
    with unlabeled clients: Personalization can happen in low dimensions. <i>arXiv</i>.
    <a href="https://doi.org/10.48550/ARXIV.2505.15579">https://doi.org/10.48550/ARXIV.2505.15579</a>'
  chicago: 'Zakerinia, Hossein, Jonathan A Scott, and Christoph Lampert. “Federated
    Learning with Unlabeled Clients: Personalization Can Happen in Low Dimensions.”
    <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/ARXIV.2505.15579">https://doi.org/10.48550/ARXIV.2505.15579</a>.'
  ieee: 'H. Zakerinia, J. A. Scott, and C. Lampert, “Federated learning with unlabeled
    clients: Personalization can happen in low dimensions,” <i>arXiv</i>. .'
  ista: 'Zakerinia H, Scott JA, Lampert C. Federated learning with unlabeled clients:
    Personalization can happen in low dimensions. arXiv, <a href="https://doi.org/10.48550/ARXIV.2505.15579">10.48550/ARXIV.2505.15579</a>.'
  mla: 'Zakerinia, Hossein, et al. “Federated Learning with Unlabeled Clients: Personalization
    Can Happen in Low Dimensions.” <i>ArXiv</i>, doi:<a href="https://doi.org/10.48550/ARXIV.2505.15579">10.48550/ARXIV.2505.15579</a>.'
  short: H. Zakerinia, J.A. Scott, C. Lampert, ArXiv (n.d.).
corr_author: '1'
date_created: 2026-02-10T08:20:59Z
date_published: 2025-05-21T00:00:00Z
date_updated: 2026-04-07T11:46:11Z
day: '21'
department:
- _id: ChLa
doi: 10.48550/ARXIV.2505.15579
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2505.15579
month: '05'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: draft
related_material:
  record:
  - id: '21198'
    relation: dissertation_contains
    status: public
status: public
title: 'Federated learning with unlabeled clients: Personalization can happen in low
  dimensions'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: preprint
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2025'
...
---
OA_place: publisher
_id: '19759'
abstract:
- lang: eng
  text: "Despite generating remarkable results in various computer vision tasks, deep
    learning comes\r\nwith some surprising shortcomings. For example, tiny perturbations,
    often imperceptible to\r\nthe human eye, can completely change the predictions
    of image classifiers. Despite a decade\r\nof research, the field has made limited
    progress in developing image classifiers that are both\r\naccurate and robust.
    This thesis aims to address this gap.\r\nAs our first contribution, we aim to
    simplify the process of training certifiably robust image\r\nclassifiers. We do
    this by designing a convolutional layer that does not require executing an\r\niterative
    procedure in every forward pass, but relies on an explicit bound instead. We also\r\npropose
    a loss function that allows optimizing for a particular margin more precisely.\r\nNext,
    we provide an overview and comparison of various methods that create robust image\r\nclassifiers
    by constraining the Lipschitz constant. This is important since generally longer\r\ntraining
    times and more parameters improve the performance of robust classifiers, making
    it\r\nchallenging to determine the most practical and effective methods from existing
    literature.\r\nIn 1-Lipschitz classification, the performance of current methods
    is still much worse than what\r\nwe expect on the simple tasks we consider. Therefore,
    we next investigate potential causes of\r\nthis shortcoming. We first consider
    the role of the activation function. We prove a theoretical\r\nshortcoming of
    the commonly used activation function, and provide an alternative without it.\r\nHowever
    this theoretical improvement does barely translate to the empirical performance
    of\r\nrobust classifiers, suggesting a different bottleneck.\r\nTherefore, in
    the final chapter, we study how the performance depends on the amount of\r\ntraining
    data. We prove that in the worst case, we might require far more data to train
    a\r\nrobust classifier compared to a normal one. We furthermore find that the
    amount of training\r\ndata is a key determinant of the performance current methods
    achieve on popular datasets.\r\nAdditionally, we show that linear subspaces exist
    with tiny data variance, and yet we can\r\nstill train very accurate classifiers
    after projecting into those subspaces. This shows that on\r\nthe datasets considered,
    enforcing robustness in classification makes the task strictly more\r\nchallenging.\r\n\r\n-----------------“In
    reference to IEEE copyrighted material which is used with permission in this thesis,
    the IEEE does not endorse any of [name of university or educational entity]’s
    products or services. Internal or personal use of this material is permitted.
    If interested in reprinting/republishing IEEE copyrighted material for advertising
    or promotional purposes or for creating new collective works for resale or redistribution,
    please go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html
    to learn how to obtain a License from RightsLink. If applicable, University Microfilms
    and/or ProQuest Library, or the Archives of Canada may supply single copies of
    the dissertation.”\r\n"
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
citation:
  ama: Prach B. Robust image classification with 1-Lipschitz networks. 2025. doi:<a
    href="https://doi.org/10.15479/10.15479/at-ista-19759">10.15479/10.15479/at-ista-19759</a>
  apa: Prach, B. (2025). <i>Robust image classification with 1-Lipschitz networks</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/10.15479/at-ista-19759">https://doi.org/10.15479/10.15479/at-ista-19759</a>
  chicago: Prach, Bernd. “Robust Image Classification with 1-Lipschitz Networks.”
    Institute of Science and Technology Austria, 2025. <a href="https://doi.org/10.15479/10.15479/at-ista-19759">https://doi.org/10.15479/10.15479/at-ista-19759</a>.
  ieee: B. Prach, “Robust image classification with 1-Lipschitz networks,” Institute
    of Science and Technology Austria, 2025.
  ista: Prach B. 2025. Robust image classification with 1-Lipschitz networks. Institute
    of Science and Technology Austria.
  mla: Prach, Bernd. <i>Robust Image Classification with 1-Lipschitz Networks</i>.
    Institute of Science and Technology Austria, 2025, doi:<a href="https://doi.org/10.15479/10.15479/at-ista-19759">10.15479/10.15479/at-ista-19759</a>.
  short: B. Prach, Robust Image Classification with 1-Lipschitz Networks, Institute
    of Science and Technology Austria, 2025.
corr_author: '1'
date_created: 2025-05-28T16:20:48Z
date_published: 2025-05-30T00:00:00Z
date_updated: 2026-04-07T11:49:52Z
day: '30'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/10.15479/at-ista-19759
file:
- access_level: open_access
  checksum: e5108e759014e2a9020c973c778fafc9
  content_type: application/pdf
  creator: bprach
  date_created: 2025-06-10T18:11:05Z
  date_updated: 2025-06-10T18:11:05Z
  file_id: '19829'
  file_name: ThesisFinal.pdf
  file_size: 3578077
  relation: main_file
- access_level: closed
  checksum: 51bf6c11fb6d8a9f8010b458c600a83f
  content_type: application/x-zip-compressed
  creator: bprach
  date_created: 2025-06-10T18:14:03Z
  date_updated: 2025-06-10T18:14:03Z
  file_id: '19830'
  file_name: ThesisFinal.zip
  file_size: 74894357
  relation: source_file
file_date_updated: 2025-06-10T18:14:03Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '84'
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '15039'
    relation: part_of_dissertation
    status: public
  - id: '18874'
    relation: part_of_dissertation
    status: public
  - id: '17426'
    relation: part_of_dissertation
    status: public
  - id: '11839'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Robust image classification with 1-Lipschitz networks
type: dissertation
user_id: ba8df636-2132-11f1-aed0-ed93e2281fdd
year: '2025'
...
---
DOAJ_listed: '1'
OA_place: publisher
OA_type: gold
_id: '18856'
abstract:
- lang: eng
  text: This research is aimed to solve the tweet/user geolocation prediction task
    and provide a flexible methodology for the geo-tagging of textual big data. The
    suggested approach implements neural networks for natural language processing
    (NLP) to estimate the location as coordinate pairs (longitude, latitude) and two-dimensional
    Gaussian Mixture Models (GMMs). The scope of proposed models has been finetuned
    on a Twitter dataset using pretrained Bidirectional Encoder Representations from
    Transformers (BERT) as base models. Performance metrics show a median error of
    fewer than 30 km on a worldwide-level, and fewer than 15 km on the US-level datasets
    for the models trained and evaluated on text features of tweets' content and metadata
    context. Our source code and data are available at https://github.com/K4TEL/geo-twitter.git.
acknowledgement: The authors acknowledge the Institute of Science and Technology (ISTA)
  for their material support and for granting access to the Twitter database archive,
  which was essential for the research.
article_processing_charge: Yes
article_type: original
author:
- first_name: Kateryna
  full_name: Lutsai, Kateryna
  last_name: Lutsai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Lutsai K, Lampert C. Predicting the geolocation of tweets using transformer
    models on customized data. <i>Journal of Spatial Information Science</i>. 2024;(29):69-99.
    doi:<a href="https://doi.org/10.5311/JOSIS.2024.29.295">10.5311/JOSIS.2024.29.295</a>
  apa: Lutsai, K., &#38; Lampert, C. (2024). Predicting the geolocation of tweets
    using transformer models on customized data. <i>Journal of Spatial Information
    Science</i>. University of Maine. <a href="https://doi.org/10.5311/JOSIS.2024.29.295">https://doi.org/10.5311/JOSIS.2024.29.295</a>
  chicago: Lutsai, Kateryna, and Christoph Lampert. “Predicting the Geolocation of
    Tweets Using Transformer Models on Customized Data.” <i>Journal of Spatial Information
    Science</i>. University of Maine, 2024. <a href="https://doi.org/10.5311/JOSIS.2024.29.295">https://doi.org/10.5311/JOSIS.2024.29.295</a>.
  ieee: K. Lutsai and C. Lampert, “Predicting the geolocation of tweets using transformer
    models on customized data,” <i>Journal of Spatial Information Science</i>, no.
    29. University of Maine, pp. 69–99, 2024.
  ista: Lutsai K, Lampert C. 2024. Predicting the geolocation of tweets using transformer
    models on customized data. Journal of Spatial Information Science. (29), 69–99.
  mla: Lutsai, Kateryna, and Christoph Lampert. “Predicting the Geolocation of Tweets
    Using Transformer Models on Customized Data.” <i>Journal of Spatial Information
    Science</i>, no. 29, University of Maine, 2024, pp. 69–99, doi:<a href="https://doi.org/10.5311/JOSIS.2024.29.295">10.5311/JOSIS.2024.29.295</a>.
  short: K. Lutsai, C. Lampert, Journal of Spatial Information Science (2024) 69–99.
corr_author: '1'
date_created: 2025-01-19T23:01:53Z
date_published: 2024-12-26T00:00:00Z
date_updated: 2025-06-05T13:47:12Z
day: '26'
ddc:
- '500'
department:
- _id: ChLa
doi: 10.5311/JOSIS.2024.29.295
file:
- access_level: open_access
  checksum: b82413f00398ffb5168e8e747571a98d
  content_type: application/pdf
  creator: dernst
  date_created: 2025-01-20T08:41:10Z
  date_updated: 2025-01-20T08:41:10Z
  file_id: '18857'
  file_name: 2024_JourSpatialInfoScience_Lutsai.pdf
  file_size: 7250655
  relation: main_file
  success: 1
file_date_updated: 2025-01-20T08:41:10Z
has_accepted_license: '1'
issue: '29'
language:
- iso: eng
month: '12'
oa: 1
oa_version: Published Version
page: 69-99
publication: Journal of Spatial Information Science
publication_identifier:
  eissn:
  - 1948-660X
publication_status: published
publisher: University of Maine
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/K4TEL/geo-twitter.git
scopus_import: '1'
status: public
title: Predicting the geolocation of tweets using transformer models on customized
  data
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/3.0/legalcode
  name: Creative Commons Attribution 3.0 Unported (CC BY 3.0)
  short: CC BY (3.0)
type: journal_article
user_id: 68b8ca59-c5b3-11ee-8790-cd641c68093d
year: '2024'
...
---
OA_place: publisher
OA_type: gold
_id: '18875'
abstract:
- lang: eng
  text: Current state-of-the-art methods for differentially private model training
    are based on matrix factorization techniques. However, these methods suffer from
    high computational overhead because they require numerically solving a demanding
    optimization problem to determine an approximately optimal factorization prior
    to the actual model training. In this work, we present a new matrix factorization
    approach, BSR, which overcomes this computational bottleneck. By exploiting properties
    of the standard matrix square root, BSR allows to efficiently handle also large-scale
    problems. For the key scenario of stochastic gradient descent with momentum and
    weight decay, we even derive analytical expressions for BSR that render the computational
    overhead negligible. We prove bounds on the approximation quality that hold both
    in the centralized and in the federated learning setting. Our numerical experiments
    demonstrate that models trained using BSR perform on par with the best existing
    methods, while completely avoiding their computational overhead.
alternative_title:
- Advances in Neural Information Processing Systems
article_processing_charge: No
arxiv: 1
author:
- first_name: Nikita
  full_name: Kalinin, Nikita
  id: 4b14526e-14d2-11ed-ba64-c14c9553d137
  last_name: Kalinin
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Kalinin N, Lampert C. Banded square root matrix factorization for differentially
    private model training. In: <i>38th Annual Conference on Neural Information Processing
    Systems</i>. Vol 37. Neural Information Processing Systems Foundation; 2024.'
  apa: 'Kalinin, N., &#38; Lampert, C. (2024). Banded square root matrix factorization
    for differentially private model training. In <i>38th Annual Conference on Neural
    Information Processing Systems</i> (Vol. 37). Vancouver, Canada: Neural Information
    Processing Systems Foundation.'
  chicago: Kalinin, Nikita, and Christoph Lampert. “Banded Square Root Matrix Factorization
    for Differentially Private Model Training.” In <i>38th Annual Conference on Neural
    Information Processing Systems</i>, Vol. 37. Neural Information Processing Systems
    Foundation, 2024.
  ieee: N. Kalinin and C. Lampert, “Banded square root matrix factorization for differentially
    private model training,” in <i>38th Annual Conference on Neural Information Processing
    Systems</i>, Vancouver, Canada, 2024, vol. 37.
  ista: 'Kalinin N, Lampert C. 2024. Banded square root matrix factorization for differentially
    private model training. 38th Annual Conference on Neural Information Processing
    Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information
    Processing Systems, vol. 37.'
  mla: Kalinin, Nikita, and Christoph Lampert. “Banded Square Root Matrix Factorization
    for Differentially Private Model Training.” <i>38th Annual Conference on Neural
    Information Processing Systems</i>, vol. 37, Neural Information Processing Systems
    Foundation, 2024.
  short: N. Kalinin, C. Lampert, in:, 38th Annual Conference on Neural Information
    Processing Systems, Neural Information Processing Systems Foundation, 2024.
conference:
  end_date: 2024-12-16
  location: Vancouver, Canada
  name: 'NeurIPS: Neural Information Processing Systems'
  start_date: 2024-12-16
corr_author: '1'
date_created: 2025-01-24T17:58:16Z
date_published: 2024-12-01T00:00:00Z
date_updated: 2025-05-14T11:34:20Z
day: '01'
ddc:
- '000'
department:
- _id: GradSch
- _id: ChLa
external_id:
  arxiv:
  - '2405.13763'
file:
- access_level: open_access
  checksum: a216cab8eddc1fe7840aede0e2c0d41e
  content_type: application/pdf
  creator: dernst
  date_created: 2025-01-27T09:52:15Z
  date_updated: 2025-01-27T09:52:15Z
  file_id: '18888'
  file_name: 2024_NeurIPS_Nikita.pdf
  file_size: 1144656
  relation: main_file
  success: 1
file_date_updated: 2025-01-27T09:52:15Z
has_accepted_license: '1'
intvolume: '        37'
language:
- iso: eng
month: '12'
oa: 1
oa_version: Published Version
publication: 38th Annual Conference on Neural Information Processing Systems
publication_identifier:
  eissn:
  - 1049-5258
publication_status: published
publisher: Neural Information Processing Systems Foundation
quality_controlled: '1'
scopus_import: '1'
status: public
title: Banded square root matrix factorization for differentially private model training
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 37
year: '2024'
...
---
OA_place: publisher
OA_type: gold
_id: '18891'
abstract:
- lang: eng
  text: "Deep neural networks (DNNs) exhibit a surprising structure in their final
    layer\r\nknown as neural collapse (NC), and a growing body of works has currently
    investigated the propagation of neural collapse to earlier layers of DNNs – a
    phenomenon\r\ncalled deep neural collapse (DNC). However, existing theoretical
    results are restricted to special cases: linear models, only two layers or binary
    classification.\r\nIn contrast, we focus on non-linear models of arbitrary depth
    in multi-class classification and reveal a surprising qualitative shift. As soon
    as we go beyond two\r\nlayers or two classes, DNC stops being optimal for the
    deep unconstrained features\r\nmodel (DUFM) – the standard theoretical framework
    for the analysis of collapse.\r\nThe main culprit is a low-rank bias of multi-layer
    regularization schemes: this bias\r\nleads to optimal solutions of even lower
    rank than the neural collapse. We support\r\nour theoretical findings with experiments
    on both DUFM and real data, which show\r\nthe emergence of the low-rank structure
    in the solution found by gradient descent."
acknowledged_ssus:
- _id: ScienComp
acknowledgement: Marco Mondelli is partially supported by the 2019 Lopez-Loreta prize.
  This research was supported by the Scientific Service Units (SSU) of ISTA through
  resources provided by Scientific Computing (SciComp).
alternative_title:
- Advances in Neural Information Processing Systems
article_processing_charge: No
arxiv: 1
author:
- first_name: Peter
  full_name: Súkeník, Peter
  id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c
  last_name: Súkeník
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Súkeník P, Lampert C, Mondelli M. Neural collapse versus low-rank bias: Is
    deep neural collapse really optimal? In: <i>38th Annual Conference on Neural Information
    Processing Systems</i>. Vol 37. Neural Information Processing Systems Foundation;
    2024.'
  apa: 'Súkeník, P., Lampert, C., &#38; Mondelli, M. (2024). Neural collapse versus
    low-rank bias: Is deep neural collapse really optimal? In <i>38th Annual Conference
    on Neural Information Processing Systems</i> (Vol. 37). Vancouver, Canada: Neural
    Information Processing Systems Foundation.'
  chicago: 'Súkeník, Peter, Christoph Lampert, and Marco Mondelli. “Neural Collapse
    versus Low-Rank Bias: Is Deep Neural Collapse Really Optimal?” In <i>38th Annual
    Conference on Neural Information Processing Systems</i>, Vol. 37. Neural Information
    Processing Systems Foundation, 2024.'
  ieee: 'P. Súkeník, C. Lampert, and M. Mondelli, “Neural collapse versus low-rank
    bias: Is deep neural collapse really optimal?,” in <i>38th Annual Conference on
    Neural Information Processing Systems</i>, Vancouver, Canada, 2024, vol. 37.'
  ista: 'Súkeník P, Lampert C, Mondelli M. 2024. Neural collapse versus low-rank bias:
    Is deep neural collapse really optimal? 38th Annual Conference on Neural Information
    Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in
    Neural Information Processing Systems, vol. 37.'
  mla: 'Súkeník, Peter, et al. “Neural Collapse versus Low-Rank Bias: Is Deep Neural
    Collapse Really Optimal?” <i>38th Annual Conference on Neural Information Processing
    Systems</i>, vol. 37, Neural Information Processing Systems Foundation, 2024.'
  short: P. Súkeník, C. Lampert, M. Mondelli, in:, 38th Annual Conference on Neural
    Information Processing Systems, Neural Information Processing Systems Foundation,
    2024.
conference:
  end_date: 2024-12-16
  location: Vancouver, Canada
  name: 'NeurIPS: Neural Information Processing Systems'
  start_date: 2024-12-16
corr_author: '1'
date_created: 2025-01-27T11:15:18Z
date_published: 2024-12-01T00:00:00Z
date_updated: 2025-06-04T07:19:21Z
day: '01'
ddc:
- '000'
department:
- _id: GradSch
- _id: MaMo
- _id: ChLa
external_id:
  arxiv:
  - '2405.14468'
file:
- access_level: open_access
  checksum: b7b79f1ea3ac1e9e11b3d91faaeb0780
  content_type: application/pdf
  creator: dernst
  date_created: 2025-02-04T08:11:25Z
  date_updated: 2025-02-04T08:11:25Z
  file_id: '18989'
  file_name: 2024_NeurIPS_Sukenik.pdf
  file_size: 1784118
  relation: main_file
  success: 1
file_date_updated: 2025-02-04T08:11:25Z
has_accepted_license: '1'
intvolume: '        37'
language:
- iso: eng
month: '12'
oa: 1
oa_version: Published Version
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 38th Annual Conference on Neural Information Processing Systems
publication_status: published
publisher: Neural Information Processing Systems Foundation
quality_controlled: '1'
status: public
title: 'Neural collapse versus low-rank bias: Is deep neural collapse really optimal?'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 37
year: '2024'
...
---
OA_place: repository
OA_type: green
_id: '19063'
abstract:
- lang: eng
  text: "Instruction-tuned Large Language Models (LLMs) show impressive results in
    numerous practical applications, but they lack essential safety features that
    are common in other areas of computer science, particularly an explicit separation
    of instructions and data. This makes them vulnerable to manipulations such as
    indirect prompt injections and generally unsuitable for safety-critical tasks.
    Surprisingly, there is currently no established definition or benchmark to quantify
    this phenomenon. In this work, we close this gap by introducing a formal measure
    for instruction-data separation and an empirical variant that is calculable from
    a model's outputs. We also present a new dataset, SEP, that allows estimating
    the measure for real-world models. Our results on various LLMs show that the problem
    of instruction-data separation is real: all models fail to achieve high separation,
    and canonical mitigation techniques, such as prompt engineering and fine-tuning,
    either fail to substantially improve separation or reduce model utility. The source
    code and SEP dataset are openly accessible at https://github.com/egozverev/Shold-It-Be-Executed-Or-Processed.\r\n"
acknowledged_ssus:
- _id: ScienComp
acknowledgement: The authors would like to sincerely thank Juan Rocamonde for valuable
  feedback to our manuscript. We acknowledge the support from the Scientific Service
  Units (SSU) of ISTA through resources provided by Scientific Computing (SciComp).
  We thank Dan Alistarh for providing us with computational resources. This work was
  partially funded by the German Federal Ministry of Education and Research (BMBF)
  under the grant AIgenCY (16KIS2012) and ELSA – European Lighthouse on Secure and
  Safe AI funded by the European Union under grant agreement No. 101070617. Views
  and opinions expressed are however those of the authors only and do not necessarily
  reflect those of the European Union or European Commission. Neither the European
  Union nor the European Commission can be held responsible for them.
article_number: '2403.06833'
article_processing_charge: No
arxiv: 1
author:
- first_name: Egor
  full_name: Zverev, Egor
  id: 05162b19-1340-11ed-8f02-fa94e0e8c3bc
  last_name: Zverev
- first_name: Sahar
  full_name: Abdelnabi, Sahar
  last_name: Abdelnabi
- first_name: Soroush
  full_name: Tabesh, Soroush
  id: 06000900-6068-11ef-8d61-c2472ef2e752
  last_name: Tabesh
  orcid: 0009-0003-4119-6281
- first_name: Mario
  full_name: Fritz, Mario
  last_name: Fritz
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Zverev E, Abdelnabi S, Tabesh S, Fritz M, Lampert C. Can LLMs separate instructions
    from data? And what do we even mean by that? <i>arXiv</i>. 2024. doi:<a href="https://doi.org/10.48550/arXiv.2403.06833">10.48550/arXiv.2403.06833</a>
  apa: Zverev, E., Abdelnabi, S., Tabesh, S., Fritz, M., &#38; Lampert, C. (2024).
    Can LLMs separate instructions from data? And what do we even mean by that? <i>arXiv</i>.
    <a href="https://doi.org/10.48550/arXiv.2403.06833">https://doi.org/10.48550/arXiv.2403.06833</a>
  chicago: Zverev, Egor, Sahar Abdelnabi, Soroush Tabesh, Mario Fritz, and Christoph
    Lampert. “Can LLMs Separate Instructions from Data? And What Do We Even Mean by
    That?” <i>ArXiv</i>, 2024. <a href="https://doi.org/10.48550/arXiv.2403.06833">https://doi.org/10.48550/arXiv.2403.06833</a>.
  ieee: E. Zverev, S. Abdelnabi, S. Tabesh, M. Fritz, and C. Lampert, “Can LLMs separate
    instructions from data? And what do we even mean by that?,” <i>arXiv</i>. 2024.
  ista: Zverev E, Abdelnabi S, Tabesh S, Fritz M, Lampert C. 2024. Can LLMs separate
    instructions from data? And what do we even mean by that? arXiv, 2403.06833.
  mla: Zverev, Egor, et al. “Can LLMs Separate Instructions from Data? And What Do
    We Even Mean by That?” <i>ArXiv</i>, 2403.06833, 2024, doi:<a href="https://doi.org/10.48550/arXiv.2403.06833">10.48550/arXiv.2403.06833</a>.
  short: E. Zverev, S. Abdelnabi, S. Tabesh, M. Fritz, C. Lampert, ArXiv (2024).
corr_author: '1'
date_created: 2025-02-20T10:13:42Z
date_published: 2024-03-01T00:00:00Z
date_updated: 2025-02-24T12:52:23Z
day: '01'
ddc:
- '000'
department:
- _id: GradSch
- _id: ChLa
doi: 10.48550/arXiv.2403.06833
external_id:
  arxiv:
  - '2403.06833'
file:
- access_level: open_access
  checksum: 35eb43968684b87be59144603ef10af0
  content_type: application/pdf
  creator: ezverev
  date_created: 2025-02-20T10:11:45Z
  date_updated: 2025-02-20T10:11:45Z
  file_id: '19064'
  file_name: 2403.06833v3.pdf
  file_size: 530972
  relation: main_file
  success: 1
file_date_updated: 2025-02-20T10:11:45Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2403.06833
month: '03'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: published
related_material:
  link:
  - relation: software
    url: ' https://github.com/egozverev/Shold-It-Be-Executed-Or-Processed'
status: public
title: Can LLMs separate instructions from data? And what do we even mean by that?
tmp:
  image: /images/cc_by_sa.png
  legal_code_url: https://creativecommons.org/licenses/by-sa/4.0/legalcode
  name: Creative Commons Attribution-ShareAlike 4.0 International Public License (CC
    BY-SA 4.0)
  short: CC BY-SA (4.0)
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2024'
...
---
OA_place: publisher
OA_type: diamond
_id: '19408'
abstract:
- lang: eng
  text: 'Continual learning is a subfield of machine learning, which aims to allow
    machine learning models to continuously learn on new data, by accumulating knowledge
    without forgetting what was learned in the past. In this work, we take a step
    back, and ask: "Why should one care about continual learning in the first place?".
    We set the stage by examining recent continual learning papers published at four
    major machine learning conferences, and show that memory-constrained settings
    dominate the field. Then, we discuss five open problems in machine learning, and
    even though they might seem unrelated to continual learning at first sight, we
    show that continual learning will inevitably be part of their solution. These
    problems are model editing, personalization and specialization, on-device learning,
    faster (re-)training and reinforcement learning. Finally, by comparing the desiderata
    from these unsolved problems and the current assumptions in continual learning,
    we highlight and discuss four future directions for continual learning research.
    We hope that this work offers an interesting perspective on the future of continual
    learning, while displaying its potential value and the paths we have to pursue
    in order to make it successful. This work is the result of the many discussions
    the authors had at the Dagstuhl seminar on Deep Continual Learning, in March 2023.'
alternative_title:
- TMLR
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Eli
  full_name: Verwimp, Eli
  last_name: Verwimp
- first_name: Rahaf
  full_name: Aljundi, Rahaf
  last_name: Aljundi
- first_name: Shai
  full_name: Ben-David, Shai
  last_name: Ben-David
- first_name: Matthias
  full_name: Bethge, Matthias
  last_name: Bethge
- first_name: Andrea
  full_name: Cossu, Andrea
  last_name: Cossu
- first_name: Alexander
  full_name: Gepperth, Alexander
  last_name: Gepperth
- first_name: Tyler L.
  full_name: Hayes, Tyler L.
  last_name: Hayes
- first_name: Eyke
  full_name: Hüllermeier, Eyke
  last_name: Hüllermeier
- first_name: Christopher
  full_name: Kanan, Christopher
  last_name: Kanan
- first_name: Dhireesha
  full_name: Kudithipudi, Dhireesha
  last_name: Kudithipudi
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Martin
  full_name: Mundt, Martin
  last_name: Mundt
- first_name: Razvan
  full_name: Pascanu, Razvan
  last_name: Pascanu
- first_name: Adrian
  full_name: Popescu, Adrian
  last_name: Popescu
- first_name: Andreas S.
  full_name: Tolias, Andreas S.
  last_name: Tolias
- first_name: Joost
  full_name: Van De Weijer, Joost
  last_name: Van De Weijer
- first_name: Bing
  full_name: Liu, Bing
  last_name: Liu
- first_name: Vincenzo
  full_name: Lomonaco, Vincenzo
  last_name: Lomonaco
- first_name: Tinne
  full_name: Tuytelaars, Tinne
  last_name: Tuytelaars
- first_name: Gido M.
  full_name: Van De Ven, Gido M.
  last_name: Van De Ven
citation:
  ama: 'Verwimp E, Aljundi R, Ben-David S, et al. Continual learning: Applications
    and the road forward. <i>Transactions on Machine Learning Research</i>. 2024;2024.'
  apa: 'Verwimp, E., Aljundi, R., Ben-David, S., Bethge, M., Cossu, A., Gepperth,
    A., … Van De Ven, G. M. (2024). Continual learning: Applications and the road
    forward. <i>Transactions on Machine Learning Research</i>. Transactions on Machine
    Learning Research.'
  chicago: 'Verwimp, Eli, Rahaf Aljundi, Shai Ben-David, Matthias Bethge, Andrea Cossu,
    Alexander Gepperth, Tyler L. Hayes, et al. “Continual Learning: Applications and
    the Road Forward.” <i>Transactions on Machine Learning Research</i>. Transactions
    on Machine Learning Research, 2024.'
  ieee: 'E. Verwimp <i>et al.</i>, “Continual learning: Applications and the road
    forward,” <i>Transactions on Machine Learning Research</i>, vol. 2024. Transactions
    on Machine Learning Research, 2024.'
  ista: 'Verwimp E, Aljundi R, Ben-David S, Bethge M, Cossu A, Gepperth A, Hayes TL,
    Hüllermeier E, Kanan C, Kudithipudi D, Lampert C, Mundt M, Pascanu R, Popescu
    A, Tolias AS, Van De Weijer J, Liu B, Lomonaco V, Tuytelaars T, Van De Ven GM.
    2024. Continual learning: Applications and the road forward. Transactions on Machine
    Learning Research. 2024.'
  mla: 'Verwimp, Eli, et al. “Continual Learning: Applications and the Road Forward.”
    <i>Transactions on Machine Learning Research</i>, vol. 2024, Transactions on Machine
    Learning Research, 2024.'
  short: E. Verwimp, R. Aljundi, S. Ben-David, M. Bethge, A. Cossu, A. Gepperth, T.L.
    Hayes, E. Hüllermeier, C. Kanan, D. Kudithipudi, C. Lampert, M. Mundt, R. Pascanu,
    A. Popescu, A.S. Tolias, J. Van De Weijer, B. Liu, V. Lomonaco, T. Tuytelaars,
    G.M. Van De Ven, Transactions on Machine Learning Research 2024 (2024).
date_created: 2025-03-16T23:01:25Z
date_published: 2024-04-12T00:00:00Z
date_updated: 2025-03-20T09:21:02Z
day: '12'
ddc:
- '000'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2311.11908'
file:
- access_level: open_access
  checksum: 0714e12f7423cd098976ed9974561155
  content_type: application/pdf
  creator: dernst
  date_created: 2025-03-20T09:02:18Z
  date_updated: 2025-03-20T09:02:18Z
  file_id: '19426'
  file_name: 2024_TMLR_Verwimp.pdf
  file_size: 1367966
  relation: main_file
  success: 1
file_date_updated: 2025-03-20T09:02:18Z
has_accepted_license: '1'
intvolume: '      2024'
language:
- iso: eng
month: '04'
oa: 1
oa_version: Published Version
publication: Transactions on Machine Learning Research
publication_identifier:
  eissn:
  - 2835-8856
publication_status: published
publisher: Transactions on Machine Learning Research
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'Continual learning: Applications and the road forward'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 2024
year: '2024'
...
---
_id: '17093'
abstract:
- lang: eng
  text: 'Federated Learning (FL) enables large-scale distributed training of machine
    learning models, while still allowing individual nodes to maintain data locally.
    However, executing FL at scale comes with inherent practical challenges: 1) heterogeneity
    of the local node data distributions, 2) heterogeneity of node computational speeds
    (asynchrony), but also 3) constraints in the amount of communication between the
    clients and the server. In this work, we present the first variant of the classic
    federated averaging (FedAvg) algorithm which, at the same time, supports data
    heterogeneity, partial client asynchrony, and communication compression. Our algorithm
    comes with a novel, rigorous analysis showing that, in spite of these system relaxations,
    it can provide similar convergence to FedAvg in interesting parameter regimes.
    Experimental results in the rigorous LEAF benchmark on setups of up to 300 nodes
    show that our algorithm ensures fast convergence for standard federated tasks,
    improving upon prior quantized and asynchronous approaches.'
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Hossein
  full_name: Zakerinia, Hossein
  id: 653bd8b6-f394-11eb-9cf6-c0bbf6cd78d4
  last_name: Zakerinia
- first_name: Shayan
  full_name: Talaei, Shayan
  last_name: Talaei
- first_name: Giorgi
  full_name: Nadiradze, Giorgi
  id: 3279A00C-F248-11E8-B48F-1D18A9856A87
  last_name: Nadiradze
  orcid: 0000-0001-5634-0731
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Zakerinia H, Talaei S, Nadiradze G, Alistarh D-A. Communication-efficient
    federated learning with data and client heterogeneity. In: <i>Proceedings of the
    27th International Conference on Artificial Intelligence and Statistics</i>. Vol
    238. ML Research Press; 2024:3448-3456.'
  apa: 'Zakerinia, H., Talaei, S., Nadiradze, G., &#38; Alistarh, D.-A. (2024). Communication-efficient
    federated learning with data and client heterogeneity. In <i>Proceedings of the
    27th International Conference on Artificial Intelligence and Statistics</i> (Vol.
    238, pp. 3448–3456). Valencia, Spain: ML Research Press.'
  chicago: Zakerinia, Hossein, Shayan Talaei, Giorgi Nadiradze, and Dan-Adrian Alistarh.
    “Communication-Efficient Federated Learning with Data and Client Heterogeneity.”
    In <i>Proceedings of the 27th International Conference on Artificial Intelligence
    and Statistics</i>, 238:3448–56. ML Research Press, 2024.
  ieee: H. Zakerinia, S. Talaei, G. Nadiradze, and D.-A. Alistarh, “Communication-efficient
    federated learning with data and client heterogeneity,” in <i>Proceedings of the
    27th International Conference on Artificial Intelligence and Statistics</i>, Valencia,
    Spain, 2024, vol. 238, pp. 3448–3456.
  ista: 'Zakerinia H, Talaei S, Nadiradze G, Alistarh D-A. 2024. Communication-efficient
    federated learning with data and client heterogeneity. Proceedings of the 27th
    International Conference on Artificial Intelligence and Statistics. AISTATS: Conference
    on Artificial Intelligence and Statistics, PMLR, vol. 238, 3448–3456.'
  mla: Zakerinia, Hossein, et al. “Communication-Efficient Federated Learning with
    Data and Client Heterogeneity.” <i>Proceedings of the 27th International Conference
    on Artificial Intelligence and Statistics</i>, vol. 238, ML Research Press, 2024,
    pp. 3448–56.
  short: H. Zakerinia, S. Talaei, G. Nadiradze, D.-A. Alistarh, in:, Proceedings of
    the 27th International Conference on Artificial Intelligence and Statistics, ML
    Research Press, 2024, pp. 3448–3456.
conference:
  end_date: 2024-05-04
  location: Valencia, Spain
  name: 'AISTATS: Conference on Artificial Intelligence and Statistics'
  start_date: 2024-05-02
corr_author: '1'
date_created: 2024-06-02T22:00:57Z
date_published: 2024-05-01T00:00:00Z
date_updated: 2024-10-09T21:08:57Z
day: '01'
department:
- _id: DaAl
- _id: ChLa
external_id:
  arxiv:
  - '2206.10032'
intvolume: '       238'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2206.10032
month: '05'
oa: 1
oa_version: Preprint
page: 3448-3456
publication: Proceedings of the 27th International Conference on Artificial Intelligence
  and Statistics
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Communication-efficient federated learning with data and client heterogeneity
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 238
year: '2024'
...
---
_id: '17411'
abstract:
- lang: eng
  text: "We present PeFLL, a new personalized federated learning algorithm that improves\r\nover
    the state-of-the-art in three aspects: 1) it produces more accurate models,\r\nespecially
    in the low-data regime, and not only for clients present during its\r\ntraining
    phase, but also for any that may emerge in the future; 2) it reduces the\r\namount
    of on-client computation and client-server communication by providing\r\nfuture
    clients with ready-to-use personalized models that require no additional\r\nfinetuning
    or optimization; 3) it comes with theoretical guarantees that establish\r\ngeneralization
    from the observed clients to future ones.\r\nAt the core of PeFLL lies a learning-to-learn
    approach that jointly trains an\r\nembedding network and a hypernetwork. The embedding
    network is used to\r\nrepresent clients in a latent descriptor space in a way
    that reflects their similarity\r\nto each other. The hypernetwork takes as input
    such descriptors and outputs the\r\nparameters of fully personalized client models.
    In combination, both networks\r\nconstitute a learning algorithm that achieves
    state-of-the-art performance in several\r\npersonalized federated learning benchmarks"
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "This research was supported by the Scientific Service Units (SSU)
  of ISTA through resources provided by Scientific Computing (SciComp).\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
- first_name: Hossein
  full_name: Zakerinia, Hossein
  id: 653bd8b6-f394-11eb-9cf6-c0bbf6cd78d4
  last_name: Zakerinia
  orcid: 0009-0007-3977-6462
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Scott JA, Zakerinia H, Lampert C. PEFLL: Personalized federated learning by
    learning to learn. In: <i>12th International Conference on Learning Representations</i>.
    OpenReview; 2024.'
  apa: 'Scott, J. A., Zakerinia, H., &#38; Lampert, C. (2024). PEFLL: Personalized
    federated learning by learning to learn. In <i>12th International Conference on
    Learning Representations</i>. Vienna, Austria: OpenReview.'
  chicago: 'Scott, Jonathan A, Hossein Zakerinia, and Christoph Lampert. “PEFLL: Personalized
    Federated Learning by Learning to Learn.” In <i>12th International Conference
    on Learning Representations</i>. OpenReview, 2024.'
  ieee: 'J. A. Scott, H. Zakerinia, and C. Lampert, “PEFLL: Personalized federated
    learning by learning to learn,” in <i>12th International Conference on Learning
    Representations</i>, Vienna, Austria, 2024.'
  ista: 'Scott JA, Zakerinia H, Lampert C. 2024. PEFLL: Personalized federated learning
    by learning to learn. 12th International Conference on Learning Representations.
    ICLR: International Conference on Learning Representations.'
  mla: 'Scott, Jonathan A., et al. “PEFLL: Personalized Federated Learning by Learning
    to Learn.” <i>12th International Conference on Learning Representations</i>, OpenReview,
    2024.'
  short: J.A. Scott, H. Zakerinia, C. Lampert, in:, 12th International Conference
    on Learning Representations, OpenReview, 2024.
conference:
  end_date: 2024-03-07
  location: Vienna, Austria
  name: 'ICLR: International Conference on Learning Representations'
  start_date: 2024-03-07
corr_author: '1'
date_created: 2024-08-11T22:01:12Z
date_published: 2024-03-07T00:00:00Z
date_updated: 2026-04-07T11:46:11Z
day: '07'
ddc:
- '000'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2306.05515'
file:
- access_level: open_access
  checksum: 81b7ea2e667adaf9c7a7b6b376b1f251
  content_type: application/pdf
  creator: dernst
  date_created: 2024-08-12T07:38:06Z
  date_updated: 2024-08-12T07:38:06Z
  file_id: '17415'
  file_name: 2024_ICLR_Scott.pdf
  file_size: 1029219
  relation: main_file
  success: 1
file_date_updated: 2024-08-12T07:38:06Z
has_accepted_license: '1'
language:
- iso: eng
month: '03'
oa: 1
oa_version: Published Version
publication: 12th International Conference on Learning Representations
publication_status: published
publisher: OpenReview
quality_controlled: '1'
related_material:
  record:
  - id: '21198'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: 'PEFLL: Personalized federated learning by learning to learn'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2024'
...
---
_id: '18120'
abstract:
- lang: eng
  text: In practice, training using federated learning can be orders of magnitude
    slower than standard centralized training. This severely limits the amount of
    experimentation and tuning that can be done, making it challenging to obtain good
    performance on a given task. Server-side proxy data can be used to run training
    simulations, for instance for hyperparameter tuning. This can greatly speed up
    the training pipeline by reducing the number of tuning runs to be performed overall
    on the true clients. However, it is challenging to ensure that these simulations
    accurately reflect the dynamics of the real federated training. In particular,
    the proxy data used for simulations often comes as a single centralized dataset
    without a partition into distinct clients, and partitioning this data in a naive
    way can lead to simulations that poorly reflect real federated training. In this
    paper we address the challenge of how to partition centralized data in a way that
    reflects the statistical heterogeneity of the true federated clients. We propose
    a fully federated, theoretically justified, algorithm that efficiently learns
    the distribution of the true clients and observe improved server-side simulations
    when using the inferred distribution to create simulated clients from the centralized
    data.
acknowledgement: 'We would like to thank: Mona Chitnis and everyone in the Private
  Federated Learning team at Apple for their help and support throughout the entire
  project; Audra McMillan, Martin Pelikan, Anosh Raj and Barry Theobold for feedback
  on the initial versions of the paper; and Christoph Lampert for valuable feedback
  on the paper structure and suggestions for additional experiments.'
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
- first_name: Áine
  full_name: Cahill, Áine
  last_name: Cahill
citation:
  ama: 'Scott JA, Cahill Á. Improved modelling of federated datasets using mixtures-of-Dirichlet-multinomials.
    In: <i>Proceedings of the 41st International Conference on Machine Learning</i>.
    Vol 235. ML Research Press; 2024:44012-44037.'
  apa: 'Scott, J. A., &#38; Cahill, Á. (2024). Improved modelling of federated datasets
    using mixtures-of-Dirichlet-multinomials. In <i>Proceedings of the 41st International
    Conference on Machine Learning</i> (Vol. 235, pp. 44012–44037). Vienna, Austria:
    ML Research Press.'
  chicago: Scott, Jonathan A, and Áine Cahill. “Improved Modelling of Federated Datasets
    Using Mixtures-of-Dirichlet-Multinomials.” In <i>Proceedings of the 41st International
    Conference on Machine Learning</i>, 235:44012–37. ML Research Press, 2024.
  ieee: J. A. Scott and Á. Cahill, “Improved modelling of federated datasets using
    mixtures-of-Dirichlet-multinomials,” in <i>Proceedings of the 41st International
    Conference on Machine Learning</i>, Vienna, Austria, 2024, vol. 235, pp. 44012–44037.
  ista: 'Scott JA, Cahill Á. 2024. Improved modelling of federated datasets using
    mixtures-of-Dirichlet-multinomials. Proceedings of the 41st International Conference
    on Machine Learning. ICML: International Conference on Machine Learning, PMLR,
    vol. 235, 44012–44037.'
  mla: Scott, Jonathan A., and Áine Cahill. “Improved Modelling of Federated Datasets
    Using Mixtures-of-Dirichlet-Multinomials.” <i>Proceedings of the 41st International
    Conference on Machine Learning</i>, vol. 235, ML Research Press, 2024, pp. 44012–37.
  short: J.A. Scott, Á. Cahill, in:, Proceedings of the 41st International Conference
    on Machine Learning, ML Research Press, 2024, pp. 44012–44037.
conference:
  end_date: 2024-07-27
  location: Vienna, Austria
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2024-07-21
corr_author: '1'
date_created: 2024-09-22T22:01:45Z
date_published: 2024-09-01T00:00:00Z
date_updated: 2026-04-07T11:46:11Z
day: '01'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2406.02416'
intvolume: '       235'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2406.02416
month: '09'
oa: 1
oa_version: Preprint
page: 44012-44037
publication: Proceedings of the 41st International Conference on Machine Learning
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '21198'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Improved modelling of federated datasets using mixtures-of-Dirichlet-multinomials
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 235
year: '2024'
...
---
OA_place: repository
OA_type: green
_id: '17426'
abstract:
- lang: eng
  text: "The robustness of neural networks against input perturbations with bounded\r\nmagnitude
    represents a serious concern in the deployment of deep learning\r\nmodels in safety-critical
    systems. Recently, the scientific community has\r\nfocused on enhancing certifiable
    robustness guarantees by crafting 1-Lipschitz\r\nneural networks that leverage
    Lipschitz bounded dense and convolutional layers.\r\nAlthough different methods
    have been proposed in the literature to achieve this\r\ngoal, understanding the
    performance of such methods is not straightforward,\r\nsince different metrics
    can be relevant (e.g., training time, memory usage,\r\naccuracy, certifiable robustness)
    for different applications. For this reason,\r\nthis work provides a thorough
    theoretical and empirical comparison between\r\nmethods by evaluating them in
    terms of memory usage, speed, and certifiable\r\nrobust accuracy. The paper also
    provides some guidelines and recommendations to\r\nsupport the user in selecting
    the methods that work best depending on the\r\navailable resources. We provide
    code at\r\nhttps://github.com/berndprach/1LipschitzLayersCompared."
acknowledgement: "This work was partially supported by project SERICS (PE00000014)
  under the MUR National Recovery and Resilience Plan funded by the European Union
  - NextGenerationEU.\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
- first_name: Fabio
  full_name: Brau, Fabio
  last_name: Brau
- first_name: Giorgio
  full_name: Buttazzo, Giorgio
  last_name: Buttazzo
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Prach B, Brau F, Buttazzo G, Lampert C. 1-Lipschitz layers compared: Memory,
    speed, and certifiable robustness. In: <i>Proceedings of the IEEE/CVF Conference
    on Computer Vision and Pattern Recognition</i>. Computer Vision Foundation; 2024:24574-24583.
    doi:<a href="https://doi.org/10.1109/CVPR52733.2024.02320">10.1109/CVPR52733.2024.02320</a>'
  apa: 'Prach, B., Brau, F., Buttazzo, G., &#38; Lampert, C. (2024). 1-Lipschitz layers
    compared: Memory, speed, and certifiable robustness. In <i>Proceedings of the
    IEEE/CVF Conference on Computer Vision and Pattern Recognition</i> (pp. 24574–24583).
    Seattle, WA, United States: Computer Vision Foundation. <a href="https://doi.org/10.1109/CVPR52733.2024.02320">https://doi.org/10.1109/CVPR52733.2024.02320</a>'
  chicago: 'Prach, Bernd, Fabio Brau, Giorgio Buttazzo, and Christoph Lampert. “1-Lipschitz
    Layers Compared: Memory, Speed, and Certifiable Robustness.” In <i>Proceedings
    of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</i>, 24574–83.
    Computer Vision Foundation, 2024. <a href="https://doi.org/10.1109/CVPR52733.2024.02320">https://doi.org/10.1109/CVPR52733.2024.02320</a>.'
  ieee: 'B. Prach, F. Brau, G. Buttazzo, and C. Lampert, “1-Lipschitz layers compared:
    Memory, speed, and certifiable robustness,” in <i>Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition</i>, Seattle, WA, United
    States, 2024, pp. 24574–24583.'
  ista: 'Prach B, Brau F, Buttazzo G, Lampert C. 2024. 1-Lipschitz layers compared:
    Memory, speed, and certifiable robustness. Proceedings of the IEEE/CVF Conference
    on Computer Vision and Pattern Recognition. CVPR: Conference on Computer Vision
    and Pattern Recognition, 24574–24583.'
  mla: 'Prach, Bernd, et al. “1-Lipschitz Layers Compared: Memory, Speed, and Certifiable
    Robustness.” <i>Proceedings of the IEEE/CVF Conference on Computer Vision and
    Pattern Recognition</i>, Computer Vision Foundation, 2024, pp. 24574–83, doi:<a
    href="https://doi.org/10.1109/CVPR52733.2024.02320">10.1109/CVPR52733.2024.02320</a>.'
  short: B. Prach, F. Brau, G. Buttazzo, C. Lampert, in:, Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition, Computer Vision Foundation,
    2024, pp. 24574–24583.
conference:
  end_date: 2024-06-22
  location: Seattle, WA, United States
  name: 'CVPR: Conference on Computer Vision and Pattern Recognition'
  start_date: 2024-06-16
corr_author: '1'
date_created: 2024-08-14T08:42:32Z
date_published: 2024-06-01T00:00:00Z
date_updated: 2026-04-07T11:49:51Z
day: '01'
department:
- _id: GradSch
- _id: ChLa
doi: 10.1109/CVPR52733.2024.02320
external_id:
  arxiv:
  - '2311.16833'
  isi:
  - '001344387500055'
has_accepted_license: '1'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2311.16833
month: '06'
oa: 1
oa_version: Preprint
page: 24574-24583
publication: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
  Recognition
publication_status: published
publisher: Computer Vision Foundation
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/berndprach/1LipschitzLayersCompared
  record:
  - id: '19759'
    relation: dissertation_contains
    status: public
status: public
title: '1-Lipschitz layers compared: Memory, speed, and certifiable robustness'
type: conference
user_id: 317138e5-6ab7-11ef-aa6d-ffef3953e345
year: '2024'
...
---
OA_place: repository
_id: '18874'
abstract:
- lang: eng
  text: "Despite extensive research since the community learned about adversarial\r\nexamples
    10 years ago, we still do not know how to train high-accuracy\r\nclassifiers that
    are guaranteed to be robust to small perturbations of their\r\ninputs. Previous
    works often argued that this might be because no classifier\r\nexists that is
    robust and accurate at the same time. However, in computer\r\nvision this assumption
    does not match reality where humans are usually accurate\r\nand robust on most
    tasks of interest. We offer an alternative explanation and\r\nshow that in certain
    settings robust generalization is only possible with\r\nunrealistically large
    amounts of data. More precisely we find a setting where a\r\nrobust classifier
    exists, it is easy to learn an accurate classifier, yet it\r\nrequires an exponential
    amount of data to learn a robust classifier. Based on\r\nthis theoretical result,
    we explore how well robust classifiers generalize on\r\ndatasets such as CIFAR-10.
    We come to the conclusion that on this datasets, the\r\nlimitation of current
    robust models also lies in the generalization, and that\r\nthey require a lot
    of data to do well on the test set. We also show that the\r\nproblem is not in
    the expressiveness or generalization capabilities of current\r\narchitectures,
    and that there are low magnitude features in the data which are\r\nuseful for
    non-robust generalization but are not available for robust\r\nclassifiers."
article_number: '2412.04245'
article_processing_charge: No
arxiv: 1
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Prach B, Lampert C. Intriguing properties of robust classification. <i>arXiv</i>.
    doi:<a href="https://doi.org/10.48550/arXiv.2412.04245">10.48550/arXiv.2412.04245</a>
  apa: Prach, B., &#38; Lampert, C. (n.d.). Intriguing properties of robust classification.
    <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2412.04245">https://doi.org/10.48550/arXiv.2412.04245</a>
  chicago: Prach, Bernd, and Christoph Lampert. “Intriguing Properties of Robust Classification.”
    <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2412.04245">https://doi.org/10.48550/arXiv.2412.04245</a>.
  ieee: B. Prach and C. Lampert, “Intriguing properties of robust classification,”
    <i>arXiv</i>. .
  ista: Prach B, Lampert C. Intriguing properties of robust classification. arXiv,
    2412.04245.
  mla: Prach, Bernd, and Christoph Lampert. “Intriguing Properties of Robust Classification.”
    <i>ArXiv</i>, 2412.04245, doi:<a href="https://doi.org/10.48550/arXiv.2412.04245">10.48550/arXiv.2412.04245</a>.
  short: B. Prach, C. Lampert, ArXiv (n.d.).
corr_author: '1'
date_created: 2025-01-24T16:57:29Z
date_published: 2024-12-05T00:00:00Z
date_updated: 2026-04-07T11:49:51Z
day: '05'
department:
- _id: GradSch
- _id: ChLa
doi: 10.48550/arXiv.2412.04245
external_id:
  arxiv:
  - '2412.04245'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2412.04245
month: '12'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: draft
related_material:
  record:
  - id: '20455'
    relation: later_version
    status: public
  - id: '19759'
    relation: dissertation_contains
    status: public
status: public
title: Intriguing properties of robust classification
type: preprint
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2024'
...
