---
OA_place: repository
OA_type: green
_id: '21859'
abstract:
- lang: eng
  text: As artificial neural networks, and specifically large language models, have
    improved rapidly in capabilities and quality, they have increasingly been deployed
    in real-world applications, from customer service to Google search, despite the
    fact that they frequently make factually incorrect or undesirable statements.
    This trend has inspired practical and academic interest in model editing, that
    is, in adjusting the weights of the model to modify its likely outputs for queries
    relating to a specific fact or set of facts. This may be done either to amend
    a fact or set of facts, for instance, to fix a frequent error in the training
    data, or to suppress a fact or set of facts entirely, for instance, in case of
    dangerous knowledge. Multiple methods have been proposed to do such edits. However,
    at the same time, it has been shown that such model editing can be brittle and
    incomplete. Moreover the effectiveness of any model editing method necessarily
    depends on the data on which the model is trained, and, therefore, a good understanding
    of the interaction of the training data distribution and the way it is stored
    in the network is necessary and helpful to reliably perform model editing. However,
    working with large language models trained on real-world data does not allow us
    to understand this relationship or fully measure the effects of model editing.
    We therefore propose Behemoth, a fully synthetic data generation framework. To
    demonstrate the practical insights from the framework, we explore model editing
    in the context of simple tabular data, demonstrating surprising findings that,
    in some cases, echo real-world results, for instance, that in some cases restricting
    the update rank results in a more effective update.
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "EI thanks Weiwei Yang, Janardhan Kulkani, and Kate Lytvynets for
  their advice and support in\r\ndeveloping an earlier version of the Behemoth library.
  This research was supported by the Scientific\r\nService Units (SSU) of IST Austria
  through resources provided by Scientific Computing (SciComp).\r\nEI was supported
  in part by the FWF DK VGSCO, grant agreement number W1260-N35.\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Eugenia B
  full_name: Iofinova, Eugenia B
  id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
  last_name: Iofinova
  orcid: 0000-0002-7778-3221
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Iofinova EB, Alistarh D-A. Behemoth: Benchmarking unlearning in LLMs using
    fully synthetic data. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2601.23153">10.48550/arXiv.2601.23153</a>'
  apa: 'Iofinova, E. B., &#38; Alistarh, D.-A. (n.d.). Behemoth: Benchmarking unlearning
    in LLMs using fully synthetic data. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2601.23153">https://doi.org/10.48550/arXiv.2601.23153</a>'
  chicago: 'Iofinova, Eugenia B, and Dan-Adrian Alistarh. “Behemoth: Benchmarking
    Unlearning in LLMs Using Fully Synthetic Data.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2601.23153">https://doi.org/10.48550/arXiv.2601.23153</a>.'
  ieee: 'E. B. Iofinova and D.-A. Alistarh, “Behemoth: Benchmarking unlearning in
    LLMs using fully synthetic data,” <i>arXiv</i>. .'
  ista: 'Iofinova EB, Alistarh D-A. Behemoth: Benchmarking unlearning in LLMs using
    fully synthetic data. arXiv, <a href="https://doi.org/10.48550/arXiv.2601.23153">10.48550/arXiv.2601.23153</a>.'
  mla: 'Iofinova, Eugenia B., and Dan-Adrian Alistarh. “Behemoth: Benchmarking Unlearning
    in LLMs Using Fully Synthetic Data.” <i>ArXiv</i>, doi:<a href="https://doi.org/10.48550/arXiv.2601.23153">10.48550/arXiv.2601.23153</a>.'
  short: E.B. Iofinova, D.-A. Alistarh, ArXiv (n.d.).
corr_author: '1'
date_created: 2026-05-11T08:58:07Z
date_published: 2026-01-30T00:00:00Z
date_updated: 2026-05-19T11:20:27Z
day: '30'
department:
- _id: GradSch
- _id: DaAl
doi: 10.48550/arXiv.2601.23153
external_id:
  arxiv:
  - '2601.23153'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2601.23153
month: '01'
oa: 1
oa_version: Preprint
project:
- _id: 9B9290DE-BA93-11EA-9121-9846C619BF3A
  grant_number: W1260-N35
  name: Vienna Graduate School on Computational Optimization
publication: arXiv
publication_status: draft
related_material:
  record:
  - id: '21854'
    relation: dissertation_contains
    status: public
status: public
title: 'Behemoth: Benchmarking unlearning in LLMs using fully synthetic data'
type: preprint
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2026'
...