{"file_date_updated":"2025-07-08T05:52:26Z","date_published":"2025-06-24T00:00:00Z","oa":1,"license":"https://creativecommons.org/licenses/by-nc-nd/4.0/","type":"journal_article","ddc":["000"],"pmid":1,"article_processing_charge":"Yes (in subscription journal)","issue":"25","intvolume":" 122","publication_identifier":{"eissn":["1091-6490"],"issn":["0027-8424"]},"author":[{"full_name":"Mcavoy, Alex","first_name":"Alex","last_name":"Mcavoy"},{"first_name":"Udari Madhushani","last_name":"Sehwag","full_name":"Sehwag, Udari Madhushani"},{"full_name":"Hilbe, Christian","id":"2FDF8F3C-F248-11E8-B48F-1D18A9856A87","last_name":"Hilbe","first_name":"Christian","orcid":"0000-0001-5116-955X"},{"id":"2E5DCA20-F248-11E8-B48F-1D18A9856A87","first_name":"Krishnendu","last_name":"Chatterjee","orcid":"0000-0002-4561-241X","full_name":"Chatterjee, Krishnendu"},{"full_name":"Barfuss, Wolfram","first_name":"Wolfram","last_name":"Barfuss"},{"full_name":"Su, Qi","last_name":"Su","first_name":"Qi"},{"full_name":"Leonard, Naomi Ehrich","last_name":"Leonard","first_name":"Naomi Ehrich"},{"last_name":"Plotkin","first_name":"Joshua B.","full_name":"Plotkin, Joshua B."}],"year":"2025","publication":"Proceedings of the National Academy of Sciences","acknowledgement":"We gratefully acknowledge the support from the European Research Council (Starting Grant 850529: E-DIRECT) and the Max Planck Society (C.H.), the European Research Council (Consolidator Grant 863818: ForM-SMArt) (K.C.), the Shanghai Pujiang Program (No. 23PJ1405500) (Q.S.), the Army Research Office (Grant No. W911NF-18-1-0325) (N.E.L.), and the John Templeton Foundation (Grant No. 62281) (J.B.P.).","article_number":"e2319927121","abstract":[{"text":"Multiagent learning is challenging when agents face mixed-motivation interactions, where conflicts of interest arise as agents independently try to optimize their respective outcomes. Recent advancements in evolutionary game theory have identified a class of “zero-determinant” strategies, which confer an agent with significant unilateral control over outcomes in repeated games. Building on these insights, we present a comprehensive generalization of zero-determinant strategies to stochastic games, encompassing dynamic environments. We propose an algorithm that allows an agent to discover strategies enforcing predetermined linear (or approximately linear) payoff relationships. Of particular interest is the relationship in which both payoffs are equal, which serves as a proxy for fairness in symmetric games. We demonstrate that an agent can discover strategies enforcing such relationships through experience alone, without coordinating with an opponent. In finding and using such a strategy, an agent (“enforcer”) can incentivize optimal and equitable outcomes, circumventing potential exploitation. In particular, from the opponent’s viewpoint, the enforcer transforms a mixed-motivation problem into a cooperative problem, paving the way for more collaboration and fairness in multiagent systems.","lang":"eng"}],"file":[{"checksum":"3b35befd959a3e37aa9080a64a6afaf3","file_id":"19972","file_name":"2025_PNAS_McAvoy.pdf","relation":"main_file","access_level":"open_access","date_updated":"2025-07-08T05:52:26Z","file_size":29525932,"success":1,"content_type":"application/pdf","creator":"dernst","date_created":"2025-07-08T05:52:26Z"}],"citation":{"ama":"Mcavoy A, Sehwag UM, Hilbe C, et al. Unilateral incentive alignment in two-agent stochastic games. Proceedings of the National Academy of Sciences. 2025;122(25). doi:10.1073/pnas.2319927121","ieee":"A. Mcavoy et al., “Unilateral incentive alignment in two-agent stochastic games,” Proceedings of the National Academy of Sciences, vol. 122, no. 25. National Academy of Sciences, 2025.","mla":"Mcavoy, Alex, et al. “Unilateral Incentive Alignment in Two-Agent Stochastic Games.” Proceedings of the National Academy of Sciences, vol. 122, no. 25, e2319927121, National Academy of Sciences, 2025, doi:10.1073/pnas.2319927121.","chicago":"Mcavoy, Alex, Udari Madhushani Sehwag, Christian Hilbe, Krishnendu Chatterjee, Wolfram Barfuss, Qi Su, Naomi Ehrich Leonard, and Joshua B. Plotkin. “Unilateral Incentive Alignment in Two-Agent Stochastic Games.” Proceedings of the National Academy of Sciences. National Academy of Sciences, 2025. https://doi.org/10.1073/pnas.2319927121.","short":"A. Mcavoy, U.M. Sehwag, C. Hilbe, K. Chatterjee, W. Barfuss, Q. Su, N.E. Leonard, J.B. Plotkin, Proceedings of the National Academy of Sciences 122 (2025).","apa":"Mcavoy, A., Sehwag, U. M., Hilbe, C., Chatterjee, K., Barfuss, W., Su, Q., … Plotkin, J. B. (2025). Unilateral incentive alignment in two-agent stochastic games. Proceedings of the National Academy of Sciences. National Academy of Sciences. https://doi.org/10.1073/pnas.2319927121","ista":"Mcavoy A, Sehwag UM, Hilbe C, Chatterjee K, Barfuss W, Su Q, Leonard NE, Plotkin JB. 2025. Unilateral incentive alignment in two-agent stochastic games. Proceedings of the National Academy of Sciences. 122(25), e2319927121."},"has_accepted_license":"1","department":[{"_id":"KrCh"}],"date_created":"2025-07-06T22:01:23Z","OA_place":"publisher","title":"Unilateral incentive alignment in two-agent stochastic games","oa_version":"Published Version","project":[{"name":"Formal Methods for Stochastic Models: Algorithms and Applications","call_identifier":"H2020","grant_number":"863818","_id":"0599E47C-7A3F-11EA-A408-12923DDC885E"}],"date_updated":"2025-07-08T06:04:07Z","_id":"19965","publisher":"National Academy of Sciences","language":[{"iso":"eng"}],"status":"public","ec_funded":1,"publication_status":"published","quality_controlled":"1","month":"06","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","volume":122,"doi":"10.1073/pnas.2319927121","article_type":"original","tmp":{"image":"/images/cc_by_nc_nd.png","short":"CC BY-NC-ND (4.0)","legal_code_url":"https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode","name":"Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)"},"external_id":{"pmid":["40523172"]},"scopus_import":"1","OA_type":"hybrid","day":"24"}