TY - JOUR AB - We consider a natural problem dealing with weighted packet selection across a rechargeable link, which e.g., finds applications in cryptocurrency networks. The capacity of a link (u, v) is determined by how many nodes u and v allocate for this link. Specifically, the input is a finite ordered sequence of packets that arrive in both directions along a link. Given (u, v) and a packet of weight x going from u to v, node u can either accept or reject the packet. If u accepts the packet, the capacity on link (u, v) decreases by x. Correspondingly, v's capacity on increases by x. If a node rejects the packet, this will entail a cost affinely linear in the weight of the packet. A link is “rechargeable” in the sense that the total capacity of the link has to remain constant, but the allocation of capacity at the ends of the link can depend arbitrarily on the nodes' decisions. The goal is to minimise the sum of the capacity injected into the link and the cost of rejecting packets. We show that the problem is NP-hard, but can be approximated efficiently with a ratio of (1+E) . (1+3) for some arbitrary E>0. AU - Schmid, Stefan AU - Svoboda, Jakub AU - Yeo, Michelle X ID - 14820 JF - Theoretical Computer Science KW - General Computer Science KW - Theoretical Computer Science SN - 0304-3975 TI - Weighted packet selection for rechargeable links in cryptocurrency networks: Complexity and approximation VL - 989 ER - TY - CONF AB - Graphical games are a useful framework for modeling the interactions of (selfish) agents who are connected via an underlying topology and whose behaviors influence each other. They have wide applications ranging from computer science to economics and biology. Yet, even though an agent’s payoff only depends on the actions of their direct neighbors in graphical games, computing the Nash equilibria and making statements about the convergence time of "natural" local dynamics in particular can be highly challenging. In this work, we present a novel approach for classifying complexity of Nash equilibria in graphical games by establishing a connection to local graph algorithms, a subfield of distributed computing. In particular, we make the observation that the equilibria of graphical games are equivalent to locally verifiable labelings (LVL) in graphs; vertex labelings which are verifiable with constant-round local algorithms. This connection allows us to derive novel lower bounds on the convergence time to equilibrium of best-response dynamics in graphical games. Since we establish that distributed convergence can sometimes be provably slow, we also introduce and give bounds on an intuitive notion of "time-constrained" inefficiency of best responses. We exemplify how our results can be used in the implementation of mechanisms that ensure convergence of best responses to a Nash equilibrium. Our results thus also give insight into the convergence of strategy-proof algorithms for graphical games, which is still not well understood. AU - Hirvonen, Juho AU - Schmid, Laura AU - Chatterjee, Krishnendu AU - Schmid, Stefan ID - 15006 SN - 18688969 T2 - 27th International Conference on Principles of Distributed Systems TI - On the convergence time in graphical games: A locality-sensitive approach VL - 286 ER - TY - JOUR AB - Direct reciprocity is a powerful mechanism for cooperation in social dilemmas. The very logic of reciprocity, however, seems to require that individuals are symmetric, and that everyone has the same means to influence each others’ payoffs. Yet in many applications, individuals are asymmetric. Herein, we study the effect of asymmetry in linear public good games. Individuals may differ in their endowments (their ability to contribute to a public good) and in their productivities (how effective their contributions are). Given the individuals’ productivities, we ask which allocation of endowments is optimal for cooperation. To this end, we consider two notions of optimality. The first notion focuses on the resilience of cooperation. The respective endowment distribution ensures that full cooperation is feasible even under the most adverse conditions. The second notion focuses on efficiency. The corresponding endowment distribution maximizes group welfare. Using analytical methods, we fully characterize these two endowment distributions. This analysis reveals that both optimality notions favor some endowment inequality: More productive players ought to get higher endowments. Yet the two notions disagree on how unequal endowments are supposed to be. A focus on resilience results in less inequality. With additional simulations, we show that the optimal endowment allocation needs to account for both the resilience and the efficiency of cooperation. AU - Hübner, Valentin AU - Staab, Manuel AU - Hilbe, Christian AU - Chatterjee, Krishnendu AU - Kleshnina, Maria ID - 15083 IS - 10 JF - Proceedings of the National Academy of Sciences SN - 0027-8424 TI - Efficiency and resilience of cooperation in asymmetric social dilemmas VL - 121 ER - TY - GEN AB - in the research article "Efficiency and resilience of cooperation in asymmetric social dilemmas" (by Valentin Hübner, Manuel Staab, Christian Hilbe, Krishnendu Chatterjee, and Maria Kleshnina). We used different implementations for the case of two and three players, both described below. AU - Hübner, Valentin AU - Kleshnina, Maria ID - 15108 TI - Computer code for "Efficiency and resilience of cooperation in asymmetric social dilemmas" ER - TY - CONF AB - Turn-based stochastic games (aka simple stochastic games) are two-player zero-sum games played on directed graphs with probabilistic transitions. The goal of player-max is to maximize the probability to reach a target state against the adversarial player-min. These games lie in NP ∩ coNP and are among the rare combinatorial problems that belong to this complexity class for which the existence of polynomial-time algorithm is a major open question. While randomized sub-exponential time algorithm exists, all known deterministic algorithms require exponential time in the worst-case. An important open question has been whether faster algorithms can be obtained parametrized by the treewidth of the game graph. Even deterministic sub-exponential time algorithm for constant treewidth turn-based stochastic games has remain elusive. In this work our main result is a deterministic algorithm to solve turn-based stochastic games that, given a game with n states, treewidth at most t, and the bit-complexity of the probabilistic transition function log D, has running time O ((tn2 log D)t log n). In particular, our algorithm is quasi-polynomial time for games with constant or poly-logarithmic treewidth. AU - Chatterjee, Krishnendu AU - Meggendorfer, Tobias AU - Saona Urmeneta, Raimundo J AU - Svoboda, Jakub ID - 12676 SN - 9781611977554 T2 - Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms TI - Faster algorithm for turn-based stochastic games with bounded treewidth ER - TY - CONF AB - Reinforcement learning has received much attention for learning controllers of deterministic systems. We consider a learner-verifier framework for stochastic control systems and survey recent methods that formally guarantee a conjunction of reachability and safety properties. Given a property and a lower bound on the probability of the property being satisfied, our framework jointly learns a control policy and a formal certificate to ensure the satisfaction of the property with a desired probability threshold. Both the control policy and the formal certificate are continuous functions from states to reals, which are learned as parameterized neural networks. While in the deterministic case, the certificates are invariant and barrier functions for safety, or Lyapunov and ranking functions for liveness, in the stochastic case the certificates are supermartingales. For certificate verification, we use interval arithmetic abstract interpretation to bound the expected values of neural network functions. AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A AU - Lechner, Mathias AU - Zikelic, Dorde ID - 13142 SN - 0302-9743 T2 - Tools and Algorithms for the Construction and Analysis of Systems TI - A learner-verifier framework for neural network controllers and certificates of stochastic systems VL - 13993 ER - TY - JOUR AB - Populations evolve in spatially heterogeneous environments. While a certain trait might bring a fitness advantage in some patch of the environment, a different trait might be advantageous in another patch. Here, we study the Moran birth–death process with two types of individuals in a population stretched across two patches of size N, each patch favouring one of the two types. We show that the long-term fate of such populations crucially depends on the migration rate μ between the patches. To classify the possible fates, we use the distinction between polynomial (short) and exponential (long) timescales. We show that when μ is high then one of the two types fixates on the whole population after a number of steps that is only polynomial in N. By contrast, when μ is low then each type holds majority in the patch where it is favoured for a number of steps that is at least exponential in N. Moreover, we precisely identify the threshold migration rate μ⋆ that separates those two scenarios, thereby exactly delineating the situations that support long-term coexistence of the two types. We also discuss the case of various cycle graphs and we present computer simulations that perfectly match our analytical results. AU - Svoboda, Jakub AU - Tkadlec, Josef AU - Kaveh, Kamran AU - Chatterjee, Krishnendu ID - 12787 IS - 2271 JF - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences SN - 1364-5021 TI - Coexistence times in the Moran process with environmental heterogeneity VL - 479 ER - TY - JOUR AB - The field of indirect reciprocity investigates how social norms can foster cooperation when individuals continuously monitor and assess each other’s social interactions. By adhering to certain social norms, cooperating individuals can improve their reputation and, in turn, receive benefits from others. Eight social norms, known as the “leading eight," have been shown to effectively promote the evolution of cooperation as long as information is public and reliable. These norms categorize group members as either ’good’ or ’bad’. In this study, we examine a scenario where individuals instead assign nuanced reputation scores to each other, and only cooperate with those whose reputation exceeds a certain threshold. We find both analytically and through simulations that such quantitative assessments are error-correcting, thus facilitating cooperation in situations where information is private and unreliable. Moreover, our results identify four specific norms that are robust to such conditions, and may be relevant for helping to sustain cooperation in natural populations. AU - Schmid, Laura AU - Ekbatani, Farbod AU - Hilbe, Christian AU - Chatterjee, Krishnendu ID - 12861 JF - Nature Communications TI - Quantitative assessment can stabilize indirect reciprocity under imperfect information VL - 14 ER - TY - CONF AB - We study the problem of training and certifying adversarially robust quantized neural networks (QNNs). Quantization is a technique for making neural networks more efficient by running them using low-bit integer arithmetic and is therefore commonly adopted in industry. Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization, and certification of the quantized representation is necessary to guarantee robustness. In this work, we present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs. Inspired by advances in robust learning of non-quantized networks, our training algorithm computes the gradient of an abstract representation of the actual network. Unlike existing approaches, our method can handle the discrete semantics of QNNs. Based on QA-IBP, we also develop a complete verification procedure for verifying the adversarial robustness of QNNs, which is guaranteed to terminate and produce a correct answer. Compared to existing approaches, the key advantage of our verification procedure is that it runs entirely on GPU or other accelerator devices. We demonstrate experimentally that our approach significantly outperforms existing methods and establish the new state-of-the-art for training and certifying the robustness of QNNs. AU - Lechner, Mathias AU - Zikelic, Dorde AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A AU - Rus, Daniela ID - 14242 IS - 12 SN - 9781577358800 T2 - Proceedings of the 37th AAAI Conference on Artificial Intelligence TI - Quantization-aware interval bound propagation for training certifiably robust quantized neural networks VL - 37 ER - TY - CONF AB - Two-player zero-sum "graph games" are central in logic, verification, and multi-agent systems. The game proceeds by placing a token on a vertex of a graph, and allowing the players to move it to produce an infinite path, which determines the winner or payoff of the game. Traditionally, the players alternate turns in moving the token. In "bidding games", however, the players have budgets and in each turn, an auction (bidding) determines which player moves the token. So far, bidding games have only been studied as full-information games. In this work we initiate the study of partial-information bidding games: we study bidding games in which a player's initial budget is drawn from a known probability distribution. We show that while for some bidding mechanisms and objectives, it is straightforward to adapt the results from the full-information setting to the partial-information setting, for others, the analysis is significantly more challenging, requires new techniques, and gives rise to interesting results. Specifically, we study games with "mean-payoff" objectives in combination with "poorman" bidding. We construct optimal strategies for a partially-informed player who plays against a fully-informed adversary. We show that, somewhat surprisingly, the "value" under pure strategies does not necessarily exist in such games. AU - Avni, Guy AU - Jecker, Ismael R AU - Zikelic, Dorde ID - 14243 IS - 5 SN - 9781577358800 T2 - Proceedings of the 37th AAAI Conference on Artificial Intelligence TI - Bidding graph games with partially-observable budgets VL - 37 ER - TY - CONF AB - We provide a learning-based technique for guessing a winning strategy in a parity game originating from an LTL synthesis problem. A cheaply obtained guess can be useful in several applications. Not only can the guessed strategy be applied as best-effort in cases where the game’s huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous LTL synthesis in several ways. Firstly, checking whether a guessed strategy is winning is easier than constructing one. Secondly, even if the guess is wrong in some places, it can be fixed by strategy iteration faster than constructing one from scratch. Thirdly, the guess can be used in on-the-fly approaches to prioritize exploration in the most fruitful directions. In contrast to previous works, we (i) reflect the highly structured logical information in game’s states, the so-called semantic labelling, coming from the recent LTL-to-automata translations, and (ii) learn to reflect it properly by learning from previously solved games, bringing the solving process closer to human-like reasoning. AU - Kretinsky, Jan AU - Meggendorfer, Tobias AU - Prokop, Maximilian AU - Rieder, Sabine ID - 14259 SN - 0302-9743 T2 - 35th International Conference on Computer Aided Verification TI - Guessing winning policies in LTL synthesis by semantic learning VL - 13964 ER - TY - CONF AB - Probabilistic recurrence relations (PRRs) are a standard formalism for describing the runtime of a randomized algorithm. Given a PRR and a time limit κ, we consider the tail probability Pr[T≥κ], i.e., the probability that the randomized runtime T of the PRR exceeds κ. Our focus is the formal analysis of tail bounds that aims at finding a tight asymptotic upper bound u≥Pr[T≥κ]. To address this problem, the classical and most well-known approach is the cookbook method by Karp (JACM 1994), while other approaches are mostly limited to deriving tail bounds of specific PRRs via involved custom analysis. In this work, we propose a novel approach for deriving the common exponentially-decreasing tail bounds for PRRs whose preprocessing time and random passed sizes observe discrete or (piecewise) uniform distribution and whose recursive call is either a single procedure call or a divide-and-conquer. We first establish a theoretical approach via Markov’s inequality, and then instantiate the theoretical approach with a template-based algorithmic approach via a refined treatment of exponentiation. Experimental evaluation shows that our algorithmic approach is capable of deriving tail bounds that are (i) asymptotically tighter than Karp’s method, (ii) match the best-known manually-derived asymptotic tail bound for QuickSelect, and (iii) is only slightly worse (with a loglogn factor) than the manually-proven optimal asymptotic tail bound for QuickSort. Moreover, our algorithmic approach handles all examples (including realistic PRRs such as QuickSort, QuickSelect, DiameterComputation, etc.) in less than 0.1 s, showing that our approach is efficient in practice. AU - Sun, Yican AU - Fu, Hongfei AU - Chatterjee, Krishnendu AU - Goharshady, Amir Kafshdar ID - 14318 SN - 0302-9743 T2 - Computer Aided Verification TI - Automated tail bound analysis for probabilistic recurrence relations VL - 13966 ER - TY - CONF AB - Markov decision processes can be viewed as transformers of probability distributions. While this view is useful from a practical standpoint to reason about trajectories of distributions, basic reachability and safety problems are known to be computationally intractable (i.e., Skolem-hard) to solve in such models. Further, we show that even for simple examples of MDPs, strategies for safety objectives over distributions can require infinite memory and randomization. In light of this, we present a novel overapproximation approach to synthesize strategies in an MDP, such that a safety objective over the distributions is met. More precisely, we develop a new framework for template-based synthesis of certificates as affine distributional and inductive invariants for safety objectives in MDPs. We provide two algorithms within this framework. One can only synthesize memoryless strategies, but has relative completeness guarantees, while the other can synthesize general strategies. The runtime complexity of both algorithms is in PSPACE. We implement these algorithms and show that they can solve several non-trivial examples. AU - Akshay, S. AU - Chatterjee, Krishnendu AU - Meggendorfer, Tobias AU - Zikelic, Dorde ID - 14317 SN - 0302-9743 T2 - International Conference on Computer Aided Verification TI - MDPs as distribution transformers: Affine invariant synthesis for safety objectives VL - 13966 ER - TY - JOUR AB - We study turn-based stochastic zero-sum games with lexicographic preferences over objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as controllable and adversarial non-determinism. Lexicographic order allows one to consider multiple objectives with a strict preference order. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. For a mixture of reachability and safety objectives, we show that deterministic lexicographically optimal strategies exist and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP∩coNP, matching the current known bound for single objectives; and in general the decision problem is PSPACE-hard and can be solved in NEXPTIME∩coNEXPTIME. We present an algorithm that computes the lexicographically optimal strategies via a reduction to the computation of optimal strategies in a sequence of single-objectives games. For omega-regular objectives, we restrict our analysis to one-player games, also known as Markov decision processes. We show that lexicographically optimal strategies exist and need either randomization or finite memory. We present an algorithm that solves the relevant decision problem in polynomial time. We have implemented our algorithms and report experimental results on various case studies. AU - Chatterjee, Krishnendu AU - Katoen, Joost P AU - Mohr, Stefanie AU - Weininger, Maximilian AU - Winkler, Tobias ID - 12738 JF - Formal Methods in System Design TI - Stochastic games with lexicographic objectives ER - TY - JOUR AB - Mathematical models often aim to describe a complicated mechanism in a cohesive and simple manner. However, reaching perfect balance between being simple enough or overly simplistic is a challenging task. Frequently, game-theoretic models have an underlying assumption that players, whenever they choose to execute a specific action, do so perfectly. In fact, it is rare that action execution perfectly coincides with intentions of individuals, giving rise to behavioural mistakes. The concept of incompetence of players was suggested to address this issue in game-theoretic settings. Under the assumption of incompetence, players have non-zero probabilities of executing a different strategy from the one they chose, leading to stochastic outcomes of the interactions. In this article, we survey results related to the concept of incompetence in classic as well as evolutionary game theory and provide several new results. We also suggest future extensions of the model and argue why it is important to take into account behavioural mistakes when analysing interactions among players in both economic and biological settings. AU - Graham, Thomas AU - Kleshnina, Maria AU - Filar, Jerzy A. ID - 10770 JF - Dynamic Games and Applications SN - 2153-0785 TI - Where do mistakes lead? A survey of games with incompetent players VL - 13 ER - TY - CONF AB - Entropic risk (ERisk) is an established risk measure in finance, quantifying risk by an exponential re-weighting of rewards. We study ERisk for the first time in the context of turn-based stochastic games with the total reward objective. This gives rise to an objective function that demands the control of systems in a risk-averse manner. We show that the resulting games are determined and, in particular, admit optimal memoryless deterministic strategies. This contrasts risk measures that previously have been considered in the special case of Markov decision processes and that require randomization and/or memory. We provide several results on the decidability and the computational complexity of the threshold problem, i.e. whether the optimal value of ERisk exceeds a given threshold. In the most general case, the problem is decidable subject to Shanuel’s conjecture. If all inputs are rational, the resulting threshold problem can be solved using algebraic numbers, leading to decidability via a polynomial-time reduction to the existential theory of the reals. Further restrictions on the encoding of the input allow the solution of the threshold problem in NP∩coNP. Finally, an approximation algorithm for the optimal value of ERisk is provided. AU - Baier, Christel AU - Chatterjee, Krishnendu AU - Meggendorfer, Tobias AU - Piribauer, Jakob ID - 14417 SN - 9783959772921 T2 - 48th International Symposium on Mathematical Foundations of Computer Science TI - Entropic risk for turn-based stochastic games VL - 272 ER - TY - JOUR AB - Allometric settings of population dynamics models are appealing due to their parsimonious nature and broad utility when studying system level effects. Here, we parameterise the size-scaled Rosenzweig-MacArthur differential equations to eliminate prey-mass dependency, facilitating an in depth analytic study of the equations which incorporates scaling parameters’ contributions to coexistence. We define the functional response term to match empirical findings, and examine situations where metabolic theory derivations and observation diverge. The dynamical properties of the Rosenzweig-MacArthur system, encompassing the distribution of size-abundance equilibria, the scaling of period and amplitude of population cycling, and relationships between predator and prey abundances, are consistent with empirical observation. Our parameterisation is an accurate minimal model across 15+ orders of mass magnitude. AU - Mckerral, Jody C. AU - Kleshnina, Maria AU - Ejov, Vladimir AU - Bartle, Louise AU - Mitchell, James G. AU - Filar, Jerzy A. ID - 12706 IS - 2 JF - PLoS One TI - Empirical parameterisation and dynamical analysis of the allometric Rosenzweig-MacArthur equations VL - 18 ER - TY - CONF AB - We consider bidding games, a class of two-player zero-sum graph games. The game proceeds as follows. Both players have bounded budgets. A token is placed on a vertex of a graph, in each turn the players simultaneously submit bids, and the higher bidder moves the token, where we break bidding ties in favor of Player 1. Player 1 wins the game iff the token visits a designated target vertex. We consider, for the first time, poorman discrete-bidding in which the granularity of the bids is restricted and the higher bid is paid to the bank. Previous work either did not impose granularity restrictions or considered Richman bidding (bids are paid to the opponent). While the latter mechanisms are technically more accessible, the former is more appealing from a practical standpoint. Our study focuses on threshold budgets, which is the necessary and sufficient initial budget required for Player 1 to ensure winning against a given Player 2 budget. We first show existence of thresholds. In DAGs, we show that threshold budgets can be approximated with error bounds by thresholds under continuous-bidding and that they exhibit a periodic behavior. We identify closed-form solutions in special cases. We implement and experiment with an algorithm to find threshold budgets. AU - Avni, Guy AU - Meggendorfer, Tobias AU - Sadhukhan, Suman AU - Tkadlec, Josef AU - Zikelic, Dorde ID - 14518 SN - 0922-6389 T2 - Frontiers in Artificial Intelligence and Applications TI - Reachability poorman discrete-bidding games VL - 372 ER - TY - CONF AB - We consider the problem of learning control policies in discrete-time stochastic systems which guarantee that the system stabilizes within some specified stabilization region with probability 1. Our approach is based on the novel notion of stabilizing ranking supermartingales (sRSMs) that we introduce in this work. Our sRSMs overcome the limitation of methods proposed in previous works whose applicability is restricted to systems in which the stabilizing region cannot be left once entered under any control policy. We present a learning procedure that learns a control policy together with an sRSM that formally certifies probability 1 stability, both learned as neural networks. We show that this procedure can also be adapted to formally verifying that, under a given Lipschitz continuous control policy, the stochastic system stabilizes within some stabilizing region with probability 1. Our experimental evaluation shows that our learning procedure can successfully learn provably stabilizing policies in practice. AU - Ansaripour, Matin AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A AU - Lechner, Mathias AU - Zikelic, Dorde ID - 14559 SN - 0302-9743 T2 - 21st International Symposium on Automated Technology for Verification and Analysis TI - Learning provably stabilizing neural controllers for discrete-time stochastic systems VL - 14215 ER - TY - CONF AB - We consider a natural problem dealing with weighted packet selection across a rechargeable link, which e.g., finds applications in cryptocurrency networks. The capacity of a link (u, v) is determined by how much nodes u and v allocate for this link. Specifically, the input is a finite ordered sequence of packets that arrive in both directions along a link. Given (u, v) and a packet of weight x going from u to v, node u can either accept or reject the packet. If u accepts the packet, the capacity on link (u, v) decreases by x. Correspondingly, v’s capacity on (u, v) increases by x. If a node rejects the packet, this will entail a cost affinely linear in the weight of the packet. A link is “rechargeable” in the sense that the total capacity of the link has to remain constant, but the allocation of capacity at the ends of the link can depend arbitrarily on the nodes’ decisions. The goal is to minimise the sum of the capacity injected into the link and the cost of rejecting packets. We show that the problem is NP-hard, but can be approximated efficiently with a ratio of (1+ε)⋅(1+3–√) for some arbitrary ε>0. . AU - Schmid, Stefan AU - Svoboda, Jakub AU - Yeo, Michelle X ID - 13238 SN - 0302-9743 T2 - SIROCCO 2023: Structural Information and Communication Complexity TI - Weighted packet selection for rechargeable links in cryptocurrency networks: Complexity and approximation VL - 13892 ER - TY - JOUR AB - Natural selection is usually studied between mutants that differ in reproductive rate, but are subject to the same population structure. Here we explore how natural selection acts on mutants that have the same reproductive rate, but different population structures. In our framework, population structure is given by a graph that specifies where offspring can disperse. The invading mutant disperses offspring on a different graph than the resident wild-type. We find that more densely connected dispersal graphs tend to increase the invader’s fixation probability, but the exact relationship between structure and fixation probability is subtle. We present three main results. First, we prove that if both invader and resident are on complete dispersal graphs, then removing a single edge in the invader’s dispersal graph reduces its fixation probability. Second, we show that for certain island models higher invader’s connectivity increases its fixation probability, but the magnitude of the effect depends on the exact layout of the connections. Third, we show that for lattices the effect of different connectivity is comparable to that of different fitness: for large population size, the invader’s fixation probability is either constant or exponentially small, depending on whether it is more or less connected than the resident. AU - Tkadlec, Josef AU - Kaveh, Kamran AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 14657 IS - 208 JF - Journal of the Royal Society, Interface TI - Evolutionary dynamics of mutants that modify population structure VL - 20 ER - TY - JOUR AB - Many human interactions feature the characteristics of social dilemmas where individual actions have consequences for the group and the environment. The feedback between behavior and environment can be studied with the framework of stochastic games. In stochastic games, the state of the environment can change, depending on the choices made by group members. Past work suggests that such feedback can reinforce cooperative behaviors. In particular, cooperation can evolve in stochastic games even if it is infeasible in each separate repeated game. In stochastic games, participants have an interest in conditioning their strategies on the state of the environment. Yet in many applications, precise information about the state could be scarce. Here, we study how the availability of information (or lack thereof) shapes evolution of cooperation. Already for simple examples of two state games we find surprising effects. In some cases, cooperation is only possible if there is precise information about the state of the environment. In other cases, cooperation is most abundant when there is no information about the state of the environment. We systematically analyze all stochastic games of a given complexity class, to determine when receiving information about the environment is better, neutral, or worse for evolution of cooperation. AU - Kleshnina, Maria AU - Hilbe, Christian AU - Simsa, Stepan AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 13258 JF - Nature Communications TI - The effect of environmental information on evolution of cooperation in stochastic games VL - 14 ER - TY - GEN AU - Kleshnina, Maria ID - 13336 TI - kleshnina/stochgames_info: The effect of environmental information on evolution of cooperation in stochastic games ER - TY - CONF AB - A classic solution technique for Markov decision processes (MDP) and stochastic games (SG) is value iteration (VI). Due to its good practical performance, this approximative approach is typically preferred over exact techniques, even though no practical bounds on the imprecision of the result could be given until recently. As a consequence, even the most used model checkers could return arbitrarily wrong results. Over the past decade, different works derived stopping criteria, indicating when the precision reaches the desired level, for various settings, in particular MDP with reachability, total reward, and mean payoff, and SG with reachability.In this paper, we provide the first stopping criteria for VI on SG with total reward and mean payoff, yielding the first anytime algorithms in these settings. To this end, we provide the solution in two flavours: First through a reduction to the MDP case and second directly on SG. The former is simpler and automatically utilizes any advances on MDP. The latter allows for more local computations, heading towards better practical efficiency.Our solution unifies the previously mentioned approaches for MDP and SG and their underlying ideas. To achieve this, we isolate objective-specific subroutines as well as identify objective-independent concepts. These structural concepts, while surprisingly simple, form the very essence of the unified solution. AU - Kretinsky, Jan AU - Meggendorfer, Tobias AU - Weininger, Maximilian ID - 13967 SN - 1043-6871 T2 - 38th Annual ACM/IEEE Symposium on Logic in Computer Science TI - Stopping criteria for value iteration on stochastic games with quantitative objectives VL - 2023 ER - TY - JOUR AB - The input to the token swapping problem is a graph with vertices v1, v2, . . . , vn, and n tokens with labels 1,2, . . . , n, one on each vertex. The goal is to get token i to vertex vi for all i= 1, . . . , n using a minimum number of swaps, where a swap exchanges the tokens on the endpoints of an edge.Token swapping on a tree, also known as “sorting with a transposition tree,” is not known to be in P nor NP-complete. We present some partial results: 1. An optimum swap sequence may need to perform a swap on a leaf vertex that has the correct token (a “happy leaf”), disproving a conjecture of Vaughan. 2. Any algorithm that fixes happy leaves—as all known approximation algorithms for the problem do—has approximation factor at least 4/3. Furthermore, the two best-known 2-approximation algorithms have approximation factor exactly 2. 3. A generalized problem—weighted coloured token swapping—is NP-complete on trees, but solvable in polynomial time on paths and stars. In this version, tokens and vertices have colours, and colours have weights. The goal is to get every token to a vertex of the same colour, and the cost of a swap is the sum of the weights of the two tokens involved. AU - Biniaz, Ahmad AU - Jain, Kshitij AU - Lubiw, Anna AU - Masárová, Zuzana AU - Miltzow, Tillmann AU - Mondal, Debajyoti AU - Naredla, Anurag Murty AU - Tkadlec, Josef AU - Turcotte, Alexi ID - 12833 IS - 2 JF - Discrete Mathematics and Theoretical Computer Science SN - 1462-7264 TI - Token swapping on trees VL - 24 ER - TY - CONF AB - Payment channel networks (PCNs) are a promising technology to improve the scalability of cryptocurrencies. PCNs, however, face the challenge that the frequent usage of certain routes may deplete channels in one direction, and hence prevent further transactions. In order to reap the full potential of PCNs, recharging and rebalancing mechanisms are required to provision channels, as well as an admission control logic to decide which transactions to reject in case capacity is insufficient. This paper presents a formal model of this optimisation problem. In particular, we consider an online algorithms perspective, where transactions arrive over time in an unpredictable manner. Our main contributions are competitive online algorithms which come with provable guarantees over time. We empirically evaluate our algorithms on randomly generated transactions to compare the average performance of our algorithms to our theoretical bounds. We also show how this model and approach differs from related problems in classic communication networks. AU - Bastankhah, Mahsa AU - Chatterjee, Krishnendu AU - Maddah-Ali, Mohammad Ali AU - Schmid, Stefan AU - Svoboda, Jakub AU - Yeo, Michelle X ID - 14736 SN - 0302-9743 T2 - 27th International Conference on Financial Cryptography and Data Security TI - R2: Boosting liquidity in payment channel networks with online admission control VL - 13950 ER - TY - THES AB - Stochastic systems provide a formal framework for modelling and quantifying uncertainty in systems and have been widely adopted in many application domains. Formal verification and control of finite state stochastic systems, a subfield of formal methods also known as probabilistic model checking, is well studied. In contrast, formal verification and control of infinite state stochastic systems have received comparatively less attention. However, infinite state stochastic systems commonly arise in practice. For instance, probabilistic models that contain continuous probability distributions such as normal or uniform, or stochastic dynamical systems which are a classical model for control under uncertainty, both give rise to infinite state systems. The goal of this thesis is to contribute to laying theoretical and algorithmic foundations of fully automated formal verification and control of infinite state stochastic systems, with a particular focus on systems that may be executed over a long or infinite time. We consider formal verification of infinite state stochastic systems in the setting of static analysis of probabilistic programs and formal control in the setting of controller synthesis in stochastic dynamical systems. For both problems, we present some of the first fully automated methods for probabilistic (a.k.a. quantitative) reachability and safety analysis applicable to infinite time horizon systems. We also advance the state of the art of probability 1 (a.k.a. qualitative) reachability analysis for both problems. Finally, for formal controller synthesis in stochastic dynamical systems, we present a novel framework for learning neural network control policies in stochastic dynamical systems with formal guarantees on correctness with respect to quantitative reachability, safety or reach-avoid specifications. AU - Zikelic, Dorde ID - 14539 SN - 2663 - 337X TI - Automated verification and control of infinite state stochastic systems ER - TY - JOUR AB - We consider the almost-sure (a.s.) termination problem for probabilistic programs, which are a stochastic extension of classical imperative programs. Lexicographic ranking functions provide a sound and practical approach for termination of non-probabilistic programs, and their extension to probabilistic programs is achieved via lexicographic ranking supermartingales (LexRSMs). However, LexRSMs introduced in the previous work have a limitation that impedes their automation: all of their components have to be non-negative in all reachable states. This might result in a LexRSM not existing even for simple terminating programs. Our contributions are twofold. First, we introduce a generalization of LexRSMs that allows for some components to be negative. This standard feature of non-probabilistic termination proofs was hitherto not known to be sound in the probabilistic setting, as the soundness proof requires a careful analysis of the underlying stochastic process. Second, we present polynomial-time algorithms using our generalized LexRSMs for proving a.s. termination in broad classes of linear-arithmetic programs. AU - Chatterjee, Krishnendu AU - Kafshdar Goharshady, Ehsan AU - Novotný, Petr AU - Zárevúcky, Jiří AU - Zikelic, Dorde ID - 14778 IS - 2 JF - Formal Aspects of Computing KW - Theoretical Computer Science KW - Software SN - 0934-5043 TI - On lexicographic proof rules for probabilistic termination VL - 35 ER - TY - CONF AB - In this paper, we present novel algorithms that efficiently compute a shortest reconfiguration sequence between two given dominating sets in trees and interval graphs under the TOKEN SLIDING model. In this problem, a graph is provided along with its two dominating sets, which can be imagined as tokens placed on vertices. The objective is to find a shortest sequence of dominating sets that transforms one set into the other, with each set in the sequence resulting from sliding a single token in the previous set. While identifying any sequence has been well studied, our work presents the first polynomial algorithms for this optimization variant in the context of dominating sets. AU - Křišťan, Jan Matyáš AU - Svoboda, Jakub ID - 14456 SN - 0302-9743 T2 - 24th International Symposium on Fundamentals of Computation Theory TI - Shortest dominating set reconfiguration under token sliding VL - 14292 ER - TY - CONF AB - We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold p in [0,1] over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on 3 stochastic non-linear reinforcement learning tasks. AU - Zikelic, Dorde AU - Lechner, Mathias AU - Henzinger, Thomas A AU - Chatterjee, Krishnendu ID - 14830 IS - 10 KW - General Medicine SN - 2159-5399 T2 - Proceedings of the 37th AAAI Conference on Artificial Intelligence TI - Learning control policies for stochastic systems with reach-avoid guarantees VL - 37 ER - TY - CONF AB - A classical problem for Markov chains is determining their stationary (or steady-state) distribution. This problem has an equally classical solution based on eigenvectors and linear equation systems. However, this approach does not scale to large instances, and iterative solutions are desirable. It turns out that a naive approach, as used by current model checkers, may yield completely wrong results. We present a new approach, which utilizes recent advances in partial exploration and mean payoff computation to obtain a correct, converging approximation. AU - Meggendorfer, Tobias ID - 13139 SN - 0302-9743 T2 - TACAS 2023: Tools and Algorithms for the Construction and Analysis of Systems TI - Correct approximation of stationary distributions VL - 13993 ER - TY - GEN AB - The software artefact to evaluate the approximation of stationary distributions implementation. AU - Meggendorfer, Tobias ID - 14990 TI - Artefact for: Correct Approximation of Stationary Distributions ER - TY - CONF AB - Reinforcement learning has shown promising results in learning neural network policies for complicated control tasks. However, the lack of formal guarantees about the behavior of such policies remains an impediment to their deployment. We propose a novel method for learning a composition of neural network policies in stochastic environments, along with a formal certificate which guarantees that a specification over the policy's behavior is satisfied with the desired probability. Unlike prior work on verifiable RL, our approach leverages the compositional nature of logical specifications provided in SpectRL, to learn over graphs of probabilistic reach-avoid specifications. The formal guarantees are provided by learning neural network policies together with reach-avoid supermartingales (RASM) for the graph’s sub-tasks and then composing them into a global policy. We also derive a tighter lower bound compared to previous work on the probability of reach-avoidance implied by a RASM, which is required to find a compositional policy with an acceptable probabilistic threshold for complex tasks with multiple edge policies. We implement a prototype of our approach and evaluate it on a Stochastic Nine Rooms environment. AU - Zikelic, Dorde AU - Lechner, Mathias AU - Verma, Abhinav AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A ID - 15023 T2 - 37th Conference on Neural Information Processing Systems TI - Compositional policy learning in stochastic control systems with formal guarantees ER - TY - CONF AB - Given a Markov chain M = (V, v_0, δ), with state space V and a starting state v_0, and a probability threshold ε, an ε-core is a subset C of states that is left with probability at most ε. More formally, C ⊆ V is an ε-core, iff ℙ[reach (V\C)] ≤ ε. Cores have been applied in a wide variety of verification problems over Markov chains, Markov decision processes, and probabilistic programs, as a means of discarding uninteresting and low-probability parts of a probabilistic system and instead being able to focus on the states that are likely to be encountered in a real-world run. In this work, we focus on the problem of computing a minimal ε-core in a Markov chain. Our contributions include both negative and positive results: (i) We show that the decision problem on the existence of an ε-core of a given size is NP-complete. This solves an open problem posed in [Jan Kretínský and Tobias Meggendorfer, 2020]. We additionally show that the problem remains NP-complete even when limited to acyclic Markov chains with bounded maximal vertex degree; (ii) We provide a polynomial time algorithm for computing a minimal ε-core on Markov chains over control-flow graphs of structured programs. A straightforward combination of our algorithm with standard branch prediction techniques allows one to apply the idea of cores to find a subset of program lines that are left with low probability and then focus any desired static analysis on this core subset. AU - Ahmadi, Ali AU - Chatterjee, Krishnendu AU - Goharshady, Amir Kafshdar AU - Meggendorfer, Tobias AU - Safavi Hemami, Roodabeh AU - Zikelic, Dorde ID - 12102 SN - 1868-8969 T2 - 42nd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science TI - Algorithms and hardness results for computing cores of Markov chains VL - 250 ER - TY - CONF AB - Spatial games form a widely-studied class of games from biology and physics modeling the evolution of social behavior. Formally, such a game is defined by a square (d by d) payoff matrix M and an undirected graph G. Each vertex of G represents an individual, that initially follows some strategy i ∈ {1,2,…,d}. In each round of the game, every individual plays the matrix game with each of its neighbors: An individual following strategy i meeting a neighbor following strategy j receives a payoff equal to the entry (i,j) of M. Then, each individual updates its strategy to its neighbors' strategy with the highest sum of payoffs, and the next round starts. The basic computational problems consist of reachability between configurations and the average frequency of a strategy. For general spatial games and graphs, these problems are in PSPACE. In this paper, we examine restricted setting: the game is a prisoner’s dilemma; and G is a subgraph of grid. We prove that basic computational problems for spatial games with prisoner’s dilemma on a subgraph of a grid are PSPACE-hard. AU - Chatterjee, Krishnendu AU - Ibsen-Jensen, Rasmus AU - Jecker, Ismael R AU - Svoboda, Jakub ID - 12101 SN - 1868-8969 T2 - 42nd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science TI - Complexity of spatial games VL - 250 ER - TY - CONF AB - We treat the problem of risk-aware control for stochastic shortest path (SSP) on Markov decision processes (MDP). Typically, expectation is considered for SSP, which however is oblivious to the incurred risk. We present an alternative view, instead optimizing conditional value-at-risk (CVaR), an established risk measure. We treat both Markov chains as well as MDP and introduce, through novel insights, two algorithms, based on linear programming and value iteration, respectively. Both algorithms offer precise and provably correct solutions. Evaluation of our prototype implementation shows that risk-aware control is feasible on several moderately sized models. AU - Meggendorfer, Tobias ID - 12568 IS - 9 SN - 1577358767 T2 - Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 TI - Risk-aware stochastic shortest path VL - 36 ER - TY - JOUR AB - A matching is compatible to two or more labeled point sets of size n with labels {1, . . . , n} if its straight-line drawing on each of these point sets is crossing-free. We study the maximum number of edges in a matching compatible to two or more labeled point sets in general position in the plane. We show that for any two labeled sets of n points in convex position there exists a compatible matching with ⌊√2n + 1 − 1⌋ edges. More generally, for any ℓ labeled point sets we construct compatible matchings of size Ω(n1/ℓ). As a corresponding upper bound, we use probabilistic arguments to show that for any ℓ given sets of n points there exists a labeling of each set such that the largest compatible matching has O(n2/(ℓ+1)) edges. Finally, we show that Θ(log n) copies of any set of n points are necessary and sufficient for the existence of labelings of these point sets such that any compatible matching consists only of a single edge. AU - Aichholzer, Oswin AU - Arroyo Guevara, Alan M AU - Masárová, Zuzana AU - Parada, Irene AU - Perz, Daniel AU - Pilz, Alexander AU - Tkadlec, Josef AU - Vogtenhuber, Birgit ID - 11938 IS - 2 JF - Journal of Graph Algorithms and Applications SN - 1526-1719 TI - On compatible matchings VL - 26 ER - TY - GEN AB - In modern sample-driven Prophet Inequality, an adversary chooses a sequence of n items with values v1,v2,…,vn to be presented to a decision maker (DM). The process follows in two phases. In the first phase (sampling phase), some items, possibly selected at random, are revealed to the DM, but she can never accept them. In the second phase, the DM is presented with the other items in a random order and online fashion. For each item, she must make an irrevocable decision to either accept the item and stop the process or reject the item forever and proceed to the next item. The goal of the DM is to maximize the expected value as compared to a Prophet (or offline algorithm) that has access to all information. In this setting, the sampling phase has no cost and is not part of the optimization process. However, in many scenarios, the samples are obtained as part of the decision-making process. We model this aspect as a two-phase Prophet Inequality where an adversary chooses a sequence of 2n items with values v1,v2,…,v2n and the items are randomly ordered. Finally, there are two phases of the Prophet Inequality problem with the first n-items and the rest of the items, respectively. We show that some basic algorithms achieve a ratio of at most 0.450. We present an algorithm that achieves a ratio of at least 0.495. Finally, we show that for every algorithm the ratio it can achieve is at most 0.502. Hence our algorithm is near-optimal. AU - Chatterjee, Krishnendu AU - Mohammadi, Mona AU - Saona Urmeneta, Raimundo J ID - 12677 T2 - arXiv TI - Repeated prophet inequality with near-optimal bounds ER - TY - JOUR AB - Transforming ω-automata into parity automata is traditionally done using appearance records. We present an efficient variant of this idea, tailored to Rabin automata, and several optimizations applicable to all appearance records. We compare the methods experimentally and show that our method produces significantly smaller automata than previous approaches. AU - Kretinsky, Jan AU - Meggendorfer, Tobias AU - Waldmann, Clara AU - Weininger, Maximilian ID - 10602 JF - Acta Informatica KW - computer networks and communications KW - information systems KW - software SN - 0001-5903 TI - Index appearance record with preorders VL - 59 ER - TY - JOUR AB - Motivated by COVID-19, we develop and analyze a simple stochastic model for the spread of disease in human population. We track how the number of infected and critically ill people develops over time in order to estimate the demand that is imposed on the hospital system. To keep this demand under control, we consider a class of simple policies for slowing down and reopening society and we compare their efficiency in mitigating the spread of the virus from several different points of view. We find that in order to avoid overwhelming of the hospital system, a policy must impose a harsh lockdown or it must react swiftly (or both). While reacting swiftly is universally beneficial, being harsh pays off only when the country is patient about reopening and when the neighboring countries coordinate their mitigation efforts. Our work highlights the importance of acting decisively when closing down and the importance of patience and coordination between neighboring countries when reopening. AU - Svoboda, Jakub AU - Tkadlec, Josef AU - Pavlogiannis, Andreas AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 10731 IS - 1 JF - Scientific Reports TI - Infection dynamics of COVID-19 virus under lockdown and reopening VL - 12 ER - TY - CONF AB - We present a novel approach to differential cost analysis that, given a program revision, attempts to statically bound the difference in resource usage, or cost, between the two program versions. Differential cost analysis is particularly interesting because of the many compelling applications for it, such as detecting resource-use regressions at code-review time or proving the absence of certain side-channel vulnerabilities. One prior approach to differential cost analysis is to apply relational reasoning that conceptually constructs a product program on which one can over-approximate the difference in costs between the two program versions. However, a significant challenge in any relational approach is effectively aligning the program versions to get precise results. In this paper, our key insight is that we can avoid the need for and the limitations of program alignment if, instead, we bound the difference of two cost-bound summaries rather than directly bounding the concrete cost difference. In particular, our method computes a threshold value for the maximal difference in cost between two program versions simultaneously using two kinds of cost-bound summaries---a potential function that evaluates to an upper bound for the cost incurred in the first program and an anti-potential function that evaluates to a lower bound for the cost incurred in the second. Our method has a number of desirable properties: it can be fully automated, it allows optimizing the threshold value on relative cost, it is suitable for programs that are not syntactically similar, and it supports non-determinism. We have evaluated an implementation of our approach on a number of program pairs collected from the literature, and we find that our method computes tight threshold values on relative cost in most examples. AU - Zikelic, Dorde AU - Chang, Bor-Yuh Evan AU - Bolignano, Pauline AU - Raimondi, Franco ID - 11459 SN - 9781450392655 T2 - Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation TI - Differential cost analysis with simultaneous potentials and anti-potentials ER - TY - JOUR AB - Structural balance theory is an established framework for studying social relationships of friendship and enmity. These relationships are modeled by a signed network whose energy potential measures the level of imbalance, while stochastic dynamics drives the network toward a state of minimum energy that captures social balance. It is known that this energy landscape has local minima that can trap socially aware dynamics, preventing it from reaching balance. Here we first study the robustness and attractor properties of these local minima. We show that a stochastic process can reach them from an abundance of initial states and that some local minima cannot be escaped by mild perturbations of the network. Motivated by these anomalies, we introduce best-edge dynamics (BED), a new plausible stochastic process. We prove that BED always reaches balance and that it does so fast in various interesting settings. AU - Chatterjee, Krishnendu AU - Svoboda, Jakub AU - Zikelic, Dorde AU - Pavlogiannis, Andreas AU - Tkadlec, Josef ID - 12257 IS - 3 JF - Physical Review E SN - 2470-0045 TI - Social balance on networks: Local minima and best-edge dynamics VL - 106 ER - TY - JOUR AB - In repeated interactions, players can use strategies that respond to the outcome of previous rounds. Much of the existing literature on direct reciprocity assumes that all competing individuals use the same strategy space. Here, we study both learning and evolutionary dynamics of players that differ in the strategy space they explore. We focus on the infinitely repeated donation game and compare three natural strategy spaces: memory-1 strategies, which consider the last moves of both players, reactive strategies, which respond to the last move of the co-player, and unconditional strategies. These three strategy spaces differ in the memory capacity that is needed. We compute the long term average payoff that is achieved in a pairwise learning process. We find that smaller strategy spaces can dominate larger ones. For weak selection, unconditional players dominate both reactive and memory-1 players. For intermediate selection, reactive players dominate memory-1 players. Only for strong selection and low cost-to-benefit ratio, memory-1 players dominate the others. We observe that the supergame between strategy spaces can be a social dilemma: maximum payoff is achieved if both players explore a larger strategy space, but smaller strategy spaces dominate. AU - Schmid, Laura AU - Hilbe, Christian AU - Chatterjee, Krishnendu AU - Nowak, Martin ID - 12280 IS - 6 JF - PLOS Computational Biology KW - Computational Theory and Mathematics KW - Cellular and Molecular Neuroscience KW - Genetics KW - Molecular Biology KW - Ecology KW - Modeling and Simulation KW - Ecology KW - Evolution KW - Behavior and Systematics TI - Direct reciprocity between individuals that use different strategy spaces VL - 18 ER - TY - JOUR AB - Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function. AU - Chatterjee, Krishnendu AU - Saona Urmeneta, Raimundo J AU - Ziliotto, Bruno ID - 9311 IS - 1 JF - Mathematics of Operations Research KW - Management Science and Operations Research KW - General Mathematics KW - Computer Science Applications SN - 0364-765X TI - Finite-memory strategies in POMDPs with long-run average objectives VL - 47 ER - TY - CONF AB - We present PET, a specialized and highly optimized framework for partial exploration on probabilistic systems. Over the last decade, several significant advances in the analysis of Markov decision processes employed partial exploration. In a nutshell, this idea allows to focus computation on specific parts of the system, guided by heuristics, while maintaining correctness. In particular, only relevant parts of the system are constructed on demand, which in turn potentially allows to omit constructing large parts of the system. Depending on the model, this leads to dramatic speed-ups, in extreme cases even up to an arbitrary factor. PET unifies several previous implementations and provides a flexible framework to easily implement partial exploration for many further problems. Our experimental evaluation shows significant improvements compared to the previous implementations while vastly reducing the overhead required to add support for additional properties. AU - Meggendorfer, Tobias ID - 12170 SN - 0302-9743 T2 - 20th International Symposium on Automated Technology for Verification and Analysis TI - PET – A partial exploration tool for probabilistic verification VL - 13505 ER - TY - JOUR AB - Fixed-horizon planning considers a weighted graph and asks to construct a path that maximizes the sum of weights for a given time horizon T. However, in many scenarios, the time horizon is not fixed, but the stopping time is chosen according to some distribution such that the expected stopping time is T. If the stopping-time distribution is not known, then to ensure robustness, the distribution is chosen by an adversary as the worst-case scenario. A stationary plan for every vertex always chooses the same outgoing edge. For fixed horizon or fixed stopping-time distribution, stationary plans are not sufficient for optimality. Quite surprisingly we show that when an adversary chooses the stopping-time distribution with expected stopping-time T, then stationary plans are sufficient. While computing optimal stationary plans for fixed horizon is NP-complete, we show that computing optimal stationary plans under adversarial stopping-time distribution can be achieved in polynomial time. AU - Chatterjee, Krishnendu AU - Doyen, Laurent ID - 11402 JF - Journal of Computer and System Sciences SN - 0022-0000 TI - Graph planning with expected finite horizon VL - 129 ER - TY - CONF AB - We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the "boundary" of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees. AU - Grover, Kush AU - Kretinsky, Jan AU - Meggendorfer, Tobias AU - Weininger, Maimilian ID - 12775 SN - 1868-8969 T2 - 33rd International Conference on Concurrency Theory TI - Anytime guarantees for reachability in uncountable Markov decision processes VL - 243 ER - TY - CONF AB - We consider the quantitative problem of obtaining lower-bounds on the probability of termination of a given non-deterministic probabilistic program. Specifically, given a non-termination threshold p∈[0,1], we aim for certificates proving that the program terminates with probability at least 1−p. The basic idea of our approach is to find a terminating stochastic invariant, i.e. a subset SI of program states such that (i) the probability of the program ever leaving SI is no more than p, and (ii) almost-surely, the program either leaves SI or terminates. While stochastic invariants are already well-known, we provide the first proof that the idea above is not only sound, but also complete for quantitative termination analysis. We then introduce a novel sound and complete characterization of stochastic invariants that enables template-based approaches for easy synthesis of quantitative termination certificates, especially in affine or polynomial forms. Finally, by combining this idea with the existing martingale-based methods that are relatively complete for qualitative termination analysis, we obtain the first automated, sound, and relatively complete algorithm for quantitative termination analysis. Notably, our completeness guarantees for quantitative termination analysis are as strong as the best-known methods for the qualitative variant. Our prototype implementation demonstrates the effectiveness of our approach on various probabilistic programs. We also demonstrate that our algorithm certifies lower bounds on termination probability for probabilistic programs that are beyond the reach of previous methods. AU - Chatterjee, Krishnendu AU - Goharshady, Amir Kafshdar AU - Meggendorfer, Tobias AU - Zikelic, Dorde ID - 12000 SN - 0302-9743 T2 - Proceedings of the 34th International Conference on Computer Aided Verification TI - Sound and complete certificates for auantitative termination analysis of probabilistic programs VL - 13371 ER - TY - JOUR AB - We consider the problem of formally verifying almost-sure (a.s.) asymptotic stability in discrete-time nonlinear stochastic control systems. While verifying stability in deterministic control systems is extensively studied in the literature, verifying stability in stochastic control systems is an open problem. The few existing works on this topic either consider only specialized forms of stochasticity or make restrictive assumptions on the system, rendering them inapplicable to learning algorithms with neural network policies. In this work, we present an approach for general nonlinear stochastic control problems with two novel aspects: (a) instead of classical stochastic extensions of Lyapunov functions, we use ranking supermartingales (RSMs) to certify a.s. asymptotic stability, and (b) we present a method for learning neural network RSMs. We prove that our approach guarantees a.s. asymptotic stability of the system and provides the first method to obtain bounds on the stabilization time, which stochastic Lyapunov functions do not. Finally, we validate our approach experimentally on a set of nonlinear stochastic reinforcement learning environments with neural network policies. AU - Lechner, Mathias AU - Zikelic, Dorde AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A ID - 12511 IS - 7 JF - Proceedings of the AAAI Conference on Artificial Intelligence KW - General Medicine SN - 2159-5399 TI - Stability verification in stochastic control systems via neural network supermartingales VL - 36 ER - TY - GEN AB - In this work, we address the problem of learning provably stable neural network policies for stochastic control systems. While recent work has demonstrated the feasibility of certifying given policies using martingale theory, the problem of how to learn such policies is little explored. Here, we study the effectiveness of jointly learning a policy together with a martingale certificate that proves its stability using a single learning algorithm. We observe that the joint optimization problem becomes easily stuck in local minima when starting from a randomly initialized policy. Our results suggest that some form of pre-training of the policy is required for the joint optimization to repair and verify the policy successfully. AU - Zikelic, Dorde AU - Lechner, Mathias AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A ID - 14601 T2 - arXiv TI - Learning stabilizing policies in stochastic control systems ER - TY - GEN AB - We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold $p\in[0,1]$ over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on $3$ stochastic non-linear reinforcement learning tasks. AU - Zikelic, Dorde AU - Lechner, Mathias AU - Henzinger, Thomas A AU - Chatterjee, Krishnendu ID - 14600 T2 - arXiv TI - Learning control policies for stochastic systems with reach-avoid guarantees ER - TY - CONF AB - A deterministic finite automaton (DFA) 𝒜 is composite if its language L(𝒜) can be decomposed into an intersection ⋂_{i = 1}^k L(𝒜_i) of languages of smaller DFAs. Otherwise, 𝒜 is prime. This notion of primality was introduced by Kupferman and Mosheiff in 2013, and while they proved that we can decide whether a DFA is composite, the precise complexity of this problem is still open, with a doubly-exponential gap between the upper and lower bounds. In this work, we focus on permutation DFAs, i.e., those for which the transition monoid is a group. We provide an NP algorithm to decide whether a permutation DFA is composite, and show that the difficulty of this problem comes from the number of non-accepting states of the instance: we give a fixed-parameter tractable algorithm with the number of rejecting states as the parameter. Moreover, we investigate the class of commutative permutation DFAs. Their structural properties allow us to decide compositionality in NL, and even in LOGSPACE if the alphabet size is fixed. Despite this low complexity, we show that complex behaviors still arise in this class: we provide a family of composite DFAs each requiring polynomially many factors with respect to its size. We also consider the variant of the problem that asks whether a DFA is k-factor composite, that is, decomposable into k smaller DFAs, for some given integer k ∈ ℕ. We show that, for commutative permutation DFAs, restricting the number of factors makes the decision computationally harder, and yields a problem with tight bounds: it is NP-complete. Finally, we show that in general, this problem is in PSPACE, and it is in LOGSPACE for DFAs with a singleton alphabet. AU - Jecker, Ismael R AU - Mazzocchi, Nicolas AU - Wolf, Petra ID - 10052 SN - 1868-8969 T2 - 32nd International Conference on Concurrency Theory TI - Decomposing permutation automata VL - 203 ER - TY - CONF AB - Graphs and games on graphs are fundamental models for the analysis of reactive systems, in particular, for model-checking and the synthesis of reactive systems. The class of ω-regular languages provides a robust specification formalism for the desired properties of reactive systems. In the classical infinitary formulation of the liveness part of an ω-regular specification, a "good" event must happen eventually without any bound between the good events. A stronger notion of liveness is bounded liveness, which requires that good events happen within d transitions. Given a graph or a game graph with n vertices, m edges, and a bounded liveness objective, the previous best-known algorithmic bounds are as follows: (i) O(dm) for graphs, which in the worst-case is O(n³); and (ii) O(n² d²) for games on graphs. Our main contributions improve these long-standing algorithmic bounds. For graphs we present: (i) a randomized algorithm with one-sided error with running time O(n^{2.5} log n) for the bounded liveness objectives; and (ii) a deterministic linear-time algorithm for the complement of bounded liveness objectives. For games on graphs, we present an O(n² d) time algorithm for the bounded liveness objectives. AU - Chatterjee, Krishnendu AU - Henzinger, Monika H AU - Kale, Sagar Sudhir AU - Svozil, Alexander ID - 10054 SN - 1868-8969 T2 - 48th International Colloquium on Automata, Languages, and Programming TI - Faster algorithms for bounded liveness in graphs and game graphs VL - 198 ER - TY - CONF AB - We study the expressiveness and succinctness of good-for-games pushdown automata (GFG-PDA) over finite words, that is, pushdown automata whose nondeterminism can be resolved based on the run constructed so far, but independently of the remainder of the input word. We prove that GFG-PDA recognise more languages than deterministic PDA (DPDA) but not all context-free languages (CFL). This class is orthogonal to unambiguous CFL. We further show that GFG-PDA can be exponentially more succinct than DPDA, while PDA can be double-exponentially more succinct than GFG-PDA. We also study GFGness in visibly pushdown automata (VPA), which enjoy better closure properties than PDA, and for which we show GFGness to be ExpTime-complete. GFG-VPA can be exponentially more succinct than deterministic VPA, while VPA can be exponentially more succinct than GFG-VPA. Both of these lower bounds are tight. Finally, we study the complexity of resolving nondeterminism in GFG-PDA. Every GFG-PDA has a positional resolver, a function that resolves nondeterminism and that is only dependant on the current configuration. Pushdown transducers are sufficient to implement the resolvers of GFG-VPA, but not those of GFG-PDA. GFG-PDA with finite-state resolvers are determinisable. AU - Guha, Shibashis AU - Jecker, Ismael R AU - Lehtinen, Karoliina AU - Zimmermann, Martin ID - 10075 SN - 1868-8969 T2 - 46th International Symposium on Mathematical Foundations of Computer Science TI - A bit of nondeterminism makes pushdown automata expressive and succinct VL - 202 ER - TY - CONF AB - In the Intersection Non-emptiness problem, we are given a list of finite automata A_1, A_2,… , A_m over a common alphabet Σ as input, and the goal is to determine whether some string w ∈ Σ^* lies in the intersection of the languages accepted by the automata in the list. We analyze the complexity of the Intersection Non-emptiness problem under the promise that all input automata accept a language in some level of the dot-depth hierarchy, or some level of the Straubing-Thérien hierarchy. Automata accepting languages from the lowest levels of these hierarchies arise naturally in the context of model checking. We identify a dichotomy in the dot-depth hierarchy by showing that the problem is already NP-complete when all input automata accept languages of the levels B_0 or B_{1/2} and already PSPACE-hard when all automata accept a language from the level B_1. Conversely, we identify a tetrachotomy in the Straubing-Thérien hierarchy. More precisely, we show that the problem is in AC^0 when restricted to level L_0; complete for L or NL, depending on the input representation, when restricted to languages in the level L_{1/2}; NP-complete when the input is given as DFAs accepting a language in L_1 or L_{3/2}; and finally, PSPACE-complete when the input automata accept languages in level L_2 or higher. Moreover, we show that the proof technique used to show containment in NP for DFAs accepting languages in L_1 or L_{3/2} does not generalize to the context of NFAs. To prove this, we identify a family of languages that provide an exponential separation between the state complexity of general NFAs and that of partially ordered NFAs. To the best of our knowledge, this is the first superpolynomial separation between these two models of computation. AU - Arrighi, Emmanuel AU - Fernau, Henning AU - Hoffmann, Stefan AU - Holzer, Markus AU - Jecker, Ismael R AU - De Oliveira Oliveira, Mateus AU - Wolf, Petra ID - 10630 SN - 1868-8969 T2 - 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science TI - On the complexity of intersection non-emptiness for star-free language classes VL - 213 ER - TY - CONF AB - Product graphs arise naturally in formal verification and program analysis. For example, the analysis of two concurrent threads requires the product of two component control-flow graphs, and for language inclusion of deterministic automata the product of two automata is constructed. In many cases, the component graphs have constant treewidth, e.g., when the input contains control-flow graphs of programs. We consider the algorithmic analysis of products of two constant-treewidth graphs with respect to three classic specification languages, namely, (a) algebraic properties, (b) mean-payoff properties, and (c) initial credit for energy properties. Our main contributions are as follows. Consider a graph G that is the product of two constant-treewidth graphs of size n each. First, given an idempotent semiring, we present an algorithm that computes the semiring transitive closure of G in time Õ(n⁴). Since the output has size Θ(n⁴), our algorithm is optimal (up to polylog factors). Second, given a mean-payoff objective, we present an O(n³)-time algorithm for deciding whether the value of a starting state is non-negative, improving the previously known O(n⁴) bound. Third, given an initial credit for energy objective, we present an O(n⁵)-time algorithm for computing the minimum initial credit for all nodes of G, improving the previously known O(n⁸) bound. At the heart of our approach lies an algorithm for the efficient construction of strongly-balanced tree decompositions of constant-treewidth graphs. Given a constant-treewidth graph G' of n nodes and a positive integer λ, our algorithm constructs a binary tree decomposition of G' of width O(λ) with the property that the size of each subtree decreases geometrically with rate (1/2 + 2^{-λ}). AU - Chatterjee, Krishnendu AU - Ibsen-Jensen, Rasmus AU - Pavlogiannis, Andreas ID - 10629 SN - 1868-8969 T2 - 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science TI - Quantitative verification on product graphs of small treewidth VL - 213 ER - TY - CONF AB - In a two-player zero-sum graph game the players move a token throughout a graph to produce an infinite path, which determines the winner or payoff of the game. Traditionally, the players alternate turns in moving the token. In bidding games, however, the players have budgets, and in each turn, we hold an “auction” (bidding) to determine which player moves the token: both players simultaneously submit bids and the higher bidder moves the token. The bidding mechanisms differ in their payment schemes. Bidding games were largely studied with variants of first-price bidding in which only the higher bidder pays his bid. We focus on all-pay bidding, where both players pay their bids. Finite-duration all-pay bidding games were studied and shown to be technically more challenging than their first-price counterparts. We study for the first time, infinite-duration all-pay bidding games. Our most interesting results are for mean-payoff objectives: we portray a complete picture for games played on strongly-connected graphs. We study both pure (deterministic) and mixed (probabilistic) strategies and completely characterize the optimal and almost-sure (with probability 1) payoffs the players can respectively guarantee. We show that mean-payoff games under all-pay bidding exhibit the intriguing mathematical properties of their first-price counterparts; namely, an equivalence with random-turn games in which in each turn, the player who moves is selected according to a (biased) coin toss. The equivalences for all-pay bidding are more intricate and unexpected than for first-price bidding. AU - Avni, Guy AU - Jecker, Ismael R AU - Zikelic, Dorde ED - Marx, Dániel ID - 10694 SN - 978-1-61197-646-5 T2 - Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms TI - Infinite-duration all-pay bidding games ER - TY - CONF AB - We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game. AU - Tomášek, Petr AU - Horák, Karel AU - Aradhye, Aditya AU - Bošanský, Branislav AU - Chatterjee, Krishnendu ID - 10847 SN - 1045-0823 T2 - 30th International Joint Conference on Artificial Intelligence TI - Solving partially observable stochastic shortest-path games ER - TY - CONF AB - matching is compatible to two or more labeled point sets of size n with labels {1,…,n} if its straight-line drawing on each of these point sets is crossing-free. We study the maximum number of edges in a matching compatible to two or more labeled point sets in general position in the plane. We show that for any two labeled convex sets of n points there exists a compatible matching with ⌊2n−−√⌋ edges. More generally, for any ℓ labeled point sets we construct compatible matchings of size Ω(n1/ℓ) . As a corresponding upper bound, we use probabilistic arguments to show that for any ℓ given sets of n points there exists a labeling of each set such that the largest compatible matching has O(n2/(ℓ+1)) edges. Finally, we show that Θ(logn) copies of any set of n points are necessary and sufficient for the existence of a labeling such that any compatible matching consists only of a single edge. AU - Aichholzer, Oswin AU - Arroyo Guevara, Alan M AU - Masárová, Zuzana AU - Parada, Irene AU - Perz, Daniel AU - Pilz, Alexander AU - Tkadlec, Josef AU - Vogtenhuber, Birgit ID - 9296 SN - 03029743 T2 - 15th International Conference on Algorithms and Computation TI - On compatible matchings VL - 12635 ER - TY - CHAP AB - Optimal decision making requires individuals to know their available options and to anticipate correctly what consequences these options have. In many social interactions, however, we refrain from gathering all relevant information, even if this information would help us make better decisions and is costless to obtain. This chapter examines several examples of “deliberate ignorance.” Two simple models are proposed to illustrate how ignorance can evolve among self-interested and payoff - maximizing individuals, and open problems are highlighted that lie ahead for future research to explore. AU - Schmid, Laura AU - Hilbe, Christian ED - Hertwig, Ralph ED - Engel, Christoph ID - 9403 SN - 978-0-262-04559-9 T2 - Deliberate Ignorance: Choosing Not To Know TI - The evolution of strategic ignorance in strategic interaction VL - 29 ER - TY - CONF AB - Several problems in planning and reactive synthesis can be reduced to the analysis of two-player quantitative graph games. Optimization is one form of analysis. We argue that in many cases it may be better to replace the optimization problem with the satisficing problem, where instead of searching for optimal solutions, the goal is to search for solutions that adhere to a given threshold bound. This work defines and investigates the satisficing problem on a two-player graph game with the discounted-sum cost model. We show that while the satisficing problem can be solved using numerical methods just like the optimization problem, this approach does not render compelling benefits over optimization. When the discount factor is, however, an integer, we present another approach to satisficing, which is purely based on automata methods. We show that this approach is algorithmically more performant – both theoretically and empirically – and demonstrates the broader applicability of satisficing over optimization. AU - Bansal, Suguman AU - Chatterjee, Krishnendu AU - Vardi, Moshe Y. ID - 12767 SN - 0302-9743 T2 - 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems TI - On satisficing in quantitative games VL - 12651 ER - TY - CONF AB - Bayesian neural networks (BNNs) place distributions over the weights of a neural network to model uncertainty in the data and the network's prediction. We consider the problem of verifying safety when running a Bayesian neural network policy in a feedback loop with infinite time horizon systems. Compared to the existing sampling-based approaches, which are inapplicable to the infinite time horizon setting, we train a separate deterministic neural network that serves as an infinite time horizon safety certificate. In particular, we show that the certificate network guarantees the safety of the system over a subset of the BNN weight posterior's support. Our method first computes a safe weight set and then alters the BNN's weight posterior to reject samples outside this set. Moreover, we show how to extend our approach to a safe-exploration reinforcement learning setting, in order to avoid unsafe trajectories during the training of the policy. We evaluate our approach on a series of reinforcement learning benchmarks, including non-Lyapunovian safety specifications. AU - Lechner, Mathias AU - Žikelić, Ðorđe AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A ID - 10667 T2 - 35th Conference on Neural Information Processing Systems TI - Infinite time horizon safety of Bayesian neural networks ER - TY - JOUR AB - We study optimal election sequences for repeatedly selecting a (very) small group of leaders among a set of participants (players) with publicly known unique ids. In every time slot, every player has to select exactly one player that it considers to be the current leader, oblivious to the selection of the other players, but with the overarching goal of maximizing a given parameterized global (“social”) payoff function in the limit. We consider a quite generic model, where the local payoff achieved by a given player depends, weighted by some arbitrary but fixed real parameter, on the number of different leaders chosen in a round, the number of players that choose the given player as the leader, and whether the chosen leader has changed w.r.t. the previous round or not. The social payoff can be the maximum, average or minimum local payoff of the players. Possible applications include quite diverse examples such as rotating coordinator-based distributed algorithms and long-haul formation flying of social birds. Depending on the weights and the particular social payoff, optimal sequences can be very different, from simple round-robin where all players chose the same leader alternatingly every time slot to very exotic patterns, where a small group of leaders (at most 2) is elected in every time slot. Moreover, we study the question if and when a single player would not benefit w.r.t. its local payoff when deviating from the given optimal sequence, i.e., when our optimal sequences are Nash equilibria in the restricted strategy space of oblivious strategies. As this is the case for many parameterizations of our model, our results reveal that no punishment is needed to make it rational for the players to optimize the social payoff. AU - Zeiner, Martin AU - Schmid, Ulrich AU - Chatterjee, Krishnendu ID - 8793 IS - 1 JF - Discrete Applied Mathematics SN - 0166218X TI - Optimal strategies for selecting coordinators VL - 289 ER - TY - JOUR AB - A game of rock-paper-scissors is an interesting example of an interaction where none of the pure strategies strictly dominates all others, leading to a cyclic pattern. In this work, we consider an unstable version of rock-paper-scissors dynamics and allow individuals to make behavioural mistakes during the strategy execution. We show that such an assumption can break a cyclic relationship leading to a stable equilibrium emerging with only one strategy surviving. We consider two cases: completely random mistakes when individuals have no bias towards any strategy and a general form of mistakes. Then, we determine conditions for a strategy to dominate all other strategies. However, given that individuals who adopt a dominating strategy are still prone to behavioural mistakes in the observed behaviour, we may still observe extinct strategies. That is, behavioural mistakes in strategy execution stabilise evolutionary dynamics leading to an evolutionary stable and, potentially, mixed co-existence equilibrium. AU - Kleshnina, Maria AU - Streipert, Sabrina S. AU - Filar, Jerzy A. AU - Chatterjee, Krishnendu ID - 9381 IS - 4 JF - PLoS Computational Biology SN - 1553734X TI - Mistakes can stabilise the dynamics of rock-paper-scissors games VL - 17 ER - TY - JOUR AB - Selection and random drift determine the probability that novel mutations fixate in a population. Population structure is known to affect the dynamics of the evolutionary process. Amplifiers of selection are population structures that increase the fixation probability of beneficial mutants compared to well-mixed populations. Over the past 15 years, extensive research has produced remarkable structures called strong amplifiers which guarantee that every beneficial mutation fixates with high probability. But strong amplification has come at the cost of considerably delaying the fixation event, which can slow down the overall rate of evolution. However, the precise relationship between fixation probability and time has remained elusive. Here we characterize the slowdown effect of strong amplification. First, we prove that all strong amplifiers must delay the fixation event at least to some extent. Second, we construct strong amplifiers that delay the fixation event only marginally as compared to the well-mixed populations. Our results thus establish a tight relationship between fixation probability and time: Strong amplification always comes at a cost of a slowdown, but more than a marginal slowdown is not needed. AU - Tkadlec, Josef AU - Pavlogiannis, Andreas AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 9640 IS - 1 JF - Nature Communications TI - Fast and strong amplifiers of natural selection VL - 12 ER - TY - CONF AB - We consider the fundamental problem of deriving quantitative bounds on the probability that a given assertion is violated in a probabilistic program. We provide automated algorithms that obtain both lower and upper bounds on the assertion violation probability. The main novelty of our approach is that we prove new and dedicated fixed-point theorems which serve as the theoretical basis of our algorithms and enable us to reason about assertion violation bounds in terms of pre and post fixed-point functions. To synthesize such fixed-points, we devise algorithms that utilize a wide range of mathematical tools, including repulsing ranking supermartingales, Hoeffding's lemma, Minkowski decompositions, Jensen's inequality, and convex optimization. On the theoretical side, we provide (i) the first automated algorithm for lower-bounds on assertion violation probabilities, (ii) the first complete algorithm for upper-bounds of exponential form in affine programs, and (iii) provably and significantly tighter upper-bounds than the previous approaches. On the practical side, we show our algorithms can handle a wide variety of programs from the literature and synthesize bounds that are remarkably tighter than previous results, in some cases by thousands of orders of magnitude. AU - Wang, Jinyi AU - Sun, Yican AU - Fu, Hongfei AU - Chatterjee, Krishnendu AU - Goharshady, Amir Kafshdar ID - 9646 SN - 9781450383912 T2 - Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation TI - Quantitative analysis of assertion violations in probabilistic programs ER - TY - CONF AB - We consider the fundamental problem of reachability analysis over imperative programs with real variables. Previous works that tackle reachability are either unable to handle programs consisting of general loops (e.g. symbolic execution), or lack completeness guarantees (e.g. abstract interpretation), or are not automated (e.g. incorrectness logic). In contrast, we propose a novel approach for reachability analysis that can handle general and complex loops, is complete, and can be entirely automated for a wide family of programs. Through the notion of Inductive Reachability Witnesses (IRWs), our approach extends ideas from both invariant generation and termination to reachability analysis. We first show that our IRW-based approach is sound and complete for reachability analysis of imperative programs. Then, we focus on linear and polynomial programs and develop automated methods for synthesizing linear and polynomial IRWs. In the linear case, we follow the well-known approaches using Farkas' Lemma. Our main contribution is in the polynomial case, where we present a push-button semi-complete algorithm. We achieve this using a novel combination of classical theorems in real algebraic geometry, such as Putinar's Positivstellensatz and Hilbert's Strong Nullstellensatz. Finally, our experimental results show we can prove complex reachability objectives over various benchmarks that were beyond the reach of previous methods. AU - Asadi, Ali AU - Chatterjee, Krishnendu AU - Fu, Hongfei AU - Goharshady, Amir Kafshdar AU - Mahdavi, Mohammad ID - 9645 SN - 9781450383912 T2 - Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation TI - Polynomial reachability witnesses via Stellensätze ER - TY - CONF AB - We present a faster symbolic algorithm for the following central problem in probabilistic verification: Compute the maximal end-component (MEC) decomposition of Markov decision processes (MDPs). This problem generalizes the SCC decomposition problem of graphs and closed recurrent sets of Markov chains. The model of symbolic algorithms is widely used in formal verification and model-checking, where access to the input model is restricted to only symbolic operations (e.g., basic set operations and computation of one-step neighborhood). For an input MDP with n vertices and m edges, the classical symbolic algorithm from the 1990s for the MEC decomposition requires O(n2) symbolic operations and O(1) symbolic space. The only other symbolic algorithm for the MEC decomposition requires O(nm−−√) symbolic operations and O(m−−√) symbolic space. A main open question is whether the worst-case O(n2) bound for symbolic operations can be beaten. We present a symbolic algorithm that requires O˜(n1.5) symbolic operations and O˜(n−−√) symbolic space. Moreover, the parametrization of our algorithm provides a trade-off between symbolic operations and symbolic space: for all 0<ϵ≤1/2 the symbolic algorithm requires O˜(n2−ϵ) symbolic operations and O˜(nϵ) symbolic space ( O˜ hides poly-logarithmic factors). Using our techniques we present faster algorithms for computing the almost-sure winning regions of ω -regular objectives for MDPs. We consider the canonical parity objectives for ω -regular objectives, and for parity objectives with d -priorities we present an algorithm that computes the almost-sure winning region with O˜(n2−ϵ) symbolic operations and O˜(nϵ) symbolic space, for all 0<ϵ≤1/2 . AU - Chatterjee, Krishnendu AU - Dvorak, Wolfgang AU - Henzinger, Monika H AU - Svozil, Alexander ID - 10002 KW - Computer science KW - Computational modeling KW - Markov processes KW - Probabilistic logic KW - Formal verification KW - Game Theory SN - 1043-6871 T2 - Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science TI - Symbolic time and space tradeoffs for probabilistic verification ER - TY - CONF AB - Markov chains are the de facto finite-state model for stochastic dynamical systems, and Markov decision processes (MDPs) extend Markov chains by incorporating non-deterministic behaviors. Given an MDP and rewards on states, a classical optimization criterion is the maximal expected total reward where the MDP stops after T steps, which can be computed by a simple dynamic programming algorithm. We consider a natural generalization of the problem where the stopping times can be chosen according to a probability distribution, such that the expected stopping time is T, to optimize the expected total reward. Quite surprisingly we establish inter-reducibility of the expected stopping-time problem for Markov chains with the Positivity problem (which is related to the well-known Skolem problem), for which establishing either decidability or undecidability would be a major breakthrough. Given the hardness of the exact problem, we consider the approximate version of the problem: we show that it can be solved in exponential time for Markov chains and in exponential space for MDPs. AU - Chatterjee, Krishnendu AU - Doyen, Laurent ID - 10004 KW - Computer science KW - Heuristic algorithms KW - Memory management KW - Automata KW - Markov processes KW - Probability distribution KW - Complexity theory SN - 1043-6871 T2 - Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science TI - Stochastic processes with expected stopping time ER - TY - CONF AB - Repeated idempotent elements are commonly used to characterise iterable behaviours in abstract models of computation. Therefore, given a monoid M, it is natural to ask how long a sequence of elements of M needs to be to ensure the presence of consecutive idempotent factors. This question is formalised through the notion of the Ramsey function R_M associated to M, obtained by mapping every k ∈ ℕ to the minimal integer R_M(k) such that every word u ∈ M^* of length R_M(k) contains k consecutive non-empty factors that correspond to the same idempotent element of M. In this work, we study the behaviour of the Ramsey function R_M by investigating the regular 𝒟-length of M, defined as the largest size L(M) of a submonoid of M isomorphic to the set of natural numbers {1,2, …, L(M)} equipped with the max operation. We show that the regular 𝒟-length of M determines the degree of R_M, by proving that k^L(M) ≤ R_M(k) ≤ (k|M|⁴)^L(M). To allow applications of this result, we provide the value of the regular 𝒟-length of diverse monoids. In particular, we prove that the full monoid of n × n Boolean matrices, which is used to express transition monoids of non-deterministic automata, has a regular 𝒟-length of (n²+n+2)/2. AU - Jecker, Ismael R ID - 10055 SN - 1868-8969 T2 - 38th International Symposium on Theoretical Aspects of Computer Science TI - A Ramsey theorem for finite monoids VL - 187 ER - TY - CONF AB - Stateless model checking (SMC) is one of the standard approaches to the verification of concurrent programs. As scheduling non-determinism creates exponentially large spaces of thread interleavings, SMC attempts to partition this space into equivalence classes and explore only a few representatives from each class. The efficiency of this approach depends on two factors: (a) the coarseness of the partitioning, and (b) the time to generate representatives in each class. For this reason, the search for coarse partitionings that are efficiently explorable is an active research challenge. In this work we present RVF-SMC , a new SMC algorithm that uses a novel reads-value-from (RVF) partitioning. Intuitively, two interleavings are deemed equivalent if they agree on the value obtained in each read event, and read events induce consistent causal orderings between them. The RVF partitioning is provably coarser than recent approaches based on Mazurkiewicz and “reads-from” partitionings. Our experimental evaluation reveals that RVF is quite often a very effective equivalence, as the underlying partitioning is exponentially coarser than other approaches. Moreover, RVF-SMC generates representatives very efficiently, as the reduction in the partitioning is often met with significant speed-ups in the model checking task. AU - Agarwal, Pratyush AU - Chatterjee, Krishnendu AU - Pathak, Shreya AU - Pavlogiannis, Andreas AU - Toman, Viktor ID - 9987 SN - 0302-9743 T2 - 33rd International Conference on Computer-Aided Verification TI - Stateless model checking under a reads-value-from equivalence VL - 12759 ER - TY - JOUR AB - In this work we solve the algorithmic problem of consistency verification for the TSO and PSO memory models given a reads-from map, denoted VTSO-rf and VPSO-rf, respectively. For an execution of n events over k threads and d variables, we establish novel bounds that scale as nk+1 for TSO and as nk+1· min(nk2, 2k· d) for PSO. Moreover, based on our solution to these problems, we develop an SMC algorithm under TSO and PSO that uses the RF equivalence. The algorithm is exploration-optimal, in the sense that it is guaranteed to explore each class of the RF partitioning exactly once, and spends polynomial time per class when k is bounded. Finally, we implement all our algorithms in the SMC tool Nidhugg, and perform a large number of experiments over benchmarks from existing literature. Our experimental results show that our algorithms for VTSO-rf and VPSO-rf provide significant scalability improvements over standard alternatives. Moreover, when used for SMC, the RF partitioning is often much coarser than the standard Shasha-Snir partitioning for TSO/PSO, which yields a significant speedup in the model checking task. AU - Bui, Truc Lam AU - Chatterjee, Krishnendu AU - Gautam, Tushar AU - Pavlogiannis, Andreas AU - Toman, Viktor ID - 10191 IS - OOPSLA JF - Proceedings of the ACM on Programming Languages KW - safety KW - risk KW - reliability and quality KW - software TI - The reads-from equivalence for the TSO and PSO memory models VL - 5 ER - TY - THES AB - The design and verification of concurrent systems remains an open challenge due to the non-determinism that arises from the inter-process communication. In particular, concurrent programs are notoriously difficult both to be written correctly and to be analyzed formally, as complex thread interaction has to be accounted for. The difficulties are further exacerbated when concurrent programs get executed on modern-day hardware, which contains various buffering and caching mechanisms for efficiency reasons. This causes further subtle non-determinism, which can often produce very unintuitive behavior of the concurrent programs. Model checking is at the forefront of tackling the verification problem, where the task is to decide, given as input a concurrent system and a desired property, whether the system satisfies the property. The inherent state-space explosion problem in model checking of concurrent systems causes naïve explicit methods not to scale, thus more inventive methods are required. One such method is stateless model checking (SMC), which explores in memory-efficient manner the program executions rather than the states of the program. State-of-the-art SMC is typically coupled with partial order reduction (POR) techniques, which argue that certain executions provably produce identical system behavior, thus limiting the amount of executions one needs to explore in order to cover all possible behaviors. Another method to tackle the state-space explosion is symbolic model checking, where the considered techniques operate on a succinct implicit representation of the input system rather than explicitly accessing the system. In this thesis we present new techniques for verification of concurrent systems. We present several novel POR methods for SMC of concurrent programs under various models of semantics, some of which account for write-buffering mechanisms. Additionally, we present novel algorithms for symbolic model checking of finite-state concurrent systems, where the desired property of the systems is to ensure a formally defined notion of fairness. AU - Toman, Viktor ID - 10199 KW - concurrency KW - verification KW - model checking SN - 2663-337X TI - Improved verification techniques for concurrent systems ER - TY - JOUR AB - We consider planning problems for graphs, Markov Decision Processes (MDPs), and games on graphs in an explicit state space. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment. We consider two planning problems with k different target sets: (a) the coverage problem asks whether there is a plan for each individual target set; and (b) the sequential target reachability problem asks whether the targets can be reached in a given sequence. For the coverage problem, we present a linear-time algorithm for graphs, and quadratic conditional lower bound for MDPs and games on graphs. For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs. Our results with conditional lower bounds, based on the boolean matrix multiplication (BMM) conjecture and strong exponential time hypothesis (SETH), establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs, and for the sequential reachability problem games on graphs are harder than MDPs and graphs; and (ii) problem-separation results showing that for MDPs the coverage problem is harder than the sequential target problem. AU - Chatterjee, Krishnendu AU - Dvořák, Wolfgang AU - Henzinger, Monika H AU - Svozil, Alexander ID - 9293 IS - 8 JF - Artificial Intelligence SN - 0004-3702 TI - Algorithms and conditional lower bounds for planning problems VL - 297 ER - TY - JOUR AB - We consider the core algorithmic problems related to verification of systems with respect to three classical quantitative properties, namely, the mean-payoff, the ratio, and the minimum initial credit for energy property. The algorithmic problem given a graph and a quantitative property asks to compute the optimal value (the infimum value over all traces) from every node of the graph. We consider graphs with bounded treewidth—a class that contains the control flow graphs of most programs. Let n denote the number of nodes of a graph, m the number of edges (for bounded treewidth 𝑚=𝑂(𝑛)) and W the largest absolute value of the weights. Our main theoretical results are as follows. First, for the minimum initial credit problem we show that (1) for general graphs the problem can be solved in 𝑂(𝑛2⋅𝑚) time and the associated decision problem in 𝑂(𝑛⋅𝑚) time, improving the previous known 𝑂(𝑛3⋅𝑚⋅log(𝑛⋅𝑊)) and 𝑂(𝑛2⋅𝑚) bounds, respectively; and (2) for bounded treewidth graphs we present an algorithm that requires 𝑂(𝑛⋅log𝑛) time. Second, for bounded treewidth graphs we present an algorithm that approximates the mean-payoff value within a factor of 1+𝜖 in time 𝑂(𝑛⋅log(𝑛/𝜖)) as compared to the classical exact algorithms on general graphs that require quadratic time. Third, for the ratio property we present an algorithm that for bounded treewidth graphs works in time 𝑂(𝑛⋅log(|𝑎⋅𝑏|))=𝑂(𝑛⋅log(𝑛⋅𝑊)), when the output is 𝑎𝑏, as compared to the previously best known algorithm on general graphs with running time 𝑂(𝑛2⋅log(𝑛⋅𝑊)). We have implemented some of our algorithms and show that they present a significant speedup on standard benchmarks. AU - Chatterjee, Krishnendu AU - Ibsen-Jensen, Rasmus AU - Pavlogiannis, Andreas ID - 9393 JF - Formal Methods in System Design SN - 0925-9856 TI - Faster algorithms for quantitative verification in bounded treewidth graphs VL - 57 ER - TY - CONF AB - We present a new approach to proving non-termination of non-deterministic integer programs. Our technique is rather simple but efficient. It relies on a purely syntactic reversal of the program's transition system followed by a constraint-based invariant synthesis with constraints coming from both the original and the reversed transition system. The latter task is performed by a simple call to an off-the-shelf SMT-solver, which allows us to leverage the latest advances in SMT-solving. Moreover, our method offers a combination of features not present (as a whole) in previous approaches: it handles programs with non-determinism, provides relative completeness guarantees and supports programs with polynomial arithmetic. The experiments performed with our prototype tool RevTerm show that our approach, despite its simplicity and stronger theoretical guarantees, is at least on par with the state-of-the-art tools, often achieving a non-trivial improvement under a proper configuration of its parameters. AU - Chatterjee, Krishnendu AU - Goharshady, Ehsan Kafshdar AU - Novotný, Petr AU - Zikelic, Dorde ID - 9644 SN - 9781450383912 T2 - Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation TI - Proving non-termination by program reversal ER - TY - CONF AB - We consider the almost-sure (a.s.) termination problem for probabilistic programs, which are a stochastic extension of classical imperative programs. Lexicographic ranking functions provide a sound and practical approach for termination of non-probabilistic programs, and their extension to probabilistic programs is achieved via lexicographic ranking supermartingales (LexRSMs). However, LexRSMs introduced in the previous work have a limitation that impedes their automation: all of their components have to be non-negative in all reachable states. This might result in LexRSM not existing even for simple terminating programs. Our contributions are twofold: First, we introduce a generalization of LexRSMs which allows for some components to be negative. This standard feature of non-probabilistic termination proofs was hitherto not known to be sound in the probabilistic setting, as the soundness proof requires a careful analysis of the underlying stochastic process. Second, we present polynomial-time algorithms using our generalized LexRSMs for proving a.s. termination in broad classes of linear-arithmetic programs. AU - Chatterjee, Krishnendu AU - Goharshady, Ehsan Kafshdar AU - Novotný, Petr AU - Zárevúcky, Jiří AU - Zikelic, Dorde ID - 10414 SN - 0302-9743 T2 - 24th International Symposium on Formal Methods TI - On lexicographic proof rules for probabilistic termination VL - 13047 ER - TY - THES AB - In this thesis, we consider several of the most classical and fundamental problems in static analysis and formal verification, including invariant generation, reachability analysis, termination analysis of probabilistic programs, data-flow analysis, quantitative analysis of Markov chains and Markov decision processes, and the problem of data packing in cache management. We use techniques from parameterized complexity theory, polyhedral geometry, and real algebraic geometry to significantly improve the state-of-the-art, in terms of both scalability and completeness guarantees, for the mentioned problems. In some cases, our results are the first theoretical improvements for the respective problems in two or three decades. AU - Goharshady, Amir Kafshdar ID - 8934 SN - 2663-337X TI - Parameterized and algebro-geometric advances in static program analysis ER - TY - THES AB - Indirect reciprocity in evolutionary game theory is a prominent mechanism for explaining the evolution of cooperation among unrelated individuals. In contrast to direct reciprocity, which is based on individuals meeting repeatedly, and conditionally cooperating by using their own experiences, indirect reciprocity is based on individuals’ reputations. If a player helps another, this increases the helper’s public standing, benefitting them in the future. This lets cooperation in the population emerge without individuals having to meet more than once. While the two modes of reciprocity are intertwined, they are difficult to compare. Thus, they are usually studied in isolation. Direct reciprocity can maintain cooperation with simple strategies, and is robust against noise even when players do not remember more than their partner’s last action. Meanwhile, indirect reciprocity requires its successful strategies, or social norms, to be more complex. Exhaustive search previously identified eight such norms, called the “leading eight”, which excel at maintaining cooperation. However, as the first result of this thesis, we show that the leading eight break down once we remove the fundamental assumption that information is synchronized and public, such that everyone agrees on reputations. Once we consider a more realistic scenario of imperfect information, where reputations are private, and individuals occasionally misinterpret or miss observations, the leading eight do not promote cooperation anymore. Instead, minor initial disagreements can proliferate, fragmenting populations into subgroups. In a next step, we consider ways to mitigate this issue. We first explore whether introducing “generosity” can stabilize cooperation when players use the leading eight strategies in noisy environments. This approach of modifying strategies to include probabilistic elements for coping with errors is known to work well in direct reciprocity. However, as we show here, it fails for the more complex norms of indirect reciprocity. Imperfect information still prevents cooperation from evolving. On the other hand, we succeeded to show in this thesis that modifying the leading eight to use “quantitative assessment”, i.e. tracking reputation scores on a scale beyond good and bad, and making overall judgments of others based on a threshold, is highly successful, even when noise increases in the environment. Cooperation can flourish when reputations are more nuanced, and players have a broader understanding what it means to be “good.” Finally, we present a single theoretical framework that unites the two modes of reciprocity despite their differences. Within this framework, we identify a novel simple and successful strategy for indirect reciprocity, which can cope with noisy environments and has an analogue in direct reciprocity. We can also analyze decision making when different sources of information are available. Our results help highlight that for sustaining cooperation, already the most simple rules of reciprocity can be sufficient. AU - Schmid, Laura ID - 10293 SN - 2663-337X TI - Evolution of cooperation via (in)direct reciprocity under imperfect information ER - TY - JOUR AB - Indirect reciprocity is a mechanism for the evolution of cooperation based on social norms. This mechanism requires that individuals in a population observe and judge each other’s behaviors. Individuals with a good reputation are more likely to receive help from others. Previous work suggests that indirect reciprocity is only effective when all relevant information is reliable and publicly available. Otherwise, individuals may disagree on how to assess others, even if they all apply the same social norm. Such disagreements can lead to a breakdown of cooperation. Here we explore whether the predominantly studied ‘leading eight’ social norms of indirect reciprocity can be made more robust by equipping them with an element of generosity. To this end, we distinguish between two kinds of generosity. According to assessment generosity, individuals occasionally assign a good reputation to group members who would usually be regarded as bad. According to action generosity, individuals occasionally cooperate with group members with whom they would usually defect. Using individual-based simulations, we show that the two kinds of generosity have a very different effect on the resulting reputation dynamics. Assessment generosity tends to add to the overall noise and allows defectors to invade. In contrast, a limited amount of action generosity can be beneficial in a few cases. However, even when action generosity is beneficial, the respective simulations do not result in full cooperation. Our results suggest that while generosity can favor cooperation when individuals use the most simple strategies of reciprocity, it is disadvantageous when individuals use more complex social norms. AU - Schmid, Laura AU - Shati, Pouya AU - Hilbe, Christian AU - Chatterjee, Krishnendu ID - 9997 IS - 1 JF - Scientific Reports KW - Multidisciplinary TI - The evolution of indirect reciprocity under action and assessment generosity VL - 11 ER - TY - JOUR AB - Direct and indirect reciprocity are key mechanisms for the evolution of cooperation. Direct reciprocity means that individuals use their own experience to decide whether to cooperate with another person. Indirect reciprocity means that they also consider the experiences of others. Although these two mechanisms are intertwined, they are typically studied in isolation. Here, we introduce a mathematical framework that allows us to explore both kinds of reciprocity simultaneously. We show that the well-known ‘generous tit-for-tat’ strategy of direct reciprocity has a natural analogue in indirect reciprocity, which we call ‘generous scoring’. Using an equilibrium analysis, we characterize under which conditions either of the two strategies can maintain cooperation. With simulations, we additionally explore which kind of reciprocity evolves when members of a population engage in social learning to adapt to their environment. Our results draw unexpected connections between direct and indirect reciprocity while highlighting important differences regarding their evolvability. AU - Schmid, Laura AU - Chatterjee, Krishnendu AU - Hilbe, Christian AU - Nowak, Martin A. ID - 9402 IS - 10 JF - Nature Human Behaviour TI - A unified framework of direct and indirect reciprocity VL - 5 ER - TY - CONF AB - The Price of Anarchy (PoA) is a well-established game-theoretic concept to shed light on coordination issues arising in open distributed systems. Leaving agents to selfishly optimize comes with the risk of ending up in sub-optimal states (in terms of performance and/or costs), compared to a centralized system design. However, the PoA relies on strong assumptions about agents' rationality (e.g., resources and information) and interactions, whereas in many distributed systems agents interact locally with bounded resources. They do so repeatedly over time (in contrast to "one-shot games"), and their strategies may evolve. Using a more realistic evolutionary game model, this paper introduces a realized evolutionary Price of Anarchy (ePoA). The ePoA allows an exploration of equilibrium selection in dynamic distributed systems with multiple equilibria, based on local interactions of simple memoryless agents. Considering a fundamental game related to virus propagation on networks, we present analytical bounds on the ePoA in basic network topologies and for different strategy update dynamics. In particular, deriving stationary distributions of the stochastic evolutionary process, we find that the Nash equilibria are not always the most abundant states, and that different processes can feature significant off-equilibrium behavior, leading to a significantly higher ePoA compared to the PoA studied traditionally in the literature. AU - Schmid, Laura AU - Chatterjee, Krishnendu AU - Schmid, Stefan ID - 7346 T2 - Proceedings of the 23rd International Conference on Principles of Distributed Systems TI - The evolutionary price of anarchy: Locally bounded agents in a dynamic virus game VL - 153 ER - TY - CONF AB - A vector addition system with states (VASS) consists of a finite set of states and counters. A transition changes the current state to the next state, and every counter is either incremented, or decremented, or left unchanged. A state and value for each counter is a configuration; and a computation is an infinite sequence of configurations with transitions between successive configurations. A probabilistic VASS consists of a VASS along with a probability distribution over the transitions for each state. Qualitative properties such as state and configuration reachability have been widely studied for VASS. In this work we consider multi-dimensional long-run average objectives for VASS and probabilistic VASS. For a counter, the cost of a configuration is the value of the counter; and the long-run average value of a computation for the counter is the long-run average of the costs of the configurations in the computation. The multi-dimensional long-run average problem given a VASS and a threshold value for each counter, asks whether there is a computation such that for each counter the long-run average value for the counter does not exceed the respective threshold. For probabilistic VASS, instead of the existence of a computation, we consider whether the expected long-run average value for each counter does not exceed the respective threshold. Our main results are as follows: we show that the multi-dimensional long-run average problem (a) is NP-complete for integer-valued VASS; (b) is undecidable for natural-valued VASS (i.e., nonnegative counters); and (c) can be solved in polynomial time for probabilistic integer-valued VASS, and probabilistic natural-valued VASS when all computations are non-terminating. AU - Chatterjee, Krishnendu AU - Henzinger, Thomas A AU - Otop, Jan ID - 8600 SN - 18688969 T2 - 31st International Conference on Concurrency Theory TI - Multi-dimensional long-run average problems for vector addition systems with states VL - 171 ER - TY - CONF AB - Game of Life is a simple and elegant model to study dynamical system over networks. The model consists of a graph where every vertex has one of two types, namely, dead or alive. A configuration is a mapping of the vertices to the types. An update rule describes how the type of a vertex is updated given the types of its neighbors. In every round, all vertices are updated synchronously, which leads to a configuration update. While in general, Game of Life allows a broad range of update rules, we focus on two simple families of update rules, namely, underpopulation and overpopulation, that model several interesting dynamics studied in the literature. In both settings, a dead vertex requires at least a desired number of live neighbors to become alive. For underpopulation (resp., overpopulation), a live vertex requires at least (resp. at most) a desired number of live neighbors to remain alive. We study the basic computation problems, e.g., configuration reachability, for these two families of rules. For underpopulation rules, we show that these problems can be solved in polynomial time, whereas for overpopulation rules they are PSPACE-complete. AU - Chatterjee, Krishnendu AU - Ibsen-Jensen, Rasmus AU - Jecker, Ismael R AU - Svoboda, Jakub ID - 8533 SN - 18688969 T2 - 45th International Symposium on Mathematical Foundations of Computer Science TI - Simplified game of life: Algorithms and complexity VL - 170 ER - TY - CONF AB - A regular language L of finite words is composite if there are regular languages L₁,L₂,…,L_t such that L = ⋂_{i = 1}^t L_i and the index (number of states in a minimal DFA) of every language L_i is strictly smaller than the index of L. Otherwise, L is prime. Primality of regular languages was introduced and studied in [O. Kupferman and J. Mosheiff, 2015], where the complexity of deciding the primality of the language of a given DFA was left open, with a doubly-exponential gap between the upper and lower bounds. We study primality for unary regular languages, namely regular languages with a singleton alphabet. A unary language corresponds to a subset of ℕ, making the study of unary prime languages closer to that of primality in number theory. We show that the setting of languages is richer. In particular, while every composite number is the product of two smaller numbers, the number t of languages necessary to decompose a composite unary language induces a strict hierarchy. In addition, a primality witness for a unary language L, namely a word that is not in L but is in all products of languages that contain L and have an index smaller than L’s, may be of exponential length. Still, we are able to characterize compositionality by structural properties of a DFA for L, leading to a LogSpace algorithm for primality checking of unary DFAs. AU - Jecker, Ismael R AU - Kupferman, Orna AU - Mazzocchi, Nicolas ID - 8534 SN - 18688969 T2 - 45th International Symposium on Mathematical Foundations of Computer Science TI - Unary prime languages VL - 170 ER - TY - CONF AB - Simple stochastic games are turn-based 2½-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound. AU - Ashok, Pranav AU - Chatterjee, Krishnendu AU - Kretinsky, Jan AU - Weininger, Maximilian AU - Winkler, Tobias ID - 7955 SN - 9781450371049 T2 - Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science TI - Approximating values of generalized-reachability stochastic games ER - TY - JOUR AB - Resources are rarely distributed uniformly within a population. Heterogeneity in the concentration of a drug, the quality of breeding sites, or wealth can all affect evolutionary dynamics. In this study, we represent a collection of properties affecting the fitness at a given location using a color. A green node is rich in resources while a red node is poorer. More colors can represent a broader spectrum of resource qualities. For a population evolving according to the birth-death Moran model, the first question we address is which structures, identified by graph connectivity and graph coloring, are evolutionarily equivalent. We prove that all properly two-colored, undirected, regular graphs are evolutionarily equivalent (where “properly colored” means that no two neighbors have the same color). We then compare the effects of background heterogeneity on properly two-colored graphs to those with alternative schemes in which the colors are permuted. Finally, we discuss dynamic coloring as a model for spatiotemporal resource fluctuations, and we illustrate that random dynamic colorings often diminish the effects of background heterogeneity relative to a proper two-coloring. AU - Kaveh, Kamran AU - McAvoy, Alex AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 8767 IS - 11 JF - PLOS Computational Biology KW - Ecology KW - Modelling and Simulation KW - Computational Theory and Mathematics KW - Genetics KW - Ecology KW - Evolution KW - Behavior and Systematics KW - Molecular Biology KW - Cellular and Molecular Neuroscience SN - 1553-734X TI - The Moran process on 2-chromatic graphs VL - 16 ER - TY - JOUR AB - Cooperation is a ubiquitous and beneficial behavioural trait despite being prone to exploitation by free-riders. Hence, cooperative populations are prone to invasions by selfish individuals. However, a population consisting of only free-riders typically does not survive. Thus, cooperators and free-riders often coexist in some proportion. An evolutionary version of a Snowdrift Game proved its efficiency in analysing this phenomenon. However, what if the system has already reached its stable state but was perturbed due to a change in environmental conditions? Then, individuals may have to re-learn their effective strategies. To address this, we consider behavioural mistakes in strategic choice execution, which we refer to as incompetence. Parametrising the propensity to make such mistakes allows for a mathematical description of learning. We compare strategies based on their relative strategic advantage relying on both fitness and learning factors. When strategies are learned at distinct rates, allowing learning according to a prescribed order is optimal. Interestingly, the strategy with the lowest strategic advantage should be learnt first if we are to optimise fitness over the learning path. Then, the differences between strategies are balanced out in order to minimise the effect of behavioural uncertainty. AU - Kleshnina, Maria AU - Streipert, Sabrina AU - Filar, Jerzy AU - Chatterjee, Krishnendu ID - 8789 IS - 11 JF - Mathematics TI - Prioritised learning in snowdrift-type games VL - 8 ER - TY - JOUR AB - We consider a real-time setting where an environment releases sequences of firm-deadline tasks, and an online scheduler chooses on-the-fly the ones to execute on a single processor so as to maximize cumulated utility. The competitive ratio is a well-known performance measure for the scheduler: it gives the worst-case ratio, among all possible choices for the environment, of the cumulated utility of the online scheduler versus an offline scheduler that knows these choices in advance. Traditionally, competitive analysis is performed by hand, while automated techniques are rare and only handle static environments with independent tasks. We present a quantitative-verification framework for precedence-aware competitive analysis, where task releases may depend on preceding scheduling choices, i.e., the environment can respond to scheduling decisions dynamically . We consider two general classes of precedences: 1) follower precedences force the release of a dependent task upon the completion of a set of precursor tasks, while and 2) pairing precedences modify the characteristics of a dependent task provided the completion of a set of precursor tasks. Precedences make competitive analysis challenging, as the online and offline schedulers operate on diverging sequences. We make a formal presentation of our framework, and use a GPU-based implementation to analyze ten well-known schedulers on precedence-based application examples taken from the existing literature: 1) a handshake protocol (HP); 2) network packet-switching; 3) query scheduling (QS); and 4) a sporadic-interrupt setting. Our experimental results show that precedences and task parameters can vary drastically the best scheduler. Our framework thus supports application designers in choosing the best scheduler among a given set automatically. AU - Pavlogiannis, Andreas AU - Schaumberger, Nico AU - Schmid, Ulrich AU - Chatterjee, Krishnendu ID - 8788 IS - 11 JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems SN - 02780070 TI - Precedence-aware automated competitive analysis of real-time scheduling VL - 39 ER - TY - JOUR AB - In this paper we introduce and study all-pay bidding games, a class of two player, zero-sum games on graphs. The game proceeds as follows. We place a token on some vertex in the graph and assign budgets to the two players. Each turn, each player submits a sealed legal bid (non-negative and below their remaining budget), which is deducted from their budget and the highest bidder moves the token onto an adjacent vertex. The game ends once a sink is reached, and Player 1 pays Player 2 the outcome that is associated with the sink. The players attempt to maximize their expected outcome. Our games model settings where effort (of no inherent value) needs to be invested in an ongoing and stateful manner. On the negative side, we show that even in simple games on DAGs, optimal strategies may require a distribution over bids with infinite support. A central quantity in bidding games is the ratio of the players budgets. On the positive side, we show a simple FPTAS for DAGs, that, for each budget ratio, outputs an approximation for the optimal strategy for that ratio. We also implement it, show that it performs well, and suggests interesting properties of these games. Then, given an outcome c, we show an algorithm for finding the necessary and sufficient initial ratio for guaranteeing outcome c with probability 1 and a strategy ensuring such. Finally, while the general case has not previously been studied, solving the specific game in which Player 1 wins iff he wins the first two auctions, has been long stated as an open question, which we solve. AU - Avni, Guy AU - Ibsen-Jensen, Rasmus AU - Tkadlec, Josef ID - 9197 IS - 02 JF - Proceedings of the AAAI Conference on Artificial Intelligence SN - 2159-5399 TI - All-pay bidding games on graphs VL - 34 ER - TY - JOUR AB - Coinfections with multiple pathogens can result in complex within‐host dynamics affecting virulence and transmission. While multiple infections are intensively studied in solitary hosts, it is so far unresolved how social host interactions interfere with pathogen competition, and if this depends on coinfection diversity. We studied how the collective disease defences of ants – their social immunity – influence pathogen competition in coinfections of same or different fungal pathogen species. Social immunity reduced virulence for all pathogen combinations, but interfered with spore production only in different‐species coinfections. Here, it decreased overall pathogen sporulation success while increasing co‐sporulation on individual cadavers and maintaining a higher pathogen diversity at the community level. Mathematical modelling revealed that host sanitary care alone can modulate competitive outcomes between pathogens, giving advantage to fast‐germinating, thus less grooming‐sensitive ones. Host social interactions can hence modulate infection dynamics in coinfected group members, thereby altering pathogen communities at the host level and population level. AU - Milutinovic, Barbara AU - Stock, Miriam AU - Grasse, Anna V AU - Naderlinger, Elisabeth AU - Hilbe, Christian AU - Cremer, Sylvia ID - 7343 IS - 3 JF - Ecology Letters SN - 1461-023X TI - Social immunity modulates competition between coinfecting pathogens VL - 23 ER - TY - GEN AB - Coinfections with multiple pathogens can result in complex within-host dynamics affecting virulence and transmission. Whilst multiple infections are intensively studied in solitary hosts, it is so far unresolved how social host interactions interfere with pathogen competition, and if this depends on coinfection diversity. We studied how the collective disease defenses of ants – their social immunity ­– influence pathogen competition in coinfections of same or different fungal pathogen species. Social immunity reduced virulence for all pathogen combinations, but interfered with spore production only in different-species coinfections. Here, it decreased overall pathogen sporulation success, whilst simultaneously increasing co-sporulation on individual cadavers and maintaining a higher pathogen diversity at the community-level. Mathematical modeling revealed that host sanitary care alone can modulate competitive outcomes between pathogens, giving advantage to fast-germinating, thus less grooming-sensitive ones. Host social interactions can hence modulate infection dynamics in coinfected group members, thereby altering pathogen communities at the host- and population-level. AU - Milutinovic, Barbara AU - Stock, Miriam AU - Grasse, Anna V AU - Naderlinger, Elisabeth AU - Hilbe, Christian AU - Cremer, Sylvia ID - 13060 TI - Social immunity modulates competition between coinfecting pathogens ER - TY - CONF AB - Multiple-environment Markov decision processes (MEMDPs) are MDPs equipped with not one, but multiple probabilistic transition functions, which represent the various possible unknown environments. While the previous research on MEMDPs focused on theoretical properties for long-run average payoff, we study them with discounted-sum payoff and focus on their practical advantages and applications. MEMDPs can be viewed as a special case of Partially observable and Mixed observability MDPs: the state of the system is perfectly observable, but not the environment. We show that the specific structure of MEMDPs allows for more efficient algorithmic analysis, in particular for faster belief updates. We demonstrate the applicability of MEMDPs in several domains. In particular, we formalize the sequential decision-making approach to contextual recommendation systems as MEMDPs and substantially improve over the previous MDP approach. AU - Chatterjee, Krishnendu AU - Chmelik, Martin AU - Karkhanis, Deep AU - Novotný, Petr AU - Royer, Amélie ID - 8193 SN - 23340835 T2 - Proceedings of the 30th International Conference on Automated Planning and Scheduling TI - Multiple-environment Markov decision processes: Efficient analysis and applications VL - 30 ER - TY - CONF AB - We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP∩coNP , matching the current known bound for single objectives; and in general the decision problem is PSPACE -hard and can be solved in NEXPTIME∩coNEXPTIME . We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies. AU - Chatterjee, Krishnendu AU - Katoen, Joost P AU - Weininger, Maximilian AU - Winkler, Tobias ID - 8272 SN - 03029743 T2 - International Conference on Computer Aided Verification TI - Stochastic games with lexicographic reachability-safety objectives VL - 12225 ER - TY - JOUR AB - We study relations between evidence theory and S-approximation spaces. Both theories have their roots in the analysis of Dempsterchr('39')s multivalued mappings and lower and upper probabilities, and have close relations to rough sets. We show that an S-approximation space, satisfying a monotonicity condition, can induce a natural belief structure which is a fundamental block in evidence theory. We also demonstrate that one can induce a natural belief structure on one set, given a belief structure on another set, if the two sets are related by a partial monotone S-approximation space. AU - Shakiba, A. AU - Goharshady, Amir Kafshdar AU - Hooshmandasl, M.R. AU - Alambardar Meybodi, M. ID - 8671 IS - 2 JF - Iranian Journal of Mathematical Sciences and Informatics SN - 1735-4463 TI - A note on belief structures and s-approximation spaces VL - 15 ER - TY - JOUR AB - The fixation probability of a single mutant invading a population of residents is among the most widely-studied quantities in evolutionary dynamics. Amplifiers of natural selection are population structures that increase the fixation probability of advantageous mutants, compared to well-mixed populations. Extensive studies have shown that many amplifiers exist for the Birth-death Moran process, some of them substantially increasing the fixation probability or even guaranteeing fixation in the limit of large population size. On the other hand, no amplifiers are known for the death-Birth Moran process, and computer-assisted exhaustive searches have failed to discover amplification. In this work we resolve this disparity, by showing that any amplification under death-Birth updating is necessarily bounded and transient. Our boundedness result states that even if a population structure does amplify selection, the resulting fixation probability is close to that of the well-mixed population. Our transience result states that for any population structure there exists a threshold r⋆ such that the population structure ceases to amplify selection if the mutant fitness advantage r is larger than r⋆. Finally, we also extend the above results to δ-death-Birth updating, which is a combination of Birth-death and death-Birth updating. On the positive side, we identify population structures that maintain amplification for a wide range of values r and δ. These results demonstrate that amplification of natural selection depends on the specific mechanisms of the evolutionary process. AU - Tkadlec, Josef AU - Pavlogiannis, Andreas AU - Chatterjee, Krishnendu AU - Nowak, Martin A. ID - 7212 JF - PLoS computational biology TI - Limits on amplifiers of natural selection under death-Birth updating VL - 16 ER - TY - THES AB - In this thesis we study certain mathematical aspects of evolution. The two primary forces that drive an evolutionary process are mutation and selection. Mutation generates new variants in a population. Selection chooses among the variants depending on the reproductive rates of individuals. Evolutionary processes are intrinsically random – a new mutation that is initially present in the population at low frequency can go extinct, even if it confers a reproductive advantage. The overall rate of evolution is largely determined by two quantities: the probability that an invading advantageous mutation spreads through the population (called fixation probability) and the time until it does so (called fixation time). Both those quantities crucially depend not only on the strength of the invading mutation but also on the population structure. In this thesis, we aim to understand how the underlying population structure affects the overall rate of evolution. Specifically, we study population structures that increase the fixation probability of advantageous mutants (called amplifiers of selection). Broadly speaking, our results are of three different types: We present various strong amplifiers, we identify regimes under which only limited amplification is feasible, and we propose population structures that provide different tradeoffs between high fixation probability and short fixation time. AU - Tkadlec, Josef ID - 7196 TI - A role of graphs in evolutionary processes ER - TY - GEN AB - Data and mathematica notebooks for plotting figures from Language learning with communication between learners AU - Ibsen-Jensen, Rasmus AU - Tkadlec, Josef AU - Chatterjee, Krishnendu AU - Nowak, Martin ID - 9814 TI - Data and mathematica notebooks for plotting figures from language learning with communication between learners from language acquisition with communication between learners ER - TY - CONF AB - The notion of program sensitivity (aka Lipschitz continuity) specifies that changes in the program input result in proportional changes to the program output. For probabilistic programs the notion is naturally extended to expected sensitivity. A previous approach develops a relational program logic framework for proving expected sensitivity of probabilistic while loops, where the number of iterations is fixed and bounded. In this work, we consider probabilistic while loops where the number of iterations is not fixed, but randomized and depends on the initial input values. We present a sound approach for proving expected sensitivity of such programs. Our sound approach is martingale-based and can be automated through existing martingale-synthesis algorithms. Furthermore, our approach is compositional for sequential composition of while loops under a mild side condition. We demonstrate the effectiveness of our approach on several classical examples from Gambler's Ruin, stochastic hybrid systems and stochastic gradient descent. We also present experimental results showing that our automated approach can handle various probabilistic programs in the literature. AU - Wang, Peixin AU - Fu, Hongfei AU - Chatterjee, Krishnendu AU - Deng, Yuxin AU - Xu, Ming ID - 8324 IS - POPL T2 - Proceedings of the ACM on Programming Languages TI - Proving expected sensitivity of probabilistic programs with randomized variable-dependent termination time VL - 4 ER - TY - JOUR AB - Markov decision processes (MDPs) are the defacto framework for sequential decision making in the presence of stochastic uncertainty. A classical optimization criterion for MDPs is to maximize the expected discounted-sum payoff, which ignores low probability catastrophic events with highly negative impact on the system. On the other hand, risk-averse policies require the probability of undesirable events to be below a given threshold, but they do not account for optimization of the expected payoff. We consider MDPs with discounted-sum payoff with failure states which represent catastrophic outcomes. The objective of risk-constrained planning is to maximize the expected discounted-sum payoff among risk-averse policies that ensure the probability to encounter a failure state is below a desired threshold. Our main contribution is an efficient risk-constrained planning algorithm that combines UCT-like search with a predictor learned through interaction with the MDP (in the style of AlphaZero) and with a risk-constrained action selection via linear programming. We demonstrate the effectiveness of our approach with experiments on classical MDPs from the literature, including benchmarks with an order of 106 states. AU - Brázdil, Tomáš AU - Chatterjee, Krishnendu AU - Novotný, Petr AU - Vahala, Jiří ID - 15055 IS - 06 JF - Proceedings of the 34th AAAI Conference on Artificial Intelligence KW - General Medicine SN - 2374-3468 TI - Reinforcement learning of risk-constrained policies in Markov decision processes VL - 34 ER -