{"status":"public","conference":{"location":"Sydney, Australia","start_date":"2014-11-03","name":"ALENEX: Algorithm Engineering and Experiments","end_date":"2014-11-07"},"author":[{"full_name":"Brázdil, Tomáš","last_name":"Brázdil","first_name":"Tomáš"},{"id":"2E5DCA20-F248-11E8-B48F-1D18A9856A87","full_name":"Chatterjee, Krishnendu","last_name":"Chatterjee","orcid":"0000-0002-4561-241X","first_name":"Krishnendu"},{"id":"3624234E-F248-11E8-B48F-1D18A9856A87","full_name":"Chmelik, Martin","last_name":"Chmelik","first_name":"Martin"},{"full_name":"Forejt, Vojtěch","first_name":"Vojtěch","last_name":"Forejt"},{"full_name":"Kretinsky, Jan","id":"44CEF464-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0002-8122-2881","first_name":"Jan","last_name":"Kretinsky"},{"last_name":"Kwiatkowska","first_name":"Marta","full_name":"Kwiatkowska, Marta"},{"full_name":"Parker, David","first_name":"David","last_name":"Parker"},{"first_name":"Mateusz","last_name":"Ujma","full_name":"Ujma, Mateusz"}],"alternative_title":["LNCS"],"oa_version":"Submitted Version","date_updated":"2021-01-12T06:54:49Z","publist_id":"5046","page":"98 - 114","doi":"10.1007/978-3-319-11936-6_8","type":"conference","acknowledgement":"This research was funded in part by the European Research Council (ERC) under grant agreement 246967 (VERIWARE), by the EU FP7 project HIERATIC, by the Czech Science Foundation grant No P202/12/P612, by EPSRC project EP/K038575/1.","quality_controlled":"1","title":"Verification of markov decision processes using learning algorithms","main_file_link":[{"url":"http://arxiv.org/abs/1402.2967","open_access":"1"}],"project":[{"name":"Quantitative Reactive Modeling","call_identifier":"FP7","_id":"25EE3708-B435-11E9-9278-68D0E5697425","grant_number":"267989"},{"grant_number":"24696","_id":"26241A12-B435-11E9-9278-68D0E5697425","name":"LIGHT-REGULATED LIGAND TRAPS FOR SPATIO-TEMPORAL INHIBITION OF CELL SIGNALING"},{"grant_number":"279307","_id":"2581B60A-B435-11E9-9278-68D0E5697425","name":"Quantitative Graph Games: Theory and Applications","call_identifier":"FP7"},{"_id":"25F5A88A-B435-11E9-9278-68D0E5697425","grant_number":"S11402-N23","call_identifier":"FWF","name":"Moderne Concurrency Paradigms"},{"call_identifier":"FWF","name":"Game Theory","_id":"25863FF4-B435-11E9-9278-68D0E5697425","grant_number":"S11407"},{"name":"Modern Graph Algorithmic Techniques in Formal Verification","call_identifier":"FWF","grant_number":"P 23499-N23","_id":"2584A770-B435-11E9-9278-68D0E5697425"},{"_id":"2587B514-B435-11E9-9278-68D0E5697425","name":"Microsoft Research Faculty Fellowship"}],"intvolume":" 8837","date_published":"2014-11-01T00:00:00Z","citation":{"ieee":"T. Brázdil et al., “Verification of markov decision processes using learning algorithms,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Sydney, Australia, 2014, vol. 8837, pp. 98–114.","ama":"Brázdil T, Chatterjee K, Chmelik M, et al. Verification of markov decision processes using learning algorithms. In: Cassez F, Raskin J-F, eds. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol 8837. Society of Industrial and Applied Mathematics; 2014:98-114. doi:10.1007/978-3-319-11936-6_8","short":"T. Brázdil, K. Chatterjee, M. Chmelik, V. Forejt, J. Kretinsky, M. Kwiatkowska, D. Parker, M. Ujma, in:, F. Cassez, J.-F. Raskin (Eds.), Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Society of Industrial and Applied Mathematics, 2014, pp. 98–114.","mla":"Brázdil, Tomáš, et al. “Verification of Markov Decision Processes Using Learning Algorithms.” Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), edited by Franck Cassez and Jean-François Raskin, vol. 8837, Society of Industrial and Applied Mathematics, 2014, pp. 98–114, doi:10.1007/978-3-319-11936-6_8.","apa":"Brázdil, T., Chatterjee, K., Chmelik, M., Forejt, V., Kretinsky, J., Kwiatkowska, M., … Ujma, M. (2014). Verification of markov decision processes using learning algorithms. In F. Cassez & J.-F. Raskin (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8837, pp. 98–114). Sydney, Australia: Society of Industrial and Applied Mathematics. https://doi.org/10.1007/978-3-319-11936-6_8","ista":"Brázdil T, Chatterjee K, Chmelik M, Forejt V, Kretinsky J, Kwiatkowska M, Parker D, Ujma M. 2014. Verification of markov decision processes using learning algorithms. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). ALENEX: Algorithm Engineering and Experiments, LNCS, vol. 8837, 98–114.","chicago":"Brázdil, Tomáš, Krishnendu Chatterjee, Martin Chmelik, Vojtěch Forejt, Jan Kretinsky, Marta Kwiatkowska, David Parker, and Mateusz Ujma. “Verification of Markov Decision Processes Using Learning Algorithms.” In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), edited by Franck Cassez and Jean-François Raskin, 8837:98–114. Society of Industrial and Applied Mathematics, 2014. https://doi.org/10.1007/978-3-319-11936-6_8."},"ec_funded":1,"month":"11","year":"2014","editor":[{"first_name":"Franck","last_name":"Cassez","full_name":"Cassez, Franck"},{"last_name":"Raskin","first_name":"Jean-François","full_name":"Raskin, Jean-François"}],"date_created":"2018-12-11T11:55:17Z","volume":8837,"abstract":[{"text":"We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model checking for unbounded properties inMDPs. In contrast with other related techniques, our approach is not restricted to time-bounded (finite-horizon) or discounted properties, nor does it assume any particular properties of the MDP. We also show how our methods extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.","lang":"eng"}],"oa":1,"_id":"2027","publication_status":"published","language":[{"iso":"eng"}],"publication":" Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","user_id":"4435EBFC-F248-11E8-B48F-1D18A9856A87","day":"01","publisher":"Society of Industrial and Applied Mathematics","department":[{"_id":"KrCh"},{"_id":"ToHe"}]}