Finite-memory strategies in POMDPs with long-run average objectives
Chatterjee K, Saona Urmeneta RJ, Ziliotto B. 2022. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 47(1), 100–119.
Download (ext.)
https://arxiv.org/abs/1904.13360
[Preprint]
Journal Article
| Published
| English
Scopus indexed
Author
Department
Grant
Abstract
Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.
Keywords
Publishing Year
Date Published
2022-02-01
Journal Title
Mathematics of Operations Research
Publisher
Institute for Operations Research and the Management Sciences
Acknowledgement
Partially supported by Austrian Science Fund (FWF) NFN Grant No RiSE/SHiNE S11407, by CONICYT Chile through grant PII 20150140, and by ECOS-CONICYT through grant C15E03.
Volume
47
Issue
1
Page
100-119
ISSN
eISSN
IST-REx-ID
Cite this
Chatterjee K, Saona Urmeneta RJ, Ziliotto B. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 2022;47(1):100-119. doi:10.1287/moor.2020.1116
Chatterjee, K., Saona Urmeneta, R. J., & Ziliotto, B. (2022). Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. Institute for Operations Research and the Management Sciences. https://doi.org/10.1287/moor.2020.1116
Chatterjee, Krishnendu, Raimundo J Saona Urmeneta, and Bruno Ziliotto. “Finite-Memory Strategies in POMDPs with Long-Run Average Objectives.” Mathematics of Operations Research. Institute for Operations Research and the Management Sciences, 2022. https://doi.org/10.1287/moor.2020.1116.
K. Chatterjee, R. J. Saona Urmeneta, and B. Ziliotto, “Finite-memory strategies in POMDPs with long-run average objectives,” Mathematics of Operations Research, vol. 47, no. 1. Institute for Operations Research and the Management Sciences, pp. 100–119, 2022.
Chatterjee K, Saona Urmeneta RJ, Ziliotto B. 2022. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 47(1), 100–119.
Chatterjee, Krishnendu, et al. “Finite-Memory Strategies in POMDPs with Long-Run Average Objectives.” Mathematics of Operations Research, vol. 47, no. 1, Institute for Operations Research and the Management Sciences, 2022, pp. 100–19, doi:10.1287/moor.2020.1116.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level
Open Access
Export
Marked PublicationsOpen Data ISTA Research Explorer
Web of Science
View record in Web of Science®Sources
arXiv 1904.13360