Lower bound on Howard policy iteration for deterministic Markov Decision Processes

Asadi A, Chatterjee K, De Raaij J. 2025. Lower bound on Howard policy iteration for deterministic Markov Decision Processes. The 41st Conference on Uncertainty in Artificial Intelligence. UAI: Conference on Uncertainty in Artificial Intelligence, PMLR, vol. 286, 223–232.

Download
OA 2025_UAI_Asadi.pdf 317.10 KB [Published Version]
Conference Paper | Published | English

Scopus indexed
Author

Corresponding author has ISTA affiliation

Series Title
PMLR
Abstract
Deterministic Markov Decision Processes (DMDPs) are a mathematical framework for decision-making where the outcomes and future possible actions are deterministically determined by the current action taken. DMDPs can be viewed as a finite directed weighted graph, where in each step, the controller chooses an outgoing edge. An objective is a measurable function on runs (or infinite trajectories) of the DMDP, and the value for an objective is the maximal cumulative reward (or weight) that the controller can guarantee. We consider the classical mean-payoff (aka limit-average) objective, which is a basic and fundamental objective. Howard's policy iteration algorithm is a popular method for solving DMDPs with mean-payoff objectives. Although Howard's algorithm performs well in practice, as experimental studies suggested, the best known upper bound is exponential and the current known lower bound is as follows: For the input size I, the algorithm requires (math formular) iterations, where (math formular) hides the poly-logarithmic factors, i.e., the current lower bound on iterations is sub-linear with respect to the input size. Our main result is an improved lower bound for this fundamental algorithm where we show that for the input size I, the algorithm requires (math formular) iterations.
Publishing Year
Date Published
2025-01-01
Proceedings Title
The 41st Conference on Uncertainty in Artificial Intelligence
Publisher
ML Research Press
Acknowledgement
This research was partially supported by the ERC CoG 863818 (ForM-SMArt) grant and Austrian Science Fund (FWF) 10.55776/COE12.
Volume
286
Page
223-232
Conference
UAI: Conference on Uncertainty in Artificial Intelligence
Conference Location
Rio de Janeiro, Brazil
Conference Date
2025-07-21 – 2025-07-25
eISSN
IST-REx-ID

Cite this

Asadi A, Chatterjee K, De Raaij J. Lower bound on Howard policy iteration for deterministic Markov Decision Processes. In: The 41st Conference on Uncertainty in Artificial Intelligence. Vol 286. ML Research Press; 2025:223-232.
Asadi, A., Chatterjee, K., & De Raaij, J. (2025). Lower bound on Howard policy iteration for deterministic Markov Decision Processes. In The 41st Conference on Uncertainty in Artificial Intelligence (Vol. 286, pp. 223–232). Rio de Janeiro, Brazil: ML Research Press.
Asadi, Ali, Krishnendu Chatterjee, and Jakob De Raaij. “Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes.” In The 41st Conference on Uncertainty in Artificial Intelligence, 286:223–32. ML Research Press, 2025.
A. Asadi, K. Chatterjee, and J. De Raaij, “Lower bound on Howard policy iteration for deterministic Markov Decision Processes,” in The 41st Conference on Uncertainty in Artificial Intelligence, Rio de Janeiro, Brazil, 2025, vol. 286, pp. 223–232.
Asadi A, Chatterjee K, De Raaij J. 2025. Lower bound on Howard policy iteration for deterministic Markov Decision Processes. The 41st Conference on Uncertainty in Artificial Intelligence. UAI: Conference on Uncertainty in Artificial Intelligence, PMLR, vol. 286, 223–232.
Asadi, Ali, et al. “Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes.” The 41st Conference on Uncertainty in Artificial Intelligence, vol. 286, ML Research Press, 2025, pp. 223–32.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
File Name
Access Level
OA Open Access
Date Uploaded
2025-09-09
MD5 Checksum
4180c81bb6ed3b4f5c7a8e48d06520c6


Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 2506.12254

Search this title in

Google Scholar