Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance
Svoboda J, Bansal S, Chatterjee K. 2024. Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance. 41st International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 235, 47331–47344.
Download (ext.)
Conference Paper
| Published
| English
Scopus indexed
Author
Corresponding author has ISTA affiliation
Department
Series Title
PMLR
Abstract
Reinforcement Learning (RL) from temporal logical specifications is a fundamental problem in sequential decision making. One of the basic and core such specification is the reachability specification that requires a target set to be eventually visited. Despite strong empirical results for RL from such specifications, the theoretical guarantees are bleak, including the impossibility of Probably Approximately Correct (PAC) guarantee for reachability specifications. Given the impossibility result, in this work we consider the problem of RL from reachability specifications along with the information of expected conditional distance (ECD). We present (a) lower bound results which establish the necessity of ECD information for PAC guarantees and (b) an algorithm that establishes PAC-guarantees given the ECD information. To the best of our knowledge, this is the first RL from reachability specifications that does not make any assumptions on the underlying environment to learn policies.
Publishing Year
Date Published
2024-07-29
Proceedings Title
41st International Conference on Machine Learning
Publisher
ML Research Press
Volume
235
Page
47331-47344
Conference
ICML: International Conference on Machine Learning
Conference Location
Vienna, Austria
Conference Date
2024-07-21 – 2024-07-27
IST-REx-ID
Cite this
Svoboda J, Bansal S, Chatterjee K. Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance. In: 41st International Conference on Machine Learning. Vol 235. ML Research Press; 2024:47331-47344.
Svoboda, J., Bansal, S., & Chatterjee, K. (2024). Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance. In 41st International Conference on Machine Learning (Vol. 235, pp. 47331–47344). Vienna, Austria: ML Research Press.
Svoboda, Jakub, Suguman Bansal, and Krishnendu Chatterjee. “Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance.” In 41st International Conference on Machine Learning, 235:47331–44. ML Research Press, 2024.
J. Svoboda, S. Bansal, and K. Chatterjee, “Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance,” in 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 47331–47344.
Svoboda J, Bansal S, Chatterjee K. 2024. Reinforcement learning from reachability specifications: PAC guarantees with expected conditional distance. 41st International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 235, 47331–47344.
Svoboda, Jakub, et al. “Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance.” 41st International Conference on Machine Learning, vol. 235, ML Research Press, 2024, pp. 47331–44.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level
