_id,doi,title 645,10.1007/978-3-319-63387-9_10,Value iteration for long run average reward in markov decision processes