A Greedy Gradient Q-learning Approach for Constructing Optimal Policies in Infinite Time Horizon Settings
Ashkan Ertefaie, Video: A Greedy Gradient Q-learning Approach for Constructing Optimal Policies in Infinite Time Horizon Settings
Ashkan Ertefaie, Video: A Greedy Gradient Q-learning Approach for Constructing Optimal Policies in Infinite Time Horizon Settings
Ashkan Ertefaie, A Greedy Gradient Q-learning Approach for Constructing Optimal Policies in Infinite Time Horizon Settings, Workshop on the Interface of Machine Learning and Statistical Inference, BIRS, BIRS talk, 18w5054, math, mathematics, video