Selected Papers on Theory of RL

Click here to return to the main page on reinforcement learning Satinder Singh.

Abstraction Selection in Model-Based Reinforcement Learning.
by Nan Jiang, Alex Kulesza, and Satinder Singh.
In 32nd International Conference on Machine Learning (ICML), 2015.
pdf.
The Dependence of Effective Planning Horizon on Model Accuracy.
by Nan Jiang, Alex Kulesza, Satinder Singh, and Richard Lewis.
In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2015.
Best Paper Award
pdf.
Low-Ranks Spectral Learning with Weighted Loss Functions.
by Alex Kulesza, Nan Jiang, and Satinder Singh.
In Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.
pdf.
Low-Rank Spectral Learning.
by Alex Kulesza, Nadakuditi Raj Rao, and Satinder Singh.
In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
pdf.
Characterizing EVOI-Sufficient k-Response-Query Sets in Decision Problems.
by Robert Cohn, Satinder Singh, and Edmund Durfee.
In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
pdf.
Dynamic Incentive Mechanisms
by David C. Parkes, Ruggiero Cavallo, Florin Constantin and Satinder Singh.
In AI Magazine, Vol. 31, No. 4, pages 79-94, 2010.
pdf.
An Experts Algorithm for Transfer Learning
by Erik Talvitie and Satinder Singh.
In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 1065-1070, 2007.
pdf
Predictive State Representations: A New Theory for Modeling Dynamical Systems by Satinder Singh, Michael R. James and Matthew R. Rudary. In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512-519, 2004.
pdf.
A Nonlinear Predictive State Representation by Matthew Rudary and Satinder Singh. In Advances in Neural Information Processing Systems 16 (NIPS), pages 855-862, 2004.
pdf.
Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Machine Learning journal, Volume 49, Issue 2, pages 209-232, 2002.
( shorter version appears in ICML 1998).
gzipped postscript pdf.
Predictive Representations of State by Michael Littman, Richard Sutton and Satinder Singh. In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555-1561, 2002.
gzipped postscript pdf.
"Bias-Variance" Error Bounds for Temporal Difference Updates by Michael Kearns and Satinder Singh. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT), pages 142-147, 2000.
gzipped postscript.
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms by Satinder Singh, Tommi Jaakkola, Michael Littman, and Csaba Szpesvari. In Machine Learning Journal, vol 38(3), pages 287-308, 2000.
gzipped postscript.
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning by Richard Sutton, Doina Precup and Satinder Singh. In Artificial Intelligence Journal, Volume 112, pages 181-211, 1999.
gzipped postscript.
On the Complexity of Policy Iteration by Yishay Mansour and Satinder Singh. In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 401-408, 1999.
gzipped postscript.
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms by Michael Kearns and Satinder Singh. In Advances in Neural Information Processing Systems 11 (NIPS), pages 996-1002, 1999.
gzipped postscript.
Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 260-268, 1998.
gzipped postscript.
Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors by Doina Precup, Richard Sutton, and Satinder Singh. In Proceedings of the 10th European Conference on Machine Learning (ECML), pages 382-393. 1998.
gzipped postscript.
Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Machine Learning Journal, Volume 32, Issue 1, pages 5-40, 1998.
gzipped postscript.
A shorter version appears in the NIPS 9 Proceedings.
Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Advances in Neural Information Processing Systems 9 (NIPS), pages 1054-1060, 1997.
gzipped postscript.
Reinforcement Learning with Replacing Eligibility Traces by Satinder Singh and Richard Sutton. In Machine Learning journal, Volume 22, Issue 1, pages 123-158, 1996.
gzipped postscript abstract.
Learning Curve Bounds for Markov Decision Processes with Undiscounted Rewards by Lawrence Saul and Satinder Singh. In Proceedings of 9th Annual Conference on Computational Learning Theory (COLT), pages 147-156, 1996.
gzipped postscript.
Markov Decision Processes in Large State Spaces by Lawrence Saul and Satinder Singh. In Proceedings of 8th Annual Workshop on Computational Learning Theory (COLT), pages 281-288, 1995.
gzipped postscript.
Learning to Act using Real-Time Dynamic Programming by Andrew Barto, Steve Bradtke and Satinder Singh. In Artificial Intelligence, Volume 72, pages 81-138, 1995.
gzipped postscript.
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Neural Computation, Volume 6, Number 6, pages 1185-1201, 1994.
gzipped postscript.
Stochastic Convergence of Iterative DP Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Advances in Neural Information Processing Systems 6 (NIPS), pages 703-710, 1994.
gzipped postscript pdf.
An Upper Bound on the Loss from Approximate Optimal-Value Functions by Satinder Singh and Richard Yee. In Machine Learning, Volume 16, Issue 3, pages 227-233, 1994.
gzipped postscript.

Click here to return to the main page on reinforcement learning Satinder Singh.