Papers on Theory of RL
Click here to return to the main page on reinforcement learning Satinder Singh.
- Predictive State Representations: A New Theory for Modeling Dynamical Systems by Satinder Singh, Michael R. James and Matthew R. Rudary. In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512-519, 2004.
pdf.
- Learning and Discovery of Predictive State Representations in Dynamical Systems with Reset by Michael James and Satinder Singh. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 417-424, 2004.
pdf.
- A Nonlinear Predictive State Representation by Matthew Rudary and Satinder Singh. In Advances in Neural Information Processing Systems 16 (NIPS), pages 855-862, 2004.
pdf.
- Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Machine Learning journal, Volume 49, Issue 2, pages 209-232, 2002.
( shorter version appears in ICML 1998).
gzipped postscript pdf.
- Predictive Representations of State by Michael Littman, Richard Sutton and Satinder Singh. In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555-1561, 2002.
gzipped postscript pdf.
- "Bias-Variance" Error Bounds for Temporal Difference Updates by Michael Kearns and Satinder Singh. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT), pages 142-147, 2000.
gzipped postscript.
- Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms by Satinder Singh, Tommi Jaakkola, Michael Littman, and Csaba Szpesvari. In Machine Learning Journal, vol 38(3), pages 287-308, 2000.
gzipped postscript.
- Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning by Richard Sutton, Doina Precup and Satinder Singh. In Artificial Intelligence Journal, Volume 112, pages 181-211, 1999.
gzipped postscript.
- On the Complexity of Policy Iteration by Yishay Mansour and Satinder Singh. In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 401-408, 1999.
gzipped postscript.
- Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms by Michael Kearns and Satinder Singh. In Advances in Neural Information Processing Systems 11 (NIPS), pages 996-1002, 1999.
gzipped postscript.
- Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 260-268, 1998.
gzipped postscript.
- Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors by Doina Precup, Richard Sutton, and Satinder Singh. In Proceedings of the 10th European Conference on Machine Learning (ECML), pages 382-393. 1998.
gzipped postscript.
- Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Machine Learning Journal, Volume 32, Issue 1, pages 5-40, 1998.
gzipped postscript.
A shorter version appears in the NIPS 9 Proceedings.
- Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Advances in Neural Information Processing Systems 9 (NIPS), pages 1054-1060, 1997.
gzipped postscript.
- Reinforcement Learning with Replacing Eligibility Traces by Satinder Singh and Richard Sutton. In Machine Learning journal, Volume 22, Issue 1, pages 123-158, 1996.
gzipped postscript abstract.
- Learning Curve Bounds for Markov Decision Processes with Undiscounted Rewards by Lawrence Saul and Satinder Singh. In Proceedings of 9th Annual Conference on Computational Learning Theory (COLT), pages 147-156, 1996.
gzipped postscript.
- Markov Decision Processes in Large State Spaces by Lawrence Saul and Satinder Singh. In Proceedings of 8th Annual Workshop on Computational Learning Theory (COLT), pages 281-288, 1995.
gzipped postscript.
- Learning to Act using Real-Time Dynamic Programming by Andrew Barto, Steve Bradtke and Satinder Singh. In Artificial Intelligence, Volume 72, pages 81-138, 1995.
gzipped postscript.
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Neural Computation, Volume 6, Number 6, pages 1185-1201, 1994.
gzipped postscript.
- Stochastic Convergence of Iterative DP Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Advances in Neural Information Processing Systems 6 (NIPS), pages 703-710, 1994.
gzipped postscript pdf.
- An Upper Bound on the Loss from Approximate Optimal-Value Functions by Satinder Singh and Richard Yee. In Machine Learning, Volume 16, Issue 3, pages 227-233, 1994.
gzipped postscript.
Click here to return to the main page on reinforcement learning Satinder Singh.