Papers in Reverse Chronological Order
Go back to publications main page.
Refereed Conference and Journal Papers
Constraint Satisfaction Algorithms for Graphical Games by Vishal Soni, Satinder Singh and Michael Wellman. In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
pdf.
On Discovery and Learning of Models with Predictive State Representations of State for Agents with Continuous Actions and Observations by David Wingate and Satinder Singh. In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
pdf.
Relational Knowledge with Predictive State Representations by David Wingate, Vishal Soni, Britton Wolfe and Satinder Singh. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 2035-2040, 2007.
pdf.
An Experts Algorithm for Transfer Learning by Erik Talvitie and Satinder Singh. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 1065-1070, 2007.
pdf.
Cobot in LambdaMOO: An Adaptive Social Statistics Agent by Charles Isbell, Michael Kearns, Satinder Singh, Christian Shelton, Peter Stone and Dave Kormann. In Journal of Autonomous Agents and Multi-Agent Systems, 13(3), pages 327-354, 2006.
pdf.
Mixtures of Predictive Linear Gaussian Models for Nonlinear Stochastic Dynamical Systems by David Wingate and Satinder Singh. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
pdf.
Using Homomorphisms to Transfer Options Across Reinforcement Learning Domains by Vishal Soni and Satinder Singh. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
pdf.
Kernel Predictive Linear-Gaussian Models for Nonlinear Stochastic Dynamical Systems by David Wingate and Satinder Singh. In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1017-1024, 2006.
pdf.
Predictive linear-Gaussian models of controlled stochastic dynamical systems by Matthew Rudary and Satinder Singh. In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 777-784, 2006.
pdf.
Predictive State Representations with Options by Britton Wolfe and Satinder Singh. In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1025-1032, 2006.
pdf.
Optimal Coordinated Planning Amongst Self-Interested Agents with Private State by Ruggiero Cavallo, David C. Parkes and Satinder Singh. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI), 2006.
pdf.
Reinforcement Learning of Hierarchical Skills on the Sony Aibo Robot by Vishal Soni and Satinder Singh. In Proceedings of the 5th International Conference on Development and Learning (ICDL), 2006.
pdf.
Off-policy Learning with Options and Recognizers by Doina Precup, Richard Sutton, Cosmin Paduraru, Anna Koop and Satinder Singh. To appear in Proceedings of Advances in Neural Information Processing Systems 18 (NIPS), pages 1097-1104, 2006.
pdf.
Intrinsically Motivated Reinforcement Learning by Satinder Singh, Andrew G. Barto and Nuttapong Chentanez. To appear in Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1281-1288, 2005.
pdf.
Approximately Efficient Online Mechanism Design by David Parkes, Satinder Singh and Dimah Yanovsky. To appear in Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1049-1056, 2005.
pdf.
Predictive linear-Gaussian models of stochastic dynamical systems by Matthew Rudary, Satinder Singh and David Wingate. In Proceedings of the Uncertainty in Artificial Intelligence (UAI), pages 501-508, 2005.
pdf.
Intrinsically Motivated Learning of Hierarchical Collections of Skills by Andrew G. Barto, Satinder Singh, and Nuttapong Chentanez. To appear in Proceedings of International Conference on Developmental Learning (ICDL), 2004.
pdf.
Predictive State Representations: A New Theory for Modeling Dynamical Systems by Satinder Singh, Michael R. James and Matthew R. Rudary. In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512-519, 2004.
pdf.
Learning and Discovery of Predictive State Representations in Dynamical Systems with Reset by Michael James and Satinder Singh. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 417-424, 2004.
pdf.
Adaptive Cognitive Orthotics: Combining Reinforcement Learning and Constraint-Based Temporal Reasoning by Matthew Rudary, Satinder Singh and Martha Pollack. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 719-726, 2004.
pdf.
Computing Approximate Bayes Nash Equilibria in Tree-Games of Incomplete Information by Satinder Singh, Vishal Soni and Michael Wellman. In Proceedings of the Fifth ACM Conference on Electronic Commerce (EC), pages 81-90, 2004.
pdf.
Distributed Feedback Control for Decision Making on Supply Chains by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, Joshua Estelle, Yevgeniy Vorobeychik, Vishal Soni and Matthew Rudary. In Proceedings of the 14th International Conference on Automated Planning and Scheduling (ICAPS), pages 384-392, 2004.
pdf.
Strategic Interactions in the TAC 2003 Supply Chain Tournament by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld and Vishal Soni. In Proceedings of the Fourth International Conference on Computer & Games, 2004.
pdf.
A Nonlinear Predictive State Representation by Matthew Rudary and Satinder Singh. In Advances in Neural Information Processing Systems 16 (NIPS), pages 855-862, 2004.
pdf.
An MDP-Based Approach to Online Mechanism Design by David Parkes and Satinder Singh. In Advances in Neural Information Processing Systems 16 (NIPS), pages 791-798, 2004.
pdf.
Learning Predictive State Representations by Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe and Peter Stone. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 712-719, 2003.
gzipped postscript.
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System by Satinder Singh, Diane Litman, Michael Kearns and Marilyn Walker. In Journal of Artificial Intelligence Research (JAIR), Volume 16, pages 105-133, 2002.
gzipped postscript pdf.
CobotDS: A Spoken Dialogue System for Chat by Michael Kearns, Charles Isbell, Satinder Singh, Diane Litman, and J. Howe. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI), pages 435-430, 2002.
gzipped postscript pdf.
Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Machine Learning journal, Volume 49, Issue 2, pages 209-232, 2002.
( shorter version appears in ICML 1998).
gzipped postscript pdf.
Predictive Representations of State by Michael Littman, Richard Sutton and Satinder Singh. In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555-1561, 2002.
gzipped postscript pdf.
ATTac-2000: An Adaptive Autonomous Bidding Agent by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns. In Journal of Artificial Intelligence Research (JAIR), Vol 15, pages 189-206, 2001.
gzipped postscript pdf.
(A shorter version also appears in AAAI'01 as listed below).
Graphical Models for Game Theory by Michael Kearns, Michael Littman and Satinder Singh. In Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 253-260, 2001.
gzipped postscript pdf.
An Efficient Exact Algorithm for Single Connected Graphical Games by Michael Littman, Michael Kearns and Satinder Singh. In Advances in Neural Information Processing Systems 14 (NIPS), pages 817-823, 2002.
gzipped postscript pdf.
ATTac-2000: An Adaptive Autonomous Bidding Agent by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns. In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 238-245, 2001.
gzipped postscript pdf.
Cobot: A Social Reinforcement Learning Agent by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone. In Advances in Neural Information Processing Systems 14 (NIPS) pages 1393-1400, 2002.
gzipped postscript pdf.
A Social Reinforcement Learning Agent by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone. In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 377-384, 2001.
Winner of Best Paper Award.
gzipped postscript.
Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System by Satinder Singh, Michael Kearns, Diane Litman, and Marilyn Walker. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 645-651, 2000.
gzipped postscript pdf.
Cobot in LambdaMOO: A Social Statistics Agent by Charles Isbell, Michael Kearns, Dave Korman, Satinder Singh and Peter Stone. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 36-41, 2000.
gzipped postscript.
Automatic Optimization of Dialogue Management by Diane Litman, Michael Kearns, Satinder Singh and Marilyn Walker. In Proceedings of the 18th International Conference on Computational Linguistics (COLING), pages 502-508, 2000.
gzipped postscript pdf.
A Boosting Approach to Topic Spotting on Subdialogues by Kary Myers, Michael Kearns, Satinder Singh and Marilyn Walker. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML) pages 655-662, 2000.
gzipped postscript pdf.
Eligibility Traces for Off-Policy Policy Evaluation by Doina Precup, Richard Sutton, and Satinder Singh. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 759-766, 2000.
gzipped postscript pdf.
Nash Convergence of Gradient Dynamics in General-Sum Games by Satinder Singh, Michael Kearns and Yishay Mansour. In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 541-548, 2000.
gzipped postscript.
Fast Planning in Stochastic Games by Michael Kearns, Yishay Mansour, and Satinder Singh In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 309-316, 2000.
gzipped postscript.
"Bias-Variance" Error Bounds for Temporal Difference Updates by Michael Kearns and Satinder Singh. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT), pages 142-147, 2000.
gzipped postscript.
Reinforcement Learning for Spoken Dialogue Systems by Satinder Singh, Michael Kearns, Diane Litman and Marilyn Walker. In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
gzipped postscript.
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms by Satinder Singh, Tommi Jaakkola, Michael Littman, and Csaba Szpesvari. In Machine Learning Journal, vol 38(3), pages 287-308, 2000.
gzipped postscript.
Policy Gradient Methods for Reinforcement Learning with Function Approximation by Richard Sutton, Dave McAllester, Satinder Singh and Yishay Mansour. In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
gzipped postscript.
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning by Richard Sutton, Doina Precup and Satinder Singh. In Artificial Intelligence Journal, Volume 112, pages 181-211, 1999.
gzipped postscript.
Approximate Planning for Factored POMDPs using Belief State Simplification by Dave McAllester and Satinder Singh. In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 409-416, 1999.
gzipped postscript.
On the Complexity of Policy Iteration by Yishay Mansour and Satinder Singh. In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 401-408, 1999.
gzipped postscript.
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms by Michael Kearns and Satinder Singh. In Advances in Neural Information Processing Systems 11 (NIPS), pages 996-1002, 1999.
gzipped postscript.
Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes by John K. Williams and Satinder Singh. In Advances in Neural Information Processing Systems 11 (NIPS), pages 1073-1079, 1999.
gzipped postscript.
Optimizing admission control while ensuring quality of service in multimedia networks via reinforcement learning by Timothy Brown, Hong Tong, and Satinder Singh. In Advances in Neural Information Processing Systems 11 (NIPS), pages 982-988, 1999.
gzipped postscript.
Improved switching among temporally abstract actions by Richard Sutton, Satinder Singh, Doina Precup and Balaraman Ravindran. In Advances in Neural Information Processing Systems 11 (NIPS), pages 1066-1072, 1999.
gzipped postscript.
Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes by John Loch and Satinder Singh. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 323-331, 1998.
gzipped postscript.
Near-Optimal Reinforcement Learning in Polynomial Time by Michael Kearns and Satinder Singh. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 260-268, 1998.
gzipped postscript.
Intra-Option Learning about Temporally Abstract Actions by Richard Sutton, Doina Precup and Satinder Singh. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 556-564, 1998.
gzipped postscript.
Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors by Doina Precup, Richard Sutton, and Satinder Singh. In Proceedings of the 10th European Conference on Machine Learning (ECML), pages 382-393. 1998.
gzipped postscript.
How to Dynamically Merge Markov Decision Processes by Satinder Singh and David Cohn. In Advances in Neural Information Processing Systems 10 (NIPS), pages 1057-1063, 1998.
gzipped postscript pdf.
Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Machine Learning Journal, Volume 32, Issue 1, pages 5-40, 1998.
gzipped postscript.
A shorter version appears in the NIPS 9 Proceedings.
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems by Satinder Singh and Dimitri Bertsekas. In Advances in Neural Information Processing Systems 9 (NIPS), pages 974-980, 1997.
gzipped postscript
Predicting Lifetimes in Dynamically Allocated Memory by David Cohn and Satinder Singh. In Advances in Neural Information Processing Systems 9 (NIPS), pages 939-945, 1997.
gzipped postscript pdf.
Analytical Mean Squared Error Curves for Temporal Difference Learning by Satinder Singh and Peter Dayan. In Advances in Neural Information Processing Systems 9 (NIPS), pages 1054-1060, 1997.
gzipped postscript.
Reinforcement Learning with Replacing Eligibility Traces by Satinder Singh and Richard Sutton. In Machine Learning journal, Volume 22, Issue 1, pages 123-158, 1996.
gzipped postscript abstract.
Learning Curve Bounds for Markov Decision Processes with Undiscounted Rewards by Lawrence Saul and Satinder Singh. In Proceedings of 9th Annual Conference on Computational Learning Theory (COLT), pages 147-156, 1996.
gzipped postscript.
Long Term Potentiation, Navigation and Dynamic Programming by Peter Dayan and Satinder Singh. In Proceedings of Computation and Neural Systems Meeting (CNS) 1996.
gzipped postscript.
Improving Policies Without Measuring Merits by Peter Dayan and Satinder Singh. In Advances in Neural Information Processing Systems 8 (NIPS), pages 1059-1065, 1996.
gzipped postscript.
Markov Decision Processes in Large State Spaces by Lawrence Saul and Satinder Singh. In Proceedings of 8th Annual Workshop on Computational Learning Theory (COLT), pages 281-288, 1995.
gzipped postscript.
Learning to Act using Real-Time Dynamic Programming by Andrew Barto, Steve Bradtke and Satinder Singh. In Artificial Intelligence, Volume 72, pages 81-138, 1995.
gzipped postscript.
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Neural Computation, Volume 6, Number 6, pages 1185-1201, 1994.
gzipped postscript.
Reinforcement Learning With Soft State Aggregation by Satinder Singh, Tommi Jaakkola and Michael Jordan. In Advances in Neural Information Processing Systems 7 (NIPS), pages 361-368, 1995.
gzipped postscript pdf.
Stochastic Convergence of Iterative DP Algorithms by Tommi Jaakkola, Michael Jordan and Satinder Singh. In Advances in Neural Information Processing Systems 6 (NIPS), pages 703-710, 1994.
gzipped postscript pdf.
Reinforcement Learning Algorithm for Partially Observable Markov Problems by Tommi Jaakkola, Satinder Singh and Michael Jordan. In Advances in Neural Information Processing Systems 7 (NIPS), pages 345-352, 1995.
gzipped postscript pdf.
Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes by Satinder Singh. In Proceedings of the Twelth National Conference on Artificial Intelligence (AAAI), pages 700-705, 1994.
gzipped postscript.
Learning Without State-Estimation in Partially Observable Markovian Decision Processes by Satinder Singh, Tommi Jaakkola and Michael Jordan. In Machine Learning: Proceedings of the Eleventh International Conference (ICML), pages 284-292, 1994.
gzipped postscript pdf.
Robust Reinforcement Learning in Motion Planning by Satinder Singh, Andrew Barto, Roderic Grupen, and Christopher Connolly. In Advances in Neural Information Processing Systems 6 (NIPS), pages 655-662, 1994.
gzipped postscript.( 68 KBytes)
An Upper Bound on the Loss from Approximate Optimal-Value Functions by Satinder Singh and Richard Yee. In Machine Learning, Volume 16, Issue 3, pages 227-233, 1994.
gzipped postscript.
Distributed Representation of Limb Motor Programs in Arrays of Adjustable Pattern Generators by Neil Berthier, Satinder Singh, Andrew Barto, and Jim Houk. In Journal of Cognitive Neuroscience, vol 5:1, pages 56-78, 1993.
Reinforcement Learning with a Hierarchy of Abstract Models by Satinder Singh. In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI), pages 202-207, 1992.
gzipped postscript.
A Cortico-Cerebellar model that learns to generate distributed motor commands to control a kinetic arm by Satinder Singh, Neil Berthier, Andrew Barto, and Jim Houk. In Advances in Neural Information Processing Systems 4 (NIPS), pages 611-618, 1992.
Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models by Satinder Singh. In Proceedings of the Ninth Machine Learning Conference, pages 406-415, 1992.
gzipped postscript.
Transfer of Learning by Composing Solutions of Elemental Sequential Tasks by Satinder Singh. In Machine Learning Journal, Volume 8, Issue 3, pages 323-339, 1992.
gzipped postscript.
The Efficient Learning of Multiple Task Sequences by Satinder Singh. In Advances in Neural Information Processing Systems 4 (NIPS), pages 251-258, 1992.
gzipped postscript.
Transfer of Learning Across Compositions of Sequential Tasks by Satinder Singh. In Machine Learning: Proceedings of the Eighth International Workshop, pages 348-352, 1991.
gzipped postscript.
Refereed Workshop Papers
Strategic Procurement in TAC/SCM: An Empirical Game-Theoretic Analysis by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld, and Vishal Soni. In Workshop on Trading Agent Design and Analysis (TADA), 2004.
gzipped postscript.
Computing Approximate Equilibria in Graphical Games on Arbitrary Graphs by Vishal Soni, Michael P. Wellman, and Satinder Singh. In Sixth Workshop on Game Theoretic and Decistion Theoretic Agents (GTDT), 2004.
gzipped postscript.
Learning Payoff Functions in Infinite Games by Yevgeniy Vorobeychik, Michael P. Wellman, and Satinder Singh. In AAAI Fall Symposium on Artificial Multi-Agent Learning, 2004.
gzipped postscript.
FAucs: An FCC Spectrum Auction Simulator for Autonomous Bidding Agents by Janos Csirik, Michael Littman, Satinder Singh and Peter Stone. In Electronic Commerce: Proceedings of the Second Interanational Workshop 2001.
gzipped postscript pdf.
Cobot in LambdaMOO: A Social Statistics Agent by Charles Isbell, Michael Kearns, Dave Korman, Satinder Singh, and Peter Stone. In Workshop on Interactive Robotics and Entertainment (WIRE), 2000. (this is an early workshop version of the AAAI paper with the same title)
gzipped postscript.
Hierarchical Optimal Control of MDPs by Amy McGovern, Doina Precup, Balaraman Ravindran, Satinder Singh and Richard Sutton. In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, 1998.
gzipped postscript pdf.
Planning with Closed-Loop Macro Actions by Doina Precup, Richard Sutton and Satinder Singh. In Proceedings of AAAI Fall Symposium on Model-directed Autonomous Systems, 1997.
gzipped postscript.
On Step-Size and Bias in Temporal-Difference Learning by Richard Sutton and Satinder Singh. In Proceedings of Eighth Yale Workshop on Adaptive and Learning Systems, 1994.
gzipped postscript pdf abstract.
Reinforcement Learning and Dynamic Programming by Andrew Barto and Satinder Singh. In Proceedings of Sixth Yale Workshop on Adaptive and Learning Systems, 1990.
Magazine Articles, Book Chapters and Others
Value-Driven Procurement in the TAC Supply Chain Game by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, and Vishal Soni. SIGecom Exchanges, Volume4.3, pages 9-19, 2004.
pdf.
Reinforcement Learning for 3 vs. 2 Keepaway by Peter Stone and R. Sutton and Satinder Singh. In RoboCup-2000: Robot Soccer World Cup IV, P. Stone, T. Balch, and G. Kraetszchmar, Eds., Springer Verlag.
pdf file.
An earlier version appeared in the Proceedings of the RoboCup-2000 Workshop, Melbourne, Australia.
Soft Dynamic Programming Algorithms: Convergence Proofs by Satinder Singh. In Proceedings of Workshop on Computational Learning and Natural Learning (CLNL), Provincetown, Massachusetts, 1993.
gzipped postscript.
On the Computational Economics of Reinforcement Learning by Andrew Barto and Satinder Singh. In Proceedings of Connectionist Summer School, 1990.
gzipped postscript.
An Adaptive Sensorimotor Network Inspired by the Physiology of the Cerebellum by Jim Houk, Satinder Singh, Charles Fisher, and Andrew Barto. Appears as a chapter in WT Miller, RS Sutton, and PJ Werbos, editors, Neural Network for Control, pages 301-348, 1989.
My one paper in a non-technical journal!
How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning by Satinder Singh, Peter Norvig and David Cohn. In Dr. Dobbs journal, March issue, 1997.
gzipped postscript [html version].
An Almost Tutorial on RL (extracted from my Thesis)
An (Almost) Tutorial on Reinforcement Learning. gzipped postscript. Extracted from my 1993 thesis.
Going Nowhere Papers
Asynchronous Modified Policy Iteration with Single-sided Updates. Satinder Singh and Vijay Gullapalli. Working Paper, 1993.
gzipped postscript.