Satinder Singh's Papers

Papers in Reverse Chronological Order

Refereed Conference and Journal Papers

Pairwise Weights for Temporal Credit Assignment
by Zeyu Zheng, Risto Vuorio, Richard Lewis, and Satinder Singh.
In 36th AAAI Conference on Artificial Intelligence, 2022
arXiv version.
Reward is Enough
by David Silver, Satinder Singh, Doina Precup, and Richard Sutton.
In Artificial Intelligence, vol 299, 2021
pdf.
On the Expressivity of Markov Reward
by David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2021
Outstanding Paper Award
pdf.
Proper Value Equivalence
by Christopher Grimm, Andre Barreto, Gregory Farquhar, David Silver, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2021
arXiv version.
Discovery of Options via Meta-Learned Subgoals
by Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2021
arXiv version.
Learning State Representations from Random Deep Action-Conditional Predictions
by Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard Lewis, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2021
arXiv version.
Reward is Enough for Convex MDPs
by Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2021
arXiv version.
Reinforcement Learning of Implicit and Explicit Control Flow Instructions
by Ethan Brooks, Janarthanan Rajendran, Richard Lewis, and Satinder Singh.
In International Conference on Machine Learning (ICML), 2021
arXiv version.
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-Person Simulated 3D Environment
by Wilka Carvalho, Anthony Liang, Kimin Lee, Honglak Lee, Richard Lewis, and Satinder Singh.
In International Joint Conference on Artificial Intelligence (IJCAI), 2021
arXiv version.
Rational use of episodic and working memory: A normative account of prospective memory
by Ida Mommennejad, Jarrod Lewis-Peacock, Kenneth A. Normal, Jonathan D. Cohen, Satinder Singh, and Richard L. Lewis.
In Neuropsychologia, vol 158, 2021
pdf.
Efficient Querying for Cooperative Probabilistic Commitments
by Qi Zhang, Edmund Durfee, and Satinder Singh.
In 35th AAAI Conference on Artificial Intelligence (AAAI), 2021
arXiv version.
The Value Equivalence Principle for Model-Based Reinforcement Learning
by Christopher Grimm, Andre Barreto, Satinder Singh, and David Silver.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
arXiv version.
Discovering Reinforcement Learning Algorithms
by Junhyuk Oh, Matteo Hessel, Wojciech Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, and David Silver.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
arXiv version.
Meta-Gradient Reinforcement Learning with an Objective Discovered Online
by Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, and David Silver.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
arXiv version.
Learning to No-Press Diplomacy with Best Response Policy Iteration
by Thomas Anthony, Tom Eccles, Andrea Tacchetti, Janos Kramar, Ian Gemp, Thomas Hudson, Nicolas Porcel, Marc Lanctot, Julien Perolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, and Yoram Bachrach.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
arXiv version.
A Self-Tuning Actor-Critic Algorithm
by Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, and Satinder Singh.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
arXiv version.
On Efficiency in Hierarchical Reinforcement Learning
by Zheng Wen, Doina Precup, Morteza Ibrahimi, Andre Barreto, Benjamin Van Roy, and Satinder Singh.
In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
pdf.
What can Learned Intrinsic Rewards Capture?
by Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, and Satinder Singh.
In International Conference on Machine Learning (ICML), 2020.
arxiv version.
Sample Complexity of Reinforcement Learning Using Linearly Combined Model Ensembles
by Aditya Modi, Nan Jiang, Ambuj Tewari, and Satinder Singh.
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
arXiv version.
How Should An Agent Practice?
by Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, and Satinder Singh.
In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
pdf.
Modeling Probabilistic Commitments for Maintainance is Inherently Harder than for Achievement
by Qi Zhang, Edmund Durfee, and Satinder Singh.
In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
pdf.
Querying to Find a Safe Policy under Uncertain Safety Constraints in Markov Decision Processes
by Shun Zhang, Edmund Durfee, and Satinder Singh.
In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
pdf.
Discovery of Useful Questions as Auxiliary Tasks
by Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hsselt, David Silver, and Satinder Singh.
In Neural Information Processing Systems (NeurIPS), 2019.
arxiv version.
Behavior Suite for Reinforcement Learning
by Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, and Hado Van Hasselt.
In Neural Information Processing Systems (NeurIPS), 2019.
arxiv version.
Hindsight Credit Assignment
by Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheslaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hsselt, Gregory Wayne, Satinder Singh, Doina Precup, and Remi Munos.
In Neural Information Processing Systems (NeurIPS), 2019.
pdf.
No Press Diplomacy: Modeling Multi-Agent Gameplay
by Philip Paquette, Yuchen Lu, Steven Bocco, Max ). Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, and Aaron Courville.
In Neural Information Processing Systems (NeurIPS), 2019.
arxiv version.
Disentangled Cumulants Help Succesor Representations Transfer to New Tasks
by Christopher Grimm, Irina Higgins, Andre Barreto, Denis Teplyashin, Markus Wulfmeier, Tim Hertweck, Raia Hadsell, and Satinder Singh.
arxiv.
Deep Reinforcment Learning for Dynamic Multi-Driver Dispatching and Repositioning Problem
by John Holler, Risto Vuorio, Tiancheng Jin, Satinder Singh, Zhiwei Qin, Jieping Ye, Xiaocheng Tan, Yan Jiao, and Chenxi Wang.
In International Conference on Data Mining (ICDM-Short Paper), 2019.
pdf.
Learning Independently-Obtainable Reward Functions
by Christopher Grimm and Satinder Singh.
arXiv version.
Many-Goals Reinforcement Learning
by Vivek Veeriah, Junhyuk Oh, and Satinder Singh.
arXiv version.
Learning to Communicate and Solve Visual Blocks-World Tasks
by Qi Zhang, Richard Lewis, Satinder Singh, and Edmund Durfee.
In Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019.
pdf.
On Learning Intrinsic Rewards for Policy Gradient Methods
by Zeyu Zheng, Junhyuk Oh, and Satinder Singh.
In Neural Information Processing Systems (NIPS), 2018.
arXiv version.
Generative Adversarial Self-Imitation Learning
by Yijie Guo, Junhyuk Oh, Satinder Singh, and Honglak Lee.
In Neural Information Processing Systems (NeurIPS), 2018.
arXiv version.
Completing State Representations Using Spectral Learning
by Nan Jiang, Alex Kulesza, and Satinder Singh.
In Neural Information Processing Systems (NIPS), 2018.
pdf.
Learning End-to-End Goal-Oriented Dialog with Multiple Answers
by Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, and Lazaros Polymenakos.
In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
pdf.
Self-Imitation Learning
by Junhyuk Oh, Yijie Guo, Satinder Singh, and Honglak Lee.
In International Conference on Machine Learning (ICML), 2018.
arXiv version.
Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes
by Shun Zhang, Edmund Durfee, and Satinder Singh.
In International Joint Conference on Artificial Intelligence (IJCAI), 2018.
pdf.
Markov Decision Processes with Continuous Side Information
by Aditya Modi, Nan Jiang, Satinder Singh, and Ambuj Tewari.
In International Conference on Algorithmic Learning Theory (ALT), 2018.
conf pdf, arXiv link.
The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA
by Jiaxuan Wang, Ian Fox, Jonathan Skaza, Nick Linck, Satinder Singh, and Jenna Wiens.
In Sloan Sports Analytics Conference, 2018.
arXiv link.
Value Prediction Networks
by Junhyuk Oh, Satinder Singh, Honglak Lee.
In Neural Information Processing Systems (NIPS), 2017.
arXiv link.
Repeated Inverse Reinforcement Learning
by Kareem Amin, Nan Jiang, and Satinder Singh.
In Neural Information Processing Systems (NIPS), 2017.
arXiv link.
A Big Step for AI
by Satinder Singh.
In Nature: News & Views, 2017.
pdf.
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
by Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli.
In International Conference on Machine Learning (ICML), 2017.
pdf.
A Stackelberg Game Model for Botnet Data Exfiltration
by Thang Nguyen, Michael Wellman, and Satinder Singh.
In Proceedings of the 8th Conference on Decision and Game Theory for Security (GameSec), 2017.
pdf.
Learning to Query, Reason, and Answer Questions on Ambiguous Texts.
by Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Jospeh Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerald Tesauro, and Satinder Singh.
In 5th International Conference on Learning Representations (ICLR), 2017.
pdf.
Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes.
by Shun Zhang, Edmund Durfee, and Satinder Singh.
In 27th International Conference on Automated Planning and Scheduling (ICAPS), 2017.
pdf.
Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making.
by Qi Zhang, Satinder Singh, and Edmund Durfee.
In 27th International Conference on Automated Planning and Scheduling (ICAPS), 2017.
pdf.
Predicting Counselor Behaviors in Motivational Interviewing Encounters.
by Veronica Perez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An, Kathy J. Goggin, and Delwyn Catley.
In Proceedings of the European Association of Computational Linguistics, (EACL) 2017.
pdf.
Control of Memory, Active Perception, and Action in Minecraft.
by Junhyuk Oh, Valliappa Chockalingum, Satinder Singh, and Honglak Lee.
In 33rd International Conference on Machine Learning (ICML), 2016.
pdf.
Gradient Methods for Stackelberg Security Games.
by Kareem Amin, Satinder Singh, and Michael Wellman.
In Conference on Uncertainty in Artificial Intelligence (UAI), 2016.
pdf.
Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games.
by Xiaoxiao Guo, Satinder Singh, Richard Lewis, and Honglak Lee.
In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
pdf.
Commitment Semantics for Sequential Decision Making Under Reward Uncertainty.
by Qi Zhang, Edmund Durfee, Satinder Singh, Anna Chen, and Stefan Witwicki.
In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
pdf.
On Structural Properties of MDPs that Bound Loss Due to Shallow Planning.
by Nan Jiang, Satinder Singh and Ambuj Tewari.
In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
pdf.
On the Trustworthy Fulfillment of Commitments.
by Edmund Durfee and Satinder Singh.
In Proceedings of the 18th International Workshop on Trust in Agent Societies (TRUST), 2016.
pdf.
Building a Motivational Interviewing Dataset.
by Veronica Perez-Rosas, Rada Mihalcea, Kenneth Resnicow, Lawrence An, and Satinder Singh.
In Proceedings of the NAACL 2016 Workshop on Clinical Psychology, 2016.
pdf.
Patient-Centerd Pain Care Using Artificial Intelligence and Mobile Health Tools: Protocol for a Randomized Study Funded by the US Department of Veterans Affairs Health Services Research and Development Program.
by Piette JD, Krein SL, Striplin D, Marinec N, Kerns RD, Farris KB, Singh S, An L, and Heapy AA.
In JMIR Research Protocols; 5(2) 2016.
pdf.
Improving Predictive State Representations via Gradient Descent.
by Nan Jiang, Alex Kulesza, and Satinder Singh.
In Thirtieth AAAI Conference on Artificial Intelligence (AAAI), 2016.
pdf.
Confirming the theoretical structure of expert-developed text messages to improve adherence to anti-hypertensive medications.
by Karen Farris, Teresa Salgado, Peter Batra, John Piette, Satinder Singh, Ahmed Guhad, Sean Newman, Vincent Marshall, and Larry An.
In Research in Social and Administrative Pharmacy, 2015.
pdf.
Action-Conditional Video Prediction Using Deep Networks in ATARI Games.
by Juhnyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard Lewis, and Satinder Singh.
In Neural Information Processing Systems, 2015.
online videos
arxiv pdf, NIPS pdf, NIPS Appendix pdf.
Multi-Task Seizure Detection: Addressing Inter-Patient and Intra-Patient Variations in Seizure Morphologies.
by Alex Van Esbroeck, Landon Smith, Zeeshan Syed, Satinder Singh, and Zahi Karam.
In Machine Learning, 2015.
pdf.
Abstraction Selection in Model-Based Reinforcement Learning.
by Nan Jiang, Alex Kulesza, and Satinder Singh.
In 32nd International Conference on Machine Learning (ICML), 2015.
pdf.
The Dependence of Effective Planning Horizon on Model Accuracy.
by Nan Jiang, Alex Kulesza, Satinder Singh, and Richard Lewis.
In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2015.
Best Paper Award
pdf.
Low-Rank Spectral Learning with Weighted Loss Functions.
by Alex Kulesza, Nan Jiang, and Satinder Singh.
In Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.
pdf.
Spectral Learning of Predictive State Representations with Insufficient Statistics.
by Alex Kulesza, Nan Jiang, and Satinder Singh.
In Twenty-Ninth AAAI Conference, 2015.
pdf.
Optimal Rewards for Cooperative Agents.
by Bingyao Liu, Satinder Singh, Richard Lewis, and Shiyin Qin.
In IEEE Transactions on Autonomous Mental Development, Vol 6, Issue 4, 2014.
pdf.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning.
by Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard Lewis, and Xiaoshi Wang.
In Neural Information Processing Systems (NIPS), 2014.
pdf.
Computationally Rational Saccadic Control: An Explanation of Spillover Effects Based on Sampling from Noisy Perception and Memory.
by Michael Shvartsman, Richard L Lewis, and Satinder Singh.
In Cognitive Modeling and Computational Linguistics (CMCL), 2014.
pdf.
The Potential Impact of Intelligent Systems for Mobile Health Self-Management Support: Monte-Carlo Simulations of Text Message Support for Medication Adherence.
by John Piette, Karen Farris, Sean Newman, Larry An, Jeremy Sussman, and Satinder Singh.
In Annals of Behavioral Medicine, 2014.
pdf.
Low-Rank Spectral Learning.
by Alex Kulesza, Nadakuditi Raj Rao, and Satinder Singh.
In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
pdf.
Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization.
by Alex Van Esbroeck, Satinder Singh, Ilan Rubinfeld, and Zeeshan Syed.
In 28th AAAI Conference on Artificial Intelligence (AAAI-14), 2014.
pdf.
Predicting Postoperative Atrial Fibrillation from Independent ECG Components.
by Chih-Chun Chia, James Blum, Zahi Karam, Satinder Singh, and Zeeshan Syed.
In 28th AAAI Conference on Artificial Intelligence (AAAI-14), 2014.
pdf.
Ecologically Valid Long-Term Mood Monitoring of Individuals with Bipolar Disorder Using Speech.
by Zahi Karam, Emily Mower Provost, Satinder Singh, Jennifer Montgomery, Christopher Archer, Gloria Harrington, and Melvin Mcinnis.
In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
pdf.
Characterizing EVOI-Sufficient k-Response-Query Sets in Decision Problems.
by Robert Cohn, Satinder Singh, and Edmund Durfee.
In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
pdf.
Improving UCT Planning via Approximate Homomorphisms.
by Nan Jiang, Satinder Singh, and Richard Lewis.
In 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2014.
pdf.
Utility Maximization and Bounds on Human Information Processing.
by Andrew Howes, Richard L Lewis, and Satinder Singh.
In Topics in Cognitive Science, Volume 6, Issue 2, pages 198-203, 2014.
pdf.
Computing Solutions in infinite-horizon discounted adversarial patrolling games.
by Yevgeniy Vorobeychik, Bo An, Milind Tambe, and Satinder Singh.
In 24th International Conference on Automated Planning and Scheduling (ICAPS), 2014.
pdf.
Computational Rationality: Linking Mechanism and Behavior Through Utility Maximization.
by Richard L Lewis, Andrew Howes, and Satinder Singh.
In Topics in Cognitive Science, Volume 6, Issue 2, pages 279-311, 2014.
pdf.
Reward Mapping for Transfer in Long-Lived Agents.
by Xiaoxiao Guo, Satinder Singh, and Richard L Lewis.
In Advances in Neural Information Processing Systems (NIPS), 26, 2013.
pdf.
The adaptive nature of eye-movements in linguistic tasks: How payoff and architecture shape speed-accuracy tradeoffs.
by Richard L Lewis, Michael Shvartsman, and Satinder Singh.
In Topics in Cognitive Science, Vol. 5, Issue 3, pages 581-610, 2013.
pdf.
Linking Context to Evaluation in the Design of Safety Critical Interfaces.
by Michael Feary, Dorritt Billman, Xiuli Chen, Andrew Howes, Richard Lewis, Lance Sherry, and Satinder Singh.
In Proceedings of Human-Computer Interaction International, 2013.
pdf.
An Exploration of Low-Rank Spectral Learning.
by Alex Kulesza, Nadakuditi Raj Rao, and Satinder Singh.
In ICML Workshop on Spectral Learning, 2013.
pdf.
Maximizing the Value of Mobile Health Monitoring by Avoiding Redundant Patient Records: Prediction of Depression-Related Symptoms and Adherence Problems in Automated Health Assessment Services.
by John Piette, Jeremy Sussman, Paul Pfeiffer, Maria Silveira, Satinder Singh, and Mariel Lavieri.
In Journal of Medical Internet Research, Vol 15, No. 7, 2013.
Testing the Structure of SMS Messages for use in an Artificial Intelligence (AI)-driven SMS Antihypertensive Adherence Support Tool.
by Karen Farris, Sean Newman, Satinder Singh, Larry An, and John Piette.
Research Abstract in Wireless Health, 2013.
pdf.
Optimal Rewards in Multiagent Teams
by Bingyao Liu, Satinder Singh, Richard L. Lewis, and Syiyin Qin
In International Conference on Development and Learning-EpiRob, 2012.
pdf.
Lossy Stochastic Game Abstraction with Bounds
by Tuomas Sandholm and Satinder Singh.
In Proceedings of the 13th ACM Conference on Electronic Commerce (EC), 2012.
pdf.
A previous version appears in Fifth International Workshop on Optimization in Multi-Agent Systems (OPTMAS), 2012.
Learning and Predicting Dynamic Networked Behavior with Graphical Multiagent Models
by Quang Duong, Michael P. Wellman, Satinder Singh, and Michael Kearns.
In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
pdf.
Strong Mitigation: Nesting Search for Good Policies within Search for Good Reward
by Jeshua Bratman, Satinder Singh, Richard Lewis, and Jonathan Sorg.
In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
pdf.
Security Games with Limited Surveillance
by Bo An, David Kempe, Christopher Kiekintveld, Eric Shieh, Satinder Singh, Milind Tambe, and Yevgeniy Vorobeychik.
In Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI), 2012.
pdf.
Computing Stackelberg Equilibria in Discounted Stochastic Games
by Yevgeniy Vorobeychik and Satinder Singh.
In Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI), 2012.
pdf.
(This is a corrected version of the paper that appeared in the conference proceedings.
Major thanks to Vincent Conitzer for finding a counterexample to the main theorem in the now-corrected submitted version.)
Planning Delayed-Response Queries and Transient Policies under Reward Uncertainty
by Rob Cohn, Edmund Durfee and Satinder Singh.
In Proceedings of the Seventh Annual Workshop on Multiagent Sequential Decision-Making Under Uncertainty (MSDM), held in conjunction with AAMAS, 2012.
pdf.
Planning and Evaluating Multiagent Influences Under Reward Uncertainty (Extended Abstract)
by Stefan Witwicki, Inn-Tung Chen, Edmund Durfee and Satinder Singh.
In 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
pdf.
Learning to Make Predictions in Partially Observable Environments without a Generative Model
by Erik Talvitie and Satinder Singh.
In Journal of Artificial Intelligence Research, vol 42, pages 353-392, 2011.
pdf.
Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents
by Jonathan Sorg, Satinder Singh, and Richard Lewis.
In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011.
pdf.
Comparing Action-Query Strategies in Semi-Autonomous Agents
by Robert Cohn, Edmund Durfee, and Satinder Singh.
In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011.
pdf.
An extended abstract also appears in the Proceedings of the 10th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2011.
Learning and Predicting Dynamic Behavior with Graphical Multiagent Models
by Quang Duong, Michael P. Wellman, Satinder Singh, and Michael Kearns.
In 5th International Workshop on Social Networks Mining and Analysis at KDD (SNACKDD-11), 2011.
pdf.
An extended abstract also appears in the Proceedings of the 2nd Workshop on Information in Networks (WIN-10), 2010.
Modeling Information Diffusion in Networks with Unobserved Links
by Quang Duong, Michael P. Wellman, and Satinder Singh.
In 3rd IEEE Conference on Social Computing (SocialCom-11), 2011.
pdf.
An earlier version also appears in the 5th International Workshop on Social Networks Mining and Analysis at KDD (SNACKDD-11), 2011
Dynamic Incentive Mechanisms
by David C. Parkes, Ruggiero Cavallo, Florin Constantin and Satinder Singh.
In AI Magazine, Vol. 31, No. 4, pages 79-94, 2010.
pdf.
Reward Design via Online Gradient Ascent
by Jonathan Sorg, Satinder Singh, and Richard Lewis.
In Neural Information Processing Systems (NIPS), 2010.
pdf.
Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective
by Satinder Singh, Richard Lewis, Andrew Barto, and Jonathan Sorg.
In IEEE Transactions on Autonomous Mental Development, Vol 2, No 2, 2010.
pdf
Modeling Multiple-mode Systems with Predictive State Representations
by Britton Wolfe, Michael James and Satinder Singh.
In Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems, 2010.
pdf
Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
by Jonathan Sorg, Satinder Singh, and Richard Lewis.
In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), 2010.
pdf
Internal Rewards Mitigate Agent Boundedness
by Jonathan Sorg, Satinder Singh, and Richard Lewis.
In Proceedings of the 27th International Conference on Machine Learning (ICML), 2010.
pdf
A New Approach to Exploring Language Emergence as Boundedly Optimal Control in the Face of Environmental and Cognitive Constraints
by Jeshua Bratman, Michael Schvartsman, Richard Lewis, and Satinder Singh.
In Proceedings of the 10th International Conference on Cognitive Modeling (ICCM), 2010.
(Honorable mention for Allan Newell Best Student Paper Award at ICCM)
pdf
Selecting Operator Queries Using Expected Myopic Gain
by Robert Cohn, Michael Maxim, Edmund Durfee, and Satinder Singh.
In Proceedings of the International Conference on Intelligent Agent Technology (IAT), 2010.
pdf
History-Dependent Graphical Multiagent Models
by Quang Duong, Michael Wellman, Satinder Singh, and Yevgeniy Vorobeychik.
In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2010.
pdf
Linear Options
by Jonathan Sorg and Satinder Singh.
In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2010.
(Finalist for Pragnesh Jay Modi Best Student Paper Award)
pdf
Transfer via Soft Homomorphisms
by Jonathan Sorg and Satinder Singh.
In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2009.
pdf
SarsaLandmark: an Algorithm for Learning in POMDPs with Landmarks
by Michael R. James and Satinder Singh.
In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2009.
pdf
Learning Graphical Game Models
by Quang Duong, Yevgeniy Vorobeychik, Satinder Singh and Michael Wellman.
In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), 2009.
pdf
Where Do Rewards Come From?
by Satinder Singh, Richard L. Lewis and Andrew G. Barto.
In Proceedings of the Annual Conference of the Cognitive Science Society (CogSci), 2009.
pdf
Maintaining Predictions Over Time Without a Model
by Erik Talvitie and Satinder Singh.
In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), 2009.
pdf
Simple Local Models for Complex Dynamical Systems
by Erik Talvitie and Satinder Singh.
In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS), 2008.
pdf
Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State
by David Wingate and Satinder Singh.
In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 1176-1183, 2008.
pdf
Building Incomplete but Accurate Models
by Erik Talvitie, Britton Wolfe and Satinder Singh.
In Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2008.
pdf
Predictive Linear-Gaussian Models of Stochastic Dynamical Systems with Vector-Value Actions and Observations
by Matthew Rudary and Satinder Singh.
In Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2008.
pdf
Knowledge Combination in Graphical Multiagent Models
by Quang Duong, Michael Wellman and Satinder Singh.
In Proceedings of the 24th Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2008.
pdf
Approximate Predictive State Representations
by Britton Wolfe, Michael R. James and Satinder Singh.
In Procedings of the 2008 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2008.
(Finalist for Pragnesh Jay Modi Best Student Paper Award)
pdf
Learning Payoff Functions in Infinite Games
by Yevgeniy Vorobeychik, Michael Wellman and Satinder Singh.
Machine Learning Journal 67:145-168, 2007.
pdf
Constraint Satisfaction Algorithms for Graphical Games
by Vishal Soni, Satinder Singh and Michael Wellman.
In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
pdf
On Discovery and Learning of Models with Predictive State Representations of State for Agents with Continuous Actions and Observations
by David Wingate and Satinder Singh.
In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
pdf
Relational Knowledge with Predictive State Representations
by David Wingate, Vishal Soni, Britton Wolfe and Satinder Singh.
In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 2035-2040, 2007.
pdf
An Experts Algorithm for Transfer Learning
by Erik Talvitie and Satinder Singh.
In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 1065-1070, 2007.
pdf
Abstraction in Predictive State Representations
by Vishal Soni and Satinder Singh.
In Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI), 2007.
pdf
Exponential Family Predictive Representations of State
by David Wnigate and Satinder Singh.
In Proceedings of the Advances in Neural Information Processing Systems, 20 (NIPS), pages 1617-1624, 2007.
pdf
Cobot in LambdaMOO: An Adaptive Social Statistics Agent
by Charles Isbell, Michael Kearns, Satinder Singh, Christian Shelton, Peter Stone and Dave Kormann.
In Journal of Autonomous Agents and Multi-Agent Systems, 13(3), pages 327-354, 2006.
pdf
Mixtures of Predictive Linear Gaussian Models for Nonlinear Stochastic Dynamical Systems
by David Wingate and Satinder Singh.
In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
pdf
Using Homomorphisms to Transfer Options Across Reinforcement Learning Domains
by Vishal Soni and Satinder Singh.
In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
pdf
Kernel Predictive Linear-Gaussian Models for Nonlinear Stochastic Dynamical Systems
by David Wingate and Satinder Singh.
In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1017-1024, 2006.
pdf
Predictive linear-Gaussian models of controlled stochastic dynamical systems
by Matthew Rudary and Satinder Singh.
In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 777-784, 2006.
pdf
Predictive State Representations with Options
by Britton Wolfe and Satinder Singh.
In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1025-1032, 2006.
pdf
Empirical Game-Theoretic Analysis of Chaturanga
by Christopher Kiekintveld, Michael Wellman and Satinder Singh.
In Proceedings of AAMAS-06 Workshop on Game-Theoretic and Decision-Theoretic Agents, 2006.
pdf
Optimal Coordinated Planning Amongst Self-Interested Agents with Private State
by Ruggiero Cavallo, David C. Parkes and Satinder Singh.
In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI), 2006.
pdf
Optimal Coordination of Loosely-Coupled Self-InterestedRobots
by Ruggeiro Cavallo, David C. Parkes, and Satinder Singh.
In Workshop on Auction Mechanisms for Robot Coordination at AAAI'06, 2006.
pdf
Reinforcement Learning of Hierarchical Skills on the Sony Aibo Robot
by Vishal Soni and Satinder Singh.
In Proceedings of the 5th International Conference on Development and Learning (ICDL), 2006.
pdf
Off-policy Learning with Options and Recognizers
by Doina Precup, Richard Sutton, Cosmin Paduraru, Anna Koop and Satinder Singh.
In Proceedings of Advances in Neural Information Processing Systems 18 (NIPS), pages 1097-1104, 2006.
pdf
Intrinsically Motivated Reinforcement Learning
by Satinder Singh, Andrew G. Barto and Nuttapong Chentanez.
In Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1281-1288, 2005.
pdf
Approximately Efficient Online Mechanism Design
by David Parkes, Satinder Singh and Dimah Yanovsky.
In Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1049-1056, 2005.
pdf
Predictive linear-Gaussian models of stochastic dynamical systems
by Matthew Rudary, Satinder Singh and David Wingate.
In Proceedings of the Uncertainty in Artificial Intelligence (UAI), pages 501-508, 2005.
pdf
Combining Memory and Landmarks with Predictive State Representations
by Michael R. James, Britton Wolfe and Satinder Singh.
In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005.
pdf
Learning Payoff Functions in Infinite Games
by Yevgeniy Vorobeychik, Michael Wellman and Satinder Singh.
In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005
pdf
(An expanded version was later published in the Machine Learning Journal; pdf)
Planning in Models that Combine Memory with Predictive Representations of State
by Michael R. James and Satinder Singh.
In Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), pages 987-992, 2005.
pdf
Learning Predictive State Representations in Dynamical Systems Without Reset
by Britton Wolfe, Michael R. James and Satinder Singh.
In Proceedings of the 22nd International Conference on Machine Learning (ICML), 2005.
pdf
Intrinsically Motivated Learning of Hierarchical Collections of Skills
by Andrew G. Barto, Satinder Singh, and Nuttapong Chentanez.
In Proceedings of International Conference on Developmental Learning (ICDL), 2004.
pdf
Predictive State Representations: A New Theory for Modeling Dynamical Systems
by Satinder Singh, Michael R. James and Matthew R. Rudary.
In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512-519, 2004.
pdf
Learning and Discovery of Predictive State Representations in Dynamical Systems with Reset
by Michael James and Satinder Singh.
In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 417-424, 2004.
pdf
Adaptive Cognitive Orthotics: Combining Reinforcement Learning and Constraint-Based Temporal Reasoning
by Matthew Rudary, Satinder Singh and Martha Pollack.
In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 719-726, 2004.
pdf
Planning with Predictive State Representations
by Michael R. James, Satinder Singh and Michael Littman.
In Proceedings of the International Conference on Machine Learning and Applications (ICMLA), pages 304-311, 2004.
pdf
Computing Approximate Bayes Nash Equilibria in Tree-Games of Incomplete Information
by Satinder Singh, Vishal Soni and Michael Wellman.
In Proceedings of the Fifth ACM Conference on Electronic Commerce (EC), pages 81-90, 2004.
pdf
Distributed Feedback Control for Decision Making on Supply Chains
by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, Joshua Estelle, Yevgeniy Vorobeychik, Vishal Soni and Matthew Rudary.
In Proceedings of the 14th International Conference on Automated Planning and Scheduling (ICAPS), pages 384-392, 2004.
pdf.
Strategic Interactions in the TAC 2003 Supply Chain Tournament
by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld and Vishal Soni.
In Proceedings of the Fourth International Conference on Computer & Games, 2004.
pdf
A Nonlinear Predictive State Representation
by Matthew Rudary and Satinder Singh.
In Advances in Neural Information Processing Systems 16 (NIPS), pages 855-862, 2004.
pdf
Strategic Procurement in TAC/SCM: An Empirical Game-Theoretic Analysis
by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld, and Vishal Soni.
In Workshop on Trading Agent Design and Analysis (TADA), 2004.
pdf
An MDP-Based Approach to Online Mechanism Design
by David Parkes and Satinder Singh.
In Advances in Neural Information Processing Systems 16 (NIPS), pages 791-798, 2004.
pdf
Learning Predictive State Representations
by Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe and Peter Stone.
In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 712-719, 2003.
pdf
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System
by Satinder Singh, Diane Litman, Michael Kearns and Marilyn Walker.
In Journal of Artificial Intelligence Research (JAIR), Volume 16, pages 105-133, 2002.
pdf
CobotDS: A Spoken Dialogue System for Chat
by Michael Kearns, Charles Isbell, Satinder Singh, Diane Litman, and J. Howe.
In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI), pages 435-430, 2002.
pdf
Near-Optimal Reinforcement Learning in Polynomial Time
by Michael Kearns and Satinder Singh.
In Machine Learning journal, Volume 49, Issue 2, pages 209-232, 2002.
( shorter version appears in ICML 1998).
pdf
Predictive Representations of State
by Michael Littman, Richard Sutton and Satinder Singh.
In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555-1561, 2002.
pdf
ATTac-2000: An Adaptive Autonomous Bidding Agent
by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns.
In Journal of Artificial Intelligence Research (JAIR), Vol 15, pages 189-206, 2001.
pdf.
(A shorter version also appears in AAAI'01 as listed below)
Graphical Models for Game Theory
by Michael Kearns, Michael Littman and Satinder Singh.
In Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 253-260, 2001.
pdf
An Efficient Exact Algorithm for Single Connected Graphical Games
by Michael Littman, Michael Kearns and Satinder Singh.
In Advances in Neural Information Processing Systems 14 (NIPS), pages 817-823, 2002.
pdf
FAucs: An FCC Spectrum Auction Simulator for Autonomous Bidding Agents
by Janos Csirik, Michael Littman, Satinder Singh and Peter Stone.
In Electronic Commerce: Proceedings of the Second Interanational Workshop 2001.
pdf
ATTac-2000: An Adaptive Autonomous Bidding Agent
by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns.
In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 238-245, 2001.
pdf
Cobot: A Social Reinforcement Learning Agent
by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone.
In Advances in Neural Information Processing Systems 14 (NIPS) pages 1393-1400, 2002.
pdf
A Social Reinforcement Learning Agent
by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone.
In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 377-384, 2001.
Winner of Best Paper Award.
pdf
Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System
by Satinder Singh, Michael Kearns, Diane Litman, and Marilyn Walker.
In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 645-651, 2000.
pdf
Cobot in LambdaMOO: A Social Statistics Agent
by Charles Isbell, Michael Kearns, Dave Korman, Satinder Singh and Peter Stone.
In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 36-41, 2000.
pdf
Automatic Optimization of Dialogue Management
by Diane Litman, Michael Kearns, Satinder Singh and Marilyn Walker.
In Proceedings of the 18th International Conference on Computational Linguistics (COLING), pages 502-508, 2000.
pdf
A Boosting Approach to Topic Spotting on Subdialogues
by Kary Myers, Michael Kearns, Satinder Singh and Marilyn Walker.
In Proceedings of the Seventeenth International Conference on Machine Learning (ICML) pages 655-662, 2000.
pdf
Eligibility Traces for Off-Policy Policy Evaluation
by Doina Precup, Richard Sutton, and Satinder Singh.
In Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 759-766, 2000.
pdf
Nash Convergence of Gradient Dynamics in General-Sum Games
by Satinder Singh, Michael Kearns and Yishay Mansour.
In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 541-548, 2000.
pdf
Fast Planning in Stochastic Games
by Michael Kearns, Yishay Mansour, and Satinder Singh
In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 309-316, 2000.
pdf
"Bias-Variance" Error Bounds for Temporal Difference Updates
by Michael Kearns and Satinder Singh.
In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT), pages 142-147, 2000.
pdf
Reinforcement Learning for Spoken Dialogue Systems
by Satinder Singh, Michael Kearns, Diane Litman and Marilyn Walker.
In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
pdf
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
by Satinder Singh, Tommi Jaakkola, Michael Littman, and Csaba Szpesvari.
In Machine Learning Journal, vol 38(3), pages 287-308, 2000.
pdf
Policy Gradient Methods for Reinforcement Learning with Function Approximation
by Richard Sutton, Dave McAllester, Satinder Singh and Yishay Mansour.
In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
pdf
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
by Richard Sutton, Doina Precup and Satinder Singh.
In Artificial Intelligence Journal, Volume 112, pages 181-211, 1999.
pdf
Approximate Planning for Factored POMDPs using Belief State Simplification
by Dave McAllester and Satinder Singh.
In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 409-416, 1999.
pdf
On the Complexity of Policy Iteration
by Yishay Mansour and Satinder Singh.
In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 401-408, 1999.
pdf
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
by Michael Kearns and Satinder Singh.
In Advances in Neural Information Processing Systems 11 (NIPS), pages 996-1002, 1999.
pdf
Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes
by John K. Williams and Satinder Singh.
In Advances in Neural Information Processing Systems 11 (NIPS), pages 1073-1079, 1999.
pdf
Optimizing admission control while ensuring quality of service in multimedia networks via reinforcement learning
by Timothy Brown, Hong Tong, and Satinder Singh.
In Advances in Neural Information Processing Systems 11 (NIPS), pages 982-988, 1999.
pdf
Improved switching among temporally abstract actions
by Richard Sutton, Satinder Singh, Doina Precup and Balaraman Ravindran.
In Advances in Neural Information Processing Systems 11 (NIPS), pages 1066-1072, 1999.
pdf
Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes
by John Loch and Satinder Singh.
In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 323-331, 1998.
pdf
Near-Optimal Reinforcement Learning in Polynomial Time
by Michael Kearns and Satinder Singh.
In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 260-268, 1998.
pdf
Intra-Option Learning about Temporally Abstract Actions
by Richard Sutton, Doina Precup and Satinder Singh.
In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 556-564, 1998.
pdf
Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors
by Doina Precup, Richard Sutton, and Satinder Singh.
In Proceedings of the 10th European Conference on Machine Learning (ECML), pages 382-393. 1998.
pdf
Hierarchical Optimal Control of MDPs
by Amy McGovern, Doina Precup, Balaraman Ravindran, Satinder Singh and Richard Sutton.
In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, 1998.
pdf
How to Dynamically Merge Markov Decision Processes
by Satinder Singh and David Cohn.
In Advances in Neural Information Processing Systems 10 (NIPS), pages 1057-1063, 1998.
pdf
Analytical Mean Squared Error Curves for Temporal Difference Learning
by Satinder Singh and Peter Dayan.
In Machine Learning Journal, Volume 32, Issue 1, pages 5-40, 1998.
pdf.
A shorter version appears in the NIPS 9 Proceedings
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems
by Satinder Singh and Dimitri Bertsekas.
In Advances in Neural Information Processing Systems 9 (NIPS), pages 974-980, 1997.
pdf
Planning with Closed-Loop Macro Actions
by Doina Precup, Richard Sutton and Satinder Singh.
In Proceedings of AAAI Fall Symposium on Model-directed Autonomous Systems, 1997.
pdf
Predicting Lifetimes in Dynamically Allocated Memory
by David Cohn and Satinder Singh.
In Advances in Neural Information Processing Systems 9 (NIPS), pages 939-945, 1997.
pdf
Analytical Mean Squared Error Curves for Temporal Difference Learning
by Satinder Singh and Peter Dayan.
In Advances in Neural Information Processing Systems 9 (NIPS), pages 1054-1060, 1997.
pdf
Reinforcement Learning with Replacing Eligibility Traces
by Satinder Singh and Richard Sutton.
In Machine Learning journal, Volume 22, Issue 1, pages 123-158, 1996.
pdf abstract
Learning Curve Bounds for Markov Decision Processes with Undiscounted Rewards
by Lawrence Saul and Satinder Singh.
In Proceedings of 9th Annual Conference on Computational Learning Theory (COLT), pages 147-156, 1996.
pdf
Long Term Potentiation, Navigation and Dynamic Programming
by Peter Dayan and Satinder Singh.
In Proceedings of Computation and Neural Systems Meeting (CNS) 1996.
pdf
Improving Policies Without Measuring Merits
by Peter Dayan and Satinder Singh.
In Advances in Neural Information Processing Systems 8 (NIPS), pages 1059-1065, 1996.
pdf
Markov Decision Processes in Large State Spaces
by Lawrence Saul and Satinder Singh.
In Proceedings of 8th Annual Workshop on Computational Learning Theory (COLT), pages 281-288, 1995.
pdf
Learning to Act using Real-Time Dynamic Programming
by Andrew Barto, Steve Bradtke and Satinder Singh.
In Artificial Intelligence, Volume 72, pages 81-138, 1995.
pdf
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
by Tommi Jaakkola, Michael Jordan and Satinder Singh.
In Neural Computation, Volume 6, Number 6, pages 1185-1201, 1994.
pdf
Reinforcement Learning With Soft State Aggregation
by Satinder Singh, Tommi Jaakkola and Michael Jordan.
In Advances in Neural Information Processing Systems 7 (NIPS), pages 361-368, 1995.
pdf
Stochastic Convergence of Iterative DP Algorithms
by Tommi Jaakkola, Michael Jordan and Satinder Singh.
In Advances in Neural Information Processing Systems 6 (NIPS), pages 703-710, 1994.
pdf
Reinforcement Learning Algorithm for Partially Observable Markov Problems
by Tommi Jaakkola, Satinder Singh and Michael Jordan.
In Advances in Neural Information Processing Systems 7 (NIPS), pages 345-352, 1995.
pdf
Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes
by Satinder Singh.
In Proceedings of the Twelth National Conference on Artificial Intelligence (AAAI), pages 700-705, 1994.
pdf
Learning Without State-Estimation in Partially Observable Markovian Decision Processes
by Satinder Singh, Tommi Jaakkola and Michael Jordan.
In Machine Learning: Proceedings of the Eleventh International Conference (ICML), pages 284-292, 1994.
pdf
On Step-Size and Bias in Temporal-Difference Learning
by Richard Sutton and Satinder Singh.
In Proceedings of Eighth Yale Workshop on Adaptive and Learning Systems, 1994.
pdf abstract
Robust Reinforcement Learning in Motion Planning
by Satinder Singh, Andrew Barto, Roderic Grupen, and Christopher Connolly.
In Advances in Neural Information Processing Systems 6 (NIPS), pages 655-662, 1994.
pdf
An Upper Bound on the Loss from Approximate Optimal-Value Functions
by Satinder Singh and Richard Yee.
In Machine Learning, Volume 16, Issue 3, pages 227-233, 1994.
pdf
Distributed Representation of Limb Motor Programs in Arrays of Adjustable Pattern Generators
by Neil Berthier, Satinder Singh, Andrew Barto, and Jim Houk.
In Journal of Cognitive Neuroscience, vol 5:1, pages 56-78, 1993.
pdf
Reinforcement Learning with a Hierarchy of Abstract Models
by Satinder Singh.
In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI), pages 202-207, 1992.
pdf
A Cortico-Cerebellar model that learns to generate distributed motor commands to control a kinetic arm
by Satinder Singh, Neil Berthier, Andrew Barto, and Jim Houk.
In Advances in Neural Information Processing Systems 4 (NIPS), pages 611-618, 1992.
pdf
Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models
by Satinder Singh.
In Proceedings of the Ninth Machine Learning Conference, pages 406-415, 1992.
pdf
Transfer of Learning by Composing Solutions of Elemental Sequential Tasks
by Satinder Singh.
In Machine Learning Journal, Volume 8, Issue 3, pages 323-339, 1992.
pdf
The Efficient Learning of Multiple Task Sequences
by Satinder Singh.
In Advances in Neural Information Processing Systems 4 (NIPS), pages 251-258, 1992.
pdf
Transfer of Learning Across Compositions of Sequential Tasks
by Satinder Singh.
In Machine Learning: Proceedings of the Eighth International Workshop, pages 348-352, 1991.
pdf
Reinforcement Learning and Dynamic Programming
by Andrew Barto and Satinder Singh.
In Proceedings of Sixth Yale Workshop on Adaptive and Learning Systems, 1990.

Magazine Articles, Book Chapters and Others

Value-Driven Procurement in the TAC Supply Chain Game
by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, and Vishal Soni. SIGecom Exchanges, Volume4.3, pages 9-19, 2004.
pdf
Reinforcement Learning for 3 vs. 2 Keepaway
by Peter Stone and R. Sutton and Satinder Singh.
In RoboCup-2000: Robot Soccer World Cup IV, P. Stone, T. Balch, and G. Kraetszchmar, Eds., Springer Verlag.
pdf.
An earlier version appeared in the Proceedings of the RoboCup-2000 Workshop, Melbourne, Australia
Soft Dynamic Programming Algorithms: Convergence Proofs
by Satinder Singh.
In Proceedings of Workshop on Computational Learning and Natural Learning (CLNL), Provincetown, Massachusetts, 1993.
pdf
On the Computational Economics of Reinforcement Learning
by Andrew Barto and Satinder Singh.
In Proceedings of Connectionist Summer School, 1990.
pdf
An Adaptive Sensorimotor Network Inspired by the Physiology of the Cerebellum
by Jim Houk, Satinder Singh, Charles Fisher, and Andrew Barto.
Appears as a chapter in WT Miller, RS Sutton, and PJ Werbos, editors, Neural Network for Control, pages 301-348, 1989.
My one paper in a non-technical journal!

How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning
by Satinder Singh, Peter Norvig and David Cohn.
In Dr. Dobbs journal, March issue, 1997.

pdf [html version]
An Almost Tutorial on RL (extracted from my Thesis)

An (Almost) Tutorial on Reinforcement Learning
. gzipped postscript. Extracted from my 1993 thesis
Going Nowhere Papers

Asynchronous Modified Policy Iteration with Single-sided Updates
. Satinder Singh and Vijay Gullapalli. Working Paper, 1993.
pdf