CSE logo University of Washington Department of Computer Science & Engineering
 590HK - Learning to Reason
  CSE Home  About Us    Search    Contact Info 

590HK - Learning to Reason

590 HK - Knowledge Representation and Reasoning - Spring 2001

Topic: Learning to Reason

When: Tuesdays 12:15-1:20 pm.  Note: seminar does not meet on Thursday.
Credits: 1 to 3
Where: LOW 220
Organizer: Henry Kautz <kautz@cs.washington.edu>
 

Mailing List

Please sign up for the CSE 590HK mailing list by sending a message to majordomo@cs.washington.edu with the text
subscribe cse590hk
in the body. You can view an archive of the mailing list online as well.

Description

Logical or probabilistic inference is, in general, computationally intractable, yet AI systems need to be able to reason and react in real time.  In this reading course we will discuss principles for handling computational resource limits in reasoning systems.  Topics will include: The overall theme of the course will be ways to build systems that learn to reason efficiently, as opposed to requiring a human expert to write lots of heuristic rules.

Our first set of papers will come from the the recent special issue of the Artificial Intelligence journal on "Computational Tradeoffs under Bounded Resources".  Click here for the introduction to the special issue describing many of the issues we will discuss.

Anyone is welcome to attend and participate in the discussion. Credit policy:
1 credit for showing up
3 credits for leading discussion on one of the papers

I particularly welcome participation from people interested in investigating connections with application areas such as robotics (Robocup), databases, comp bio, etc.

The schedule

  Note: if you have trouble printing some of the .pdf files trying checking the "print as image" print dialog box.
12:30pm Tuesday Topic/Papers Leader Cookies
4/3 Decision-theoretic control of computational resources provides a precise framework for talking about the value of computation.  When should an agent stop reasoning and act?  The notion of continual computation projects the value of computation into the future: when should an agent spend time thinking about possible future problems, rather than just solving the tasks at hand?

E.J. Horvitz, G.F. Cooper, D.E. Heckerman, Reflection and action under scarce resources: Theoretical principles and empirical study. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, Detroit, MI. August 1989, pp. 1121-1127. International Joint Conference on Artificial Intelligence. 

E. Horvitz. Principles and Applications of Continual Computation, Artificial Intelligence Journal, Elsevier Science, February 2001. 

Dan Grossman Stani Vlaseva
4/10 Explanation-based Learning, or "EBL", is a classic technique for speeding up reasoning by learning new operators that (potentially) shorten the length of chains of reasoning.  It was originally developed in the context of the PRODIGY planning system in the late 1980's.  More recently EBL has been applied to modern Graphplan-style planners.

Minton, S, Carbonell, J., Knoblock, C, Kuokka, D., Etzioni, O, & Gil, Y. (1989) Explanation-based learning: A problem solving perspective.  Artificial Intelligence, 40, 63-118.  (Not online: hardcopy handed out in class.)

Steven Minton (1988), Quantitative Results Concerning the Utility of Explanation-Based Learning.  Proc. AAAI-88.

Don Patterson Yongshao Ruan
4/17 NO MEETING
4/24 The propositional form of EBL applied to backtracking search is called variously "look back", "clause learning", and "memoization".  A simple strategy called "relevance based learning" turns out to be surprising effective at solving the utility problem in the propositional case.

R. Bayardo & R. Schrag. Using CSP look-back techniques to solve real-world SAT instances. in Proc. of AAAI-97. pp. 203--208. AAAI Press/The MIT Press, 1997. 

Subbarao Kambhampati.  Improving Graphplan's search with EBL and DDB techniques. In Proc. IJCAI-1999.

Background on Graphplan planners: A. Blum and M. Furst. Fast planning through plan-graph analysis. Artificial Intelligence, 90:281--300, 1997.
W. Li and Jia-chi Wu Jia-chi Wu
5/1 PAC-learning results are concerned with what can kind of functions can possibily learned in polynomial time.  Work by Khardon considers supervised learning of reactive plans.

Roni Khardon, Learning to Take Actions, R. Khardon. In Proc. National Conference on Artificial Intelligence, pages 787--792. AAAI, 1996.

Long version: R. Khardon. Learning to take actions. Machine Learning, 35(1):57-90, 1999.
Background:  Learning to Reason, R. Khardon & D. Roth, AAAI-94, 682-687.
Learning to Reason, R. Khardon & D. Roth, Journal of the ACM, 44(5):697-725, Sept. 1997. (Long version of AAAI-94 paper).
Ana Popescu Dan Grossman
5/8 Aspects of EBL and "learning to take action" appear in the work by Huang on learning search control rules.  Like EBL there is no teacher, but unlike EBL the learned rules are not (necessarily) deductive consequences of the agent's previous knowledge. The learning algorithm employed is one called inductive logic programming.

"Learning Declarative Control Rules for Constraint-Based Planning". Yi-Cheng Huang, Bart Selman, and Henry Kautz. Proc. ICML-2000 (Seventeenth International Conference on Machine Learning), Stanford, CA.

Henry Kautz David Hsu
5/15 The most successful example of using machine learning for game-playing is also the prominent success story for the area of reinforcement learning.

Gerald Tesauro. Temporal difference learning and TD-Gammon. Communications of the ACM, 58--67, March 1995.

A good short overview of reinforcement learning is Section 4 of Machine Learing Research: Four Current Directions by Thomas Dietterich.  Artificial Intelligence Magazine. Winter 1997. 97-136.
Corin Anderson Henry Kautz
5/22 This paper introduces a methodology for solving combinatorial optimization problems through the application of reinforcement learning methods. The approach can be applied in cases where several similar instances of a combinatorial optimization problem must be solved. The key idea is to analyze a set of ``training'' problem instances and learn a search control policy for solving new problem instances. The search control policy has the twin goals of finding high-quality solutions and finding them quickly. Results of applying this methodology to a NASA scheduling problem show that the learned search control policy is much more effective than the best known non-learning search procedure.

Zhang, W., Dietterich, T. G. (submitted for publication). Solving Combinatorial Optimization Tasks by Reinforcement Learning: A General Methodology Applied to Resource-Constrained Scheduling.

A popular framework for planning under uncertainty is Markov Decision Processes (MDP). Traditional algorithms for solving MDP's require time that is polynomial in the size of the state space, which is far too large for most realistic domains. Kearns et al. show that (a least theoretically) one can find a good solution to an MDP in time that is independent of the actual number of states.

Kearns, M., Mansour, Y., & Ng, A. Y. (1999). A sparse sampling algorithm for near optimal planning in large Markov decision processes. In Proceedings of the Sixteenth International Joint Conference on Articial Intelligence, pp. 1324-1331. 

Zasha Nan Li
5/29 Reinforcement learning is applied to the problem of playing Robocup Soccer.

Stone, P., Sutton, R.S. (2001). Scaling reinforcement learning toward RoboCup soccer. Proceedings of the 18th International Conference on Machine Learning.

Yongshao Ruan Corey Anderson
5/30 - in 590AI

Wed 4:30pm

EE 003

Decision-theoretic control of reasoning is applied to the problem of learning "restart" strategies for problem solvers in new work by Horvitz et al.

A Bayesian Approach to Tackling Hard Computational Problems. Eric Horvitz, Yongshao Ruan, Carla Gomes, Henry Kautz, Bart Selman, and Max Chickering. To appear, 2001.

Yongshao Ruan

 


CSE logo Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to kautz]