CSE 574 - Automated Planning Winter 2003

### 574 Project Page.

[ Administrivia | Paper schedule | Project Ideas | Resources ]

#### Hand-on Planning Experimentation

Your first task is to choose (at least one) planner, fire it up, try a couple of problems, and perhaps write a new problem yourself. After you have a sense for it's strengths and weaknesses, fill out a [Review]

• An ADD-based MDP solver called SPUDD - both snazzy and with a nice online interface.
• The SAPA and LPG temporal planners are available here.
Other recommended planners are: Geffner's GPT (which handles uncertainty) and Blackbox.

### Project Ideas

I hope that groups (1-3 people) will form and decide on their specific projects by Tues April 22. The two default proposals are:

#### Goal Selection

Modify a classical planner (e.g. LPG, HSPr, Blackbox, Graphplan, or whatever) to solve the following optimization problem. In addition to the standard init state, action schemata, and goals inputs, your planner will take a resource bound. Each action will have an associated cost and each goal an associated utility. We'll assume that the agent's overall utility is a linear combination of the individual goals.

Your planner should output a plan which maximizes (or as close as it can find to optimal) the overall utility achievable without exceeding the resource bound. The straightforward way of solving this is doubly exponential, so don't be sad if you don't find a fast solution which is guarenteed optimal. The ideal project would identify one or more clever heuristics and / or auxilliary datastructures. Is dynamic programming (i.e. generating minimal resource-usage plans for sets of one goals, sets of two, etc) a useful strategy? Can you use a planning graph variant to determine bounds on goal interactions?

#### Incremental Replanning in an Embedded Agent

Build a world simulator (or modify one from the AIMA textbook page). The idea simulator would let you input the world physics with a set of planning operators and an init state - the same inputs you'd give to your planner. You'll want to assume that actions have an associated reward (or cost). The simulator should record the agent's actions and it's performance (accumulated reward stream). Your simulator should talk to any agent, probably via sockets, maybe using an XML-based protocol to say which actions are being executed and to report sensor data back to the agent. You should probably assume that the world and the agent are synchronous - the world will wait for the agent to command an action before it changes the world. This eliminates a host of race conditionbs and means you don't need to start out focussing on how fast the planner is.

Now take a planner (probably GPT or SPUDD) and connect it to the world. How does it perform?

Finally, extend the planner in some interesting way. My favorite idea here would be to allow the human user to (at any time step) to add one-shot goals of state achievement (with associated utilities). Can you get the agent to satisfy these? One way would be to compile them into the MDP and resolve using value iteration, (see this aaai-96 paper), but maybe you can figure out a faster way?

Here are some notes on other ideas.
• Analyze graphplan in terms of CSP framework; compare mutex to k-consistency; can one bound k?
• Reimplement STAN's TIM analysis and symmetry detection algorithms
• Abstraction methods motivated by McDermott grid world
• Dynamic objects with universally quantified effects in graphplan
• One-shot goals in an MDP framework
• Planning where actions can take real-valued parameters (and hence the branching factor is infinite).