A* Search
An informed method for finding the minimum-cost path from initial to a goal
The ranking function is simply
- f’(n) = g(n) + h’(n) ? estimated minimum cost to goal
- how does this limit the agent’s reward structure??
What are the implications of getting h’ wrong?
- if h’(n) = h(n) for all n
- if h’(n) ? h(n) for all n but strictly less than for some n
- if h’(n) > h(n) for some n