CSE 473 Autumn 1998, Copyright, S. Tanimoto, Univ. of Washington 
Introduction to Artificial Intelligence (Oct 28, 1998)

"Genetic Search"

Genetic search: search in which a "population" of states is gradually changed according to "mutation" and "crossover" transformations with some combination of directed (by a fitness function) and random guidance.
 
 

Motivation

Large search spaces are often "irregular";
 it's easy to get stuck in local minima.

A combination of random and directed search is often effective.

"Natural selection" offers one compelling model.

We can take advantage of parallel processing.
 
 
 
 
 


 

Individuals, the Population, and Solutions

An individual may correspond to a state in a state space.
  In the usual biological model, each individual is represented as a string, as in a DNA molecule:  e.g., ACGTCCGACAGTT
(In Lisp, we can use a list.)

The population is a collection of individuals currently "alive" (under consideration).

Any solution to the problem should be in the state space.

We want the final population to contain a solution.
 

Mutations

A mutation is a (small) transformation to an individual.  It is analogous to a move in a state space.

Mutations are usually effected with some degree of randomness.

Examples:
Start with ACGTCCGACAGTT
1. random selection of location in the string, followed by replacement of that symbol by a randomly chosen, different one from the alphabet.
ACTTCCGACAGTT (third letter changed to T)

2. random selection of two locations in the string, followed by an exchange of the two selected elements.
AAGTCCGCCAGTT (second and eighth elements traded places)

3. random deletion of one element.
ACGCCGACAGTT  (fourth element, T, deleted)
4. random insertion of a random element.
 ACGTCACGACAGTT  (A inserted after 5th element).
 
 

Crossovers
 

Two individuals are selected.

Each is split into two parts

The first part of one is combined with the second part of the other, producing a new individual.

Example: Start with
ACGTCCGACAGTT
TTTTTTTTTTTTT

Cut both after the 5th character
ACGTC          CGACAGTT
TTTTT           TTTTTTTT
Combine 1st of 1st with 2nd of 2nd
ACGTCTTTTTTTT
 
 
 

Fitness Functions
 

A fitness function is a function that maps from the set of individuals to the real numbers and that is used to determine which individuals survive from one generation to the next.

Fitness functions serve several purposes:
1. to help discriminate between individuals that correspond to "feasible solutions" and those that correspond to infeasible solutions.
  For example, in the travelling salesman problem, we might interpret "feasible solution" to mean a sequence of cities corresponding to an actual tour (visiting all cities, repeating none).
  Or in the painted squares puzzle, a feasible solution might be thought of as one in which all four pieces have been placed, but not necessarily with matching sides.

2. to indicate, within the set of feasible solutions, which ones seem best or better than the others,
and within the set of infeasible solutions, which ones seem closest to being feasible.

3. to lend some direction to what otherwise might be a completely random search.

Fitness functions can actually hamper the search if (a) they are overly influenced by irrelevant criteria, or (b) influenced in the wrong way by relevant criteria, or (c) if they are not appropriately counterbalanced by enough randomness in the mutation and crossover operations.

Fitness functions tend to drive the search (and the whole population) towards minima in the space.

The randomness of mutation and crossover operations keep the search moving and can help escape from local, non-global minima.

These two kinds of mechanisms must be in balance for genetic search to work well.
 
 

Termination Criteria
 

In some problems such as puzzles, there are easily recognizable solutions.

At the end of each "generation" of new individuals, if any of them satisfies a test for being a solution to the problem, then we can stop there.

In other problems, the solution is defined to be the "best" among a set of candidates.
For example, in the TSP, the candidates are tours, and the best one is the one with the smallest total distance.
It is sometimes easy to identify a solution to such a problem, particularly if there is a method for computing tight lower bounds on the cost of a solution.
However, often no such method is available.

Some termination criteria are these...
1. Stop after a fixed number of generations, or a fixed amount of computer time has been spent.
2. Stop as soon as the population changes from one generation to the next get small enough.
(And if the population fails to change, then stop immediately.)

In addition to termination criteria, there are other policies to set and adjust:

1. Size of the population.
It should be big enough to accommodate a diversity of individuals.  Otherwise, crossovers will not lead to new individuals.
It should be small enough to keep the computational load manageable.

2. Balance of mutations and crossovers.
Crossovers seem to offer best results when the good parts of two existing individuals can be combined into something better.
Mutations seem to work best when something random needs to be done to produce an individual that is different enough from its "peers" to help the whole population get out if its local minimum.
 

Genetic search is related to "simulated annealing."
Related to these are "Boltzmann machines" and the "Metropolis algorithm".
 
 


 
 

Last modified: October 27, 1998

Steve Tanimoto

tanimoto@cs.washington.edu