Genetic Algorithms Overview
Genetic algorithms operate on set of possible solutions. Because of random nature of the genetic algorithm, solutions found by the algorithm can be good, poor or infeasible [defective, erroneous] so there should be a way to specify how good that solution is. This is done by assigning fitness value [or just fitness] to the solution. Chromosomes represent solutions within the genetic algorithm. Two basic component of chromosome are coded solution and its fitness value.
Chromosomes are grouped into population [set of solutions] on which the genetic algorithm operates. In each step [generation] genetic algorithm selects chromosomes form population [selection is usually based on fitness value of chromosome] and combines them to produce new chromosomes [offspring]. These offspring chromosomes form new population [or replace some of the chromosomes in the existing population] in hope that new population will be better then previous. Populations keep track of the worst and the best chromosomes and stores additional statistical information which can be used by genetic algorithm to determine stop criteria.
Chromosome in some way stores solution which it represents. This is called representation [encoding] of the solution. There are number of probabilities way to represent solution in such way that it is suitable for genetic algorithm [binary, real number, vector of real number, permutations, and so on] and they are mostly depend on nature of problem.
Diagram - Chromosome representations [for maximization of single-parameter function]
Diagram - Chromosome representations [Traveling Salesman Problem]
Genetic algorithms produce new chromosomes [solutions] by combining existing chromosomes. This operation is called crossover. Crossover operation takes parts of solution encodings from two existing chromosomes [parents] and combines them into single solution [new chromosome]. This operation depends on chromosome representation and can be very complicated. Although general crossover operations are easy to implement, building specialized crossover operation for specific problem can greatly improve performance of the genetic algorithm.
Diagram - Crossover operation examples
Before genetic algorithm finishes production of new chromosome, after it performs crossover operation, it performs mutation operation. Mutation operation makes random but small changes to encoded solution. This prevents falling of all solution into local optimum and extends search space of the algorithm. Mutation as well as crossover operation depends on chosen representation.
Diagram - Mutation operation examples [swap mutation is performed over the first and overt the second invert mutation is performed]
Crossover and mutation operations are not always performed when producing new chromosome. If crossover is not performed the genetic algorithm produce new chromosome by copying on of the parents. Rates of crossover and mutation operations are called crossover probability and mutation probability. The crossover probability is usually high [about 80%] and mutation probability should be relatively low [about 3%, but for some problems higher probability gives better results]. Higher mutation probability can turn the genetic algorithm in a random search algorithm.
The last operations defined by genetic algorithms used to manipulate the chromosomes are fitness operation and fitness comparator. Fitness operation measures quality of produced solution [chromosome]. This operation is specific to problem and it actually tells genetic algorithm what to optimize. Fitness comparators [as their name suggests] are used to compare chromosomes based their fitness. Basically fitness comparator tells genetic algorithm whether it should minimize or maximize fitness values of chromosomes.
Choosing parents for production of new chromosomes from population is called selection. Selection can be based on many different criterias but it is usually based on fitness value. The idea behind this is to select the best chromosomes for parents in hope that combining them will produce better offspring chromosomes. But selecting only the best chromosome has one major disadvantage, all chromosomes in population will start to look the same very quickly. This narrows exploration space, pushes genetic algorithm into local optimum, and prevents genetic algorithm to find possibly better solutions that reside in inaccessible area of exploration space. To preserver diversity of chromosomes [and wider exploration space] within the population, selection operations usually introduce factor of randomness in the selection process. Some implementations of selection operation are entirely random.
One problem may occur with selection operations that are based on fitness values. When there is a chromosome with dominant fitness value, it will be selected most of the times thus it will cause problem similar to previous. To prevent this fitness values can be scaled [transformed] to lower the difference between dominant chromosome(s) and the rest of population [this allows other chromosomes to be selected]. There are many ways to transformation fitness value. Usually they are implemented by applying mathematical transformation to fitness value, but there other methods like ranking based scaling that use rank [based on raw fitness values of chromosomes] of chromosome as scaled fitness value.
Diagram - Scaling fitness value [shows selection probability of chromosomes]
Coupling operation defines how the selected chromosomes [parents] are paired for mating [mating is done by performing crossover operation over paired parents and applying mutation operation to newly produced chromosome]. This operation gives better control over the production of new chromosomes but can it be skipped and new chromosomes can be produced as the selection operation selects parents from the population.
Diagram - Coupling operation flowchart
Diagram - Selection operation without coupling operation flowchart
Next step performed by genetic algorithm is introduction of new chromosomes into population. Offspring chromosomes can form new population and replace the entire [previous] population [non-overlapping population] or they can replace only few chromosomes in the current population [overlapping population]. For overlapping populations, replacement operation defines which chromosomes are removed [usually the worst chromosomes in current population] from the current population and which offspring chromosomes are inserted. By replacing chromosomes there is a chance that the genetic algorithm will loose the best chromosome[s] [found so far]. To prevent this concept of elitism is introduced into genetic algorithms. Elitism guarantees that the best chromosome[s] from the current generation is going to survive to next generation.
Algorithm performs previously described steps one by one in sequence and when they have been performed it is said that one generation has passed. At the end of each generation genetic algorithm checks stop criteria. Because the nature of the genetic algorithms, most of the times it is not clear when the algorithm should stop, so criteria is usually based on statistical information such as number of generation, fitness value of the best chromosome or average fitness value of chromosomes in population, duration of evolution process...
Diagram - Flowchart of a genetic algorithm [overlapping population coupling operation]
Diagram - Flowchart of a genetic algorithm [overlapping population without coupling operation]
Diagram - Flowchart of a genetic algorithm [non-overlapping population with coupling operation]
Links