 # Simple Evolutionary Genetic Optimisation code in Python

Step through of a simple minimilist framework for evolutionary genetic optimisation in Python, with examples.

Full uninterrupted code available in this gist

For an introduction to genetic optimisation see my previous post

## Summary

Evolutionary Genetic Optimisation is an algorithm that creates and breeds generations of potential solutions, with the idea that each generation becomes on average better than the last. It borrows ideas from the biological process of evolution, where the more ‘fit’ solutions in a generation are more likely to be selected as parents of a child solution that is a mix of the parents.

## Genes and Genomes

First, we need a way of representing solutions and sets of solutions. We’ll borrow some terms from biology:

• Genome: How solutions are represented. We’ll use a list.
• Gene: The smallest parts that make up a solution. These are the possible elements that can appear in the list
• Individual: Used interchangeably with ‘genome’ and ‘solution’
• Population: Set of individuals. We’ll store these in a list (so it’s a list of lists)
• Fitness Function: Function taking an individual and returns a numeric value representing how ‘good’ the solution is

## Example Optimisation Problem

• Genome: List of 10 numbers
• Genes: Digits 0 – 9
• Fitness Function: Sum numbers in the list

We’ll cover the other parameters and build the functions that will eventually be used by the ‘optimise’ function

```genes = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
genome_length = 10
fitness_func = lambda x: sum(x)
```

## Initialisation

We create an initial population where genes are randomised and create a function that can sort them based on their fitnesses. Note the initialisation lines of code will be placed into optimise function.

```def population_fitnesses(population, fitness_func):
# Input population list, returns population (sorted by fitness) and fitness lists
fitnesses = [fitness_func(i) for i in population]
fitnesses, population = zip(*sorted(zip(fitnesses, population), reverse=True))
return list(population), list(fitnesses)

# Initialisation and order by fitnesses
population = [random.choices(genes, k=genome_length) for _ in range(population_size)]
population, fitnesses = population_fitnesses(population, fitness_func)
```

## Create Child

We’ll define a function to handle the selection, crossover and mutation processes that go into creating new children for the next generation

```def create_child(population, genes, tournament_size, mutation_rate):
# Breed two parents, returning (possibly mutated) offspring
parent1 = select(population, tournament_size)
parent2 = select(population, tournament_size)
child = crossover(parent1, parent2)
child = mutate(child, mutation_rate, genes)
return child
```

## Select

We’ll use tournament selection to pick which genomes in a population should be parents. Tournament selection randomly chooses a subset (size tournament_size) of the population then picks the most fit individual to return

```def select(population, tournament_size):
# Choose an individual in population through winner of a tournament
assert tournament_size <= len(population)
tournament_winner_index = min(random.sample(range(len(population)), tournament_size))
parent = population[tournament_winner_index]
return parent.copy()
```

## Crossover

We’ll use single point crossover to pick an index in the genome to slice the second parent onto the first.

```def crossover(parent1,parent2):
# Returns results of single point crossover of the parents
crossover_point = random.choice(range(len(parent1)))
child = parent1.copy()
child[crossover_point:] = parent2[crossover_point:]
return child
```

## Mutate

Mutation randomly changes genes in a child’s genome to another possibility. The idea is to introduce novel variations into the population.

```def mutate(child, mutation_rate, genes):
# Choose to replace a gene with a random different one
for gene in range(len(child)):
if random.random() < mutation_rate:
child[gene] = random.choice(genes)
return child
```

## Optimise

Now we can combine our functions into an iterative process to create and evolve generations of solutions, until we hit an iteration or max fitness limit. Note how we don’t replace the best individuals in the population (elitism) and we allow optional stopping if a max fitness has been reached.

```def optimise(iterations, genes, genome_length, fitness_func
, population_size, elites_size, tournament_size, mutation_rate, max_fitness=None):
# Initialisation and order by fitnesses
population = [random.choices(genes, k=genome_length) for _ in range(population_size)]
population, fitnesses = population_fitnesses(population, fitness_func)

# Iterate through generations of populations
for generation in range(iterations):
new_population = [p.copy() for p in population]

# Keep the best elites, replace rest of population with new children
for p in range(elites_size, population_size):
new_population[p] = create_child(population, genes, tournament_size, mutation_rate)

# Compute fitnesses and sort population by them
population, fitnesses = population_fitnesses(new_population, fitness_func)
print(f"Gen: {generation+1}, Best: {population} with Fitness: {fitnesses}")

# Check if we've passed the max fitness stopping condition
if max_fitness is not None and fitnesses >= max_fitness:
break

return population, fitnesses
```

## Example: List of numbers

We now just need to change the genes, genome_length and fitness_func to use the framework for different problems:

```if __name__ == "__main__":
iterations = 8
population_size = 200
elites_size = 5
tournament_size = 10
mutation_rate = 0.1

genes = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
genome_length = 10
fitness_func = lambda x: sum(x)

best, best_fitness = optimise(iterations, genes, genome_length, fitness_func
, population_size, elites_size, tournament_size, mutation_rate)
print(f"Best: {best} with Fitness: {best_fitness}")
```
```Gen: 1, Best: [7, 5, 7, 9, 9, 9, 7, 8, 8, 8] with Fitness: 77
Gen: 2, Best: [9, 9, 9, 6, 9, 6, 9, 7, 8, 7] with Fitness: 79
Gen: 3, Best: [9, 9, 9, 9, 9, 9, 7, 8, 8, 8] with Fitness: 85
Gen: 4, Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 7] with Fitness: 88
Gen: 5, Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 8] with Fitness: 89
Gen: 6, Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] with Fitness: 90
Gen: 7, Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] with Fitness: 90
Gen: 8, Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] with Fitness: 90
Best: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] with Fitness: 90
```

After 6 generations the solution converges to 90. If we knew the best solution (or best we’d be satisfied with) was 90 in advance (but not know what the solution looked like) we could send in 90 as the extra optimise parameter.

Let’s try a more interesting example. We’ll pretend we’re cracking a password and use unicode distance between the fixed password string and our attempt. As we want the distance to be 0, we’ll define the fitness_func to be the negative distance so that we’re still optimising towards the maximum.

```genes = "abcdefghijklmnopqrstuvwxyz123456789"
genome_length = 7
fitness_func = lambda x: sum([-abs(ord("pass123"[i]) - ord(''.join(x)[i])) for i in range(len(x))])
```
```Gen: 1, Best: ['p', 'g', 'r', 'y', '8', '2', '1'] with Fitness: -22
Gen: 2, Best: ['p', 'a', 'o', 'v', '6', '1', '5'] with Fitness: -15
Gen: 3, Best: ['p', 'a', 'o', 's', '1', '8', '3'] with Fitness: -10
Gen: 4, Best: ['p', 'a', 'o', 's', '1', '2', '1'] with Fitness: -6
Gen: 5, Best: ['o', 'a', 's', 't', '2', '1', '3'] with Fitness: -4
Gen: 6, Best: ['p', 'a', 'r', 's', '1', '1', '3'] with Fitness: -2
Gen: 7, Best: ['p', 'a', 's', 's', '1', '2', '1'] with Fitness: -2
Gen: 8, Best: ['p', 'a', 's', 's', '1', '2', '3'] with Fitness: 0
Best: ['p', 'a', 's', 's', '1', '2', '3'] with Fitness: 0
```

We luckily get there at the 8th iteration.

In this example we know what the best fitness should be (0) so we can set a high iterations number (like 1000) and send in max_fitness (0) to the optimise function so it stops when it finds it.

```if __name__ == "__main__":
iterations = 1000
population_size = 200
elites_size = 5
tournament_size = 10
mutation_rate = 0.1

genes = "abcdefghijklmnopqrstuvwxyz123456789"
genome_length = 7
fitness_func = lambda x: sum([-abs(ord("pleasedontcrackmepass123"[i]) - ord(''.join(x)[i])) for i in range(len(x))])

best, best_fitness = optimise(iterations, genes, genome_length, fitness_func
, population_size, elites_size, tournament_size, mutation_rate, 0)
print(f"Best: {best} with Fitness: {best_fitness}")
```

The algorithm finds it hard to begin with…

```Gen: 1, Best: ['x', 'g', 'g', '1', 'k', 'b', '5', 'm', 's', 'o', 'c', 'g', 'b', 'n', 'p', 'j', 'f', 'q', 'e', 'y', 'o', '2', '1', 'l'] with Fitness: -239
Gen: 2, Best: ['g', 'k', 'b', 'p', 'q', 'a', 'c', 'm', 'i', 'p', 'c', 'g', 'b', 'n', 'p', 'j', 'f', 'q', 'e', 'l', 'n', '9', 'g', '8'] with Fitness: -161
Gen: 3, Best: ['n', 'o', 'g', 'e', 'z', 'x', 'g', 'w', 'r', 'j', 'f', 'u', 'b', 'n', 'p', 'j', 'f', 'w', 'j', 'q', 'u', '4', '7', '2'] with Fitness: -118
```

… but gets there on the 108th generation and stops there.

```Gen: 107, Best: ['p', 'l', 'e', 'a', 's', 'e', 'd', 'o', 'n', 't', 'c', 'r', 'a', 'c', 'k', 'm', 'e', 'p', 'a', 's', 's', '1', '3', '3'] with Fitness: -1
Gen: 108, Best: ['p', 'l', 'e', 'a', 's', 'e', 'd', 'o', 'n', 't', 'c', 'r', 'a', 'c', 'k', 'm', 'e', 'p', 'a', 's', 's', '1', '2', '3'] with Fitness: 0
Best: ['p', 'l', 'e', 'a', 's', 'e', 'd', 'o', 'n', 't', 'c', 'r', 'a', 'c', 'k', 'm', 'e', 'p', 'a', 's', 's', '1', '2', '3'] with Fitness: 0
```

## Conclusion

Evolutionary genetic optimisation offers a nice approach to converge at a solution through survival of the fittest. I hope this code helps to understand and demonstrate how this can be done in a minimalist and clean framework.

## Appendix: Code in Full

Full uninterrupted code available in this gist or below:

```import random

def select(population, tournament_size):
# Choose an individual in population through winner of a tournament
assert tournament_size <= len(population)
tournament_winner_index = min(random.sample(range(len(population)), tournament_size))
parent = population[tournament_winner_index]
return parent.copy()

def create_child(population, genes, tournament_size, mutation_rate):
# Breed two parents, returning (possibly mutated) offspring
parent1 = select(population, tournament_size)
parent2 = select(population, tournament_size)
child = crossover(parent1, parent2)
child = mutate(child, mutation_rate, genes)
return child

def crossover(parent1,parent2):
# Returns results of single point crossover of the parents
crossover_point = random.choice(range(len(parent1)))
child = parent1.copy()
child[crossover_point:] = parent2[crossover_point:]
return child

def mutate(child, mutation_rate, genes):
# Choose to replace a gene with a random different one
for gene in range(len(child)):
if random.random() < mutation_rate:
child[gene] = random.choice(genes)
return child

def population_fitnesses(population, fitness_func):
# Takes list of individuals and returns population (sorted by fitness) and fitness lists
fitnesses = [fitness_func(i) for i in population]
fitnesses, population = zip(*sorted(zip(fitnesses, population), reverse=True))
return list(population), list(fitnesses)

def optimise(iterations, genes, genome_length, fitness_func
, population_size, elites_size, tournament_size, mutation_rate, max_fitness=None):
# Initialisation and order by fitnesses
population = [random.choices(genes, k=genome_length) for _ in range(population_size)]
population, fitnesses = population_fitnesses(population, fitness_func)

# Iterate through generations of populations
for generation in range(iterations):
new_population = [p.copy() for p in population]

# Keep the best elites, replace rest of population with new children
for p in range(elites_size, population_size):
new_population[p] = create_child(population, genes, tournament_size, mutation_rate)

# Compute fitnesses and sort population by them
population, fitnesses = population_fitnesses(new_population, fitness_func)
print(f"Gen: {generation+1}, Best: {population} with Fitness: {fitnesses}")

# Check if we've passed the max fitness stopping condition
if max_fitness is not None and fitnesses >= max_fitness:
break

return population, fitnesses

if __name__ == "__main__":
iterations = 20
population_size = 200
elites_size = 5
tournament_size = 10
mutation_rate = 0.1

genes = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
genome_length = 10
fitness_func = lambda x: sum(x)

best, best_fitness = optimise(iterations, genes, genome_length, fitness_func
, population_size, elites_size, tournament_size, mutation_rate)
print(f"Best: {best} with Fitness: {best_fitness}")
```