GA--sentence match

来源：互联网发布：qq农场辅助软件编辑：程序博客网时间：2024/06/05 10:20

Source of original code

Author: Morvan
Visualize Genetic Algorithm to match the target phrase.
Visit tutorial website for more: https://morvanzhou.github.io/tutorials/

The result

Absolutely, it’s satisfying.
With only 24 generations(Maybe a little more or less, randomly.), the sentence matches perfectly.

Algorithm & Code interpretation

the code component

The whole code is consists of three parts.
Some const variables , the class GA and the main function.

some const variables

TARGET_PHRASE = 'I love you!'       # target DNAPOP_SIZE = 300                      # population sizeCROSS_RATE = 0.4                    # mating probability (DNA crossover)MUTATION_RATE = 0.01                # mutation probabilityN_GENERATIONS = 1000                # max generationDNA_SIZE = len(TARGET_PHRASE)       # the size of DNA # target ASCII indices of the target string, convert string to numberTARGET_ASCII = np.fromstring(TARGET_PHRASE, dtype=np.uint8) ASCII_BOUND = [32, 126]             # the bounds of the ASCII charactors

main function

Firstly, we constuct a GA object ,named ga with the const variables above.

ga = GA(DNA_size=DNA_SIZE, DNA_bound=ASCII_BOUND, cross_rate=CROSS_RATE,    mutation_rate=MUTATION_RATE, pop_size=POP_SIZE)

Then, execute a for loop in the max generation

for generation in range(N_GENERATIONS):    fitness = ga.get_fitness()                      # get the fitness of the current population    best_DNA = ga.pop[np.argmax(fitness)]           # best matching DNA    best_phrase = ga.translateDNA(best_DNA)         # translate the DNA to string we can read    print('Gen', generation, ': ', best_phrase)     # print the string    if best_phrase == TARGET_PHRASE:                # as if the string matches perfectly, break the loop        break    ga.evolve()                                     # if not, evolve to the next generation

GA class

function init()

def __init__(self, DNA_size, DNA_bound, cross_rate, mutation_rate, pop_size):    self.DNA_size = DNA_size        # the size of DNA    DNA_bound[1] += 1    self.DNA_bound = DNA_bound      # ASCII Bound    self.cross_rate = cross_rate    self.mutate_rate = mutation_rate       self.pop_size = pop_size    # population    # define the population of first generation    self.pop = np.random.randint(*DNA_bound, size=(pop_size, DNA_size)).astype(np.int8)  # int8 for convert to ASCII

To initialize the class with const variables above

function translateDNA()

def translateDNA(self, DNA):                     # convert to readable string    return DNA.tostring().decode('ascii')

Such as translate
[126, 34, 45, 125, 46, 101, 32, 119, 38, 126, 33] to ‘~”-}.e w&~!’

function get_fitness()

def get_fitness(self):                      # count how many character matches    match_count = (self.pop == TARGET_ASCII).sum(axis=1)    return match_count

To calculate the fitness of every DNA in current generation.
List of array convert to list of int , int means match numbers

function select()

def select(self):    fitness = self.get_fitness() + 1e-4     # add a small amount to avoid all zero fitness    idx = np.random.choice(np.arange(self.pop_size), size=self.pop_size, replace=True, p=fitness/fitness.sum())    return self.pop[idx]

To select better fit DNA
Every new array element depends on the probabilities of old array elements
(probability = element.fitness/fitness.sum())
The probability is higher, the more it will appear in the new array

function crossover()

def crossover(self, parent, pop):    if np.random.rand() < self.cross_rate:        i_ = np.random.randint(0, self.pop_size, size=1)                        # select another individual from pop        cross_points = np.random.randint(0, 2, self.DNA_size).astype(np.bool)   # choose crossover points        parent[cross_points] = pop[i_, cross_points]                            # mating and produce one child    return parent

To cross this parent with another random parent in current generation

function mutate()

def mutate(self, child):    for point in range(self.DNA_size):        if np.random.rand() < self.mutate_rate:            child[point] = np.random.randint(*self.DNA_bound)  # choose a random ASCII index    return child

Traverse all the elements in a DNA
Randomly convert a element to a random ASCII charactor index

function evolve()

def evolve(self):    pop = self.select()    pop_copy = pop.copy()    for parent in pop:  # for every parent        child = self.crossover(parent, pop_copy)        child = self.mutate(child)        parent[:] = child    self.pop = pop

To evolve into the next generation.
Firstly, select the better fitting ones
Then, creat a copy of current generation
Then, traverse everyone in current generation,
cross it with another random one in current generation
mutate randomly.
Finally, upgrade to the next generation

Complete code

import numpy as npTARGET_PHRASE = 'I love you!'       # target DNAPOP_SIZE = 300                                # population sizeCROSS_RATE = 0.4                          # mating probability (DNA crossover)MUTATION_RATE = 0.01                   # mutation probabilityN_GENERATIONS = 1000DNA_SIZE = len(TARGET_PHRASE)TARGET_ASCII = np.fromstring(TARGET_PHRASE, dtype=np.uint8)  # convert string to numberASCII_BOUND = [32, 126]class GA(object):    def __init__(self, DNA_size, DNA_bound, cross_rate, mutation_rate, pop_size):        self.DNA_size = DNA_size        DNA_bound[1] += 1        self.DNA_bound = DNA_bound        self.cross_rate = cross_rate        self.mutate_rate = mutation_rate        self.pop_size = pop_size        self.pop = np.random.randint(*DNA_bound, size=(pop_size, DNA_size)).astype(np.int8)  # int8 for convert to ASCII    def translateDNA(self, DNA):                 # convert to readable string        return DNA.tostring().decode('ascii')    def get_fitness(self):                      # count how many character matches        match_count = (self.pop == TARGET_ASCII).sum(axis=1)        return match_count    def select(self):        fitness = self.get_fitness() + 1e-4     # add a small amount to avoid all zero fitness        idx = np.random.choice(np.arange(self.pop_size), size=self.pop_size, replace=True, p=fitness/fitness.sum())        return self.pop[idx]    def crossover(self, parent, pop):        if np.random.rand() < self.cross_rate:            i_ = np.random.randint(0, self.pop_size, size=1)                        # select another individual from pop            cross_points = np.random.randint(0, 2, self.DNA_size).astype(np.bool)   # choose crossover points            parent[cross_points] = pop[i_, cross_points]                            # mating and produce one child        return parent    def mutate(self, child):        for point in range(self.DNA_size):            if np.random.rand() < self.mutate_rate:                child[point] = np.random.randint(*self.DNA_bound)  # choose a random ASCII index        return child    def evolve(self):        pop = self.select()        pop_copy = pop.copy()        for parent in pop:  # for every parent            child = self.crossover(parent, pop_copy)            child = self.mutate(child)            parent[:] = child        self.pop = popif __name__ == '__main__':    ga = GA(DNA_size=DNA_SIZE, DNA_bound=ASCII_BOUND, cross_rate=CROSS_RATE,            mutation_rate=MUTATION_RATE, pop_size=POP_SIZE)    for generation in range(N_GENERATIONS):        fitness = ga.get_fitness()        best_DNA = ga.pop[np.argmax(fitness)]        best_phrase = ga.translateDNA(best_DNA)        print('Gen', generation, ': ', best_phrase)        if best_phrase == TARGET_PHRASE:            break        ga.evolve()

阅读全文

0 0