[笔记]遗传算法 genetic algorithm

来源：互联网发布：数据化决策编辑：程序博客网时间：2024/05/23 05:09

引言

遗传算法是最优化问题的常用方法，尤其是组合优化，常常可以得到近似最优解。

从物种的进化历史来看，物种的进化正是自我优化的过程。

从种群上来看，进化这一过程中，目标函数正是物种对环境的适应能力，限制条件就是严酷的自然环境，而状态表征量正是物种的遗传物质，基因，DNA，RNA等。

假设环境不变的话，每一代物种都比上一代物种更加适应这个环境，因为不能适应环境的基因已经被淘汰。这一代代的淘汰与繁衍，正是一代代的优化过程。很可惜，自然环境是变化的，是与物种相互作用的，所以恐龙灭绝了。:（

但是，作为数学假设，遗传算法的限制条件一般是不变的，当然这也可以使变化的。

遗传算法的求解步骤

1.设定求解框架

种群数量是有限的，所以要设置种群的数量 population
进化过程用种群繁殖的代数表示，同样，种群的代数也是有限值 generation number
作为基因，在繁殖的时候要交换染色体，一般设定交换概率P = 1，可以保证充分进化，较快得到优化的结果。

繁殖过程中存在变异。将变异的概率设定为P = 0.1，符合实际中变异发生率低的事实。

2.状态变量

遗传算法一般用来求解组合问题的优化结果。
用随机序列 w1,w2,...wn 表示基因排布。其中， 0<=wi<=1 。
比如， n = 9时，随机序列为 [0.23, 0.82, 0.45, 0.74,0.87,0.11, 0.56, 0.69, 0.78] ，解码方法一般是排序操作，对随机序列进行升序排列，得到序列由小到大的元素的索引值 [5, 0, 2, 6, 7, 3, 8, 1, 4]。

3. 确定目标函数

map(target_function, state_variable)

4.初始化种群

一般采用改良圈法得到。对随机序列

w1,w2,...,wu−1,wu,wu+1,...,wv−1,wv,wv+1,...wn

交换u 与v 之间的顺序，(0<u<v<n) 得到：

w1,w2,...,wu−1,wv,wv−1,...,wu+1,wu,wv+1,...wn

如果新的序列使目标函数减小，那么就覆盖原序列。直到，找到一个局部最优解。

5. 基因交叉

对种群随机两两匹配。交叉的方式有很多，这里只提及一种。
假设其中一组的两个序列分别为：
f1=[w1,w2,...wn]
f2=[w′1,w′2,...w′n]
随机选取t(0<t<=n) 作为交换节点。
那么子代child1的基因的前t 个与 f1 的前t 个相同，后n−t 个与f2 的后n−t 个相同。
同样，子代基因child2的基因的前t 个与 f2 的前t 个相同，后n−t 个与f1 的后n−t 个相同。

6. 基因变异

基因变异的方式有很多，记得生物学上DNA于RNA的变异不同。DNA的变异相对RNA的变异而言，与本体相比，差异小。而RNA变异则是大段基因的重组，所以RNA作为遗传物质的病毒，如HIV病毒，变异很快，适应环境的速度快。
这里为了快速得到最优解，可以选取类似RNA变异的算法。当然，还有许多其他方法。

已知，基因[w1,w2,...wn] 随机选取三个节点
u,v,w.(0<u<v<w<n)
将u,v 之间的基因段插入到 w 的后面

7.进化

在得到的父代以及子代和变异的的种群后，按状态序列对应的目标函数值进行排序，最后选取最大的或是最小的固定种群数个基因，进行迭代，直到达到最大迭代次数，也就是generation number。

举例

一架飞机的起飞地点的经纬度分别是(70,40)。飞机速度为1000km/h。飞机从基地出发侦查完所有目标后返回基地。侦查时间不计，最短要花费多长时间回到基地。
目标的经纬度在文件”sj.txt”中。
这个问题需要求解起点和终点之间的序列。
应用上面的步骤可以利用numpy实现近似最优化解。
注意：遗传算法每次运行得到的结果可能并不相同，应多次运行后，再根据局部最优解得到近似解。本例中，并未多次运行，实现算法只为了学习、理解。

#!/usr/bin/env python3# -*- coding: utf-8 -*-import numpy as npimport mathdef load_txt(filename):    """    return the [[x1,y1],[x2,y2]....]    """    data = []    with open(filename) as f:        for line in f:            print(line)            line = line.strip()            first, second = line.split(' ')            data.append([float(first),float(second)])    return datadef init_pop(population_num, distance_mat):    """    return array shape (population_num, 102)    """    population = np.zeros((population_num, 102))    for row in range(population_num):        init_one = list(range(1,101))        np.random.shuffle(init_one)        init_one = [0] + init_one + [101]        for t in range(102):            quit_flag = 0            for m in range(100):                for n in range(m+2, 101):                    if distance_mat[init_one[m], init_one[n]] + distance_mat[init_one[m+1],init_one[n+1]] \                        < distance_mat[init_one[m], init_one[m+1]] + distance_mat[init_one[n],init_one[n+1]]:                        temp = init_one[:]                        init_one = temp[:m+1] + temp[m+1:n+1][::-1] + temp[n+1:]                        quit_flag = 1            if quit_flag == 0:                for idx, val in enumerate(init_one):                    population[row,val] = idx                break    population /= 101.0 # encoding     return populationdef product(parent, distance_mat):    """    parent numpy array (population_num, 102)    distance_mat numpy array (102,102)    return new population    """    child = np.copy(parent)    row = parent.shape[0]    c = list(range(row))    np.random.shuffle(c)    # exchange gene    for i in range(0,row,2):        position = math.floor(np.random.random() * 102)        temp = np.copy(child[c[i], position:]) # ###### copy         child[c[i], position:] = child[c[i+1], position:]        child[c[i+1], position:] = temp    # mutation gene    change_row = np.array(range(row))[np.random.random(row)<0.1]    B = np.copy(child[change_row,:])    for i in range(len(change_row)):        mutation_pos = np.sort(np.floor(np.random.random(3)*102))        temp = np.copy(B[i,:])        B[i,:] = np.hstack((temp[:mutation_pos[0]], temp[mutation_pos[1]+1:mutation_pos[2]+1],\                           temp[mutation_pos[0]:mutation_pos[1]+1],temp[mutation_pos[2]+1:]))    family = np.vstack((parent, child, B))    family_idx = np.argsort(family, axis = 1) # decoding    row_num = family.shape[0]    length = np.zeros(row_num)    for j in range(row_num):        for i in range(101):            length[j] += distance_mat[family_idx[j,i],family_idx[j,i+1]]    idx = np.argsort(length)    population = family[idx[:row],:]    return populationif __name__ == '__main__':    sj = load_txt('sj.txt') #length is 100    start_point = [[70,40]]    sj = start_point + sj + start_point # the path is circular    sj = np.array(sj) # change to array    sj = sj * np.pi/180 # change to rads    distance_mat = np.zeros((102,102)) # initialize the distance matrix    for i in range(101):        for j in range(i+1,102):            point1 = np.array([np.cos(sj[i,0])*np.cos(sj[i,1]), np.sin(sj[i,0])*np.cos(sj[i,1]), np.sin(sj[i,1])])            point2 = np.array([np.cos(sj[j,0])*np.cos(sj[j,1]), np.sin(sj[j,0])*np.cos(sj[j,1]), np.sin(sj[j,1])])            distance_mat[i,j] = 6370 * math.acos(np.dot(point1, point2)\                /np.sqrt(np.dot(point1,point1) * np.dot(point2,point2)))    distance_mat = distance_mat + distance_mat.T # build the systematic matrix# set the algorithm parameter    population_num = 50    generation_num = 100    population = init_pop(population_num, distance_mat) # find the sectional optimize value# iterations    for _ in range(generation_num):        population = product(population, distance_mat)# population is the final group of genes# find the best gene    num = population.shape[0]    population_idx = np.argsort(population, axis=1)    length = np.zeros(num)    for j in range(num):        for i in range(101):            length[j] += distance_mat[population_idx[j,i],population_idx[j,i+1]]    idx = np.argsort(length)[0]    path = population_idx[idx]    print(length[idx])

0 0