python多进程学习
来源:互联网 发布:linux epoll wait 编辑:程序博客网 时间:2024/05/27 16:41
当遇到CPU密集型的场景时,我们可以考虑用多进程的方式来解决问题。
比如我自己写了个txt文本文件。里面顺次存储了1-999999的数字,循环写入了3次。那么我查找999999出现的次数,这个场景是计算密集型的,也就是属于cpu密集型的场景,我们可以试一下多进程
#!/usr/bin/python env# -*- coding:utf-8 -*-import timeimport multiprocessingimport osdef get_count_number(file_path): count_num = 0 print 'the process pid is %s and the parent pid is %s : ' %(os.getpid(), os.getppid()) with open(file_path) as f: str = f.readlines() for one_line in str: if '999999' in one_line: # print one_line count_num += 1 print count_numstart_time = time.time()txt_list = ['00001.txt','00002.txt', '00003.txt']for file_path in txt_list: get_count_number(file_path)# print get_count_number('00001.txt')end_time = time.time()print '##########'print 'the total time to run is: ', end_time - start_timemulti_start_time = time.time()process_list = []for each_file in txt_list: each_process = multiprocessing.Process(target=get_count_number, args=(each_file,)) process_list.append(each_process) #for each_p in process_list: each_p.start()for each_p in process_list: each_p.join()multi_end_time = time.time()print '##########'print 'multi processing time is: ',multi_end_time - multi_start_time
我们启动了多进程,结果如下:
the process pid is 79873 and the parent pid is 78757 : 3the process pid is 79873 and the parent pid is 78757 : 3the process pid is 79873 and the parent pid is 78757 : 3##########the total time to run is: 1.18405485153the process pid is 79874 and the parent pid is 79873 : the process pid is 79875 and the parent pid is 79873 : the process pid is 79876 and the parent pid is 79873 : 333##########multi processing time is: 0.489647865295
我们可以看到,顺次读取3个文件,耗费的时间是1.184秒。而用多进程的方式,总共用了0.49秒。
当文件比较多时,进程不是并发执行的越多越好,进程并发数量是有一个最优的配置方式的,这个与执行程序的机器配置有关。因此我们就引入了线程池的概念,即同一个时刻,同时有几个进程来并发执行。
在例子中,我读取6个文件,进程池设置为3个
#!/usr/bin/python env# -*- coding:utf-8 -*-#!/usr/bin/python env# -*- coding:utf-8 -*-import timeimport multiprocessingimport osdef get_count_number(file_path): count_num = 0 print 'the process pid is %s and the parent pid is %s : ' %(os.getpid(), os.getppid()) with open(file_path) as f: str = f.readlines() for one_line in str: if '999999' in one_line: # print one_line count_num += 1 print count_numstart_time = time.time()txt_list = ['00001.txt','00002.txt', '00003.txt','00004.txt','00005.txt', '00006.txt']for file_path in txt_list: get_count_number(file_path)# print get_count_number('00001.txt')end_time = time.time()print '##########'print 'the total time to run is: ', end_time - start_timemulti_start_time = time.time()pool = multiprocessing.Pool(processes=3)for each_file in txt_list: each_process = pool.apply_async(func=get_count_number, args=(each_file,)) #pool.close()pool.join()multi_end_time = time.time()print '##########'print 'multi processing time is: ',multi_end_time - multi_start_time
执行结果如下:
the process pid is 79882 and the parent pid is 78757 : 3the process pid is 79882 and the parent pid is 78757 : 3the process pid is 79882 and the parent pid is 78757 : 3the process pid is 79882 and the parent pid is 78757 : 3the process pid is 79882 and the parent pid is 78757 : 3the process pid is 79882 and the parent pid is 78757 : 3##########the total time to run is: 2.45317697525the process pid is 79883 and the parent pid is 79882 : the process pid is 79884 and the parent pid is 79882 : the process pid is 79885 and the parent pid is 79882 : 333the process pid is 79885 and the parent pid is 79882 : the process pid is 79884 and the parent pid is 79882 : the process pid is 79883 and the parent pid is 79882 : 333##########multi processing time is: 1.25623202324
可以看到,顺序读取6个文件,耗费的时间是2.45秒,而用进程池的方式并发执行,耗费的时间是1.26秒
当把进程的数量设置为2个时,pool = multiprocessing.Pool(processes=2)
运行结果如下:
the process pid is 79893 and the parent pid is 78757 : 3the process pid is 79893 and the parent pid is 78757 : 3the process pid is 79893 and the parent pid is 78757 : 3the process pid is 79893 and the parent pid is 78757 : 3the process pid is 79893 and the parent pid is 78757 : 3the process pid is 79893 and the parent pid is 78757 : 3##########the total time to run is: 2.41508388519the process pid is 79894 and the parent pid is 79893 : the process pid is 79895 and the parent pid is 79893 : 33the process pid is 79894 and the parent pid is 79893 : the process pid is 79895 and the parent pid is 79893 : 33the process pid is 79895 and the parent pid is 79893 : the process pid is 79894 and the parent pid is 79893 : 33##########multi processing time is: 1.55285310745
可以看到,多进程读取6个文件耗费的时间时1.55秒,比进程池设置为3个时,耗费的时间要长一些,这个可以理解,毕竟3个人同时干活比2个人同时干活速度要快一点,但是,并不是进程并发的数量设置的越大越好,比如读500个文件,设置进程数量为100,不见得比设置为50的读取速度更快。
总结:我们可以得出结论,当遇到CPU密集型(计算密集型)的场景时,可以考虑用多进程的方式执行。
阅读全文
0 0
- python多进程学习
- Python中的多进程学习
- Python多进程(multiprocessing)学习总结
- Python 学习笔记 多进程 multiprocessing
- python学习笔记(多进程)
- Python菜鸟学习手册16----多进程
- Python 多进程/多线程 学习笔记
- python学习——多进程
- Python 学习笔记 多进程 multiprocessing
- Python 学习笔记 多进程 multiprocessing
- Python 多进程池的学习
- python多进程与多线程学习总结
- python学习——多进程
- python爬虫学习多进程下载图片
- Python爬虫学习笔记--多进程用法
- Python 学习笔记 多进程 multiprocessing
- 操作系统(python)多进程学习
- python多进程变成学习之multiprocessing
- Windows下利用win32clipboard实现Python的剪切板(Clipboard)操作
- 方正转型的悲哀
- COGS血帆海盗
- ereg()无法正常使用
- COGS 426. 血帆海盗 最小割定理
- python多进程学习
- Ubuntu Maven 配置
- Mysql5.7.19 winx64安装过程遇到的问题及解决办法
- Android属性动画(Property Animation)
- 黄昏
- 【搜索】【广搜模板】
- 使用SchemaExport生成数据库表
- Hibernate的N+1条SQL查询问题-------Iterate
- web前端学习日记8