python 多线程编程总结(实验多线程判断网址是否在线)
来源:互联网 发布:js替换所有换行符 编辑:程序博客网 时间:2024/05/19 16:49
现在做一个针对网址是否在线的判断实验,利用多线程和普通方法来进行对比,以下为代码和代码结果:
一,不使用多线程,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
#t = threading.Thread(target = online,args = (url,))
#t.start()
online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.hitwh.edu.cn this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://kuleteknik.net/wp-includes/lol2.html this url is online
[Finished in 5.2s]
解释:使用了5.2秒,若判断网址更多,并且其中没有在线的网址更多时,时间会更长
二、使用多线程判断,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
t = threading.Thread(target = online,args = (url,))
t.start()
#online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.ifeng.com this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://www.hitwh.edu.cn this url is online
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://www.sina.com.cn/ this url is online
http://www.sina.com.cn/ this url is online
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol2.html this url is online
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
[Finished in 1.7s]
解释:每一个网址判断都使用一个线程执行,时间只用了1.7s
总结:
1、当判断的网址多时,数量级达到百万级,多线程的优势会显现的非常大。
2、该多线程代码是为每一个网址创建一个线程,当网址过多时,很显然这个方法不行,所以可以优化该判断代码。
3、当网址存在数据库中时候,如何高效存入数据库,也是很重要的方法。
4、上面判断网址是否在线的函数,个人觉得不是非常正确,因为网址重定向的问题,网址可能不存在,但是重定向后,显示网址还存在,这也是以后改进方法,有改进办法的同学可以跟我留言,共同进步,如果我有方法,也会在博客公开。
更新(2014.10.30)
1、使用pycurl检测url是否在线,效率更高。
2、将其连接数据库,并且将结果存入数据库(自己做的小项目,已经完成)
- python 多线程编程总结(实验多线程判断网址是否在线)
- python多线程编程例子实验
- Python多线程编程总结
- python多线程编程总结(三)
- 判断是否支持多线程
- OpenMP多线程编程实验
- 多线程编程实验
- python 多线程编程(一)
- python 多线程编程(二)
- python 多线程编程(三)
- python 多线程编程(四)
- python多线程编程(二)
- python多线程编程(五)
- Python多线程编程(一)
- Python多线程编程(二)
- python多线程编程(1)
- python 多线程总结(一)
- 多线程网址
- 通过Mybatis-Springspring与mybatis整合
- Android开发随笔
- ubuntu 10.04 安装ibus 拼音输入法
- iOS UIDevice的使用
- IOS 内存管理
- python 多线程编程总结(实验多线程判断网址是否在线)
- 什么是堆和栈?它们在哪儿?
- 扩展文件大小的两个方法
- 成为开源编程高手的11个技巧【转】
- VS下程序打包
- BWT压缩算法及FM搜索算法详解
- 数组中的元素根据关键字筛选
- 横屏竖屏全屏
- [leetcode]Implement strStr()