Python爬虫初试

来源:互联网 发布:软件开发项目建议书 编辑:程序博客网 时间:2024/06/04 19:39
#coding=utf-8import urllib.requestimport redef getHtml(url):    page = urllib.request.urlopen(url)    html = page.read()    html = html.decode('utf-8')    return htmldef getImg(html):    reg = r'src="(.+?\.jpg)" pic_ext'    imgre = re.compile(reg)    imglist = imgre.findall(html)    x = 0    for imgurl in imglist:        urllib.request.urlretrieve(imgurl,'D:\jpg\%s.jpg' % x)        x+=1        print(x)html = getHtml("http://tieba.baidu.com/p/2460150866")getImg(html)
0 0
原创粉丝点击