python刷访问数

来源:互联网 发布:四川麻将规则算法 编辑:程序博客网 时间:2024/06/17 20:55

      python是个好东西,生命苦短,为什么不用python那,

     下面是一个用python刷访问数的代码,用到了多线程,以及ip代理,闲话少说,开始吧:
     我们首先用到了西刺免费代理http://www.xicidaili.com/,这是一个免费提供代理ip的网站,至于什么是ip代理,简单的来说就是让你拥有不同的IP地址去访问网站,

    开始的时候我是用txt文件存下ip值

page =1while page < 10:    # proxy_support = urllib2.ProxyHandler({'http': 'http://43.243.112.79:3128'})    #    # opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)    # urllib2.install_opener(opener)    # url = 'http://www.xicidaili.com/nn/'+str(page)    url = 'http://www.xicidaili.com/nt/' + str(page)    # url ='http://www.xicidaili.com/wn/'+str(page)    # url ='http://www.xicidaili.com/wt/'+str(page)    # url ='http://www.xicidaili.com/qq/'+str(page)    req = urllib2.Request(url, headers=header)    res = urllib2.urlopen(req).read()    # print res    soup = BeautifulSoup(res)    ips = soup.findAll('tr')    f = open("nt.txt", "a")    for x in range(1, len(ips)):        ip = ips[x]        tds = ip.findAll("td")        # tds[2].contents[0]+"\t"+tds[3].contents[0]+"\n"        ip_temp = tds[1].contents[0] + "\t" + tds[2].contents[0] + "\n"        # print tds[2].contents[0]+"\t"+tds[3].contents[0]        # print ip_temp        f.write(ip_temp)
但问题来了,当我使用的时候发现ip代理并不好用了,why,原因很简单,代理的IP是有时间的,这点一定要记住,

那么我们调整思路,我们只要网站前10页的ip地址

page =int(pagenum)while page < 999:    # proxy_support = urllib2.ProxyHandler({'http': 'http://43.243.112.79:3128'})    #    # opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)    # urllib2.install_opener(opener)    # url = 'http://www.xicidaili.com/nn/'+str(page)    url = 'http://www.xicidaili.com/nt/' + str(page)    # url ='http://www.xicidaili.com/wn/'+str(page)    # url ='http://www.xicidaili.com/wt/'+str(page)    # url ='http://www.xicidaili.com/qq/'+str(page)    req = urllib2.Request(url, headers=header)    res = urllib2.urlopen(req).read()    # print res    soup = BeautifulSoup(res)    ips = soup.findAll('tr')    f = open("nt.txt", "a")    for x in range(1, len(ips)):        ip = ips[x]        tds = ip.findAll("td")        # tds[2].contents[0]+"\t"+tds[3].contents[0]+"\n"        ip_temp = tds[1].contents[0] + "\t" + tds[2].contents[0] + "\n"        # print tds[2].contents[0]+"\t"+tds[3].contents[0]        # print ip_temp        f.write(ip_temp)
for proxy in proxys:    print proxy    #    try:        proxy_support = urllib2.ProxyHandler(proxy)        # proxy_support = urllib2.ProxyHandler({'http': 'http://43.243.112.79:3128'})        opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)        # opener = urllib2.build_opener({})        urllib2.install_opener(opener)        # if enable_proxy:        # opener = urllib2.build_opener(proxy)
        url='http://blog.csdn.net/zhy421202048/article/details/51509155'
# else: # opener = urllib2.build_opener(null_proxy_handler) request=urllib2.Request(url) request.add_header('User-Agent', user_agent) res = urllib2.urlopen(url,timeout=10).read() print res time.sleep(10) except Exception,e: print proxy print e continue


0 0
原创粉丝点击