How to craw the Info of BiliBIli with python in Parallel
来源:互联网 发布:淮安悠迅网络 编辑:程序博客网 时间:2024/06/05 22:44
Based on the basic code of How to craw the Info of BiliBIli with python,add the parallel:
# -*- coding:utf-8 -*-#craw bilibili info with parallelimport timeimport requestsimport sysfrom prettytable import PrettyTableimport threadingimport importlib#set the encodingimportlib.reload(sys)lock = threading.Lock()#the logic codedef startCraw(url,beginNum,crawNum): times = 0 while(times < crawNum): myRequest = requests.get(url.format(beginNum),headers = {}) if myRequest.status_code == 200: lock.acquire() try: jsDict = myRequest.json()['data'] av_num = 'av' + str(beginNum) view = str(jsDict['view']) danmaku = str(jsDict['danmaku']) reply = str(jsDict['reply']) favorite = str(jsDict['favorite']) coin = str(jsDict['coin']) share = str(jsDict['share']) tableItem.add_row([av_num,view,danmaku,reply,favorite,coin,share]) except Exception as e: print('error:%s' %(e)) pass finally: lock.release() else: print('the status_code is not 200,url:%s,status_code:%d' %(url.format(beginNum),myRequest.status_code)) beginNum += 1 times += 1#main functionif __name__ == '__main__': url = 'https://api.bilibili.com/x/web-interface/archive/stat?aid={}' beginNum = int(input('please enter the begin av_number:')) crawNum = int(input('please enter the number you want to craw:')) threadNum = int(input('please enter the thread number you want to craw:')) MyThread = [] global tableItem tableItem = PrettyTable(['av_num', 'view', 'danmaku', 'replay', 'favorite', 'coin', 'share']) for i in range(threadNum): MyThread.append(threading.Thread(target=startCraw,args=(url,beginNum,crawNum,))) beginNum += crawNum for i in MyThread: i.start() for i in MyThread: i.join() print(tableItem)
阅读全文
0 0
- How to craw the Info of BiliBIli with python in Parallel
- How to craw the Info of BiliBIli with python in Parallel and MySQL storage
- How to craw the Info of BiliBIli with python
- The study of how to define a function in Python(20170907)
- How to build the environment of MSYS in the windows?
- How To Get System Info In Linux
- How to handle the space in column with awk?
- How to execute the command with root jurisdiction in pyCharm
- [PHP]How to get the system info?
- how to install vscode with Python extension in ubuntu 16
- How to implement the built in effects of DirectX.DirectSound
- How to use Events in the Context of C#
- How to build the environment of XPCOM in Windows XP
- How to use "man" effectively in the development of Linux
- How to use the pointer of function in a class?
- How to change the font size in legend of matlab
- How to update the version of pgfplots in MiKTeX
- how to calculate the textsize of TLatex in CernRoot
- Mac 按键标识
- 基于Annotation注解整合SSH框架和基于XML文件配置Bean整合SSH框架
- chosen插件的使用
- 字符串和日期之间的转换
- python list里面是dict时排序
- How to craw the Info of BiliBIli with python in Parallel
- CentOS 7.x下安装MySQL错误(ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password:)
- oracle like函数 介绍与优化
- 环信3.0自定义扩展消息
- itop4412 uboot 学习详细记录四丶Exynos4412 编译Makefile的配置文件config.mk分析)
- STM32使用SWO引脚调试
- 136.Single Number
- 简述:为什么硅胶按键要使用镭雕工艺?
- #研发解决方案介绍#Tracing(鹰眼)