【python 爬虫】全国失信被执行人名单查询功能
来源:互联网 发布:vmware中linux上网 编辑:程序博客网 时间:2024/04/27 13:29
一、需求说明
利用百度的接口,实现一个全国失信被执行人名单查询功能。输入姓名,查询是否在全国失信被执行人名单中。
二、python实现
版本1:
# -*- coding:utf-8*-import sysreload(sys)sys.setdefaultencoding('utf-8')import timeimport requeststime1=time.time()import pandas as pdimport jsoniname=[]icard=[]def person_executed(name): for i in range(0,30): try: url="https://sp0.baidu.com/8aQDcjqpAAV3otqbppnN2DJv/api.php?resource_id=6899" \ "&query=%E5%A4%B1%E4%BF%A1%E8%A2%AB%E6%89%A7%E8%A1%8C%E4%BA%BA%E5%90%8D%E5%8D%95" \ "&cardNum=&" \ "iname="+str(name)+ \ "&areaName=" \ "&pn="+str(i*10)+ \ "&rn=10" \ "&ie=utf-8&oe=utf-8&format=json" html=requests.get(url).content html_json=json.loads(html) html_data=html_json['data'] for each in html_data: k=each['result'] for each in k: print each['iname'],each['cardNum'] iname.append(each['iname']) icard.append(each['cardNum']) except: passif __name__ == '__main__': name="郭家松" person_executed(name) print len(iname) #####################将数据组织成数据框########################### data=pd.DataFrame({"name":iname,"IDCard":icard}) #################数据框去重#################################### data1=data.drop_duplicates() print data1 print len(data1) #########################写出数据到excel######################################### pd.DataFrame.to_excel(data1,"F:\\iname_icard_query.xlsx",header=True,encoding='gbk',index=False) time2=time.time() print u'ok,爬虫结束!' print u'总共耗时:'+str(time2-time1)+'s'
三、效果展示
"D:\Program Files\Python27\python.exe" D:/PycharmProjects/learn2017/全国失信被执行人查询.py郭家松 34122319790****5119郭家松 32032119881****2419郭家松 32032119881****24193 IDCard name0 34122319790****5119 郭家松1 32032119881****2419 郭家松2ok,爬虫结束!总共耗时:7.72000002861sProcess finished with exit code 0
版本2:
# -*- coding:utf-8*-import sysreload(sys)sys.setdefaultencoding('utf-8')import timeimport requeststime1=time.time()import pandas as pdimport jsoniname=[]icard=[]courtName=[]areaName=[]caseCode=[]duty=[]performance=[]disruptTypeName=[]publishDate=[]def person_executed(name): for i in range(0,30): try: url="https://sp0.baidu.com/8aQDcjqpAAV3otqbppnN2DJv/api.php?resource_id=6899" \ "&query=%E5%A4%B1%E4%BF%A1%E8%A2%AB%E6%89%A7%E8%A1%8C%E4%BA%BA%E5%90%8D%E5%8D%95" \ "&cardNum=&" \ "iname="+str(name)+ \ "&areaName=" \ "&pn="+str(i*10)+ \ "&rn=10" \ "&ie=utf-8&oe=utf-8&format=json" html=requests.get(url).content html_json=json.loads(html) html_data=html_json['data'] for each in html_data: k=each['result'] for each in k: print each['iname'],each['cardNum'],each['courtName'],each['areaName'],each['caseCode'],each['duty'],each['performance'],each['disruptTypeName'],each['publishDate'] iname.append(each['iname']) icard.append(each['cardNum']) courtName.append(each['courtName']) areaName.append(each['areaName']) caseCode.append(each['caseCode']) duty.append(each['duty']) performance.append(each['performance']) disruptTypeName.append(each['disruptTypeName']) publishDate.append(each['publishDate']) except: passif __name__ == '__main__': name="郭家松" person_executed(name) print len(iname) #####################将数据组织成数据框########################### # data=pd.DataFrame({"name":iname,"IDCard":icard}) detail_data=pd.DataFrame({"name":iname,"IDCard":icard,"courtName":courtName,"areaName":areaName,"caseCode":caseCode,"duty":duty,"performance":performance,\ "disruptTypeName":disruptTypeName,"publishDate":publishDate}) #################数据框去重#################################### # data1=data.drop_duplicates() # print data1 # print len(data1) detail_data1=detail_data.drop_duplicates() # print detail_data1 # print len(detail_data1) #########################写出数据到excel######################################### pd.DataFrame.to_excel(detail_data1,"F:\\iname_icard_query.xlsx",header=True,encoding='gbk',index=False) time2=time.time() print u'ok,爬虫结束!' print u'总共耗时:'+str(time2-time1)+'s'
阅读全文
1 0
- 【python 爬虫】全国失信被执行人名单查询功能
- 【python 爬虫】全国失信被执行人名单爬虫
- 国家对“失信被执行人”惩戒手段,有哪些是你不知道的
- 哭晕!失信被执行人,最多可拘留27次(“老赖”全疯了)
- python爬虫获取全国天气信息
- 全国民办普通高校名单
- python爬虫高级功能
- Python爬虫登录功能
- 使用Spark SQL 探索“全国失信人数据”
- 使用Spark SQL 探索“全国失信人数据”
- python查询全国主要天气代码
- 全国“211工程”高校名单
- 2014年全国高等学校名单
- python实现简单爬虫功能
- python实现简单爬虫功能
- python实现简单爬虫功能
- python实现简单爬虫功能
- python实现简单爬虫功能
- 赢在面试之Java集合框架篇(3)
- 深度学习实践课程--fast.ai 资料整理
- 4922: Karp-de-Chant Number
- php导出excel表格文件
- spring分布式事务详解
- 【python 爬虫】全国失信被执行人名单查询功能
- windows 开机引导丢失dos命令
- <script type="text/html"></script> js模版使用
- Electron 环境搭建
- 动态规划---公式推导剖析
- activit工作流-会签流程(多实例)
- MYSQL:Access denied for user 'root'@'localhost' (using password:YES) 问题解决方法
- eclipse maven 操作文档
- Registration system