代码1
来源:互联网 发布:淘宝刷好评兼职工作 编辑:程序博客网 时间:2024/05/13 15:36
# coding:utf-8import requestsfrom bs4 import BeautifulSoupfrom multiprocessing import Pooldef get_zhaopin(page): url = 'http://sou.zhaopin.com/jobs/searchresult.ashx?jl=全国&kw=python&p={0}&kt=3'.format(page) print("第{0}页".format(page)) header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36', 'Connection': 'keep-alive', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' } wbdata = requests.get(url, headers=header).content soup = BeautifulSoup(wbdata, 'lxml') job_name = soup.select("table.newlist > tr > td.zwmc > div > a") salarys = soup.select("table.newlist > tr > td.zwyx") locations = soup.select("table.newlist > tr > td.gzdd") times = soup.select("table.newlist > tr > td.gxsj > span") for name, salary, location, time in zip(job_name, salarys, locations, times): data = { 'name': name.get_text(), 'salary': salary.get_text(), 'location': location.get_text(), 'time': time.get_text(), } print(data)if __name__ == '__main__': pool = Pool(processes=5) pool.map_async(get_zhaopin,range(1, 403+1)) pool.close() pool.join()
# coding:utf-8import requestsfrom bs4 import BeautifulSoupimport reurl = 'http://sou.zhaopin.com/jobs/searchresult.ashx?jl=全国&kw=python&p=1&kt=3'header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36', 'Connection': 'keep-alive', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' }wbdata = requests.get(url, headers=header).contentsoup = BeautifulSoup(wbdata, 'lxml')items = soup.select("div#newlist_list_content_table > table")count = len(items) - 1# 每页职位信息数量print(count)job_count = re.findall(r"共<em>(.*?)</em>个职位满足条件", str(soup))[0]# 搜索结果页数pages = (int(job_count) // count) + 1print(pages)
阅读全文
0 0
- 代码1
- 代码1
- 代码1
- 代码1
- 代码1
- 代码 1
- 代码1
- 代码1
- 提高代码可维护性(1)---代码注释
- 2015061504 - 代码分析之代码格式(1)
- 代码1-1
- 常用代码1
- 常见问题及代码(1)
- 代码管理(1)
- 数据结构代码整理(1)
- asp常用代码1
- .net常用代码1
- ASP代码优化1
- Haproxy+keepalived(1)
- hdu5543(01背包变换)
- pandas中Dataframe的查询方法([], loc, iloc, at, iat, ix)
- switch case语句,如果不在case后加break会怎么样
- linux文件描述符,系统打开文件和i节点的关系
- 代码1
- 关于jquery validate中radio形式提示框显示位置错误
- 《程序员修炼之道》读书笔记
- 用一些简易的标签写一个美丽说左侧二级菜单
- 不用加减乘除做加法
- Magento中getSize()与count()的区别
- java-11.5
- springmvc gzyl
- rex 检查脚本执行的返回结果