[python]爬虫存入本地

来源：互联网发布：g71格式编程哈斯系统编辑：程序博客网时间：2024/05/18 01:51

一，open函数
打开本地文件的方法

    with open（'path','r') as f:        f.write('data')

path是文件的路径，r是read读取，w就write写入，f是操作这个连接的对象
二，操作字符串的方法
替代用的format，一个字符串str里面加上{}然后用.format(new_str)把new_str加入到{}里
分片的split，一个字符串str后.split()在()里是用什么分片的字符，然后这个str被分成了几段list，这样就可以取出想要的字符。
三，写入一个文章

from bs4 import BeautifulSoupimport requestsurl = 'http://blog.sina.com.cn/s/blog_486e105c010001s9.html'resp = requests.get(url)resp.encoding = 'utf-8'soup = BeautifulSoup(resp.text,'lxml')title = soup.select('#t_486e105c010001s9')pp = soup.select('#sina_keyword_ad_area2 > div > p > span > font')for i in title:    title1 = i.get_text()name = title1+'.txt'with open(name,'w',encoding='utf-8') as fo:    for i in pp:        fo.write(i.get_text())

文章的url是有规律的，只有最后的数字不同，可以用

str = ['{}'.format(n) for n in range(1,10)]

来批量处理。
题目可以用split()方法来处理。

0 0