pycurl 使用跳转

来源：互联网发布：linux rpm安装java 编辑：程序博客网时间：2024/06/14 20:54

pycurl 使用跳转

根据doi获取文献的最终url，而不获得原文（如想获得全文，可以修改 FOLLOWLOCATION为1）

doi='10.1007/s12559-015-9328-x'storage = io.BytesIO() # python3， python2与此处不同，存储字符串c = pycurl.Curl() # 初始化对象c.setopt(c.URL, 'http://dx.doi.org/'+doi) # 设置urlc.setopt(pycurl.FOLLOWLOCATION, 0) #是否重定向到url，0表示关闭重定向，1表示开启重定向# c.setopt(c.HTTPHEADER,["Accept: application/vnd.crossref.unixsd+xml"]) #设置headerc.setopt(pycurl.WRITEFUNCTION, storage.write)# 将返回的字符串写入storage# c.setopt(pycurl.HEADERFUNCTION, storage.write)# 将返回的header写入storagec.perform() # 执行# out = c.getinfo(pycurl.EFFECTIVE_URL)# 获取最新的url，如果重定向，则返回跳转后的urlcontent = storage.getvalue().decode('utf-8')#返回byte，需要进行decode，如果 c.setopt(pycurl.FOLLOWLOCATION, 1)设置为1，则返回跳转后的全文 #返回数据进行正则表达式匹配，获取重定向urlout = re.compile('(?<=href=").*(?=">)').findall(content)[0]cu_w.execute('insert into doi_link values(%s, %s, %s)', (pmid,doi, out))con.commit()c.close()storage.close()

1 0

pycurl 使用 跳转

pycurl 使用 跳转

pycurl 使用跳转

pycurl 使用跳转