pythonPkg_BeautifulSoup

来源:互联网 发布:注册自己域名的邮箱 编辑:程序博客网 时间:2024/05/17 08:57
Beautiful Soup Document 4.0.0http://www.crummy.com/software/BeautifulSoup/bs4/doc/#Objects:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>Tag  name: .name  attributes: [], .attrs  NavigableString:  .stringBeautifulSoup:Navigating:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>Down:  using tag name: tag.tag2.tag3  .contents => list  .children => iterator, direct children  .descendants => iterator, all children    .string  .strings, .stripped_stringsUp:  .parent, .parentsSideways:  .next_sibling, .previous_sibling  .next_siblings, .previous_siblingsBack and forth:  .next_element, .previous_element  .next_elements, .previous_elementsSearching:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>  filters:    string    regular expression    list    function    find_all(name, attrs, recursive, text, limit, **kwargs)    keyword arguments = kwargs  find(name, attrs, recursive, text, **kwargs)    find_parents(name, attrs, text, limit, **kwargs)  find_parent(name, attrs, text, **kwargs)    find_next_siblings(name, attrs, text, limit, **kwargs)  find_next_sibling(name, attrs, text, **kwargs)    find_previous_siblings(name, attrs, text, limit, **kwargs)  find_previous_sibling(name, attrs, text, **kwargs)    find_all_next(name, attrs, text, limit, **kwargs)  find_next(name, attrs, text, **kwargs)    find_all_previous(name, attrs, text, limit, **kwargs)  find_previous(name, attrs, text, **kwargs)    selectModifying:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>change tag, attr, string:  tag.name = ''  tag['attr'] = ''  tag.string = ''append()BeautifulSoup.new_string(), .new_tag()insert()insert_before(), insert_after()clear()extract()decompose()replace_with()wrap(), unwrap()Ouput:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>Pretty-printing:  prettify('encoding')    formatterget_text()specify parser:  BeautifulSoup(doc, 'lxml')encoding:  BeaufifulSoup(doc, form_encoding='')  UnicodeDammit(doc, encoding)    .unicode_markup

原创粉丝点击