A search spider i…

来源:互联网 发布:千牛mac历史版本 编辑:程序博客网 时间:2024/06/07 10:07
转自http://blog.dharanasoft.com/2012/03/19/a-search-spider-in-ruby-using-capybara-webkit/

Posted: March19, 2012 Author: DeepakPrasanna | Filedunder: Uncategorized |Leave acomment

When I first looked at Nokogiri, it was a redefiningmoment(atleast for me!) on how to screen scrap. Recently I found mylove with cucumber and capybara-webkit.For newbies to capybara-webkit, it is a capybara driver whichsimulates a webkit browser for running tests. Perks? You get asimulated browser running in a headless mode, it supportsjavascript and its bloody fast! For more info, please checkout aprevious article on howto get started. I was extremely bored this weekend, and all ofa sudden an idea was born. I created a simple search spider usingcapybara-webkit which would fetch search results from google. Andhere is how I did it.

require 'ruby gems' 
require 'capybara' 
require 'capybara/dsl' 
require 'capybara-webkit' 
Capybara.run_server = false 
Capybara.current_driver = :webkit 
Capybara.app_host = "http://www.google.com/"  
module Spider 
 class Google 
 include Capybara::DSL 
 def search 
 visit('/') 
 fill_in "q", :with => ARGV[0] || "I love Ruby!" 
 click_button "Google Search" 
 all("li.g h3").each do |h3| 
 a = h3.find("a") 
 puts "#{h3.text}  =>  #{a[:href]}" 
 end 
 end 
 end 
end 
 spider = Spider::Google.new 
spider.search
0 0
原创粉丝点击