Archive for August, 2009

Get all hyperlinks within a page using Nokogiri

August 27, 2009 Leave a comment

Task: Create a selenium script using Ruby that will collect all the available links within a page.

In essence we will try to create a method that will parse the html source of the current page and get all the elements with css(‘a’) or xpath ‘//a’ which indicates an anchor element. First let’s try to do it in IRB.


1. Start your server and fire up your irb

2. In your console, type

require 'nokogiri'

3. Initialize the page we want to test, say we want to get all the hyperlinks within a google home page.

page = ""

4. Type the following commands

doc = Nokogiri::HTML(open(page))
links = doc.css('a')
hrefs = {|link| link.attribute('href').to_s}.uniq.sort.delete_if
{|href| href.empty?}

Of course we would not like to do the procedure every time in our console, thus we could save it as a method in our class like the following:

# method that will get all links using Nokogiri
def get_all_hrefs_nokogiri
page = self.get_location()
doc = Nokogiri::HTML(open(page))
links = doc.css(‘a’)

hrefs = {|link| link.attribute(‘href’).to_s}.uniq.sort.delete_if {|href| href.empty?}
return hrefs

# get all links without using Nokogiri
def get_all_hrefs
hrefs = []
self.get_xpath_count(‘//a’).to_i.times do |i|
if self.is_element_present(“document.links[#{i}]”) {hrefs << self.get_attribute("document.links[#{i}]@href")} end return hrefs end end [/sourcecode]

Categories: ruby Tags: ,

Convert XML to CSV with Nokogiri Ruby gem

August 14, 2009 3 comments

Once upon a time, in an exciting world of software testing… Exist QA team had been using Testlink 1.8.3 as an open-source tool for test management. They were happy and it serves them well not until their client request for a copy of the testcases with complete details in EXCEL format. Doomed! Testlink only offers generation of test specification in HTML, OpenOffice Writer and MS Word but unfortunately not in EXCEL.

But just like a princess with a prince charming… then came Nokogiri(saw in Japanese) gem from Ruby which is an HTML, XML, SAX  and Reader parser. It supports document searching via XPATH and CSS3 Selectors. Not to mention FasterCSV also a Ruby gem which provides a complete interface to CSV files and data.  It offers tools to enable you to read and write to and from Strings or IO objects, as needed.

First they install these precious gems in their Windows machine by executing the following commands:

gem install nokogiri
gem install fastercsv

With these tools Exist QA carefully plans a plot to solve their problem. Since Testlink has the ability to export testsuite together with its testcases in XML format, they use this advantage to pass it as an input file in their Testlink parser code in Ruby. Here’s their gameplan:

require ‘rubygems’
require ‘nokogiri’
require ‘fastercsv’

FIELDS = %w{Testsuite ID Name Summary Steps Expected_Result }

def new_testcase(csv, suite, id, name, summary, steps, expectedresult)

testcases = []
testcases << suite testcases << "GPC - #{id}" testcases << name testcases << summary testcases << steps testcases << expectedresult csv <<, testcases) end csv =[1],"w") csv << FIELDS doc = Nokogiri::XML(open(ARGV[0])) doc.xpath('//testsuite').each do |tsuite| puts "#{tsuite.attribute('name')}\n" doc.xpath('//testcase').each do |tcase| new_testcase(csv, tsuite.attribute('name'), tcase.css('externalid').inner_text, tcase.attribute('name'), tcase.css('summary').inner_text, tcase.css('steps').inner_text, tsuite.css('expectedresults').inner_text) end end [/sourcecode] All they need to do is run the program in their console following this format: ruby <filename> "<input>" "<output>" Where filename is the name of the Testlink parser code; input is the xml filename(generated XML file from Testlink) and output is the csv filename(file where the parsed xml data will be saved).

ruby tlparser.rb “test.xml” “test.csv”

Nokogiri and FasterCSV saves the day! Now they can provide the testcase report in no time, every time their client request for it. And Exist QA lives happily ever after…

Categories: ruby Tags: , ,