I'm quite new to Nokogiri and Ruby and seeking a little help.
I am parsing a very large XML file using class MyDoc < Nokogiri::XML::SAX::Document
. Now I want to traverse the inner part of a block.
Here's the format of my XML file:
<Content id="83087">
<Title></Title>
<PublisherEntity id="1067">eBooksLib</PublisherEntity>
<Publisher>eBooksLib</Publisher>
......
</Content>
I can already tell if the "Content" tag is found, now I want to know how to traverse inside of it. Here's my shortened code:
class MyDoc < Nokogiri::XML::SAX::Document
#check the start element. set flag for each element
def start_element name, attrs = []
if(name == 'Content')
#get the <Title>
#get the <PublisherEntity>
#get the Publisher
end
end
def cdata_block(string)
characters(string)
end
def characters(str)
puts str
end
end
It's trickier to do with SAX. I think the solution will need to look something like this:
class MyDoc < Nokogiri::XML::SAX::Document
def start_element name, attrs = []
@inside_content = true if name == 'Content'
@current_element = name
end
def end_element name
@inside_content = false if name == 'Content'
@current_element = nil
end
def characters str
puts "#{@current_element} - #{str}" if @inside_content && %w{Title PublisherEntity Publisher}.include?(@current_element)
end
end