I have xml file with header
<?xml version="1.0" encoding="utf-16"?>
and also it contains the
<transmission xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
when used the SAX parser it wont parse. But when manually removed the encoding part and the attributes after transmission;XML parsing success. Being the file is large;I can use only SAX.Is there any other way to parse this xml file without manually removing the encoding and transmission attributes.
Sample Code is
require 'nokogiri'
include Nokogiri
class P < Nokogiri::XML::SAX::Document
def initialize
end
def start_element(element, attributes = [])
puts element
end
def cdata_block(string)
end
def characters(string)
end
def end_element(element)
puts element
end
end
parser = Nokogiri::XML::SAX::Parser.new(P.new())
parser.parse_file('file_dummy.xml')
After numerous referrals. I got the answer. It is the answer from @thetinman.But not fully absorbed. Used a sed command to replace utf-16 with utf-8 and parse the file. Why i need the sed operation is nokogiri causes issue with this utf-16