how to use libxml2 to parse dirty html in C programing
The html maybe dirty
such as premature end of data in tag
How can i do it? Thanks
Solution
Using the libxml2 HTML parser it will normalize "dirty" HTML into a normalized tree.
see htmlDocPtr htmlParseFile(const char * filename, const char * encoding)