Search code examples
javamediawikinlpnsxmlparserwikipedia

Parser for Wikipedia


I downloaded a Wikipedia dump and I want to convert the wiki format into my object format. Is there a wiki parser available that converts the object into XML?


Solution

  • See java-wikipedia-parser. I have never used it but according to the docs :

    The parser comes with an HTML generator. You can however control the output that is being generated by passing your own implementation of the be.devijver.wikipedia.Visitor interface.