I've tried to convert bz2 to text with "Wikipedia Extractor(https://github.com/attardi/wikiextractor). I've downloaded wikipedia dump with bz2 extension then on command line used this line of code:
python Wikiextractor.py -b 85M -o extracted D:\wikiextractor-master\wikiextractor\zhwiki-latest-pages-articles.xml.bz2
After finishing preprocessing the pages, I came out with error like this: enter image description here
How can I fix this?
I encountered this problem. Likely caused by the StringIO issue with Windows. I re-run it on Windows Subsystem for Linux (WSL) and it went well.