Search code examples
javajsonbase64apache-poidoc

Word Doc File to JSON


I will be parsing and converting a document in ms word format to JSON (or by via an XML finally to JSON). How such parsing and conversion will take care of image embeded in word doc. how this images can be represented in json format. Any pointers or demo example.

I am thinking of using apache poi as parser and customised java class for json string builder.

Is there any readily available tool for such parsing and conversion.


Solution

  • Try to convert MS Document to Base64 (sequence of ASCII characters) and send it as a String via JSON or XML. Then you can decode it, the document should be the same (with embedded images).