I'm working on an application which has to deal with docx files. I know that docx files are just xml/images/others files in a zip file.
My application would have to:
Importing docx files and store their representation (text, but also eveything related to the presentation such as style, police, font .... ) in a database.
Provide a way to modify the text of each sentence on a webpage.
Exporting the docx file with the new texts while preserving the style/presentation.
The complex thing is that I have to support nested tags. For instance, a tag which contains a sentence can also include some tags to provide some bold to a word.
I do not have any requirements on the database. It can be anything.
My question is more on how to handle and make a representation of the data and how to handle my requirements, not on how to parse XML.
Thanks !
The question is not an easy one.
Here is some related question I answered: Creating RTF , DOC , or DOCX in iOS
After you read that, here is a real word example:
<w:p w:rsidP="00CA7135" w:rsidR="00137C91" w:rsidRDefault="00137C91">
<w:r>
<w:t>Hello</w:t>
</w:r>
<w:r w:rsidR="008C194D">
<w:t xml:space="preserve"/>
</w:r>
<w:r>
<w:t>My name</w:t>
</w:r>
</w:p>
<w:p w:rsidP="00CA7135" w:rsidR="008C194D" w:rsidRDefault="00137C91">
<w:r>
<w:t xml:space="preserve">is</w:t>
</w:r>
<w:r w:rsidR="008C194D" w:rsidRPr="00E92392">
<w:rPr>
<w:b/>
</w:rPr>
<w:t xml:space="preserve">John Doe</w:t>
</w:r>
<w:proofErr w:type="spellStart"/>
<w:r w:rsidR="008C194D" w:rsidRPr="00E92392">
<w:rPr>
<w:b/>
</w:rPr>
<w:t/>
</w:r>
<w:proofErr w:type="spellEnd"/>
<w:r w:rsidR="008C194D" w:rsidRPr="00E92392">
<w:rPr>
<w:b/>
</w:rPr>
<w:t xml:space="preserve"/>
</w:r>
<w:r w:rsidR="008C194D">
<w:t xml:space="preserve"/>
</w:r>
<w:r>
<w:t>I want to</w:t>
</w:r>
<w:r w:rsidR="008C194D">
<w:t xml:space="preserve"/>
</w:r>
<w:r>
<w:t>show</w:t>
</w:r>
<w:r w:rsidR="00E92392">
<w:t xml:space="preserve">how difficult it is</w:t>
</w:r>
</w:p>
As you can see, the text in one paragraph is never only in one stroke.
Answer to your questions:
<w:t>
tags and group them by the same <w:p>
tags. For example 'Hello' and 'My name' are in the same <w:p>
.You would then find a way to know where the text has been inserted, and insert the text in the right <w:t>