Search code examples
xmlxmlstarletxmllint

Omit XML entities when formating


I have a XML file with no right indentation and so much spaces. There are also entities like LF in this format 


I want to format and reindent the file for readability. I have tried with xmllint and xmlstarlet but both of them substitute those entities for its ASCII characters so they not longer appear in the formated document.

How can I format my XML without taking into account those entities?


Solution

  • Found a solution, for everyone arriving here:

    We can use the tidy utility. In linux just:

    sudo apt-get install tidy
    tidy -o output.xml --preserve-entities yes -xml input.xml
    

    Maybe some options are not meant to be there. Just play around the options to fulfill your requirements. See the full documentation here: http://tidy.sourceforge.net/docs/tidy_man.html

    The most important is --preserve-entities yes