Search code examples
xmlgowhitespaceremoving-whitespace

How to remove XML intendations from XML string.?


I'm having a XML string. I'm not able to remove indentation space from XML string. I replaced newlines.

  <person id="13">
      <name>
          <first>John</first>
          <last>Doe</last>
      </name>
      <age>42</age>
      <Married>false</Married>
      <City>Hanga Roa</City>
      <State>Easter Island</State>
      <!-- Need more details. -->
  </person>

How to remove XML indentation spaces from string in GOLANG?

I want this XML as string like,

<person id="13"><name><first>John</first><last>Doe</last></name><age>42</age><Married>false</Married><City>Hanga Roa</City><State>Easter Island</State><!-- Need more details. --></person>

How to do this in GOLANG?


Solution

  • Remove Whitespace-Only Sequences Between XML Tags

    func unformatXML(xmlString string) string {
        var unformatXMLRegEx = regexp.MustCompile(`>\s+<`)
        unformatBetweenTags := unformatXMLRegEx.ReplaceAllString(xmlString, "><") // remove whitespace between XML tags
        return strings.TrimSpace(unformatBetweenTags) // remove whitespace before and after XML
    }
    

    RegEx Explanation

    \s - matches any whitespace including tab, newline, form feed, carriage return and space

    + - matches one or more of whitespace character

    RegEx syntax reference: https://golang.org/pkg/regexp/syntax/

    Example

    package main
    
    import (
        "fmt"
        "regexp"
        "strings"
    )
    
    func main() {
        var s = `    
    <person id="13">
        <name>
            <first>John</first>
            <last>Doe</last>
        </name>
        <age>42</age>
        <Married>false</Married>
        <City>Hanga Roa</City>
        <State>Easter Island</State>
        <!-- Need more details. -->
    </person>   `
    
        s = unformatXML(s)
        fmt.Println(fmt.Sprintf("'%s'", s)) // single quotes used to confirm no leading or trailing whitespace
    }
    
    func unformatXML(xmlString string) string {
        var unformatXMLRegEx = regexp.MustCompile(`>\s+<`)
        unformatBetweenTags := unformatXMLRegEx.ReplaceAllString(xmlString, "><") // remove whitespace between XML tags
        return strings.TrimSpace(unformatBetweenTags) // remove whitespace before and after XML
    }
    

    Runnable Example in Go Playground

    https://play.golang.org/p/VS1LRNevicz