Search code examples
javascriptregextemplate-literals

how to deal with extraneous white-space in template literals


How do you process a multi-line template-literal generated string to exclude all the white space that is created by following any/all JS code indentation.

I have used a regex replace to get rid of leading space in my first test string. But what if the string is HTML which has an indentation structure that I would like to preserve.

//problem: includes a lot of white space


    if(true){
        const aString = `This string
                        is multiline
                        and indented to match
                        the surrounding 
                        JS code`
        console.log(aString)
    }


//what I have tried

    if(true){
        const aString = `This string
                        is multiline
                        and indented to match
                        the surrounding 
                        JS code`
        
        const newString = aString.replace(/\n[ \t]+/g, " \n")
        console.log(newString)
    }
    
//but what about this

    if(true){
        const aString = `<div class="hello">
                            <div class="inner">
                               <span>Some text</span>
                            </div>
                         </div>`
        
        const newString = aString.replace(/\n[ \t]+/g, " \n")
        console.log(newString)
    }

I would like it to print:

<div class="hello">
   <div class="inner">
      <span>Some text</span>
   </div>
</div>

and not:

<div class="hello">
<div class="inner">
<span>Some text</span>
</div>
</div>

Solution

  • The common-tags package has a stripIndent tag that does this for you. However you need to start the string on the second line for this:

    const aString = stripIndent`
                                <div class="hello">
                                  <div class="inner">
                                    <span>Some text</span>
                                  </div>
                                </div>
    `;
    
    

    generally this is not possible to do with a simple regex. What you need to do is to count the number of spaces in front of each line and figure out the smallest number of spaces. Then remove exactly these.

    A simple implementation is this:

    function removeIndent(str) {
      const lines = str.split('\n');
      if(lines[0] !== '' || lines[lines.length - 1] !== '') {
        return str;
      }
      lines.shift();
      lines.pop();
      
      const lens = lines.map(l => l.match(/ */)[0].length)
      const minLen = Math.min(...lens);
      return '\n' + lines.map(l => l.substr(minLen)).join('\n') + '\n';
    }
    
    const inStr = `
       foo
         bar
       baz
    `;
    
    const outStr = removeIndent(inStr);
    
    console.log(inStr);
    console.log(outStr);