Search code examples
bashurllinehrefbreak

Bash: unusual line break


I coded a small script in bash, which read some HTML and should print the href of a link:

#!/bin/bash

link=$(echo $source | sed -ne 's#^.*<a href="\([^"]*\)".*$#\1#p')

  if [ "$(echo "$link" | grep '/fonts/list/style')" ]
    then
      echo "http://www.domain.com$link/10000"
  fi

The var source is in my example:

<li><span>19</span><a href="/fonts/list/style/home words">linktext</a></li>

The Problem: The script print not

http://www.domain.com/fonts/list/style/home words/1000

instead of it prints

http://www.domain.com/fonts/list/style/home
words/1000

How can I remove or avoid this line break?


Solution

  • You have to escape the " appearing in the <li>...:

    This worked to me:

    #!/bin/bash
    
    source="<li><span>19</span><a href=\"/fonts/list/style/home words\">linktext</a></li>"
    
    link=$(echo $source | sed -ne 's#^.*<a href="\([^"]*\)".*$#\1#p')
    
      if [ "$(echo "$link" | grep '/fonts/list/style')" ]
        then
          echo "http://www.domain.com$link/10000"
      fi
    

    Output

    http://www.domain.com/fonts/list/style/home words/10000