Search code examples
htmlgrephtml-parsing

How to use grep to find value of html tag


I want to analyze some aspects of my web page. For example I want to see the values of all alt tags. For the example I created simple html code which contain few alt tags. Let say that the tags inside the code are:

alt='Text-01'
alt='Text 02'
alt=''
alt='Some long text'

Then I tried command:

grep -o "alt='*'" my-page.html

The output is:

alt='
alt='
alt=''
alt='

I'm expecting to see the outputs like these ones:

Text-01
Text 02
empty line or alt=''
Some long text

or this one:

alt='Text-01'
alt='Text 02'
alt=''
alt='Some long text'

Can you help me to achieve that?


Solution

  • If you know for certain, that the argument to alt is between single quotes, you could do a

    grep -o "alt='[^']*'" file
    

    This searches for alt=, followed by a single quote, followed by an arbitrary number of non-single quotes, and finally from a terminating single quote.