Search code examples
bashcurltagsextractwget

How to extract the source of a webpage without tags using bash?


We can download the source of the page using wget or curl , but I want to extract the source of the page without tags. I mean extract it as text.


Solution

  • You can pipe to a simple sed command :

    curl www.gnu.org | sed 's/<\/*[^>]*>//g'