I have raw html with some css classes inside for various tags.
Example:
Input:
<p class="opener" itemprop="description">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Neque molestias natus iste labore a accusamus dolorum vel.</p>
and I would like to get just plain html like:
Output:
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Neque molestias natus iste labore a accusamus dolorum vel.</p>
I do not know names of these classes. I need to do this in JavaScript (node.js).
Any idea?
This can be done with Cheerio, as I noted in the comments.
To remove all attributes on all elements, you'd do:
var html = '<p class="opener" itemprop="description">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Neque molestias natus iste labore a accusamus dolorum vel.</p>';
var $ = cheerio.load(html); // load the HTML
$('*').each(function() { // iterate over all elements
this.attribs = {}; // remove all attributes
});
var html = $.html(); // get the HTML back