Regex for removing consecutive character formatting tags

I need a regex to match and replace the consecutive character formatting tags enclosing the entire paragraph tags in simple DOM Html Parser

Input :

<p><b><i>Lorem Ipsum Content</i></b></p>

Expected output : Lorem Ipsum

In the below case regex should match and replace only the  tags since that's the only tag that encloses the entire paragraph tag

eg :Input : Text some more text text inside 

output : Text some more text text inside 

Thanks .

Solution

It will look something like this:

foreach($html->find('p') as $p) {
  while(preg_match('/^<([^>]+)>(.*)<\/\1>$/', $p->innertext, $m)){
    $p->innertext = $m[2];
  }
}

Note that the \1 in the regex matches the html tag name from the first capture group, probably not necessary but I did it for the bonus.