Search code examples
phphtmlregexregex-groupregex-greedy

RegEx for HTML tag conversion


For some reasons, I want to convert strings which contain

<p style=“text-align:center; others-style:value;”>Content</p>

to <center>Content</center> in PHP.

The text-align values can be either left, right, or center. And when there are other stylings, I want to omit them.

How can I do that in PHP?

Edit:

Maybe I was not clear enough in my original question. What I mean is that I want to convert contents with text-align:center to be wrapped by <center>, and contents with text-align:right to be wrapped by <right>. And when there is no text-align styling, I do not need any wrapping for that div. Thank you.


Solution

  • You might use a preg_replace to do so:

    Test 1:

    $test = preg_replace('/(<.*”>)(.*)(<\/.*)/s', '<center>$2</center>', '<p style=“text-align:center; others-style:value;”>Content</p>');
    
    var_dump($test);
    

    Output 1:

    It would return:

    string(24) "<center>Content</center>"
    

    RegEx 1:

    The RegEx divides your inputs into three capturing groups, where the first and third groups can be assigned to open/close p tags.

    enter image description here

    RegEx 2:

    You can further expand it, if you wish, with this RegEx for any other tags/quotations/contents that you may want. It would divide any tags with any quotations (" or ” or ' or ’) into five groups where the fourth group ($4) is your target content. This type of RegEx may be usually useful for single occurrence non-looping strings, since it uses (.*).

    enter image description here

    Test 2

    $test = preg_replace('/<(.*)(\"|\”|\'|\’)>(.*)(<\/.*)/s', '<center>$4</center>', '<p style=“text-align:center; others-style:value;”>Content</p>');
    
    var_dump($test);
    

    RegEx 3

    If you may wish to get any specific attributes in style, this RegEx might help:

    <(.*)(text-align:)(.*)(center|left|right|justify|inherit|none)(.*)(\"|\”|\'|\’)>(.*)(<\/.*)
    

    enter image description here

    Test 3

    $tags = [
        '0' => '<p style=“text-align:center; others-style:value;”>Content</p>',
        '1' => '<div style=‘text-align:left; others-style:value;’ class=‘any class’>Any Content That You Wish</div>',
        '2' => '<span style=\'text-align:right; others-style:value;\' class=\'any class\'>Any Content That You Wish</span>',
        '3' => '<h1 style=“text-align:justify; others-style:value;” class="any class">Any Content That You Wish</h1>',
        '4' => '<h2 style=“text-align:inherit; others-style:value;” class=“any class">Any Content That You Wish</h2>',
        '5' => '<h3 style=“text-align:none; others-style:value;” class=“any class">Any Content That You Wish</h3>',
        '6' => '<h4 style=“others-style:value;” class=“any class">Any Content That You Wish</h4>',
    ];
    
    var_dump($tag);
    
    $RegEx = '/<(.*)(text-align:)(.*)(center|left|right|justify|inherit|none)(.*)(\"|\”|\'|\’)>(.*)(<\/.*)/s';
    foreach ($tags as $key => $tag) {
        preg_match_all($RegEx, $tag, $matches);
        foreach ($matches as $key1 => $match) {
            if (sizeof($match[0]) > 0) {
                $tags[$key] = preg_replace($RegEx, '<$4>$7</$4>', $tag);
                break;
            }
    
        }
    
    }
    
    var_dump($tags);
    

    Output 3

    It would return:

    array(7) {
      [0]=>
      string(24) "<center>Content</center>"
      [1]=>
      string(38) "<left>Any Content That You Wish</left>"
      [2]=>
      string(40) "<right>Any Content That You Wish</right>"
      [3]=>
      string(44) "<justify>Any Content That You Wish</justify>"
      [4]=>
      string(44) "<inherit>Any Content That You Wish</inherit>"
      [5]=>
      string(38) "<none>Any Content That You Wish</none>"
      [6]=>
      string(86) "<h4 style=“others-style:value;” class=“any class">Any Content That You Wish</h4>"
    }