Search code examples
htmlpurifier

htmlpurifier add missing url protocol


using AutoFormat.Linkify HTMLpurifier converts text such as http://www.example.com into links. But many people write links without the protocol, such as www.example.com or example.com. Is there anyway to use HTMLpurifier to also convert these into links?


Solution

  • I know no way to get HTMLpurifier do this. But you should get this work by adding http:// to every link, that does not containing it. you can use a regular expression to do this.

    preg_replace( '#\s((https?|ftp)\:\/\/)?([a-z0-9-.]*)\.([a-z]{2,4})\s#', 
                  ' http://${3}.${4} ', $html );
    

    Test the regex here.

    Example:

    test example.com test<br>
    test www.example.com test<br>
    test http://example.com test
    

    becomes

    test http://example.com test
    test http://www.example.com test
    test http://example.com test
    

    Now HTMLpurifier should do the right things.