Search code examples
phpregexspecial-characterspreg-match-all

preg_match_all with special characters


I am trying to do a preg_match_all on data that contains multiple phrases that I want to extract.

Data:

'us/Llane/Hówl' then some other text then 'us/Casey/Hówl' and so on

I would like to extract the 2 names Llane and Casey into an array. I am currently using http://www.phpliveregex.com/ as well as my code itself to try and work this out but regex seems very difficult to understand even with some of the great guides on the internet. To the best of my knowledge this should work:

preg_match_all("/us\/(.*?)\/HÓWL'/",$data,$output);

But all I am getting is $output[0] and $output[1] which are both blank. I wasn't having a problem before so it might be the special character however I can only find information on preg_match_all for detecting special characters, not just using them in a string. Any assistance would be great, I have been stuck on this problem for about 4 days straight now and spent a significant number of hours trying to work it out.


Solution

  • You're trying to match HÓWL instead of Hówl..

    $data = "'us/Llane/Hówl' then some other text then 'us/Casey/Hówl' and so on";
    preg_match_all("~us/(.*?)/Hówl~", $data, $output);
    print_r($output[1]);
    

    Output

    Array
    (
        [0] => Llane
        [1] => Casey
    )
    

    Alternatively, unless you know that Hówl will always be on the right side of the forward slash I would consider using the full Letter Unicode property \p{L}. This will allow you to match accented characters as well.

    preg_match_all("~us/(.*?)/\p{L}+~u", $data, $output);