I have been having some trouble writing regex to match previous names on this page: http://steamcommunity.com/id/TripleThreat/namehistory
To be clear, I want in an array the following:
and so on..
I have already tried writing the Regex but it was a disaster (Something I struggle with)
Here's what I wrote:
$page = file_get_contents(sprintf("http://steamcommunity.com/id/TripleThreat/namehistory"));
preg_match_all("/<span class=\"historyDash\">-<\/span>((.|\n)*)<\/div>/", $page, $matches);
foreach($matches[0] as $match) {
echo($match . "<br/>");
}
Any help is much appreciated :)
You can try the following regex (the match is in the first capturing group):
"/<span class=\"historyDash\">-<\/span>\s*((?:[^\<]|\n)*?)\s*<\/div>/"
See it on Regex101.
The changes I made: trimmed whitespace before and after with the \s*
, changed the .
to [^\<]
to choose only the ones that aren't tag (i.e., the correct text).
Note: As @PedroLobito pointed out, don't parse HTML with regex unless necessary. Use a library to parse the DOM instead when you can. I just provided an easy example to extend your work, but it might not be the best solution.