I want to extract specific links from a website.
The links look like that:
/topic/Funny/G1pdeJm
The links are always the same - except the last random chars.
I'm getting hard time to combine these parts
(preg_match("/^http:\/\//i",$str) || is_file($str))
and
(preg_match("/Funny(.*)/", $str) || is_file($str))
first code extract every links second extract from the links only the /topic/Funny/* part.
Unfortunately, I can't combine them, also I want to also block these tags:
/topic/Funny/viral
/topic/Funny/time
/topic/Funny/top
/topic/Funny/top/week
/topic/Funny/top/month
/topic/Funny/top/year
/topic/Funny/top/all
you could try using negative lookaheads to "filter out" the urls you don't like:
.*\/Funny\/(?!viral|time|top\/week|top\/month|top\/year|top\/all|top(\n|$)).*