Why isn't array_unique returning me a list of unique items?

I am trying to scrape all the urls on the home page on my client's site so I can migrate it to wordpress. The problem is I can't seem to arrive at a de-duplicated list of urls.

Here's the code:

$html = file_get_contents('http://www.catwalkyourself.com');

$dom = new DOMDocument();
@$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for ($i = 0; $i < $hrefs->length; $i++) {
   $href = $hrefs->item($i);
   $url = $href->getAttribute('href');

   if($url = preg_match_all('((www|http://)(www)?.catwalkyourself.com\/?.*)', $url, $matches[0])){
    $urls = $matches[0][0][0];
    $list = implode( ', ', array_unique( explode(", ", $urls) ) );
    echo $list . '<br/>';
    //print_r($list);
   }
}

(Also posted here.)

Instead I am getting duplicates like this:

http://www.catwalkyourself.com/rss.php
http://www.catwalkyourself.com/rss.php

How do I fix this?

Solution

The last part of your code shouldn't be in the loop. You're traversing an array containing every links on the page. As each element of this array contains only one link, you're applying array_unique on an array which can't contain more than one element.

Try something like this:

$html = file_get_contents('http://www.catwalkyourself.com');

$dom = new DOMDocument();
@$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
$urls = array();

for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    $url = $href->getAttribute('href');

    if($url = preg_match_all('((www|http://)(www)?.catwalkyourself.com\/?.*)', $url, $matches[0])){
        $urls[] = $matches[0][0][0];
    }
}
$list = implode(', ', array_unique($urls));
echo $list . '<br/>';