Search code examples
phpcurlpreg-replacewhitespacestrip

Strip all whitespace


I use curl to get the content of an website into a string. After that I want to stip all the whitespace. For that I use $content = preg_replace('/\s+/', '', $content);. But it doesn't work properly. What am I doing wrong?

I use this code to get the content:

$curl_handle = curl_init();
curl_setopt($curl_handle, CURLOPT_URL, 'http://www.italiakalmar.se/ui/Article/show.aspx?id=185&m=165');
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
$content = curl_exec($curl_handle);
curl_close($curl_handle);

$pos = stripos($content, "<body");
$content = substr($content, $pos);

$content = strip_tags($content);

$content = html_entity_decode($content, ENT_COMPAT, 'UTF-8');

$content = preg_replace('/\s+/', '', $content);

$content = mb_strtolower($content, 'utf-8');

echo $content = str_replace("–", "-", $content);

I then get this string: //fabrikenrestaurangenpizzerianintromenykvalitetallergihittatillosspizzeriaitaliapizzeriaitaliaöppnadedörrarnaförstagångenredan1977,ochdrivssedandessisammamiljöochsammakaraktäristiskastil.viharalltidutsöktapizzoraverkäntgodsmakochkvalitet.komintillpizzeriaitaliaochlåtossserveradigenutsöktpizza.elleromdetpassarbättre-låtosslevereradenhemtilldig!nukanmanävenbetalamedkortvidutkörning!öppettider:mån-torskl:15-21fredag  kl:15-22lördag  kl:12-22söndag kl:12-21ingårikalmarkrogar.se

As you can see the whitespace is still there.


Solution

  • $content = str_replace(' ', '', $content);
    

    No regex approach.