Search code examples
phpasp-classicscreen-scraping

How can I tell what kind of whitespace is in a string?


I am scraping some information from a 10 year old website that was built in ASP using Frontpage(originally) and Dreamweaver(lately). I am using PHP.

I am getting back strings with whitespace that is not spaces. Using the PHP trim function, some of the white space is removed but not all.

original string: string(47) "  School Calendar"
trimmed string: string(34) " School Calendar"

How do I figure out what the whitespaces are so I can remove them?

My page showing var_dumps of the original and trimmed strings is here.


Solution

  • It looks like (if you view source on your page), that you're string has   "spaces" that aren't being trimmed by PHP's trim function.

    The best option is probably to replace these in advance, by calling str_replace prior to trim:

    $stringToTrim = str_replace(" "," ", $original);

    $trimmed = trim($stringToTrim);

    (Not using standard code formatting because it wasn't handling the   correctly)