Search code examples
phpcharacter-encodingnon-ascii-characters

Print non ASCII characters as their hex in PHP


How do I print a string in PHP in which all the non-ASCII characters gets converted to their HEX value (eg. 0x02) and displayed? I want users to know that they are entering non-ASCII values. I don't want to strip them. Instead, I would like to display it so that they can edit and correct the mistakes.

I want to allow users to enter standard tab, new lines, etc (maybe upto ASCII 127).

I tried quoted_printable_encode() but it displays = as =3D. Other non-ASCII characters as =[HEXVAL]. The equal sign creates confusion.

I tried preg_replace('/[[:^print:]]/', '', $string) but it ended up removing tabs, new lines, etc.


Solution

  • This is hard to acheive when it comes to unicode characters. Even valid unicode characters (there are a ton of it) might not being printable because the current font contains no letter definitions for that character. Meaning a German unicode font might not contain all valid Chinese characters for example.

    If you just care about ascii, you can use ctype_print() to check if a character is printable or not.

    Example:

    // test string contains printable and non printable characters
    $string = "\x12\x12hello\x10world\x03";
    
    $allowed = array("\x10", /* , ... */);
    
    // iterate through string
    for($i=0; $i < strlen($string); $i++) {
    
        // check if current char is printable
        if(ctype_print($string[$i]) || in_array($string[$i], $allowed)) {
            print $string[$i];
        } else {
            // use printf and ord to print the hex value if
            // it is a non printable character
            printf("\\x%02X", ord($string[$i]));
        }   
    }
    

    Output:

    \x12\x12hello
    world\x03