Search code examples
phputf-8character-encodinginternationalizationiso

Php encoding with Greek


Hi i am reading a string written in greek from a site with encoding "ISO-8859-7" on a php script running in cmd with the intention of adding parts of it in a mysql database. The problem is that before i add anything to the database i echo and there seems to be an encoding problem. I am attaching the relevant part of code.

$fil1=hitFormGet("http://www.topsites.gr/gr_domain_list/".$site->href);
$fil1=iconv("ISO-8859-7","UTF-8",$fil1);
$html1=str_get_html($fil1);
$data=$html1->find('td class="res3" table tbody table tbody tr');
echo "ttl".$data[4]->plaintext."\n";

and the output is ttl Φιλοξενεί Site : Îαί

(It appears differently here too...)


Solution

  • Use UTF-8 whenever dealing with PHP and MySQL. Here is how I connect:

    $DB_USERNAME = '';
    $DB_PASSWORD = '';
    $DB_HOST = '';
    $DB_NAME = '';
    
    $pdo_attributes = array();
    $pdo_attributes[PDO::ATTR_EMULATE_PREPARES] = FALSE;
    
    $dsn = sprintf('mysql:dbname=%s;host=%s;charset=utf8', $DB_NAME, $DB_HOST);
    $pdo = new PDO($dsn, $DB_USERNAME, $DB_PASSWORD, $pdo_attributes);
    

    Additionally, in PHP < 5.3.6 set this additional attribute:

    $pdo_attributes[1002] = 'SET NAMES utf8';
    

    The integer 1002 represents PDO::MYSQL_ATTR_INIT_COMMAND, however that command is broken in PHP < 5.3.1.

    Find below the original content of this answer, which is now outdated. This is the way to connect using the old mysql_* functions, which are not recommended any more, with some other gibberish-preventing tips that were necessary at the time:

    mysql_connect ("localhost", "DB_USER", "DB_PASSWORD") or die (mysql_error());
    mysql_select_db ("DATABASE_NAME") or die (mysql_error());
    
    mysql_query("SET character_set_client=utf8"); 
    mysql_query("SET character_set_connection=utf8"); 
    mysql_query("SET character_set_database=utf8"); 
    mysql_query("SET character_set_results=utf8"); 
    mysql_query("SET character_set_server=utf8"); 
    mysql_query("SET NAMES 'utf8'");