Search code examples
phpencodingxlsx

accents and UTf-8 php using strtolower


I read an XLSX with simpleXlsx parser plugin.

my first line of excel is the header and i need to read it.

In my excel i have for example 3 columns with the name of the header on first row:

Columns_1      Columns_with_accent_à       Columns_3

Second col has an a accented: à

My editor is in UTF-8 mode, my php page has encoding UTF-8 set, i don't use any html on my page (is it a page import only in php) but I get this var dump:

<?php

header('Content-type: text/html; charset=UTF-8');

$xlsx = SimpleXLSX::parse("file.xlsx");

foreach( $xlsx->rows() as $indexrow => $r ) {   

        if ( $indexrow == 0 ) {

            // HEADER

            var_dump(strtolower($r[1])); //second column

            //output WRONG:     Columns_with_accent_�

        }
}

?>

Any idea strtolower broke my string? without it, it work great


Solution

  • Seeing the comment and edit (was not shown in the original post) about the use of strtolower(), the manual states:

    Note that 'alphabetic' is determined by the current locale. This means that e.g. in the default "C" locale, characters such as umlaut-A (Ä) will not be converted.

    mb_strtolower() on the other hand, shows:

    By contrast to strtolower(), 'alphabetic' is determined by the Unicode character properties. Thus the behaviour of this function is not affected by locale settings and it can convert any characters that have 'alphabetic' property, such as A-umlaut (Ä).