I am searching for the string version
in text read from a Unicode little-endian file.
With the $text 'version
(apostrophe intended) I get
echo strpos($text, "r"); // Returns 7.
echo strpos($text, "version"); // Returns null.
I suspect that I need to convert either the needle or the haystack into the same format.
Any ideas?
Update after cmbuckley's answer.
$var = iconv('UTF-16LE', 'UTF-8', $fields[0]);
// Returns Notice: iconv(): Detected an incomplete multibyte character in ...input string in
So I checked the existing encoding and find
echo mb_detect_encoding($fields[0], mb_detect_order(), false); // Returns 'ASCII'.
This is confusing. If the string is ASCII why was I having trouble with the original strpos
function?
Update 2
The hex encoding of 'version
is 2700 5600 6500 7200 7300 6900 6f00 6e00
.
What encoding is that?
I created a file with the hex contents you provided and managed to find a solution:
<?php
$text = file_get_contents(__DIR__.'/test');
$text = mb_convert_encoding($text, 'UTF-8', 'UTF-16LE');
var_dump(strpos($text, "r")); // int(3)
var_dump(strpos($text, "Version")); // int(1)
Contents of test
(viewed in Hex Fiend):
Version of PHP used: PHP 5.6.36