I have two files:
1: template.html (utf-8 encoding) content:
<tag>
<output>
</output>
</tag>
2: and second file is parser.php (utf-8 encoding) content:
$fileContent = (file_get_contents('template.html'));
echo 'Pos #1: <b>'.$pos1 = mb_strpos($fileContent, '<'); echo '</b><br />';
echo 'Pos #2: <b>'.$pos2 = mb_strpos($fileContent, '>'); echo '</b><br />';
echo 'Substring by Pos1 & Pos2: <b>'.htmlentities(substr($fileContent, $pos1, $pos2)).'</b>';
I try to parse the tags and i need to know their correct position.. When I use substr I notice problem the output is:
Pos #1: 0
Pos #2: 10
Substring by Pos1 & Pos2: <tag
I need the correct way.. The result is supposed to be:
Pos #1: 0
Pos #2: 11
Substring by Pos1 & Pos2: <tag>
Extracting a substring takes a start
, which is a position and a length
which is not a position.
You can get the length by doing:
$length = $pos2 - $pos1 + 1;
Also, you are processing a unicode string and have the clarity of mind to use mb_strpos
yet you still use substr
to extract the substring. You should use mb_substr
.
mb_substr()
Performs a multi-byte safe substr() operation based on number of characters. Position is counted from the beginning of str. First character's position is 0. Second character position is 1, and so on.