I try to parse an iCal:
//open file $calendar = file_get_contents('http://app.kigo.net/public/ics.php?c-7ca2eb67c1a7fa8b87b2434ed1096076-422-9871b35967bb29f999cd11ac72943011'); //debug purpose echo $calendar; //parse string preg_match_all('#^BEGIN\:VEVENT.*?END\:VEVENT$#sm',$calendar,$results,PREG_SET_ORDER); //output: empty! print_r($results);
it returns an empty array.
Anyway, if I copy/paste the "$calendar" content on a other variable, and parse it with the same regexp, it works fine.
Why when I call preg_match_all on the same string directly from file_get_contents, It works wrong?
The remote file uses the sequence CR LF as newline, that's why the anchor $
doesn't match. When you copy/paste the file content to (or from) an application that uses by default only LF as newline, the sequence CR LF is probably silently replaced with LF and your pattern works.
Several ways to solve the problem:
1) write explicitly the carriage return in your pattern:
#^BEGIN:VEVENT.*?END:VEVENT\r$#sm
If you don't want the carriage return at the end of the match, use trim
or put it in a lookahead assertion: #^BEGIN:VEVENT.*?END:VEVENT(?=\r$)#sm
.
You can also remove the $
and use the \R
alias that matches \r
,\r\n
and \n
.
2) allow the $
to match whatever the newline sequence using the directive (*ANYCRLF)
#(*ANYCRLF)^BEGIN:VEVENT.*?END:VEVENT$#sm
3) don't use a pattern at all (after all you are only looking for blocks between fixed lines, and if your file may be a bit long, it's more elegant and saves memory to read your file by line and to use a generator to return blocks):
$filePath = 'http://app.kigo.net/public/ics.php?c-7ca2eb67c1a7fa8b87b2434ed1096076-422-9871b35967bb29f999cd11ac72943011';
try {
if ( false === $fp = fopen($filePath, 'rb') )
throw new Exception('Could not open the file!');
} catch (Exception $e) {
echo 'Error (File: ' . $e->getFile() . ', line ' . $e->getLine() . '): ' . $e->getMessage();
}
foreach (genBlocks($fp, "BEGIN:VEVENT\r\n", "END:VEVENT\r\n") as $block) {
echo $block . PHP_EOL;
}
fclose($fp);
function genBlocks($fp, $start, $end, $buffer = 1024) {
$block = false;
while ( false !== $line = fgets($fp, $buffer) ) {
if ( $line === $start ) {
$block = $line;
} elseif ( $block !== false ) {
$block .= $line;
if ( $line === $end ) {
yield $block;
$block = false;
}
}
}
}
Note: You can also use stream_get_line
instead of fgets
since this one is able to return a line without the newline sequence.