Search code examples
phpfgetstrailing-newline

Lines of text appended with unwanted newlines while using `fgets()` to read a file


This works

$arr = array_merge(array_diff($words, array("the","an"));

Why doesn't this work?

$common consists of 40 words in an array.

$arr = array_merge(array_diff($words, $common));

Is there another solution for this?

For Reference:

<?php
error_reporting(0);
$str1= "the engine has two ways to run: batch or conversational. In batch, expert system has all the necessary data to process from the beginning";

common_words($str1);

function common_words(&$string) { 

    $file = fopen("common.txt", "r") or exit("Unable to open file!");
    $common = array();

    while(!feof($file)) {
      array_push($common,fgets($file));
    }
    fclose($file);
    $words = explode(" ",$string);
    $arr = array_merge(array_diff($words, array("the","an")));
    print_r($arr);
}
?>

Solution

  • White-spaces are evil, sometimes..

    fgets with only one parameter will return one line of data from the filehandle provided.

    Though, it will not strip off the trailing new-line ("\n" or whatever EOL character(s) is used) in the line returned.

    Since common.txt seems to have one word per line, this is the reason why php won't find any matching elements when you use array_diff.

    PHP: fgets - Manual

    parameter: length

    Reading ends when length - 1 bytes have been read, on a newline (which is included in the return value), or on EOF (whichever comes first). If no length is specified, it will keep reading from the stream until it reaches the end of the line.

    Rephrase:

    • All entries off $common will have a trailing line-break the way you are doing it now.

    Alternative solutions 1

    If you are not going to process the entries in common.txt I'd recommend you to take a look at php's function file, and use that in conjunction with array_map to rtrim the lines for you.

    $common = array_map ('rtrim', file ('common.txt')); // will do what you want
    

    Alternative solutions 2

    After @MarkBaker saw the solution above he made a comment saying that you might as well pass a flag to file to make it work in the same manner, there is no need to call array_map to "fix" the entries returned.

    $common = file ('common.txt', FILE_IGNORE_NEW_LINES);