Search code examples
node.jsfiletextline

Read from Nth to Mth line of text file in Node.js


Although i have found many examples on reading a text file line by line or reading the Nth line, i cannot find anything on how to read from Nth to Mth line.

The file is somewhat big, ~5 Gigabytes (~10 million lines).

EDIT: The lines don't have fixed length.


Solution

  • You can use readline functionality to read file as stream without loading it to RAM as a whole. Here is an example of how it can be done:

    const fs = require('fs');
    const readline = require('readline');
    
    function readFromN2M(filename, n, m, func) {
      const lineReader = readline.createInterface({
        input: fs.createReadStream(filename),
      });
    
      let lineNumber = 0;
    
      lineReader.on('line', function(line) {
        lineNumber++;
        if (lineNumber >= n && lineNumber < m) {
          func(line, lineNumber);
        }
      });
    }
    

    Let's try it:

    // whatever you would like to do with those lines
    const fnc = (line, number) => {
      // e.g. print them to console like this:
      console.log(`--- number: ${number}`);
      console.log(line);
    };
    
    // read from this very file, lines from 4 to 7 (excluding 7):
    readFromN2M(__filename, 4, 7, fnc);
    

    This gives the output:

    //  --- number: 4
    //  function readFromN2M(filename, n, m, func) {
    //  --- number: 5
    //    const lineReader = readline.createInterface({
    //  --- number: 6
    //      input: fs.createReadStream(filename),
    

    Lines are numerated starting from 1. To start from 0 just modify the numbering a little.

    UPDATE:

    I've just realized, that this approach is not 100% safe in a sense that if some file is not ended with new line char then the very last line of such a file would not be read this way. This is the way readline is designed... To overcome that I go to prepare file streams in little more sophisticated way - by adding new line chars to those streams when required. This would make the solution somewhat longer. But it is all possible.

    UPDATE 2

    As you've mentioned in the comment the lineReader continues to walk through the even after desired lines have been already found, which slows down the application. I think to we can stop it like this:

    lineReader.on('line', function(line) {
      lineNumber++;
      if (lineNumber >= n && lineNumber < m) {
        func(line, lineNumber);
      }
    

    next 3 lines should stop lineReader 'soon', but not immediately as explained in official docs

      if (lineNumber > m) {
        lineReader.close();
      }
    });
    

    I believe this should do the trick.