Faster file reading in C++

I need to read many different files in succession as fast as possible. It's not one big file, but many small ones. The files I try to read from are the stat files in /proc/<pid>/stat

I am using std::ifstream and std::getline() to read the files.

Here is my current code:

std::ifstream statFile("/proc/" + pid + "/stat");
if (!statFile.is_open())
{
    std::cerr << "Error: Could not open file for PID " << pid << std::endl;
    return 0; // No fatal error because file may be deleted during read
}

std::string line;
if (!std::getline(statFile, line))
{
    std::cerr << "Error: Could not read from file for PID " << pid << std::endl;
    return 0; // No fatal error because file may be deleted during read
}

I tried using mmap(), but that doesn't seem to work in the /proc/ directory.

I also tried using a buffer, but that was slower.

Solution

I recommend opening the files as normal with std::ifstream and then use std::ifstream::read to read the whole file into a fixed size char array. On my system an array of 957 is enough given the max (or min) values of all the fields in proc_pid_stat(5) + a max length of the comm string of 16. I'd round it up to 1024 for good measure. If your system has sizeof(int) greater than 4, double the size of the buffer - or double it anyway. I doubt you'll notice a difference.

For extracting the numerical values, I recommend using std::from_chars which is supposed to provide the fastest way to convert char arrays into numerical types.

I'd start by defining a class that can hold the values:

struct proc_pid_stat {
    /*
    (1) pid  %d
           The process ID.
    */
    int pid;
    /*
    (2) comm  %s
           The filename of the executable, in parentheses.
           Strings longer than TASK_COMM_LEN (16) characters
           (including the terminating null byte) are silently
           truncated.  This is visible whether or not the
           executable is swapped out.
    */
    std::string comm;

    //... add all the fields with the correct types ... 

    /*
    (52) exit_code  %d  (since Linux 3.5)  [PT]
           The thread's exit status in the form reported by
           waitpid(2).
    */
    int exit_code;
};

To this class I'd add a "magic" value that can be used to indicate if extracting the information from the file failed. This will be set on the last field in the class when extraction starts, but it'll will be overwritten if extraction succeeds.

struct proc_pid_stat {
    // same as above goes here

    static constexpr int fail = std::numeric_limits<int>::min();
};

Then to the actual extration. The only messy parts are the comm(2) and state(3) fields, which comes early. The rest can be made into a big fold expression in which std::from_chars is used:

struct proc_pid_stat {
    // same as above goes here

    friend std::istream& operator>>(std::istream& is, proc_pid_stat& pps) {
        pps.exit_code = fail; // set the last field to a "fail" value
        char buf[1024]; // max length with all the fields incl. comm is 957
        // read the whole line:
        is.read(buf, static_cast<std::streamsize>(sizeof buf));
        const char* const end = buf + is.gcount();

        // extract fields:
        auto rptr = std::from_chars(buf, end, pps.pid).ptr; // (1)
        if(rptr == end) return is;
        ++rptr;
        if(std::distance(rptr, end) < kernel_thread_comm_len) return is;
        std::string_view comm(rptr, kernel_thread_comm_len);
        const auto cpos = comm.rfind(')');
        if(cpos == std::string_view::npos) return is;
        auto sp = rptr + cpos + 1;
        if(std::distance(sp, end) < 96) return is; // a resonable amount left
        pps.comm.assign(rptr, sp);                 // (2)
        pps.state = *++sp;                         // (3)
        ++sp;

        // if extracting all the rest succeeds, the last field, exit_code,
        // will get a value other than "fail":
        [&](auto&&... rest) {
            (..., (sp = std::from_chars(sp + (sp != end), end, rest).ptr));
        }(pps.ppid /* (4) */, pps.pgrp /* (5) */, pps.session /* (6) */,
          pps.tty_nr /* (7) */, pps.tpgid /* (8) */, pps.flags /* (9) */,
          pps.minflt /* (10) */, pps.cminflt /* (11) */, pps.majflt /* (12) */,
          pps.cmajflt /* (13) */, pps.utime /* (14) */, pps.stime /* (15) */,
          pps.cutime /* (16) */, pps.cstime /* (17) */, pps.priority /* (18) */,
          pps.nice /* (19) */, pps.num_threads /* (20) */,
          pps.itrealvalue /* (21) */, pps.starttime /* (22) */,
          pps.vsize /* (23) */, pps.rss /* (24) */, pps.rsslim /* (25) */,
          pps.startcode /* (26) */, pps.endcode /* (27) */,
          pps.startstack /* (28) */, pps.kstkesp /* (29) */,
          pps.kstkeip /* (30) */, pps.signal /* (31) */, pps.blocked /* (32) */,
          pps.sigignore /* (33) */, pps.sigcatch /* (34) */,
          pps.wchan /* (35) */, pps.nswap /* (36) */, pps.cnswap /* (37) */,
          pps.exit_signal /* (38) */, pps.processor /* (39) */,
          pps.rt_priority /* (40) */, pps.policy /* (41) */,
          pps.delayacct_blkio_ticks /* (42) */, pps.guest_time /* (43) */,
          pps.cguest_time /* (44) */, pps.start_data /* (45) */,
          pps.end_data /* (46) */, pps.start_brk /* (47) */,
          pps.arg_start /* (48) */, pps.arg_end /* (49) */,
          pps.env_start /* (50) */, pps.env_end /* (51) */,
          pps.exit_code /* (52) */
        );
        return is;
    }
};

Note: kernel_thread_comm_len is a constant to deal with comm fields longer than the 16 characters mentioned for the comm field. Kernel tasks may be 64 characters, so that's what I set that constant to.

Then comes the part with for what processes to collect the information. If you have a std::vector of process IDs, you could add a function that populates a std::vector<proc_pid_stat>:

auto get_proc_pid_stats(std::ranges::random_access_range auto&& pids) {
    static const std::filesystem::path proc("/proc");
    std::vector<proc_pid_stat> ppss(std::ranges::size(pids));

    auto zw = std::views::zip(pids, ppss);

    auto fillfunc = [](auto&& pid_pps) {
        auto& [pid, pps] = pid_pps;
        auto path = proc / std::to_string(pid) / "stat";
        std::ifstream is(path);
        is >> pps;
    };

    std::for_each(std::execution::par, std::ranges::begin(zw),
                  std::ranges::end(zw), fillfunc);

    return ppss;
}

Note: The above uses the built-in thread pool (if your implementation supports it). You may need to link with the library implementing it for it to be useful. -ltbb is common. Should you for some reason don't want to use the thread pool, change std::execution::par to std::execution::seq and measure the difference in time.

If you want all the processes, you can make it more effective by not building the filename for every process file like I did in get_proc_pid_stats above. Just collect the filenames and use those instead of pids in the loop above:

std::vector<std::filesystem::path> pids;
for(auto& de : std::filesystem::directory_iterator("/proc")) {
    if(std::isdigit(
       static_cast<unsigned char>(de.path().filename().string().front()))) 
    {
        pids.emplace_back(de.path());
    }
}

Full demo