Search code examples
c++boost-regex

Unexpected output while getting the name of a file with some regex from a directory using boost::regex


I just made a function findFile to find whether a file with some pattern file_name_regex in the directory dir_name. Just test it in Coliru

#include <string>
#include <iostream>
#include <boost/regex.hpp>
#include <boost/filesystem.hpp>

namespace fs = boost::filesystem;

bool findFile(const std::string & dir_name, const std::string & file_name_regex)
{
    fs::path p(dir_name);
    if (!exists(p))
        return false;

    boost::regex file_regex(file_name_regex, boost::regex::basic);

    fs::directory_iterator end_itr;
    for (fs::directory_iterator itr(p);itr != end_itr; ++itr )
    {   
        if (!fs::is_directory(itr->path()))
        {               
            boost::sregex_iterator it(itr->path().filename().string().begin(),
                                   itr->path().filename().string().end(), 
                                   file_regex);
            boost::sregex_iterator end;
            for (; it != end; ++it){
                std::cout << it->str() << std::endl;
            }
        }   
        else {
            continue;
        }
    }   
    return false;
}

int main()
{
    findFile("/", "a.out" );
}

Compile and run it with the command:

g++ -std=c++11 -O2 -Wall -lboost_system -lboost_filesystem -lboost_regex main.cpp && ./a.out

It should print out:

a.out

But it gives out unexpected output:

.out

It is based on the solution of C++ Regular Expressions with Boost Regex

I also changed it to make a simple test also in Coliru:

#include <boost/regex.hpp>
#include <iostream>
#include <string>

int main()
{
    std::string text("a.out");
    const char * pattern = "a.out";    
    boost::regex ip_regex(pattern);

    boost::sregex_iterator it(text.begin(), text.end(), ip_regex);
    boost::sregex_iterator end;
    for (; it != end; ++it) {
        std::cout << it->str() << "\n";
        // v.push_back(it->str()); or something similar     
    }
}

It prints out the expected word a.out.

So what is wrong with my code?


Solution

  • You've got UB due to a dangling pointer. The temporary itr->path().filename().string() is destroyed at the end of the following statement:

            boost::sregex_iterator it(itr->path().filename().string().begin(),
                                   itr->path().filename().string().end(), 
                                   file_regex);
    

    So begin() and end() now point to garbage.

    You need to hoist the temporary string out into a separate variable to extend its lifetime:

            std::string s = itr->path().filename().string();
            boost::sregex_iterator it(s.begin(), s.end(), file_regex);