Search code examples
c++stringpattern-matchingfstreamofstream

Find a specific string from a line through all the file


I have a file in which i want to search a specific string in a line then compare that string through the whole file which consists of 5000lines. All lines that match the string will be writen on another text file beneath each other. So far i have succeeded to get that specifi string from the first line and write all those lines that match the specific string. follow is the code that do solve the issue for only the first line.

 #include <iostream>
 #include <fstream>

 using namespace std;

 //gets the specific string from first line
 string FirstLineSplitedString()
 {
ifstream infile;
infile.open("test.txt");
//prints first line
string fline;
string splited;
if (infile.good())
{
string sLine;
getline(infile, sLine);
fline = sLine;

//string starting from Cap900 till before .waa (specific string)
int first = fline.find('_');
int last = fline.find_last_of('.');
splited = fline.substr (first+1,last-first);

}
return splited;
}


 int main()
 {
string SString = FirstLineSplitedString();
ifstream  stream1("test.txt");
string line ;
ofstream stream2("output.txt");

while( std::getline( stream1, line ) )
{
if(line.find(SString) != string::npos)
    stream2 << line << endl;
 }


stream1.close();
stream2.close();
  return 0;
}

what i couldnt figure how to do: I dont know how to do this to all the document. i mean when i am done finding the specific string from first line and writing all those lines that match the string, how to go to the next line and do the same steps and write all the lines with matching the string beneath each other. Besides if there was no match only the line itself will be written to file.

for example: lets say i have a file test.txt that contains the following (written in bold)

aaaaaa _men there here. there. so on
bbbb _men there here. there. so on
aaaabbbbbaa _from from. there. so on
zzzzzzzz _from from. there. so on
aaabbbbbaaa _men there here. there. so on
aabbbbaaaa _men there here. there. so on
nnnnnnn _from from. there. so on

when i run the code i get the following lines in my output.txt
aaaaaa _men there here. there. so on
bbbb _men there here. there. so on
aaabbbbbaaa _men there here. there. so on
aabbbbaaaa _men there here. there. so on

which is correct because i want split to get specific stringfrom(_)till last(.) . Now i want to this to next line different from the first and get the results. Below is the output.txt that i want to achieve from test.txt

aaaaaa _men there here. there. so on
bbbb _men there here. there. so on
aaabbbbbaaa _men there here. there. so on
aabbbbaaaa _men there here. there. so on


aaaabbbbbaa _from from. there. so on
zzzzzzzz _from from. there. so on
nnnnnnn _from from. there. so on


this pattern should continue till ast line of the file

sorry for writing so long but i wanted to be as clear as possible. any help would appreciated.
Also not to forget that the line that match the specific string may come below each other or may come 2000 lines after.


Solution

  • So I take it you need to group the input file lines based in some substring keys.

    The simplest way would be to populate in-memory line group collections as your read the file and then after processing the entire input flush the groups to the output file:

    #include <iostream>
    #include <string>
    #include <fstream>
    #include <deque>
    
    using namespace std;
    
    string findGroupKey(const string &line)
    {
        size_t first = line.find('_');
        if (first == string::npos)
            first = 0;
        size_t last = line.find_last_of('.');
        size_t len = (last == string::npos ? string::npos : last - first + 1);
        // The key includes the start and end chars themselves in order to
        // distinguish lines like "x_test." and "xtest"
        return line.substr(first,len);
    }
    
    int main()
    {
        // *** Var defs
        // Read the input file as stream
        ifstream inStream("test.txt");
        // line by line placing each just read line into inLine
        string inLine;
        // Place each inLine into its one group
        deque<deque<string> *> linesGrouped;
        // according to the grouping key
        deque<string> keys;
    
        // *** Read the input file and group the lines in memory collections
        while (getline(inStream, inLine)) {
            string groupKey = findGroupKey(inLine);
    
            // Find the key in our keys-met-so-far collection
            int keyIndex = -1;
            for (int i = 0, keyCount = (int)keys.size(); i < keyCount; i++)
                if (keys.at(i) == groupKey) {
                    keyIndex = i;
                    break;
                };
    
            if (keyIndex == -1) {
                // If haven't found the key so far, add it to our key index collection
                keys.push_back(groupKey);
                // and add a new group collection
                deque<string> *newGroup = new deque<string>();
                newGroup->push_back(inLine);
                linesGrouped.push_back(newGroup);
            } else {
                // Otherwise just add the line into its respective group
                linesGrouped.at(keyIndex)->push_back(inLine);
            }
        }
    
        // *** Write the groups into the output file
        ofstream outStream("output.txt");
        for (int i = 0, groupCount = (int)linesGrouped.size(); i < groupCount; i++) {
            for (int j = 0, lineCount = (int)linesGrouped.at(i)->size(); j < lineCount; j++)
                outStream << linesGrouped.at(i)->at(j) << endl;
            // Add a delimiter line (uncomment if you need one)
            //if (i < groupCount - 1)
            //  outStream << "-------------------" << endl;
        }
        return 0;
    }