Search code examples
c++arraysstringio

How should I read in strings from 2 separate text files and compare for matches?


I have 2 text files with strings(a few hundred each file). The idea is to compare the contents of each, if a match is found, then that string is outputted to one file. If there is no match found, then the string is outputted to a different file. Essentially, input file 1 contains a master list of names and I am comparing that to input file 2. So we take name 1 on the master list and then compare that name to every name on the the other input file.

The main part I am stuck on is making an algorithm that will correctly traverse the files. I am also not sure if I am comparing the strings correctly, but this could be a result of other errors. I am new to c++ so I am not totally aware of all the rules of the language.

So there are 480 names on the master list and 303 names on the 2nd list, so there should be 303 names matching and 177 not matching, I made counters to ensure the numbers matched up. First I tried a simple while loop that would loop as long as input from the master file was being taken in, but I ran into a problem where I wasn't matching all of the files(makes sense), so I thought that maybe I needed something a little more complex so I tried reading all of the values from each input file to their own arrays and tried to compare the elements of the arrays. I read and successfully printed the arrays, but I ran into some other issues. Namely segmentation fault, which was apparently caused by sizeof(), both I am still trying to troubleshoot. I tried doing it like this:

//Had problems with making empty arrays
string arrMidasMaster[480];
string arrMidasMath[303];

for (int i = 0; i < sizeof(arrMidasMaster); ++i)
    {
        for (int j = 0; j < sizeof(arrMidasMath); ++j)
        {
            if (arrMidasMaster[i] == arrMidasMath[j]) //match
            {
                outData_Elig << arrMidasMaster[i] << endl;
                num_eligible ++; //counter
            }
            else                                      //No match
            {
                continue;
                //Where can I put these statements?
                //outData_Ineli << arrMidasMaster[i] << endl;
                //num_ineligible ++; //counter
            }
        }
    }

In the grand scheme of things this looks like it should be able to do what I need it to do, but there are still things that need to be done with it. Other than the segmentation fault, the if else statement needs work. This is beacause I need to use continue to keep going until a match is found, but what if a match is never found, then it looks like it'll just go back to the outer loop and test the next name, but I want it to execute the 2 statements as shown above. I hope this is enough to go off of.


Solution

  • I was able to find the correct answer to the problem with everyones help, thank you. The find() function is looking for whatever matches to string line in the range of vComp's 1st element to its last. This is important because if you compare the files in the reversed order, you will only find matches. This is because string line will only contain matching strings and will not contain the rest of the non-matching data. This is always true when comparing a select group to the whole.

    //Variable Declaration
    ifstream inData_Master, inData_Comp;
    ofstream outData_Match, outData_NonMatch;
    int num_Matches = 0, num_NonMatches = 0; // counter
    vector<string> vComp; // saves the compare list
    string line;
    
    inData_Master.open("Input_File_Master.txt");
    inData_Comp.open("Input_File_Comp.txt");
    outData_Match.open("Output_File_Match.txt");
    outData_NonMatch.open("Output_File_NonMatch.txt");
    
    // Reads & saves contents of our Comp input file, which will be compared against the Master file, store in line
    while (getline(inData_Comp,line)) 
    {
        vComp.push_back(line); // Adds new element to the end of the Vector via line
    }
    // Reads the contents of the master file 
    while (getline(inData_Master, line))
    {
        if (find(vComp.begin(),vComp.end(),line) != vComp.end()) // if nonMatch, find = vComp.end()  
        {
            outData_Match << line << endl; // match found
            num_Matches++; // counts match
        }
        else
        {
            outData_NonMatch << line << endl; // match not found
            num_NonMatches++; // counts non match
        }
    }
    cout << "Matches: " << num_matches << endl;
    cout << "NonMatches: " << num_NonMatches << endl;