Search code examples
c++binaryfilesaccess-violationreinterpret-cast

Access violation reading location using binary file


First off, I know there are posts with similar problems, but I cannot find the solution to mine in any of them.

This is for a programming assignment using binary and text files to store "Corporate sales data." (Division name, quarter, and sales), and then to search for specified records inside the binary data file and display them.

Here are the important parts of my code:

#include stuff
...

// Struct to hold division data
struct DIVISION_DATA_S
{
    string divisionName;
    int quarter;
    double sales;
};

int main()
{
    ...

    // Open the data file
    fstream dataFile;
    dataFile.open(dataFilePath, ios::in | ios::out | ios::binary);

    ... Get data from user, store in an instance of my struct ...

    // Dump struct into binary file
    dataFile.write(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));        

    // Cycle through the targets file and display the record from divisiondata.dat for each entry
while(targetsFile >> targetDivisionName)
{       
    int targetQuarter;  // Target quarter
    string targetQuarterStr;
    targetsFile.ignore();   // Ignore the residual '\n' from the ">>" read
    getline(targetsFile, targetQuarterStr);
    targetQuarter = atoi(targetQuarterStr.c_str()); // Parses into an int

    cout << "Target: " << targetDivisionName << " " << targetQuarter << endl;

    // Linear search the data file for the required name and quarter to find sales amount
    double salesOfTarget;
    bool isFound = false;
    while (!isFound && !dataFile.eof())
    {
        cout << "Found division data: " << targetDivisionName << " " << targetQuarter << endl;
        DIVISION_DATA_S divisionData;

        // Read an object from the file, cast as DIVISION_DATA_S
        dataFile.read(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
        cout << "Successfully read data for " << targetDivisionName << " " << targetQuarter << endl
            << "Name: " << divisionData.divisionName << ", Q: " << divisionData.quarter << ", "
            << "Sales: " << divisionData.sales << endl;

        // Test for a match of both fields
        if (divisionData.divisionName == targetDivisionName && divisionData.quarter == targetQuarter)
        {
            isFound = true;
            cout << "Match!" << endl;
            salesOfTarget = divisionData.sales;
        }
    }
    if (!isFound)   // Error message if record is not found in data file
    {
        cout << "\nError. Could not find record for " << targetDivisionName
            << " division, quarter " << targetQuarter << endl;
    }
    else
    {
        // Display the corresponding record
        cout << "Division: " << targetDivisionName << ", Quarter: " << targetQuarter
            << "Sales: " << salesOfTarget << endl;
        totalSales += salesOfTarget;    // Add current sales to the sales accumulator
        numberOfSalesFound++;   // Increment total number of sales found
    }
}

Sorry for the lack of indent for the while loop, copy/paste kind of messed it up.

My problem appears when attempting to access information read from the binary file. For instance, when it tries to execute the cout statement I added for debugging, it gives me this error:

Unhandled exception at 0x0FED70B6 (msvcp140d.dll) in CorporateSalesData.exe: 0xC0000005: Access violation reading location 0x310A0D68.

Now, from what I have read, it seems that this means something is trying to read from the very low regions of memory, AKA something somewhere has something to do with a null pointer, but I can't imagine how that would appear. This whole read operation is copied exactly from my textbook, and I have no idea what a reinterpret_chast is, let alone how it works or how to fix errors with it. Please help?

EDIT: Thanks for all the help. To avoid complications or using something I don't fully understand, I'm gonna go with switching to a c-string for the divisionName.


Solution

  • // Dump struct into binary file
    dataFile.write(reinterpret_cast<char *>(&divisionData), sizeof(divisionData)); 
    
    /*...*/
    
    // Read an object from the file, cast as DIVISION_DATA_S
    dataFile.read(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
    

    This will categorically not work under any circumstances.

    std::string uses heap-allocated pointers to store any string data it contains. What you're writing to the file is not the contents of the string, but simply the address where the string's data is located (along with some meta-data). If you arbitrarily read those pointers and treat them as memory (like you are in the cout statement) you'll reference deleted memory.

    You have two options.

    If all you want is a struct that can be easily serialized, then simply convert it like so:

    // Struct to hold division data
    struct DIVISION_DATA_S
    {
        char divisionName[500];
        int quarter;
        double sales;
    };
    

    Of course, with this style, you're limited to interacting with the name as a c-string, and also are limited to 500 characters.

    The other option is to properly serialize this object.

    // Struct to hold division data
    struct DIVISION_DATA_S
    {
        string divisionName;
        int quarter;
        double sales;
    
        string serialize() const { //Could also have the signature be std::vector<char>, but this will make writing with it easier.
            string output;
            std::array<char, 8> size_array;
            size_t size_of_string = divisionName.size();
            for(char & c : size_array) {
                c = size_of_string & 0xFF;
                size_of_string >>= 8;
            }
            output.insert(output.end(), size_array.begin(), size_array.end());
            output.insert(output.end(), divisionName.begin(), divisionName.end());
            int temp_quarter = quarter;
            for(char & c : size_array) {
                c = temp_quarter & 0xFF;
                temp_quarter >>= 8;
            }
            output.insert(output.end(), size_array.begin(), size_array.begin() + sizeof(int));
            size_t temp_sales = reinterpret_cast<size_t>(sales);
            for(char & c : size_array) {
                c = temp_sales & 0xFF;
                temp_sales >>= 8;
            }
            output.insert(output.end(), size_array.begin(), size_array.end());
            return output;
        }
    
        size_t unserialize(const string & input) {
            size_t size_of_string = 0;
            for(int i = 7; i >= 0; i--) {
                size_of_string <<= 8;
                size_of_string += unsigned char(input[i]);
            }
            divisionName = input.substr(7, 7 + size_of_string);
            quarter = 0;
            for(int i = 10 + size_of_string; i >= 7 + size_of_string; i--) {
                quarter <<= 8;
                quarter += unsigned char(input[i]);
            }
            size_t temp_sales = 0;
            for(int i = 18 + size_of_string; i >= 11 + size_of_string; i--) {
                temp_sales <<= 8;
                temp_sales += unsigned char(input[i]);
            }
            sales = reinterpret_cast<double>(temp_sales);
            return 8 + size_of_string + 4 + 8;
        }
    };
    

    Writing to files is pretty easy:

    dataFile << divisionData.serialize();
    

    Reading can be a little harder:

    stringstream ss;
    ss << dataFile.rdbuf();
    string file_data = ss.str();
    size_t size = divisionData.unserialize(file_data);
    file_data = file_data.substr(size);
    size = divisionData.unserialize(file_data);
    /*...*/
    

    By the way, I haven't checked my code for syntax or completeness. This example is meant to serve as a reference for the kind of code that you'd need to write to properly serialize/unserialize complex objects. I believe it to be correct, but I wouldn't just throw it in untested.