Search code examples
c++memory-managementneural-network

Uninitialized nested pointers after initializing them (Not deterministic) [C++]


I'm developing a C++ Neural Network I have this code to initialize a 4D array whose dimensions are: Each important value the neuron calculated e.g. Linear Function and Activation Function inside of an array representing each neuron inside a layer inside an array representing layers inside an array representing timesteps

Code that initializes it:

double**** execution_results = new double*** [real_t_count];

for (size_t t = 0; t < real_t_count; t++)
{
    std::tuple<double***, double**> inference_execution_results = ExecuteStore(X[t]);
    execution_results[t] = std::get<0>(inference_execution_results);
}

Inside NN->ExecuteStore(double*):

double*** execution_results = new double** [shape_length];

// Not instantiated due to input layer not being instantiated
execution_results[0] = NULL;
for (size_t i = 1; i < shape_length; i++)
{
    size_t layer_length = shape[i];
    ILayer* current_layer = layers[i - 1];
    current_layer_execution_results = execution_results[i] = new double* [layer_length];
    for (size_t j = 0; j < layer_length; j++)
    {
        INeuron* current_neuron = current_layer->neurons[j];
        current_layer_execution_results[j] = current_neuron->ExecuteStore(network_activations);
        
    }

}
return std::tuple<double***, double**>(execution_results, network_activations);

INeuron*->ExecuteStore(double**) returns an array of the items mentioned above and there is no problem with it (At least for now, I don't know what to expect :( | I thought this was deterministic).

for (int layer_i = shape_length - 1; layer_i >= 1; layer_i--)
{
    // current layer having not allocated neurons when layer_i == 2 - layers[1] == NULL
    // execution_results[0][2][0] not having stored data
    //          [t][l][n]
    size_t layer_length = shape[layer_i];
    ILayer* current_layer = layers[layer_i - 1];
    if (layer_i == 2) <- the exact layer that the second errors occurs.
        int x = 0;
    for (size_t i = 0; i < layer_length; i++)
    {
        INeuron* current_neuron = current_layer->neurons[i]; <- ERROR HERE (2nd most often)

        current_neuron->GetGradients(execution_results, real_t_count, gradients, network_costs, network_activations);
    }

INeuron->GetGradients(double execution_results, double neuron_cost):

double linear_function_gradient = neuron_cost * Derivatives::DerivativeOf(execution_results[0], this->activation_function); <- Error Here (most often is this one).

ERRORS: Exception thrown: read access violation. execution_results was 0xFFFFFFFFFFFFFFFF.

there is a third bug which is raised when deleting the arrays after this 2 and it occurs the least often.

also, there is a fourth error that says that memory was corrupt "heap corruption detected", but I have tried in a second computer and the same happens

I did abstraction of other variables due to simplicity.

Other related functions such as constructors, and layer generation:

    size_t image_resolution = 2;
    size_t fake_image_count = 3;

    double** fake_images = new double* [fake_image_count];
    double** Y = new double*[fake_image_count];
    //fake_images and Y is initialized

    size_t shape_length = 7;
    size_t* shape = new size_t[shape_length];
    shape[0] = image_resolution * image_resolution;
    /*shape[1] = 2; this comment is due to that with this shape errors aren't raised
    shape[2] = 1;*/
    shape[1] = image_resolution * image_resolution * fake_image_count;
    shape[2] = (image_resolution * image_resolution * fake_image_count) / 1.2;
    shape[3] = 128;
    shape[4] = (image_resolution * image_resolution * fake_image_count) / 1.5;
    shape[5] = fake_image_count;
    shape[6] = 1;

    ILayer** layers = new ILayer* [shape_length - 1];
    for (size_t i = 1; i < shape_length; i++)
    {
        layers[i - 1] = (ILayer*)new DenseNeuronLayer(shape, i, ActivationFunctions::Sigmoid);
    }

    NN* n = new NN(layers, shape, shape_length);

public:
    NN(ILayer** layers_not_including_input_layer, size_t* shape_including_input_layer, size_t shape_length)
    {
        this->layers = layers_not_including_input_layer;
        this->shape = shape_including_input_layer;

        this->shape_length = shape_length;
    }

class DenseNeuronLayer : public ILayer
{
public:
    DenseNeuronLayer(size_t* network_shape, size_t layer_i, ActivationFunctions::ActivationFunction activation_function)
    {
        this->layer_length = network_shape[layer_i];
        this->neurons = new INeuron*[layer_length];
        for (size_t i = 0; i < layer_length; i++)
        {
            DenseConnections* connections = new DenseConnections(layer_i, i, network_shape);
            Neuron* neuron = new Neuron(connections, 1, activation_function);
            this->neurons[i] = neuron;
        }
    }
};

I tried if the error is at layer instantiation and also when getting current_neuron the neuron kinda disappears, I dont know.

If you wanna take a look for the full code here is its github repo


Solution

  • As suggested in the comments, you almost certainly want to use a vector for your underlying storage, and provide 4 dimensional addressing into it.

    You typically want to overload an operator for that. Unless you have a fairly specific reason to do otherwise, I'd overload operator() to do it. Here's a bit of demo code for a 2D version. Extending it out to 4 dimensions should be fairly straightforward:

    template <class T>
    class matrix { 
        int columns_;
        std::vector<T> data;
    public:
        matrix(int columns, int rows) : columns_(columns), data(columns*rows) {}
    
        T &operator()(int column, int row) { return data[row*columns_+column]; }
    };
    

    If you really want to maintain the normal C++ array subscripting syntax, like myMatrix[t][z][y][x], you can do that, but it's a little more complex. The problem is that although you can overload operator[], you can only pass it one argument representing a location in one dimension. So, for 2D addressing, you overload operator[] in your matrix class, and have it return a proxy that: 1) stores the index it was passed, and 2) has an overload of operator[] that gets the index from the previous proxy, and returns a reference to data[y * width + x].

    For 4D addressing, you'll need three levels of proxies, each of which stores an index, and overloads operator[] to allow the user the enter the next index, and creates the next proxy with the indices it received at creation plus the one it just received as an argument. Then the final overload of operator[] uses the indices from all the proxies, does the multiplication and addition to find the right spot in the vector, and returns a reference to the correct location. Here's a sample:

    #include <vector>
    
    template<class T>
    class matrix4 {
    public:    
        int line_length, page_size, book_size;
        std::vector<T> data;
    
        friend class proxy1;
        friend class proxy2;
        friend class proxy3;
    
        class proxy1 { 
            matrix4 &m_;
            int book_, page_, line_;
        public:
            proxy1(matrix4 &m, int book, int page, int line)
                : m_(m), book_(book), page_(page), line_(line)
            {}
    
            T &operator[](int line_pos) {
                return m_.data[book_*m_.book_size + page_* m_.page_size + line_ * m_.line_length + line_pos];
            }
        };
    
        class proxy2 { 
            matrix4 &m_;
            int book_, page_;
        public:
            proxy2(matrix4 &m, int book, int page) 
                : m_(m), book_(book), page_(page) 
            {}
    
            proxy1 operator[](int line) { 
                return proxy1(m_, book_, page_, line);
            }
        };
    
        class proxy3 {
            matrix4 &m_;
            int book_;
        public:
            proxy3(matrix4 &m, int book) : m_(m), book_(book) {}
    
            proxy2 operator[](int page) { 
                return proxy2(m_,book_, page);
            }
        };
    public:
    
        matrix4(int books, int pages, int lines_per_page, int line_length) 
            : line_length(line_length)
            , page_size(lines_per_page * line_length)
            , book_size(pages * page_size)
            , data(book_size * books)
        {}
    
        proxy3 operator[](int book) {
            return proxy3(*this, book);
        }
    };
    
    #ifdef TEST
    #include <iostream>
    
    int main() { 
        matrix4<int> m(4, 5, 6, 7);
    
        m[0][0][0][0] = 1;
        m[1][2][3][4] = 2;
    
        for (int book = 0; book < 4; book++) {        
            std::cout << "\n\nbook: " << book;
            for (int page = 0; page < 5; page++) {
                std::cout << "\npage: " << page;
                for (int line=0; line<6; line++) {                
                    std::cout << "\n";
                    for (int x = 0; x<7; x++)
                        std::cout << m[book][page][line][x] << " ";
                }
            }
        }
    }
    #endif
    

    Define TEST when you compile to get a small test program.

    Depending on the order you use for the arguments, you can make this either row-major or column-major.