SystemC Error with wait() in SC_THREAD: "wait() is only allowed in SC_THREADs and SC_CTHREADs"

I'm working on a convolutional neural network simulation using SystemC for a school homework. My code includes a module Conv2d with an SC_THREAD for the forward pass of a convolutional layer, which involves waiting for input readiness. When I invoke sc_start() to begin the simulation, I encounter an error related to the use of wait() within the SC_THREAD. The exact error message is:

The Error Message

Error: (E519) wait() is only allowed in SC_THREADs and SC_CTHREADs: 
        in SC_METHODs use next_trigger() instead
In file: ../../../src/sysc/kernel/sc_wait.cpp:94
make: *** [all] Error 1

despite the fact that I've clearly registered my forward pass function as an SC_THREAD. Here's the relevant part of my module definition

The Module Definition (Conv2d.h)

// Conv2d.h
#ifndef CONV2D_H
#define CONV2D_H

#include <systemc.h>

SC_MODULE(Conv2d) {
private:
    // Layer configuration parameters
    unsigned int in_channels, out_channels;
    unsigned int kernel_height, kernel_width;
    unsigned int stride_height, stride_width;
    unsigned int padding_height, padding_width;
    bool apply_relu;    // Apply ReLU activation after convolution

    // Feature map dimensions
    unsigned int input_feature_map_height, input_feature_map_width;
    unsigned int output_feature_map_height, output_feature_map_width;
    // unsigned int input_feature_map_size, output_feature_map_size;    // Calculated from the above parameters

    // Layer parameters
    std::vector<std::vector<std::vector<std::vector<float>>>> weights;
    std::vector<float> bias;

public:
    // Define the ports for the module
    // Assuming input_feature_map_size and output_feature_map_size are calculated outside this module
    // We need this since the number of ports have to be determined for the module prior to forward passes
    sc_vector<sc_fifo_in<float>> input_feature_map;     // FIFOS offers buffering and prevents race conditions, in case we need to run successive inference
    sc_vector<sc_fifo_out<float>> output_feature_map;
    sc_in<bool> input_ready; // Signal indicating input is ready
    sc_out<bool> output_ready; // Signal indicating output is ready

    // Constructor with configuration parameters
    SC_HAS_PROCESS(Conv2d);
    Conv2d(sc_module_name name)
        : in_channels(1), out_channels(1),
          kernel_height(3), kernel_width(3),
          stride_height(1), stride_width(1),
          padding_height(1), padding_width(1),
          apply_relu(false),
          input_feature_map_height(3), input_feature_map_width(3),
          output_feature_map_height(3), output_feature_map_width(3),
          input_ready("input_ready"), output_ready("output_ready") {
        // initialize parameters of the convolutional layer's weights and biases
        initialize_parameters();

        // Register the forward pass function with the SystemC kernel
        SC_THREAD(forward_pass);
        sensitive << input_ready.pos();
        dont_initialize();  // Ensure the thread is not triggered upon initialization
    }

    void configure(unsigned int in_c, unsigned int out_c,
                   std::pair<unsigned int, unsigned int> kernel_size,
                   std::pair<unsigned int, unsigned int> stride,
                   std::pair<unsigned int, unsigned int> padding,
                   bool relu,
                   unsigned int in_feature_map_size, unsigned int out_feature_map_size,
                   std::pair<unsigned int, unsigned int> in_feature_map_dimension, std::pair<unsigned int, unsigned int> out_feature_map_dimension) {
        // Configure the layer with the given parameters
        in_channels = in_c;
        out_channels = out_c;
        kernel_height = kernel_size.first;
        kernel_width = kernel_size.second;
        stride_height = stride.first;
        stride_width = stride.second;
        padding_height = padding.first;
        padding_width = padding.second;
        apply_relu = relu;

        // Initialize input and output feature maps
        input_feature_map_height = in_feature_map_dimension.first;
        input_feature_map_width = in_feature_map_dimension.second;
        input_feature_map.init(in_feature_map_size);
        output_feature_map_height = out_feature_map_dimension.first;
        output_feature_map_width = out_feature_map_dimension.second;
        output_feature_map.init(out_feature_map_size);

        // Re-initialize parameters
        initialize_parameters();
    }

    // Forward computation using pure C++ primitives
    // We're assuming that the input/output feature maps are in the shape of (C, H, W)
    // and operates directly on it without reconstructing the 1D array back to 3D
    void forward_pass() {
        while(true) {
            wait(); // Wait for input_ready signal
            for (unsigned int out_c = 0; out_c < out_channels; ++out_c) {
                for (unsigned int h = 0; h < output_feature_map_height; ++h) {
                    for (unsigned int w = 0; w < output_feature_map_width; ++w) {
                        float sum = 0.0;
                        for (unsigned int in_c = 0; in_c < in_channels; ++in_c) {
                            for (unsigned int kh = 0; kh < kernel_height; ++kh) {
                                for (unsigned int kw = 0; kw < kernel_width; ++kw) {
                                    // Calculate the input index, considering stride and padding
                                    int h_index = h * stride_height + kh - padding_height;
                                    int w_index = w * stride_width + kw - padding_width;

                                    if (h_index >= 0 && h_index < input_feature_map_height && w_index >= 0 && w_index < input_feature_map_width) {
                                        int input_index = in_c * input_feature_map_height * input_feature_map_width + h_index * input_feature_map_width + w_index;
                                        sum += input_feature_map[input_index].read() * weights[out_c][in_c][kh][kw];
                                    }
                                }
                            }
                        }
                        sum += bias[out_c];
                        if (apply_relu && sum < 0) {
                            sum = 0.0;
                        }
                        int output_index = out_c * output_feature_map_height * output_feature_map_width + h * output_feature_map_width + w;
                        output_feature_map[output_index].write(sum);
                    }
                }
            }

            output_ready.write(true);   // Indicate that output is ready
            wait(1, SC_NS); // Wait for 1 ns to ensure the signal is read before resetting
            output_ready.write(false);  // Reset
        }
    }
};

#endif // CONV2D_H

The main file for instantiating the module and wiring it to the testing data is directly written in sc_main(). This might be the source of problem, but I'm not too sure to wrap it in another testbench module and complicate the matter further.

The Testing Code (main.cpp)

// main.cpp
#include <systemc.h>

#include <vector>
#include <tuple>
#include <iostream>
#include <iomanip>
#include <fstream>

#include <Conv2d.h>
#include <helpers.h>


int sc_main(int argc, char* argv[]) {
    // Example instantiation and configuration
    Conv2d conv_layer("ConvolutionalLayer");
    conv_layer.configure(
        3, 64,
        std::make_pair(11, 11),
        std::make_pair(4, 4),
        std::make_pair(2, 2),
        true,
        150528,
        193600,
        std::make_pair(224, 224),
        std::make_pair(55, 55)
        );

    // Assuming you know the dimensions and shape (C_out, C_in, H, W) of the convolutional layer
    auto conv_layer_shape = conv_layer.weight_shape();
    int out_channels = std::get<0>(conv_layer_shape);
    int in_channels = std::get<1>(conv_layer_shape);
    int rows = std::get<2>(conv_layer_shape);
    int cols = std::get<3>(conv_layer_shape);

    // Load weights from file
    auto weights = reshape_weights(load_weights("./data/conv1_weight.txt"), out_channels, in_channels, rows, cols); // Reshape the weights flat vector into the 4D weights vector
    auto biases = load_weights("./data/conv1_bias.txt");    // Load biases from file, no need to reshape
    conv_layer.load_parameters(weights, biases); // Load the weights and biases into the layer

    // Start the simulation
    // Load image data first
    auto image_data = load_image("./data/cat.txt");

    // Connect the input and output feature maps to the layer
    sc_vector<sc_fifo<float>> input_feature_map_sig("input_feature_map_sig", 150528);
    for (size_t i = 0; i < input_feature_map_sig.size(); i++) {
        conv_layer.input_feature_map[i](input_feature_map_sig[i]);
    }
    sc_vector<sc_fifo<float>> output_feature_map_sig("output_feature_map_sig", 193600);
    for (size_t i = 0; i < output_feature_map_sig.size(); i++) {
        conv_layer.output_feature_map[i](output_feature_map_sig[i]);
    }
    sc_signal<bool> input_ready_sig;
    conv_layer.input_ready(input_ready_sig);
    sc_signal<bool> output_ready_sig;
    conv_layer.output_ready(output_ready_sig);

    // feed the data and signal the layer
    for (size_t i = 0; i < input_feature_map_sig.size(); i++) {
        input_feature_map_sig[i].write(image_data[i]);
    }
    input_ready_sig.write(true);

    // Start the simulation if using SC_THREAD or SC_METHOD for computation
    sc_start(); // Run the simulation

    return 0;
}

I've ensured that my wait() call is indeed inside an SC_THREAD (specifically, the forward_pass method registered as an SC_THREAD in my Conv2d module's constructor). I was expecting the simulation to run without any errors related to wait() usage since, to my understanding, wait() is correctly used within SC_THREAD.

Again, this is the simplified version of my module definition above:

SC_MODULE(Conv2d) {
    // Constructor
    SC_HAS_PROCESS(Conv2d);
    Conv2d(sc_module_name name) {
        SC_THREAD(forward_pass);
        sensitive << input_ready.pos();
        dont_initialize();
    }

    void forward_pass() {
        while(true) {
            wait(); // Wait for input_ready signal
            // Forward pass computations follow...
        }
    }
};

I was expecting the simulation to start and the forward_pass method to wait for the input_ready signal as per the usual operation of an SC_THREAD. The error seems to suggest that wait() is being misused, but from my understanding and according to the SystemC documentation, its usage is correct in this context.

Solution

The complaint about wait() outside of a thread is coming from the writes to sc_fifo in sc_main(). Using scfifo.write() is a blocking call, which means it does call wait() if the FIFO cannot be currently written (i.e. it is sized and full).

To confirm this is the case, you can place a breakpoint at the point of error, and follow the stack trace to confirm the origin of the error. Using gdb:

  gdb sim.exe
  break sc_report_error
  run
  bt

The solution to the problem is to create another module/thread to drive the stimulus to your Conv2D. Writing/reading sc_fifos and sc_signals should not be done except inside processes (threads/methods) spawned by the call to sc_start().