Search code examples
c++structstdvectorsubclassing

How to create a proper vector of substruct instances?


I'm trying to make a very basic tokenizer/lexer.

To do this, I'm making a main struct called Token that all types of tokens will inherit from, such as IntToken and PlusToken.

Every new type of token will include a type variable as a string, and a to_string function, which returns a representation like: Token(PLUS) or Token(INT, 5) (5 would be replaced by whatever integer value it is);

I've looked at many questions on SO and it looks like I need to make a vector of type std::shared_ptr(BaseClass) (in my case, BaseClass would be Token) https://stackoverflow.com/a/20127962/12101554

I have tried doing this how I would think that it should be made, but since it didn't work, I looked on SO and found the answer linked above, however it doesn't seem to be working.

Am I following the answer wrong, did I make some other error, or is this not possible to do in C++ without a lot of other code?

(I have also tried converting all the struct's to class's and adding public:, but that makes no change)

#include <iostream>
#include <string>
#include <vector>

struct Token {
    std::string type = "Uninitialized";
    virtual std::string to_string() { return "Not implemented"; };
};

struct IntToken : public Token {
    IntToken(int value) {
        this->value = value;
    }
    std::string type = "INT";
    int value;
    std::string to_string() {
        return "Token(INT, " + std::to_string(value) + ")";
    }
};

struct PlusToken : public Token {
    std::string type = "PLUS";
};

std::vector<std::shared_ptr<Token>> tokenize(std::string input) {
    std::vector<std::shared_ptr<Token>>  tokens;
    for (int i = 0; i < input.length(); i++) {
        char c = input[i];
        if (std::isdigit(c)) {
            std::cout << "Digit" << std::endl;
            IntToken t = IntToken(c - 48);
            std::cout << t.value << std::endl;
            tokens.push_back(std::make_shared<IntToken>(t));
        }
        else if (c == '+') {
            std::cout << "Plus" << std::endl;
            PlusToken p = PlusToken();
            tokens.push_back(std::make_shared<PlusToken>(p));
        }
    }
    return tokens;
}

int main()
{
    std::string input = "5+55";
    std::vector<std::shared_ptr<Token>> tokens = tokenize(input);
    for (int i = 0; i < tokens.size(); i++) {
        //std::cout << tokens[i].to_string() << std::endl;
        std::cout << tokens[i]->type << std::endl;
    }
}

Current Output:

Digit
5
Plus
Digit
5
Digit
5
Uninitialized
Uninitialized
Uninitialized
Uninitialized

Expected Output: (with current code)

Digit
5
Plus
Digit
5
Digit
5
Token(INT, 5)
Token(PLUS)
Token(INT, 5)
Token(INT, 5)

Note: Yes, I know that the proper tokenization would be (5) (+) (55), but I'm still creating the basic part.


Solution

  • You are giving your derived classes their own type member variables. Instead you should be setting the type that belongs to the base class inside the derived-class constructors.