Search code examples
c++compiler-constructionlexical-analysisstatic-variables

Creating a static unordered_set from keys of a static unordered_map


I'm writing a front-end for a compiler and currently working on implementing punctuator scanning functionality. I've got a Punctuator class that I'd like to use to represent punctuators from the input source file in my tokens, and additionally it'll be home to some static helper methods for scanning. Here is punctuator.h:

#ifndef PUNCTUATOR_H_
#define PUNCTUATOR_H_

#include <string>
#include <unordered_map>
#include <unordered_set>
#include <string>

#include "reserved_component.h"

enum class PunctuatorEnum {
    OPEN_BRACE,
    CLOSE_BRACE,
    TERMINATOR,
    EQUAL,
    PLUS,
    MINUS,
    MULTIPLY,
    DIVIDE,
};

class Punctuator : public ReservedComponent {
public:
    Punctuator(std::string lexeme);
    Punctuator(Punctuator&& punctuator);

    static bool IsPunctuator(std::string buffer);
    static PunctuatorEnum ForLexeme(std::string buffer);

private:
    PunctuatorEnum punctuator_enum;
    static std::unordered_map<std::string, PunctuatorEnum> dictionary;
};

#endif // PUNCTUATOR_H_

As you can see I'm using a static unordered_map to store a "dictionary" of possible punctuators and then using this map to check if a given string represents a punctuator or not. I'm initializing the map inside the punctuator.c file here:

#include "punctuator.h"

std::unordered_map<std::string, PunctuatorEnum> Punctuator::dictionary = {
    { "{", PunctuatorEnum::OPEN_BRACE },
    { "}", PunctuatorEnum::CLOSE_BRACE },
    { ".", PunctuatorEnum::TERMINATOR },
    { "=", PunctuatorEnum::EQUAL },
    { "+", PunctuatorEnum::PLUS },
    { "-", PunctuatorEnum::MINUS },
    { "*", PunctuatorEnum::MULTIPLY },
    { "/", PunctuatorEnum::DIVIDE }
};

Punctuator::Punctuator(std::string lexeme) : ReservedComponent(lexeme) {
    this->punctuator_enum = dictionary.at(lexeme);
}

Punctuator::Punctuator(Punctuator&& punctuator) : ReservedComponent(std::move(punctuator)) {
    this->punctuator_enum = std::move(punctuator.punctuator_enum);
}

bool Punctuator::IsPunctuator(std::string buffer) {
    return dictionary.contains(buffer);
}

PunctuatorEnum Punctuator::ForLexeme(std::string lexeme) {
    return dictionary.at(lexeme);
}

I understand this is not the cleanest approach as this forces me to maintain the PunctuatorEnum and the unordered map, but it's what I've chosen so far. I'd also like to implement a method StartsPunctuator(char c) to check whether a given char starts a punctuator.

My question is as follows: Is it possible to declare a static unordered_set variable in my Punctuator class, say punctuator_first_char, and have it initialized to store only the first chars of every entry of the key-set of the static member variable dictionary? This way I would be able to simply call the contains method on punctuator_first_char in my StartsPunctuator(char ch) method without having to iterate through dictionary.


Solution

  • Is it possible to declare a static unordered_set variable in my Punctuator class, say punctuator_first_char, and have it initialized to store only the first chars of every entry of the key-set of the static member variable dictionary?

    Yes:

    #include <unordered_set> 
    
    class Punctuator : public ReservedComponent {
       //...
    private:
        PunctuatorEnum punctuator_enum;
        static std::unordered_map<std::string, PunctuatorEnum> dictionary;
        static std::unordered_set<char> punctuator_first_char;
    };
    

    Initialization:

    #include <algorithm> // std::transform
    #include <iterator>  // std::inserter
    
    std::unordered_map<std::string, PunctuatorEnum> Punctuator::dictionary = { ... };
    
    std::unordered_set<char> Punctuator::punctuator_first_char = [] {
        std::unordered_set<char> rv;
        std::transform(dictionary.begin(), dictionary.end(),
                       std::inserter(rv, rv.end()),
                       [](auto&& str_punctenum) { return str_punctenum.first[0]; });
        return rv;
    }();