Search code examples
c++constexprcompile-timelibrary-design

Raise compile-time error if a string has whitespace


I have a base class that is intended to be inherited by other users of the code I'm writing, and one of the abstract functions returns a name for the object. Due to the nature of the project that name cannot contain whitespace.

class MyBaseClass {

  public:

    // Return a name for this object. This should not include whitespace.
    virtual const char* Name() = 0;

};

Is there a way to check at compile-time if the result of the Name() function contains whitespace? I know compile-time operations are possible with constexpr functions but I'm not sure of the right way to signal to code users that their function returns a naughty string.

I'm also unclear on how to get a constexpr function to actually be executed by the compiler to perform such a check (if constexpr is even the way to go with this).


Solution

  • I think this is possible in C++20.

    Here is my attempt:

    #include <string_view>
    #include <algorithm>
    #include <stdexcept>
    
    constexpr bool is_whitespace(char c) {
        // Include your whitespaces here. The example contains the characters
        // documented by https://en.cppreference.com/w/cpp/string/wide/iswspace
        constexpr char matches[] = { ' ', '\n', '\r', '\f', '\v', '\t' };
        return std::any_of(std::begin(matches), std::end(matches), [c](char c0) { return c == c0; });
    }
    
    struct no_ws {
        consteval no_ws(const char* str) : data(str) {
            std::string_view sv(str);
            if (std::any_of(sv.begin(), sv.end(), is_whitespace)) {
                throw std::logic_error("string cannot contain whitespace");
            }
        }
        const char* data;
    };
    
    class MyBaseClass {
      public:
        // Return a name for this object. This should not include whitespace.
        constexpr const char* Name() { return internal_name().data; }
      private:
        constexpr virtual no_ws internal_name() = 0;
    };
    
    class Dog : public MyBaseClass {
        constexpr no_ws internal_name() override {
            return "Dog";
        }
    };
    
    class Cat : public MyBaseClass {
        constexpr no_ws internal_name() override {
            return "Cat";
        }
    };
    
    class BadCat : public MyBaseClass {
        constexpr no_ws internal_name() override {
            return "Bad cat";
        }
    };
    

    There are several ideas at play here:

    • Let's use the type system as documentation as well as constraint. Therefore, let us create a class (no_ws in the above example) that represents a string without whitespaces.

    • For the type to enforce the constraints at compile-time, it must evaluate its constructor at compile time. So let's make the constructor consteval.

    • To ensure that derived classes don't break the contract, modify the virtual method to return no_ws.

    • If you want to keep the interface (i.e returning const char*), make the virtual method private, and call it in a public non-virtual method. The technique is explained here.

    Now of course here I am only checking a finite set of whitespace characters and is locale-independent. I think it would very tricky to handle locales at compile-time, so maybe a better way (engineering-wise) would be to explicitly specify a set of ASCII characters allowed in the names (a whitelist instead of a blacklist).

    The above example would not compile, since "Bad cat" contains whitespace. Commenting out the Bad cat class would allow the code to compile.

    Live demo on Compiler Explorer