Search code examples
c++gccoptimization

Will gcc optimize away repeated function calls upon the same variable with same output for each call?


For one application, I'm in a situation where the same information exists in multiple forms: Base64 string, hex string, and char[].

For now and for productivity's sake, instead of painstakingly declaring & initializing a variable once per function, I'm applying it only at the obvious conversion points between the above forms. The reason why is because there are points where the variable does not need to be transformed to another form for operations such as conditional comparisons.

From what I've read, it appears as if compilers are incredibly efficient and becoming more so by the day; however, when I try to read more in depth analysis and description, I often pass the limit of my experience, and my brain stack overflows.

If a function is repeatedly called upon a single variable to alter it into another form, say from a Base64 string to a hex string producing the same result each time, will the compiler optimize those calls away so that a variable declared for the entire scope is unnecessary?

In my case, I'm using -Ofast until there's something better.


Solution

  • Here's a bit of code to illustrate the concept:

    class TriString
    {
      public:
        enum Format { Binary, Hex, Base64 };
    
        TriString(const std::string& s) : s_(s) { }
    
        // mutators - must modify b_ and h_ accordingly or clear them
    
        TriString& operator=(const std::string& rhs)
            { s_ = rhs; b_.clear(); h_.clear(); }
    
        TriString& erase(size_type index = 0, size_type count = npos)
        {
            s_.erase(index, npos);
            h_.clear(); // will need regeneration...
            b_.erase(index * 2, count == npos ? npos : count * 2);
        }
    
        char& operator[](size_type n)
        {
            h_.clear();
            b_.clear();
            return s_[n];
        }
    
        // ...add more as needed...
    
        // accessors
    
        const std::string& get(Format) const
        {
            if (Format == Binary || s_.empty())
                return s_;
            if (Format == Hex)
            {
                if (h_.empty()) h_ = to_hex(s_);
                return h_;
            }
            // Format == Base64
            if (b_.empty()) b_ = to_base64(s_);
            return b_;
        }
    
        const char& operator[](size_type n) const { return s_[n]; }
    
        // ...add more as needed...
    
      private:
        std::string s_;          // normal string
    
        // "cached" conversions - invariant: valid if not empty(), or s_.empty() too
        // (mutable so get(Format) const can modify despite being const)
        mutable std::string b_;  // base64 encoded
        mutable std::string h_;  // hex encoded
    };
    

    It's not really safe to do this with the usual std::string interface, as client code like the following won't work:

    TriState s("hello!");
    char& c = s[2];
    const std::string& h = s.get(TriState::Hex);  // triggers caching of hex conversion
    c = 'x';                                      // oops - modifies s_ without clearing/updating h_
    const std::string& h2 = s.get(TriState::Hex); // oops - gets old cached h_ despite changed s_
    

    You have to make some choices to either limit the interface to avoid granting ongoing ability to change the string (as with non-const operator[], iterators etc.), return proxy objects (instead of e.g. character references) that can clear out the cached conversions when written through, or document some restrictions on client usage and hope for the best....