Search code examples
c++erase-remove-idiom

Is there a way to use the erase-remove idiom in concert with other looping conditions?


I have a function that, given a block of text, should remove all punctuation characters, and make all letters lowercase, and eventually, should shift them according to a mono alphabetic cipher. The code below works:

class Cipher { 
  public:

  string keyword; 

  string decipheredText;

  deque<string> encipheredAlphabet;

    static bool is_punctuation (char c) {
      return c ==  '.' || c == ',' || c == '!' || c == '\''|| c == '?' || c 
      == ' ';
    }

  string encipher(string text) { 
    Alphabet a;
    encipheredAlphabet = a.cipherLetters(keyword);


    text.erase( remove_if(text.begin(), text.end(), is_punctuation), 
    text.end() );

    string::iterator it;
    for (it = text.begin(); it != text.end(); it++) { 
      *it = tolower(*it); 
      // encipher text according to shift
    }

    return text;
  }

};

The problem is, it currently makes two passes over the string, one to remove the punctuation, and one to do all the other stuff. This seems inefficient, since it seems like all the transformations could be accomplished in one pass through the string somehow. Is there a clean way to incorporate the erase-remove idiom with other loop conditions?


Solution

  • If you don't want to do two loops because you've measured and found that it's slower, write a custom algorithm:

    template <typename Iter, typename OutIter>
    OutIter lowercased_without_punctuation(Iter begin, Iter end, OutIter out) {
        while (begin != end) {
            // Ignoring things like std::move_iterator for brevity.
            if (!is_punctuation(*begin)) {
                *out = tolower(*begin);
                ++out;
            }
    
            // Use `++iter` rather than `iter++` when possible
            ++begin;
        }
    
        return out;
    }
    
    // ...
    
    string encipher(string text) {
        Alphabet a;
        encipheredAlphabet = a.cipherLetters(keyword);
    
        text.erase(
            lowercased_without_punctuation(text.begin(), text.end(), text.begin()),
            text.end());
    
        return text;
    }
    

    If you think about it some more, lowercased_without_punctuation is actually a special-case of a more general algorithm which might be called transform_if (relevant Q&A):

    template <typename Iter, typename OutIter, typename Pred, typename Transf>
    OutIter transform_if(Iter begin, Iter end, OutIter out, Pred p, Transf t) {
        while (begin != end) {
            if (p(*begin)) {
                *out = t(*begin);
                ++out;
            }
    
            ++begin;
        }
    
        return out;
    }
    
    // ...
    
    string encipher(string text) {
        Alphabet a;
        encipheredAlphabet = a.cipherLetters(keyword);
    
        text.erase(
            transform_if(text.begin(), text.end(), text.begin(),
                [](char c) { return !is_punctuation(c); },
                [](char c) { return tolower(c); }),
            text.end());
    
        return text;
    }