Search code examples
c++stringalgorithmdynamic-programmingpalindrome

Find number of palindromes that are anagrams in C++


I am taking part in an online programming contest for fun on http://www.hackerrank.com. One of the problems that I am working on is to find number of anagrams of a string that are palindrome as well. I have solve the problem but when I try to upload it it fails on a number of test cases. As the actual test cases are hidden from me I do not know for sure why its failing but I assume it could be a scalability issue. Now I do not want someone to give me a ready made solution because that would not be fair but I am just curious on what people think about my approach.

Here is the problem description from the website:

Now that the king knows how to find out whether a given word has an anagram which is a palindrome or not, he is faced with another challenge. He realizes that there can be more than one anagram which are palindromes for a given word. Can you help him find out how many anagrams are possible for a given word which are palindromes?

The king has many words. For each given word, the king needs to find out the number of anagrams of the string which are palindromes. As the number of anagrams can be large, the king needs number of anagrams % (109+ 7).

Input format :
A single line which will contain the input string

Output format :
A single line containing the number of anagram strings which are palindrome % (109 + 7).

Constraints :

  • 1<=length of string <= 105
  • Each character of the string is a lowercase alphabet.
  • Each testcase has atleast 1 anagram which is a palindrome.

Sample Input 01 :

aaabbbb

Sample Output 01 :

3 

Explanation :
Three permutation of given string which are palindrome can be given as abbabba , bbaaabb and bababab.

Sample Input 02 :

cdcdcdcdeeeef

Sample Output 02 :

90

As specified in the problem description input strings can be as large as 10^5, hence all palindromes for a large string is not possible since I will run into number saturation issues. Also since I only need to give a modulo (10^9 + 7) based answer, I thought of taking log of numbers with base (10^9 + 7) and after all computations take antilog of fraction part with base (10^9 + 7) of the answer since that will be the modulo anyway.

My algorithm is as follows:

  1. Store freq of each char (only need to look half of the string since second half should be same as first one by def of palindrome)
  2. If there are more than one char appearing with odd number of time no palindrome possible
  3. Else for each char's freq incrementally calculate number of palindromes (Dynamic Programming)

For the DP following is the subproblem: previous_count = 1

For each additional character added number of palindromes = previous_count * (number_of_char_already_seen + 1)/(number of char same as current char)

Here is my code:

#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <cmath>
#include <cstdio>
#include <algorithm>
#include <fstream>
using namespace std;
#define MAX_SIZE 100001

void factorial2 (unsigned int num, unsigned int *fact) {
    fact[num]++;
    return;
}

double mylog(double x) {
    double normalizer = 1000000007.0; 
    return log10(x)/log10(normalizer);
}

int main() {
    string in;
    cin >> in;
    if (in.size() == 1) { 
        cout << 1 << endl;
        return 0;
    }
    map<char, int> freq;
    for(int i=0; i<in.size(); ++i) {
        if (freq.find(in.at(i)) == freq.end()) {
            freq[in.at(i)] = 1;
        } else {
            freq[in.at(i)]++;
        }
    }
    map<char, int> ::iterator itr = freq.begin();
    unsigned long long int count = 1;
    bool first = true;
    unsigned long long int normalizer = 1000000007;
    unsigned long long int size = 0;
    unsigned int fact[MAX_SIZE] = {0};
    vector<unsigned int> numerator;
    while (itr != freq.end()) {
        if (first == true) {
            first = false;
        } else {
            for (size_t i=1; i<=itr->second/2; ++i) {
                factorial2(i, fact);
                numerator.push_back(size+i);
            }
        }
        size += itr->second/2;
        ++itr;
    }
    //This loop is to cancel out common factors in numerator and denominator
    for (int i=MAX_SIZE-1; i>1; --i) {
        while (fact[i] != 0) {
            bool not_div = true;
            vector<unsigned int> newNumerator;
            for (size_t j=0; j<numerator.size(); ++j) {
                if (fact[i] && numerator[j]%i == 0) {
                    if (numerator[j]/i > 1)
                        newNumerator.push_back(numerator[j]/i);
                    fact[i]--;  //Do as many cancellations as possible
                    not_div = false;
                } else {
                    newNumerator.push_back(numerator[j]);
                }
            }
            numerator = newNumerator;
            if (not_div) {
                break;
            }
        }
    }
    double countD = 0.0;
    for (size_t i=0; i<numerator.size(); ++i) {
        countD += mylog(double(numerator[i]));
    }
    for (size_t i=2; i <MAX_SIZE; ++i) {
        if (fact[i]) {
            countD -= mylog((pow(double(i), double(fact[i]))));
            fact[i] = 0;
        }
    }
    //Get the fraction part of countD
    countD = countD - floor(countD);
    countD = pow(double(normalizer), countD);
    if (floor(countD + 0.5) > floor(countD)) {
        countD = ceil(countD);
    } else {
        countD = floor(countD);
    }
    count = countD;
    cout << count;
    return 0;
}

Now I have spent a lot of time on this problem and I just wonder if there is something wrong in my approach or am I missing something here. Any ideas?


Solution

  • The basic formula is:

    p!/(a!*b!...*z!)
    

    where p is floor of length/2 of the word and a,b,c..,z denotes the floor of half of frequency of occurrences a,b,c..,z in the word receptively.

    The only problem now you are left is how to calculate this. For that, check this out.