I'm working on a simple substitution-cipher decoder. I'm using frequency analysis to decrypt the ciphertext. Just looking at the frequency of unique letter isn't enough. I need to look at the occurrences of 2-letter sequences (maybe 3-letter sequences).
My code for counting the occurrences of each letter is below
int counterRaw[256][2] = {0};
for(int i = 0; i <inputString.length(); i++)
counterRaw[inputString[i]][1]++;
int counterLetter[26][2] = {0};
for(int i = 0 ; i<26 ; i++){
counterLetter[i][0] = 'A'+i;
counterLetter[i][1] = counterRaw['A'+i][1];
As you can see very simple yet effective !
But I don't know how to achieve a 2-letter sequence counter, do you have any idea which could help me to code this ?
Thanks !
EDIT : As an example Given AZAZ RTYU JKLM I want my program to output :
AZ : 2
ZA : 1
ZR : 1
RT : 1
...
Something like the following would do the trick, though you'd have to do some jiggery pokery to make it suit your own needs.
#include <iostream>
#include <map>
#include <string>
int main ()
{
std::string message("some string that you will probably get from some encrypted file");
std::map<std::string,int> occurences;
std::string seq(" ");
for(int i = 1; i < message.length() - 1; i++)
{
seq[0] = message[i-1];
seq[1] = message[i];
//ignore spaces
if (seq.compare(0,1, " ") && seq.compare(1,1, " "))
{
occurences[seq]++;
}
}
//let's have a look ...
for(auto iter = occurences.begin(); iter != occurences.end(); ++iter)
{
std::cout << iter->first << " " << iter->second << "\n";
}
return 0;
}
output:
ab 1
at 1
ba 1
bl 1
cr 1
ed 1
en 1
et 1
fi 1
fr 1
ge 1
ha 1
il 2
in 1
ll 1
ly 1
me 2
nc 1
ng 1
ob 1
om 3
ou 1
pr 1
pt 1
ri 1
ro 2
ry 1
so 2
st 1
te 1
th 1
tr 1
wi 1
yo 1
yp 1