Search code examples
c++parallel-processingopenmp

Prolem with parallel openmp multi loop in C++


I wrote a program to find key with 8 loops. It takes me 2 hours to found key. Now i want to use openmp to reduce the time but i can't get it to work, it found nothing. Please help me i'm new to openmp, thank you.

int main()
{
    cout << endl;
    clock_t start, end;
    cout << "\nStart\n";
    start = clock();
    int i1, i2, i3, i4, i5, i6, i7, i8;
#pragma omp parallel num_threads(12)
    {
#pragma omp for collapse(8)
        for (i1 = 'a'; i1 <= 'z'; i1++)
            for (i2 = 'a'; i2 <= 'z'; i2++)
                for (i3 = 'a'; i3 <= 'z'; i3++)
                    for (i4 = 'a'; i4 <= 'z'; i4++)
                        for (i5 = 'a'; i5 <= 'z'; i5++)
                            for (i6 = 'a'; i6 <= 'z'; i6++)
                                for (i7 = 'a'; i7 <= 'z'; i7++)
                                    for (i8 = 'a'; i8 <= 'z'; i8++)
                                    {
                                        unsigned short ot_2[6] = { 0x66c0, 0xc5cc, 0x0bcd ,0x7201, 0x0923,0xfc29 };
                                        unsigned short wt_2[6] = { 0xdb93, 0xe790, 0x1dc0 ,0xbf79, 0x1fdf,0x5bdb };
                                        unsigned short Key[4];
                                        Key[0] = ((i1 << 8) ^ i2);
                                        Key[1] = ((i3 << 8) ^ i4);
                                        Key[2] = ((i5 << 8) ^ i6);
                                        Key[3] = ((i7 << 8) ^ i8);
                                        if (encrypt(ot_2[0], Key) == wt_2[0])
                                            if (encrypt(ot_2[1], Key) == wt_2[1])
                                                if (encrypt(ot_2[2], Key) == wt_2[2])
                                                    if (encrypt(ot_2[3], Key) == wt_2[3])
                                                        if (encrypt(ot_2[4], Key) == wt_2[4])
                                                            if (encrypt(ot_2[5], Key) == wt_2[5])
                                                                printf("%c%c%c%c%c%c%c%c", i1, i2, i3, i4, i5, i6, i7, i8);
                                    }
    }
    end = clock();
    double time = (end - start) / CLOCKS_PER_SEC;
    int hour = time / 60 / 60, minutes = (time - hour * 60 * 60) / 60, second = time - hour * 60 * 60 - minutes * 60;
    cout << "Time: " << hour << ":" << minutes << ":" << second << endl;
}

I'm using g++ on windows by the way.

Update: here's the encrypt funtion

unsigned short encrypt(unsigned short x, unsigned short Key[])
{
    for (int i = 0; i < 2; i++)
    {
        x ^= Key[i];
        x = (Pi[x & 0xf] ^ (Pi[(x >> 4) & 0xf] << 4) ^ (Pi[x >> 8 & 0xf] << 8) ^ (Pi[x >> 12 & 0xf] << 12));
        x = functionP(x);
    }
    x ^= Key[2];
    x = (Pi[x & 0xf] ^ (Pi[(x >> 4) & 0xf] << 4) ^ (Pi[x >> 8 & 0xf] << 8) ^ (Pi[x >> 12 & 0xf] << 12));
    x ^= Key[3];
    return x;
}

Here's functionP

#include<iostream>
#include<fstream>
#include<time.h>
#include<omp.h>
#include<vector>
#include <string>
using namespace std;
unsigned char Pi[16] = { 0xa,5,6,4,0xb,3,0xd,0x9,0xc,0xf,0xe,0,7,1,0x8,2 };
unsigned char P[16] = { 0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15 };
unsigned short functionP(unsigned short x)
{
    unsigned short temp = x;
    x = 0;
    for (int i = 0; i < 16; i++)
        x ^= (((temp >> (15 - P[i])) & 0x1) << (15 - i));
    return x;
}

Solution

  • Edit: As the OP is new is openMP, I have added more details to my answer.

    The problem is with your code is that the collapsed loops has altogether 25^8=152,587,890,625 iterations, which is bigger than the maximum of (32 bit) int (2,147,483,647). The compiler (g++ 10.2) cannot handle this situation and produce incorrect code. There are 2 solutions to handle it:

    1. use long long instead of int. In this case collapsing all the loops is not a problem for the compiler.

    2. Remove collapse(8) or change it to collapse(n) n=2..6. If you remove collapse(8) the above mentioned problem is solved, but in this case your loop variables (i2,i3,..i8) become shared, as they are not loop variables of the parallel loop construct anymore and defined outside the parallel region. It causes data race as they used by different threads and gives incorrect result. Note that the loop variables (of the parallel loop construct) are privatized by the compiler so i1 is private by default, and if you use collapse(8) all the loop variables are privatized. To make your loop variables private you can explicitly set it by the private clause:

      #pragma omp parallel num_threads(12) private(i2,i3,i4,i5,i6,i7,i8) Since you are new to openMP, it is a good practice to use default(none) clause, so you are forced to define the sharing attribute of all of your variables. If you forget to do so, you will receive a compiler error. #pragma omp parallel num_threads(12) default(none) private(i2,i3,i4,i5,i6,i7,i8)

    Another comment is you should always define your variables in their minimum required scope. Since the loop variables are not used after the loop it is much better to define them in the init-statement of for loop:

    for (int i1 = 'a'; i1 <= 'z'; i1++)  
    

    In this case variables i1,i2..i8 defined inside the parallel region, so they are private by default, so you do not have not to worry about them. Note also that the 2 openmp directives can be merged to a so-called combined directive, so putting it together your code will look like this:

    #pragma omp parallel for default(none) num_threads(12) collapse(6)
            for (int i1 = 'a'; i1 <= 'z'; i1++)
                for (int i2 = 'a'; i2 <= 'z'; i2++)
                    for (int i3 = 'a'; i3 <= 'z'; i3++)
                        for (int i4 = 'a'; i4 <= 'z'; i4++)
                            for (int i5 = 'a'; i5 <= 'z'; i5++)
                                for (int i6 = 'a'; i6 <= 'z'; i6++)
                                    for (int i7 = 'a'; i7 <= 'z'; i7++)
                                        for (int i8 = 'a'; i8 <= 'z'; i8++)
                                        { .....