Search code examples
c#.netcudagpgpualeagpu

Trying to find large prime numbers with Alea GPU


An exception occurs when I try to find the 100,000th prime number using Alea GPU. The algorithm works fine if I try to find a smaller prime number e.g. the 10,000th prime number.

I am using Alea v3.0.4, NVIDIA GTX 970, Cuda 9.2 drivers.

I am new to GPU programming. Any help would be greatly appreciated.

long[] primeNumber = new long[1]; // nth prime number to find
int n = 100000; // find the 100,000th prime number
var worker = Gpu.Default; // GTX 970 CUDA v9.2 drivers
long count = 0;

worker.LongFor(count, n, x =>
{                
    long a = 2;
    while (count < n)
    {
        long b = 2;
        long prime = 1;
        while (b * b <= a)
        {
            if (a % b == 0)
            {
                prime = 0;
                break;
            }
            b++;
        }
        if (prime > 0)
        {
            count++;
        }
        a++;
    }

    primeNumber[0] = (a - 1);
}
);

Here are the exception details:

System.Exception occurred HResult=0x80131500 Message=[CUDAError] CUDA_ERROR_LAUNCH_FAILED Source=Alea StackTrace: at Alea.CUDAInterop.cuSafeCall@2939.Invoke(String message) at Alea.CUDAInterop.cuSafeCall(cudaError_enum result) at A.cf5aded17df9f7cc4c132234dda010fa7.Copy@918-22.Invoke(Unit _arg9)
at Alea.Memory.Copy(FSharpOption1 streamOpt, Memory src, IntPtr srcOffset, Memory dst, IntPtr dstOffset, FSharpOption1 lengthOpt)
at Alea.ImplicitMemoryTrackerEntry.cdd2cd00c052408bcdbf03958f14266ca(FSharpFunc2 c600c458623dca7db199a0e417603dff4, Object cd5116337150ebaa6de788dacd82516fa) at Alea.ImplicitMemoryTrackerEntry.c6a75c171c9cccafb084beba315394985(FSharpFunc2 c600c458623dca7db199a0e417603dff4, Object cd5116337150ebaa6de788dacd82516fa) at Alea.ImplicitMemoryTracker.HostReadWriteBarrier(Object instance) at Alea.GlobalImplicitMemoryTracker.HostReadWriteBarrier(Object instance) at A.cf5aded17df9f7cc4c132234dda010fa7.clo@2359-624.Invoke(Object arg00) at Microsoft.FSharp.Collections.SeqModule.Iterate[T](FSharpFunc2 action, IEnumerable1 source) at Alea.Kernel.LaunchRaw(LaunchParam lp, FSharpOption1 instanceOpt, FSharpList1 args) at Alea.Parallel.Device.DeviceFor.For(Gpu gpu, Int64 fromInclusive, Int64 toExclusive, Action1 op) at Alea.Parallel.GpuExtension.LongFor(Gpu gpu, Int64 fromInclusive, Int64 toExclusive, Action1 op) at TestingGPU.Program.Execute(Int32 t) in C:\Users..\source\repos\TestingGPU\TestingGPU\Program.cs:line 148
at TestingGPU.Program.Main(String[] args)

Working Solution:

static void Main(string[] args)
    {
        var devices = Device.Devices;

        foreach (var device in devices)
        {
            Console.WriteLine(device.ToString());                                
        }

        while (true)
        {
            Console.WriteLine("Enter a number to check if it is a prime number:");

            string line = Console.ReadLine();
            long checkIfPrime = Convert.ToInt64(line);

            Stopwatch sw = new Stopwatch();
            sw.Start();
            bool GPUisPrime = GPUIsItPrime(checkIfPrime+1);
            sw.Stop();

            Stopwatch sw2 = new Stopwatch();
            sw2.Start();
            bool CPUisPrime = CPUIsItPrime(checkIfPrime+1);
            sw2.Stop();

            Console.WriteLine($"GPU: is {checkIfPrime} prime? {GPUisPrime} Time Elapsed: {sw.ElapsedMilliseconds.ToString()}");
            Console.WriteLine($"CPU: is {checkIfPrime} prime? {CPUisPrime} Time Elapsed: {sw2.ElapsedMilliseconds.ToString()}");
        }            
    }        

    [GpuManaged]
    private static bool GPUIsItPrime(long n)
    {
        //Sieve of Eratosthenes Algorithm
        bool[] isComposite = new bool[n];
        var worker = Gpu.Default; 
        worker.LongFor(2, n, i =>
        {
            if (!(isComposite[i]))
            {
                for (long j = 2; (j * i) < isComposite.Length; j++)
                {
                    isComposite[j * i] = true;
                }
            }
        });
        return !isComposite[n-1];
    }

    private static bool CPUIsItPrime(long n)
    {
        //Sieve of Eratosthenes Algorithm
        bool[] isComposite = new bool[n];

        for (int i = 2; i < n; i++)
        {
            if (!isComposite[i])
            {
                for (long j = 2; (j * i) < n; j++)
                {
                    isComposite[j * i] = true;
                }
            }
        }

        return !isComposite[n-1];
    }

Solution

  • Your code doesn't look right. Given a parallel for-loop method here (LongFor), Alea will spawn "n" threads, with an index "x" used to identify what the thread number is. So, for example a simple example like For(0, n, x => a[x] = x); uses "x" to initialize a[] with { 0, 1, 2, ...., n - 1}. But, your kernel code does not use "x" anywhere in the code. Consequently, you run the same code "n" times with absolutely no difference. Why then run on a GPU? What I think you want is to do is to compute in thread "x" whether "x" is prime. With result in hand, set bool prime[x] = true or false. Then, afterwards, in the kernel after all that, add a sync call, followed with a test using a single thread (e.g., x == 0) to go through prime[] and pick the largest prime from the array. Otherwise, there's a lot of collisions for 'primeNumber[0] = (a - 1);' by n-threads on the GPU. I can't imagine how you would ever get the right result. Finally, you probably want to make sure using some Alea call that prime[] is never copied to/from the GPU. But, I don't know how you do that in Alea. The compiler might be smart enough to know that prime[] is only used in the kernel code.