Pages

Monday, September 17, 2012

Cudafy Me: Part 2 of 4

These posts are meant to inspire you to enter into the world of graphics processor programming.

CUDALibs

Posts in This Series

Full Source Code

http://cudafytsp.codeplex.com/

Recap

In the last post, we simply put together a single-threaded CPU solution to the traveling salesman problem. Here was the code that accomplished our work:

private Answer CpuTsp()
{
    var bestDistance = float.MaxValue;
    var bestPermutation = -1;
    var stopWatch = Stopwatch.StartNew();

    for (var permutation = 0; permutation < _permutations; permutation++)
    {
        var path = new int[1, Cities];
        var distance = FindPathDistance(_permutations, permutation, Cities, _latitudes, _longitudes, path, 0);
        if (distance < bestDistance)
        {
            bestDistance = distance;
            bestPermutation = permutation;
        }
    }

    return new Answer { Distance = bestDistance, Permutation = bestPermutation, Milliseconds = stopWatch.ElapsedMilliseconds };
}

Multi-Threaded CPU Solution

Now we will modify the previous solution to take advantage of a multi-core CPU.

private Answer MpuTsp()
{
    var bestDistance = float.MaxValue;
    var bestPermutation = -1;
    var locker = new Object();
    var stopWatch = Stopwatch.StartNew();

    Parallel.For(0, _permutations, permutation =>
    {
        var path = new int[1, Cities];
        var distance = FindPathDistance(_permutations, permutation, Cities, _latitudes, _longitudes, path, 0);
        lock (locker)
        {
            if (distance < bestDistance)
            {
                bestDistance = distance;
                bestPermutation = permutation;
            }
        }
    });

    return new Answer { Distance = bestDistance, Permutation = bestPermutation, Milliseconds = stopWatch.ElapsedMilliseconds };
}

The only difference in this solution is that we use a Parallel.For loop instead of a tradition for loop. Because the interior of the loop runs in parallel (concurrently), we need to use a bit of thread locking to ensure we find the shortest distance. The function FindPathDistance is identical to the single-threaded solution.

We could optimize the code above and use a more efficient locking scheme, but that is not the point of this series of posts.

Next Step

What we have left to do is to write a method that takes advantage of the code we have already written to run it on the massively parallel GPGPU.

>>> Part 3 – Massively Parallel GPGPU Solution

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.