When I was trying to solve Euler Project 205 using F#, I was thrilled to find an opportunity to use GPGPU to solve this problem. The correct solution for this problem is to use statistic method, but I would like to simulate the process and see what I can get. Again, I do not intend to solve this problem by this approach, just to learn GPGPU.
Simulating this process is simple, I make a 9*1000 size matrix A with random number from 1 to 4. Similarly, matrix B is 6 * 1000 with element ranged from 1 to 6. The Accelerator v2 from Microsoft Research provides a good managed platform for GPGPU. The following is the code:
let dxTarget = new Microsoft.ParallelArrays.
DX9Target()
let getWin index (gridSize:int) =
let shape = [| gridSize; gridSize; |]
let resultShape = [|gridSize|]
let zero, one = new FPA(0.0f, resultShape), new FPA(1.0f, resultShape)
let generatePrymaid i j = float32(random.Next(1,4))
let generateDice i j = float32(random.Next(1,6))
let px = new FPA(Array2D.init 9 gridSize generatePrymaid)
let py = new FPA(Array2D.init 6 gridSize generateDice)
let pxSum = PA.Sum(px, 0)
let pySum = PA.Sum(py, 0)
let cond = FPA.op_GreaterThan(pxSum, pySum)
let gpu = PA.Sum(PA.Cond(cond, one, zero), 0)
let a = dxTarget.ToArray1D(gpu)
let result = Seq.head a
tempResult <- tempResult + int64(result)
if (index%500=0) then
let total = int64(index) * int64(gridSize) + totalTest
printfn "Time = %A; Result = %A" index (float(tempResult) / float(total))
appendToFile file (System.String.Format("{0}\t{1}", tempResult, total))
result
let compute n = Seq.init n (fun i->getWin i gridSize) |> Seq.sum
let n = 250000;
let mutable start = System.DateTime.Now;
printfn "%A" ((compute n) / float32(gridSize*n))
printfn "%A" (System.DateTime.Now - start);
The result is interesting:
- on my powerful desktop: the GPU version is slower than CPU version.
- on my laptop, the GPU version is 2 time faster than CPU version. The CPU fan is quiet when I do the computation.