Wednesday, September 22, 2010

C# Performance II

I like LINQ and functional programming very much. How about its performance?

this post is to check performance to check if a collection or list contain any element:

        public static IEnumerable l = Enumerable.Range(0, 10);

        public static void A()
            var b = new List(l).Count == 0;

        public static void B()
            var b = l.Count();

My first impression is the new + .Count could be slower than, but actually it is very fast.

  • A uses 7 ticks
  • B uses 1921 ticks
But thing got to change after increase the size of this list from 10 to 1000000
  • A uses 55858 ticks
  • B uses 26076 ticks
Let us not use Count, we use Any.

  • A uses 55012 ticks
  • B uses 3142 ticks
The conclusion will be:

for a small size collection, convert it to a List and check Count is faster. Otherwise, use Any() will be the best choice.

Monday, September 20, 2010

C# Performance

When C comes out, there must be some chaos about its performance is not as good as assembly language. I will start some posts to discuss the performance of C#, trying to help improve the performance of .NET language.

  1. var str = "";
  2. var str0 = String.Empty;
we can easily find out that str = str0; but which way is faster? 

the following is the disassembly from the first statement:
    Stloc_0 local_0

the following is the one from the second statement:
    Ldsfld System.String.Empty
    Stloc_1 local_1

It does not tell much, but when we actually run it using StopWatch, we can find the String.Empty is 5-6 times faster than empty string. I would not say the different is so huge because maybe it is my computer's configuration, but definitely the second statement is faster.

WP7 Lost Connection error

I am disappointed by the Windows Phone 7 emulator. Why? It just pops out "lost connection" error and I have to spend a day to figure out what is wrong.  I have a class which implements the INotifyCollectionChanged interface. In the item I was trying to hold a reference to its parent which is a INotifyCollectionChanged structure. Do not ask me why, this make the emulator crash. How to fix it? change its type to object. I guess some boxing is happening behind.

Hopefully the phone is as good as media said before.

Friday, September 17, 2010

Working on a .NET emulation enigne

How hard to do a software evaluation engine for .NET runtime. I spent 3 weeks. The CCI helps me a lot, although currently it does not read .NET opcode.

Monday, September 6, 2010

Euler Project Problem 81 with F#

module Problem81

let readfile fileName =
    let lines = System.IO.File.ReadAllLines(fileName)
    let data =
        let splitter = [|','|]
        |> (fun line -> line.Split(splitter) |> (fun element->float(element)))

let m = Matrix.ofSeq (readfile @".\matrix.txt")

let cols = m.NumCols - 1
let rows = m.NumRows - 1

let tempMatrix = Matrix.create (cols+1) (rows+1) -1.

let rec f x y =
    let v = tempMatrix.[x,y]
    if (v <> -1.) then
        let getResult =
            let d = m.[x,y]
            match x,y with
                | (a,b) when a=cols && b=rows -> d
                | (a,_) when a=cols -> d + f x (y+1)
                | (_,b) when b=rows -> d + f (x+1) y
                | _ ->
                    let right = f (x+1) y
                    let down = f x (y+1)
                    d + (min right down)
        tempMatrix.[x,y] <- float(getResult)

let solve = f 0 0

printfn "%A" solve

Sunday, September 5, 2010

Simulation on Euler Project 205

When I was trying to solve Euler Project 205 using F#, I was thrilled to find an opportunity to use GPGPU to solve this problem. The correct solution for this problem is to use statistic method, but I would like to simulate the process and see what I can get. Again, I do not intend to solve this problem by this approach, just to learn GPGPU.

Simulating this process is simple, I make a 9*1000 size matrix A with random number from 1 to 4. Similarly, matrix B is 6 * 1000 with element ranged from 1 to 6. The Accelerator v2 from Microsoft Research provides a good managed platform for GPGPU. The following is the code:

let dxTarget = new Microsoft.ParallelArrays.DX9Target()

let getWin index (gridSize:int) =
    let shape = [| gridSize; gridSize; |]
    let resultShape = [|gridSize|]
    let zero, one = new FPA(0.0f, resultShape), new FPA(1.0f, resultShape)

    let generatePrymaid i j = float32(random.Next(1,4))
    let generateDice i j = float32(random.Next(1,6))

    let px = new FPA(Array2D.init 9 gridSize generatePrymaid)
    let py = new FPA(Array2D.init 6 gridSize generateDice)
    let pxSum = PA.Sum(px, 0)
    let pySum = PA.Sum(py, 0)
    let cond = FPA.op_GreaterThan(pxSum, pySum)
    let gpu = PA.Sum(PA.Cond(cond, one, zero), 0)  
    let a = dxTarget.ToArray1D(gpu)
    let result = Seq.head a
    tempResult <- tempResult + int64(result)
    if (index%500=0) then
        let total = int64(index) * int64(gridSize) + totalTest
        printfn "Time = %A; Result = %A" index (float(tempResult) / float(total))
        appendToFile file (System.String.Format("{0}\t{1}", tempResult, total))

let compute n = Seq.init n (fun i->getWin i gridSize) |> Seq.sum

let n = 250000;
let mutable start = System.DateTime.Now;
printfn "%A" ((compute n) / float32(gridSize*n))
printfn "%A" (System.DateTime.Now - start);

The result is interesting:

  1. on my powerful desktop: the GPU version is slower than CPU version.
  2. on my laptop, the GPU version is 2 time faster than CPU version. The CPU fan is quiet when I do the computation.

Thursday, September 2, 2010

QuickSort in F# using Continuation Passing Style (CPS)

The recursive implementation is a good approach for QuickSort algorithm; however, this implementation can lead to "stackOverflow" exception. We will use tail recursive to get rid of this flaw.

let rec qs list cont =
    match list with
        | [] -> cont([])
        | [a] -> cont([a])
        | head::tail ->
            let lessList = tail |> List.filter (fun i-> i<=head)
            let moreList = tail |> List.filter (fun i-> i>head)
            qs lessList (fun lessListPara -> 
                qs moreList (fun moreListPara -> cont(lessListPara@[head]@moreListPara)))

let list = [7;2;6;8;4]
let result = qs list (fun a -> a)

Wednesday, September 1, 2010

Recursive call into Tree structure

let rec f2 tree cont =
    match tree with
        | EmptyTree -> cont([])
        | Tree(l, n, r) ->
            f2 l (fun lSum ->
                    f2 r (fun rSum->cont(lSum@[n]@rSum)))

Use continuation to implement the tree recursive function. This way is a better approach because it won't generate "stack over flow" exception.

Remember when you debug this in Visual Studio, remember to check the "Generate tail calls" at "build" tab when you open the project property.