Friday, April 29, 2011

Find the maximum difference in a sequence

Let us put another interview question solution with F# (functional way). The question is to find the x[a], x[b], where x[b] - x[a] -> maximum

we are going to find a tuple (a,b) where x[b] - x[a] -> maximum. Be careful b must be greater than a. So our first problem will be how to generate a sequence (a,b) where b>a and a between (0, L).

f(i) will generate a sequence from i to L.
f(i) -> [i..L] |> Seq.map (n-> (i,n))

[0..L]
|> Seq.map (i-> Seq.map (n->(i,n)) [i..L] ) //generate the (a,b) tuple
|> Seq.collect (n->n) //flatten the sequence
|> Seq.maxby ((a,b)->x[b]-x[a]) //find the max dif

Friday, April 15, 2011

F# and genetic algorithms

I have already three posts: fitness, chromosome, and population for the GA. Because my IEEE paper is about the chromosome encoding, so I am very interested in decouple the geno and phone mapping.

This post is to use F# implement the Genetic Algorithm (GA). This GA implementation is based on genotype and phenotype. You can add geno-to-pheno type mapping. Not like simple genetic algorithm, the chromosome is only 0 and 1, this implementation supports the integer. I believe you can support other chromosome representation easily because the beauty of F#. The chromosome representation can be generated by the initialization function. All the GA part is represented as function and you are given the liberty to pass in your function.

I also added the elitism for the evolution. (Aug 05, 2012).

Please use the link below to get the code.


Tuesday, April 12, 2011

Delegate wrap a function pointer

though C# does not support, but delegate does wrap a function pointer underlying. In CLR, it is calli.

Keep this in mind..

Monday, April 11, 2011

Function list for parallel - My journey to Parallel extension

Today I want to share my failure.. :-). Sometimes the failure make me "better experience"...

These days I am struggling the design patterns and parallel as I claimed design pattern is not parallel friendly. I bet a lot of people won't agree with me. Give me some time and let me think through it.

Let me explain the context; otherwise, you will doubt the rationale behind my test. I found many parallel algorithm or approach is trying to partition the data. how about play function (not data) to achieve parallelism?

Since I am a fan of F#, I really hope the function way will win. Let us wait and see.. :)

the code is

  • The first version is to partition data and make parallel. 
  • The second version is to apply a function list to the data. 

static void Main(string[] args)
        {
            List< double > a = new List< double >();
            List < double > b = new List < double >();
            Stopwatch sw = new Stopwatch();
            var size = 100000;

            for (int j = 0; j < 100; j++)
            {

                var l = Enumerable.Range(0, size).ToList();
                sw.Reset();
                sw.Start();
                Parallel.For(0, size, i => l[i]++);
                sw.Stop();
                a.Add(sw.ElapsedMilliseconds);

                var f = Enumerable.Range(0, size).Select(i => { return (() => l[i]++); }).ToList();
                sw.Reset();
                sw.Start();
                Parallel.ForEach(f, n => n.Invoke());
                sw.Stop();
                b.Add(sw.ElapsedMilliseconds);

            }

            a.Zip(b, (x, y) => new { x, y }).ToList().ForEach(n => Console.WriteLine("{0}, {1}", n.x, n.y));
        }

First of all, why I need to run 100 times? Parallel is based on some random factor, if we only run once, the result is not accurate. 

The results shows data partition version wins.. :-(

the first version is so simple, so the second version's function invoke fails me.. (sigh). But anyway I tried my luck.. 

Friday, April 8, 2011

function with LINQ

Today I work on a problem to find if a line is intersected with a circle.

so f1 is y = ax + b and f2 is x^2 + y^2 - r^2 = 0

It is true you can use for-loop to solve this problem, but since I hate for-loop, i will use LINQ.

the input to f1 is { x }, if the line is intersected with a circle means there is (a,b) where f2(a,b) <=0. OK, the data input is {x}, and it is transformed by f1 into some { (x,y) } tuple sequence. If there is a tuple in the sequence makes f2 <=0, we can draw the conclusion that the line is intersected with a circle.

The pseudo-code is like:

{ x } |> seq.map f1 |> seq.exists f2(x,y) <=0

Let me give detailed code tomorrow

Sunday, April 3, 2011

My understanding of parallel and functional programming

I just want to write down what is on my mind today. Maybe I'd say what a stupid in the future.

Days ago, I got complaints from a team member who wanted me to reduce the usage of LINQ because it is hard to debug. First of all, that is not true. Debugging a function intensive program is as easy as traditional program. When you use for (i = 0; i < 10; i++), you think about how to make this thing happen ten times. When you use Enumerable.Range(0, 10).Select(...), you might want to think about the input sequence is being pushed into the system. The system accept the input sequence and this sequence will be transformed by several functions, just like what is happening in an chip assembly factory. You pump in sand and get a CPU after several process. Nothing left on these machine exception the garbage nobody wants and care about. I do not really how people switch from for to parallel, but the input sequence way provide me some advantages.

If you want to generate more chip, what can you do? build new factories.
partition the data sequence into small chunk. Pump each small chunk into functions set and will get parallel. Actually you do not have to chunk the data, applying a function using LINQ will provide you the same functionality. Because we only use function which does not have memory about states, we are pretty safe to make the whole system parallel. That's why sometimes I hate to use class and complex design patterns to solve a problem, all of these are designed when parallel is not the main concern of the system design.

When you use class, most likely you will use field to keep some state information. Sooner or later, you will find these information is the enemy of parallel unless you keep them all non-public and non-static. I am not that disciplined to follow the rule, so.. :-)

Let me come back to this topic. This morning when I woke up, I realize "a program = data structure + algorithm" or "data structure + function". If I could partition the data, I can also partition the functions. For example, I have 10 element in an array, instead of thinking of 10 data elements, I can make up a new function list (length = 10) and apply this function list to the data set. People might say, yes, it is same. I would not agree. Once you manipulate the function, you are in the functional programming more, which is there I really want to explore more.