Sunday, January 19, 2020

SelfNote: Setup Spacy Module in Visual Studio

I guess not so many people use Visual Studio doing NLP with Python. I have been searching for a solution on how to add a Spacy module in Visual Studio. I kept getting cannot find en_core_web_sm module error and the only solution I could find is to run python -m spacy download en_core_web_sm. I was struggling to find where in the UI to run the Python command. And finally, the solution came.

you noticed the "Admin" icon? yes, that is the key to solve the problem. With admin permission, you can open PowerShell and run python -m spacy download en_core_web_sm.




Monday, January 6, 2020

Standford NLP Quick Setup on Win10 with WSL

After setting up the Stanford NLP on my PC, I was struggling with how to run it faster. Then I ran into Windows Subsystem for Linux (WSL). I realize that the modification of Stanford NLP is not an option for me. I can use WSL to quickly start an NLP web service and start my work.

I ordered a more powerful VM from Azure and use PowerShell to set up the NLP environment. 

  1. Enable WSL

    Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

    It will trigger a reboot if WSL is not enabled
  2. Add Ubuntu disco to WSL

    # download ubuntu 18.04 as save it as Ubuntu.appx at local directory
    Invoke-WebRequest -Uri https://aka.ms/wsl-ubuntu-1804 -OutFile Ubuntu.appx -UseBasicParsing

    # add Ubuntu.appx to WSL
    Add-package Ubuntu.appx
  3. Download Standard NLP zip file and unzip it to the current folder

    # download the Stanford NLP and save the zip file locally as "corenlp.zip"
    Invoke-WebRequest -uri http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip -outfile corenlp.zip -UseBasicParsing

    # unzip the corenlp.zip
    Expand-Archive corenlp.zip -DestinationPath .\CoreNlp\
  4. install Java in WSL. Since Stanford NLP does not require Oracle Java, so I use Open Java to make the command shorter

    wsl sudo apt-get update
    wsl sudo apt-get install default-jdk
I go into WSL from PowerShell and launch Stanford NLP from WSL. 

  • go to WSL from Powershell by using "wsl"
  • go to the folder which stores the unzipped Stanford NLP files in step 3
  • run java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer
Open Edge and go to http://localhost:9000/, it will show similar UI like http://corenlp.run.

The PowerShell script used to access the localhost 9000 ports are listed below:

$data = "The quick brown fox jumped over the lazy dog."
$url2 = 'http://localhost:9000/?properties={"annotators":"tokenize,ssplit,pos,lemma,ner, entitymentions,depparse,parse,relation,openie,dcoref,kbp","outputFormat":"json"}'
$r = Invoke-RestMethod -Uri $url2 -Method post -Body $data

The annotators are listed here in case you need it.


Monday, September 2, 2019

6-Year-Old Rule?

The story begins recently I am indulging in the quantum computing programming whose foundation involves a lot of linear algebra. The direct consequence is I want to solve almost all problems using the complex number, vector, and matrix. The Mars Rover problem got my attention.

The rover on the Mars surface only takes L, R, and M instruction. L means turn left. R means turn right. M means move 1 step forward. If the initial position is (0, 0) and facing North, what will be the location and facing direction after receiving couples of instructions? 

This is a perfect opportunity to practice my math, I thought. The direction and location are all complex numbers. I can manipulate numbers to make the code concise! I convert the L, R, and M instruction to complex number op:


  • turn left L is complex (0, 1) where the real part is 0 and the imaginary part is 1
  • turn right R is complex (0, -1)
  • move forward M is complex number (1, 0)
The new location and direction can be computed like the following:

  • new location += op.Real * Direction  //L and R's real part is 0, so it does not move location
  • new direction *= op  //M is 1, so it has no effect on the direction
It looks cool and code is concise. I avoided those unpleasant switch statements. 

Everything looks satisfying until I realize those expensive multiply operations might be a performance problem. I wrote a simple solution using those switch statements. The simple solution doubled the code. However, the performance is 30+% faster than my "elegant" one! 

I told my wife what was going on when I encountered her in the kitchen, with a little bit of frustration. She smiled and pointed my 6-year-old son. "If the solution can be understand by him, it must be faster." she said, "It requires a smaller brain so it runs faster."

Well, both algorithms are O(n). Maybe next time, I should think on the big O level. 

Sunday, March 3, 2019

Old stuff: my first IEEE paper on Computational Intelligence

Before I finished my college, I managed to publish a paper on evolutionary computation (EC). The paper is on the encoding part. My next task is to use quantum to rewrite my algorithm. Well, there is a quantum genetic algorithm (QGA) there. However, I think with the latest decision on Q# and quantum algorithms. The paper link is here.

Saturday, January 19, 2019

Microsoft Q# Day 1: Qubit Collapsed

After going through the quantum linear algebra, I am going to see some quantum concept by using Q#.

The first one I want to prove is measurement will collapse the qubit. I will use the AssertProb to verify the value with probability. The pseudo code is listed below:

  1. use H gate to set the qubit to superposition
  2. check the probability to 1 equals 0.5 (50%)
  3. measure the qubit
  4. check the qubit equals the measured value equals to 1 (100%). The qubit collapsed. 
the code listed below:


1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
namespace Quantum.QSharpApplication2
{
    open Microsoft.Quantum.Primitive;
    open Microsoft.Quantum.Canon;
 open Microsoft.Quantum.Extensions.Convert;

    operation Operation1 () : Unit
    {
        body(...)
        {
            using (ancilla = Qubit())
   {
    H(ancilla);

    AssertProb([PauliZ], [ancilla], Zero, 0.5, "error", 1e-5);
    AssertProb([PauliZ], [ancilla], One, 0.5, "error", 1e-5);

    mutable gate = PauliZ;
    let mX = Measure([gate], [ancilla]);
    AssertProb([gate], [ancilla], mX, 1.0, "error", 1e-5);

    Message(ToStringD(mX == Zero ? 0. | 1.));

    Reset(ancilla);
   }
        }
    }
}

Wednesday, December 26, 2018

Microsoft Q# is my choice

Today I am thinking to dig into the real quantum programming after learning the foundation part. There are three big players I was looking at on quantum programming framework/language. I hate to waste my time learning the framework or language itself. I want to focus on so I can focus on solving the problem. I have three choices:
  • IBM, IBM Q
  • Google Cirq
  • Microsoft Q#
It seems IBM Q and Cirq are reusing the Python to provide a framework while Q# is a brand new language. It makes sense IBM and Google use Python as base as it has a good math library. However, I doubt this choice is the best choice. Like the GPU, quantum is special hardware. Cuda is a like C language but it has its own features. For quantum computing, it has a totally different way to compute. Therefore, I think the best way to use a new language, maybe it is just a DSL. C++ can be used to write a functional program, but the best way is still to use a functional programming language. 

Also, the examples from those three companies show a huge difference. Q# provides the best sample pack so far. It has a quantum foundation, matrix computation, and programming language. It will help me a lot to adopt this new stuff. 

The setup and debugging is the last thought. I had tried to step away from Microsoft stack; however, the debugging and the setup needs more time I expected. I was spoiled by double click and just working. 

Anyway, the current Q# shows good progress and the watch, star, and pull request number is bigger than other players. I will continue to focus on the quantum basics, like matrix and math part, anyway. 

Sunday, December 23, 2018

SelfNotes: Quantum Algorithms

I believe the quantum computing is the key to enable real AI. No matter how complicated the AI algorithm can be, the fundamental computing power will make a real difference.

The huge difference between a classical computer and a quantum computer is so intriguing! It is the first time that math can come back to computer science and guide its evolution. Also, the quantum algorithm is the first time it introduces the internal parallelism. Those existing algorithms barely considered parallelism when it is designed. Now the quantum can change this situation!

The following are some notes about quantum algorithms. They are not related to any language implementation, it is just pure math. The only tool I want to use is "that piece of meat between my ears". :)