Sunday, October 18, 2020

SelfNote: Enable Python in Visual Studio Interactive

This note is to document how to configure Python interactive window in Visual Studio (not VS Code)

Reference a local library

When I added a new .py file in the project, I cannot add it to the python interact window. The reason is the default path for python interact is under "program files". The following command is to change the current working folder to a new folder:

import os

Make sure the string is started with "r". 

Reload a library

After change a library, the new content is not loaded to the python interactive session by default. The following command is to reload the library into the session.

import myModule #import my module
#the following is to reload myModule

import importlib

importlib.reload(myModule )

Please note: myModule does not need quote or double quote 


Friday, August 7, 2020

Matrix Multiplication & Graph Reachable

I have been struggling to find a good way to represent graph structure. There are number of graph library where Node and Edge are defined. I found that matrix can be used to represent a graph. 

The row and column of the matrix represent the node in a graph. The element in the matrix is connection between two nodes. In a graph, this connection is called Edge. The following diagram represents a graph with 3 nodes and 3 edges. 

In a matrix representation, it can be written as a matrix. Row and column are named A, B, and C. If there is an edge between two nodes, the corresponding cell is set to 1. For example, edge AB shows 1 in cell (A, B) in the grid. We can denote this matrix as M

Matrix "multiplication" to itself can reveal if a certain node can reach the other nodes in two steps. The "multiplication" operation is different from the linear algebra matrix multiplication.  

K = M ⊗ M

 Each element b in N can be calculated by the following formula, where N is the number of the nodes (or the column number of the matrix M).

Two nodes i and j. If there j is reachable from i, either of the following two conditions are met:
  • there is a direct edge between i and j 
  • there is a node p which is reachable from i and p can reach j
Written as math language in the matrix M
  • M(i, j) = 1
  • 彐p, M(i, p) = 1 and M(p, j) = 1
Repeat the "multiplication" operation logN times, it can calculate any node in a graph can be reachable from other nodes. 

Finally, I can move away from a complicated node/edge definition into a formal math world. Hopefully the existing linear algebra can help me down the road. 

Wednesday, February 19, 2020

SelfNote: VSCode setup

Recently I started to use the Azure cloud machine to do my coding work. I can still do my coding in my laptop using full Visual Studio but an Azure VM with VSCode will be much convenient. Low tech but works. :)

The following VSCode setup script helps me set up a new machine:

  1. open PowerShell 
  2. copy extension names to the clipboard (yes, Ctrl+C)
  3. run get-clipboard | % { code --install-extension $_ }


Monday, February 3, 2020

Docker Image build clean up using PowerShell on local PC

When I run the docker image to do the debug on PC, I was struggling to keep the hard disk space consumption low and running the recent binary in "Docker on Windows Desktop". This development stage, so I won't consider the Azure cloud for hosting.

I need a script to reclaim hard disk, remove the currently running container, build and run the new image in docker on windows desktop. 

The following code first removes unused images and then gets the image running on the port 5555. I will have to play the string replace tricks using PowerShell. Please check "-replace" part for details. 

#prune unused image
docker image prune -f

#remove existing running docker image
$a = docker container ls --filter expose=5555
$a2 = $a | foreach-object {$_ -replace '\s{3,}', '  '} | foreach-object {$_ -replace '  ', "`t"}
$obj = ConvertFrom-Csv -InputObject $a2 -Delimiter "`t"
if ($obj -ne $null)
    docker rm -f  $obj.names
    Write-Verbose "cannot find running container image so won't remove"

#build flask app and set its version to 1
docker build -t flaskapp:1 .
docker run -d -p 5555:5555 flaskapp:1

#test running instance
sleep -s 4
invoke-restmethod "http://localhost:5555/api/health"

Sunday, January 19, 2020

SelfNote: Setup Spacy Module in Visual Studio

I guess not so many people use Visual Studio doing NLP with Python. I have been searching for a solution on how to add a Spacy module in Visual Studio. I kept getting cannot find en_core_web_sm module error and the only solution I could find is to run python -m spacy download en_core_web_sm. I was struggling to find where in the UI to run the Python command. And finally, the solution came.

you noticed the "Admin" icon? yes, that is the key to solve the problem. With admin permission, you can open PowerShell and run python -m spacy download en_core_web_sm.

I installed en_core_web_sm, but it shows a warning message. And the en_core_web_lg module can eliminate the warning.

Monday, January 6, 2020

Standford NLP Quick Setup on Win10 with WSL

After setting up the Stanford NLP on my PC, I was struggling with how to run it faster. Then I ran into Windows Subsystem for Linux (WSL). I realize that the modification of Stanford NLP is not an option for me. I can use WSL to quickly start an NLP web service and start my work.

I ordered a more powerful VM from Azure and use PowerShell to set up the NLP environment. 

  1. Enable WSL

    Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

    It will trigger a reboot if WSL is not enabled
  2. Add Ubuntu disco to WSL

    # download ubuntu 18.04 as save it as Ubuntu.appx at local directory
    Invoke-WebRequest -Uri -OutFile Ubuntu.appx -UseBasicParsing

    # add Ubuntu.appx to WSL
    Add-package Ubuntu.appx
  3. Download Standard NLP zip file and unzip it to the current folder

    # download the Stanford NLP and save the zip file locally as ""
    Invoke-WebRequest -uri -outfile -UseBasicParsing

    # unzip the
    Expand-Archive -DestinationPath .\CoreNlp\
  4. install Java in WSL. Since Stanford NLP does not require Oracle Java, so I use Open Java to make the command shorter

    wsl sudo apt-get update
    wsl sudo apt-get install default-jdk
I go into WSL from PowerShell and launch Stanford NLP from WSL. 

  • go to WSL from Powershell by using "wsl"
  • go to the folder which stores the unzipped Stanford NLP files in step 3
  • run java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer
Open Edge and go to http://localhost:9000/, it will show similar UI like

The PowerShell script used to access the localhost 9000 ports are listed below:

$data = "The quick brown fox jumped over the lazy dog."
$url2 = 'http://localhost:9000/?properties={"annotators":"tokenize,ssplit,pos,lemma,ner, entitymentions,depparse,parse,relation,openie,dcoref,kbp","outputFormat":"json"}'
$r = Invoke-RestMethod -Uri $url2 -Method post -Body $data

The annotators are listed here in case you need it.

Monday, September 2, 2019

6-Year-Old Rule?

The story begins recently I am indulging in the quantum computing programming whose foundation involves a lot of linear algebra. The direct consequence is I want to solve almost all problems using the complex number, vector, and matrix. The Mars Rover problem got my attention.

The rover on the Mars surface only takes L, R, and M instruction. L means turn left. R means turn right. M means move 1 step forward. If the initial position is (0, 0) and facing North, what will be the location and facing direction after receiving couples of instructions? 

This is a perfect opportunity to practice my math, I thought. The direction and location are all complex numbers. I can manipulate numbers to make the code concise! I convert the L, R, and M instruction to complex number op:

  • turn left L is complex (0, 1) where the real part is 0 and the imaginary part is 1
  • turn right R is complex (0, -1)
  • move forward M is complex number (1, 0)
The new location and direction can be computed like the following:

  • new location += op.Real * Direction  //L and R's real part is 0, so it does not move location
  • new direction *= op  //M is 1, so it has no effect on the direction
It looks cool and code is concise. I avoided those unpleasant switch statements. 

Everything looks satisfying until I realize those expensive multiply operations might be a performance problem. I wrote a simple solution using those switch statements. The simple solution doubled the code. However, the performance is 30+% faster than my "elegant" one! 

I told my wife what was going on when I encountered her in the kitchen, with a little bit of frustration. She smiled and pointed my 6-year-old son. "If the solution can be understand by him, it must be faster." she said, "It requires a smaller brain so it runs faster."

Well, both algorithms are O(n). Maybe next time, I should think on the big O level.