Pages

Monday, October 29, 2018

F# Stanford NLP is running

After some configuration, I can successfully run the first NLP project with F#. Special thanks to Sergey's post! The post is very informative. His solution is based on the F# interactive while I prefer to use the project-based solution.

Sergey points out that one of the common problems to setup is the path problem. His claim is so true. I had stuck in this problem for days. Here is the process I followed.


  • Open Visual Studio 2017 and create an F# console application. 
    • I tried .net core app; it does not work as the IKVM has the dependency on the .NET framework
  • compile the F# console application and remember the debug folder location
  • Open NuGet and retrieve Stanford NLP CoreNLP. The current version is 3.9.1
    • Current Stanford NLP is 3.9.2. I suggest you download 3.9.1 version
  • download the Standard NLP 3.9.1 zip file
  • unzip the 3.9.1 file to the F# console app debug folder
  • go the unzipped folder and find the model JAR file

  • download WINRAR to unzip the JAR file to a folder, this folder should contain a folder called "EDU"
  • copy the "EDU" folder up to debug folder, so the structure in the "DEBUG" folder is like the following.
  

The F# file I was using is listed below. Set the "EDU" folder to the debug folder can save you from setting the CurrentDirectory. 


1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Learn more about F# at http://fsharp.org
// See the 'F# Tutorial' project for more help.

open System
open System.IO
open java.util
open java.io
open edu.stanford.nlp.pipeline

[<EntryPoint>]
let main argv = 
    let text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";

    // Annotation pipeline configuration
    let props = Properties()
    props.setProperty("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref") |> ignore
    props.setProperty("ner.useSUTime","0") |> ignore

    let pipeline = StanfordCoreNLP(props)

    // Annotation
    let annotation = Annotation(text)
    pipeline.annotate(annotation)

    // Result - Pretty Print
    let stream = new ByteArrayOutputStream()
    pipeline.prettyPrint(annotation, new PrintWriter(stream))
    printfn "%O" <| stream.toString()
    stream.close()

    printfn "%A" argv
    0 // return an integer exit code

Executing the NLP program seems taking a lot of memory. My program uses 2G memory and takes a while to show the result. Hopefully, your computer is faster enough. :)

No comments: