Natural-Language-Processing-Project-1

Very easy to run. First go to PROJECT_PATH\src\Author Recognition and run the following command

    java -jar Natural-Language-Processing-Project-1.jar

Program will ask the directory in which authors and their files are located.

Expected directory structure is the following:

input_path/authorName/fileX

Natural-Language-Processing-Project-2

This is the second project of the course Natural Language Processing in Bogazici University during the semester Spring 2016 Implementation of Viterbi Algorithm on the dataset Metubank in java

The easy way to RUN THEM ALL##

I made an runner class which receives all the parameters and runs the 3 task one by one. Nothing has changed internally, this is just to simplify the life.

RunThemAll.java

You can find the jar file in Natural-Language-Processing-Project-1\src\POS Tagger,

Receives several parameters, you can compile it as well but I extracted the Jar file in the project which can be run easily with the following command:

    java -jar RunThemAll.jar trainingFilePath [postag/cpostag] testFilePath outputPath goldStandardPath

How to run the code manually

Assuming all files are compiled using: javac <filename>.java

Main.java

This program expects your to send two parameters as specified in the project description. First one is the path of the training data Second one is the option of pos tags Either postag|cpostag This program will train itself using the training data and will create 3 output files in the same directory that are provided below

posNamesToPosNamesProbablities.ser
posNamesToWordPossibilities.ser
posType.ser

If you dont provide any parameters then the program will crash.

Test.java

This program expects your to send two parameters as specified in the project description. First one is the path of the test data Second one is the path of the output file This program will test the test data using the pre-produced model by reading the .ser files explained above, afterwards it will produce an output file in which the guesses are written in a readable manner.

Validator.java

This program expects your to send two parameters as specified in the project description. First one is the path of output file that was described in the Test.java. Second one is the path of the gold_standard file. This program will output the confusion matrix in a json form and an example is the following

Noun : {
        Noun : 7
        Adj : 2
        },
Punc : {
        Punc: 15
        }

The example states in the test data there were 9 words labelled as noun and the program guessed 7 of them as noun and 2 of them as adjective. Besides, there were 15 Puncs and all of them were correctly labelled by the program

For Turkish Speaking Developers: Projeden bagimsiz sekilde Turkiye illeri arasindaki mesafe bir json dosyasi olarak projede bulunuyor.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.idea		.idea
META-INF		META-INF
src		src
.gitignore		.gitignore
README.md		README.md
iller arasi mesafe.json		iller arasi mesafe.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Natural-Language-Processing-Project-1

Natural-Language-Processing-Project-2

The easy way to RUN THEM ALL##

RunThemAll.java

How to run the code manually

Main.java

Test.java

Validator.java

About

Uh oh!

Releases

Packages

Languages

erdemtor/Natural-Language-Processing-Projects

Folders and files

Latest commit

History

Repository files navigation

Natural-Language-Processing-Project-1

Natural-Language-Processing-Project-2

The easy way to RUN THEM ALL##

RunThemAll.java

How to run the code manually

Main.java

Test.java

Validator.java

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages