NLP-Final-Project

This is the final project for Whitman CS 357 Natural Language Processing (2019 Fall). Created by Tracy Cui and Tina Tao

Summary: This project is aiming to add a language model of POS tagging into the SpellChecker program. Our program is able to first detect the incorrect words in the text and then replace the incorrect words with the correct words with the highest probability according to the pos tagging. The text could be both a sentence or a list of sentences.

Reference/Sources: We used Python built-in Spell Checker (pyspellchecker) to recognize the incorrect words and to generate a list of possible words for each incorrect word Use “pip install pyspellchecker” to download the package

Corpora used: We used Brown corpus from NLTK as our training set for probability calculation. This corpus contains lists of list of words with tags We used the Holbrook-tagged.dat corpus as our training set. This corpus contained sentences with incorrect words and their corrections (https://www.dcs.bbk.ac.uk/~roger/corpora.html)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CS Final Project Report .pdf		CS Final Project Report .pdf
README.md		README.md
final.py		final.py
final_model.dat		final_model.dat
final_readfile.py		final_readfile.py
holbrook-tagged.dat		holbrook-tagged.dat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NLP-Final-Project

About

Uh oh!

Releases

Packages

Languages

TracyCuiYating/NLP-Final-Project

Folders and files

Latest commit

History

Repository files navigation

NLP-Final-Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages