Skip to content

This is the final project for Whitman CS 357 Natural Language Processing. Created by Tracy Cui and Tina Tao

Notifications You must be signed in to change notification settings

TracyCuiYating/NLP-Final-Project

Repository files navigation

NLP-Final-Project

This is the final project for Whitman CS 357 Natural Language Processing (2019 Fall). Created by Tracy Cui and Tina Tao

Summary: This project is aiming to add a language model of POS tagging into the SpellChecker program. Our program is able to first detect the incorrect words in the text and then replace the incorrect words with the correct words with the highest probability according to the pos tagging. The text could be both a sentence or a list of sentences.

Reference/Sources: We used Python built-in Spell Checker (pyspellchecker) to recognize the incorrect words and to generate a list of possible words for each incorrect word Use “pip install pyspellchecker” to download the package

Corpora used: We used Brown corpus from NLTK as our training set for probability calculation. This corpus contains lists of list of words with tags We used the Holbrook-tagged.dat corpus as our training set. This corpus contained sentences with incorrect words and their corrections (https://www.dcs.bbk.ac.uk/~roger/corpora.html)

About

This is the final project for Whitman CS 357 Natural Language Processing. Created by Tracy Cui and Tina Tao

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages