Latin NLP Tools Comparison

This project evaluates the speed, accuracy, and usability of four prominent Natural Language Processing (NLP) tools for Latin texts:

Two samples are used for testing:

Project Goal

To provide a reproducible and comparative analysis of Latin NLP tools for tokenisation, lemmatisation, and POS-tagging, as well as processing speed.

Project Structure

data/: Sample Latin texts (raw and preprocessed)
notebooks/: Jupyter notebooks for experiments
scripts/: Python scripts for preprocessing and tool execution
results/: Accuracy/speed metrics and visualizations

Installation

Clone the repo:

git clone https://github.com/YOUR_USERNAME/latin-nlp-comparison.git
cd latin-nlp-comparison ```

Create a virtual environment:
```
 source env/bin/activate ```
```
Install dependencies:

pip install -r requirements.txt

Metrics Evaluated

Accuracy
- Tokenisation
- Lemmatisation
- POS
Speed: length of time to process data
Usability: Observational assessment of set-up complexity, packages required, interface, export options

Wiki

See the GitHub Wiki for documentation, tool setup guides, and detailed findings.

Acknowledgments

Supervised by Bernhard Bauer at the University of Graz.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Latin NLP Tools Comparison

Project Goal

Project Structure

Installation

Metrics Evaluated

Wiki

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
data		data
notebooks		notebooks
results		results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

TAJSchaaf/LatinNLPTools

Folders and files

Latest commit

History

Repository files navigation

Latin NLP Tools Comparison

Project Goal

Project Structure

Installation

Metrics Evaluated

Wiki

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages