Skip to content

Latest commit

 

History

History
66 lines (42 loc) · 2.5 KB

File metadata and controls

66 lines (42 loc) · 2.5 KB

Training Data & Language Resources

Table of Contents


fastText Language Identification

Download the lid.176.bin model for automatic language detection:

wget -P training_data https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin

Categories

Refer to public documentation for human-readable explanations:

Lookup Files

Generated by bin/sync_sqlite_data.py when refreshing rules:

Incorrect/Missing German Articles

Update ./training_data/de-DE/articles.csv based on:

https://www.verbformen.de/deklination/pronomen

Model Download Utilities

Context checker model helpers:

Regenerating Rule Data

See database-seed.md for details on regenerating database/dump.sql and rebuilding lookup JSONs.


See Also