Python package for TrOCR training If you have a line based preprocessed dataset, you can train a handwritten text recognition TrOCR model with this package.
- 📦 UV - Ultra-fast Python package manager
- 🚀 Just - Modern command runner with powerful features
- 💅 Ruff - Lightning-fast linter and formatter
- 🔍 Mypy - Static type checker
- 🧪 Pytest - Testing framework with fixtures and plugins
- 🧾 Loguru - Python logging made simple
- 🛫 Pre-commit hooks
- 🐳 Docker support with multi-stage builds and distroless images
- 🔄 GitHub Actions CI/CD pipeline
The template is based on UV as package manager and Just as command runner. You need to have both installed in your system to use this template.
Once you have those, you can just run
just dev-syncto create a virtual environment and install all the dependencies, including the development ones. If instead you want to build a "production-like" environment, you can run
just prod-syncIn both cases, all extra dependencies will be installed (notice that the current pyproject.toml file has no extra dependencies).
You also need to install the pre-commit hooks with:
just install-hooksYou can configure Ruff by editing the .ruff.toml file. It is currently set to the default configuration.
Format your code:
just formatRun linters (ruff and mypy):
just lintRun tests:
just testDo all of the above:
just validateYou can run the code with:
just runThe template includes a multi stage Dockerfile, which produces an image with the code and the dependencies installed. You can build the image with:
just dockerizeThere are two Github Actions workflows from the template.
The first one runs tests and linters on every push on the main and dev branches. You can find the workflow file in
.github/workflows/main-list-test.yml.
The second one is triggered on every tag push and can also be triggered manually. It builds the distribution and uploads
it to PyPI. You can find the workflow file in .github/workflows/publish.yaml.