This repository contains materials for the Git and GitHub CI/CD workshop run by the NHS England Data Science and Applied AI team.
This workshop is a dive into the automated features of Git and GitHub, specifically focusing on CI/CD workflows for Data Science projects. We will cover the mechanics and practical implementation of pre-commit hooks (including gitleaks, linting, formatting, and Jupyter notebook cleaning) to catch issues locally, alongside GitHub Actions for automated testing, formatting checks, and deployment. The demonstrations will primarily use Python, though equivalent resources will be provided for R users. Please ensure you are comfortable with basic Git and GitHub operations and have a GitHub account.
Head to Getting Started to fork this repository and set up your environment.
Workshop Website — start here.
There are two main activities. Complete one during the live session and work through the other in your own time.
| Activity | Description |
|---|---|
| Pre-Commit Hooks | Set up local automation to catch formatting issues, leaked secrets, and dirty notebooks before they are committed. |
| GitHub Actions | Build CI/CD workflows that run automated checks and tests on every push and pull request. |
| Pre-requisite | Description |
|---|---|
| Python | Knowledge of how to write and run Python code |
| Git | Basic command line usage and version control concepts |
| GitHub | Familiarity with repositories, Codespaces, and forking |
If you encounter any problems or have questions, please open an issue in this repository.
Unless stated otherwise, the codebase is released under the MIT Licence. This covers both the codebase and any sample code in the documentation.
HTML and Markdown documentation is Crown copyright and available under the terms of the Open Government Licence 3.0.