Thank you for your interest in contributing to Privlog.
This guide explains the project's architecture, development workflow, and where key logic lives.
Privlog is a privacy-aware linter for Python. The command-line interface is implemented using Typer, while the analysis engine combines Semgrep rules with a Python AST-based scanner.
To work on Privlog locally:
git clone https://github.com/privlog-dev/privlog.git
cd privlog
python -m venv .venv
source .venv/bin/activate
pip install -e .This installs Privlog in editable mode so code changes immediately affect the CLI.
You can verify the installation by running a scan on the project itself:
privlog .-
pyproject.toml- Purpose: Defines project metadata, dependencies, and the
privlogentry point. It is also the location for user-defined configuration under the[tool.privlog]section.
- Purpose: Defines project metadata, dependencies, and the
-
README.md- Purpose: Provides a high-level overview and usage instructions for users.
-
privlog/- The main Python package directory.
-
privlog/cli.py- Purpose: The main Typer entry point for the CLI application.
- Responsibilities: Defines commands and arguments using Typer. Implements the
--warnings/-wflag and filters findings based on severity.
-
privlog/runner.py- Purpose: The main analysis engine.
- Responsibilities:
- Loads user configuration from
pyproject.tomlvia the_load_configfunction. - Runs the Semgrep scanner.
- Runs the AST checker, passing the loaded configuration to it.
- Merges findings from both sources.
- Determines the final exit code based only on the presence of
ERROR-level findings.
- Loads user configuration from
-
privlog/formatter.py- Purpose: Handles the presentation of results.
- Responsibilities: Prints findings in a
Flake8-like format, with color-coding for severities.
-
privlog/ast_checks.py- Purpose: A high-precision Python linter using the
astmodule. It is the core of the tool's intelligence. - Responsibilities:
- Severity System: Divides sensitive variable names into
HIGH_CONFIDENCE_SENSITIVE_NAMES(ERROR) andWARNING_SENSITIVE_NAMES(WARNING). - Multi-Format Detection: Understands and inspects arguments within f-strings,
.format()calls, and%-style formatting. print()Check: Scansprint()statements for sensitive variables.- Heuristic Analysis: Flags risky patterns like logging with
extra=...orjson.dumps(). - Custom Wrapper Analysis: Receives the
PrivlogConfigobject and inspects function calls to see if they match a name in thecustom_wrappersconfiguration, checking their keyword arguments accordingly.
- Severity System: Divides sensitive variable names into
- Finding Codes:
PL2101: A direct sensitive identifier was found in a logging call.PL2201-2203: A heuristic pattern (likeextra=...orjson.dumps) was found in a logging call.PL2301-2303: A sensitive identifier or heuristic pattern was found in aprint()call.PL2401: A sensitive argument was passed to a custom logging wrapper defined in the user's configuration.
- Purpose: A high-precision Python linter using the
-
privlog/rules/privlog.yml- Purpose: The core Semgrep ruleset, which complements the AST checker.
- Responsibilities: Defines rules for detecting sensitive identifiers, secrets, and unsafe logging patterns.
When contributing code:
- Keep the CLI interface stable
- Maintain clear error messages and finding codes
- Prefer AST-based detection when accuracy matters
- Keep rules deterministic and easy to understand
Before submitting a pull request:
- Ensure the CLI runs correctly
- Verify that findings behave as expected
- Update documentation if behavior changes
- Fork the repository.
- Create a feature branch (
git checkout -b feature/AmazingFeature). - Make your changes.
- Commit your changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a new Pull Request.
Clear explanations and examples are always appreciated.