fix(setup): resolve path bugs and missing config extras#164
Conversation
Greptile OverviewGreptile SummaryThis PR improves the setup workflow by removing automatic dependency installation from Key improvements:
Issues found:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| setup.py | Improved security with shlex.split and removed auto poetry install; error handling incomplete for FileNotFoundError |
| CONTRIBUTING.md | Corrected installation workflow order (deps before hooks), improved clarity with better formatting and comments |
Additional Comments (2)
Prompt To Fix With AIThis is a comment left during a code review.
Path: pyproject.toml
Line: 73:79
Comment:
**Broken `all` extra mapping**
`[tool.poetry.extras].all` is defined as `["pyspark", "databricks-connect", "boto3"]`, but extras are supposed to list *optional dependency keys* (e.g. `databricks-connect`, `boto3`) not other extras (and `pyspark` here is an extra name, while the optional dependency key is `pyspark`). This will make `poetry install -E all` fail to include PySpark (and can error depending on Poetry version). `all` should reference the optional dependency keys (e.g. include `"pyspark"` via the dependency key, and consider using the existing `databricks`/`aws` extras rather than raw dependency names).
How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix With AIThis is a comment left during a code review.
Path: setup.py
Line: 11:13
Comment:
**Shell invocation in setup**
`run_command()` uses `subprocess.run(command, ..., shell=True)` (setup.py:11-13) with a string. If any part of `command` ever becomes user-controlled (CLI args/env), this becomes a command-injection vector. Given this script’s purpose, it’s safer to avoid `shell=True` and pass args as a list (e.g. `['poetry','run','pre-commit','install']`).
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Pull request overview
This PR focuses on making local development setup more reliable by routing tooling through Poetry, aligning Poetry extras with documented install options, and cleaning up contributor documentation.
Changes:
- Update
setup.pyto install pre-commit hooks viapoetry run pre-commit installand add a module docstring. - Add
all,lightweight, anddeep-learningextras topyproject.tomland syncpoetry.lockaccordingly. - Improve
CONTRIBUTING.mdsetup/test instructions with clearer Poetry activation andpoetry run pytestguidance.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| setup.py | Adds module docstring and uses Poetry to run pre-commit installation. |
| pyproject.toml | Introduces new extras (all, lightweight, deep-learning) to match setup/documentation. |
| poetry.lock | Regenerates lockfile to reflect new extras/markers and Poetry version changes. |
| CONTRIBUTING.md | Refines onboarding commands for Poetry env activation and test execution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
01a18e1 to
13db61c
Compare
marcellodebernardi
left a comment
There was a problem hiding this comment.
Thanks for the PR! Looks good, except that installing pyspark and databricks-connect at the same time leads to unexpected behaviour, so I think we should avoid that.
I think let's remove the all extras, other than that LGTM 👍
13db61c to
65ce20d
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@greptileai review again with latest changes, all issues have been addressed |
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: setup.py
Line: 24:35
Comment:
**Setup ignores requested extras**
`main()` prints guidance about installing extras (e.g. `-E aws -E pyspark`), but the script always runs `poetry install` with no extras (line 31) and then installs pre-commit hooks. If a contributor follows the docs/print output expecting extras to be installed, they’ll still end up with the default lightweight environment, which is particularly confusing given the docs now recommend extras for common workflows. Consider either making setup.py accept/pass through extras (e.g. via argv) or removing the extras messaging so the script behavior matches what it says.
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove 'all' extra to prevent pyspark/databricks conflict - Use shlex.split() for safer command parsing - Fix unreachable error handling logic - Update setup.py to only configure hooks, not install deps - Correct CONTRIBUTING.md installation order (deps before hooks) - Remove outdated/misleading comments and print statements
b738c44 to
159e667
Compare
|
Hi @marcellodebernardi , Thanks for the feedback again! I've addressed all your points: ✅ Removed all extra to prevent pyspark/databricks conflict I have Tested end-to-end in fresh environment - all working as documented. Ready for review! |
marcellodebernardi
left a comment
There was a problem hiding this comment.
LGTM, thanks for the contribution @Virtuoso633!
|
@greptileai review again with latest changes, but this PR has already been approved and everything looks good |
Description
This PR resolves dependency conflicts and improves the developer setup workflow by addressing all code review feedback.
Changes Made
1. Removed all extra to prevent dependency conflicts
Removed the all extra from
pyproject.tomlthat was causing conflicts between pyspark and databricks-connect.Updated documentation to guide users to install specific extras instead (e.g.,
-E aws -E pyspark).2. Hardened setup.py for security and robustness
✅ Added
shlex.split()for safer command parsing (handles quoted arguments correctly)✅ Fixed unreachable error handling logic
✅ Removed automatic poetry install call to prevent overwriting user-specified extras
✅ Updated messaging to accurately reflect that
setup.pyonly configures pre-commit hooks3. Corrected CONTRIBUTING.md installation workflow
✅ Fixed installation order: dependencies installed before hooks configured
✅ Removed outdated comments about
setup.pyrunning poetry install✅ Clear, logical workflow:
poetry install -E <extras>→python setup.py→pytestTesting
Verified end-to-end in a fresh environment:
Addresses Review Feedback
Copilot: Dependency conflict between pyspark/databricksCopilot: setup.py string command handling safetyCopilot: Unreachable error handling codeCopilot: Misleading "Installing dependencies..." messagesCopilot: Backwards installation workflow in CONTRIBUTING.mdMaintainer: Remove all extraMaintainer: Document specific extras usage