Observation
Installing graphistry 0.53.16 from PyPI downloads a 625 kB wheel (2.75 MB uncompressed).
What's NOT the problem
Tests: not in wheel. graphistry/tests/ has no top-level __init__.py so find_packages() skips it entirely. Confirmed by inspecting the actual built wheel (212 files, 0 test files).
Deps are already clean. Confirmed in Docker (python:3.12-slim):
| Use case |
Required deps |
import graphistry + basic GFQL |
pandas, numpy, requests, pyarrow, typing_extensions, packaging |
| Cypher string GFQL |
+ lark (already lazy — only imported on first parse) |
| squarify, palettable, scipy, sklearn, igraph, cugraph |
not imported unless explicitly used |
squarify is being eliminated separately.
What IS the problem
GFQL/Cypher engine files are large and may be bloated — grown organically without modularization passes:
| File |
Uncompressed |
Lines |
graphistry/compute/gfql/cypher/lowering.py |
294 KB |
8,212 |
graphistry/compute/gfql/row/pipeline.py |
181 KB |
3,976 |
graphistry/PlotterBase.py |
150 KB |
3,768 |
graphistry/feature_utils.py |
112 KB |
3,097 |
graphistry/pygraphistry.py |
101 KB |
2,653 |
graphistry/compute/gfql/cypher/parser.py |
84 KB |
1,949 |
graphistry/compute/gfql/temporal_text.py |
74 KB |
2,073 |
graphistry/compute/ast.py |
66 KB |
1,701 |
graphistry/compute/gfql_unified.py |
58 KB |
1,548 |
graphistry/compute/chain.py |
50 KB |
1,227 |
GFQL-specific files alone (lowering, pipeline, parser, temporal_text, ast, gfql_unified, chain) = ~808 KB uncompressed, ~29% of total wheel.
Scope
- Audit
lowering.py (8,212 lines) and pipeline.py (3,976 lines) for dead code, duplication, or extractable sub-modules
- Same for
temporal_text.py and ast.py
- Modularization into focused sub-files improves maintainability regardless of wheel impact
- Separately:
PlotterBase.py and feature_utils.py are non-GFQL large files worth a similar pass
Not urgent — 625 kB downloads in ~18ms on a typical connection. Worth doing as engineering hygiene when touching these files for other reasons.
Observation
Installing graphistry 0.53.16 from PyPI downloads a 625 kB wheel (2.75 MB uncompressed).
What's NOT the problem
Tests: not in wheel.
graphistry/tests/has no top-level__init__.pysofind_packages()skips it entirely. Confirmed by inspecting the actual built wheel (212 files, 0 test files).Deps are already clean. Confirmed in Docker (python:3.12-slim):
import graphistry+ basic GFQLsquarify is being eliminated separately.
What IS the problem
GFQL/Cypher engine files are large and may be bloated — grown organically without modularization passes:
graphistry/compute/gfql/cypher/lowering.pygraphistry/compute/gfql/row/pipeline.pygraphistry/PlotterBase.pygraphistry/feature_utils.pygraphistry/pygraphistry.pygraphistry/compute/gfql/cypher/parser.pygraphistry/compute/gfql/temporal_text.pygraphistry/compute/ast.pygraphistry/compute/gfql_unified.pygraphistry/compute/chain.pyGFQL-specific files alone (
lowering,pipeline,parser,temporal_text,ast,gfql_unified,chain) = ~808 KB uncompressed, ~29% of total wheel.Scope
lowering.py(8,212 lines) andpipeline.py(3,976 lines) for dead code, duplication, or extractable sub-modulestemporal_text.pyandast.pyPlotterBase.pyandfeature_utils.pyare non-GFQL large files worth a similar passNot urgent — 625 kB downloads in ~18ms on a typical connection. Worth doing as engineering hygiene when touching these files for other reasons.