You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make Python API the source of truth for upstream coverage checks
Functions exposed in Python (e.g., as aliases of other Rust bindings)
were being falsely reported as missing because they lacked a dedicated
#[pyfunction] in Rust. The user-facing API is the Python layer, so
coverage should be measured there.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
You are auditing the datafusion-python project to find features from the upstream Apache DataFusion Rust library that are **not yet exposed** in this Python binding project. Your goal is to identify gaps and, if asked, implement the missing bindings.
10
10
11
+
**IMPORTANT: The Python API is the source of truth for coverage.** A function or method is considered "exposed" if it exists in the Python API (e.g., `python/datafusion/functions.py`), even if there is no corresponding entry in the Rust bindings. Many upstream functions are aliases of other functions — the Python layer can expose these aliases by calling a different underlying Rust binding. Do NOT report a function as missing if it appears in the Python `__all__` list and has a working implementation, regardless of whether a matching `#[pyfunction]` exists in Rust.
12
+
11
13
## Areas to Check
12
14
13
15
The user may specify an area via `$ARGUMENTS`. If no area is specified or "all" is given, check all areas.
@@ -24,9 +26,9 @@ The user may specify an area via `$ARGUMENTS`. If no area is specified or "all"
24
26
25
27
**How to check:**
26
28
1. Fetch the upstream scalar function documentation page
27
-
2. Compare against functions listed in `python/datafusion/functions.py` (check the `__all__` list)
28
-
3.Also check `crates/core/src/functions.rs` for what's registered in `init_module()`
29
-
4.Report functions that exist upstream but are missing from this project
29
+
2. Compare against functions listed in `python/datafusion/functions.py` (check the `__all__` list and function definitions)
30
+
3.A function is covered if it exists in the Python API — it does NOT need a dedicated Rust `#[pyfunction]`. Many functions are aliases that reuse another function's Rust binding.
31
+
4.Only report functions that are missing from the Python `__all__` list / function definitions
30
32
31
33
### 2. Aggregate Functions
32
34
@@ -40,8 +42,9 @@ The user may specify an area via `$ARGUMENTS`. If no area is specified or "all"
40
42
41
43
**How to check:**
42
44
1. Fetch the upstream aggregate function documentation page
43
-
2. Compare against aggregate functions in `python/datafusion/functions.py`
44
-
3. Report missing aggregate functions
45
+
2. Compare against aggregate functions in `python/datafusion/functions.py` (check `__all__` list and function definitions)
46
+
3. A function is covered if it exists in the Python API, even if it aliases another function's Rust binding
47
+
4. Report only functions missing from the Python API
45
48
46
49
### 3. Window Functions
47
50
@@ -55,8 +58,9 @@ The user may specify an area via `$ARGUMENTS`. If no area is specified or "all"
55
58
56
59
**How to check:**
57
60
1. Fetch the upstream window function documentation page
58
-
2. Compare against window functions in `python/datafusion/functions.py`
59
-
3. Report missing window functions
61
+
2. Compare against window functions in `python/datafusion/functions.py` (check `__all__` list and function definitions)
62
+
3. A function is covered if it exists in the Python API, even if it aliases another function's Rust binding
63
+
4. Report only functions missing from the Python API
60
64
61
65
### 4. Table Functions
62
66
@@ -70,8 +74,9 @@ The user may specify an area via `$ARGUMENTS`. If no area is specified or "all"
70
74
71
75
**How to check:**
72
76
1. Fetch the upstream table function documentation
73
-
2. Compare against what's available in this project
74
-
3. Report missing table functions
77
+
2. Compare against what's available in the Python API
78
+
3. A function is covered if it exists in the Python API, even if it aliases another function's Rust binding
79
+
4. Report only functions missing from the Python API
75
80
76
81
### 5. DataFrame Operations
77
82
@@ -84,9 +89,9 @@ The user may specify an area via `$ARGUMENTS`. If no area is specified or "all"
84
89
85
90
**How to check:**
86
91
1. Fetch the upstream DataFrame documentation page listing all methods
87
-
2. Compare against methods in `python/datafusion/dataframe.py`
0 commit comments