feat: Add BigFrames.bigquery.st_regionstats method#2200
feat: Add BigFrames.bigquery.st_regionstats method#2200
Conversation
This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography. The implementation includes: - A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Compiler implementations for both the SQLGlot and Ibis backends. - A unit test with a SQL snapshot. - A code sample in `samples/snippets/wildfire_risk.py` that demonstrates the use of the new function.
| raster: bigframes.series.Series, | ||
| band: str, | ||
| *, | ||
| options: Mapping[str, Union[str, int, float]] = {}, |
There was a problem hiding this comment.
Missing "include".
Also, "options" might not work if it is keyword-only.
There was a problem hiding this comment.
Also, don't use mutable values as the default.
| if op.options: | ||
| args.append(bigframes_vendored.ibis.literal(op.options, type="json")) | ||
| return bigframes_vendored.ibis.remote_function( | ||
| "st_regionstats", args, output_type="struct<min: float, max: float, sum: float, count: int, mean: float>" # type: ignore |
There was a problem hiding this comment.
Per https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_regionstats it should also include area.
samples/snippets/wildfire_risk.py
Outdated
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import bigframes.bigquery as bbq |
There was a problem hiding this comment.
Should put this inside a test file so we actually run it.
samples/snippets/wildfire_risk.py
Outdated
There was a problem hiding this comment.
| pytest.importorskip("pytest_snapshot") | ||
|
|
||
|
|
||
| class TestGeoCompiler: |
There was a problem hiding this comment.
Should use pytest style not unittest style.
This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography. The implementation includes: - A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Compiler implementations for both the SQLGlot and Ibis backends. - A unit test with a SQL snapshot. - A system test that demonstrates the use of the new function. This commit also addresses feedback from the code review, including: - Adding `area` to the output struct of `st_regionstats`. - Making the `options` parameter a positional argument. - Adding comments to explain the use of `pass_op`. - Converting the unit test to a pytest-style function. - Moving the sample code to a system test.
…ython-bigquery-dataframes into feat-st-regionstats
…ython-bigquery-dataframes into feat-st-regionstats
This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography. Key changes: - Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Implemented compilation logic for the operation in both Ibis and SQLGlot compilers. - Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API. - Added a new `_apply_ternary_op` method to `bigframes.series.Series`. - Included a unit test with a snapshot to verify the generated SQL. - Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames. - Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Bad merge. Revert these changes.
There was a problem hiding this comment.
Needs to be reverted too.
This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography. Key changes: - Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Implemented compilation logic for the operation in both Ibis and SQLGlot compilers. - Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API. - Added a new `_apply_ternary_op` method to `bigframes.series.Series`. - Included a unit test with a snapshot to verify the generated SQL. - Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames. - Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
|
Here is the summary of changes. You are about to add 1 region tag.
This comment is generated by snippet-bot.
|
| # TODO: Add st_simplify when it is available in BigFrames. | ||
| # https://github.com/googleapis/python-bigquery-dataframes/issues/1497 | ||
| # countries["simplified_geometry"] = bq.st_simplify(countries["geometry"], 10000) |
There was a problem hiding this comment.
I'd like to add st_simplify first so that this sample can work without any todos for us.
| # [START bigquery_dataframes_st_regionstats] | ||
| from typing import cast | ||
|
|
||
| import bigframes.bigquery as bq |
There was a problem hiding this comment.
Let's use bbq instead of bq.
|
Closing in favor of #2228 |
This commit adds the
BigFrames.bigquery.st_regionstatsmethod, which allows users to compute statistics for a raster band within a given geography.The implementation includes:
StRegionStatsOpinbigframes/operations/geo_ops.py.samples/snippets/wildfire_risk.pythat demonstrates the use of the new function.Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕