[WIP] Add release api logic#1350
Conversation
…onverts cli params with hyphens to db friendly params with underscores. Also clean up docstring and add check for valid release type inputs
…use datetime values
…e. Clean up TODOs for future work. Return search results instead of static success message
There was a problem hiding this comment.
Pull request overview
Implements the query side of the Release API lambda (issue #1307): replaces the stub lambda_handler with logic that authenticates the request, validates query parameters, filters ReleaseFiles records by date/release type, returns only the latest version, and serializes results. Adds a test module covering filtering, validation, and authentication paths. Per the PR description, the update-status portion of the feature is still TODO.
Changes:
- Build a parameterized SQLAlchemy query over
ReleaseFileswith auth, parameter, and value validation. - Reduce results to the latest version and serialize datetimes for response.
- Add
tests/lambda_endpoints/test_release_api.pycovering happy-path and error scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py | Replaces stub handler with auth check, query-param filtering, latest-version reduction, and response serialization. |
| tests/lambda_endpoints/test_release_api.py | New tests for query body, invalid parameters, release_type/date filters, version selection, and auth rejection. |
Comments suppressed due to low confidence (7)
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:151
- The
instrumentandrelease_numberparameters are listed as valid query parameters but the for-loop only handlesstart_date,end_date, andrelease_type. There is noelsebranch that applies a filter for other valid parameters (compare to query_api.py line 119-120 which hasquery = query.where(getattr(model, param) == value)). As a result, passinginstrument=sweorrelease_number=001silently does nothing — the parameter is accepted but never filters the query. Additionally,release_numberdoes not even exist as a column onReleaseFiles, so it cannot be supported by a genericgetattrfilter without further work.
# go through each query parameter to set up sqlalchemy query conditions
for param, value in query_params.items():
# confirm that the query parameter is valid
if param not in valid_parameters:
response = {
"statusCode": 400,
"body": json.dumps(
f"{param} is not a valid query parameter. "
+ f"Valid query parameters are: {valid_parameters}"
),
}
logger.debug(
f"Received an invalid query parameter [{param}], valid options are: {valid_parameters}"
)
return response
try:
if param == "start_date":
query = query.where(
model.start_date >= datetime.datetime.strptime(value, "%Y%m%d")
)
elif param == "end_date":
# the date queries will only look at the file start_date.
query = query.where(
model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
)
elif param == "release_type":
valid_release_types = [
"early-release",
"unrelease",
"withhold-data", # TODO: make this one default if release-type isn't given?
]
if value not in valid_release_types:
response = {
"statusCode": 400,
"body": json.dumps(
f"{param} is not a valid release_type parameter. "
+ f"Valid release_type parameters are: {valid_release_types}"
),
}
logger.debug(
f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}"
)
return response
# filter release-type in filename using a "contains" query on the file path
query = query.where(model.file_path.contains(value, autoescape=True))
except ValueError:
response = {
"statusCode": 400,
"body": json.dumps(f"Invalid value for {param}: {value}"),
}
logger.debug(f"Invalid value for {param}: {value}")
return response
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:156
- The "latest version" subquery selects the single global maximum version across all rows returned by the filter, then keeps only rows whose version equals that scalar. This is incorrect when the filtered result spans multiple logically distinct files (e.g., different instruments, descriptors, or date ranges): all rows that happen not to be at the global max version will be dropped, even though each distinct file group has its own version history. The "latest version" needs to be computed per group (e.g., GROUP BY instrument, descriptor, start_date, end_date) rather than as a single global scalar.
# Keep only rows at the highest version from the filtered result set.
filtered_subq = query.subquery()
max_version_subq = select(func.max(filtered_subq.c.version)).scalar_subquery()
query = select(filtered_subq).where(filtered_subq.c.version == max_version_subq)
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:101
- When API Gateway receives a request with no query string,
event["queryStringParameters"]isNone, not{}. The callquery_params.items()on line 101 will raiseAttributeErrorin that case. The handler should treat a missing/None value as an empty dict (e.g.,query_params = event.get("queryStringParameters") or {}).
query_params = event["queryStringParameters"]
# get desired table for query
query_table = "release"
logger.info(f"Querying table: {query_table}")
model = getattr(table_models, query_table)
# select the given table for the query
query = select(model.__table__)
# get a list of all valid search parameters
valid_parameters = [
"instrument",
"start_date",
"end_date",
"release_type",
"release_number",
]
# go through each query parameter to set up sqlalchemy query conditions
for param, value in query_params.items():
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:124
- The end_date filter
model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")excludes rows whereend_date IS NULL, but theend_datecolumn onAncillaryFileBaseis nullable (models.py:220). Rows with no end_date will be silently dropped from results whenever anend_datequery parameter is provided. Consider whether NULL end_date rows should be included (e.g.,(end_date <= value) | (end_date.is_(None))).
elif param == "end_date":
# the date queries will only look at the file start_date.
query = query.where(
model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
)
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:142
- The invalid-release-type error message uses
{param}(which is the literal string"release_type") rather than{value}(the offending user input). The message reads "release_type is not a valid release_type parameter..." which is confusing. The test at test_release_api.py:116 only checks the static substring "is not a valid release_type parameter" so it doesn't catch this. The same applies to the debug log on line 140 which also uses{param}instead of{value}.
if value not in valid_release_types:
response = {
"statusCode": 400,
"body": json.dumps(
f"{param} is not a valid release_type parameter. "
+ f"Valid release_type parameters are: {valid_release_types}"
),
}
logger.debug(
f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}"
)
return response
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:124
- Comment says "the date queries will only look at the file start_date", but this branch filters on
model.end_date, notstart_date. The comment appears to have been copy-pasted from query_api.py (line 102-103) where the original logic did use start_date for the end_date param. Please update the comment to match the actual filter, or reconsider whether end_date is the intended column here.
elif param == "end_date":
# the date queries will only look at the file start_date.
query = query.where(
model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
)
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:144
- The release_type filter uses a substring
containsmatch onfile_path. This is fragile: the substring "withhold-data" would also match files that happen to have "withhold-data" anywhere in their path (e.g., directory components, future descriptors), and the substring approach cannot distinguish between e.g. "release" and "early-release" without ordering of the checks. Matching againstmodel.descriptor(the structured field that already encodes the release type) would be safer and more precise.
# filter release-type in filename using a "contains" query on the file path
query = query.where(model.file_path.contains(value, autoescape=True))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "statusCode": 400, | ||
| "body": json.dumps( | ||
| f"API authentication failed: {event['body']}."), | ||
| } | ||
| logger.debug( | ||
| f"API authentication failed: {event['body']}." | ||
| ) |
| logger.debug( | ||
| f"Received an invalid query parameter [{param}], valid options are: {valid_parameters}" | ||
| ) | ||
| return response | ||
| try: | ||
| if param == "start_date": | ||
| query = query.where( | ||
| model.start_date >= datetime.datetime.strptime(value, "%Y%m%d") | ||
| ) | ||
| elif param == "end_date": | ||
| # the date queries will only look at the file start_date. | ||
| query = query.where( | ||
| model.end_date <= datetime.datetime.strptime(value, "%Y%m%d") | ||
| ) | ||
| elif param == "release_type": | ||
| valid_release_types = [ | ||
| "early-release", | ||
| "unrelease", | ||
| "withhold-data", # TODO: make this one default if release-type isn't given? | ||
| ] | ||
| if value not in valid_release_types: | ||
| response = { | ||
| "statusCode": 400, | ||
| "body": json.dumps( | ||
| f"{param} is not a valid release_type parameter. " | ||
| + f"Valid release_type parameters are: {valid_release_types}" | ||
| ), | ||
| } | ||
| logger.debug( | ||
| f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}" | ||
| ) | ||
| return response | ||
| # filter release-type in filename using a "contains" query on the file path | ||
| query = query.where(model.file_path.contains(value, autoescape=True)) | ||
| except ValueError: | ||
| response = { | ||
| "statusCode": 400, | ||
| "body": json.dumps(f"Invalid value for {param}: {value}"), | ||
| } | ||
| logger.debug(f"Invalid value for {param}: {value}") |
| event = _build_event( | ||
| { | ||
| "instrument": "swe", | ||
| "release_type": "withhold-data", | ||
| "start_date": "20250101", | ||
| "end_date": "20250331", | ||
| } | ||
| ) | ||
| returned_query = release_api.lambda_handler(event=event, context={}) | ||
|
|
||
| assert returned_query["statusCode"] == 200 | ||
| body = json.loads(returned_query["body"]) | ||
| assert len(body) == 1 | ||
| assert body[0]["version"] == "v002" |
There was a problem hiding this comment.
this is a good catch
| TableModels = namedtuple( | ||
| "TableModels", ["release", "science", "ancillary"] | ||
| ) | ||
|
|
||
| table_models = TableModels( | ||
| science=models.ScienceFiles, | ||
| ancillary=models.AncillaryFiles, | ||
| release=models.ReleaseFiles | ||
| ) | ||
|
|
||
| # add session, pick model | ||
| query_params = event["queryStringParameters"] | ||
| # get desired table for query | ||
| query_table = "release" | ||
| logger.info(f"Querying table: {query_table}") | ||
| model = getattr(table_models, query_table) | ||
|
|
||
| # select the given table for the query | ||
| query = select(model.__table__) |
… with dictionary for simplicity
|
@vmartinez-cu Thank you for this work! This set me up nicely and I can pick up from here. I rebased from your working branch. Can I close this? |
|
I'm glad it was helpful! Yes, feel free to close this draft PR |
This pull request updates the Release API Lambda handler, replacing the previous stub with functionality that supports query-based filtering of release records and enforces authentication. It also adds a test file to validate the new API's behavior, including query filtering, error handling, and authentication checks.
NOTE: This addresses part of ticket #1307. It handles querying the new release table. Work is still needed to implement updating the release status of data products.
Release API Lambda Implementation:
Replaces the placeholder
lambda_handlerinrelease_api.pywith functionality that:instrument,start_date,end_date,release_type,release_number) with type and value checks.Adds logging for debugging and error tracking throughout the API handler.
Testing Enhancements:
test_release_api.pythat: