Skip to content

[WIP] Add release api logic#1350

Draft
vmartinez-cu wants to merge 7 commits into
IMAP-Science-Operations-Center:devfrom
vmartinez-cu:release-api-logic
Draft

[WIP] Add release api logic#1350
vmartinez-cu wants to merge 7 commits into
IMAP-Science-Operations-Center:devfrom
vmartinez-cu:release-api-logic

Conversation

@vmartinez-cu
Copy link
Copy Markdown
Contributor

This pull request updates the Release API Lambda handler, replacing the previous stub with functionality that supports query-based filtering of release records and enforces authentication. It also adds a test file to validate the new API's behavior, including query filtering, error handling, and authentication checks.

NOTE: This addresses part of ticket #1307. It handles querying the new release table. Work is still needed to implement updating the release status of data products.

Release API Lambda Implementation:

  • Replaces the placeholder lambda_handler in release_api.py with functionality that:

    • Validates user authentication.
    • Supports query parameters (instrument, start_date, end_date, release_type, release_number) with type and value checks.
    • Filters release records using SQLAlchemy based on provided parameters, including date parsing and release type validation.
    • Ensures only the latest version of each filtered record is returned.
    • Formats response data, including date formatting and field filtering.
  • Adds logging for debugging and error tracking throughout the API handler.

Testing Enhancements:

  • Introduces a new test suite test_release_api.py that:
    • Covers valid and invalid query scenarios, including parameter validation, release type filtering, date parsing, and version selection.
    • Verifies authentication enforcement and correct API responses for both successful and error cases.
    • Provides utility functions for building test events and inserting test records.

…onverts cli params with hyphens to db friendly params with underscores. Also clean up docstring and add check for valid release type inputs
…e. Clean up TODOs for future work. Return search results instead of static success message
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the query side of the Release API lambda (issue #1307): replaces the stub lambda_handler with logic that authenticates the request, validates query parameters, filters ReleaseFiles records by date/release type, returns only the latest version, and serializes results. Adds a test module covering filtering, validation, and authentication paths. Per the PR description, the update-status portion of the feature is still TODO.

Changes:

  • Build a parameterized SQLAlchemy query over ReleaseFiles with auth, parameter, and value validation.
  • Reduce results to the latest version and serialize datetimes for response.
  • Add tests/lambda_endpoints/test_release_api.py covering happy-path and error scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py Replaces stub handler with auth check, query-param filtering, latest-version reduction, and response serialization.
tests/lambda_endpoints/test_release_api.py New tests for query body, invalid parameters, release_type/date filters, version selection, and auth rejection.
Comments suppressed due to low confidence (7)

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:151

  • The instrument and release_number parameters are listed as valid query parameters but the for-loop only handles start_date, end_date, and release_type. There is no else branch that applies a filter for other valid parameters (compare to query_api.py line 119-120 which has query = query.where(getattr(model, param) == value)). As a result, passing instrument=swe or release_number=001 silently does nothing — the parameter is accepted but never filters the query. Additionally, release_number does not even exist as a column on ReleaseFiles, so it cannot be supported by a generic getattr filter without further work.
    # go through each query parameter to set up sqlalchemy query conditions
    for param, value in query_params.items():
        # confirm that the query parameter is valid
        if param not in valid_parameters:
            response = {
                "statusCode": 400,
                "body": json.dumps(
                    f"{param} is not a valid query parameter. "
                    + f"Valid query parameters are: {valid_parameters}"
                ),
            }
            logger.debug(
                f"Received an invalid query parameter [{param}], valid options are: {valid_parameters}"
            )
            return response
        try:
            if param == "start_date":
                query = query.where(
                    model.start_date >= datetime.datetime.strptime(value, "%Y%m%d")
                )
            elif param == "end_date":
                # the date queries will only look at the file start_date.
                query = query.where(
                    model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
                )
            elif param == "release_type":
                valid_release_types = [
                    "early-release",
                    "unrelease",
                    "withhold-data",  # TODO: make this one default if release-type isn't given?
                ]
                if value not in valid_release_types:
                    response = {
                        "statusCode": 400,
                        "body": json.dumps(
                            f"{param} is not a valid release_type parameter. "
                            + f"Valid release_type parameters are: {valid_release_types}"
                        ),
                    }
                    logger.debug(
                        f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}"
                    )
                    return response
                # filter release-type in filename using a "contains" query on the file path
                query = query.where(model.file_path.contains(value, autoescape=True))
        except ValueError:
            response = {
                "statusCode": 400,
                "body": json.dumps(f"Invalid value for {param}: {value}"),
            }
            logger.debug(f"Invalid value for {param}: {value}")
            return response

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:156

  • The "latest version" subquery selects the single global maximum version across all rows returned by the filter, then keeps only rows whose version equals that scalar. This is incorrect when the filtered result spans multiple logically distinct files (e.g., different instruments, descriptors, or date ranges): all rows that happen not to be at the global max version will be dropped, even though each distinct file group has its own version history. The "latest version" needs to be computed per group (e.g., GROUP BY instrument, descriptor, start_date, end_date) rather than as a single global scalar.
    # Keep only rows at the highest version from the filtered result set.
    filtered_subq = query.subquery()
    max_version_subq = select(func.max(filtered_subq.c.version)).scalar_subquery()
    query = select(filtered_subq).where(filtered_subq.c.version == max_version_subq)

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:101

  • When API Gateway receives a request with no query string, event["queryStringParameters"] is None, not {}. The call query_params.items() on line 101 will raise AttributeError in that case. The handler should treat a missing/None value as an empty dict (e.g., query_params = event.get("queryStringParameters") or {}).
    query_params = event["queryStringParameters"]
    # get desired table for query
    query_table = "release"
    logger.info(f"Querying table: {query_table}")
    model = getattr(table_models, query_table)

    # select the given table for the query
    query = select(model.__table__)

    # get a list of all valid search parameters
    valid_parameters = [
        "instrument",
        "start_date",
        "end_date",
        "release_type",
        "release_number",
    ]

    # go through each query parameter to set up sqlalchemy query conditions
    for param, value in query_params.items():

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:124

  • The end_date filter model.end_date <= datetime.datetime.strptime(value, "%Y%m%d") excludes rows where end_date IS NULL, but the end_date column on AncillaryFileBase is nullable (models.py:220). Rows with no end_date will be silently dropped from results whenever an end_date query parameter is provided. Consider whether NULL end_date rows should be included (e.g., (end_date <= value) | (end_date.is_(None))).
            elif param == "end_date":
                # the date queries will only look at the file start_date.
                query = query.where(
                    model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
                )

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:142

  • The invalid-release-type error message uses {param} (which is the literal string "release_type") rather than {value} (the offending user input). The message reads "release_type is not a valid release_type parameter..." which is confusing. The test at test_release_api.py:116 only checks the static substring "is not a valid release_type parameter" so it doesn't catch this. The same applies to the debug log on line 140 which also uses {param} instead of {value}.
                if value not in valid_release_types:
                    response = {
                        "statusCode": 400,
                        "body": json.dumps(
                            f"{param} is not a valid release_type parameter. "
                            + f"Valid release_type parameters are: {valid_release_types}"
                        ),
                    }
                    logger.debug(
                        f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}"
                    )
                    return response

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:124

  • Comment says "the date queries will only look at the file start_date", but this branch filters on model.end_date, not start_date. The comment appears to have been copy-pasted from query_api.py (line 102-103) where the original logic did use start_date for the end_date param. Please update the comment to match the actual filter, or reconsider whether end_date is the intended column here.
            elif param == "end_date":
                # the date queries will only look at the file start_date.
                query = query.where(
                    model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
                )

sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py:144

  • The release_type filter uses a substring contains match on file_path. This is fragile: the substring "withhold-data" would also match files that happen to have "withhold-data" anywhere in their path (e.g., directory components, future descriptors), and the substring approach cannot distinguish between e.g. "release" and "early-release" without ordering of the checks. Matching against model.descriptor (the structured field that already encodes the release type) would be safer and more precise.
                # filter release-type in filename using a "contains" query on the file path
                query = query.where(model.file_path.contains(value, autoescape=True))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +60 to +66
"statusCode": 400,
"body": json.dumps(
f"API authentication failed: {event['body']}."),
}
logger.debug(
f"API authentication failed: {event['body']}."
)
Comment on lines +111 to +150
logger.debug(
f"Received an invalid query parameter [{param}], valid options are: {valid_parameters}"
)
return response
try:
if param == "start_date":
query = query.where(
model.start_date >= datetime.datetime.strptime(value, "%Y%m%d")
)
elif param == "end_date":
# the date queries will only look at the file start_date.
query = query.where(
model.end_date <= datetime.datetime.strptime(value, "%Y%m%d")
)
elif param == "release_type":
valid_release_types = [
"early-release",
"unrelease",
"withhold-data", # TODO: make this one default if release-type isn't given?
]
if value not in valid_release_types:
response = {
"statusCode": 400,
"body": json.dumps(
f"{param} is not a valid release_type parameter. "
+ f"Valid release_type parameters are: {valid_release_types}"
),
}
logger.debug(
f"Received an invalid release_type parameter [{param}], valid options are: {valid_release_types}"
)
return response
# filter release-type in filename using a "contains" query on the file path
query = query.where(model.file_path.contains(value, autoescape=True))
except ValueError:
response = {
"statusCode": 400,
"body": json.dumps(f"Invalid value for {param}: {value}"),
}
logger.debug(f"Invalid value for {param}: {value}")
Comment on lines +182 to +195
event = _build_event(
{
"instrument": "swe",
"release_type": "withhold-data",
"start_date": "20250101",
"end_date": "20250331",
}
)
returned_query = release_api.lambda_handler(event=event, context={})

assert returned_query["statusCode"] == 200
body = json.loads(returned_query["body"])
assert len(body) == 1
assert body[0]["version"] == "v002"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good catch

Comment thread sds_data_manager/lambda_code/SDSCode/api_lambdas/release_api.py
Comment on lines +71 to +89
TableModels = namedtuple(
"TableModels", ["release", "science", "ancillary"]
)

table_models = TableModels(
science=models.ScienceFiles,
ancillary=models.AncillaryFiles,
release=models.ReleaseFiles
)

# add session, pick model
query_params = event["queryStringParameters"]
# get desired table for query
query_table = "release"
logger.info(f"Querying table: {query_table}")
model = getattr(table_models, query_table)

# select the given table for the query
query = select(model.__table__)
@tech3371
Copy link
Copy Markdown
Contributor

@vmartinez-cu Thank you for this work! This set me up nicely and I can pick up from here. I rebased from your working branch. Can I close this?

@vmartinez-cu
Copy link
Copy Markdown
Contributor Author

I'm glad it was helpful! Yes, feel free to close this draft PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Data Release 1 Deadline: June 15, 2026 Infrastructure

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants