Skip to content

Release v10.17.0 -- Write products to user buckets#3106

Open
jhkennedy wants to merge 56 commits into
mainfrom
develop
Open

Release v10.17.0 -- Write products to user buckets#3106
jhkennedy wants to merge 56 commits into
mainfrom
develop

Conversation

@jhkennedy
Copy link
Copy Markdown
Contributor

@jhkennedy jhkennedy commented May 20, 2026

This release will allow HyP3 to publish products straight to end-user managed S3 buckets, once they apply an appropriate bucket policy.

First, a user would hit the new /bucket-policy/{bucket_name} endpoint to get a bucket policy for their bucket -- let's use jhk-lavas-test for example:
https://hyp3-test-api.asf.alaska.edu/bucket-policy/jhk-lavas-test

And then apply it to their bucket:
image

which will then allow them to provide the bucket and bucket_prefix parameters to any job they would like to run and the output products will show up in the jhk-lavas-test bucket. See all the examples 👇 .

Importantly, this requires no downstream changes to the plugins! All plugins already are required to accept a bucket and bucket_prefix argument; we just hid this from users as they are fixed to the HyP3 content bucket and job ID, respectively.

Example jobs

First, lets start with a basic INSAR_ISCE_BURST job, to confirm existing behavior:

{
  "validate_only": false,
  "jobs": [
    {
      "job_type": "INSAR_ISCE_BURST",
      "name": "test-user-bucket",
      "job_parameters": {
        "granules": [
          "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
          "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
        ],
        "apply_water_mask": false,
        "looks": "20x4"
      }
    }
  ]
}

Submitting that job results in:
https://hyp3-test-api.asf.alaska.edu/jobs/8a731ede-7468-45c8-b22b-17d1d87dd577

{
  "status_code": "PENDING",
  "user_id": "jhkennedy",
  "credit_cost": 1,
  "job_parameters": {
    "apply_water_mask": false,
    "looks": "20x4",
    "granules": [
      "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
      "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
    ]
  },
  "priority": 8000,
  "bucket": "hyp3-edc-uat-contentbucket-1cesvjwsurrfn",
  "bucket_prefix": "8a731ede-7468-45c8-b22b-17d1d87dd577",
  "job_id": "8a731ede-7468-45c8-b22b-17d1d87dd577",
  "execution_started": false,
  "job_type": "INSAR_ISCE_BURST",
  "name": "test-user-bucket",
  "request_time": "2026-05-21T03:47:44+00:00"
}

Note how the job response now has bucket, which is the HyP3 content bucket for this deployment, and bucket-prefix, which is the job ID, maintaining the existing behavior for HyP3. Also, when this job succeeds (and it did!) the download URLs for products and images will before for the cloudfront distribution, since this is an EDC deployment.

Now, if we add a bucket parameter to the job, products will be written to the jhk-lavas-test bucket. After submitting this:

{
  "validate_only": false,
  "jobs": [
    {
      "job_type": "INSAR_ISCE_BURST",
      "name": "test-user-bucket",
      "bucket": "jhk-lavas-test",
      "job_parameters": {
        "granules": [
          "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
          "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
        ],
        "apply_water_mask": false,
        "looks": "20x4"
      }
    }
  ]
}

We'll get back:
https://hyp3-test-api.asf.alaska.edu/jobs/347a0c73-1604-4510-93e5-5520f31df67d

{
  "status_code": "PENDING",
  "user_id": "jhkennedy",
  "credit_cost": 1,
  "job_parameters": {
    "apply_water_mask": false,
    "looks": "20x4",
    "granules": [
      "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
      "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
    ]
  },
  "priority": 8000,
  "bucket": "jhk-lavas-test",
  "bucket_prefix": "347a0c73-1604-4510-93e5-5520f31df67d",
  "job_id": "347a0c73-1604-4510-93e5-5520f31df67d",
  "execution_started": false,
  "job_type": "INSAR_ISCE_BURST",
  "name": "test-user-bucket",
  "request_time": "2026-05-21T03:47:44+00:00"
}

Note: the bucket_prefix is still the job ID but now bucket is what I specified. Importantly, when this job succeeds (and it did!) it will have S3 download URLs since this is a non-hyp3 bucket and not part of our cloudfront distribution -- these URLs will work if the bucket is public and return a 403 Access Denied response if not.

I can also specify bucket_prefix, but this parameter has some expansions that are possible:

  • If {job_id} is in the string, it will expand to the job ID
  • If {name} is in the string, it will expand to the job name

So, for example I could do:

{
  "validate_only": false,
  "jobs": [
    {
      "job_type": "INSAR_ISCE_BURST",
      "name": "test-user-bucket",
      "bucket": "jhk-lavas-test",
      "bucket_prefix": "HyP3/{name}/{job_id}",
      "job_parameters": {
        "granules": [
          "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
          "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
        ],
        "apply_water_mask": false,
        "looks": "20x4"
      }
    }
  ]
}

We'll get back:
https://hyp3-test-api.asf.alaska.edu/jobs/347a0c73-1604-4510-93e5-5520f31df67d

{
  "job_id": "7fc85d8d-088b-439b-977e-9e0e148b8337",
  "user_id": "jhkennedy",
  "status_code": "PENDING",
  "execution_started": false,
  "request_time": "2026-05-21T04:04:05+00:00",
  "priority": 7998,
  "job_type": "INSAR_ISCE_BURST",
  "name": "test-user-bucket",
  "bucket": "jhk-lavas-test",
  "bucket_prefix": "HyP3/test-user-bucket/7fc85d8d-088b-439b-977e-9e0e148b8337",
  "job_parameters": {
    "apply_water_mask": false,
    "looks": "20x4",
    "granules": [
      "S1_136231_IW2_20200604T022312_VV_7C85-BURST",
      "S1_136231_IW2_20200616T022313_VV_5D11-BURST"
    ]
  },
  "credit_cost": 1
}

Note that the expansion happens immediately in the API so the value placed in the DynamoDB is the expanded value.

We do not provide expansion for the bucket. Fortunately, if I try and provide any expansion for the bucket name (e.g., "bucket": "{name},") or a bad expansion in the prefix (e.g,., "bucket_prefix": "{bad_expansion}",) the job will be rejected by the API as { and } not allowable characters for an S3 key. This is enabled by the the OpenAPI spec pattern provided for these parameters -- validation happens after expansion .

And, to see how a failure works, we can submit this AUTORIFT job we know will fail:

{
  "validate_only": false,
  "jobs": [
    {
      "job_type": "AUTORIFT",
      "name": "test-user-bucket",
      "bucket": "jhk-lavas-test",
      "bucket_prefix": "{job_id}",
      "job_parameters": {
        "granules": [
          "S2B_MSIL1C_20260415T005509_N0512_R059_T51DWE_20260415T031050",
          "S2B_MSIL1C_20251116T005509_N0511_R059_T51DWE_20251116T020121"
        ]
      }
    }
  ]
}

and when it fails, the job response will look like:

{
  "browse_images": [],
  "bucket": "jhk-lavas-test",
  "job_parameters": {
    "granules": [
      "S2B_MSIL1C_20260415T005509_N0512_R059_T51DWE_20260415T031050",
      "S2B_MSIL1C_20251116T005509_N0511_R059_T51DWE_20251116T020121"
    ]
  },
  "processing_times": null,
  "thumbnail_images": [],
  "request_time": "2026-05-21T04:30:04+00:00",
  "execution_started": true,
  "bucket_prefix": "0f8d12bf-4834-4001-971e-d8927dae8443",
  "job_type": "AUTORIFT",
  "files": [],
  "status_code": "FAILED",
  "user_id": "jhkennedy",
  "expiration_time": "2120-10-21T00:00:00+00:00",
  "credit_cost": 25,
  "priority": 7997,
  "logs": [
    "https://jhk-lavas-test.s3.us-west-2.amazonaws.com/0f8d12bf-4834-4001-971e-d8927dae8443/0f8d12bf-4834-4001-971e-d8927dae8443.log"
  ],
  "name": "test-user-bucket",
  "job_id": "0f8d12bf-4834-4001-971e-d8927dae8443"
},

Note how the log files are also written to the user-bucket.

Potential concerns

With this implementation, there are a few concerns that are worth mentioning.

  1. The bucket policy grants permissions the entire HyP3 AWS account. That means potentially anyone with access to the HyP3 account can use those permissions.
    1. A HyP3 developer could list, get, and put items in user buckets themselves.
    2. Users could their products in another users bucket if they knew the other users' bucket name.
  2. The bucket policy grants permissions to the entire bucket.
  3. The best/only way to check that permissions are set up correctly is to run a job and see it succeed. If get-files or upload-log don't have the right permissions, you'll end up with a failed job that has no reported files and no logs (or even log key in the job dict).
  4. We include the download URL for the products browse/thumbnail images, but they won't work when products are placed in an end-user bucket.
  5. Without updates to the SDK, if you want to submit jobs with the bucket/bucket-prefix parameters, you'll need to use build the job dictionaries and use the submit_prepared_jobs method in the SDK or the API directly.

Fist, I think (4) is largely fine since we also provided the S3 info for products in the job response and clients like vertex should be able to gracefully handle an access denied (403) response and will likely just not show the images.

As for the rest, I am planning on effectively soft-launching this feature and using for internal projects (e.g., VolcSARvatory, LAVAs, AK FIRE SAFE) for a period since none of these concerns apply to these projects -- we all have admin access to all the AWS accounts involved and we strictly control EDL accounts with access to the HyP3 deployments. For (5) specifically, we're already largely building job-dictionaries as the custom job types aren't available in HyP3 Basic or HyP3+ yet.

Before make it generally available (e.g., add it to hyp3-docs and update the SDK), we'll wan to address some/most/all of those concerns, so I've opened a number of follow on issues 👇 with more details for these concerns -- please discuss concerns in those issues, if appropriate and possible.

Follow on issues

@jhkennedy jhkennedy added the minor Bump the minor version number of this project label May 20, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

Developer checklist

  • Indicated the level of changes to this package by affixing one of these labels:
    • major -- Major changes to the API that may break current workflows
    • minor -- Minor changes to the API that do not break current workflows
    • patch -- Patches and bugfixes for the current version that do not break current workflows
    • bumpless -- Changes to documentation, CI/CD pipelines, etc. that don't affect the software's version
  • (If applicable) Updated the dependencies and indicated any downstream changes that are required
  • Added/updated documentation for these changes
  • Added/updated tests for these changes
  • Verified changes in test deployment and summarized results, e.g. in PR description or comments on the related issue(s)
  • If the step function code has changed, have you drained the job queue before merging?
    • For example, if the interface for a Lambda function has changed to expect different input,
      then currently running jobs (which use the old step function definition) will call the new
      function with the old input. So we must drain the job queue before deployment, so that the new
      function is only called by the new step function definition.

Reviewer checklist

  • Have all dependencies been updated?
  • Is the level of changes labeled appropriately?
  • Are all the changes described appropriately in CHANGELOG.md?
  • Has the documentation been adequately updated?
  • Are the tests adequate?
  • Have the changes been verified in the test deployment?

@jhkennedy jhkennedy marked this pull request as ready for review May 21, 2026 04:39
@jhkennedy jhkennedy requested review from a team as code owners May 21, 2026 04:39
@jhkennedy
Copy link
Copy Markdown
Contributor Author

jhkennedy commented May 21, 2026

If the step function code has changed, have you drained the job queue before merging?

Caution

The Step Function code has changed! The job queue should be drained before merging!

Someone from @ASFHyP3/tools will have to do this and then merge the PR since there is no review-gate for EDC deployments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

minor Bump the minor version number of this project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants