Conversation
…gs in step function
Updates to user defined buckets
Co-authored-by: Joseph H Kennedy <me@jhkennedy.org>
User Defined Buckets
Developer checklist
Reviewer checklist
|
Contributor
Author
Caution The Step Function code has changed! The job queue should be drained before merging! Someone from @ASFHyP3/tools will have to do this and then merge the PR since there is no review-gate for EDC deployments. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This release will allow HyP3 to publish products straight to end-user managed S3 buckets, once they apply an appropriate bucket policy.
First, a user would hit the new
/bucket-policy/{bucket_name}endpoint to get a bucket policy for their bucket -- let's usejhk-lavas-testfor example:https://hyp3-test-api.asf.alaska.edu/bucket-policy/jhk-lavas-test
And then apply it to their bucket:

which will then allow them to provide the
bucketandbucket_prefixparameters to any job they would like to run and the output products will show up in thejhk-lavas-testbucket. See all the examples 👇 .Importantly, this requires no downstream changes to the plugins! All plugins already are required to accept a
bucketandbucket_prefixargument; we just hid this from users as they are fixed to the HyP3 content bucket and job ID, respectively.Example jobs
First, lets start with a basic INSAR_ISCE_BURST job, to confirm existing behavior:
{ "validate_only": false, "jobs": [ { "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "job_parameters": { "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ], "apply_water_mask": false, "looks": "20x4" } } ] }Submitting that job results in:
https://hyp3-test-api.asf.alaska.edu/jobs/8a731ede-7468-45c8-b22b-17d1d87dd577
{ "status_code": "PENDING", "user_id": "jhkennedy", "credit_cost": 1, "job_parameters": { "apply_water_mask": false, "looks": "20x4", "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ] }, "priority": 8000, "bucket": "hyp3-edc-uat-contentbucket-1cesvjwsurrfn", "bucket_prefix": "8a731ede-7468-45c8-b22b-17d1d87dd577", "job_id": "8a731ede-7468-45c8-b22b-17d1d87dd577", "execution_started": false, "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "request_time": "2026-05-21T03:47:44+00:00" }Note how the job response now has
bucket, which is the HyP3 content bucket for this deployment, andbucket-prefix, which is the job ID, maintaining the existing behavior for HyP3. Also, when this job succeeds (and it did!) the download URLs for products and images will before for the cloudfront distribution, since this is an EDC deployment.Now, if we add a
bucketparameter to the job, products will be written to thejhk-lavas-testbucket. After submitting this:{ "validate_only": false, "jobs": [ { "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "bucket": "jhk-lavas-test", "job_parameters": { "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ], "apply_water_mask": false, "looks": "20x4" } } ] }We'll get back:
https://hyp3-test-api.asf.alaska.edu/jobs/347a0c73-1604-4510-93e5-5520f31df67d
{ "status_code": "PENDING", "user_id": "jhkennedy", "credit_cost": 1, "job_parameters": { "apply_water_mask": false, "looks": "20x4", "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ] }, "priority": 8000, "bucket": "jhk-lavas-test", "bucket_prefix": "347a0c73-1604-4510-93e5-5520f31df67d", "job_id": "347a0c73-1604-4510-93e5-5520f31df67d", "execution_started": false, "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "request_time": "2026-05-21T03:47:44+00:00" }Note: the
bucket_prefixis still the job ID but nowbucketis what I specified. Importantly, when this job succeeds (and it did!) it will have S3 download URLs since this is a non-hyp3 bucket and not part of our cloudfront distribution -- these URLs will work if the bucket is public and return a 403 Access Denied response if not.I can also specify
bucket_prefix, but this parameter has some expansions that are possible:{job_id}is in the string, it will expand to the job ID{name}is in the string, it will expand to the job nameSo, for example I could do:
{ "validate_only": false, "jobs": [ { "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "bucket": "jhk-lavas-test", "bucket_prefix": "HyP3/{name}/{job_id}", "job_parameters": { "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ], "apply_water_mask": false, "looks": "20x4" } } ] }We'll get back:
https://hyp3-test-api.asf.alaska.edu/jobs/347a0c73-1604-4510-93e5-5520f31df67d
{ "job_id": "7fc85d8d-088b-439b-977e-9e0e148b8337", "user_id": "jhkennedy", "status_code": "PENDING", "execution_started": false, "request_time": "2026-05-21T04:04:05+00:00", "priority": 7998, "job_type": "INSAR_ISCE_BURST", "name": "test-user-bucket", "bucket": "jhk-lavas-test", "bucket_prefix": "HyP3/test-user-bucket/7fc85d8d-088b-439b-977e-9e0e148b8337", "job_parameters": { "apply_water_mask": false, "looks": "20x4", "granules": [ "S1_136231_IW2_20200604T022312_VV_7C85-BURST", "S1_136231_IW2_20200616T022313_VV_5D11-BURST" ] }, "credit_cost": 1 }Note that the expansion happens immediately in the API so the value placed in the DynamoDB is the expanded value.
We do not provide expansion for the bucket. Fortunately, if I try and provide any expansion for the bucket name (e.g.,
"bucket": "{name},") or a bad expansion in the prefix (e.g,.,"bucket_prefix": "{bad_expansion}",) the job will be rejected by the API as{and}not allowable characters for an S3 key. This is enabled by the the OpenAPI spec pattern provided for these parameters -- validation happens after expansion .And, to see how a failure works, we can submit this AUTORIFT job we know will fail:
{ "validate_only": false, "jobs": [ { "job_type": "AUTORIFT", "name": "test-user-bucket", "bucket": "jhk-lavas-test", "bucket_prefix": "{job_id}", "job_parameters": { "granules": [ "S2B_MSIL1C_20260415T005509_N0512_R059_T51DWE_20260415T031050", "S2B_MSIL1C_20251116T005509_N0511_R059_T51DWE_20251116T020121" ] } } ] }and when it fails, the job response will look like:
{ "browse_images": [], "bucket": "jhk-lavas-test", "job_parameters": { "granules": [ "S2B_MSIL1C_20260415T005509_N0512_R059_T51DWE_20260415T031050", "S2B_MSIL1C_20251116T005509_N0511_R059_T51DWE_20251116T020121" ] }, "processing_times": null, "thumbnail_images": [], "request_time": "2026-05-21T04:30:04+00:00", "execution_started": true, "bucket_prefix": "0f8d12bf-4834-4001-971e-d8927dae8443", "job_type": "AUTORIFT", "files": [], "status_code": "FAILED", "user_id": "jhkennedy", "expiration_time": "2120-10-21T00:00:00+00:00", "credit_cost": 25, "priority": 7997, "logs": [ "https://jhk-lavas-test.s3.us-west-2.amazonaws.com/0f8d12bf-4834-4001-971e-d8927dae8443/0f8d12bf-4834-4001-971e-d8927dae8443.log" ], "name": "test-user-bucket", "job_id": "0f8d12bf-4834-4001-971e-d8927dae8443" },Note how the log files are also written to the user-bucket.
Potential concerns
With this implementation, there are a few concerns that are worth mentioning.
submit_prepared_jobsmethod in the SDK or the API directly.Fist, I think (4) is largely fine since we also provided the S3 info for products in the job response and clients like vertex should be able to gracefully handle an access denied (403) response and will likely just not show the images.
As for the rest, I am planning on effectively soft-launching this feature and using for internal projects (e.g., VolcSARvatory, LAVAs, AK FIRE SAFE) for a period since none of these concerns apply to these projects -- we all have admin access to all the AWS accounts involved and we strictly control EDL accounts with access to the HyP3 deployments. For (5) specifically, we're already largely building job-dictionaries as the custom job types aren't available in HyP3 Basic or HyP3+ yet.
Before make it generally available (e.g., add it to hyp3-docs and update the SDK), we'll wan to address some/most/all of those concerns, so I've opened a number of follow on issues 👇 with more details for these concerns -- please discuss concerns in those issues, if appropriate and possible.
Follow on issues
Principalin user bucket policies #3108