Skip to content

Issues after downgrade from v2.3 to 1.6.2 #1956

@sheetalchauhan24

Description

@sheetalchauhan24

Hi Team,

We have data.all Dev, Test and Prod - 3 environments, where Dev and Test has a common code repository vs. prod has a separate repo. I agree that downgrade is not recommended but we had to do it in Dev env. because of following reason:
We started upgrading dev from 1.6.2 to v2.3 but during upgrade Test environment started throwing issue due to ECR image retention limit of 200, it stopped showing Admin tools.
To fix this 1.6 issue in Test environment, we had no other option other than downgrading dev repo code from 2.3 to 1.6, we did that but now we started facing strange issues in Dev environment, it seems code is downgraded but AWS infrastructure is not (cdk version/python version/Lambda layer etc.)

Issues: All Environments stack update are in UPDATE_ROLLBACK_COMPLETE state from UPDATE_COMPLETE state.

Error from CloudWatch logs:- TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

Root cause identified:- Lambda layer (dataallGlueProflingJobDeploymentAwsCliLayer2ABEAF10) attached to lambda (dataall-environment-CustomCDKBucketDeployment) is still pointing to Python 3.10+ but code is expecting it to be on Python 3.9
** Lambda layer contains an AWS CLI version that has a broken module is missing or incompatible with Python 3.9

Fix attempted: Tried creating a new 3.9 compatible Lambda layer manually

Current Blockers: Environment stack update is still failing with errors:

Blocker No. 1: S3 copy from cdk assets bucket to Environment specific bucket is failing

The resource dataallGlueProflingJobDeploymentCustomResourceC9CF42F0 is in a CREATE_FAILED state
This Custom::CDKBucketDeployment resource is in a CREATE_FAILED state.
Received response status [FAILED] from custom resource. Message returned: Command '['/opt/awscli/aws', 's3', 'cp', 's3://cdk-hnb659fds-assets-985757210942-eu-west-1/bda8704101aab5f506bbd80c208427b6d7beb2f685b712197d0b181c1c521e41.zip', '/tmp/tmpsgm_eu1j/2a4eaa29-61ad-4185-8771-426818ad919f']' returned non-zero exit status 1. (RequestId: 909e7e1a-a77b-43e5-8f1b-a926ef47353f)

Blocker No. 2: Behavior difference between creating a new environment stack vs. existing environment stack
Using manually created 3.9 compatible Lambda layer works for new environment stack but it doesn't works for existing environment stack update - "CF stack still trying to search old CLI Layer version"

The resource CustomCDKBucketDeployment8693BB64968944B69AAFB0CC9EB8756C81C01536 is in a UPDATE_FAILED state
This AWS::Lambda::Function resource is in a UPDATE_FAILED state.
Resource handler returned message: "Layer version arn:aws:lambda:eu-west-1:582667241002:layer:dataallGlueProflingJobDeploymentAwsCliLayer2ABEAF10:38 does not exist. (Service: Lambda, Status Code: 400, Request ID: a1bbac76-2719-4028-9177-b5e4ec449149) (SDK Attempt Count: 1)" (RequestToken: 8c7f35dd-772e-d2cc-658c-50f2d1b0fe83, HandlerErrorCode: InvalidRequest)

Blocker No. 3 CodePipeline stage "dataall-dev-backend-stage" failure

The resource S3ResourcesNestedStackS3ResourcesNestedStackResourceEF3D2964 is in a UPDATE_FAILED state
This AWS::CloudFormation::Stack resource is in a UPDATE_FAILED state.
Embedded stack arn:aws:cloudformation:eu-west-1:244469940082:stack/dataall-dev-backend-stage-backend-stack-S3ResourcesNestedStackS3ResourcesNestedStackRe-1BIDTIXL3YIAE/d6b284a0-fc0e-11ef-a4db-029b476e7481 was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [PivotRoleDeploymentdevCustomResource90F356DC, CDKExecutionPolicyDeploymentdevCustomResourceA5234CC9].
Fix attempted: Commented code for these 2 Nested stack to bypass error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions