Skip to content

Long Running AWS Connections Failing #336

@gsexton

Description

@gsexton

I'm running into an issue with long lived connections on AWS ECS containers failing and was hoping for some help.

I'm using:

github.com/go-openapi/runtime v0.28.0
go: 1.23.6

The application has swagger generated clients that are talking to other services. I frequently see context deadline exceed messages in my error logs. I opened a ticket and the VPC support engineer said this was a keepalive/timeout issue. The NAT gateway has an idle connection timeout of 350 seconds.

https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html#nat-gateway-troubleshooting-timeout

According to the SE, when the connection is closed by the NAT gateway, the client doesn't receive a notification, and it hangs until the context deadline expires. The solution is to either use keepalive packets, or to close the connections and re-open them before the idle connection timeout expires.

Does this diagnosis sound correct?

I looked through the docs and found Runtime.EnableConnectionReuse(), but from reading the docs and looking at the code, it doesn't look like it addresses my problem. I tried it anyhow, and it makes no difference.

I have not tried re-generating the clients using swagger. Would that help?

Any ideas would be really appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions