Skip to content

ECONNRESET: aborted when pushing large multi-container builds #2768

@timwedde

Description

@timwedde

Expected Behavior

Pushing arbitrarily-sized multi-container builds to Balena builders works fine and creates a new image successfully.

Actual Behavior

When pushing large multi-container docker-compose files to the Balena builders, the push operations fails in about 90% of cases with the below error message:

ECONNRESET: aborted

Error: aborted
    at TLSSocket.socketCloseListener (node:_http_client:462:19)
    at TLSSocket.emit (node:events:532:35)
    at TLSSocket.emit (node:domain:488:12)
    at node:net:338:12
    at TCP.done (node:_tls_wrap:659:7)

The behavior is not consistent:

  • There seems to be no correlation to specific builders as far as I can tell
  • The build works slightly more consistently on my M2 Macbook Air compared to my M1 Mac Studio, but not by much. The setup is almost the same, except the former is using Node 22 and the latter Node 20.
  • The aborted builds will often end up in the 'Releases' tab on Balena and will sometimes actually successfully complete. However, there is no way to know this without manually checking this tab every once in a while.

The command used to build is very simple:

balena push myFleet --release-tag description "debug" --draft

Here is one of the builds that failed, on the machine that has a slightly higher success rate:

❯ balena push myFleet --release-tag description "debug" --draft --debug
----------------------------------------------------------------------
[Warn] Node.js version "22.2.0" does not satisfy requirement "^20.6.0"
[Warn] This may cause unexpected behavior.
----------------------------------------------------------------------
[debug] new argv=[/opt/homebrew/Cellar/node/22.2.0/bin/node,/opt/homebrew/bin/balena,push,jetson-test,--release-tag,description,lpm debug,--draft] length=8
[debug] Deprecation check: 6.81196 days since last npm registry query for next major version release date.
[debug] Will not query the registry again until at least 7 days have passed.
[Debug]   Using build source directory: . 
(node:28123) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
[Debug]   Pushing to cloud for fleet: myFleet
[debug] Event tracking error: Timeout awaiting 'response' for 0ms
| Packaging the project source...[Debug]   Tarring all non-ignored files...
[Debug]   docker-compose.yml file found at "/Users/user/Documents/Work/project"
/ Packaging the project source...[Debug]   Tarring complete in 353 ms
[debug] Connecting to builder at https://builder.balena-cloud.com/v3/build?slug=gh_timwedde%2Fjetson-test&dockerfilePath=&emulated=false&nocache=false&headless=false&isdraft=true
\ Uploading source package to https://builder.balena-cloud.com[debug] received HTTP 200 OK
[debug] handling message: {"type":"metadata","resource":"buildLogId","value":"3047436"}
[debug] handling message: {"message":"\u001b[36m[Info]\u001b[39m         Starting build for myFleet, user gh_timwedde"}
[Info]         Starting build for myFleet, user gh_timwedde
[debug] handling message: {"message":"\u001b[36m[Info]\u001b[39m         Dashboard link: https://dashboard.balena-cloud.com/apps/ID/devices"}
[Info]         Dashboard link: https://dashboard.balena-cloud.com/apps/ID/devices
ECONNRESET: aborted

Error: aborted
    at TLSSocket.socketCloseListener (node:_http_client:462:19)
    at TLSSocket.emit (node:events:532:35)
    at TLSSocket.emit (node:domain:488:12)
    at node:net:338:12
    at TCP.done (node:_tls_wrap:659:7)

For further help or support, visit:
https://www.balena.io/docs/reference/balena-cli/#support-faq-and-troubleshooting


[debug] Timeout reporting error to sentry.io

Steps to Reproduce the Problem

Hard to say, I don't know if this is generally reproducible. This seems to occur with larger multi-container builds though.
My particular one is massive (in terms of final Docker image sizes at least), ending up at about 40-50GB. This is bad and I'm aware of that, but since I'm building for a Jetson and need multiple distinct containers that make use of GPU acceleration, I have to ship the entire driver stack several times, which bloats image sizes by a lot. I'm assuming I'm getting kicked off the builders because of cache or image sizes, but the error message is not clear about this nor could I find any hard limits on this, so I'm a bit confused as to the source of the issue.

Specifications

  • balena CLI version: 18.2.4
  • Cloud backend: balenaCloud?
  • Operating system version: macOS 14.5
  • 32/64 bit OS and processor: 64-bit OS, ARM processor
  • Install method: Executable installer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions