Skip to content

Threads blocked causing missing attestations and proposals #46

@slavik0329

Description

@slavik0329

Describe the bug

In the last 3-4 weeks I have had missed and orphaned blocks and also have been noticing missed attestations as well. Looking at the logs of web3signer I notice multiple errors relating to threads being blocked. I have tried rebooting and even erasing the web3signer volume and starting with a clean db and is still happening. I have been running this node since genesis.

To Reproduce

Its just happening with a standard geth + prysm setup

Screenshots

2023-07-03 12:43:45.488+00:00 | ForkJoinPool-1-worker-1 | INFO  | SignerLoader | Total signers loaded from configuration files: 36 in 00:00:22.193 with error count 0
2023-07-03 12:43:45.641+00:00 | pool-2-thread-1 | INFO  | RegisteredValidators | Validators registered successfully in database:36
2023-07-03 12:43:45.644+00:00 | pool-2-thread-1 | INFO  | DefaultArtifactSignerProvider | Total signers (keys) currently loaded in memory: 36
2023-07-03 12:43:46.370+00:00 | main | INFO  | Runner | Web3Signer has started with TLS disabled, and ready to handle signing requests on 0.0.0.0:9000
2023-07-03 13:02:53.482+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 2975 ms, time limit is 2000 ms
2023-07-03 13:08:34.412+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@1b124b66 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:08:34.412+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@141d3d43 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:09:48.319+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2899 ms, time limit is 2000 ms
2023-07-03 13:10:05.912+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@3a36c51f (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:10:10.175+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@33dcae3a (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:11:36.168+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@4b45e7b2 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:12:24.410+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@60283682 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:12:24.410+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@62118fa2 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:13:16.461+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@4aedcb96 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:13:18.991+00:00 | ForkJoinPool.commonPool-worker-5 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@508ba67a (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:13:18.991+00:00 | ForkJoinPool.commonPool-worker-3 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@4e5387ab (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:12:29.492+00:00 | HikariPool-1 housekeeper | WARN  | HikariPool | HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=46s409ms341?s316ns).
2023-07-03 13:12:29.492+00:00 | HikariPool-2 housekeeper | WARN  | HikariPool | HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=46s409ms339?s713ns).
2023-07-03 13:14:17.168+00:00 | HikariPool-1 housekeeper | WARN  | HikariPool | HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=3m45s331ms846?s383ns).
2023-07-03 13:14:33.544+00:00 | ForkJoinPool.commonPool-worker-3 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@40946011 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:14:29.179+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@46fff1e9 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:15:00.657+00:00 | HikariPool-2 housekeeper | WARN  | HikariPool | HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=4m29s928ms532?s768ns).
2023-07-03 13:12:28.788+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 7895 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at io.netty.handler.timeout.IdleStateHandler$AllIdleTimeoutTask.run(IdleStateHandler.java:575) ~[netty-handler-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:475) ~[netty-handler-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
2023-07-03 13:16:52.231+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 2384 ms, time limit is 2000 ms
2023-07-03 13:17:11.093+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2991 ms, time limit is 2000 ms
2023-07-03 13:17:48.080+00:00 | ForkJoinPool.commonPool-worker-7 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@60f28e14 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:17:55.773+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@6cdf291b (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:19:37.137+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2232 ms, time limit is 2000 ms
2023-07-03 13:20:24.477+00:00 | ForkJoinPool.commonPool-worker-2 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@45d1f567 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:21:44.160+00:00 | HikariPool-1 housekeeper | WARN  | HikariPool | HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m35s355ms152?s275ns).
2023-07-03 13:21:43.602+00:00 | HikariPool-2 housekeeper | WARN  | HikariPool | HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=1m47s794ms58?s593ns).
2023-07-03 13:21:45.075+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 9318 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at sun.nio.ch.EPoll.wait(Native Method) ~[?:?]
	at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source) ~[?:?]
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source) ~[?:?]
	at sun.nio.ch.SelectorImpl.select(Unknown Source) ~[?:?]
	at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:883) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:526) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
2023-07-03 13:22:45.008+00:00 | HikariPool-2 housekeeper | WARN  | HikariPool | HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=1m4s315ms662?s122ns).
2023-07-03 13:22:45.008+00:00 | HikariPool-1 housekeeper | WARN  | HikariPool | HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m4s451ms566?s440ns).
2023-07-03 13:22:51.989+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2483 ms, time limit is 2000 ms
2023-07-03 13:22:59.380+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@46f50c99 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:22:59.380+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 7427 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at sun.nio.ch.EPoll.wait(Native Method) ~[?:?]
	at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source) ~[?:?]
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source) ~[?:?]
	at sun.nio.ch.SelectorImpl.select(Unknown Source) ~[?:?]
	at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:883) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:526) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
2023-07-03 13:26:16.295+00:00 | HikariPool-1 housekeeper | WARN  | HikariPool | HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=48s895ms838?s270ns).
2023-07-03 13:26:19.627+00:00 | ForkJoinPool.commonPool-worker-1 | WARN  | PoolBase | HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@75fbdc85 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2023-07-03 13:26:19.667+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 2097 ms, time limit is 2000 ms
2023-07-03 13:26:16.295+00:00 | HikariPool-2 housekeeper | WARN  | HikariPool | HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=48s895ms843?s668ns).
2023-07-03 13:26:27.445+00:00 | vertx-blocked-thread-checker | WARN  | BlockedThreadChecker | Thread Thread[vert.x-worker-thread-19,5,main] has been blocked for 65391 ms, time limit is 60000 ms
io.vertx.core.VertxException: Thread blocked
	at sun.management.ThreadImpl.findMonitorDeadlockedThreads(Unknown Source) ~[?:?]
	at io.prometheus.client.hotspot.ThreadExports.addThreadMetrics(ThreadExports.java:79) ~[simpleclient_hotspot-0.9.0.jar:?]
	at io.prometheus.client.hotspot.ThreadExports.collect(ThreadExports.java:123) ~[simpleclient_hotspot-0.9.0.jar:?]
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:190) ~[simpleclient-0.9.0.jar:?]
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:223) ~[simpleclient-0.9.0.jar:?]
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:144) ~[simpleclient-0.9.0.jar:?]
	at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22) ~[simpleclient_common-0.9.0.jar:?]
	at org.hyperledger.besu.metrics.prometheus.MetricsHttpService.lambda$metricsRequest$4(MetricsHttpService.java:205) ~[metrics-core-22.10.3.jar:22.10.3]
	at org.hyperledger.besu.metrics.prometheus.MetricsHttpService$$Lambda$683/0x000000010062be40.handle(Unknown Source) ~[?:?]
	at io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137) ~[vertx-core-4.3.8.jar:4.3.8]
	at io.vertx.core.impl.ContextBase$$Lambda$687/0x000000010062c6d8.handle(Unknown Source) ~[?:?]
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264) ~[vertx-core-4.3.8.jar:4.3.8]
	at io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135) ~[vertx-core-4.3.8.jar:4.3.8]
	at io.vertx.core.impl.ContextBase$$Lambda$685/0x000000010062c290.run(Unknown Source) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.89.Final.jar:4.1.89.Final]
	at java.lang.Thread.run(Unknown Source) ~[?:?]

DAppNode version:

Core DAppNode Packages versions
bind.dnp.dappnode.eth: 0.2.6
core.dnp.dappnode.eth: 0.2.73
dappmanager.dnp.dappnode.eth: 0.2.68, commit: 4369cbdc
ipfs.dnp.dappnode.eth: 0.2.19
vpn.dnp.dappnode.eth: 0.2.8, commit: f9a8743e
wifi.dnp.dappnode.eth: 0.2.8
System info
dockerComposeVersion: 1.25.5
dockerServerVersion: 19.03.8
dockerCliVersion: 19.03.8
os: debian
versionCodename: bullseye
architecture: amd64
kernel: 5.9.0-1-amd64
Disk usage: 46%

  • Package version:
  • OS:
  • Browser

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions