Skip to content

SYNC plugin breaks when one node is down #1657

@Rikysonic

Description

@Rikysonic

What installation are you running?

Production (netalertx) 📦

Is there an existing issue for this?

The issue occurs in the following browsers. Select at least 2.

  • Firefox
  • Chrome
  • Edge
  • Safari (unsupported) - PRs welcome
  • N/A - This is an issue with the backend

Current Behavior

We have 4 nodes added in the hub list. Today it happened that the second one in the list went down. Hub was able to download devices list from the first one, then when attempting to download from the second one, it failed and the entire SYNC flow is aborted, skipping third and fourth nodes and setting their devices as "Offline".

Expected Behavior

Hub should fail downloading devices from the down node and continue downloading from the other nodes in the list.

Steps To Reproduce

1 - Add more than 1 node (or add fake IPs to the nodes list)
2 - As soon as SYNC fails to download from a broken node, the entire operation is aborted and all devices managed by the nodes after the failing one in the list are set as "Offline"

Relevant app.conf settings

docker-compose.yml

Debug or Trace enabled

  • I have read and followed the steps in the wiki link above and provided the required debug logs and the log section covers the time when the issue occurs.

Relevant app.log section

12:01:05 [Plugin utils] ---------------------------------------------
12:01:05 [Plugin utils]  display_name :  Sync Hub
12:01:05 [Plugins] CMD: python3 /app/front/plugins/sync/sync.py
12:01:05 [Plugins] Timeout: 30
12:01:06 [SYNC] In script
12:01:06 [SYNC] Mode 2: PULL (HUB) - This is a HUB as SYNC_nodes is set
12:01:06 [SYNC] SYNC_hub_url not defined, skipping posting "Plugins" and "Devices" data
12:01:06 [SYNC] Getting data from node: "http://10.3.0.172:20212"
Starting new HTTP connection (1): 10.3.0.172:20212
http://10.3.0.172:20212 "GET /sync HTTP/1.1" 200 421930
12:01:06 [SYNC] Tried endpoint: http://10.3.0.172:20212/sync, status: 200
12:01:06 [SYNC] Device data from node "vin_netalertx_node" written to last_result.vin_netalertx_node.log
12:01:06 [SSE] Broadcasted event: unread_notifications_count_update
12:01:06 [SYNC] Getting data from node: "http://10.5.0.132:20212"
12:01:05 [Plugin utils] Pre-Resolved CMD:  python3 /app/front/plugins/sync/sync.py
12:01:05 [Plugins] Executing: python3 /app/front/plugins/sync/sync.py
12:01:05 [Plugins] Resolved : ['python3', '/app/front/plugins/sync/sync.py']
12:01:06 [plugin_helper] reading config file
12:01:06 [SYNC] In script
12:01:06 [SYNC] Mode 2: PULL (HUB) - This is a HUB as SYNC_nodes is set
12:01:06 [SYNC] SYNC_hub_url not defined, skipping posting "Plugins" and "Devices" data
12:01:06 [SYNC] Getting data from node: "http://10.3.0.172:20212"
12:01:06 [SYNC] Tried endpoint: http://10.3.0.172:20212/sync, status: 200
12:01:06 [SYNC] Device data from node "vin_netalertx_node" written to last_result.vin_netalertx_node.log
12:01:06 [SSE] Broadcasted event: unread_notifications_count_update
12:01:06 [SYNC] Getting data from node: "http://10.5.0.132:20212"
12:01:06 [SYNC] Error calling http://10.5.0.132:20212/sync: HTTPConnectionPool(host='10.5.0.132', port=20212): Max retries exceeded with url: /sync (Caused by NewConnectionError("HTTPConnection(host='10.5.0.132', port=20212): Failed to establish a new connection: [Errno 111] Connection refused"))
12:01:06 [SYNC] Failed to get data from "http://10.5.0.132:20212" via all endpoints
12:01:06 [SSE] Broadcasted event: unread_notifications_count_update
Traceback (most recent call last):
  File "/app/front/plugins/sync/sync.py", line 353, in <module>
    main()
  File "/app/front/plugins/sync/sync.py", line 134, in main
    node_name = response_json.get('node_name', 'unknown_node')
                ^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'get'

12:01:06 [Plugins] ⚠ ERROR - enable LOG_LEVEL=debug and check logs
12:01:06 [Plugins] No output received from the plugin "SYNC"

Docker Logs

Nothing more than the default lines when container starts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Waiting for reply⏳Waiting for the original poster to respond, or discussion in progress.bug 🐛Something isn't workingnext release/in dev image 🚀This is coming in the next release or was already released if the issue is Closed.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions