Skip to content

Conversation

@kravietz
Copy link

* Address stateful active response issues #4738 
* Use Wazuh recommended [Python template](https://documentation.wazuh.com/current/user-manual/capabilities/active-response/custom-active-response-scripts.html) for active response script
* Improve logging
* Tested in real life environment
@AdSchellevis
Copy link
Member

@kravietz I can accept changes to my script, but not a rewrite I'm afraid. The existing script is not terribly difficult to read, so please stay within the lines of the existing one and update what is needed to fix the issue in question.

@kravietz
Copy link
Author

kravietz commented Dec 15, 2025

@AdSchellevis Please do not take this personally, but the existing script has a very confusing control flow and error handling. I've tried to edit it initially, but kept running into problems caused by the above - mostly control flow issues, caused by non-exhaustive evaluation of states. The Wazuh supplied script on the other hand does the same thing in a very clean way, literally leaving a couple of placeholders to customise the action - which I did. Not sure why you're reluctant to replace it but from maintainability point of view it's 100% win. In other words, granted a few hours spent on trying to get the old code working I'm not ready to spend more time on it. The new one simply does the job.

@fichtner
Copy link
Member

It's not personal. It's just high risk imposed on the project for little benefit.

Cheers,
Franco

@kravietz
Copy link
Author

But the current code is not working at all due to the abort logic error - so the risk has already materialised 🤷🏻 The new code is tested and working. If you want to fix the old code, the primary bug is in line 124:

"parameters":{
                    "keys": [event['parameters']['alert']['rule']['id']]
                }

The rule_id parameter must be replaced by srcip, like below:

"parameters":{
                    "keys": [srcip]
                }

That's the theory, because even with that change the script behaved weirdly when faced with abort messages, which I believe is caused by the control flow issues, which is why I gave a try to the Wazuh one, and it works like a charm.

@paracetamol32
Copy link

paracetamol32 commented Dec 17, 2025

I have comment this part, and now pfctl -t __wazuh_agent_drop -T show, show multiples IP and not only one:

#        if params.input == '/dev/stdin':
#            timeout_event = None
#            try:
#                timeout_event=json.loads(read_data(params.input))
#            except ValueError:
#                pass
#            if timeout_event:
#                send_log('Received : %s' % json.dumps(timeout_event))
#                if timeout_event.get('command') == 'abort':
#                    send_log('Aborted')
#                    return 0
#                elif timeout_event.get('command') != 'continue':
#                    send_log('Invalid command')
#                    return -1

@kravietz
Copy link
Author

@paracetamol32 Thanks, what do you mean by "multiple IPs"? That several different IPs are added to the table, or the same IP added several times? I would expect that the table deduplicates added entries, but only the decoded srcip should be added for each run of the active response script.

I have done some more digging into the wazuh-agent code to understand why that abort is needed at all.

From the code linked above it seems the sole purpose of abort it to prevent adding the same banned entry over and over again, which in case of of file-based response (e.g. hosts.deny) would result in series of duplicates added when the same alert triggers the same response in a short period of time. That's why they introduced the stateful protocol.

But in case of pf response we don't really care about it - adding an existing entry to the table simply results in a no-op:

root@nagual:~ # pfctl -t __wazuh_agent_drop -T show | grep 51.68
   51.68.107.157
   51.68.111.244
   51.68.236.73
root@nagual:~ # pfctl -t __wazuh_agent_drop -T add 51.68.111.214
1/1 addresses added.
root@nagual:~ # pfctl -t __wazuh_agent_drop -T add 51.68.111.214
0/1 addresses added.
root@nagual:~ # pfctl -t __wazuh_agent_drop -T show | grep 51.68
   51.68.107.157
   51.68.111.214
   51.68.111.244
   51.68.236.73

Should we consider simply switching back to the much simpler stateless protocol, which simply involves:

  1. parse JSON, extract srcip
  2. add to table
  3. end response with no messages back to agent and waiting for abort/continue

When configured timeout expires, the Wazuh Agent simply calls the script again with delete action.

@mbedworth
Copy link
Contributor

mbedworth commented Dec 27, 2025

I can say I ran into a similar issue when my firewall would detect the same rule used for more then 1 ip the "keys": [event['parameters']['alert']['rule']['id']] would be the same for each external device that breached the rule and therefore only the first 1 would be added by the flow of add -> continue. each breach of the rule after that would be a add -> abort. Therefore wazuh was not truly useful since each rule would only work 1 time until the first one expired and then it would work again 1 time. I resolved this by changing the line (line 103 I believe) to be 2 lines:
unique_key = "%s-%s" % (event['parameters']['alert']['rule']['id'], srcip)
"keys": [unique_key]
this seems to make the key unique and therefore more then 1 ip can breach the rule and get added to the block list. I can write this all up more detailed if you need it.

edit: I put my better writeup in this issue (it more directly applies): #4738

@AdSchellevis
Copy link
Member

close in favor of #5104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants