-
Notifications
You must be signed in to change notification settings - Fork 275
Description
Hi there, we are working with L4WFPPROXY in order to route traffic to and from a sidecar proxy that runs in a pod on a Windows node. Our workflow is essentially as follows.
We are running this setup on an AKS kubernetes cluster where the nodes are running image version: AKSWindows-2022-containerd-20348.4297.251027
- A cni being invoked for every newly created pod
- CNI resolves the pod metadata to endpoint ID
- A L4WFPPROXY policy is applied to the endpoint in question
The policy that we add looks like:
{
"FilterTuple": {
"Protocols": "6"
},
"InboundExceptions": {
"PortExceptions": [
"4193",
"4192"
]
},
"InboundProxyPort": "4143",
"OutboundProxyPort": "4140",
"Type": "L4WFPPROXY",
"UserSID": "S-1-5-20"
}So from my perspective it looks quite vanilla. The problem that we are facing is that after a certain amount of policies added (~15) we start experiencing problems where despite the fact that the API call succeeds, the policy application fails. So it would work for the first 10 pods or so and then it would start randomly failing. This error is observed when scaling a single workload to multiple replicas. So there is nothing apparently different between these workloads/endpoints. The behavior is isolated to a single node.
If I run Get-WinEvent -ProviderName "Microsoft-Windows-Host-Network-Service" | Where-Object { $_.Message -like "*HNS-Policy-Apply*" } | ConvertTo-Json -Depth 20 I would get the following for a successful policy application:
{
"Id": 1057,
"Version": 0,
"Qualifiers": null,
"Level": 4,
"Task": 0,
"Opcode": 0,
"Keywords": -9223372036854775808,
"RecordId": 41,
"ProviderName": "Microsoft-Windows-Host-Network-Service",
"ProviderId": "0c885e0d-6eb6-476c-a048-2457eed3a5c1",
"LogName": "Microsoft-Windows-Host-Network-Service-Admin",
"ProcessId": 3500,
"ThreadId": 5732,
"MachineName": "akswin000000",
"UserId": {
"BinaryLength": 12,
"AccountDomainSid": null,
"Value": "S-1-5-18"
},
"TimeCreated": "/Date(1765202244020)/",
"ActivityId": null,
"RelatedActivityId": null,
"ContainerLog": "Microsoft-Windows-Host-Network-Service-Admin",
"MatchedQueryIds": [],
"Bookmark": {},
"LevelDisplayName": "Information",
"OpcodeDisplayName": "Info",
"TaskDisplayName": null,
"KeywordsDisplayNames": [],
"Properties": [
{
"Value": "HNS-Policy-Apply"
},
{
"Value": "ba96c595-9544-4b97-9195-a163bc92818e"
},
{
"Value": "358bad56-900f-4dc7-8f12-9e9009279bdc"
},
{
"Value": 17
},
{
"Value": 0
}
],
"Message": "HNS-Policy-Apply :- \r\n Endpoint id = '{ba96c595-9544-4b97-9195-a163bc92818e}'.\r\n Network id = '{358bad56-900f-4dc7-8f12-9e9009279bdc}'.\r\n Policy type = 'L4WFPPROXY'.\r\n Result code = '0x0'."
}But often times I would get:
{
"Id": 1056,
"Version": 0,
"Qualifiers": null,
"Level": 2,
"Task": 0,
"Opcode": 0,
"Keywords": -9223372036854775808,
"RecordId": 3863,
"ProviderName": "Microsoft-Windows-Host-Network-Service",
"ProviderId": "0c885e0d-6eb6-476c-a048-2457eed3a5c1",
"LogName": "Microsoft-Windows-Host-Network-Service-Admin",
"ProcessId": 3500,
"ThreadId": 972,
"MachineName": "akswin000000",
"UserId": {
"BinaryLength": 12,
"AccountDomainSid": null,
"Value": "S-1-5-18"
},
"TimeCreated": "/Date(1765204269427)/",
"ActivityId": null,
"RelatedActivityId": null,
"ContainerLog": "Microsoft-Windows-Host-Network-Service-Admin",
"MatchedQueryIds": [],
"Bookmark": {},
"LevelDisplayName": "Error",
"OpcodeDisplayName": "Info",
"TaskDisplayName": null,
"KeywordsDisplayNames": [],
"Properties": [
{
"Value": "HNS-Policy-Apply"
},
{
"Value": "e12c4dae-567b-459d-948a-4b7dc72b119b"
},
{
"Value": "358bad56-900f-4dc7-8f12-9e9009279bdc"
},
{
"Value": 17
},
{
"Value": -2144206793
}
],
"Message": "HNS-Policy-Apply :- \r\n Endpoint id = '{e12c4dae-567b-459d-948a-4b7dc72b119b}'.\r\n Network id = '{358bad56-900f-4dc7-8f12-9e9009279bdc}'.\r\n Policy type = 'L4WFPPROXY'.\r\n Result code = '0x80320037'."
}I wonder whether this is a problem in my configuration, the way I am applying the policy, or I am seeing some internal threshold being hit? I would expect that one would be able to apply more than 15 policies of that type. Any pointers would be greatly appreciated.