-
Notifications
You must be signed in to change notification settings - Fork 1.5k
net/tcp/udp: Modify calls of net_iob*alloc calls to be unthrottled for sends, throttled for receives. #17358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
This looks very promising! Thanks @mennovf. |
|
In practical work scenarios, if we don't intentionally not read the packet, the IOB in readahead will always be consumed, and ioc_alloc_committed can ensure that the sending thread that was previously in a waiting state obtains the IOB. If the sending direction is unthrottled, during the TCP sending process, if the buffer in the write queue is full of IOB, the driver will be unable to receive any new packet (especially TCP-ACK), and the buffer in the write queue will never be released, causing the entire IP protocol stack to hang. This is a fatal problem. So for the overall robustness of the IP protocol stack, it is recommended to use throttled IOB for the sending direction and unthrottled IOB for the receiving direction. If you agree with my viewpoint, we should modify the description in kconfig. This is my understanding of this issue, which can be used as a reference for you. |
ae4686d to
8cbab7d
Compare
|
Yes that's the dual problem of what I'm experiencing. I think the essential issue is that neither write nor read should be able to starve the other's memory, so compiling with either NET_SEND_BUFSIZE or RECV_SEND_BUFSIZE too small to contain a packet is just not correct. Both the receive and transmit end should always have enough memory to hold at least one packet to ensure progress. |
|
Does this close/resolve #17299 or is it just one piece of the puzzle? |
|
@mennovf suggestion: since you faced these issues with net iob throttle, maybe you could include a small section at Documentation/components/net/netdriver.rst talking about this feature, to give an overview for someone willing to use it. And the implications do disabling it and/or read-ahead. Maybe add some testing examples at https://nuttx.apache.org/docs/latest/guides/testingtcpip.html This is just a suggestion, because NuttX documentation is very shy! So all opportunities we have to improve it we need to use. :-) |
…r sends, throttle for receives. The Kconfig option CONFIG_IOB_THROTTLE is used to limit the amount allocated by TCP (and UDP) receives as to not starve sends. This distinction is made via a 'throttle' argument passed to the iob_*alloc functions, which are wrapped by net_iob*alloc. Previously the udp/tcp_wrbuffer_write functions incorrectly allocate with throttle=true, effectively making the IOB_THROTTLE option useless. This patch modifies the calls in udp/tcp_wrbuffer_write to allocate unthrottled, and fixes an unthrottled allocation in the TCP receive path. There were also several locations in the receive path that incorrectly allocated unthrottled. Signed-off-by: Menno Vanfrachem <mennovanfrachem@hotmail.com> Modify receives to allocate throttled
8cbab7d to
a0dc172
Compare
@mennovf with your change how to handle the case described by @zhhyu7 : If the sending direction is unthrottled, during the TCP sending process, if the buffer in the write queue is full of IOB, the driver will be unable to receive any new packet (especially TCP-ACK), and the buffer in the write queue will never be released, causing the entire IP protocol stack to hang. This is a fatal problem. |
|
Tried running the IPERF test and it works at first, but then it fails on edit: it is also failing on master. I think it is still starving the IOBs, but I'm not sure. |
Yes, the TCP sending will be blocked. Note that this is already an issue if _IOB_THROTTLE=0, regardless whether sending or receiving is marked "throttled". @fdcavalcanti I can't reproduce it. What relevant options are you using? I did seemingly run into another bug using iperf where the socket doesn't seem to get cleaned up properly on ctr-c. |
I use the default defconfig on |
|
I've seen this IOB deadlock issue too. I've been working around it by simply increasing the number of buffers available, but that isn't a good solution. Are you guys planning to move this pull request ahead, or are you parking it for now? |
|
I think the THROTTLE option is a dead-end due to the TCP remarks. IMO the memory management needs an overhaul. In the meantime I can resolve my issue by fixing RECV/SEND_BUFSIZE in a fork and correctly configuring the system with these parameters. |
Can you show an example of this approach? |
|
I recently upgraded from NuttX 10.3.0 to NuttX 12.12.0 and encountered this issue, which wasn’t a problem before. Previously, I was running with CONFIG_IOB_THROTTLE=0 without any issues. @xiaoxiang781216 @acassis Could you clarify why the network stack appears to have regressed to the point where deadlocks are possible? This is quite concerning. |
|
@PetervdPerk-NXP this was already an issue with NuttX 10.3.0 but it seems like it only manifest under certain conditions / network pressure. This ticket & PR were opened by @mennovf because we are hitting this issue with current PX4 for our custom board. See PX4/PX4-Autopilot#25956 NuttX 12.12 might make the conditions to trigger this easier, but it was definitely an issue before. |
Surely but with 10.3.0 atleast IOB could get full and not deadlock this easily, right now the first occurence of a full with IOB with TCP traffic yields a deadlock for me, This is most likely more logical/state machine problem. NuttX 10.3.0 working fine with this config NuttX 12.12.0 config needed to avoid deadlocks Overall, the throughput appears quite inconsistent, even though the CPU load remains relatively low at around 6% from the thread generating the TCP data. I understand that @xiaoxiang781216 and his team have been working on rewriting the NET/IOB stack for several years, but I was expecting performance improvements rather than a regression. Edit: NuttX 12.12.0 default settings change seems to make it easier to reproduce. |
could you share the hardware/defconfig and repro step? if we have the hardware, @zhhyu7 could help to identify the root cause. |
Hi @PetervdPerk-NXP , The issue of the protocol stack getting stuck, caused by the case where TCP cannot process TCP_ACK because the transmit queue fills up the IOB so the driver fails to allocate an IOB, which in turn leads to the inability to release TCP write buffer resources, should have always existed, especially in scenarios where the total amount of iob is small, such as less than 16k. Another scenario that can cause the protocol stack getting stuck is when the application never reads the packets in the protocol stack's readahead. The probability of the second scenario occurring can be reduced by limiting CONFIG_NET_RECV_BUFSIZE. Let's focus on the first scenario. To avoid the occurrence of this scenario. Optimization of performance requires further specialized analysis and solution design. On different products, it is necessary to make reasonable configurations for CONFIG_IOB_NBUFFERS, CONFIG_IOB_BUFSIZE, CONFIG_NET_RECV_BUFSIZE, CONFIG_NET_RECV_BUFSIZE, CONFIG_IOB_THROTTLE, etc., in combination with the memory allocation situation to make the protocol stack work more efficiently and robustly. |
I push an enhanced patch for this scenario #18011 |
Note: Please adhere to Contributing Guidelines.
Summary
Addresses this issue: #17299
The Kconfig option CONFIG_IOB_THROTTLE is used to limit the amount allocated by TCP (and UDP) receives as to not starve sends. This distinction is made via a 'throttle' argument passed to the iob_*alloc functions, which are wrapped by net_iob*alloc. Previously the udp/tcp_wrbuffer_write functions incorrectly allocate with throttle=true, effectively making the IOB_THROTTLE option useless.
This patch modifies the calls in udp/tcp_wrbuffer_write to allocate unthrottled.
There were also several locations in the receive path that incorrectly allocated unthrottled.
Impact
This should not have an effect during normal operation. This change is only in effect when CONFIG_IOB_THROTTLE > 0 and there's high reception load. In that case the system should keep operating and not deadlock on a sendto() call on a blocking socket without timeout.
Testing
I ran
iperf -s -B 10.0.1.2 -u &in the simulator target with the other endiperf -c -b 100M ...on the host machine while periodically executingcat /proc/iobinfoto check the state of the IOB MM. The sim target is compiled with CONFIG_IOB_THROTTLE=128When compiled from master, I get (worst-case):
Here you can see that nfree < CONFIG_IOB_THROTTLE(=128) even though only the receive path is exercised.
After the changes the worst-case is:
So nfree > CONFIG_IOB_THROTTLE.
There are still several places in other drivers that don't respect the throttle parameter e.g. drivers/wireless/ieee802154/xbee/xbee.c:265