Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lwip - Fix memory leak in k64f cyclic-buffer overflow #3135

Merged
merged 2 commits into from
Nov 1, 2016

Conversation

geky
Copy link
Contributor

@geky geky commented Oct 25, 2016

This is hopefully a fix for #3118, where @infinnovation put in a lot of work in tracking down the exact problem.

This was actually several bugs colluding together.

  1. Confusion on the buffer-semaphore paradigm used led to misuse of the tx semaphore and potential for odd behaviour.
  2. Equality tests on tx_consume_index and tx_produce_index did not handle overflow correctly. This would allow tx_consume_index to catch up to tx_produce_index and trick the k64f_rx_reclaim function into forgetting about a whole buffer of pbufs.
  3. On top of all of that, the ENET_BUFFDESCRIPTOR_TX_READ_MASK was not correctly read immediately after being set due to either a compiler optimization or hardware delays. This caused k64f_low_level_output to eagerly overrun existing buff-descriptors before they had been completely sent. Adopting the counting-semaphore paradigm for 1 avoided this concern.

As pointed out by @infinnovation, the overflow only occurs in the rare case that the 120MHz CPU can actually generate packets faster than the ENET hardware can transmit on a 100Mbps link.

related issues #3118, #2553
superseeds #3124
cc @infinnovation, @bogdanm, @kjbracey-arm

This was actually several bugs colluding together.

1. Confusion on the buffer-semaphore paradigm used led to misuse of the
tx semaphore and potential for odd behaviour.

2. Equality tests on tx_consume_index and tx_produce_index did not
handle overflow correctly. This would allow tx_consume_index to catch
up to tx_produce_index and trick the k64f_rx_reclaim function into
forgetting about a whole buffer of pbufs.

3. On top of all of that, the ENET_BUFFDESCRIPTOR_TX_READ_MASK was not
correctly read immediately after being set due to either a compiler
optimization or hardware delays. This caused k64f_low_level_output
to eagerly overrun existing buff-descriptors before they had been
completely sent. Adopting the counting-semaphore paradigm for 1 avoided
this concern.

As pointed out by @infinnovation, the overflow only occurs in the rare
case that the 120MHz CPU can actually generate packets faster than the
ENET hardware can transmit on a 100Mbps link.
@bogdanm
Copy link
Contributor

bogdanm commented Oct 25, 2016

👍

@geky
Copy link
Contributor Author

geky commented Oct 25, 2016

/morph test-nightly

@mbed-bot
Copy link

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 947

Test failed!

@infinnovation-dev
Copy link
Contributor

LGTM - successfully tested at 100MBps with a few variations of burst size, also at 10MBps. FYI, with 5.2.0 on a 10MBps link, a burst of more than 8 packets caused a lock-up as I expected.

Would it be churlish to suggest that the only thing missing from your PR was the 20-line comment explaining the Theory of Operation to the next poor soul to visit the code? ;-)

@@ -526,15 +524,14 @@ static err_t k64f_low_level_output(struct netif *netif, struct pbuf *p)

/* Wait until a descriptor is available for the transfer. */
/* THIS WILL BLOCK UNTIL THERE ARE A DESCRIPTOR AVAILABLE */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment still correct ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, though less directly, I can remove the bolded portion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this code actually getting used? I don't see where xTXDCountSem is released. If it isn't getting released, then this would essential lockup lwip if it occurred.

Copy link
Contributor Author

@geky geky Oct 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 26, 2016

Would it be churlish to suggest that the only thing missing from your PR was the 20-line comment explaining the Theory of Operation to the next poor soul to visit the code? ;-)

Can you be more specific provide line numbers or code block that should get more documentation?

@@ -526,15 +524,14 @@ static err_t k64f_low_level_output(struct netif *netif, struct pbuf *p)

/* Wait until a descriptor is available for the transfer. */
/* THIS WILL BLOCK UNTIL THERE ARE A DESCRIPTOR AVAILABLE */
while (g_handle.txBdCurrent->control & ENET_BUFFDESCRIPTOR_TX_READY_MASK)
osSemaphoreWait(k64f_enet->xTXDCountSem.id, osWaitForever);
osSemaphoreWait(k64f_enet->xTXDCountSem.id, osWaitForever);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It strikes me that there are two possible behaviours when output packets are generated faster than the hardware can send them: (a) wait for the hardware; (b) drop packets. Depending on the sizes of packets, the code here may do either: drop them (if there is not enough memory in the pool) or block waiting for a slot. Which is a little inconsistent.

If the preferred policy is to drop packets, then maybe it should use a zero timeout when getting the semaphore and bail out if none is available.

If the preferred policy is to block, then maybe it should defer allocating the pbuf until the slot is obtained (at which point the memory allocation is guaranteed to succeed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, maybe it isn't guaranteed, since the pool is shared with the receive side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I haven't been able to look into this yet. Do you adopting a blocking memory allocation would be the correct solution (if such exists in lwip)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking is well defined at the Network socket API layer, so that is another option. If it was done there, then the LWIP thread would be free to continue processing RX packets.

Copy link
Contributor Author

@geky geky Oct 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The k64f_low_level_output function is responsible for sending ethernet frames and is significantly detached from the socket layer. LWIP only maintains the state for non-blocking network operations, not hardware operations.

In the situation we can't continue, we have two choices:

  1. Block the lwip thread
    • Adds potential delay to every socket operation
    • Still valid for non-blocking socket operations
  2. Drop the packet
    • Relies on higher-level packet-loss logic to recover
    • Adds overhead to network transaction

Looking at this, my gut says the current behaviour is reasonable.

Full hardware buffers is not an uncommon state for low-power devices, and blocking effectively throttles the network logic in the application without adding any overhead that would further slow down the device. Running out of memory, however, is much more concerning.

Dropping the packet and bailing out of any ip logic will probably give the device the best chance to recover.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would normally say always drop packets. A buffer overflow is a buffer overflow. IP has to be resilient to buffer overflows in every device and every router. There's no reason to take extra precautions locally in your own Ethernet driver to avoid buffer overflow drops - you just need "enough" buffering.

Only pathological case you need to worry about generally is IP fragmentation - that's the only thing that pushes in multiple frames in one go and requires them all to go, and fails if they don't. As long as you have enough buffering to take the largest fragmented datagram you expect to attempt, dropping should be fine.

Blocking the network stack for any significant time could be very bad - it would be quite possible to pull out your Ethernet cable from one interface and bring the entire network stack (all interfaces) to a halt. Network stacks are usually single-threaded, and won't be able to do anything while a transmit call to one driver is blocking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kjbracey-arm, you make a good argument. I'm still hesitant, but suspect I'm weighing efficiency too heavily over robustness and consistency.

I've pushed up a new patch that drops frames when the hardware buffers are exhausted. As expected, @infinnovation's example now only gets ~6 big packets through each iteration, but congestion control should be able to work around that.

I would be interested in seeing a performance comparison of the two strategies on a 10mbps link with something high level like tcp, but I don't currently have the hardware available.

@bridadan
Copy link
Contributor

BTW, nightly test looked ok, there's still an issue with the NCS36510 on master

@infinnovation-dev
Copy link
Contributor

infinnovation-dev commented Oct 27, 2016

Can you be more specific provide line numbers or code block that should get more documentation?

It's not necessarily about specific lines of code. E.g. declarations and code dealing with the pool of buffers are spread across (at least) 4 functions, three threads, one .c file and two .h files. What were the design decisions and how were they arrived at? Why were the sizes of tx and rx ring buffers chosen? Why use these buffers at all for tx? (Guess: lwip may send a packet as a chain of pbufs; we could use one buffer descriptor for each pbuf but that adds extra complexity; so we copy to ensure contiguous memory and one buffer per packet, with the alignment required by uDMA.) What are the circumstances in which packets may get dropped? How do task priorities affect operation? (E.g. if tx reclaim thread has insufficient priority, rx packets may get dropped too.) What are the invariants? And so on...

Previously, exhausting hardware buffers would begin blocking the lwip
thread. This patch changes the emac layer to simply drop ethernet
frames, leaving recovery up to a higher level protocol.

This is consistent with the behaviour of the emac layer when unable
to allocate dynamic memory.
@geky
Copy link
Contributor Author

geky commented Oct 28, 2016

Honestly, I'm guessing at this code as much as you are. This driver has been around since before my time, and I suspect there have been enough contributors patching this code that there is no longer a single author who could explain it. I'm hesitant to add much documentation without knowing the facts.

Modifying this driver has felt a bit like eating pasta with a teaspoon, and there have been other concerns raised about this driver's heap and thread usage. There is motivation to refactor the driver, but it should probably occur as a separate pull request.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 31, 2016

/morph test-nightly

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 31, 2016

@infinnovation Happy with this patch or any other reviewer mentioned above?

@infinnovation-dev
Copy link
Contributor

LGTM

@mbed-bot
Copy link

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 976

Test failed!

@bridadan
Copy link
Contributor

Results look good, currently chasing down an issue in Greentea that's causing it to show up as a failure.

@sg- sg- merged commit fcfc7b4 into ARMmbed:master Nov 1, 2016
oter pushed a commit to oter/mbed-os that referenced this pull request Nov 21, 2016
Release mbed OS 5.2.2 and mbed lib v129

Known issues in this release

There is currently a DNS resolution failure in Thread mode with this release. This causes a failure in the
mbed-os-example-client. This will be fixed in a subsequent release. This can be worked around by reverting
to mbed-os-5.2.0

Ports for Upcoming Targets

3011: Add u-blox Sara-N target. ARMmbed#3011
3099: MAX32625 ARMmbed#3099
3151: Add support for FRDM-K82F ARMmbed#3151
3177: New mcu k22512 fixing pr 3136 ARMmbed#3177

Fixes and Changes

2990: [tools] Parallel building of tests ARMmbed#2990
3008: NUCLEO_F072RB: Fix wrong timer channel number on pwm PB_5 pin ARMmbed#3008
3013: STM32xx - Change how the ADC internal pins are checked before pinmap_ ARMmbed#3013
3023: digital_loop tests update for STM32 ARMmbed#3023
3041: [nRF5] - added implementation of API of serial port flow control configuration. ARMmbed#3041
3092: [tools + tests] Adding parallelized build option for iar and uvision exporters ARMmbed#3092
3084: [nrf5] fix in Digital I/O : a gpioe pin was uninitialized badly ARMmbed#3084
3009: TRNG enabled. TRNG APIs implemented. REV A/B/C/D flags removed. Warnings removed ARMmbed#3009
3139: Handle [NOT_SUPPORTED] exception in make.py ARMmbed#3139
3074: Target stm init gcc alignement ARMmbed#3074
3140: [tests] Replacing getchar with RawSerial getc in greentea-client ARMmbed#3140
3158: Added support for 6lowpan PAN ID filter to mbed mesh api configuration ARMmbed#3158
2988: Update of can_api.c fixing ARMmbed#2987 ARMmbed#2988
3175: Updating IAR definition for the NCS36510 for IAR EW v7.8 ARMmbed#3175
3170: [tests] Preventing test from printing before Greentea __sync ARMmbed#3170
3169: [Update of ARMmbed#3014] Usb updates ARMmbed#3169
3143: CFStore fix needed for the Cloud Client ARMmbed#3143
3135: lwip - Fix memory leak in k64f cyclic-buffer overflow ARMmbed#3135
3048: Make update.py test compile examples prior to updating mbed-os version. ARMmbed#3048
3162: lwip/nsapi - Clean up warnings in network code ARMmbed#3162
3161: nsapi - Add better heuristic for the default record of DNS queries ARMmbed#3161
3173: [Exporters] Add a device_name to microbit entry in targets.json ARMmbed#3173
3072: i2c_loop tests update for STM32 ARMmbed#3072
2958: Allowing mbed_app.json files to be discovered for tests. ARMmbed#2958
2969: [nRF52] - switch irq priorities of driver handlers to the lowest level ARMmbed#2969
3078: lwip: Allow several configuration macros to be set externally (bis) ARMmbed#3078
3165: Add address type checks to NanostackInterface ARMmbed#3165
3166: nsapi_dns: Provide 2 IPv6-hosted default servers ARMmbed#3166
3171: [tools] Fixing project.py -S printing problem ARMmbed#3171
3172: [Exporters] New export-build tests ARMmbed#3172
3184: ARMmbed#3183 Compiler warning in trng_api.c with K64F  ARMmbed#3184
3185: Update tests to fix build failures. Also make the code similar to oth ARMmbed#3185
3104: [NuMaker] Support CAN and fix PWM CLK error ARMmbed#3104
3182: Exporter documentation ARMmbed#3182
3186: MultiTech mDot - add back SPI3 pins ARMmbed#3186
3187: [Export-Make] Use internal class variable for resolving templates in makefiles ARMmbed#3187
3195: [Exporters - Make-based] Quote the shell call in mkdir and rmdir ARMmbed#3195
3204: [Export build-test] Directory traversal error ARMmbed#3204
3189: [Exporters - Make-based] Force make exporter to search PATH for compilers ARMmbed#3189
3200: Using Popen for uVision and unifying the structure of the build function ARMmbed#3200
3075: nsapi - Add standardized return types for size and errors ARMmbed#3075
3221: u-blox odin w2 drivers update ARMmbed#3221
LMESTM added a commit to LMESTM/mbed that referenced this pull request Apr 27, 2017
Before this patch, many warnings like below were generated
during compilation with ArmCC
[Warning] lwip_ethernet.h@57,0:  ARMmbed#3135-D: attribute does not apply to any entity

This happens here as ``--gnu`` option of ArmCC is being used, which
enables the GNU compiler extensions that the ARM compiler supports.

This is solve by adding a extra check on __CCARM .
adbridge pushed a commit that referenced this pull request May 5, 2017
Before this patch, many warnings like below were generated
during compilation with ArmCC
[Warning] lwip_ethernet.h@57,0:  #3135-D: attribute does not apply to any entity

This happens here as ``--gnu`` option of ArmCC is being used, which
enables the GNU compiler extensions that the ARM compiler supports.

This is solve by adding a extra check on __CCARM .
adbridge pushed a commit that referenced this pull request May 5, 2017
Before this patch, many warnings like below were generated
during compilation with ArmCC
[Warning] lwip_ethernet.h@57,0:  #3135-D: attribute does not apply to any entity

This happens here as ``--gnu`` option of ArmCC is being used, which
enables the GNU compiler extensions that the ARM compiler supports.

This is solve by adding a extra check on __CCARM .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants