Optimize recv() for embedded systems with batched reads#32
Optimize recv() for embedded systems with batched reads#32dcliftreaves wants to merge 1 commit intotehmaze:masterfrom
Conversation
|
Thank you @dcliftreaves for the contribution, it's very appreciated. I think I get the idea here, previously:
You propose to
Which helps with timing issues for some systems, presumably where OS read calls are demanded to be fast enough to keep up with the endpoint (perhaps some very fast baudrate transfer on a serial line without flow control?). I'd only be certain about merging if there are test cases to cover at least some of the edge cases here -- bad sequence numbers, badly sized blocks, timeouts, etc. that this code change makes adjustments to, I'd be worried about introducing new errors for existing users in this case, so test cases would make me much more comfortable. I'll hope to try to take a look at this / write test cases soon but no promises, anyone else please join review! |
|
This PR pretty much mirrors the issue for #19, which was for send() |
|
Thank you for your kind response! Your comment: "perhaps some very fast baudrate transfer on a serial line without flow control?" Is right on the money. Most modern UARTs on embedded systems only use two IO lines which mean that they can only use software flow control which isn't normally used during xmodem transfers. Also, yes to this: This PR pretty much mirrors the issue for #19 I am not sure I can help too much with test cases as they are fairly complex. I will try to look at adding one or two test cases. |
|
Hope to get to this over the holiday! |
1 similar comment
Read sequence bytes, packet data, and checksum in a single getc() call instead of multiple separate calls. This reduces timing issues when communicating with embedded systems over fast serial lines without hardware flow control, where multiple reads with separate timeouts can stack up and cause buffer overruns. Also fix the inner header loop to break and NAK on bad headers rather than sleeping for the full timeout duration, which caused problems with embedded systems. Add comprehensive unit tests for recv() covering: - Successful single-block and multi-block transfers - Bad sequence numbers and bad sequence complements - Corrupted CRC checksums - Short (badly sized) blocks - Timeout during block data read - Checksum mode (crc_mode=0) - Cancellation via 2xCAN during data phase - Retry limit exhaustion on repeated errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hey @jquast 👋 — I've rebased this PR on top of the current What changed since the original PR
All existing upstream unit tests continue to pass. Would love to get this merged — happy to address any additional feedback! |
|
Thanks I'll take a look soon |
| time.sleep(timeout) | ||
|
|
||
| # Reads next char from stream | ||
| char = self.getc(1, timeout=1) |
There was a problem hiding this comment.
This "Reads next char from stream" was important for the side effect of the CAN above
| # cancel at two consecutive cancels | ||
| if cancel: | ||
| self.log.info('Transmission canceled: received 2xCAN ' | ||
| 'at block %d', sequence) |
There was a problem hiding this comment.
It's not obvious and that's ok -- this sort of thing is the cause of all bugs in this library it's difficult to see,
this "CAN" treatment depended on the deleted line marked below to receive the second CAN char, with these changes, "line noise" of a single SOH would fail, not a big problem. I have included a test to exercise the "line noise" possibility of missing second CAN with TDD
| data = data[2:] | ||
| if len(data) != (packet_size + 1 + crc_mode): | ||
| self.log.warning('recv: expected %d data bytes, got %d', | ||
| packet_size + 1 + crc_mode, len(data)) |
There was a problem hiding this comment.
Need to set data = None here also to prevent possible IndexError depending when size of 'data' is too small
jquast
left a comment
There was a problem hiding this comment.
I have reviewed and made the suggested changes, thanks for the submission I'll release soon
Summary
Optimizes the
recv()method for reliable communication with embedded systems by reading the sequence bytes, packet data, and checksum in a singlegetc()call instead of multiple separate calls.Problem
When receiving data from embedded systems over fast serial lines without hardware flow control (the most common configuration for modern UARTs with only 2 IO lines), the original
recv()implementation would:getc(1)for the first sequence bytegetc(1)for the second sequence bytegetc(packet_size + 1 + crc_mode)for the packet data + checksumEach call had its own timeout, and the delays between reads created windows where incoming data could be missed or buffers could overrun. Additionally, when the inner header loop received garbage/unexpected bytes, it would
time.sleep(timeout)for the full timeout duration before retrying, causing further timing issues.Changes
xmodem/__init__.py:getc(2 + packet_size + 1 + crc_mode)call, minimizing the time window for buffer overrunserror_countis only reset after successful packet reception (not unconditionally before each read), ensuring the retry limit works correctly across retransmissionstest/unit/test_xmodem.py— 11 new recv() unit tests:test_xmodem_recv_successful_single_blocktest_xmodem_recv_successful_multi_blocktest_xmodem_recv_bad_sequence_numbertest_xmodem_recv_bad_sequence_complementtest_xmodem_recv_bad_crctest_xmodem_recv_short_blocktest_xmodem_recv_timeout_on_block_readtest_xmodem_recv_checksum_modetest_xmodem_recv_cancel_by_can_cantest_xmodem_recv_exceeds_retry_limittest_xmodem1k_receive_successful_when_timeout_after_first_packetBackward Compatibility
recv()accepts the same parameters and returns the same valuescallbackparameter and all upstream features (empty file handling, etc.) are preservedMotivation
This mirrors the optimization discussed in issue #19 for
send(). Most modern embedded UARTs use only two IO lines (TX/RX) without hardware flow control. Reading all expected bytes upfront lets the caller/OS buffer management work optimally rather than issuing multiple small reads that can miss data.🤖 Generated with Claude Code