Search code examples
linuxwindowstimeoutlinux-device-driverlibusb

usb bulk transfer timeout under Linux while it works under Windows


[Edit: I found the reason, see below]

The problem:

I created a "driver" for a device in Windows using Python (PyUSB and libusb-win32). While this software works seamlessly on multiple PCs under Windows, using my Linux (Kubuntu 18.10) test laptop, a sequence of bulk writes of length 512 bytes each times out after the second 512 byte bulk transfer.

Interesting: I also tried the same using VirtualBox. And it turns out using a Windows guest via VirtualBox on the same Linux host, the same error still occurs. So it is not because of

The question:

What can happen under Linux does not happen under Windows that causes a timeout [Errno 110]?

More information, in case it helps:

  • Under Windows, Wireshark shows timing differences between two of the bulk writes of 6 ms for the first one and 5 ms for every following, while under Linux the delta is only round about 3 ms, which are mostly resulting from a sleep operation (relevant Python source code is attached). Doubling that time does nothing.
  • dmesg shows messages like 'bulk endpoint ## has invalid maxpacket 64', where ## is 0x01, 0x08 and 0x81.
  • The device only has one configuration.
  • The test laptop has only USB 3.0 connectors, where the Windows PCs have both USB 3.0 and 2.0. I tested all.
  • Wireshark shows the device answering with another (empty) bulk on every bulk write under Linux, while it does not show that under Windows. As far as I understand, that is because USBPcap cannot capture handshakes under Windows. But I am not shure with that, because I do not know if this type of response would really be classified as "URB_BULK out".
  • I tried libusb0, libusb1 and OpenUSB as backends under Linux, without success.
  • The bulk transfer in question is the transfer of FPGA firmware to the device.
  • I am able to communicate with the device before the multiple-512 byte-chunk bulk operation on the same endpoints using only a few bytes. The code that then causes the timeout is the following one in the second iteration of this for loop:
for chunk in chunks: # chunks: array of bytearrays with 512 bytes each
    self.write(0x01,chunk)
    time.sleep(0.003)

[Edit] The reason I found out that this only occurs on my test laptop using xhci, not on a second Linux test machine using ehci. So this might be caused by xhci. I do not yet have a workaround, but this at least gives an explanation.


Solution

  • It turns out that the device requested less bytes per packet, the desired amount of bytes (64) could be found in dmesg as already written in the question. Since xhci doesn't support that officially, Linux decided to ignore that request. Windows seemingly went with it and split larger packets up in the requested packet size. So the solution was to manually split the data to packets of 64 byte size before transferring it.