USB module delayed reads
Colin (478) 2433 posts |
I’m hoping Jeffrey Lee is reading this. In version 1.41 of the usbmodule you added delayed_read and xfer_busy. I’d like to take out the delayed_read and was wondering if there was a particular problem that the delayed_read part solved (other than what the cvs log says). |
Jeffrey Lee (213) 6048 posts |
Hello!
I can’t remember what bug I was looking into at the time, but the basic sequence of events was:
Although the change did fix the infinite loop in usbd_dump_queue(), unfortunately I can’t remember if it fixed the original bug I was investigating (there’s a possibility it did – I vaguely remember running into some issues with the Pi USB driver where the DWC layer was being asked to reinitialise a transfer that was still in progress). Also, after looking at the BSD code a bit closer on a later date, I’m not sure if my original fix fully fixes the problem of transfers being re-inserted before the BSD layer has finished with them. If you look at the usb_transfer_complete() function in usbdi.c you’ll spot that the transfer complete callback (at line 971) isn’t the last thing that happens in the function – it still interacts with the xfer further (e.g. by calling pipe→methods→done(xfer)), so it’s probably not a good idea to allow code to reuse an xfer from within the transfer complete callback. I’m not sure exactly how BSD drivers work around this problem – the transfer complete callback will always be called from within a splusb() block, so maybe there’s some documentation somewhere which lays out a rule that you aren’t allowed to start transfers from within callbacks, period? (i.e. you should always wake up another thread, and use splusb() to synchronise) But at the same time, things like the if_aue driver definitely do call usbd_setup_xfer from within their transfer complete callback (see aue_rxeof()), so maybe it is OK? Good luck! |
Colin (478) 2433 posts |
Hi – thanks for responding. Regarding the BSD code On my iyonix I only have one outstanding problem and that is when doing separate reads of 1 byte in quick succession it sometimes fails with an IOERROR – its actually an error triggered when the transfer is retired in softintr in ohci.c. I haven’t looked into that yet. I too came to the conclusion that dealing directly with the BSD layer looked more appealing but in application space its a bit raw. What I’m trying to do at the moment is get osgbpb work the same nonblocking as it does blocking. At present in blocking mode it does 1 RXWakeup and waits for the data to arrive, returning when it has all arrived – leaving no delayed_reads. Nonblocking mode is a real mess. Every read does a wakeup which sets nextcount/delayed_read for start_read. So if fetching 4 bytes returns 0 bytes the fetch gets started and the next time OSGBPB reads the requested data it starts a new fetch – which you don’t want. Anyway I can make it so that nonblocking OS_GBPB can read the data without initiating new fetches. Thats where getting rid of delayed_read comes in. I had thought you were using it as a queue’d read to try and speed things up but realised this couldn’t work in blocking mode. I have a feeling EtherUSB will call DeviceFS_CallDevice RXWakeUp directly with a very large number of bytes (totalcount). However If it has NOPAD set its depending on a bug. If NOPAD is not set then padding occurs when a short transfer is returned, at which point it retires the whole totalcount transaction – not very successfully I may add. With NOPAD set it always retires the whole totalcount transaction after the 1st transfer instead of after a short transfer. Ideally I’d get rid of padding and fix os_gbpb to complete early – but who knows what that would break :-) Unfortunately I don’t have a pi to test so I’ll have to stiff someone elses :-) I only wanted to fix a ‘little’ bug, its turned out like Father Ted knocking the dents out of his car. Thanks for taking the time to put your thoughts down. |
Colin (478) 2433 posts |
Re transfer_complete and callbacks You can’t reuse xfer in the callback because methods→done of the previous transfer comes after methods→start of the next transfer. You can’t know from the callback what methods→done does as it is controller device dependent. Just looking at bulk and ohci, methods→done nulls hcpriv which is only used for retiring interrupt endpoints so in that case we get away with it. It looks like it has been a problem because there’s a ‘riscos’ null test on hcpriv added to softintr in ohci.c We appear to be using usbdi.c from FREEBSD55 or earlier. After that, non repeating endpoints have methods→done called before the call back. The last version that looks similar to ours – and hence the least work is FREEBSD74. Any volunteers :-) |