How to do USB audio
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 52
Colin (478) 2433 posts |
:-) iMic I pictured a microphone. |
Dave Higton (1515) 3526 posts |
I’ve got some prototype code going that extracts information from the descriptors, and has a call that returns the interface, alternate and endpoint etc. corresponding to a sample rate, resolution and number of channels. I don’t think I’ll be able to integrate it into my test app until mid-week, though, because of the demo on Tuesday at SAUG (all welcome, see csa.announce). It’s in BASIC. If it works, it will be time to rewrite it in C in a module. |
jim lesurf (2082) 1438 posts |
bq.bq. Well “adaptive” can mean almost whatever the individual designer decides!
Alas, this is yet another example where the practice doesn’t always match what “should” happen! :-/ The webpage I gave shows a clear case where ‘adaptive’ involves two clocks. One is in the host, sending the data. The other is in the DAC. The problem isn’t in the defined method. It is in the implimentation. This is why asych/iso was also devised as ‘adaptive’ isn’t necessarily good enough. It leaves you open to two clocks ‘arguing’. However in principle, you can avoid this if one clock locks to the long-term rate of the other and plays uniformly. Indeed, this is a way for DACs to reduce ‘jitter’ and is used in other areas. The Asynchronous Isochronous method avoids these effects because it allows one clock to take more control. The DACs clock not only times the transfers but also the playout. Any buffering levels then become ‘invisible’ so far as the output is concerned provided the buffers never overflow or empty. Jim |
Colin (478) 2433 posts |
If you are using an adaptive DAC and the data is from a computer hard disc then the clock is internal to the usb controller. So the computer becomes an asynchronous source. If the Adaptive sink (DAC) needs to be told the frequency to play that is so that it can lock on to the input frequency quicker. This probably has the side effect of being less tolerant of drift in the computer timing. If the computer timing was more accurate than your isochronous DAC then you may find that an Adaptive DAC gives better results. Its not the Adaptive DAC that is bad its the Asyncronous computer. So as I see it your web page is just describing the difference in clock quality between the Adaptive DAC (clock internal to the computer) and the asynchronous DAC (clock internal to the DAC). |
jim lesurf (2082) 1438 posts |
This depends on what you mean by clock “quality”. The pattern of regular switches in output (DAC) clock rate show it is trying to accomodate being sent samples at a medium-term rate that doesn’t agree with the DACs clock rate(s). If the source (host computer) was simply sending batches of data as and when requested and this was being buffered OK there would be no need for the output rate to regularly hop about like this. But the DAC is trying to keep “adapting” to a host computer rate that isn’t precisely a rate the DAC clock can produce. In effect, the computer clocking may be stable, but differ from the rate(s) of playout available in the DAC by just a few ppm. I’ve seen this in other comms kit where a receiver has a ‘comb’ of possible clock rates that are derived using an N/M ratio PLL locked to a crystal. The receiver (DAV) chooses N and M integers to get as close as it can, and every now and then changes N and/or M to keep roughtly up. In such cases the DAC has no single internal clock rate when told something like “this should be played at 48k” but a set of rates ‘laddered’ around it. Both computer and DAC agree that the rate is 48k (say), but have different definitions of ‘1 second’. The DAC designer knows this and uses the adaption to deal with it. FWIW The BBC tend to litter asynch reconvertors in chains in the distribution in my experience. This allows time drifts in the synch of the audio. They allow for ‘drift’ rather that such hopping. Other engineers will use analogue-like locking. This avoids the regular hopping effects, but lets though more clock fluctations from the source. So in practice some of the systems don’t always work quite as theory leads people to expect. :-) Jim |
Colin (478) 2433 posts |
Jim. I’ve replied by email to the address at the bottom of your web page – don’t know if you still read messages to that address. |
Dave Higton (1515) 3526 posts |
I’ve made a simple start on a USBAudio module. However, it would be crazy of me to just come up with an API all on my own. So the question is how best to put it up for scrutiny and suggestions? My first thought is to put it up as a series of pages on the wiki here, linked from “Documentation” → “RISC OS programmer documentation”, and marked as “WIP – for comment” all the way from the top. Suggestions are welcome :-) What I have so far is calls to enumerate USB audio devices (returning e.g. “USB9,USB12”), and to get their string descriptors (e.g. “Griffin Technology, Inc” and “iMic USB audio system”). Next target is a call to return the interface, alternate and endpoint numbers required to open a given device (e.g. “USB9”) for audio output at given sample rate, word length and number of channels. It would also be possible to enumerate the sample rates, word lengths and numbers of channels, although the results might surprise you with their inconsistency. In fact all sorts of results throughout work with USB audio devices might surprise you with their inconsistency. |
Rick Murray (539) 13840 posts |
…? If I was looking at the API of a module called USBaudio, I would expect more to specify the sample rate, word length, and number of channels, and have the module hand me back a handle of the device that it already opened for me. More specifically, to aid in the module taking care of nasty stuff and the inconsistencies and let the programmer work in a higher level of abstraction. Even more preferably would be for the module to, if a certain audio format is not supported, perform conversion to something that is on the fly. However the problem with this approach is the lack of integration into the RISC OS sound system. Should a USBaudio module be doing all of the work from sample to sound? Perhaps it might be better to look at the RISC OS audio system as a whole for something akin to: Samples —> Conversions —> Mixing —> Output where the data to be played is “Samples” (I deliberately avoided the term “input”), the “Conversions” subsystem handles all the stuff that would make Jim happy regarding playing such-and-such an audio format in so-and-so’s way (ie 44.1kHz on a 96kHz setup), “Mixing” handles shoving the system beep into music playback, and so on, while “Output” is a driver for the specific output technology (VIDC, OMAP, USB, etc). I still believe that the entire audio system should operate with primary and secondary audio sources; the difference being that – as far as is possible – the audio system adapts to the current data format of the primary audio source, so – for instance – a 16 bit stereo 44.1kHz source will cause the audio system to lock to that all the way across so for 99.9% of the time the entire system doesn’t need to do any more than pass on each wodge of data it receives. The secondary audio path is for “other things”, most notably system beeps. These can either be muted or mixed in; and for the mixing it can have a rough and ready quick’n’dirty sample rate conversion to that which is used by the primary path. You don’t need quality for a short beep; and if this is done it will one-up Android 2.3.7 which still mutes music playback in order to sound a notification tone. ;-) Of course, you’ll want to consider my first suggestion as I don’t think anybody has the time to design a new API for RISC OS audio, and write a whole new implementation! |
jim lesurf (2082) 1438 posts |
The model is probably best looked at as being akin to SharedSound, etc. So, yes, to expect the ‘playing’ application to have to choose the device (by number or returned name) and specify the rate, number of channels, and bit-depth. At present, rate conversions are probably best left to be done somewhere other than in the USB software. Better to get the playing application to decide to either switch the system rate to match or handle rate conversions. That said, at some point we ideally need a globally available decent quality resampling level. But I suspect that would be best fitted into SharedSound in the end so all apps could use it and the user could have general control over the choices and methods. I don’t think it should do ‘format’ conversions beyond aspects by byte-order shuffling or similar data re-arrangements. It should only handle LPCM and leave to higher level software to do things like turn flac into lpcm. An alternative model is a ‘SoundUSB’ module that has swis, etc, that follow the behaviour of SoundDMA as closely as possible. With any ‘device controls’ being modelled on ‘SoundControl’. This would make it far easier for others to get existing programs/modules, etc, like PlayIt, DigitalRenderer, etc, to work via USB. The aim being to mimic the existing APIs and SWI behaviours as closely as practical. It would make changing PlayIt, etc, far easier. So more likely to happen quickly and encourage the use of USB audio devices. Can you approach this in terms of two modules ‘SoundUSB’ (which follows the SoundDMA model) and ‘SoundControlUSB’ (modelled on SoundControl)? Given the existence of PlayIt, DiskSample, etc, etc, it would be best to try and avoid having to re-invent or alter them all. However you do this, for quality reasons it is important to ensure the system can play without mixing or scaling if the user wishes. When in doubt, pass the parcel and leave it to the playing app to do any ‘conversions’ required. FWIW At present IIUC SharedSound expects ‘someone else’ to do things like rate conversions. It just tells drivers what it expects, knowing the system rate. Jim |
jim lesurf (2082) 1438 posts |
I’ll agree with that given that I’m still struggling to write ROSS documentation that will come anywhere close to being up-to-date now! Mind you, I’ll be delighted to have to add a nice new section on USB audio in due course. 8-] BTW I hope to put up a new draft of the ROSS doc soon, covering more than before. Jim |
Colin (478) 2433 posts |
I can’t see much point in writing a Audio Class module, Does it make things much easier than programming usb directly? |
Dave Higton (1515) 3526 posts |
Firstly: everyone so far seems to have missed my original point about where to put a proposed USBAudio API for discussion and improvement. Suggestions are welcome.
If you mean doing things via a USBAudio module rather than programming USB directly, my answer is an emphatic yes. You’ve go to go through a lot of USB operations simply in order to find out what parameters to open an isochronous pipe with, and a whole lot more to discover where the controls are. The idea of the USBAudio module is to get that information from easy calls. If you mean something different, then please clarify what you mean.
I agree. It’s an easy step from getting the information, to using the information and opening a pipe. The two are not mutually exclusive, though. It may be possible at some future stage to make calls to set volume on, and mute, an open pipe, but I expect that reaching that stage will make my head hurt lots. |
Rick Murray (539) 13840 posts |
Up to you! Does your ISP offer web space? If you think this might be a collaboration effort, you might consider Google Drive as world-readable with interested people having write access? Failing all of that, I can drop an HTML or txt file on my site…
Par for the course with USB isn’t it? :-) |
Dave Higton (1515) 3526 posts |
I don’t find USB programming – at application level – difficult. The stuff required for the USBAudio module, especially that of finding and making available the feature unit controls, is a big step up in difficulty. As a human being I can decode the descriptors, but doing it algorithmically is the challenge. |
Dave Higton (1515) 3526 posts |
You are aware that I have a web site, I hope? |
Colin (478) 2433 posts |
In what way. The control class interface doesn’t need to be in a module at all. Streaming class carries the information to configure the sample rate data format etc. All you can do for another program is to repackage information you can read directly from the descriptors. What information in these descriptors do you leave out? An Audio program has to be in a module so that sound transfer isn’t interrupted. It has to be told if the device has been removed/inserted which it can find out using usb. It can read all usb audiostreaming speaker devices information. As far as I see it making only a subset of a devices capabilities available isn’t a solution thats what we have now. Parsing descriptors may be ugly but if there was a better solution to supplying all relevant information to the programmer usb.org would have used it. All I would do is create a back end for the existing sound system so that it no longer limited to fixed hardware and leave that for system/legacy applications. If you want hifi plug in a second usb device and control it directly. |
Dave Higton (1515) 3526 posts |
Agreed. The idea is to save application developers from having to learn to parse the descriptors, and from having to write descriptor-parsing code in their applications. It’s part of the general process of abstraction, making it easier for people. Take care of lower level information by automating the process of getting it, and of using it. The application developers have got enough to work on without all this low level stuff. As for a subset of the information: I’m sure the USBAudio module will evolve. It will only provide a subset of the available information at first, and get filled out as time goes on; but, importantly, that first subset will be the most important and most useful subset. |
jim lesurf (2082) 1438 posts |
Just to point out that I’ve just released http://jcgl.orpheusweb.co.uk/temp/ROSSDocument.pdf which has a ‘placeholder’ for future addition of USB audio. :-) Jim |
Colin (478) 2433 posts |
Can you change the fonts to standard RISC OS fonts. Edit: a shorter page size would be nice to so a page can fit a 1024 screen without scaling. Programmers manual size works well 190.5 × 228.6 |
jim lesurf (2082) 1438 posts |
If anyone wants to do that or change the layout / page size / etc I can let them have a copy of the TW document. FWIW once I have a version that seems good enough to be ‘version 1’ and is decently edited I am planning to look at producing an HTML version and putting that on the web. People may find that more convenient for onscreen if they dislike the PDF. It may also be more convenient as a basis for anyone else who wants to generate an alternative version, etc. Jim |
Colin (478) 2433 posts |
Rather have a pdf version I can read – I’d do it myself but haven’t got the time. |
Dave Higton (1515) 3526 posts |
The USBAudio module is coming along nicely. It can enumerate audio devices, get descriptors, and do a lot of the parsing of the device descriptors for the information we need. A couple more evenings of work should see it able to open a stream, given the sample rate, resolution, number of channels, etc. Which should give a nice simplification of the process that application writers will need to use. I’m still awaiting a registration from ROOL. I must put up an API discussion document. |
jim lesurf (2082) 1438 posts |
Good to hear about the progress! :-) Look forward to seeing the discussion document. Jim |
Dave Higton (1515) 3526 posts |
I’ve put up the beginning of my USBAudio API discussion document at http://davehigton.me.uk/Audio/USBAudioAPI.htm At the moment it just lists the facilities that I think we need, and points out a little of why we need them. |
Dave Higton (1515) 3526 posts |
It was very late last night when I threw that document together. The obvious feature I omitted was to be able to set the controls in a path. |
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 52