RISC OS Open: Forum: 22050 Hz audio on Pandaboard

Mar 18, 2015 12:43pm

Colin (478) 2435 posts

It’s the same problem. For filling the application has to poll the filling routine faster than bytes are removed from the buffer by the device. For emptying the application has to poll the emptying routine faster than bytes are inserted into the buffer by the device. Either way you need quality of service to the application’s routine.

Mar 18, 2015 1:30pm

Dave Higton (1515) 3597 posts

One particular interest I have is VoIP. Groups of bytes (typically 160) from the microphone have to be packaged up with a few bytes of header information (different each packet) and sent out over UDP. The reverse has to happen on the way in from the network – the header has to be checked, and then the payload (again typically 160 bytes) sent to the audio output if the header is OK.

I’m interested in whether these actions can be reduced to easy recipes that get them performed in the background. Sorry if the topic is veering, but clearly it’s closely related.

There have to be some buffers (minimum two buffers, each of 160 bytes, in each direction), and, since the foreground task is not going to be performing the operations, the buffers have to be in DA or RMA. So does the code. There has to be a way of calling the code when the audio input buffer has at least 160 bytes (or other value, chosen by the main app, as appropriate). Correspondingly there has to be a way of calling the other piece of code when a UDP packet arrives in; I know that’s OT here, but I’m interested in generalised methods and interfaces.

This stuff may already exist; I don’t know. There are various bits of RISC OS that I haven’t tried using because I don’t understand the documentation. (I’ve seen this commonly in Linux docs – you can’t understand the docs unless you already know how to do it, in which case the docs are largely unnecessary.)

Mar 18, 2015 2:13pm

Jeffrey Lee (213) 6048 posts

The general method would be to write the code as a module, so that you don’t need to worry about the code being paged out when data comes in/out of the system.

I’m not an expert on the RISC OS network stack, but I’d assume there’s an event or service call of some kind which you can listen out for to indicate that a socket has received new data, or has free space in its transmit buffer.

For audio input we have the benefit of being able to come up with an API from scratch. So we can decide whether it should be a broadcast event (RISC OS event, service call, vector, etc.) or something more targeted like a function call directly into the recipient’s code. Personally I’m not a fan of the broadcast approach (the CPU load doesn’t scale very well as the number of event producers or event consumers increases), so I’d rather see a function call/callback based approach be used. To allow software to cooperatively share an input stream, we could allow multiple clients to register an interest in a given stream, and have their callbacks be called in turn whenever new data arrives. The callback would be given the buffer pointer and the length, and the data will only be guaranteed to be valid for the duration of the callback (e.g. for builtin audio the pointer could be a direct pointer to a circular DMA buffer). Essentially it would be the same interface SharedSound offers to its clients, except they’d be reading data out of the buffer instead of writing (or mixing) data into it.

Mar 18, 2015 2:55pm

Colin (478) 2435 posts

Dave.

You can’t write the whole application in application space without running into the problem of latency and poor multitasking.

So your application space program is limited to starting/ stopping the call an possibly doing some monitoring.

The call would be done in a module.

For a usb mic data from the microphone is signaled by UpCall_DeviceRxDataPresent. I’d avoid OS_GBPB and use the buffer SWIs or better still use the buffer functions directly to read the data from the buffer.

The voip data would be signaled by the internet event.

Mar 18, 2015 3:16pm

Dave Higton (1515) 3597 posts

You can’t write the whole application in application space without running into the problem of latency and poor multitasking.

I think that’s a restatement of what I wrote earlier.

For a usb mic data from the microphone is signaled by UpCall_DeviceRxDataPresent.

How often would I get this UpCall if I’m dealing with 8 kHz sampling rate? Would I get an UpCall every millisecond? Can I wait for 20 ms (160 samples) before I collect data?

Mar 18, 2015 3:23pm

h0bby1 (2567) 480 posts

aaaa

Mar 18, 2015 3:29pm

Colin (478) 2435 posts

How often would I get this UpCall if I’m dealing with 8 kHz sampling rate?

Doesn’t matter – but it would be every 2 ms because I’m transferring in 2ms blocks (works better that way)

Can I wait for 20 ms (160 samples) before I collect data?

No. You must empty the buffer as data arrives otherwise you don’t get another UpCall_DeviceRxDataPresent which only happens when the buffer goes from empty to containing data.

Basically the upcall handler would transfer data to the socket.

Mar 18, 2015 4:01pm

Dave Higton (1515) 3597 posts

I just have to revisit my code and try some of this.

Mar 18, 2015 5:49pm

Jon Abbott (1421) 2666 posts

We have to be careful not to preclude real time communication apps, like a VoIP phone app. Clearly they must work with the absolute minimum of buffering.

The application can specify it’s requirement when it registers the stream

In RISC OS you can only get quality of service for polling by using a ticker handler from a module but what is the point.

That’s how it was designed to be, ref. the PRM’s regarding audio.

It’s the same problem. For filling the application has to poll the filling routine faster than bytes are removed from the buffer by the device. For emptying the application has to poll the emptying routine faster than bytes are inserted into the buffer by the device. Either way you need quality of service to the application’s routine.

There’s not really a problem, with a big enough buffer the application can take it in batches. If there’s a requirement for real-time as in the VoIP example, it should be within a Module. We don’t need to reinvent the wheel for the odd app, RISCOS isn’t PMT so it would always be a half measure to get apps dealing with near realtime data.
Personally, I don’t think apps should be dealing with audio streams directly, keep it in Modules – that’s how the RISCOS audio layers were designed.

Personally I’m not a fan of the broadcast approach (the CPU load doesn’t scale very well as the number of event producers or event consumers increases), so I’d rather see a function call/callback based approach be used.

I agree with you on this point. Service broadcast aren’t efficient with regard to realtime data, we just need to extend Sound_Configure to allow a Stream handler to be registered (referring to sound output here, CD streaming, MP3 playback etc) with a buffer size it specifies and let SoundDMA poll the app/module at sensible intervals to ensure the buffer is reasonably full.

Mar 19, 2015 1:32pm

Dave Higton (1515) 3597 posts

UpCall_DeviceRxDataPresent – does the UpCall only come to me in response to my buffer having data?

Mar 19, 2015 3:38pm

Colin (478) 2435 posts

Yes. It occurs when an empty buffer has data inserted.

Mar 20, 2015 5:24pm

jim lesurf (2082) 1445 posts

I’m assuming that the new version of the sound stack will be capable of converting both 16bit & 8bit audio to whatever output format the hardware is using.

It also should accept 24bit. Where possible this should be passed to the hardware as 24bit, but some hardware may need it cut down. So in practice it’s safest to assume that the situation for replay can be:

Accept 32/24/16/8 bit input as supplied by the file, etc, and either pad or snip itif necessary to whichever of 32/24/16/8 the hardware accepts. (Although I hope that less that 16bits would rarely be required by hardware these days!)

And you may face a similar situation with USB ADCs. These may send 32 or 24 bit which carries 24 bit.

FWIW If a new SoundDMA can handle that it may well also work with some trendy new (sic) formats like DSD/DXD as well because they tend to allow the system to ‘pretend’ the data is LPCM. But in reality the main ‘high quality’ requirement tends to be for 24bit LPCM sent as three or four bytes. (Plus the need for rates > 48k which is, I assume, not really a problem here.)

Jim

Mar 21, 2015 2:39am

Jon Abbott (1421) 2666 posts

Accept 32/24/16/8 bit input as supplied by the file, etc, and either pad or snip itif necessary to whichever of 32/24/16/8 the hardware accepts.

It should be easy enough to get SoundDMA (if that’s where this ends up) to handle any bit rate/depth in/out and deal with the necessary conversion. Implementing rate conversion in ADFFS was fairly trivial and bit depth is just one additional instruction to shift the output.

We’d probably want to shift this out of SoundDMA though and allow upsamplers to register in a similar fashion to the linear handler, future proofing it for new and improved methods and user provided upsamplers.

There’s also the possibility of registering AC3/AAC handlers and feeding digital streams straight out over HDMI.

Apr 6, 2015 9:15pm

Dave Higton (1515) 3597 posts

Well, a bit of good news, anyway. I’ve been playing with IyoPhone, and I’ve successfully received a couple of phone calls with the audio in and out via a Plantronics analogue headset and my old iMic USB audio device. Which means that, in principle, it woud be possible to run it on any of the current RISC OS platforms that support isochronous USB.

A couple of asides: I’ve been surprised at how few audio devices support 8 kHz sampling directly. I have a Plantronics headset at work that only supports 44.1 kHz in and out. Of all the sampling rates, that seems to me the least helpful for a headset. And IyoPhone, when I used it, was always plagued by a buzz from the microphone, which I’ve finally traced to interference from the DECT phone that I was using for the other end of the call.

So far, it’s still with programmed I/O for the audio transfers. No UpCalls yet.

As far as I can see, to get all the audio transfers (in and out) to happen in the background will require an RTP module to be written. The local microphone signal has to be processed and sent out to the network; the incoming signal from the network will have to generate an event that will be handled by the RTP module which will then process the audio and feed bytes to the USB audio output. I don’t see any way of copying arbitrary C functions into permanently paged in RAM, and I really don’t fancy doing that in assembler.

Apr 7, 2015 8:04am

Colin (478) 2435 posts

So far, it’s still with programmed I/O for the audio transfers. No UpCalls yet.

Does it need to run in the background?

If it was a file transfer program instead of audio transfer I don’t think it would worry you that it was in application space and ideally that’s where it should be. Moving functions into a module has only 2 benefits that I can see.

1) audio won’t stop while using the desktop
2) improved latency

Whether these extra features are worth the effort is debatable. 1 would be nice but not essential. 2 The program you have now will tell you if that is a problem.

Effort may be better served producing a nice front end.

Apr 7, 2015 9:03am

Dave Higton (1515) 3597 posts

It’s only (1) that I’m concerned with. (2) is not a problem.

Apr 7, 2015 9:27am

Colin Ferris (399) 1862 posts

What about using a ‘Circular Buffer’ – in the RMA?

Apr 7, 2015 9:35am

Dave Higton (1515) 3597 posts

It’s not just a case of copying from one buffer to another – the samples have to be processed. Code has to run.

Apr 7, 2015 11:33am

Colin (478) 2435 posts

I agree that you will need to write a module to do it.

In the module you could just sit in a call every function and do what you are doing now. ie read a socket non blocking, if it reads something process it and send it to the speaker and in the same fn read from the microphone non blocking if there’s data process it and write it to the socket. No need to use Upcalls or Internet Events.

22050 Hz audio on Pandaboard

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Mar 18, 2015 12:43pm Colin (478) 2435 posts	It’s the same problem. For filling the application has to poll the filling routine faster than bytes are removed from the buffer by the device. For emptying the application has to poll the emptying routine faster than bytes are inserted into the buffer by the device. Either way you need quality of service to the application’s routine.

Mar 18, 2015 1:30pm Dave Higton (1515) 3597 posts	One particular interest I have is VoIP. Groups of bytes (typically 160) from the microphone have to be packaged up with a few bytes of header information (different each packet) and sent out over UDP. The reverse has to happen on the way in from the network – the header has to be checked, and then the payload (again typically 160 bytes) sent to the audio output if the header is OK. I’m interested in whether these actions can be reduced to easy recipes that get them performed in the background. Sorry if the topic is veering, but clearly it’s closely related. There have to be some buffers (minimum two buffers, each of 160 bytes, in each direction), and, since the foreground task is not going to be performing the operations, the buffers have to be in DA or RMA. So does the code. There has to be a way of calling the code when the audio input buffer has at least 160 bytes (or other value, chosen by the main app, as appropriate). Correspondingly there has to be a way of calling the other piece of code when a UDP packet arrives in; I know that’s OT here, but I’m interested in generalised methods and interfaces. This stuff may already exist; I don’t know. There are various bits of RISC OS that I haven’t tried using because I don’t understand the documentation. (I’ve seen this commonly in Linux docs – you can’t understand the docs unless you already know how to do it, in which case the docs are largely unnecessary.)

Mar 18, 2015 2:13pm Jeffrey Lee (213) 6048 posts	The general method would be to write the code as a module, so that you don’t need to worry about the code being paged out when data comes in/out of the system. I’m not an expert on the RISC OS network stack, but I’d assume there’s an event or service call of some kind which you can listen out for to indicate that a socket has received new data, or has free space in its transmit buffer. For audio input we have the benefit of being able to come up with an API from scratch. So we can decide whether it should be a broadcast event (RISC OS event, service call, vector, etc.) or something more targeted like a function call directly into the recipient’s code. Personally I’m not a fan of the broadcast approach (the CPU load doesn’t scale very well as the number of event producers or event consumers increases), so I’d rather see a function call/callback based approach be used. To allow software to cooperatively share an input stream, we could allow multiple clients to register an interest in a given stream, and have their callbacks be called in turn whenever new data arrives. The callback would be given the buffer pointer and the length, and the data will only be guaranteed to be valid for the duration of the callback (e.g. for builtin audio the pointer could be a direct pointer to a circular DMA buffer). Essentially it would be the same interface SharedSound offers to its clients, except they’d be reading data out of the buffer instead of writing (or mixing) data into it.

Mar 18, 2015 2:55pm Colin (478) 2435 posts	Dave. You can’t write the whole application in application space without running into the problem of latency and poor multitasking. So your application space program is limited to starting/ stopping the call an possibly doing some monitoring. The call would be done in a module. For a usb mic data from the microphone is signaled by UpCall_DeviceRxDataPresent. I’d avoid OS_GBPB and use the buffer SWIs or better still use the buffer functions directly to read the data from the buffer. The voip data would be signaled by the internet event.

Mar 18, 2015 3:16pm Dave Higton (1515) 3597 posts	You can’t write the whole application in application space without running into the problem of latency and poor multitasking. I think that’s a restatement of what I wrote earlier. For a usb mic data from the microphone is signaled by UpCall_DeviceRxDataPresent. How often would I get this UpCall if I’m dealing with 8 kHz sampling rate? Would I get an UpCall every millisecond? Can I wait for 20 ms (160 samples) before I collect data?

Mar 18, 2015 3:23pm h0bby1 (2567) 480 posts	aaaa

Mar 18, 2015 3:29pm Colin (478) 2435 posts	How often would I get this UpCall if I’m dealing with 8 kHz sampling rate? Doesn’t matter – but it would be every 2 ms because I’m transferring in 2ms blocks (works better that way) Can I wait for 20 ms (160 samples) before I collect data? No. You must empty the buffer as data arrives otherwise you don’t get another UpCall_DeviceRxDataPresent which only happens when the buffer goes from empty to containing data. Basically the upcall handler would transfer data to the socket.

Mar 18, 2015 4:01pm Dave Higton (1515) 3597 posts	I just have to revisit my code and try some of this.

Mar 18, 2015 5:49pm Jon Abbott (1421) 2666 posts	We have to be careful not to preclude real time communication apps, like a VoIP phone app. Clearly they must work with the absolute minimum of buffering. The application can specify it’s requirement when it registers the stream In RISC OS you can only get quality of service for polling by using a ticker handler from a module but what is the point. That’s how it was designed to be, ref. the PRM’s regarding audio. It’s the same problem. For filling the application has to poll the filling routine faster than bytes are removed from the buffer by the device. For emptying the application has to poll the emptying routine faster than bytes are inserted into the buffer by the device. Either way you need quality of service to the application’s routine. There’s not really a problem, with a big enough buffer the application can take it in batches. If there’s a requirement for real-time as in the VoIP example, it should be within a Module. We don’t need to reinvent the wheel for the odd app, RISCOS isn’t PMT so it would always be a half measure to get apps dealing with near realtime data. Personally, I don’t think apps should be dealing with audio streams directly, keep it in Modules – that’s how the RISCOS audio layers were designed. Personally I’m not a fan of the broadcast approach (the CPU load doesn’t scale very well as the number of event producers or event consumers increases), so I’d rather see a function call/callback based approach be used. I agree with you on this point. Service broadcast aren’t efficient with regard to realtime data, we just need to extend Sound_Configure to allow a Stream handler to be registered (referring to sound output here, CD streaming, MP3 playback etc) with a buffer size it specifies and let SoundDMA poll the app/module at sensible intervals to ensure the buffer is reasonably full.

Mar 19, 2015 1:32pm Dave Higton (1515) 3597 posts	UpCall_DeviceRxDataPresent – does the UpCall only come to me in response to my buffer having data?

Mar 19, 2015 3:38pm Colin (478) 2435 posts	Yes. It occurs when an empty buffer has data inserted.

Mar 20, 2015 5:24pm jim lesurf (2082) 1445 posts	I’m assuming that the new version of the sound stack will be capable of converting both 16bit & 8bit audio to whatever output format the hardware is using. It also should accept 24bit. Where possible this should be passed to the hardware as 24bit, but some hardware may need it cut down. So in practice it’s safest to assume that the situation for replay can be: Accept 32/24/16/8 bit input as supplied by the file, etc, and either pad or snip itif necessary to whichever of 32/24/16/8 the hardware accepts. (Although I hope that less that 16bits would rarely be required by hardware these days!) And you may face a similar situation with USB ADCs. These may send 32 or 24 bit which carries 24 bit. FWIW If a new SoundDMA can handle that it may well also work with some trendy new (sic) formats like DSD/DXD as well because they tend to allow the system to ‘pretend’ the data is LPCM. But in reality the main ‘high quality’ requirement tends to be for 24bit LPCM sent as three or four bytes. (Plus the need for rates > 48k which is, I assume, not really a problem here.) Jim

Mar 21, 2015 2:39am Jon Abbott (1421) 2666 posts	Accept 32/24/16/8 bit input as supplied by the file, etc, and either pad or snip itif necessary to whichever of 32/24/16/8 the hardware accepts. It should be easy enough to get SoundDMA (if that’s where this ends up) to handle any bit rate/depth in/out and deal with the necessary conversion. Implementing rate conversion in ADFFS was fairly trivial and bit depth is just one additional instruction to shift the output. We’d probably want to shift this out of SoundDMA though and allow upsamplers to register in a similar fashion to the linear handler, future proofing it for new and improved methods and user provided upsamplers. There’s also the possibility of registering AC3/AAC handlers and feeding digital streams straight out over HDMI.

Apr 6, 2015 9:15pm Dave Higton (1515) 3597 posts	Well, a bit of good news, anyway. I’ve been playing with IyoPhone, and I’ve successfully received a couple of phone calls with the audio in and out via a Plantronics analogue headset and my old iMic USB audio device. Which means that, in principle, it woud be possible to run it on any of the current RISC OS platforms that support isochronous USB. A couple of asides: I’ve been surprised at how few audio devices support 8 kHz sampling directly. I have a Plantronics headset at work that only supports 44.1 kHz in and out. Of all the sampling rates, that seems to me the least helpful for a headset. And IyoPhone, when I used it, was always plagued by a buzz from the microphone, which I’ve finally traced to interference from the DECT phone that I was using for the other end of the call. So far, it’s still with programmed I/O for the audio transfers. No UpCalls yet. As far as I can see, to get all the audio transfers (in and out) to happen in the background will require an RTP module to be written. The local microphone signal has to be processed and sent out to the network; the incoming signal from the network will have to generate an event that will be handled by the RTP module which will then process the audio and feed bytes to the USB audio output. I don’t see any way of copying arbitrary C functions into permanently paged in RAM, and I really don’t fancy doing that in assembler.

Apr 7, 2015 8:04am Colin (478) 2435 posts	So far, it’s still with programmed I/O for the audio transfers. No UpCalls yet. Does it need to run in the background? If it was a file transfer program instead of audio transfer I don’t think it would worry you that it was in application space and ideally that’s where it should be. Moving functions into a module has only 2 benefits that I can see. 1) audio won’t stop while using the desktop 2) improved latency Whether these extra features are worth the effort is debatable. 1 would be nice but not essential. 2 The program you have now will tell you if that is a problem. Effort may be better served producing a nice front end.

Apr 7, 2015 9:03am Dave Higton (1515) 3597 posts	It’s only (1) that I’m concerned with. (2) is not a problem.

Apr 7, 2015 9:27am Colin Ferris (399) 1862 posts	What about using a ‘Circular Buffer’ – in the RMA?

Apr 7, 2015 9:35am Dave Higton (1515) 3597 posts	It’s not just a case of copying from one buffer to another – the samples have to be processed. Code has to run.

Apr 7, 2015 11:33am Colin (478) 2435 posts	I agree that you will need to write a module to do it. In the module you could just sit in a call every function and do what you are doing now. ie read a socket non blocking, if it reads something process it and send it to the speaker and in the same fn read from the microphone non blocking if there’s data process it and write it to the socket. No need to use Upcalls or Internet Events.