Audio: future improvements?
Colin (478) 2433 posts |
I can’t see prioritising streams (hifi,media, legacy etc.) working. Media streaming for example may end up being hifi and even if its not its for the listener to decide that they don’t want interrupting (mixing) reguardless of the quality. Having a stream that doesn’t mix doesn’t make sense to me either. If I’m watching a video and start a sound file I don’t want my click on the sound file to be ignored because I had the video on. It should be an analogue to watching TV and switching the hifi on. |
Jess Hampshire (158) 865 posts |
Then the player would choose hifi. (The media option is a fallback for when the desired sample rate isn’t available. It would usually be better for the player to output a supported rate than for the sound system to convert it.)
Hence the reason I suggested it. The main reason for this suggestion is that the audio source doesn’t need to know about the hardware. It would for example allow hifi destination sound to come from your hifi, telephone sound from a phone style handset and alerts from the computer’s speaker. It would suit a media player, where you don’t want noises annoying you, or a workstation, where alerts are more important than your background music. |
jim lesurf (2082) 1438 posts |
Agreed. But would mean that in situations like the present one with the ARMiniX you’d immediately have the option to get past the problem.
All the good USB ones I’ve tested work with the same open-standard isochronous/asynchronous method on Linux. When one works, the others do. No doubt there will be non-standard ones. But there will be a list of ones that work to choose from. And once you have one, if RO supports the method, you can just shift the USB to your new machine and it should work as before. It means you can spend more on a better DAC in the confidence that you can go on using it even when you change your host computer. Many also work with laptops and don’t need a mains PSU. So in essences, once you have one of these DACs, and the right open standard stack, it becomes plug in and go. I’ve tested a number of them, and currently use three different ones on different machines with various distros, and have some others for special purposes. The main annoyance with them on Linux is that PulseAudio – as set by the distro developers – may ‘nanny knows best’ assume you must want to use the internal soundcard. Beyond that, they just work. Bit perfect. Sample perfect, etc. TBH Once you’ve used these for a while the idea of an internal soundcard seems as weird as an ‘internal’ monitor for anything but a portable machine. [conversions being ‘simple’]
Indeed! This is why mixing becomes a problem. As soon as the user expects to play sources at two different rates (usually 44.1 and 48k) then some resampling becomes unavoidable. We now have mixing. So we have to both look carefully at resampling and ensure users can opt for blocking ‘pass the parcel’. Jim |
jim lesurf (2082) 1438 posts |
Personally, I agree that we should not be too concerned about audio quality for things like system beeps or old 8-bit audio. However some people may have reason to be concerned. TBH I’m more concerned about ensuring things like DigitalRender and *nix ports work OK. That’s a slightly different issue, though.
I agree. That’s what I set up with my Linux systems. Note also that some USB DACs provide their own volume control which is applied as part of the conversion process. e.g.s being the Cambridge Audio DACMagic Plus and 851C (which also has a balance control because I asked for one ;→ ). These can work with precision without having the host do the grunt work.
Bear in mind that 96k/24bit is well on the way to being a common standard. The chip in the ARMiniX will play this if we can get the data to it. And the decent USB DACs all accept it as standard. Oh, and there are good commercial reasons why we can expect iToons, etc, to start offering higher bitrates, LPCM, etc. If nothing else there is the urge of Meeja companies to get everyone to buy another version and make the ‘newer better’ format fashionable. On a few years time, with 4G, etc, etc, no fashionista or tendu will want to be caught using mere 44.1k rate mp3! I assume that you’re now playing flac OK. DigitalCD/DiskSample play them, and of course sox or the flac tool will handle them
Depends what you mean by format. If you mean various types /rates of LPCM then a mixer/resampler is needed somewhere. Proper design would allow that to be optional. i.e. The host knows what rates the DAC can accept. So sends ‘pass the parcel’ ones that are OK and carefully resamples ones that aren’t on the list. Ideally flagging the user so they know what is happening. Should be handled via the playing app and driver. One point to add. Any resampling/volume changes/mixing is much better done as 24bit or 32bit even for 16bit inputs means you can avoid having to worry so much about employing added processes for dithering and noise shaping to keep down quantisation effects. So any volume/mixing processes really should all be 32bit. My guess is that this doesn’t make things much harder for RO + ARM systems. Given being able to forget dithering, etc, should end up being faster and more convenient. If by format you mean ‘flac’, say, rather than ‘lpcm wav’ then it is down the to playing software app/driver normally. That has to ‘add water to reconstitute the lpcm defined’. Jim |
Colin (478) 2433 posts |
You have sources (various places where sound data can come from) and sound device destinations (speaker, headphone socket on sound device A etc.) and what you want to be able to do is say play this source at that destination. You want music sound to go to destination x which is your hifi connection. This isn’t a quality consideration it purely depends where you’ve got the hifi plugged in – you may have multiple identical usb DACs attached. Audio Data quality is of no importance. You may want to redirect sound on the fly eg you are listening to the internet via your default destination (the computer) and decide you want to hear it on your hifi (or headphones) so you just change the destination. You can’t use the ‘Quality of the audio’ to determine the destination an mp3 could be a hifi sound a telephone sound or an alert. Splitting sounds into system sounds and everything else sort of makes sense as you probably don’t want a 100db alert beep while listening to Beethovens 7th at full blast. On a single sound device system system sounds could be automatically muted otherwise they could be redirected to a device of choice. Other sounds can be controlled by not playing them. Whether the sound data is the passed through unchanged or not is of no concern I want the sound from destination X. Obviously if Destination X can handle the sound unmodified it should do. |
Jess Hampshire (158) 865 posts |
Missed this earlier:
Complex parts of the RISC OS system yes. Of the sound theory, no.
Sort of. More accurately when you convert D to A you get extra high frequency components above half the sample frequency. These need to be filtered out to get the original sound. If you do it with an analogue filter it introduces severe phase errors at the top end of the frequency range. So if you resample to 88KHz this removes the problem, because the top half of the frequency range isn’t used. (The sample rate needs to be over twice the highest frequency to capture all the information (and not cause distortion) Resampling is mathematically converting the stream from one sample rate to another. This can cause rounding errors and if not done well can cause drops in quality. (Obviously doubling or quadrupling is relatively simple). This is what needs to be avoided if possible (Apart from doubling or quadrupling it.) Bitrate for MP3s is different to sample rate. (Think bitrate being like size of a jpeg, and sample rate being like the picture size). Also if you use too low a sample rate on a simple DAC, then the high frequency components won’t be filtered out and will sound horrible.
Because if a program can adjust its output for what the system can use directly, it is likely to result in a better sound quality. (Potentially cut out one set of resampling). The media option is there as a fallback, I would hope it to be mainly used for low quality media where the source doesn’t warrant the effort the programmer would have to put in. Basically you cut out the possibility of accidentally degrading the quality of a hifi stream. bq.This is why I lumped alerts in with legacy – as it is likely that legacy style code will be used to play the beeps, The reason I separated them was that legacy would include all programs that already exist and haven’t been modified, including games and media players. For which you would want the option of sending to your hifi. Alerts are a system thing, so could be modified. And if they are lumped in with legacy, if you use an old media player, your system bleeps would also come out of the hifi (not nice).
That is my point about not being a programmer. I used the term in the most generic way ever, and how it is actually delivered to the sound system is a matter for programmers with the relevant experience to decide. You could think of my destinations as sets of rules for the sound system.
Simple, you just don’t get that facility. e.g. Ideally on a Pi you could configure HiFi to go to the HDMI and alerts to the analogue. But it seems unlikely yet, so it would be no different to a RiscPC in effect.
Yes, I was referring to whether the program takes action or not, not how that action is achieved.
You would have to resample to the system sample rate. Which while you were playing a hifi stream would be that rate (or 2 x or 4x). The quality would have to depend on the available processing power. |
Rick Murray (539) 13850 posts |
…except that it is. The current problem with the sound system, as I understand it, is that the sound system makes some odd assumptions about the underlying hardware and does a lot of unnecessary work – perhaps a throwback to the VIDC days when the hardware wasn’t so smart? At any rate, this degrades the sound quality.
That would be nice; although I tend to listen to stuff with headphones or one set of speakers. All I’d ask is for the beeps to play at half volume, and to mix in to the main audio, not replace it (as happens with Android, can’t it mix?).
I know the difference. On the PC, my MP3 program says something like “320kbit (CBR) full stereo 44100Hz 16 bit”. As for my FLAC issue – my (older) WinAMP didn’t want to know, SMPlayer choked, most of the stuff I found for Android was either non free or had really dodgy permissions. I never thought to use RISC OS. Good to know that it works, I’ll have to remember that. |
Jess Hampshire (158) 865 posts |
A further thought: Suppose you are using an expensive external DAC. If you wish to play a 44KHz stream the odds are it will sound better at that rate than if you oversample to 88Khz and send it, (because the DAC will probably oversample better). However if you are processing that stream in any non trivial way, then the likelihood is that oversampling to 88KHz then processing would give a better end result. But if the main purpose is to play a quality stream, you may prefer maximum quality most of the time and and accept that just for the duration of an alert the sound isn’t optimum. (And if you have it configured to allow alerts to distract you, it’s because you need to act on them.) |
Colin (478) 2433 posts |
Rick. We are talking about improvements here. The failing of the current system is due to it being designed for old devices. I’ll also add that a new system should take all sound data types directly (mp3, wma etc.) that way if a device can do everything the data can be sent directly to it. The only way to design the system is to not assume device capabilities and to be able to overcome a devices shortcomings via soundsystem plugins. So the sound system needs to know the devices capabilities and if no conversion is needed the data is passed straight through. So at each stage in the process between audio data and device the sound system asks the device driver if it can handle the data at this stage directly. So you have Play mp3 The point being that any conversions are minimised and as the programmer of the sound system I don’t care about the quality of the Audio data or the quality of the device. |
jim lesurf (2082) 1438 posts |
Need to unpack that to some extent. There are a number of problems/imperfections at present. Most are in the “devil in the details” category and depend on the hardware, etc. One example being the clipping of the ARMiniX. Another being that the low system rates offerred munch the sound into mono plus ultrasonic hash. Another being that the only resampling method currently on offer is linear interpolation. Some problems are bugs. Others are more that attention hasn’t been devoted to providing something better. e.g. The linear interpolation. That works Ok for x2 conversions since any garbage tends to be in the ultrasonic region. I’ve ‘fixed’ it on my ARMiniX by adding a good analogue filter to the output. That also suppresses the delta-sigma hash produced by the chip which is a ‘feature’ of these kinds of chip. But it would be much better to have improved interpolation being offerred for a number of reasons – some of which have already been mentioned. Look at Linux as an example here. There ALSA (and apps like sox) can provide really good resampling if you set the system to use it. When a conversion isn’t needed, still best avoided. But if one is, do it well. And this matters much more if the user wants mixing and to play things at more than one source rate. Jim |
jim lesurf (2082) 1438 posts |
Yes. The better DACs upsample everything to rates/depths like 384k/32 bit anyway. Often higher and deeper. This means they can do a lot of fiddling with the details to get optimum results on dedicated hardware designed to pass muster when used by demanding customers.
It may be better to look at that from a different POV and regard the process as being that doing any processing tends to be easier if you upsample a fair way as an inherent part of all the processing required. Indeed, some systems work by using a high rate that is an integer multiple of both 44.1k and 48k so each upsample at the start can be an integer ratio. Note that the TI chip in the ARMiniX uses internal clocks in the region around 10MHz and above. Jim |
Colin Ferris (399) 1818 posts |
Have you any ideas – on how to improve the sound output of the Iyonix? |
jim lesurf (2082) 1438 posts |
I can suggest two routes. A) That someone impliment a good sample rate convertor from 44.1k to 48k which would process the values cleanly before sending to the hardware. or B) People use something like sox to generate 48k versions of their 44.1k source material. Of the two sox is open to the user right now. The snag being that the process would be slow and mean you have a ‘double inventory’ of files. But sox can do very clean conversions in my experience. (I’m assuming, though the RO port works as nicely as the Linux version.) The ‘better resampling’ option may fall out from it being done for newer hardware. However good 44.1k to 48k may be too much work for the Iyonix to keep up with when playing audio. Dunno, may be fine. The point being that the Iyonix hardware is nailed to a 48k clock and the resampling distortions tend to be nasty if you start with 44.1k material. In addition, an output filter + analogue stage would help a little as you can avoid sacrificing the DAC range when you want a lower level of output. The circuit I’m about to write up for the ARMiniX would be suitable. However note that this won’t fix problems produced by poor 44.1k to 48k conversion. So isn’t a substitute for the above options! |
jim lesurf (2082) 1438 posts |
I’ll a few more points which example the kinds of things we may need to take into account at one level or another. A) I may decide to load a series of files into my choice of player app. These have an assortement of rates and depths. e.g. some from CD audio or iPlayer will be 44.1k/16bit. Others from TV or DVD will be 48k/16. Others from downloaded files or DVD-A will be 96k/24/. Some may even been ‘surround’ from HDTV or DVD-V.1 In this situation I expect the player to just play the files in sequence, with the sample rate set for the DAC ‘following the file’. Indeed, I hope for essentially ‘gapless playback’ with no clicks or pauses between items. B) A few days ago I was listening to a series of 44.1k files using DigitalCD with the ARminiX system rate set to 88.2k. During this I was searching for another file I wanted to play. I accidentally clicked on and ran MusicMan2 rather than the directory of files beside it. The playback glitched severely, volume changed, and then I could not alter the volume at all. The reason seems to be that – as supplied – MM2 when run issues a command to set the system rate to 44.1k. So this happened whilst I was playing music with the rate set to 88.2k. I had tried to tick the options to ‘block’. But still the system rate ‘change’ caused a foul-up. We need to think about cases like these in terms of extended API, etc. Jim 1 The Proms on HDTV have 5.1 surround sound. If you record them using something like a suitable ‘PC TV’ tuner you can then extract the surround audio. A detail here is that the ‘.1’ part has a lower sample rate than the main channels. 8-] |
Lee Shepherd (435) 51 posts |
would this re-sampling of 44.1 to 48k be the cause of this – http://www.drobe.co.uk/article.php?id=1438&hlt=iyonix Lee |
Jess Hampshire (158) 865 posts |
As I understand it, the Iyonix makes a mess of 44.1 KHz material. However I’ve never really noticed a problem, presumably digital cd resamples as needed and makes a decent job of it. (My Iyonix has never been via my hifi, though.) |
jim lesurf (2082) 1438 posts |
If the Iyonix is set to 44.1k system rate then DCD will just feed it 44.1k assuming the hardware will cope. The Iyonix then simply ‘repeats’ samples every now and then because it has to output 48000 per second (per channel) even if being fed at a lower rate. This causes distortion. If the Iyonix system rate is set to 48k and you have set DCD to ‘interpolate’ then DCD/Driver will linearly interpolate. A little better than the above shambles, but it also causes distortion. How much you will notice or be bothered depends entirely on the circumstances. And expectations to some extent. If you want to check, use sox to generate a 48k version of some of your music files. Set the system rate (and check DCD settings), and see how that compares with playing the 44.1k original. Slainte, Jim |
jim lesurf (2082) 1438 posts |
Dunno. People react in different ways, and it will depend on the details of the test waveform and setup. Crackling could also be some other kind of timing or interrupt problem. However a frequency sweep would mean that the (anharmonic) alias generated by the problem change their relationship to the test waveform as the frequency changes. So it would have different effects as the frequency changed, and vary in audibility. I did an ‘Archive’ article on this some years ago. I guess I should sometime get around to poking all those old articles onto the web. Jim |
jim lesurf (2082) 1438 posts |
Here yer go… http://jcgl.orpheusweb.co.uk/temp/badtime.png I covered this in the 12th article I wrote for Archive on audio. (The next article went into how to do resampling.) The above is a couple of the figures from the article. The top graph shows what happens with a 3kHz test tone. The lower one 300 Hz. In each case the machine is trying to play a 44.1k file having been told the system rate is 44.1k. Since the Iyonix couldn’t actually do that you get results like those shown. The broken blue line is what the player thinks is being played. The red line is what you get. The black line is the difference. Pretty, isn’t it! 8-] The problem with these distortions is that they rise swiftly with frequency. They don’t notice so much on low frequency tones. But more complex tones with high freqencies are altered rather more. Its a form of phase modulation. And it doesn’t depend on the amplitude of the waveforms. Just the frequencies. Jim |
nemo (145) 2556 posts |
I think hoping for gapless playback of tracks compressed using the same codec at the same sample rate is an entirely reasonable aspiration and, these days, expectation. But I don’t expect gapless playback of entirely different sample rates. At least, not ‘proper’ gapless playback. Cross-mixing is reasonable enough I suppose. Synchronising the interpolated representation of the abrupt end of one track with the abrupt continuation of the same wave cycle at a different sample rate from a different codec is probably asking a little too much.
I have to admit that the red line doesn’t look like the result of ‘repeating samples’, unless there’s some smoothing going on afterwards. Does this graph show something else? |
Jeffrey Lee (213) 6048 posts |
Especially since most hardware we’re targeting these days (OMAP3, OMAP4, Pi) require you to stop and restart the audio output in order to switch sample rates. And stopping and starting the chip can introduce noise on the output! (nasty clicks and pops on the Pi, and a high-pitch whine on OMAP3. Presuambly the same on OMAP4.) However for IOMD and I think Iyonix, gapless sample rate changes are possible, and is currently supported by the OS. |
jim lesurf (2082) 1438 posts |
In general, gapless playback is only likely to be required (in my experience) when what you are playing does consist of a series of files sharing the same sample rate, bit depth, and format. That’s because they’ll probably be something like chunks ‘tracked’ out of a continuous work. For most people that might be from a CD/‘album’. For me it also tends to be some of what I’m working on at present. Proms recordings that I make as a continuous recording, but then snip into sections. That way I can either play a part, or the entire thing.1 Using a player on Linux I get gapless playback. But there are pauses when I try this with something like DigitalCD. That can mean silences in the audience noise, but can also mean a pause in music for some long works. I can see that rate switching would be problematic. But in practice I doubt gapless playback would be needed for cases where the rate changes from file to file. In that context a pause may be a good thing to help space out different items. Jim 1 FWIW I tend to record an entire concert, then snip it about later, keeping all sections. So I can then listen to an entire concert or a chosen work/part. |
Rick Murray (539) 13850 posts |
Gapless playback? Oh you youngsters! ;-) I have Jeff Wayne’s War Of The Worlds. On vinyl. Four sides of vinyl. Not only is it not gapless, but you need to physically manipulate the playback mechanism so there can be lengthy gaps in between the sections. Oh, and let’s not even talk about the issues involved in getting the playback mechanism to run at the exact correct sampling rate (measured in RPMs, not Hz!), plus additional complications such as a gunky read-head, plus reconstituting the correct sound from the RIAA munging (think of it like an early form of DRM if you must ;-) ). And, to my knowledge, nobody has commercialised a Walkman-style turntable 1. Perhaps they learned from the big pile of fail that was the in-car turntable? Really, you people think you have it hard these days. :-) :-) :-) :-) 1 On the other hand, I don’t see why you couldn’t clamp an LP into a caddy that holds it top and bottom (like a CD player) and then use a tracked laser (again, like a CD player) to bounce off of the grooves, with the optical deflection of the laser being able to be sensed and thus converted into a waveform… |
Steve Pampling (1551) 8172 posts |
Uh? There’s another version. Oh, yes the wife bought a copy on CD, later. |
jim lesurf (2082) 1438 posts |
I used to do things like that with my LPs… until I changed to simply making digital copies. Makes it far easier to find a track, etc. Oh, and ‘Finial’ developed a ‘laser player’ which I think is still available if your pockets are deep enough. The snag is that the discs have to be ultra-clean. Otherwise you hear all the clicks from small grains of dirt that the stylus otherwise kicks out of the groove as you play the LP. IIRC It was Audio-Technica1 who made an LP deck where the center (sic) spindle auto-adjusted to ‘center’ an LP with its hole off-center and so null ’wow whilst playing. Jim 1 Not sure that’s the right company, but one of the Japanese ones fanatical at the time about direct-drive, etc. |