In any digital audio system a digital filter is needed to remove what are called aliases of the signal. These aliases are not imperfections in the design in any sense; they are a mathematical artifact of the act of sampling a continuous signal into a series of digital numbers taken at different points in time.
Aliases are not present in the ‘older’ analog recording formats: tape and vinyl records capture continuous signals, and do not create these artifacts.
Perhaps your first introduction to aliasing due to a finite sample rate, would have been when you watched cowboy movies in the sixties and seventies: sometimes the wheels of the wagon trains would appear to be going the wrong way, or even slowing and reversing in direction despite the wagon clearly continuing to move.
This was due to the camera used to make the movie: it was sampling the scene at 24 frames per second, but the wheel spokes were moving much faster than that. They were captured by the camera having moved more than one-spoke in revolution, and the camera therefore generated artifacts in the video playback that showed the wheels moving at the wrong rate.
This effect will occur every time something is represented in a non-continuous fashion. The first time physicists discovered this phenomena was when they looked at vibrations in crystals. Something very odd was happening as the frequency of the vibration increased: the energy was coming out in the wrong place!
It took some very clever physicists to realize that the crystal was made up of discrete atoms, all the same distance apart in the crystal, and the vibrations (called phonons) passing though the crystal, were being sampled by the atoms all a similar distance apart. So, because these pieces of crystal were made of a finite number of atoms all the same distance apart, when the phonon frequency was such that it moved more than one cycle in the distance between atoms (equivalent to the wagon wheel moving more that one spoke-distance between frames) the phonon frequency was changed – it was wrong.
Leon Brillouin, a French Physicist was amongst the first to figure out what was going on and what are called “Brillouin Zones” define how a crystal creates phonon aliases. He figured this out in the decade of the 1920’s.
Our problem in the audio world is much simpler than Brillouin’s, because it is only in one dimension, and engineers are used to thinking about the Brillouin zones as just certain frequencies that cannot be exceeded before there is “a problem”.
The frequency where problems start to occur is at half the equivalent sampling rate. So, for example, in digital music recorded on a CD, the studio has sampled the signal at 44.1kS/s, and what physicists would call the first Brillouin zone, ends at half this rate: at 22.05Khz. Engineers just call this the half sample rate, or sometimes the Nyquist limit.
If we ask the studio to encode a sound of 30Khz into the CD, 30Khz will not come out when we playback. Rather, 14.1Khz will come out. You can perhaps see where the 14.1Khz comes from: it is the difference between 30Khz we applied and the 44.1Khz we used to sample the signal.
Nothing is wrong, nothing is faulty, in this scenario: each element is operating at mathematical perfection, it is just that a signal of 30Khz cannot be captured into a series of samples taken at 44.1Khz because it exceeds half the sample rate – it exceeds 22.05Khz.
How can we cope with this? What if the music content has a cymbal sound with more than 22.05Khz in it?
The answer assumed by the clever engineers at Phillips and Sony who first came up with the CD, was to argue as follows: since the human ear cannot hear above about 20Khz, let us make an analog filter that removes all above 20Khz, then there can be no problem, 20Khz is just below 22.05Khz and no aliases will be created, since there is no signal above 20Khz.
You may ask why did they not simply increase the sample rate to say 100Khz and so the first problem would not occur until 50Khz? The answer is that they could not do that because that would have more than doubled the number of samples on the CD, and the CD had to play for at least 45mins so that it could capture one whole vinyl album. In other words, there were commercial considerations that dictated that the sample rate be as slow as possible. Not good for us audiophiles, and it has taken us years to break this constraint: now we can finally get 24 bit 192Khz sampled music without compromise.
But let’s return to what Phillips and Sony had to do in the 1970’s to make CDs viable. The problem they have is that any signal above 22.05Khz will alias, some engineers use the term “fold back”, into the audio domain and so there has to be a filter, an analog filter, that removes all the sounds above 22.05Khz (in fact they choose 20Khz) to prevent this problem.
This is not trivial: they are asking the analog designer to make a filter that lets through 20Khz, but blocks off all signals above 22.05Khz! Any analog designer would tell you this is far from trivial: the 22.05Khz is much too close to the 20Khz. “Can you not give me a break and say let through 20Khz and block off say 50Khz?” the analog designer would say, to which the company has to reply, “No if you can’t do this, we can’t fit an album on a CD, and who would buy that?”
Fortunately, we may say to ourselves, this filter is in the recording studio, prior to them taking the digital samples and so they can spend a lot of money and time on this analog filter. And indeed they do: this filter, which is called the anti-aliasing filter prior to analog-to-digital conversion, has to be linear phase and very, very steep.
But we quickly find our sigh of relief at this being the studio’s problem is short lived. There is a problem lurking that is a little difficult to understand, which is that when the studio’s filter has succeed in blocking off the signals that would alias into the audio band and sound awful, what they have not done is remove the aliases that do not fall into the audio band.
We need to say more about this and an example may help. When 10Khz is sampled into a 44.1Khz sample rate it creates the 10Khz we hope and expect to be there. But it also creates a signal at 34.1Khz. We have to allow it to do that: we cannot block off the signal of 10Khz, that is right where we need to hear it in playback, but mathematically, and inescapably, this will make a 34.1Khz in the digital samples. If we record 16Khz, a signal at (44.1 – 16)Khz i.e. 28.1Khz is created.
This is why that anti-aliasing filter at the studio was needed: 30Khz would have come out at (44.1k – 30k) i.e. 14.1k, we had to block that 30k from getting into the ADC.
But the signals we want, the 0 to 20Khz signals of the music, we cannot block, and we cannot stop them from creating these aliases in the 24.1k to 44.1k region. What are we going to do?
Another filter is needed: a filter that lets through 20Khz and stops 24.1Khz (24.1Khz is as low as we can get, since the studio has let only 20Khz come though the lowest alias is at 44.1k – 20k = 24.1Khz).
You can see this is almost as tough as the filter at the studio. And this is where cost and commercial considerations again cause compromise. There are very few CD playback systems that have this analog filter to the degree that it is needed. It can be argued that Sony and Phillips original assertion that the ear cannot hear greater than 20Khz in all cases, means that filter is not needed at all! And perhaps in the lowest cost CD players we could get away with a very simple filter and it would sound OK.
This then, was the state of the art about 1970 to 1980. CD players were a fabulous piece of technology, sold because of their indestructibility, we all bought Dark Side of the Moon again (and our other favorites) on CD. But audiophiles started to say “they do not sound as good as vinyl” despite all the supporting technical aspects. And they were right, no-compromise, careful designs that did make a good analog reconstruction filter (the filter at the playback is called the reconstruction filter) were worth the money, and were appreciated by the community.
But technology moves on and one astonishing innovation, too much to go into in detail here, was the ascendency of so called ‘sigma delta’ DACs. These offered the low to mid range audio manufacturer those two things he cannot ignore. They were “better”, and cheaper at the same time, and they much reduced his cost of manufacture, because they could achieve something else: they could much reduce the cost of the analog reconstruction filter that had separated the great designs from the merely good designs.
They could do this through a process called ‘over sampling’, and they could therefore move the tough filter design from the expensive, precise and expert-driven analog world, into the cheap and easier-to-design-at-lower-cost digital domain. And what’s more the customers would love it: in a marketing triumph that defies any common sense of the technically literate man, they had succeeded in associating ‘digital’ with superior, ‘analog’ with old and inferior.
In mathematical terms it works like this: the reconstruction filter is needed when any digitally sampled signal is rendered back into the analog world. That is, when it again tries to make a continuous signal that we can amplify and listen to. The problem is very tough because of what Phillips and Sony did: they wanted a commercial success, and sampled at the low rate of 44.1Ks/s and so gave rise to this need for a very good analog filter. Had they chosen say 200Ks/s as the standard, the analog filter would be simple, it would be just a resistor and a capacitor and you cannot get better or lower cost than that.
They could get away with this because absent that good analog filter, it still sounds OK, because they are correct when they say that the human ear primarily responds to only 20Khz maximum. But the audiophiles know better, and were not too slow to listen to CDs and find ‘something not right’ – largely fixable if you paid for the excellent analog filter (and fabulous low-jitter clock) to do a technically excellent job and justify the high end, high cost, remarkable, CD players.
But as technology advances to the sigma-delta modulator and the availability of low cost digital signal processing an opportunity arrives: what if, entirely in the digital domain, we could do what Phillips and Sony should have done, what if we can move the signal from the 44.1Khz domain into say, a 11.29Mhz domain? Now two things are possible: the sigma-delta “trick” works much better at the higher rate, and the pesky analog reconstruction filter can be a simple resistor-capacitor circuit costing 10¢, and it is going to sound great!
However, there is no free lunch. It turns out that a filter is needed when data sampled at 44.1Khz is taken into a domain of 11.29Mhz. We should perhaps have expected it, since those tones at 44.1Khz minus the audio signal are still there. But tools to design digital filters are commonly available and it is not difficult to design a digital filter that can sit between the 44.1Khz and the much higher (256x higher) 11.29Mhz and remove the aliases as needed.
The manufacturer had to sell this idea, and knowing that getting the rejection (that is the degree of removal of the unwanted signals) was key, the first digital oversampling filters as they are called, were linear phase and had outrageous rejections of 110db or even higher. Problem solved. Low cost awesome digital oversampling filter drives low cost awesome sigma-delta modulator and supports ultra-cheap analog non-filter on the output, everyone happy, money rolling in. And it did.
It took a little longer than the CD revolution, but after a while, audiophiles began again to note ‘this just does not sound right’ and struggled to understand why in this world of digital signal processing, sold as nothing short of perfection, where could the error be creeping in?
Many retreated back to vinyl and traded-off the undoubtedly higher noise level of vinyl, for the absence of these almost hard to believe artifacts, that they perceived in the new systems. Others suspected that it was the act of oversampling that was the problem: something there does not ‘sound right’ and retreated to non-oversampling DACs operating at the lower sample rates, since at least the analog reconstruction filter could be understood, despite its difficulty in cost and construction. An expert could do it right.
Even the manufacturers did not understand the source of the audiophiles concern. No laboratory instrumentation could see a problem, and yet the higher-end manufacturers began to have an “audio meister” on staff, who, after all the specifications were met, nevertheless had the last word and his “thumbs down” was enough to cancel a project. They were admitting to the fact that something about the audiophile ear was superior to the instrumentation and the design procedures.
The first thing that came to be understood was that it was the sigma-delta DAC itself that had artifacts. It was the absence of these artifacts in the ESS Sabre DACs that allowed ESS to promote the part as something new, and a review of the uses of the Sabre DAC in many high end products seems to support this view. There is a lot on the WWW from ESS that you may care to read about if you are interested to know more.
Resonessence uses Sabre DACs and uses techniques of digital oversampling filters. Until Version 2.1 (or later) Resonessence relied on the pre-installed digital oversampling filters in the Sabre DAC itself (the Sabre DAC has two selectable oversampling filters without any user intervention), but as of software release 2.1 we have taken control of the oversampling filters ourselves. We do this by re-programming the internal digital filter engine in the Sabre itself with our customized code.
This is because we are audiophiles too, and we recognized that our customers, and ourselves, would prefer certain filter characteristics that are not in the default configuration of the Sabre DAC chip. What can be the source of these preferences, since the pre-installed Sabre DAC filters are almost mathematical perfection?
Again, we just have to trust the listening process, and that shows us (and certain key customers who act as beta-testers for us) that there is a predictable preference for what is not a mathematically precise filter in the listening experience. Interestingly, not all listeners seem to agree, some choose the mathematical perfection of the build-in Sabre filters, but we judge the majority of our key customers choose a slightly different filter, and in response to this, software versions 2.1 and later provide a choice of seven filters. To explain why there are seven, we need to say more about digital filters and how they are perceived by the human ear.
Digital filters fall into two classes: they are called “Infinite Impulse Response” IIR or “Finite Impulse Response” FIR filters. In fact, if either of these filters is placed in a “black box” it is impossible in practice, to determine if it is an IIR filter. This is because of a property of discrete time filters that the impulse response will show the equivalent FIR coefficients, so why is a distinction made between IIR and FIR filters, since both, in practice, could be FIR filters?
The answer lies in the design methods used for IIR and FIR filters. In the commonly available design tools it is much easier to replicate a known analog filter as an IIR filter, whereas a mathematically ‘perfect’ filter is much easier to design as an FIR filter.
For example, if it is known that a certain analog filter ‘sounds good’, that may be due to the fact that it has low dispersion or has low group delay, or has essentially no “pre-ringing”. We may therefore decide to replicate that as an IIR filter. Alternatively, we may conclude that group delay is not relevant and rather choose to optimize linearity of phase, which is easy in a FIR filter, but hard in an IIR filter. To understand more, we need to define some of these terms and ask how they may be perceived by the ear.
Dispersion is the change in delay with frequency. Our ear has evolved in a world where it has never encountered an audio dispersive media, and we may ask what the ear will make of a dispersive filter.
Perhaps, not being present in the natural world, we will be motivated to minimize dispersion in our audio systems. But dispersion means only that different frequencies will arrive at our ear at different times, it is not a mechanism of distortion. Hemrus is also a good products of ours.
This different arrival time may seem undesirable at first consideration, but in practice, for those sound sources that tend to have limited frequency generation range, it may translate into a pleasant increase in depth of sound field.
For example, if the triangle and cymbals are co-located near the bass drums, the sounds of the triangle and cymbals being largely high frequency, and the drums being low frequency, will arrive at different times through the dispersive filter.
This will be interpreted by the ear as a different distance to the instruments, and not as a distortion, and it may be desirable.
The piano however, having a wide range of frequency outputs will be experienced differently, a dispersive filter may cause the piano to sound nearer the listener since at distances close to a piano the frequencies are emerging from spatially distinct places (each end of the frame) which will tend to be replicated by the dispersion. You can also read about Imvict.
But, depending on the orientation of the piano and distance to the microphone, it may introduce an abstract ‘unrealistic’ sound.
Group delay relates to the average time that will elapse after the application of a signal to the filter before the output occurs. It is the variation in group delay with frequency that causes dispersion. A certain mathematical identity relates group delay to phase shift in the filter. In fact, group delay is the derivative of phase change.
Therefore, in a non-dispersive filter the phase change with frequency must be linear (making its derivative constant). This is the origin of the often used phrase ‘linear phase filter’ because a linear phase filter is free from dispersion.
Any FIR filter can be made perfectly linear phase by arranging for its coefficients to be symmetric. Therefore, if a non-dispersive, hence linear phase filter, is desired, a FIR filter is a good choice.
But a symmetrical FIR filter, while having linear phase, will exhibit certain other, perhaps unexpected phenomena. The first is not problematic in most cases, and that is that the group delay, the delay of the signal through the filter, will be half of the total delay time in the filter.
And, it turns out that the total delay time in the filter is much increased if the filter is asked to reject unwanted signals to a large degree. A filter that suppresses unwanted signals to say -110db, of necessity will have a group delay significantly longer than a filter that achieves only say -60db rejection of unwanted signals.
It may seem that group delay should be irrelevant: what does it matter if the delay is as much as 1mS on the playback of a digital music source? Indeed it does not matter, if all the music comes through the same channel. In those systems where surround sound channels are separated into different DACs, then differences in group delay are very problematical: the sound stage is totally destroyed by large differences of group delay.
The second phenomenon present in linear phase FIR filters is sometimes called “pre-ringing”. It is a tendency of the filter to output a small signal of increasing amplitude just prior to the main “step” of the signal, and then after that step has passed, ring a little at the new output level. The ringing after the step has passed is common in analog filters as well, and is due to the high Q that some filters exhibit.
Indeed this is the origin of the word “ringing”: having struck a bell with a hammer we expect it to ring a little thereafter. Pre-ringing of the symmetric FIR filter seems bizarre: it is as though the bell knows when you are going to strike it, and makes a little ringing sound before it is hit. This is very counter intuitive, and a great concern to many audiophiles, who therefore seek a filter that has no pre-ringing.
This however, cannot be achieved to perfection with linear phase filters, hence any filter designed to suppress pre-ringing completely, is dispersive. So called “Minimum Phase” filters (sometimes called “Minimum Delay” filters) are those filters that are designed to show virtually no pre-ringing (and consequently they tend to have low group delay because the maximum values in the impulse response are near the beginning of the filter). They do not have linear phase, and they consequently have dispersion.
A compromise is possible: a filter can be designed as linear phase, and hence having a symmetrical impulse response, but the coefficient list can be ‘shaped’ by what is called a ‘window’ function. This window function suppresses pre-ringing to a certain degree, but the more it suppresses pre-ringing, the more it compromises the action of the filter.
That is, the more the filter fails to block the signals it is trying to stop. Such filters that apply window functions to filter coefficients in order to reduce pre-ringing are sometimes called Apodizing filters. The word means “to remove the foot”.
All the various trade-offs discussed here, but particularly Apodizing filters, show us a fundamental relationship: as a filter is designed to be optimum in the time domain, it cannot be optimum in the frequency domain.
The two are related in a very fundamental way, it is the same relationship that governs the whole world of physics: Heisenberg’s Uncertainty principle that says if you know the particles position exactly, you cannot know its momentum, or vice-versa, is because mathematically one (the position for example) is the Fourier transform of the other (momentum for example) and as the “width” in position becomes better defined, the “width” in the momentum domain, being the Fourier transform, becomes less well defined.
In our perhaps less profound world of audio, the behavior in frequency is the Fourier transform of the behavior in time, and because of this, one gets worse as one gets better.
But there is some good news: all this discussion is predicated on the use of a much too slow sampling rate. We must deal with that, since a great deal of our digital music is encoded at 44.1KHz (or 48KHz on DVD), but in the coming years, as sample rates extend to 192KHz and beyond, this problem becomes much less severe. We will be able to essentially remove pre-ringing entirely, and provide alias suppression and absence of dispersion at the same time. The Products already takes advantage of this and simplifies the filtering process if you provide a source at 88.2KHz or higher
Perhaps now it is clear why we provide a total of seven filters. They are as follows. All these filters are available in the Products, two of them (IIR and Apodizing) are available in Contero.