In some of very clever and thick book for audio engineers I've read about the theoretical approach to "ideal" digital sound.
They say, I need to calculate the convolution of my audio signal with Sinc function to get rid of all frequencies beyond the Nyquist frequency, so they do not suffer from non-linear distortions. This is done by a special analog linear filter. They I need to collect disctete number samples between regular intervals, as frequent as double Nyquist frequency. Then convert these sample levels to digital numbers to store.
Now I have the theoretically ideal digital sound and I want to playback it. Again, I convert these numbers to a series of amplitude levels. Then I generate a regular series of thin short impulses with these levels. Finally, again, I need a filter to convolute this series with a Sinc function to get continious analog signal. Finally!
Okay, but how alike is my signal to the original? Well, taking a Fourier image of a filter (Sinc function), I see, that it has flat amplitude dependence up to the Nyquist frequency. Theory says it is good. Also, the phase dependence is linear, and no dispersion is present. Theory says it is good. I loose all frequencies beyond Nyquist range. That is how we pay for digital sound. The only problem is exactly at Nyquist frequency. But it is a single one, why worrying? We must have got an ideal sample!
Yes and no.
Image, we catch a casual powerful electron from Large Hadron Collider when sampling our signal. This gives us a delta-function like impulse interference. It is very short. But very powerful to get a click in a record. Actually, the click is what we would expect in analog case. Filter converts this click to Sinc function. If the impulse is not exactly in phase with Sinc function, we get a very extensive response in digitized sample. But we are lucky, and, in our record, we have loud but just single sample of click. During a playback, this click turns exactly into Sinc function!
What are the properties of the sound in a form of Sinc function? First of all, it is not as instanteous, as original click. The Sinc function has long hyperbolic tails, that yield prolonged oscillating precursor, the smoothed impulse, and prolonged oscillating tail. Okay, I may rely on psychoacoustic properties of my ears, that are going to mask the aftershock, but not a precursor! This precursor certainly should distort my sense of the sound! Simple approximation say, that 99% of Sinc impulse energy would be contained in center 21 samples, which at 44100 Hz are as prolonged as 1/2100 fraction of a second, that may be audible. And 1% of energy is still left! Similar, 99.9% of energy occupy 200 samples.
We can still rely on the fact that Sinc oscillation frequency of 22050 Hz is not audible, so we should not hear Sinc artefacts. However, these oscillations being played back can take part in non-linear distortion of my stereo system and become audible, spoiling my listening.
Okay. Everything above are about ideal digital sound. Real sound is not.
First of all, Sinc resampling during a playback is too prolonged to be used in practice. It means, to make 99.9% precious Sinc resampling, we need to delay a sound at least 100 samples (1/441 fraction of a second). Better precision -- more delay.
Another thing, DAC itself. Most DAC's now are 1-bit (or few bits) sigma-delta noise modulators. Audiophiles say, sigma-delta sound is terrible, old school multibit DAC's sound much better. They are almost impossible to get by now. But, engineers and their measurements say, that sigma-delta DACs give almost ideal sound, with -120 dB SNR. Are they lying? Why sigma-delta sound so bad? Yes, they are sensitive to jitter and power source noises. Is it a reason? Or, producers don't tell us something else important about them? That is very, very strange. May the reason be in the way DAC's resample sound? May it be, that 64x resampling in sigma-delta DACs provide too neat Sinc function, while old schoold DAC's are too imperfect for that and have some smooth but narrow impulse responce on each sample instead? What is the truth here?
And, this is not all. Some music is composed synthetically now. But most sounds are digitized first. The fact is, most audio ADC's now are sigma-delta as well! And, even if we get rid of sigma-delta DAC, we can't get rid of sigma-delta ADC the music was recorder with! And, believing that sigma-delta have weird artefacts, we are doomed to have distorted sound!
Sigma-delta ADC's downsample sound too. Whenever they use Sinc function, the digitized series may have prolonged oscillations because of Sinc function in the same way as during the playback. If that impulse were not in phase with digitizer, we would get decaying with 1/t sequence of numbers, that would still be turned into Sinc-like oscillations during playback, even if DAC does not use Sinc function.
One more thing worth to note. All ADC's are not ideal. All DAC's are not ideal. They work in different way. Theoretically, for each ADC we could create corresponding DAC that would recover analog signal in the best way, and vice versa. But, in practice we are likely to have these non-idealnesses to sinerge, bringing even more distortion to original sound.
Now, what is the solution to these fundamental problems???