The 24-Bit Delusion

UPDATED: 10.3.17

INCLUDING INFORMATION ON MQA


Introduction:

In the past decade, more and more music has become available in “high-definition” (HD) digital formats, such as 24-bit 192KHz downloads, 24-bit 192KHz MQA streaming, and DSD. Now I hear talk about developing a new 32-bit 384KHz standard for HD music. Interestingly enough, not everyone agrees that greater bit depth and higher sampling rates are good things.

This blog will explain the math and physics of digital recording and musical reproduction in layman's terms so that you can decide for yourself if this is progress or simply marketing madness.

If you're not sure if you should believe the statements in this blog that contradict much of the marketing hype, myth, and legend in the audiophile industry, feel free to check the references at the end of this blog that were written by recording engineers, such as Dan Lavry, and companies that manufacture electronics used in recording studios, such as Antelope Audio

If you don’t want to wade through a lot of technical data, you may want to skip to the summary, where I hit all the major points. You also may want to refer to my other blog on “DSD vs. PCM: Myth vs. Truth.”


Bits, Bytes, and Digital Words:

So why did 24-bit become the new standard?

When digital data is transferred and manipulated, it’s moved in bytes rather than as individual bits. There are 8 bits to a byte, and a byte is known as a digital word. This is why everything in the digital world is divisible by 8. So 16 bits = 2 bytes and 24 bits = 3 bytes, and both 16 bits and 24 bits became standard because each represented the next digital word.

Historical note: The 16-bit format existed long before 16-bit digital-to-analog converters (DAC) were commercially available. The same is true of the 24-bit format.


Sampling Rate and Bit Depth:

The process of converting analog sound waves into digital numbers is known as “quantization,” which is often represented as points plotted on an XY axis. The horizontal X axis represents time or sampling frequency and the vertical Y axis represents amplitude or bit-depth. In the graphic below the white wave form represents the musical signal being quantized and the green step pattern overlaid on the white wave form represents the quantized values.

PCM Quantizing

Sampling rate is the frequency at which the amplitude of the analog sound wave is sampled. The 44.1KHz sampling frequency specified for Red Book CDs sample the amplitude of the music 44,100 times each second. The 96KHz sampling frequency used in the 7.1 channel audio embedded into DVDs and Blu-Rays sample the amplitude 96,000 times each second. And the 192KHz sampling frequency used in HD music files and MQA sample the amplitude 192,000 times each second.

Bit depth translates to the number of steps the amplitude of the analog sound wave is divided into at each sampling. A 16-bit recording has 65,536 steps, a 20-bit recording has 1,048,576 steps, and a 24-bit recording has 16,777,216 steps. Yes, you read that correctly: a 24-bit recording has 256 times the number of potential amplitude steps as a 16-bit recording. 256 times?!?!? Doesn't that seem like a rather excessive amount?

The more bits and/or the higher the sampling rate used in quantization, the higher the theoretical resolution. So a 16-bit 44.1KHz Red Book CD has 28,901,376 potential sampling points each second (44,100 x 65,536). And a 20-bit 96KHz recording has 1,006,632,900 potential sampling points each second (96,000 x 1,048,576). This means 20-bit 96KHz recordings have roughly 33 times the resolution of a 16-bit 44.1KHz recording and a 24-bit 192KHz recording has roughly 256 time the resolution of a 16-bit 44.1KHz recording. No small difference.

So why is it that HD recordings sound only slightly better than a 16-bit 44.1KHz recordings? Later in this blog I’ll explain the difference between theoretical and actual resolution.


Dynamic Range and Bit Depth:

Dynamic range is the difference in volume between the quietest and the loudest passage, commonly measured in decibels (dB).

  • 16-bit Red Book CDs have a dynamic range of over 96dB.
  • 20-bit digital master tape has a dynamic range of over 120dB.
  • 24-bit modern HD formats have a dynamic range of over 144dB.

But wait…isn’t the background noise in a quiet room 30dB?

So you can’t actually hear the difference between the dynamic range of a 16-bit recording and a 20-bit recording unless you turn the volume up high enough above the 30dB background noise that it would cause permanent hearing loss. And actually listening to the dynamic range of a 24-bit recording (144dB) would cause permanent hearing loss.

So why on Earth would they even create a digital music recording format that can't even be listened to?!?!?!?!?

Later in this blog I'll explain how using insanely high bit-depth and sampling rates far above the range of human hearing is used during the editing, mixing, and mastering process to lower noise in the significantly lower resolution commercially released recordings we listen to.

Just for reference, here are some examples of dynamic range that most of us can relate to:

  • The sound of a mosquito flying 3 meters away is 0dB.
  • The hum of an incandescent bulb at 1 meter away is 10dB.
  • The background noise in a quiet recording studio is 20dB.
  • The background noise in a normal quiet room is about 30dB.
  • Early analog master tape had a dynamic range of only 60dB.
  • LP micro-groove records have a dynamic range of 65dB.
  • Dolby increased analog master tape dynamic range to 90dB.
  • The sound of a jackhammer at 1 meter away is 110dB.
  • The sound of a full orchestra at 1 meter away is 120dB.
  • Over 130dB causes irreparable hearing loss.
  • The sound of a jet aircraft at takeoff is 140dB.

Of course dynamic range and quantization noise are not the only factors: the noise floor in the power supply of your DAC will determine the Least Significant Bits (LSB) that can be resolved at the output of your DAC despite whatever bit-depth the manufacturer may claim it decodes.

Sorry to burst anyone's bubble and contradict marketing hype, myth, and legend in the audiophile industry, but just because a DAC is capable of decoding 24-bits doesn't mean it is capable of actually resolving that bit-depth in the output stage.

And don't even get me started on DACs with tube output stages: the lowest noise floor on a tube output stage is about 90dB which means despite whatever the manufacturers may claim they can't even resolve a 16-bit recording.


Noise Floor:

Dynamic range expresses the loudest possible sound, and noise floor expresses the quietest. If you want to hear the LSB on a recording, the volume (or voltage) of that bit has to be above the noise floor of both the room and the equipment in your system.

We already know that a quiet room has a background noise level of about 30db that we need to rise above. Even after the equipment is playing above the 30db room noise, the power supply of the electronics will mask the LSB if the peak-to-peak voltage of the noise in the power supply is not less than the voltage of the LSB.

Based on a 2.5V output on a DAC (higher than average), below are the voltages power supply noise must be below in order to hear the LSB:

  • 16-bit LSB noise floor voltage = 76uV
  • 18-bit LSB noise floor voltage = 19uV
  • 20-bit LSB noise floor voltage = 4.75uV
  • 24-bit LSB noise floor voltage = 0.3uV
For a reference, a common LM317 regulator, the quality used in most commercial electronics, has about 150uV peak-to-peak noise, and the world’s lowest noise power supplies (we’re talking NASA, not audiophile) have about 5uV of peak-to-peak noise.

According to the experts that manufacture the finest DAC chips, resistors, and power regulators, there is theoretically no way to make electronics that are capable of discerning greater than a 20-bit resolution (120dB dynamic range). Any company that claims greater than 20-bit resolution from their DAC is simply full of shit. Oh they can decode 24-bits, because 24-bits does exist in software, but the output from their DAC has less than 20-bits of resolution and dynamic range.


Theoretical vs. Actual Resolution:

According to mathematical theory, sampling at more than twice the maximum audible frequency only plots more points along the same curves when the digital signal is converted back into an analog waveform. So in order to correctly sample a 20KHz note, the maximum frequency human ears can hear, you would need to sample at greater than 40KHz. The 44.1KHz sampling rate of a Red Book CD was engineered to allow a 20KHz sound to be recorded accurately.

So why would there be any need for higher sampling frequencies than 44.1KHz if those mathematical theories are correct?

One reason is quantization noise. Since quantization noise is present around the sampling frequency of a PCM recording, a 44.1KHz recording has quantization noise one octave above the human hearing limit of 20KHz. This quantization noise needs to be filtered out, so all DACs have a low-pass filter at the output. Because the quantization noise is only one octave above audibility the filters used have to have a very steep slope so as to not filter out the desirable high frequencies. These steeply sloped low-pass digital filters are commonly known as "brick wall" filters.

Though you hear a lot about "brick wall" filters in the top end of early Red Book CD players causing an audible distortion, the fact is that was only a small part of the reason early Red Book CDs had an unnatural sounding top end. Most of the hard, harsh, unnatural sounding high frequencies in early digital recordings had more to do with flaws in the power supplies and flaws in the recording process, not "brick wall" filters.

Sorry to be the one to burst your bubble, but despite what many audiophiles may believe, less than one person in a thousand can hear anything above 20KHz as a child, and there is almost no one over the age of 40 that can hear much above 15KHz.

Of course if a higher sampling frequency and greater bit depth are used for the recording, mixing and mastering, there is much lower quantization noise when the recording is output to the significantly lower resolution commercial formats we listen to.

This is why professional formats, such as 24-bit 352.8KHz DXD, were originally developed for recording studios. One of the reasons 24-bit DAC chips were developed was so that the recording engineers could monitor their their editing, mixing, and mastering in real-time without having to down-sample. There is in fact no rational need to use 24-bits of dynamic range in commercially marketed recordings, and no company has ever released a commercial recording with anything even close to 24-bits of dynamic range (144dB).

Even though many recordings are advertised as being 24-bit, all 24 bits of dynamic range were only used in the recording studio to reduce quantization noise. The consumer version of most so-called 24-bit recordings are mastered with the dynamic range of or less than a 16-bit recording (<96dB). Refer back to the above section on dynamic range. If there were to sell commercial recordings with more than 96dB of dynamic range you wouldn't be able to hear any of the low-level details above your room's 30dB noise floor and the peaks would clip out most amplifiers at a very low volume. There are more details on how dynamic range effects electronics in the following section on "Playback Equipment Requirements."

So what do they do with commercially marketed so-called 24-bit recordings? They simply fill some of the Most Significant Bits (MSB) with 1s and some of the Least Significant Bits (LSB) with 0s to pad the overall volume up to the target level. They could have released a recording of identical performance in 16-bits, but naive consumers insist on 24-bits, so the record companies trick them by centering 16-bits of dynamic range in a 24-bit frame. How silly.

DSD is no different. Any type of digital recording produces quantization noise that requires a low-pass filter at the output of the converter. But instead of having quantization noise centered around the sampling frequency like PCM, DSD64 SACD has significant amounts of noise just above 25KHz, as is shown in the graphic below.

To get around this problem single-bit native DSD DACs have noise-shaping algorithms and upsample to ridiculously high frequencies to move the quantization noise far outside of the human hearing limit to allow quantization noise to be filtered out with a minimum of distortion in the audible range. This is one of the reasons why player software that upsamples PCM or DSD64 SACD to Double-Rate DSD or Quad-Rate DSD makes such an improvement in single-bit DAC performance. For more detailed information on this topic refer to our blog on “DSD vs. PCM: Myth vs. Truth.”

Another consideration of higher sampling rates and greater bit depth is system resources. Both require more storage space, more RAM, and faster processors. Though the optimal sampling frequency and bit-depth that are required to reproduce accurate music are a matter of heated debate, there is no doubt that excessive resolution unnecessarily uses up system resources and unnecessarily increases the size and cost of components.


Playback Equipment Requirements:

There are very few systems, even among the best-of-the-best, that can accurately play back the full 120dB dynamic range of a 20-bit recording. This is why few recordings are even released at the 96dB dynamic range of a 16-bit recording, let alone the 144dB dynamic range of a 24-bit recording. Keep in mind that the maximum dynamic range of LP records is only about 60dB. Even Dolby analog master tapes had a maximum of about 90dB dynamic range.

So that 120dB live music can be played on most high-end audiophile systems, recording studios need to limit the dynamic range using a process called “dynamic compression.” The process of dynamic compression makes the quieter passages relatively louder and the louder passages relatively more quiet. This makes it easier to discern low-level details from the louder passages. Dynamic compression is part of what gives recorded music the illusion of having more detail and focus than live music.

There was wisdom to the LP record and the analog tape standards. The relationship between amplifier wattage and decibels (dB) of volume is logarithmic not linear. Manufacturers knew that for every 3dB consumers raised their volume, they would have to double the wattage of their amplifier and double the output of the speakers. So keeping the dynamic range of home audio under 60dB is much of what allowed home entertainment equipment to be affordable, of modest size, and relatively high-fidelity. Fact is that most commercially distributed HD recordings have a dynamic range that is intentionally less than half the 144dB dynamic range of a 24-bit recording and significantly less than the 96dB dynamic range of even a 16-bit recording.

Think about it: a 60dB dynamic range on top of a 30dB background noise equals 90db. How much louder than 90dB do you want to listen to music in your home? More importantly, for every additional 3dB you increase dynamic range, you would need to double the wattage of your amplifier and double the output of your speakers.

All things being equal, to go from 90dB output up to 99dB, you would need an amplifier with 8 times the wattage and speakers with 8 times the output. To accurately reproduce a recording at 120dB, you would need an amplifier with 1,000 times the wattage and speakers with 1,000 times the output than it would take to reproduce the same recording at 90dB. I don’t know about you, but a system like that will neither fit in my room nor my budget.


Summary:

Well, all that’s a real ear opener, isn’t it?

When digital data is transferred and manipulated, it’s moved in bytes rather than as individual bits. There are 8 bits to a byte, and a byte is known as a digital word. Both 16 bits and 24 bits became a standard because each represented the next digital word.

Bit depth translates to the number of steps in the amplitude of a digital recording. A 16-bit recording has 65,536 steps, a 20-bit recording has 1,048,576 steps, and a 24-bit recording has 16,777,216 steps.

Sampling rate is the frequency at which the amplitude of the analog sound wave is sampled. At a 44.1KHz sampling frequency sample the amplitude of the music 44,100 times each second. At a 96KHz sampling frequency the amplitude is sampled 96,000 times each second. At a 192KHz sampling frequency the amplitude is sampled 192,000 times each second.

The more bits and/or the higher the sampling rate used in quantization, the higher the theoretical resolution. So a 16-bit 44.1KHz Red Book CD has 28,901,376 potential sampling points each second (44,100 x 65,536). And a 20-bit 96KHz recording has 1,006,632,900 potential sampling points each second (96,000 x 1,048,576). This means 20-bit 96KHz recordings have roughly 33 times the resolution of a 16-bit 44.1KHz recording and a 24-bit 192KHz recording has roughly 256 time the resolution of a 16-bit 44.1KHz recording. No small difference.

According to the experts that manufacture the finest DAC chips, resistors, and power regulators, there is theoretically no way to make electronics that are capable of discerning greater than a 20-bit resolution (120dB dynamic range). Any company that claims greater than 20-bit resolution from their DAC is simply full of shit. Oh they can decode 24-bits, because 24-bits does exist in software, but the output from their DAC has less than 20-bits of resolution and dynamic range.

In order to reproduce anywhere near the dynamic range these high-res formats offer, you would need amplification with several times the wattage and a fraction of the noise floor of what is currently available to the high-end audiophile.

Of course that doesn’t even account for the significant amount of distortion added by signal cables, amplification, and speakers, and the background noise in a listening room, all of which would not allow hearing the full resolution and dynamic range of even an 16-bit recording.

In order to hear the difference in dynamic range between a 16-bit and a 20-bit recording in a normal quiet listening room, you would have to play the music so loud it would cause permanent hearing loss.

When people claim to hear differences between 16-bit, 20-bit, and 24-bit recordings, it is not the difference between the bit depths that they are hearing, but rather the difference in the quality of the digital mastering. The fact is that even most so-called 24-bit recordings are mastered with less than the 96dB dynamic range of a 16-bit recording (and wisely so).

So what do they do with commercially marketed so-called 24-bit recordings? They simply fill some of the Most Significant Bits (MSB) with 1s and some of the Least Significant Bits (LSB) with 0s to pad the overall volume up to the target level. They could have released a recording of identical performance in 16-bits, but naive consumers insist on 24-bits, so the record companies trick them by centering 16-bits of dynamic range in a 24-bit frame. How silly.

Part of why some HD recordings sound sterile has to do with lower dynamic compression that doesn’t allow the subtle low-level detail to rise above the noise floor. When music is sanely dynamically compressed, it allows you to listen at a reasonable volume and still hear all the subtle harmonic cues that reveal the tone, timbre, and room acoustics in the recording.

Another consideration of higher sampling rates and greater bit depth is system resources. Both require more storage space, more RAM, and faster processors. Though the optimal sampling frequency and bit-depth that are required to reproduce accurate music are a matter of heated debate, there is no doubt that excessive resolution unnecessarily uses up system resources and unnecessarily increases the size and cost of components.

Of course most recordings are engineered to sound best on a car stereo or portable device as opposed to on a high-end audiophile system. It’s a well-known fact that artists and producers will often listen to tracks on an MP3 player or car stereo before approving the final mix.

The quality of the recording plays a far more significant role than the format or resolution it is distributed in. But to increase profits, many modern recording studio executives insist that errors be edited out in post-production, significantly compromising the quality of the original master tapes. So no matter what format these recordings are released in, the music will always sound mediocre, since you can never have higher performance than what is on the original masters.

In contrast, some of my favorite digital recordings were digitally mastered from 1950s analog recordings. Many of these recordings were done as a group of musicians playing in a room with one take per track and a minimum of post-production editing. Though these recordings have a much higher background noise being limited by old-school pre-Dolby 60dB dynamic range master tapes, they retain an organic character that can't be duplicated any other way. When you hear the organic character and coherent in-the-room harmonics, it is clear why so many audiophiles prize these recordings.

The simpler the signal path and the lower the power supply noise, the better the digital-to-analog conversion. Hence our decades of obsession with R-2R non-oversampling DACs and ultralow-noise power supplies, as are used in our Mystique DAC.


Hear It for Yourself:

Are you curious about the potential of digital-to-analog conversion?

Mojo Audio’s Mystique DAC has the purest digital conversion possible.

  • A true non-oversampling R-2R multi-bit design.
  • No noise-shaping, upsampling, or oversampling algorithms.
  • MSB zero-crossing voltage adjustment circuitry to optimize linearity.
  • Perfectly bit-aligned left and right channel hardware-based demultiplexing.
  • Direct-coupled: no capacitors or transformers to distort phase and time or narrow bandwidth.

The Mystique is in a class by itself. Explosive micro-dynamics combined with harmonically coherent micro-details reveal the true time, tune, tone, and timbre of the original performance.

With Mojo Audio’s 45-day no-risk audition, you can hear the Mystique DAC for yourself, in your own system, with no-risk and no restocking fees. Experience all the purity and emotional content digital music is capable of delivering.

If you like what you read in this blog and are interested in getting more free tips and tricks, check out the rest of the blogs on our website. Also, sign up for our e-newsletter to get more useful info as well as dscount coupons, special offers, and first looks at new products. Plus, don’t forget to “like us” on Facebook.

Enjoy!

Benjamin Zwickel
Owner, Mojo Audio


References:

http://www.lavryengineering.com/lavry-white-papers/

http://www.homestudiocorner.com/24-bit-vs-16-bit/

http://electronics.forumsee.com/a/m/s/p12-37984-047253–24bit-16bit-the-myth-exploded.html

http://www.tested.com/tech/1905-the-real-differences-between-16-bit-and-24-bit-audio/

http://www.highendnews.info/technology/oversampling_and_bitstream_metho.htm

http://www.grimmaudio.com/site/assets/files/1088/dsd_myth.pdf

http://bitperfectsound.blogspot.com/2014/12/dst-compression.html

http://www.soundonsound.com/sos/sep07/articles/digitalmyths.htm

http://www.digitalpreservation.gov/formats/fdd/fdd000230.shtml

https://en.wikipedia.org/wiki/Direct_Stream_Digital

http://hometheaterreview.com/super-audio-compact-disc-sacd/

http://en.antelopeaudio.com/blog/

http://benchmarkmedia.com/blogs/news/15121729-audio-myth-24-bit-audio-has-more-resolution-than-16-bit-audio


Note: many of the graphics used in this blog were adapted from graphics taken from these reference sources.