DSD vs. PCM: Myth vs. Truth

Introduction

Direct Stream Digital (DSD) has become the big thing in high-end audio. Simplified encoding and decoding, along with ultra-high sampling frequencies, promise unparalleled performance. Is this what we’ve been waiting for, or just mass-marketing hype? This blog separates the hype from the technical facts. I’ll explain in what ways DSD has the advantage, and in what ways pulse-code modulation (PCM) is better. 

If you don’t want a history lesson and don’t want to wade through a lot of technical data, you may want to skip to the summary, where I hit all the major points. You also may want to refer to my other blog on “The 24-bit Delusion."


A Brief History

In 1857, Édouard-Léon Scott de Martinville invented the phonautograph, which could graphically record sound waves. In early 1877, Charles Cros devised a way to reverse that process on a photoengraving to form a groove that could be traced by a stylus, causing vibrations that could be passed on to a diaphragm, recreating sound waves.

In late 1877, Thomas Edison used Cros’ theories to invent the cylinder phonograph, allowing music lovers to experience recorded music in their homes for the first time. Can you imagine a modern cylinder phonograph? Tangential tracking…no arc error…no skating error. The concept was flawless.

In 1887, Emile Berliner invented the technically inferior disk phonograph. Since disks are much cheaper to produce, fit nicely in display bins at stores, and can include larger cover art and notes, they became the standard. And so began the long history of the recorded music industry being more about consumer convenience and optimal profits than about optimal fidelity.

The digital revolution was no different. Philips and Sony collaborated on the new standard for a consumer digital format in 1979. Philips wanted a 20 cm disk, but Sony insisted on a 12 cm disk that could be played in a smaller portable device. In 1980, they published the Red Book CD-DA standard, and mass-market digital music was born. Many in the early days of digital joked that CD stood for “compromised disk.”

In the early 1980s, when digital recording became readily available, studios converted from analog to digital to save money. For studios, this cost less for the equipment, required less space for both recording and archiving, and made it easier to mix and edit tracks in postproduction. For consumers, there weren't many advantages. Most of the early digital recordings were produced with relatively low resolution and sounded so fatiguing they would make you want to tear your ears off.

The switch from PCM to DSD was no different. In the early 1990s, Sony wanted a future-proof, less expensive medium to archive their analog masters. In 1995, they concluded that storing a 1-bit signal directly from analog to digital would allow them to output to any conceivable consumer digital format (LOL!). This new 1-bit technology was achieved by outputting from the monitoring pin on Crystal’s new 1-bit 2.8Mhz Bit Stream DAC chip.

Later, Sony’s consumer division caught wind of DSD and collaborated with Philips to create the SACD format. Of course, from the time the SACD was conceived until the time it came to market, DAC chip manufacturers had advanced from 64fs to a higher 128fs sampling rate and from 1-bit to a higher-resolution 5-bit format. Oops.

Long before the DVD, SACD, or DSD formats were developed, the Bit Stream DAC chip was introduced to the consumer market as a lower-cost alternative to the significantly more expensive R-2R multi-bit DAC chip. Bit Stream DAC chips have built-in algorithms to convert PCM input to DSD, which is then converted to analog. Once again, the result was a huge cost saving at the expense of fidelity.

It was in part Bit Stream DAC technology that allowed the development of our modern 7.1 channel audio that’s embedded into video formats. This also allowed electronics manufacturers to market DVD players in small chassis with cheap power supplies that could retail for under $70. Once again, the audio purist never stood a chance.

In contrast, not only do multi-bit R-2R DAC chips cost significantly more to manufacture than single-bit DAC chips, but they also require much larger and more sophisticated power supplies. If you were to make a 7.1 channel R-2R CD/DVD/SACD player, it would cost several times the price of Bit Stream technology, and it would be several times the size. Certainly not what the average consumer is looking for.

To sum things up, the recorded music industry has made decision after decision to maximize profits and mass consumer appeal at the expense of the audio purist. History lesson over.


DSD vs. PCM Technology

PCM recordings are commercially available in 16-bit or 24-bit and in several sampling rates from 44.1KHz to 192KHz. The most common format is the Red Book CD with 16-bits sampled at 44.1KHz. DSD recordings are commercially available in 1-bit with a sample rate of 2.8224MHz. This format is used for SACD and is also known as DSD64.

There are more modern, higher-resolution DSD formats, such as DSD128, DSD256, and DSD512, which I will explain later. These formats were created for recording studios and comprise only a very small portion of the recordings that are commercially available.

Though you can’t make a direct comparison between the resolution of DSD and PCM, various experts have tried. One estimate is that 1-bit 2.8224MHz DSD has similar resolution to a 20-bit 96KHz PCM. Another estimate is that 1-bit 2.8224MHz DSD is equal to 20-bit 141.12KHz PCM or 24-bit 117.6KHz PCM.

In other words, DSD64, or SACD, has higher resolution than a 16-bit 44.1KHz Red Book CD, roughly the same resolution as 24-bit 96KHz PCM recording, and not as much resolution as a 24-bit 192KHz PCM recording.

Both DSD and PCM are “quantized,” meaning numeric values are set to approximate the analog signal. Both DSD and PCM have quantization errors. Both DSD and PCM have linearity errors. Both DSD and PCM have quantization noise that requires a low-pass filter at the output of the converter so as not to overload amplification and speakers with ultrahigh-frequency noise. In other words, neither one is perfect.

PCM encodes the amplitude of the analog signal sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. The range of steps is based on the bit depth of the recording. A 16-bit recording has 65,536 steps, a 20-bit recording has 1,048,576 steps, and a 24-bit recording has 16,777,216 steps. 

The more bits and/or the higher the sampling rate, the higher the resolution. That translates to a 20-bit 96KHz recording having roughly 33 times the resolution of a 16-bit 44.1KHz recording. No small difference.

DSD encodes music using pulse-density modulation, a sequence of single-bit values at a sampling rate of 2.8224MHz. This translates to 64 times the Red Book CD sampling rate of 44.1KHz, but at only 1⁄32,768th of its 16-bit resolution.

 

A graphical representation of PCM as a dual axis quantization, and DSD as a single axis quantization, makes clear why the accuracy of DSD reproduction is so much more dependent on the accuracy of the clock than PCM. Of course, the accuracy of the voltage of each bit is just as important in DSD as PCM, so the regulation of the reference voltage is equally important in both types of converters. In addition, the accuracy of the clocking during the recording process is equally, if not more, important than the accuracy of the clocking during playback.

There are other DSD formats that use higher sampling rates, such as DSD128 (aka Double-Rate DSD), with a sampling rate of 5.6448MHz; DSD256 (aka Quad-Rate DSD), with a sampling rate of 11.2896MHz; and DSD512 (aka Octuple-Rate DSD), with a sampling rate of 22.5792MHz. All of these higher-resolution DSD formats were intended for studio use as opposed to consumer use, though there are some obscure companies selling recordings in these formats.

Note that Double, Quad, and Octuple DSD have both a 44.1KHz multiple and a 48KHz multiple sample rate for 100% equal division down to 44.1KHz Red Book or 96KHz and 192KHz High-Definition PCM formats.


The Problems:

There are three major areas where both PCM and DSD fall short of perfection: quantization errors, quantization noise, and non-linearity.

Quantization errors can occur in several ways. One way that was most common in the early days of digital recording had to do with the resolution being too low. You can’t quantize to a fraction of a bit, and you can’t quantize to a fraction of a sampling rate. When the value of the analog signal falls between two quantization values, the digital recording ends up recreating the sound lower or higher in volume and/or slower or faster in frequency, distorting the time, tune, and intensity of the original music. Often this creates unnatural, odd harmonics that result in the hard, fatiguing sound associated with early digital recordings.

Though modern sampling rates are high enough to fool the human ear, quantization errors still occur when translating from one format to another. For example, when Sony decided to archive their analog master libraries to DSD64 back in the mid 1990s, they were wrong to believe that these masters would be future-proof and able to reproduce any consumer format. The fact is, these masters could only properly reproduce a format that was divisible by 44.1KHz. So any modern 96KHz or 192KHz recording created from DSD64 master files have quantization errors.

This leads me to one of the many things that enrage me about the recorded entertainment industry. If 44.1KHz was the standard that was engineered to put aliasing errors in less critical audio frequencies, then why did they start using multiples of 48KHz?!?!?!? All they had to do was go with 88.2KHz and 176.4KHz as the modern HD consumer formats, and all of this mess could have been avoided. They made DXD, a 24-bit 352.8KHz studio format, equally divisible by 44.1KHz. What blithering idiot decided to put a wrench in the works with 96KHz and 192KHz HD audio?!?!?!?

Forgive me for my ranting righteous indignation. Silly me…applying technically sound logic to decisions made by so-called marketing professionals.

Quantization noise is unavoidable. No matter what format you digitize in, ultrasonic artifacts are created. The more bits you have, the lower the noise floor. Noise floor is lowered by roughly 6 db for each bit. So as you can imagine, 1-bit DSD has significantly more ultrasonic noise than even 16-bit PCM. With PCM, you have to deal with significant noise at the sampling frequency. This is why Sony and Philips engineered the Red Book CD to sample at 44.1KHz, which is over twice the human high-frequency hearing limit of 20KHz. A simple low-pass filter at the output of a PCM converter set around 30KHz does a fine job of cleaning that up.

Of course DSD64 is another story: above 25KHz the quantization noise rises sharply, requiring far more sophisticated filters and/or noise-shaping algorithms. When you filter the output of DSD64 with a simple low-pass filter, the result is distorted phase/time and some rather nasty artifacts in the audible range. The solution is noise-shaping algorithms that move the noise to less audible frequencies and/or higher sampling rates. This is why DSD128 (Double-Rate DSD) and the other higher-sample-rate DSD formats came into being. This is also why advanced player software, such as JRiver, offers Double-Rate DSD output.

Jitter is defined as inconsistencies in playback frequency caused by inaccurate clocking. The result is non-linearity, which is observable as distortion of the time and tune of the music. Often the pattern of the inconsistency of frequency can result in an analog wave form that has an unnatural, odd harmonic frequency. This results in the fatiguing character commonly known as “digititis.”

  

Non-linearity can occur when either the converter’s clock rate or voltage per step is off. This is why we are hearing so much about “super clocks” and “femto clocks.” The more accurate the clock, the more linear the analog output. This is also why ultrahigh-performance PCM converters, such as Mojo Audio’s Mystique, have a way to adjust the voltage of the most-significant-bit (MSB) at the zero crossing to optimize linearity. The question is, why don’t other companies have a way to optimize MSB voltage in addition to these super clocks they are all bragging about?


The Myth of Pure DSD:

Despite the marketing hype, there are almost no pure DSD recordings available to consumers. This is because there is currently no way to edit, mix, and master DSD files. Only the rare DSD recordings made from a mixed and mastered analog recording, or those recorded direct to DSD without any postproduction, are pure DSD. Most so-called pure DSD recordings are, in fact, edited in PCM. The marketing hype DSD flow chart you see below rarely exists anywhere but in theory. Yikes…the secret is out.

There are several generations and levels of quality in purely digital DSD recordings. The least pure are DSD recordings made from old PCM masters. Many of these PCM masters had low resolution, as well as significantly higher quantization errors and lower linearity than modern PCM recordings. Since you can never get better than the original masters, these DSD recordings sound as bad as or worse than the original low-resolution PCM masters. The purest DSD recordings come from modern DSD masters that are recorded in Wide-DSD, which is in fact a 5-bit or 8-bit PCM format at ultrahigh DSD sampling rates.

As you can see from the above flow chart, most commercially available DSD recordings have to be converted back and forth to a PCM format in order to do postproduction editing, mixing, and mastering. In each of these conversions, more quantization noise and/or quantization errors are added to the recording. This leads many to ask: why degrade performance by adding the additional step to convert to DSD when the master is already in PCM?

Another common marketing myth about DSD vs. PCM is that when blind listening tests were done comparing DSD to PCM, there was a consensus that PCM had a fatiguing quality and DSD had a more analog-like quality. This was proved to be total marketing BS. One way that marketing lie was perpetuated was with hybrid SACDs that have DSD64 and 16-bit 44.1KHz PCM on the same disk. The DSD64 tracks have roughly 33 times the resolution of the 16-bit 44.1KHz tracks so that they could make DSD sound better than PCM in comparisons. The truth is that in recent blind studies they've proved that high-resolution PCM and DSD are statistically indistinguishable from one another. Considering that nearly all DSD recordings were edited, mixed, and mastered in PCM, it is no wonder.


Summary:

Historically, most decisions related to mass-marketed recordings were based on consumer convenience and higher profits, rather than technical advantages and higher fidelity.

R-2R ladder DAC chips, and the circuitry that supports them, cost significantly more to manufacture and are significantly larger in size than DSD or Bit Stream technology.

DSD64, or SACD, has higher resolution than a 16-bit 44.1KHz Red Book CD, roughly the same resolution as 24-bit 96KHz PCM recording, and not as much resolution as a 24-bit 192KHz PCM recording

Even though many recordings are advertised as being 24-bit, all 24 bits were used only in the recording studio to reduce quantization noise. The consumer version was mastered at a much lower bit-rate, usually at or below 20 bits.

The DSD64 tracks on a hybrid SACDs have roughly 33 times the resolution of the 16-bit 44.1KHz PCM tracks. This was done purposely so that they could sell more SACD players by fooling potential customers into believing that they were making a fair comparison when they played music from the same disk.

DSD has significantly higher quantization noise than PCM, and the noise is much closer to audible frequencies, requiring significantly more sophisticated digital filters, as well as noise-shaping and upsampling algorithms, that can result in distortion of the analog signal.

Pure DSD recordings, as pictured in the flow charts used in DSD marketing hype, are almost nonexistent. There is currently no technology to edit, mix, or master DSD. High-definition 5-bit and 8-bit PCM (Wide-DSD), are used in recording and postproduction editing, mixing, and mastering of nearly all modern DSD recordings.

High-resolution PCM and DSD formats are statistically indistinguishable from one another in blind listening tests.

When a PCM file is played on a DSD or Bit Stream converter, the DAC chip has to convert the PCM to DSD in real time. This is one of the major reasons people claim DSD sounds better than PCM, when in fact, it is just that the chip in most modern single-bit DACs do a poor job of decoding PCM.

Of course most recordings are engineered to sound best on a car stereo or portable device as opposed to on a high-end audiophile system. It’s a well-known fact that artists and producers will often listen to tracks on an MP3 player or car stereo before approving the final mix.

I believe that the quality of the recording plays a far more significant role than the format or resolution it is distributed in. Too bad most of the big recording houses don’t agree with me. To increase profits, recording studio executives insisted that errors be edited out in postproduction, significantly compromising the quality of the original master tapes. In my opinion, this was the end of the golden age of recording.

In contrast, some of my favorite digital recordings were digitally mastered from 1950s analog recordings made on tube-based reel-to-reels. When you hear the organic character and coherent in-the-room harmonics, it is clear why so many audiophiles prize these recordings.

I also believe that the simpler the signal path and the lower the power supply noise, the better the digital-to-analog conversion. Hence my decades of obsession with R-2R nonoversampling conversion and ultralow-noise power supplies, as are used in the Mystique DAC.


Hear It for Yourself:

Are you curious about the potential of digital-to-analog conversion? Mojo Audio’s Mystique DAC has the purest digital conversion possible.

  • A true nonoversampling R-2R multi-bit design
  • No noise-shaping, upsampling, or oversampling algorithms
  • MSB zero-crossing voltage adjustment circuitry to optimize linearity
  • Perfectly bit-aligned left and right channel hardware-based demultiplexing
  • Direct-coupled: no capacitors or transformers to distort phase and time or narrow bandwidth

The Mystique is in a class by itself. Explosive micro-dynamics combined with harmonically coherent micro-details reveal the true time, tune, tone, and timbre of the original performance.

Of course with Mojo Audio’s 45-day no-risk audition, you can hear the Mystique DAC for yourself, in your own system. Experience all the purity and emotional content digital music is capable of delivering.

If you like what you read in this blog and are interested in getting more free tips and tricks, sign up for Mojo Audio’s Audiofiles blog. Also, sign up for our e-newsletter to get more useful info as well as coupons, special offers, and first looks at new products. Plus, don’t forget to “like us” on Facebook.

Enjoy!

Benjamin Zwickel

Owner, Mojo Audio


References:

http://www.lavryengineering.com/lavry-white-papers...

http://www.iear-app.com/24-bit-myth/

http://www.homestudiocorner.com/24-bit-vs-16-bit/

http://electronics.forumsee.com/a/m/s/p12-37984-04...

http://www.tested.com/tech/1905-the-real-differenc...

http://www.highendnews.info/technology/oversamplin...

www.grimmaudio.com/site/assets/files/1088/dsd_myth...

http://bitperfectsound.blogspot.com/2014/12/dst-co...

http://www.soundonsound.com/sos/sep07/articles/dig...

http://www.digitalpreservation.gov/formats/fdd/fdd...

https://en.wikipedia.org/wiki/Direct_Stream_Digita...

http://hometheaterreview.com/super-audio-compact-d...

http://www.antelopeaudio.com/blog/direct-stream-di...

http://benchmarkmedia.com/blogs/news/15121729-audi...

Note: many of the graphics used in this blog were adapted from graphics taken from these reference sources.