This contentious topic is up there with Mac versus PC,
tonewheel versus clonewheel, and of course, the Michael Jackson versus
Elvis Presley Epic Rap Battle of History. (Yes, that’s a thing.) So
what’s the reality? Here’s what recent experiments of mine determined.
Fig. 1 above: Oversampling options in various programs,
clockwise from top: iZotope Ozone, Cakewalk Z3ta+ 2, two processors from
Native Instruments Kontakt, and Peavey’s ReValver amp simulator.
The debate. The “by the numbers” camp references
studies showing people can’t differentiate between 96kHz source material
and the same material played back through a 44.1kHz converter. They
quote the Nyquist theory and studies on human perception. Their verdict:
96kHz is snake oil.
That same camp has often characterized the pro-96kHz
contingent as simply thinking, “it goes up to 11, so it must be better.”
That’s a bit of a straw man argument, though. Proponents of 96kHz, in
fact, point to studies on converter performance at higher sample rates,
localization information in audio streams, and bring out the one
argument no one can refute: “It sounds better to me.” Whether it’s the
placebo effect, hearing acuity, or some as yet unquantified technical
issue, some engineers and producers with impeccable audio credentials
swear that files recorded at 96kHz simply sound better.
So far, however, the controversy has centered on playback. I wanted to find out whether recording at 96kHz made a difference—regardless of whether the files were then played back at 96kHz or 44.1kHz.
Surprise! You’re already using higher sample rates. I
used to mix DAW tracks with Panasonic’s DA7 digital mixer because the
EQs sounded better that what was in my DAW. I wondered if I was just
hearing things, so one day I cornered the DA7’s head engineer. He
informed me that the EQs ran at twice the sample rate internally. So
while I was mixing at 44.1kHz, the EQs were mixing at 88.2kHz.
With today’s virtual instruments and signal processors,
clicking on the “oversampling” button or its equivalent (see Figure 1 above)
generally introduces oversampling by a factor of at least double—and
also increases the hit on your CPU. This can result in an audible
improvement, especially from soft synths rich in harmonics or algorithms
that generate distortion. This is because oversampling reduces . . .
Foldover distortion. What’s this? Basically, if something inside the computer generates audio above the clock frequency (e.g.,
44.1kHz), this audio creates difference signals that “fold back” into
the audible range, producing nasty distortions. These signals won’t have
huge amplitudes, but they’ll still be annoying. This process is called aliasing. Increasing
the sample rate pushes the clock out of the range of the more prominent
harmonics. Also, when distortion folds back from 96kHz, it will likely
fold back above the audible range, so you won’t hear it anyway.
Your audio interface already includes input filtering to
remove harmonics above the audio range, so you’re unlikely to encounter
problems from real-world signals. However, processes created entirely
inside the computer are a different matter. If all plug-ins were
designed with oversampling options and high-quality filtering, you
shouldn’t hear aliasing because any frequencies that could go above the
clock frequency wouldn’t exist. However some (many?) plug-ins and
virtual instruments, especially older ones, don’t include oversampling
and can produce aliasing at lower sample rates. Your four options are:
don’t use them, hope the designers come up with an improved version,
accept the distortion, or run your project at a higher sample rate.
Perhaps aliasing is a key reason for the 96kHz
controversy. You might run plug-ins that aren’t prone to aliasing, and
sound the same at 44.1kHz or 96kHz. So you conclude 96kHz is hype. Or,
you might run projects with plug-ins that sound significantly better at
96kHz—so that means 96kHz is better, right? No, it means that running at
96kHz effectively adds an oversampling button to plug-ins lacking that
option. It has nothing to do with the frequency range our ears can hear.
Pros and cons. To minimize aliasing, one valid
strategy is to switch your sample rate preference to 96kHz (see Figure
2 at left), and not enable oversampling on plug-ins unless they’re intended to
handle oversampling at high sample rates. Any subtle quality improvement
spread over multiple plug-ins has a cumulative and potentially
audible—not just theoretical—effect.
Fig. 2. Switching a program’s sample rate to 96kHz usually
involves a setting in a preferences menu. Clockwise from top:
Propellerhead Reason, Steinberg Cubase, MOTU Digital Performer, PreSonus
Studio One Pro. Center: Cakewalk Sonar.
However, 96kHz stresses out your computer more and limits
the maximum number of audio streams between your USB or FireWire audio
interface. Also, some plug-ins won’t operate at 96kHz. You can achieve
lower latency—a nice bonus—if your computer can handle it. But if you
need to double your sample buffer setting in order to run at 96kHz,
there’s no real improvement.
Furthermore, a well designed plug-in might actually sound
better when oversampled from a lower sample rate. For example IK
Multimedia's AmpliTube guitar amp simulation suite applies different
amounts of oversampling to different processes within the emulation so
they don’t use more than what’s needed. A higher sample rate oversamples
everything, whether it needs it or not.
The verdict. Ultimately for recording, both camps
are right. Recording at 96kHz can improve the sound quality but it can
also make no difference, depending on your collection of plug-ins and
the musical material. In any event, you do have to consider the CPU
As to playback at 96kHz vs. 44.1kHz, I’m not
touching that one—no one enjoys hate mail!. However, I did note that if
rendered audio sounded better at 96kHz, then was sample-rate converted
to 44.1kHz and played back at 44.1kHz, the improved quality was still there. This makes sense: If the improvement at 96kHz falls into the audible range, 44.1kHz has no trouble reproducing what’s in the audible range—it just has a hard time with frequencies above the audible range.
So, if your intuitive attitude has been “might as well
record at 96kHz; it can’t hurt and it might help,” your intuition was
right. On the other hand, if you haven’t noticed any difference between
recording at 44.1kHz or 96kHz on your particular system, it’s probably
because the Gestalt of everything you use is such that there isn’t any difference.
Then again you can just not worry about it, and write a better song. That’s all your audience really cares about anyway.
About the Audio Example
Both files are from Cakewalk Z3ta+ 2, which was set for no
oversampling (if set for oversampling, there’s no difference between
recording it at a higher or lower sample rate). I played high notes,
then copied and transposed them up an octave to make sure there would be
plenty of high frequencies to hit the 44.1kHz clock. The point here is not which
file sounds “better.” This is a test file that was created with more
highs than normal, so the fact that the 44.1kHz file doesn’t reproduce
them makes the file more musically useful. However, the one that was
recorded at 96kHz and converted to 44.1kHz audibly has more high
frequency components and if you listen closely, they’re cleaner as
well—you don’t hear any low-level “wooliness” from foldover distortion.