This contentious topic is up there with Mac versus PC, tonewheel versus clonewheel, and of course, the Michael Jackson versus Elvis Presley Epic Rap Battle of History. (Yes, that’s a thing.) So what’s the reality? Here’s what recent experiments of mine determined.
Fig. 1 above: Oversampling options in various programs, clockwise from top: iZotope Ozone, Cakewalk Z3ta+ 2, two processors from Native Instruments Kontakt, and Peavey’s ReValver amp simulator.
The debate. The “by the numbers” camp references studies showing people can’t differentiate between 96kHz source material and the same material played back through a 44.1kHz converter. They quote the Nyquist theory and studies on human perception. Their verdict: 96kHz is snake oil.
That same camp has often characterized the pro-96kHz contingent as simply thinking, “it goes up to 11, so it must be better.” That’s a bit of a straw man argument, though. Proponents of 96kHz, in fact, point to studies on converter performance at higher sample rates, localization information in audio streams, and bring out the one argument no one can refute: “It sounds better to me.” Whether it’s the placebo effect, hearing acuity, or some as yet unquantified technical issue, some engineers and producers with impeccable audio credentials swear that files recorded at 96kHz simply sound better.
So far, however, the controversy has centered on playback. I wanted to find out whether recording at 96kHz made a difference—regardless of whether the files were then played back at 96kHz or 44.1kHz.
Surprise! You’re already using higher sample rates. I used to mix DAW tracks with Panasonic’s DA7 digital mixer because the EQs sounded better that what was in my DAW. I wondered if I was just hearing things, so one day I cornered the DA7’s head engineer. He informed me that the EQs ran at twice the sample rate internally. So while I was mixing at 44.1kHz, the EQs were mixing at 88.2kHz.
With today’s virtual instruments and signal processors, clicking on the “oversampling” button or its equivalent (see Figure 1 above) generally introduces oversampling by a factor of at least double—and also increases the hit on your CPU. This can result in an audible improvement, especially from soft synths rich in harmonics or algorithms that generate distortion. This is because oversampling reduces . . .
Foldover distortion. What’s this? Basically, if something inside the computer generates audio above the clock frequency (e.g., 44.1kHz), this audio creates difference signals that “fold back” into the audible range, producing nasty distortions. These signals won’t have huge amplitudes, but they’ll still be annoying. This process is called aliasing. Increasing the sample rate pushes the clock out of the range of the more prominent harmonics. Also, when distortion folds back from 96kHz, it will likely fold back above the audible range, so you won’t hear it anyway.
Your audio interface already includes input filtering to remove harmonics above the audio range, so you’re unlikely to encounter problems from real-world signals. However, processes created entirely inside the computer are a different matter. If all plug-ins were designed with oversampling options and high-quality filtering, you shouldn’t hear aliasing because any frequencies that could go above the clock frequency wouldn’t exist. However some (many?) plug-ins and virtual instruments, especially older ones, don’t include oversampling and can produce aliasing at lower sample rates. Your four options are: don’t use them, hope the designers come up with an improved version, accept the distortion, or run your project at a higher sample rate.
Perhaps aliasing is a key reason for the 96kHz controversy. You might run plug-ins that aren’t prone to aliasing, and sound the same at 44.1kHz or 96kHz. So you conclude 96kHz is hype. Or, you might run projects with plug-ins that sound significantly better at 96kHz—so that means 96kHz is better, right? No, it means that running at 96kHz effectively adds an oversampling button to plug-ins lacking that option. It has nothing to do with the frequency range our ears can hear.
Pros and cons. To minimize aliasing, one valid strategy is to switch your sample rate preference to 96kHz (see Figure 2 at left), and not enable oversampling on plug-ins unless they’re intended to handle oversampling at high sample rates. Any subtle quality improvement spread over multiple plug-ins has a cumulative and potentially audible—not just theoretical—effect.
Fig. 2. Switching a program’s sample rate to 96kHz usually involves a setting in a preferences menu. Clockwise from top: Propellerhead Reason, Steinberg Cubase, MOTU Digital Performer, PreSonus Studio One Pro. Center: Cakewalk Sonar.
However, 96kHz stresses out your computer more and limits the maximum number of audio streams between your USB or FireWire audio interface. Also, some plug-ins won’t operate at 96kHz. You can achieve lower latency—a nice bonus—if your computer can handle it. But if you need to double your sample buffer setting in order to run at 96kHz, there’s no real improvement.
Furthermore, a well designed plug-in might actually sound better when oversampled from a lower sample rate. For example IK Multimedia's AmpliTube guitar amp simulation suite applies different amounts of oversampling to different processes within the emulation so they don’t use more than what’s needed. A higher sample rate oversamples everything, whether it needs it or not.
The verdict. Ultimately for recording, both camps are right. Recording at 96kHz can improve the sound quality but it can also make no difference, depending on your collection of plug-ins and the musical material. In any event, you do have to consider the CPU resources tradeoff.
As to playback at 96kHz vs. 44.1kHz, I’m not touching that one—no one enjoys hate mail!. However, I did note that if rendered audio sounded better at 96kHz, then was sample-rate converted to 44.1kHz and played back at 44.1kHz, the improved quality was still there. This makes sense: If the improvement at 96kHz falls into the audible range, 44.1kHz has no trouble reproducing what’s in the audible range—it just has a hard time with frequencies above the audible range.
So, if your intuitive attitude has been “might as well record at 96kHz; it can’t hurt and it might help,” your intuition was right. On the other hand, if you haven’t noticed any difference between recording at 44.1kHz or 96kHz on your particular system, it’s probably because the Gestalt of everything you use is such that there isn’t any difference.
Then again you can just not worry about it, and write a better song. That’s all your audience really cares about anyway.
About the Audio Example
Both files are from Cakewalk Z3ta+ 2, which was set for no oversampling (if set for oversampling, there’s no difference between recording it at a higher or lower sample rate). I played high notes, then copied and transposed them up an octave to make sure there would be plenty of high frequencies to hit the 44.1kHz clock. The point here is not which file sounds “better.” This is a test file that was created with more highs than normal, so the fact that the 44.1kHz file doesn’t reproduce them makes the file more musically useful. However, the one that was recorded at 96kHz and converted to 44.1kHz audibly has more high frequency components and if you listen closely, they’re cleaner as well—you don’t hear any low-level “wooliness” from foldover distortion.