(No argument for a better. Steaks, hamburger and sloppy joes are all great. Lets just know what we are eating….)
Any avid music fan has probably had the argument with a friend (or foe) about what the best way is, in terms of format, to listen to music. Since Napster shattered the customs of the music world in the late 90’s mp3s have become synonymous with contemporary music. The iPod then came along and told us we no longer needed shelves for our music collection, just a regular sized pocket. These developments are currently pushing the CD format closer and closer to its extinction. Yet ironically as the CD slowly dies, vinyl records are storming back into popularity. So it appears that while the MP3 has unquestionably made music more portable and “share-able” (it is damn awesome to bring your entire music collection on a plane ride!), it doesn’t seem to have what it takes to wipe out other formats completely. So lets take a look at the science behind music formats and how we hear in general. You may be surprised by what you didn’t know; an educated listener is a better listener indeed.
We must start by examining sound in general.
All right, lets get some simple things straight about the way sound works for us humans and our brains. In general the human ear picks up frequencies between 20 hertz (Hz) and 20k Hz; hertz meaning the number of vibrations per second (“sound” is simply our brains perceiving minuscule air pressure changes, or vibrations). Yet the truth is most adults only ear up to around 16k Hz (a little higher for females, you lucky ladies) because we lose higher frequencies as we age. Yet sounds indeed exist below 20 Hz (think of when you feel deep bass without actually hearing it) and upward well beyond 20K Hz (think of a dog whistle, we don’t hear it but the pups sure do!). So while we can pick up an important swath of the sound-spectrum there exists a great deal of sonic information we just never hear. [Note: this phenomena also exists with our eyes, we only see a tiny portion of the electro-magnetic spectrum, which we call light & color]
So who cares about these sounds our brains’ cannot even perceive, what the heck does that have to do with mp3s and CDs and listening to your tunes? Again, we have to look at some science basics (bear with me!). Sound is mathematical. Lets say you play an A major chord on an instrument. The fundamental frequency of an A major is 440 Hz, so that will be the most present frequency we hear, yet it will not be the only. Here is the math, that A note will also create and sound out its harmonics (or “overtones”), which are always multiples of itself. This means that 440 Hz A note will create another “harmonic” at 880 Hz (440 x 2), another at 1320 Hz (440 x 3), and another one at 1760 Hz (440 x 4) and it goes on. Harmonics are what make notes played by instruments interesting to our ears. Because different instruments (or vocal chords for that matter) will inherently create different harmonic relations to the fundamental frequency this is in turn the reason there is a difference in sound from instrument to instrument, even when they play the same mathematical musical note. This is also referred to as an instrument’s “timbre”. Think of a computer created “true tone”, one with no harmonics; it’s a shrill, sterile and annoying sound. So…. you may be thinking now, well if the chords and notes that make up our music all have harmonics that are out of our hearing range, do those sounds affect what we do hear? AHAH, hold onto that thought (!); however, we can now begin our discussion upon music formats!
Lets start with the coasters. A CD, due to its bit-rate, can only reproduce sounds between 20Hz – 22.05k Hz. This is because a CD takes 44,100 digital samples of sound a second, and it roughly takes 2 samples to recreate a frequency, hence you can only recreate up to 22,050 Hz (44.1K/2 = 22.05k). This mathematical phenomenon is known as the Nyquisk Theorem. Yet as one might infer, the more samples per second of a sound source the higher fidelity copy of that sound is recreated. So on a CD the higher frequencies, while present, are of a lower fidelity than the lower frequencies. So even though the original recorded music contains sonic information above 22k Hz, that information is forever lost. But at least the information between 20-22k Hz is present and not altered by a compression algorithm. However, this is also why an album on a CD may take upward of 700+ MB; the information is not compressed. What does compressed mean? Let’s introduce the mp3.
MP3s are great because they don’t take up as much memory as a CD, making it possible to collect an amazing amount of music on one’s hard drive. But how is this done? It is done by taking out information from the original CD recording, both sonic information and mathematical redundant information. In the end though about 91% of the sonic information will be removed. The mathematical algorithm that change WAV files (CD’s format) to mp3s rely on some tactics to do what they do. Basically the algorithms take into account some of the principles of sound and human hearing we have been discussing, known as psychoacoustics.
Here are some of the basics of mp3’s psychoacoustic compression techniques:
-One; they compress the sound information in a “lossy” way, which means that it can never be uncompressed back into the original file again. They are essentially taking samples of the samples from the CD version. Sonic information is simply lost forever, hence “lossy” compression.
IMPORTANT NOTE: This means when you burn a CD from an mp3, you are not getting CD quality sound. Also, the bit-rate of the mp3 greatly affects the fidelity of the music. The mp3 algorithm is essentially taking samples of information from a CD, which is just samples of sound. So that means there are less samples of the sound per second, so that sounds which have lots of information in a short amount of time (known as transients) will not be reproduced as well. What kinds of sounds have these “transients”? For one, percussive ones; the attack of a snare drum is more difficult for an mp3 to replicate than acoustic guitar, this should be kept in mind.
Side note: There are plenty of lossless compression techniques available that don’t use psychoacoustic datea removal techniques such as Free Lossless Audio Codec (FLAC), Apple’s Apple Lossless, MPEG-4 ALS, Monkey’s Audio, and TTA.
-Two; it is known that the human ear only hears certain sounds and some better than others, so all frequencies above 15.5k Hz are cut out of mp3s completely. Also, much of the very low end is flattened (20-80 Hz), as few stereos or headphones will accurately reproduce them. They slightly boost the frequencies between 1-4k Hz as well, which the human ear is most sensitive to (because its what the human voice fall between).
IMPORTANT NOTE: Sonic information from the original recording is being thrown away, and altered, before being presented to your ears and brain.
-Three; it is known that the human ear picks up which direction a sound is coming from the higher the frequency the sound is (think of an ambulance siren vs a car with loud subs, which one can you pinpoint the location of better?). So they reduce the stereo information of non-high frequency sounds.
IMPORTANT NOTE: This means there is just less stereo information. Music from an MP3 is just more “mono’ish”.
-Four; it is known that the human ear has trouble hearing certain sounds above other, louder sounds. Usually when one sound is 6dB louder than another, the human ear doesn’t initially pick it up, it is “masked.” Mp3s notice when there are quieter sounds that you might not initially hear, and will cut out that sonic information to focus on the main sounds you can easily hear.
IMPORTANT NOTE: This means the quiet subtle sounds are being cut out, and everything must be about as loud as everything else to be heard. This will reduce the dynamic range that can be heard in an MP3, and reduce the ability to have quiet and loud sounds presented at the same time.
-Five; it is known that the human ear can have trouble perceiving sounds that are closely related in time. So while you will notice if two sounds occur (think of two drum taps) 10 milliseconds apart, you wont notice them 2 milliseconds apart (you will think it is the same sound). Mp3s take advantage of this by scanning for similar sounds that are so similar in time that the human ear won’t perceive it, and then removing that sonic information.
IMPORTANT NOTE: This has an implication for sounds that have pre-echo, and echo (commonly known as reverb), as well as for tightly timed choruses (as when someone sings over their own singing). This means that reverb or cymbal sounds (which have a pre-echo type sound) will be reproduced differently for an mp3 than for a CD.
So what is all this fuss you ask? In the end it seems as if the mp3 is then just a customized & streamlined way to listen to audio; indeed, the mp3 is an amazing invention. It is truly amazing that that so much fidelity can be conserved with amazing memory needs reduction, and I in no way regret the mp3! I f*&$ing love being able to bring my entire album collection in my pocket when I travel, don’t doubt it! But these psychoacoustic techniques of the mp3 that take out sounds we usually consciously cannot perceive; do they affect the quality of our music?
| |
|
|
|
| |
 |
|
 |
| |
Well let’s take a look at two Japanese studies done on sounds that people cannot perceive. The first study (http://jn.physiology.org/cgi/content-nw/full/83/6/3548) was done by examining people’s EEG, or the location and intensity of electric signals in the brain, when exposed to the same recording played with different inherent frequency information. Take a look at the pictures above. Baseline = our brains without sound. LCS = our brains when just sound information above 20k Hz is played. HCS = our brains when sound played between normal hearing range (like a CD). FRS = our brains when all sounds are played, the normal range plus the range we normally don’t perceive. Yet it is when the recording is playing with the frequencies we “can’t” hear that the brain gets the most excited! To quote the study’s results, “When the conditions with audible sounds (i.e., FRS or HCS) were compared with those without audible sounds (i.e., LCS or baseline), the bilateral temporal cortex, presumably the primary and secondary auditory cortex, always showed significantly increased rCBF as expected. More importantly, when FRS was compared with HCS, deep-lying structures in the brain were significantly more activated during the presentation of FRS than during that of HCS.”
The second study, also done in Japan (http://www.jstage.jst.go.jp/article/ast/24/4/197/_pdf), took a different approach. They exposed people to music twice. Each exposure consisted of the volume being set for a first run, then on a second run the participants were allowed to set the volume level themselves to whatever they deemed a comfortable listening level. On one exposure they were listening to HCS (like a CD) music, on the other they were exposed to the FRS (same music with the additional frequencies we “can’t hear”). On average (the experiment was done many times on many people) when people were listening to the FRS they considered their comfortable listening level to be close to 1 dB higher. While this may seem like very little change to us, scientifically if we cannot hear a difference, there shouldn’t be any at all. To quote the study, “The averaged comfortable listening level of the sounds containing HFC above 22 kHz was significantly higher than that of the sounds from which HFC above 22 kHz have been removed.”
In short (maybe actually in long), you don’t have to take my word for it; these frequencies being present change the listening experience. I am not suggesting we all buy crazy expensive audio systems that can reproduce up to 100k Hz and use only Super-Audio CDs; just that sonic information we cannot perceive at face value still affects the end product, our music. Just because we can’t consciously perceive those frequencies alone, they seem to have at least a subconsciously effect when played along with the frequencies we can hear (remember our harmonics talk..?). Is this actually surprising? Isn’t music a subconscious experience in many ways, what genres we inherently like or what memories or thoughts music brings up, this is all subconscious! We should not confuse sub-conscious with un-important.
So now what about that mp3… Do you feel any different about taking out frequencies that actually lie within the realm we can hear when we see that taking out frequencies we cannot hear has an effect? Yes, maybe we can do a psychoacoustic removal technique here and there without degrading the fidelity much, but do you really now think that taking out that much sonic information (remember our 5 points from above) and not create an effect on the music and then the listening experience?
Again, I want to stress I am in support of using mp3s, as long as you know what you are dealing with. We have to be honest with ourselves about what the mp3 is. It is a flattener of music, there is little to no information in the very low and high end. It is a mono’izer of music, there is less stereo information and a reduction of the stereo image. It is less dynamic, quiet sounds or similar sounding sounds are thrown out. This means you will have to “bass boost”, you won’t hear the two different acoustic guitars as well, reverb will be reproduced very poorly, drums will sound a little more bland, and whether the lead singer whispers or screams you’ll probably hear it at the same volume.
But it’s not just that the mp3 changes how music is digitally stored, its that it now changes the way it is recorded. Why have a great stereo image? Why have complicated delicate sounds? Why have dynamics, where a song may be quiet at one point and louder later? Why use delicate or complex reverbs? If the public is just going to take in the music on an mp3, then why get complicatedly creative, when its going to sound simple. HERE is the crux, when the public eats mp3s and doesn’t even realize what they are taking in, then we are feeding them burgers and they think they are getting steak! Again, I love burgers; you can get em anywhere, you can eat them in your car or on a plane, they are cheap and plentiful. But when I am home and have time and I really want a good meal, I want to eat me some steak! Imagine not even knowing that steak was an option… this is what I fear our listening culture is moving towards; having the simpler option becoming the only option.
One more thing… wax. Vinyl records, just to let you know, can reproduce frequencies well above 20k Hz. Depending on the quality of the system they can reach 50k Hz, and some say even higher. And because vinyl is a literal copy of a sound wave and is not comprised of “samples” it reproduces frequencies with equal quality. This means the high frequency information is replicated with as much fidelity as the low frequency information. And personally, with a decent audio system, there is nothing like listening to sound from a needle vibrating in a wax groove. Music is comprised of complex vibrations and is “analog” by nature, and I think the experience of music is superior when it is created from analog vibrations!
[But that isn’t the main point. Vinyls have to be taken care of, you have to flip em and you need a decent system to appreciate them. You can’t play them in your car, and they need physical storage stage. There are certain drawbacks and it is not for everyone. But vinyl LPs do make you care about the quality of your sound, something that seems to be lost on many people these days. If you haven’t tried it, give it a whirl.]
So what have we learned from all this bickering about audio format fidelity? That no matter what format you use to record the sonic information of a guitar playing, they all pail in comparison to the amount of sonic information you get by being in the guitar’s presence; i.e. no format is of higher quality than live music! Also, that mp3s are compressed and derived from CDs and are not equals, yet that’s okay (but know that when you decide to purchase one or the other). What isn’t okay is when we eat hamburgers and don’t know steak exists.
If you love music, then learn about how you are listening to it, and you will love it all the more!
Written by Sean Poynton Brna
In case you want to check some of the stats and where I got some of my information. Although this was also written with information I have gained from my schooling and career.
- http://www.mp3-converter.com/mp3codec/breakitdown.htm
- http://en.wikipedia.org/wiki/Audio_compression_(data)
- http://jn.physiology.org/cgi/content-nw/full/83/6/3548
- http://www.jstage.jst.go.jp/article/ast/24/4/197/_pdf
- http://en.wikipedia.org/wiki/Psychoacoustics
- http://en.wikipedia.org/wiki/MP3