Sanders Sound Systems - Audio Equipment Testing White Paper

Audio Equipment Testing White Paper

Wouldn't you like to know for sure if that new, ten thousand dollar amplifier you want to buy is really better than your old one? Do different brands of tubes sound different from others? Do multi-thousand dollar interconnects really sound better than ordinary ones? Do high power solid state amps really sound badly when playing quietly? Does negative feedback make an amp sound worse than one without feedback? Does the class of amplifier operation affect the sound? Do MOSFET amplifiers sound different from those using bipolar transistors? Do cables really sound better when connected in a particular direction?

These are just some of the questions audiophiles want answered. These need to be answered with certainty before an audiophile spends thousands of dollars on expensive audio equipment.

But there is something very strange about high end audio. Although sound reproduction is a highly scientific, engineering exercise, most audiophiles base their purchase decisions almost totally on subjective listening tests, anecdotal information, and testimonials from self-proclaimed "experts" instead of from engineering measurements. Therefore, it is hard to know for a fact what components really have high sound quality.

Subjective listening tests can be useful and accurate. But if not done well, their results can be confusing, misleading, and invalid. Worse yet, poor testing makes it possible for unsuspecting music lovers to be deceived and fail to get the performance they are seeking.

Unfortunately, there are many unscrupulous manufacturers and dealers who take advantage of this situation by making false claims based on "voodoo science" to sell inferior, grossly overpriced, or even worthless products to uninformed audiophiles. I find it amazing that this state of affairs exists for such expensive products.

Audiophiles need to know -- and deserve to know the truth about the performance of the audio components they are considering. Only then can they make intelligent and informed decisions.

This requires accurate test information, which generally is not available. This information can be obtained by objective measurements by instruments and by valid listening tests. Unfortunately, unscrupulous businessmen in the audio industry have managed to convince audiophiles that measurements cannot be trusted. So most audiophiles use listening tests to compare two similar items to evaluate which sounds better.

But most listening tests produce conflicting and unreliable results as proven by all the controversy and conflicting opinions about the merits of various components. After all, quality testing will clearly and unquestionably reveal the superior product, so there should be no confusion or disputes about it.

For example, the ability of a digital camera to produce detailed images is fundamentally linked to the number of megapixels in its CCD. So the specifications and measurements of the number of pixels as accepted as an important factor when choosing a camera. Therefore, you don't find videophiles arguing over this.

The same is true of the technical aspects of audio equipment like frequency response, noise, and distortion. But audiophiles have been told that such measurements are not to be trusted.

But the results of most audiophile subjective testing is variable and uncertain. So different listeners come to different conclusions about what they hear.   As a result, there is very little agreement about the quality of the performance of audio equipment.

Why is this so? We all hear in a similar way, so what is going on with subjective listening tests that is so confusing?

The purpose of this paper is to investigate testing and answer that question. Actually, the answer is simple, but requires great elaboration of the details to explain the problem and what must be done to correct it.

So let's eliminate the suspense and immediately answer the question of why the typical audiophile listening test produces vague and conflicting results. The answer is that most listening tests have multiple, uncontrolled variables in them. Therefore, there is no way to know what is causing the differences in sound that is heard. Allow me to explain this in detail.

This issue of controlling the variables in a test lies at the heart of all testing. Audiophiles need to understand this and control the variables so that they can do accurate listening tests that produce reliable results.

What is a "variable?". A variable is any factor that can affect the result of a test.

An "uncontrolled" variable is the one variable in a test that is allowed to vary because we are trying to evaluate its effect. It is absolutely essential that any and all other variables in a test be "controlled" so that they do not influence the results.

If there is more than one uncontrolled variable in a test, then you will not be able to determine which variable caused what you heard.   Therefore, having multiple uncontrolled variables makes it impossible to draw any cause/effect relationships and conclusions from the test. Since our listening tests are trying to find cause/effect relationships, a test done with multiple, uncontrolled variables can't answer the question, so it is useless and invalid.

Let me be very clear about this by giving an example of how a typical audiophile listening test is performed and then analyze it for uncontrolled variables. Let's assume that an audiophile is considering buying a new amplifier that costs $10,000 and wants to know if the new amplifier is really better than his current one and worth the large price that is being asked for it. His testing will go something like this:

He may listen to his old amp briefly before listening to the new one, or he may not even bother and just assume he can remember the sound of it from long experience. He will then turn off his system, unplug the cables from his old amp, put the new amp in place, hook up all the cables, turn everything back on, then listen to the new amp for awhile.   He will then make a judgment regarding which amp sounded better.

He will usually go one step further and draw some sort of cause/effect relationship as to the CAUSE of why one amp sounded better. Typical examples of such cause/effect relationships might be that one amp had feedback while the other didn't, one was Class A while the other was Class D, one had tubes while the other had transistors, one had the latest boutique capacitors or resistors, while the other one didn't, etc. For the remainder of this article, I will refer to this type of test as "open loop" testing.

Now what would happen if I were to intervene in the above test and change the loudspeakers at the same time that the audiophile switched amplifiers? I think we would all agree that changing the loudspeakers would add another variable and that this would make it impossible to determine the cause of the difference in sound that would be heard.

We simply would have no way of knowing if the different sound that we heard was caused by the speakers or the amplifier (or both) because there are two uncontrolled variables in the test. Therefore, the test would be invalid and could not be used to determine which amplifier sounds better.

Now this is not new information. All rational audiophiles understand this concept of only having one uncontrolled variable in a test. They readily agree that you can only test one thing at a time. They made a sincere attempt to follow this process by only testing one component at a time in their listening tests.

But they unknowingly break the "one variable" rule in their listening tests. Let's look carefully at the amplifier test previously described and analyze it for uncontrolled variables.

When asked, the audiophile will honestly claim that his test only had one uncontrolled variable, which would be the amplifier. But he would be mistaken.

His test actually had five uncontrolled variables. Any of them, or multiples of them could have caused the differences in sound he heard. He needs to control all the variables except for the amplifier under test. So what are the other uncontrolled variables?

1) LEVEL DIFFERENCES.

If one amplifier played louder than the other, then it will sound better. Louder music sounds better to us. That is why we like to listen to our music loudly.

The gain and power of amplifiers varies. Therefore, for a specific volume control setting on the preamp used in the test, different amplifiers will play at slightly different loudness levels.

But the audiophile in the example above probably didn't even attempt to set the preamp level at exactly the same level for both amplifiers. He probably just turned up the level to where it sounded good to him. He made no attempt to match the levels at all because he was unaware that this was an uncontrolled variable.

In any case, the amps probably would have had different loudness levels even if the preamp setting was identical. This is because amplifiers have different gain and power levels.

Note that human hearing is extremely sensitive to loudness. Scientific tests show that we can hear and accurately detect very tiny differences in loudness (1/4 dB is possible). At the same time, we don't recognize obvious differences in the level of music until there are a couple of dB of difference. This is due to the transient and dynamic nature of music, which makes subtle level differences hard to recognize.

Therefore when music is just a little louder, we hear it as "better" rather than as "louder." It is essential that you understand that two identical components will sound different if one simply plays a little louder than the other. The louder one will sound better to us even if the two actually sound identical.

This is a serious problem in listening tests. Consider the amplifier test above and for purposes of this discussion, let's assume that both amplifiers sound exactly the same, but that the new one will play a bit louder because it has slightly more gain. This means that the new amp will sound better than the old one in an open loop test even though the two actually sound identical.

The audiophile will then draw the conclusion that the new amp is better and will spend $10,000 to buy it. But in fact, the new amp didn't really sound any better and it was the difference in loudness that caused the listener to perceive that it was better.

So the audiophile would have drawn a false conclusion about the new amp sounding better. This erroneous conclusion cost him $10,000. I think you can see from this example that you absolutely, positively must not have more than one uncontrolled variable in your tests.

2) TIME DELAY.

Humans can only remember SUBTLE differences in sound for about two seconds. Oh sure, you can tell the difference between your mother's and your father's voices after many years. But those differences aren't subtle.

Most audiophiles are seeking differences like "air", "clarity", "imaging", "dynamics", etc. that are elusive and rather hard to hear and define. They are not obvious. We cannot remember them for more than a few seconds. To be able to really hear subtle differences accurately and reliably requires that you be able to switch between the amplifiers immediately.

Equally important is that you should make many comparisons between the components as this will greatly improve the reliability of your testing. This is particularly important when dealing with music as different types of music have a big influence on the sensitivity of what you can hear during your testing. You really need to test with many types of music using many comparisons.

Open loop testing only provides a single comparison, which is separated by a relatively long delay while components are changed. This makes it very difficult to determine with certainty if subtle differences in sound are present.

3) PSYCHOLOGICAL BIAS.

Humans harbor biases. These prejudices influence what we hear. In other words, if you EXPECT one component to sound better than another -- it will.

It doesn't matter what causes your bias. The audiophile in the previous test had a bias towards the new amp, which is why he brought it home for testing. He expected it to sound better than his old amp, so it did. It was especially easy for his bias to influence him due to the time delay involved as he changed cables.

That bias may have been because he expects tubes to sound better (or worse) than transistors, or that the new amp had (or didn't have) feedback, or it was more expensive than his old amp, or that it looked better, or that he read a great review on it, or that is had a particular class of operation, etc. Bias is bias regardless of the cause and it will affect the performance that an audiophile perceives. It must be eliminated from the test.

Don't think you are immune from the effects of bias. Even if you try hard to be fair and open-minded in a test, you simply can't will your biases away. You are human. You have biases. Accept it.

4) CLIPPING.

Clipping is when an amplifier is being driven beyond its power and voltage abilities. This produces massive amounts of distortion, compression of the dynamic range, loss of clarity and detail, a sense of strain, harshness, and generally bad performance.

It doesn't matter what good features an amplifier has -- if it is clipping, it is performing horribly and any potentially subtle improvements in sound due to a particular feature will be totally swamped by the massive distortion and general misbehavior of an amplifier when clipping. Therefore no test is valid if either amplifier is clipping.

If one amplifier in the above test was clipping, while the other wasn't, then of course the two will sound different from each other. The amp that is clipping will sound worse than the one that isn't. But you must not test a clipping amp (that is grossly misbehaving) to one that isn't clipping (and is performing well). That is not a valid test at all and doesn't tell you how an amp sounds when it is performing properly and within its design parameters.

Most audiophiles simply don't recognize when their amps are clipping. This is because the clipping usually only occurs on musical peaks where it is very transient, and does not occur at the average power level. Transient clipping is not recognized as clipping by most listeners because the average levels are relatively much longer than the peaks. Since the average levels aren't obviously distorted, the listeners think the amp is performing within its design parameters -- even when it is not.

Peak clipping really messes up the performance of the amplifier as its power supply voltages and circuits take several milliseconds to recover from clipping. During that time, the amp is operating far outside its design parameters, has massive distortion, and it will not sound good, even though it doesn't sound grossly distorted to the listener.

Instead of distortion, the listener will describe an amp that is clipping peaks as sounding "dull" (due to compressed dynamics), muddy (due to high transient distortion and compressed dynamics), "congested", "harsh", "strained", etc. In other words, the listener will recognize that the amp doesn't sound good, but he won't recognize the cause as simple amplifier clipping. Instead, he will likely assume that the differences in sound he hears is due to some minor feature like feedback, capacitors, type of tubes, bias level, class of operation, etc. rather than simply lack of power.

But his opinion would be just that -- an assumption that is totally unsupported and unproven by any evidence. Most likely his guess would not be the actual cause of the problem.

Because different audiophiles will make different assumptions about the causes of the differences they hear, it is easy to see why there is so much confusion and inaccuracy about the performance of components when open loop testing is used.

It is easy to show that most speaker systems require about 500 watts to play musical peaks cleanly. Most audiophiles use amps with far less power. Therefore audiophiles are comparing clipping amps most of the time. This variable must be eliminated if you want to compare amplifiers operating as their designers intended.

5) The last uncontrolled variable is THE AMPLIFIER

This is the one variable that we want to test. So we do not need to control it.

The above information should make it clear why open loop testing is fraught with error and confusion. It is easy to see why we can easily be tricked by open loop testing, particularly when there is a significant time delay which will allow bias to strongly influence what we hear and make it difficult to recognize level differences. All these uncontrolled variables simply make it impossible to draw valid conclusions from open loop testing, even though we may be doing our best and being totally sincere in our attempt to determine how the two components sound.

But it doesn't have to be that way. It is possible to control the variables so that subjective listening test results are accurate and useful. Here's how:

1) LEVEL DIFFERENCES

are easily eliminated by matching the levels before starting the listening test. This is done by feeding a continuous sound (anything from a sine wave to white or pink noise) into the amps and measuring their output using an ordinary AC volt meter. The input level to the louder of the two amps will need to be attenuated until the levels of the amps are matched as closely as possible (must be matched to within 1/10 dB).

Need a signal generator to produce a steady test tone? You can buy a dedicated one for around $50 on eBay. Or you can download one for free as software for your laptop computer at this link: http://www.dr-jordan-design.de/signalgen.htm

2) TIME DELAY

must be eliminated by using a switch to compare the amps instantly and repeatedly. This is done by placing a double pole, double throw switch or relay in a box that will switch the amplifier outputs. If you want to compare preamps, CD players, DACs, or other line-level components, then you need a suitable switch and connectors for them.

Attenuators should be placed on the box so you can adjust levels on amplifiers and line level components. Of course, the box will have both input and output connectors for the amplifiers and other types of components so that they can simply be plugged into the box for testing.

Need a test box? You can make one or borrow mine. You can reach me at This email address is being protected from spambots. You need JavaScript enabled to view it. or by phone at 303 838 8130.

3) PSYCHOLOGICAL BIAS

must be eliminated by doing the test blind. Listeners must not know which component they are hearing during the test. This will force them to make judgments based solely on the sound they hear and prevent their biases from influencing the results.

Scientists are so concerned about biases that they do double-blind testing. This should be done during audio tests too. Double blind audio testing means that the person plugging in the equipment (who will know which component is connected to which side of the test box) must not be involved in the listening tests. If he is present during the test, he may give clues to the listeners either deliberately or accidentally about which component is playing.

There is one more thing that must be done to assure that bias is eliminated. There must be an "X" condition during the tests. By this I mean that you can't simply do an "A-B" test where you switch back and forth between components. A straight "A-B" test makes it possible for listeners claim they hear a difference each time the switch is thrown, even if there are no differences.

So you need to do an "ABX" test where when the switch is thrown, it sometimes continues to test the same component rather than switching to the other. Of course, if the component is not switched out, the sound will not change, so this will force listeners to be accurate and only indicate differences in sound when they are indeed present.

This is not a trick. It is done to assure accuracy and meaningful results. Listeners should be told prior to the test that this will be an ABX test so sometimes there will be no difference and they must be careful and be sure they really hear a difference.

4) CLIPPING

can be eliminated by connecting an oscilloscope to the amps and monitoring it during the test. A 'scope is very fast and can accurately follow the musical peaks -- something your volt meter cannot do. Clipping is easy to see as the trace will appear to run into a brick wall. If clipping is seen during initial testing, the listening level must be turned down until it no longer occurs. Only when no clipping is present can you proceed with the test.

Need an oscilloscope? You can easily find a good used one on eBay for around $100. There is software you can use on your computer to turn it into a 'scope.

The variables above apply to amplifiers. There usually are different variables involved for different components. You have to use some thoughtful logic to determine what variables are present and design your test to control them.

For example, preamplifiers and CD players don't clip in normal operation. So you don't need to bother with a 'scope. Cables, interconnects, power conditioners, and power cords don't have any gain.   So you don't need to do any level matching. Just use a switch and do the test blind.

Also, consider your comparison references. In the case of an amplifier, you can only compare it to another amplifier because power and gain are required. But when testing a preamp, you don't have to compare it to another preamp. You can compare it to the most perfect reference possible -- a straight, short, piece of wire.

This usually takes the form of a short interconnect, or you can go one better and use a very short piece of wire soldered across the terminals of the test box switch. You need then only set the preamp to unity gain to match the wire and do your testing blind.

There are many variations of the ABX test. A rigorous, scientifically valid ABX test will be done with a panel of listeners to eliminate any hearing faults that might be present with a single listener, and it will always be done double-blind.

But you can cut corners a little bit and still have a valid test. For example, you can do the test single-blind with one listener. What this means of course, is that you will do the listening test by yourself.

But you must "blind" yourself. The best way to do this is to have someone else connect the cables to the equipment so you don't know which one is "A" and which one is "B." You can then set levels and proceed to listening tests.

When doing ABX testing with others, it is important to give them a little training. Tell them that they will only be asked if they can hear any DIFFERENCE between components. Obviously, if one component sounds better (or worse) than the other, it must also sound different.

You need not be concerned about making judgments on subjective quality factors initially. Just ask the listeners if they hear any differences of any type and if any exist, you can test that separately later.

Because a good test will involve many comparisons, it is helpful to use a score sheet for listening groups. The sheet has a check box for "different" and "same" that they check after each comparison. You can then use a master sheet that shows where differences are possible (A-B test) and where they are not (A-X or B-X). It is then easy to score their sheets quickly after the test. I find that listeners are very accurate and that there is usually complete agreement on what is heard.

When testing by yourself, you can use a score sheet and you can only do A-B testing. This type of testing isn't well controlled (although it is vastly better than open loop testing), but you can usually get a good idea of what to components sound like.   If you need really reliable results, you should back up your personal testing with others using a full ABX test to be certain of the results.

When training a new group of listeners, I deliberately make a small error in setup (usually a level difference of 1 dB on one channel) and have them start listening. The ABX test is extremely revealing and much more sensitive than open loop testing. So even with such a tiny difference, even unskilled listeners quickly become very good at detecting them. Once the listeners are confident in how the test operates, we move on to the actual testing.

During testing, you may use any source equipment and source material you and the listeners like. I let listeners take turns doing the switching.   I encourage them to listen for as long as they wish and switch whenever and as often they like while listening to any music they wish. They can go back and listen to the same section of music over and over if they wish.

There are no tricks involved. This is science and I want them to be sure of what they hear.

Different types of source material make a big difference in how easy it is to hear differences. Generally, it is more difficult to hear differences in highly dynamic, transient music than on slow, sustained music. For example, it is harder to hear differences on pop music than on lyrical piano music with long tones.

Actually, music isn't even the best material for hearing some types of differences. Steady-state, white noise, pink noise, and MLS test tones are far more revealing of frequency response errors than music. So I usually include some noise during a part of my testing.

You don't have to use "golden ear" listeners for an ABX test. I encourage the disinterested wives of audiophiles to join in the fun. I find that they are just as good or better than their audiophile husbands at hearing differences.

If you find differences, you can then explore their cause by being a bit more creative in designing the test. For example, let's say you want to know if negative feedback affects the sound. To do so, you will need to have one amplifier with feedback and one without that you can compare.

But if the amplifiers are different in other ways, such as one being solid state and the other being tubed and the two amplifiers are from different manufacturers with different circuitry, then you will have multiple uncontrolled variables so you won't be able to draw any conclusions about feedback.

Therefore you will need to use identical amplifiers that have switchable or adjustable feedback. Several tube amplifiers have this feature. You would then set one amp for maximum feedback and the other for zero feedback for your testing.

You will find that feedback has a big effect on output levels (feedback reduces the level), so even though you are using identical amps, you will still need to match levels very accurately before starting the test.

You can even do the test with a single stereo amp where you compare one channel with feedback to the other without it -- as long as the feedback is independently adjustable for each channel. You will then do the test in monaural, but that is perfectly okay as you don't need to listen in stereo to hear differences.

It goes without saying that everything else in the signal path must be left alone during the testing process. Only the components under test can be switched.

Along this line, it is also true that it doesn't matter what flaws the other equipment in the signal chain might have. This is because the signal chain is identical for both test components so any differences that are heard can only be caused by the components under test.

For example, I have had listeners complain that the attenuators in the switch box may be changing the sound. I point out that even if true, since both components have attenuators, they would be affecting both components under test equally. So they would not be a variable and any difference in sound could only be caused by differences in the components under test.

If you only use ABX testing, you will find many components that sound different from each other. You then need to determine the cause of the differences you hear.

Sometimes its easy to determine the cause of the difference you heard, such as when one component is much noisier than the other where you hear hiss when you switch to that component. But sometimes its difficult such as when there is a frequency response difference between the components. How do you determine which one is accurate?

Because ABX testing takes a lot of time and effort, I always subject the equipment to instrument tests first to assure it meets the basic quality criteria (BQC) for high fidelity sound. I find that a significant amount of equipment fails to meet BQC on instrument testing.

The BQC are:

1) Inaudible noise levels (a S/N of 86 dB or better is required)

2) Inaudible wow and flutter (less than 0.01%)

3) Linear frequency response across the audio bandwidth (20 Hz - 20 KHz +/- 0.1 dB)

4) Harmonic distortion of less than 1%

If components fail the BQC, they will sound different on an ABX test.   But if this is so, why bother to go to all the trouble of doing an ABX test on them? After all, you will already know the cause of the differences you will hear because you found it using instrument testing.

Specifically, if a component has a poor S/N, you will hear hiss on an ABX test that will cause the component to sound different from one that has a good S/N and is silent. If the frequency response isn't linear, the sound will be different from one that has linear response -- and the instrument measurement will tell you which one is accurate. If high levels of distortion is present, you will hear that as lack of clarity, muddy sound, a sense of strain, poor imaging, and most of the other subjective comments audiophiles use to describe the sound they hear. If wow and flutter is high on one component, you will not need to ABX test it to know that.

The results of ABX testing usually are quite surprising to most audiophiles. They quickly discover that components that meet the BQC always sound identical to each other. Only if components fail the BQC (and many do), will they sound different.

Now I understand that many audiophiles will find that hard to believe.   But don't shoot me, I'm just the messenger. If you don't believe that components that meet the BQC sound identical, then you need to do some well-controlled listening test and prove it to yourself.

Let me stress a very important point. Many audiophiles immediately become defensive and think that what I just said is that all audio equipment sounds identical. NOTHING COULD BE FURTHER FROM THE TRUTH! Of course many components sound different from each other.

But they think I said that all components sound the same (which is untrue). They have heard differences between such components, so immediately disregard the whole idea of ABX testing. This is a tragedy.

I said that components "that meet the BQC sound identical." This is true. They do. But a great many components do not meet the BQC, so do NOT sound identical.

The point of doing controlled testing is to find out what is causing the differences in sound that is heard. I am not saying that audiophiles are deaf. I am trying to help them understand how to do testing that will show the true causes of the differences they hear between components.

Valid testing requires that you apply some basic scientific principles to the task. Science is not incompatible with the audiophile world. In fact, it is an essential and very helpful tool in finding out the facts and determining what is causing the differences we all hear between components.

So don't dismiss science. After all, it is science and engineering that provided you with the components you now enjoy. There is no magic and magicians don't design audio equipment -- engineers do.

Amplifiers are particularly surprising in ABX tests. When the test is started, obvious differences usually are heard. But it is also quickly discovered that the 'scope shows the amps to be clipping. Unless you are testing very powerful amplifiers that can deliver hundreds of watts per channel, you will find that you have to turn the level way down before peak clipping stops. At quiet levels where there is no clipping, the amps will sound identical -- assuming that they pass the BQC.

If all components that pass the BQC sound identical, then we are lead to the logical conclusion that listening tests aren't needed. Why not just measure the component in question to find your answer?

INSTRUMENT TESTING is a lot easier to do than ABX listening tests. Instruments are far more sensitive than human hearing. As a result, you can learn a great deal more about an electronic component with instruments than by listening. So why aren't audiophiles measuring their equipment?

Mostly this is because of ignorance of modern testing procedures, the mistaken believe that quality test equipment is very expensive, and because audiophiles are often told that measurements are not to be trusted. All this has changed with the development of the computer-based, FFT (Fast Fourier Transform), spectrum analyzer.

A spectrum analyzer is an amazing tool that will evaluate the BQC quickly, easily, and in incredible detail with simply astonishing sensitivity. You can now buy one for less than $500 as computer software and I've even seen free software for them on the internet.

So just what does a spectrum analyzer do and how does it work? A spectrum analyzer will show you the "spectrum" of frequencies produced when you input a test signal into the device under test.

Conceptually, what it does is quite simple. A perfect component will show the test signal frequency only. No other frequencies will be present. If any other frequencies are present, they are distortion or noise.

The spectrum analyzer will show a graph with frequency on the horizontal axis and magnitude on the vertical axis. If you input say a 1 KHz sine wave, you will see a very large spike on the graph at 1 KHz. You should see nothing else.

Of course, no component is perfect, so you will see many other frequencies above and below 1 KHz. These frequencies will take two forms. One will be harmonically related to the test tone and the others will not.

Those frequencies that are harmonically related are harmonic distortion. You will see frequencies as multiples of the test tone. So if you use a 1 KHz tone, you will see harmonics at 2 KHz (the second harmonic), 3 KHz (the 3rd harmonic), at 4 KHz (the 4th harmonic), and so on.

Those frequencies that are not harmonically related are noise. They are random, present at all frequencies, and hopefully should be at a very low level.

Some noise frequencies represent problems that need to be addressed.   For example if you see a big noise spike at 60 Hz, you will know that you have hum (if your mains is 60 Hz). If you see harmonics of 60 Hz at 120 Hz, 180 Hz, etc., then you will know that you have a ground loop.

If you see significant noise spikes at higher frequencies with a component that has digital control circuitry, you may suspect digital noise is bleeding into the analog circuits. Anytime you see significant noise spikes, something is amiss and you need to find out why and get it fixed.

Most spectrum analyzers will identify the harmonics and label them with their magnitude relative to the reference test tone, or you can interpret the level from the graph lines. Each harmonic will be defined as a certain number of dB below the reference level (the peak of the test tone).

The analyzer will combine and compute all the harmonics into THD (Total Harmonic Distortion). This will be a negative number of dB below the reference tone or distortion percentage of the test tone. You can chose which you like to use.

The two are related. For example, if the distortion is 100 dB below the reference level, it will also be 0.001% distortion.

The spectrum analyzer will also compute the THD+N (THD + Noise), which will be a little higher than the THD alone. If the THD value is very low, you can consider the THD+N to be the S/N.   If the THD is high, you must subtract the THD from the THD+N to isolate the noise and get an accurat graph1

e S/N.

To demonstrate, I have attached two photos of my spectrum analyzer showing the performance of one of my Magtech amplifiers.

The first one shows the distortion when there is insufficient bias to completely eliminate crossover distortion. You see the first 20 harmonics and their levels in small boxes above each harmonic (although it is hard to read the values in the pictures). The large blue box at the top of the screen shows the THD, which in this case is about a half of one percent.

graph2

The second photo shows the same amplifier with the bias adjusted to eliminate crossover distortion. You can see that all of the harmonic spikes above the 4th harmonic have disappeared into the noise floor -- they simply don't exist. The second harmonic is the highest, and even it is 99.6 dB below the reference tone. The 3rd harmonic is -102.4 dB, the 4th harmonic is -110 dB and the remainder are unmeasurable. The THD is incredibly low at around one thousandth of one percent.

The amp is being tested under the worst condition, which is at just 1 watt. This is tough because crossover distortion is a greater percentage of the total distortion at low power levels than at high power levels.

Also, the S/N will be much lower at high power levels as the noise floor is fixed and greater output will tremendously increase the magnitude of the difference between the noise and the signal. But even at this very low power level you see that the noise floor is about 118 dB below the reference signal. This is a very quiet amplifier.

In other words, the spectrum analyzer shows that this amplifier has lower distortion and noise than you will find in most preamps! That is truly spectacular performance. Who says that high power, solid state amps sound bad at low power levels? That is a myth.

Because the spectrum shows each harmonic, you can evaluate the component for its harmonic structure. For example, many audiophiles have come to believe that tubes have a greater percentage of low order (2nd, 3rd, and 4th) harmonics than solid state equipment which is widely believed to have a greater percentage of high order, odd harmonics (7th, 9th, 11th, etc.). Low order harmonics sound less objectionable than high order, odd harmonics, so the theory goes that tube equipment should sound better than solid state.

I haven't found this to always be true. Some tube equipment has lots of high order harmonics and some SS gear has more 2nd and 3rd harmonic structure. It varies between amplifiers so it is inaccurate to make a blanket statement about this.

Refer again to the spectrum of the Magtech shown above. Its greatest distortion is the 2nd harmonic and all others are lower -- just like a tube amp is supposed to behave. However, since even the 2nd harmonic is about 100 dB below a 1 watt output level, the distortion is far too quiet for a human to hear, so the point is moot.

The spectrum analyzer can plot the frequency response of a component. It can measure wow and flutter and you can also see any wow and flutter that might be present by instability in the graph.

So you can see that a spectrum analyzer will tell you an enormous amount about your equipment. It is shockingly better than the human ear. For example, scientific studies show that humans can hear distortion down to only about 2%. That's why I say that the distortion in the BQC must be less than 1% to be inaudible.

But a spectrum analyzer will show distortion down to around one ten-thousandth of one percent, and it will show all the various harmonics and separate those from the noise. This is far more sensitive than human hearing.

For example, my amplifiers have only a few thousandths of a percent distortion when their bias is properly adjusted. I wouldn't (and simply couldn't) set the bias levels by listening for the distortion. Instead, I get it extremely precisely set by adjusting the 5th harmonic to -110 dB.

I think it is quite interesting that human hearing can't hear the difference between the two spectrum analyzer graphs shown above, while there is a great deal of difference in performance of the amplifier. This is because what appears to be high distortion in the first graph is only about 1/2 of one percent, which is below the human limit for detecting distortion. Imagine what a clipping amplifier's spectrum looks like when it is producing many tens of percent distortion.

In summary, listening tests, if properly controlled are useful. But they are far less sensitive and precise than instrument testing. They are also much more difficult to do.

I hope this discussion has convinced you that open loop listening tests are useless unless components sound grossly different. For example, you don't need to do an ABX test to detect differences between a 5 watt, SET tube amplifier and a 500 watt solid state amplifier.

But for the typical audiophile test where subtle differences are the norm, you must eliminate the multiple, uncontrolled variables that make their results meaningless. You simply must do ABX testing to obtain valid results from listening tests.

But a spectrum analyzer will tell all there is to know about the performance of electronics. These are cheap, easy to use, and every audiophile should have one. Considering the tens of thousands of dollars most audiophiles spend on equipment, wouldn't a few hundred dollars for a spectrum analyzer be a good investment?

Both spectrum analyzers and blind, ABX, listening tests are accurate and will reveal the true causes of any differences you may hear between components. By using these tests you can determine with accuracy the quality of the components you are using or want to buy. You will then have total freedom and confidence as you know the truth and can ignore all the controversy and confusion that is the bane of the audiophile industry.

Technical White Papers