Psychoacoustics for Musicians
by
1. Introduction
I have given several courses at the University of Birmingham on the Physics of Music. I was asked to give a talk at a meeting of the National Flute Association in 2003, and for this, I selected some topics from my lecture courses, on the subject of Psychoacoustics. The material is in no way specific to the flute or to any other instrument; it applies to musical sound in general. I have selected particularly those topics that I believe will be useful to performers, composers and arrangers.
Psychoacoustics is the study of the relation between the way a physicist describes a sound wave and the way a musician describes the resulting sound. A physicist talks about quantities such as frequency, wavelength, velocity, waveform, amplitude and harmonic content, while a musician describes a sound as loud of soft, high or low pitch, harsh or pleasant sounding, etc. What is the connection between these two descriptions? In section 2, I discuss how a physicist describes a sound and in section 3, I talk about the musician’s description.
Click here to join the BFS
2. The physicist's description
Sound is a wave travelling through the air. You can't see it, in part because you can't see air. If you want to picture waves that are familiar to you, think of water waves since these are easily visible. You can see them in the sea or in a bath; just put your hand in the surface of the water and move it up and down and waves travel outwards from that point. The wave looks like an undulating pattern, such as that in fig. 1, travelling along the surface of the water. The cone of a loudspeaker produces sound waves in a similar way, by moving backwards and forwards in the air.
Fig. 1. Graph of a simple sound wave, called a sine wave. It can be regarded
either as a graph of displacement of the air in the wave from its equilibrium position
against time (at a fixed place), or against position (at a fixed time). Each of these
interpretations can be seen by observing water waves.
A physicist describes these waves by several quantities:
2.1 Frequency.
This is the number of oscillations per second. When making water waves, if you move your hand up and down 5 times per second, the wave will have a frequency of 5 cycles/sec, also called 5 Hertz (abbreviated 5 Hz). Sound waves in air have rather higher frequencies than this; they cover the range from about 20 to 20,000 Hz. One can make waves in air with frequencies below or above this range, but the ear doesn't respond to waves with frequencies outside the range 20 to 20,000 Hz, so they are not called sound waves. (They are called subsonic or ultrasonic waves.) This is another reason why you can’t see sound waves; anything that vibrates as quickly as this looks like just a blur.
The higher the frequency of a sound wave, the higher the pitch of the resulting sound. Middle C has a frequency of about 256 Hz. But more about all that later. As a quick illustration, sound clip1 contains notes of 200 Hz, 500 Hz, 2000 Hz and 4200 Hz.
Track 1:
(To play the clip, click on the play button - you may need to click twice, once to activate the control and then again to play - it will take a few moments to start playing the sound. If the plug-in fails to start, there are links to all the file at the bottom of the article. Click here.)
2.2 Wavelength.
In a wave, such as that in fig. 1, the distance between 2 peaks is called the wavelength. For water waves, this can be typically from a few cm to a few metres, though almost any value is possible. Sound waves have similar wavelengths - they vary from about 2 cm to 20 metres. Middle C has a wavelength of about 1.3 m.
2.3 Speed (or velocity).
Water waves travel rather slowly, just a few cm/sec. Sound waves in air travel much faster. The speed is about 330 - 340 m/sec, depending mainly on the temperature of the air. It's important to realise that the water in a water wave, or the air in a sound wave, is not moving along with this speed. The shape of the wave moves, but if you float a cork in the water, it doesn’t move along; it just bobs up and down as the wave passes that point. Only the shape travels along.
A simple bit of algebra shows that the three quantities introduced so far are related:
Frequency = speed / wavelength,
or, if you prefer,
Wavelength = speed / frequency,
or, if you prefer,
Speed = frequency x wavelength.
This has an important consequence. A wind instrument produces a note of a particular wavelength (in many cases, the lowest note has a wavelength twice the length of the instrument), whereas the pitch depends on the frequency. But the relation between the frequency and wavelength depends on the speed, which gets higher as the temperature increases. That's why wind instruments go sharper as the air in them warms up.
2.4 Waveform.
So far, I have assumed that the shape of the wave is the rather smooth curve shown in fig. 1. But it doesn't have to be. It might have one of the less smooth shapes shown in fig. 2, or any other shape. All of the shapes shown in fig. 2 are “repetitive”, i.e. the same pattern repeats over and over. (This must be the case, otherwise we couldn’t define the frequency or wavelength.) The first shape in fig. 2 (which is the same as the shape in fig. 1) is called a “sine wave”. All of the waves as drawn have the same wavelength and, therefore, the same frequency and the same pitch (approximately; see section 3.2 below), but they sound different.

Fig. 2. Waveforms of various shapes. (a) A sine wave, as in fig. 1. (b) A
sine wave plus its second harmonic. (c) A square wave. (d) A sawtooth
wave. (e) A sound wave from an oboe.
Physicists tell us that the waveforms in figs. 2(b)-(e) are actually combinations of the sine wave (fig. 2(a)) and higher-frequency sine waves, called harmonics. These harmonics have frequencies twice, 3 times, 4 times,…. that of the fundamental. When you think about it, this is a rather surprising statement; if you look at the sine and square waves in fig. 2(a) and (c), you don't see any higher frequencies in fig. 2(c). If you listen to these sounds, you don't hear any higher frequencies in the square wave - you hear the same pitch but a different tone colour. So what do physicists mean by these high harmonics (also called Fourier components)? To a physicist, it's simply arithmetic. Fig. 3 shows how, by adding together several sine waves, we can produce something close to a square wave. This is just a matter of arithmetic, but we shall see that these harmonics are not just an abstract physical concept. They are very real, and I will give several demonstrations of their importance. So please believe it for now; I'll say more about it in section 3.3.1 later.
Fig. 3. This shows how a square wave results if a sine wave and several
harmonics are added. The right-hand plot shows the sum of the sine
waves in the left-hand column for 3 cases: (a) just the fundamental
note (so the sum is just a sine wave), (b) the fundamental and third
harmonic, (c) the fundamental and harmonics up to the 9th. The square
wave happens to contain only odd harmonics. The more harmonics that are
included, the closer the result approximates to a square wave.
With this description, then, you can describe the waveform of a sound either by the shape of the wave (e.g. by pictures like those in fig. 2) or by the amplitudes of all the harmonics present in the sound. Just a quick illustration, sound clip 2 demonstrates a sine wave and a square wave, each with a frequency of 500 Hz. The sine wave is in some ways a “simple” sound, perhaps a rather uninteresting one, but sounding somehow rather “basic”. It’s difficult to define these qualitative descriptions, but you probably know what I mean.
Track 2:
In fig. 2, I included an actual musical instrument, the oboe. I chose this because the double-reed instruments have a very complicated waveform. I.e., the shape of fig. 2(e) looks very different from the sine wave of fig. 2(a). Equivalently, we could say that the oboe and bassoon sounds have a lot of harmonics in their sounds. The shape of the waveform or the harmonic content are alternative descriptions.
To end this section, here is a summary of the differences between sound waves and the water waves that I used as an example of wave motion:
- Sound waves are in air, not water.
- Sound waves travel at about 330 m/sec, much faster than water waves.
- A sound wave is a ripple in the pressure in the air, whereas a water wave is a ripple in the surface of the water.
- For real physicists – sound waves are longitudinal, water waves are transverse.
Click here to join the BFS
3. The musician's description
To the musician, the properties of a sound are summarised by some of the following quantities.
3.1 Loudness.
The loudness of a sound depends on the amplitude of the wave, i.e. on the distance that the air moves backwards and forwards in the wave. This is quite small, often a tiny fraction of a millimetre. But the loudness also depends a lot on the pitch of the sound, because the sensitivity of the ear varies drastically with frequency. The ear is most sensitive at a frequency of about 3000 Hz. This is a fairly high note, roughly an octave above the treble stave. So for most of the range of pitches of musical sounds, the ear's sensitivity is increasing as the pitch rises. The sensitivity of the ear is plotted, as a function of frequency, in fig. 4.

Fig. 4. The sensitivity of the human ear. The curve plots the amplitude of the
least sound that the ear can hear, so the ear is most sensitive at the lowest point of this
curve, i.e. about 3000 Hz.
This effect is quite strong. For example, the ear is about 100 times more sensitive at C above the treble stave as it is at C in the bass stave. This is at least partly why a bass flute sounds much quieter than a concert flute or piccolo. If you play a duet with a concert flute and the lower register of a bass flute, the bass flute sounds much the quieter of the two.
At first sight, this statement may seem to be dubious, because a bassoon can play in the same region as the bass flute, and if you listen to the same duet played by a concert flute and a bassoon, the two instruments will not sound so different in loudness. There is no balance problem in this case. But remember that the bassoon sound, like the oboe, is particularly rich in harmonics. This means that when the bassoon plays the C in the bass stave (which is about 128 Hz), most of the sound it produces is not really at 128 Hz but at much higher harmonics of this frequency, where the ear is a lot more sensitive. Apparently, the ear is able to analyse the bassoon sound into its harmonics and to respond strongly to the higher harmonics, as expected for its greater sensitivity at these higher frequencies. We shall see later (sect. 3.1.1, fig. 7) that the bass flute sound has much lower amplitudes of these higher harmonics, so it sounds quieter than a bassoon. We have here a good demonstration that these harmonics are very real.
3.2 Pitch.
Pitch depends basically on frequency; a higher frequency gives a higher pitch. For the more mathematically minded, the relation is logarithmic, i.e. a given ratio of frequencies corresponds to a particular musical interval. The simplest intervals have the following simple frequency ratios:
Octave 2
Fifth 3/2
Fourth 4/3
Major third 5/4
(at least approximately; the precise values depend on the temperament of the scale in use.)
If you listen more carefully, the pitch may change slightly with loudness, even is the frequency doesn't change. Sound clip 3 contains a sine wave of 150 Hz, alternating loud and soft. To most people, the loud note sounds lower in pitch, even though the frequency doesn’t change. Clip 4 shows the same thing for a sine wave at 3200 Hz. Here, the effect is much less and to some people, it even reverses (the loud note sounding higher). However, if one uses a more complicated sound than a sine wave, the effect largely goes away; sound clip 5 repeats the soft-loud alternation for 150 Hz, as in clip 3, but for a square wave. For most people, the effect has essentially vanished for a square wave.
Track 3: (NOTE: This track produces a very low note and requires speakers/headphones that are responsive to lower notes. It cannot be heard very well on some speakers e.g. built-in laptop speakers, with a small responsive range.)
Track 4:
Track 5:
For most people, this effect is summarised in this table:
Frequency |
Waveform |
Observed effect |
Low (around 150 Hz) |
Sine |
Louder notes sound lower |
High (around 2000 Hz) |
Sine |
Little if any effect |
Any |
Square |
Little if any effect |
At first sight, this effect seems to suggest that when there is a diminuendo, the music would begin to rise in pitch. This would be an added complication in performance. But look at the last line of the table; for a realistic musical sound, with many harmonics, the effect disappears. A square wave has many harmonics, just like a realistic musical instrument. Apparently, the ear somehow hears the average effect over the range of frequencies of the harmonics in the waveform, and when there is a wide range of frequencies, as in any realistic musical sound, the effect goes away. There are many examples of odd acoustic “tricks” that can be demonstrated using sine waves, but which go away for realistic musical sounds.
When tuning an ensemble, one needs a sound of well-defined pitch. The above suggests that a sine wave would be the worst choice and a sound with substantial harmonic content should be used. Of course, orchestras are aware of this and tune to an oboe. But when the ensemble doesn't contain an oboe, you have to use something else. A flute is not a very good choice here (see sect. 3.1.1, fig. 7). An electronic device producing a square wave (or any other waveform that is far from a pure sine wave) is preferable.
3.3 Tone colour.
3.3.1 The steady sound.
Tone colour depends a lot on the shape of the waveform or, equivalently, the harmonics present in the waveform. As mentioned above, physicists will tell you that any waveform other than a sine wave can be expressed as a sum of sine waves of many higher frequencies (called harmonics) that are multiples of the lowest frequency (called the fundamental). We have seen in sect. 2.4 how this happens as a purely algebraic effect, apart from any relevance to sound.
To the musician, this would be quite irrelevant except that the ear is apparently able to separate a complicated waveform into the various harmonics present. Although you don't immediately hear these harmonics, we have already seen two instances where the ear appears to respond to the harmonics as well as the fundamental, and a third example appears below. So this analysis into harmonics (called Fourier analysis) seems to be relevant to the way in which the ear behaves. But please remember that when one talks about the frequency of a sound, one means that of the fundamental. Thus “tuning A” is usually referred to as being 440 Hz, but typically, the wave also contains 880 Hz, 1320 Hz, 1760 Hz, etc.
The effect is demonstrated in sound clips 6 and 7. Firstly, in clip 6 you hear a sine wave of 256 Hz, which is about middle C. Then, in clip 7, all the harmonics are played individually, up to the 16th (which is a sine wave with frequency 16 x 256 = 4096 Hz). Then the 256 Hz sine wave and all the harmonics are played together. What you hear is just one pitch, middle C, but with quite a different tone quality, somewhat like the square-wave sound. Finally, you hear the 256 Hz sine wave again, so that you can compare it with the sine wave plus harmonics. The pitch of a sound is always that of the fundamental, whatever harmonics are present.
Track 6:
Track 7:
Figs. 5 and 6 show the harmonics present in the sounds of a range of instruments. In fig. 5, the waveform is plotted, while in fig. 6, the amplitudes of the various harmonics are shown.
You might be surprised by three observations:
(i) The wide range of harmonics present, often up to the 10th harmonic (at 10 times the fundamental frequency) and beyond.
(ii) Sometimes the sound is almost all harmonics. Look at the oboe sound; if it's playing the tuning A, which has a frequency of 440 Hz, there is very little 440 Hz in the note. Most of the sound is in harmonics up to at least the 10th (i.e. 4400 Hz).
(iii) It's very hard to guess what an instrument sounds like from these pictures. To the ear, we know that the four violins in fig. 6 would sound much more similar to each other than to the other instruments. But you would hardly guess that from the waveforms in fig. 6.

Fig. 5. Waveforms of sounds from some musical instruments.
Fig. 6. Amplitudes of harmonics present in various sounds.
The oboe has almost no sound at the actual fundamental frequency, even though it is just that pitch that you hear when listening to it. The same is true of the bassoon sound. In fact, even if the fundamental frequency is omitted altogether, the ear still hears the same pitch as if it were there. Sound clip 8 shows this. Firstly, you hear a 256 Hz sine wave. Then you hear an entire set of harmonics from 1 to 16 (i.e. frequencies 256, 512, 768, 1024,…, 4096 Hz), which produce a different sound but still with a pitch corresponding to 256 Hz.. Thirdly, you hear just the harmonics from 2 – 16, i.e. sine waves of frequencies 512, 768, 1024,…, 4096 Hz, but omitting 256 Hz. Surprisingly, the ear still hears one pitch, that corresponding to 256 Hz, even though there is no such sine wave there. This is referred to as “reconstruction of the fundamental”.
Track 8:
Fig. 7 shows the waveforms for one instrument, the flute, but in different registers and different dynamics. These waveforms are quite different - almost as different as the various instruments in figs. 5 and 6. Sound clips 9 and 10 demonstrate this for a horn sound. In clip 9, you hear a short phrase played at 3 different pitches, in three octaves. Then in clip 10, you hear the phrase played at the same three pitches, but for the lower pitches, the tape was speeded up on playback so that they all sound at the same pitch. If the waveform of the horn were the same at all pitches, these three versions of the phrase would sound the same, but in fact, this is very far from being the case.
Track 9:
Track 10:

Fig. 7. Amplitudes of harmonics present in a flute sound for the middle
and low registers, and for two dynamic levels.
3.3.2 Transients.
When a sound on a musical instrument starts or stops, or moves legato from one note to the next, the waveform is quite different (for a brief period of time) from that during the “middle" of a note, which was discussed in sect. 3.3.1. This short burst with a different waveform, referred to as a transient, is quite characteristic of each instrument and is just as important in determining an instrument’s characteristic sound as is the waveform in the “middle” of a note. If you listen to the “middle" of a note (i.e. remove the transient somehow and listen only to the sound after the note has started and before it stops), it can be quite difficult to distinguish one instrument from another.
To illustrate this, sound clip 11 contains the following four items:
(i) The “middle” of a note. This was recorded by suppressing (electronically) the transient at the start and end of the note.
(ii) The same as (i) but on a different instrument.
(iii) The same instrument as in (i) plays a little phrase. The transient at the beginning of the first note is suppressed, but there are many transients later in the phrase.
(iv) The same as (iii), but played by the same instrument as in (ii).
In (i) and (ii), with no transients, it can be hard to tell what the two instruments are. Non-musicians often cannot tell. Musicians usually know, but it is certainly more obvious what the two instruments are in (iii) and (iv) as soon as the note changes, so that there are transients as well as just the middles of notes.
Track 11:
This should contain a useful message for players. If you can only distinguish a flute from a trumpet (for example) by listening to the transients, the transient is probably also important in distinguishing a good sound from a bad sound. So when practicing for a good sound, it is probably just as important to practice tonguings, bowings, legato movements, etc., as it is to practice the sound of a long note.
An amusing demonstration of the importance of transients in characterising a sound is shown in sound clip 12. A passage is played on the piano, but with the tape reversed, so that starting transients come at the end of the note and vice versa. Also, the middle of the note is reversed in time. It’s surprising how different it sounds. The same passage is then played in clip 13 with the tape running in the correct direction, in case you didn’t identify the piece played with the tape direction reversed.
Track 12:
Track 13:
Click here to join the BFS
4. Two sounds together.
The response of the ear to two or more simultaneous sounds is a complicated subject, which contains the theory of scale construction and the phenomenology of harmony. This is more than space permits in this note.
A less familiar topic involving two or more sounds together is masking. By masking, we mean when one sound obscures another so that you can’t hear it. Musically, this is normally not wanted, so we study masking mainly in order to learn how to avoid it.
Masking usually doesn’t occur when the two sounds are of very different pitches. It would be difficult to obscure a double bass with a flute or vice versa, even if one is much louder than the other. But it is important to realise that it is not actually the pitch that matters but the frequency range. These may not be the same thing if we are talking about two instruments with different harmonic content in their sounds. This is demonstrated in sound clip 14, which contains the Westminster chimes, in which each note is played twice. The first note of each pair is a sine wave, so the frequencies are a few hundred Hz. The second note of each pair contains very little of the fundamental and a lot of high harmonics, so even though the pitch is the same as that of the first note, the frequencies present in the sound are much higher. The notes sound a bit like flute and oboe respectively, though they are actually electronically generated. Then the phrase is repeated and noise in the region of a few hundred Hz is added to the playback. This obscures the “flute-like” sound, because it is in the same frequency range. But the “oboe-like” sound is still audible, since it is mostly harmonics with much higher frequencies than that of the noise. The noise goes away and you can hear both notes of each pair. Then, noise is added that has frequencies in the range of a few thousand Hz. Now, the “oboe-like” sound is obscured and the “flute-like” sound remains. So to decide if a sound is likely to be masked, you need to look at the actual frequencies present in the sound, not just the frequency that you would guess from the pitch (i.e. the fundamental frequency).
Track 14:
This has some practical consequences for orchestrators and arrangers. There exists in the orchestral repertoire a piece with a bassoon solo accompanied by violins. At first sight, this is a reasonable thing to do, but remember that the bassoon sound has a lot of harmonics and very little fundamental, so that most of the actual frequencies present are many times those of the fundamental, and lie in the violin register. A professional bassoonist tells me that the solo is well known for being difficult to bring out over the violin accompaniment, though he never understood why until he learned about masking. There are times when one might actually want masking to occur, that is when two sounds are supposed to blend. Composers sometime write quiet passages for cellos and bassoon in unison. But the bassoon tends not to blend well with the cellos, since its sound occupies predominantly a different frequency range; it tends to stick out from the texture. And, of course, all of us who play wind quintets are well aware that the other double-reed instrument, the oboe, can never be masked by any of the other players in the group, however hard we try.
In preparing these notes, I have benefited much from interaction with Prof. Charles Taylor and his group at University of Wales at Cardiff. I learned a lot of physics while talking to them, and they provided me with the Westminster chimes tape used in sound clip 14.
Click here to join the BFS
Note on the sound clips
These sound clips can be played on any equipment designed for playing music. High quality is not required except possibly for clips 3 – 5. For these clips, careful adjustment of the playback level is necessary. It is important that the loud dynamic is loud enough, but it is also necessary to avoid undue distortion, since it is essential to the demonstration that the sounds in clips 3 and 5 are reasonably pure sine waves. Some experimentation may be necessary to give the most convincing demonstration. The speakers that are supplied with many computers may not be adequate. Headphones of reasonable quality work quite well.
Clip contents
1. Track1.wav.
Sine waves at 200, 500, 2000 and 4200 Hz.
2. Track2.wav.
Alternating sine and square waves at 500 Hz.
3. Track3.wav. Alternating loud and quiet sounds; sine wave at 150 Hz.
4. Track4.wav. Alternating loud and quiet sounds; sine wave at 3200 Hz.
5. Track5.wav. Alternating loud and quiet sounds; square wave at 150 Hz.
6. Track6.wav. Pure 256 Hz sine wave.
7. Track7.wav
(i) All harmonics of the 256 Hz sine wave, up to the 16th.
(ii) 256 Hz sine wave and all harmonics played together.
(iii) 256 Hz sine wave alone.
8. Track8.wav
(i) 256 Hz sine wave.
(ii) 256 Hz sine wave and harmonics up to the 16th.
(iii) Harmonics 2 to 16 of the 256 Hz sine wave, but omitting the 256 Hz fundamental.
9. Track9.wav.
A passage played on a horn at three different pitches separated by an octave.
10. Track10.wav.
The same passage as in 9, but the lower pitched passages were played at one half or one quarter of the speed and the playback was speeded up by the same factor.
11. Track11.wav. Sounds of two instruments with the starting and ending transients removed during recording:
(i) Instrument A, both transients removed.
(ii) Instrument B, both transients removed.
(iii) Instrument A, starting transient removed.
(iv) Instrument B, starting transient removed.
12. Track12.wav.
A passage played on a piano, with the tape played backwards, to interchange starting and ending transients.
13. Track13.wav.
The same passage with the tape replayed forwards.
14. Track14.wav.
Masking demonstration:
(i) Westminster chimes. Each note is played twice, firstly as a sine wave and then as a sound rich in harmonics.
(ii) The same, but with medium-frequency noise superimposed, to mask the sine-wave sound.
(iii) The same, but with high-frequency noise superimposed, to mask the sound rich in harmonics.
If you found this article of interest, why not join the BFS for access to our quarterly magazine "Pan - The Flute Magazine" which always includes a wide range of articles on the flute and flute playing, as well as all the latest news and reviews?
Click here to join the BFS
|