Chapter 4: The Human Auditory System

As with speech, there are several aspects to the human auditory system as well as several pieces of remarkable 'hardware' and 'software' that together endow us with the ability to hear.
These aspects include the physical sound waves that travel through the air and which can be picked up and analysed by digital computer hardware (or generated by computer), as well as the interpretation of those signals by the human brain.

The difference between what we perceive and what is pysically present is due in part to the 'hardware' limitations, such as the frequency response of the outer ear, but also to the brain processing that is our 'software'. That level of hearing is usually studied as part of the fascinating topic of psychoacoustics (which is the subject of Chapter 5).

The main hardware components are shown in the figure below:

Probably the most important and interesting part of this is within the inner ear, the cochlea. Within the cochlea, the basilar membrane and the hair cells that are hosted there are the parts of the ear that are sensitive to sounds. As described within the book, it is the vibration of the hair cells in response to rarefactions or compressions of the fluid in the inner ear that gives rise to nerve signals and subsequently to the sensation of hearing.

An excellent resource that describes this hardware in some detail is at www.cochlea.ea, which even shows electron microscopy of the cochlea and hair ceslls. Another source of useful background information is wikipedia.


4.2.2 Equal loudness

Tones of the same physical amplitude are perceived as having different degrees of loudness. Try this example from the book:

lo= tonegen(250, 441000, 2);
mi= tonegen(1200, 441000, 2); 
hi= tonegen(11000, 441000, 2);
soundsc(lo, 441000); 
soundsc(mi, 441000); 
soundsc(hi, 441000);

4.2.3.1 2f1-f2 tone induction

Here is an example from the book:

fs=44100; %sample frequency 
f1=tonegen(1800,fs,2);
f2=chirp([1:fs*2]/fs,2000,2,2200,’q’); 
left=[f1,zeros(1,fs),f1]; % repeat sounds twice
right=[f2,zeros(1,fs),f2];

soundsc([left;],fs)
%wait for this to finish
%
%
% then
soundsc([;right],fs)

%
%
% Now try this
soundsc([left;right],fs)

4.2.3.2 f2-f1 tone induction

Here is an example from the book:

t1=tonegen(196, 8000, 2);
t2=tonegen(196*2, 8000, 2); 
t3=tonegen(196*3, 8000, 2);
t4=tonegen(196*4, 8000, 2);

soundsc(t1, 8000); 
%
% then
soundsc(t1+t2+t3+t4, 8000);
%
% then
soundsc(t2+t3+t4, 8000);

4.2.9 Masking

To illustrate the explanation in the book, try the following;

lo=0.1*tonegen(800, 8000, 2); 
hi=tonegen(880, 8000, 2);

%Let us visualise this
pwelch(hi+lo,4096,2048,4096,8000)

% first adjust volume
sound(hi/2, 8000);
% that was the louder sound
sound(lo/2, 8000);
% that was the quieter sound - now together:
sound((lo+hi)/2, 8000);

The visualised signals are plotted as shown below, indicating that they are very close in frequency but of mismatched amplitude:


4.2.12 Frequency discrimination

Can you hear the difference between these tones?

t1=tonegen(1000, 8000, 2); 
t2=tonegen(1002, 8000, 2);
%listen to both, 1/4 second pause between
soundsc([t1,zeros(1,1000),t2], 8000);

You probably can't - unless you really have 'perfect pitch'. But even those who are slightly 'tone deaf' will probably notice the difference when we remove the gap;

soundsc([t1,t2], 8000);

Incidentally, most of our human senses are more sensitive to changes than they are to steady state, absolute, conditions. This is true of motion, sight, touch, taste, temperature. In fact I would argue it is also true of emotions - anger, love, happiness, sadness and so on.


4.2.14 Mistuning of harmonics

Can you hear the difference between these tones?

note =440;
t1=tonegen(note, 8000, 1); 
t2=tonegen(note*2^(3/12), 8000, 1);
t3=tonegen(note*2^(8/12), 8000, 1);
%
% try it
soundsc(t1+t2+t3, 8000);

Does it sound pleasant? Now try it with some mistuning;

m2=tonegen(note*1.05*2^(3/12), 8000, 1); 
soundsc(t1+m2+t3, 8000);

4.2.15 The precendence effect

audio=reshape(audio ,1,length(audio)); 
for echo =0.01:0.020:0.1
	pad=zeros(1,fix(Fs*echo));
	input(’Press any key to hear next echo’); 
	soundsc([audio,pad]+[pad,audio],Fs);
end

4.2.16 Speech perception

First load some speech into an array called 'speech', then;

soundsc(speech ,Fs) 
ls=length(speech); 
ws=Fs*0.1;
s2 =[];
for i=1:floor(ls/ws) 
	s2=[s2;speech(i*ws:-1:1+(i-1)*ws)];
end
%now listen to the reversed segments
soundsc(s2,Fs)

Can you understand this still? If so, try chaging the value of ws...