SPHSC 462 HEARING DEVELOPMENT Overview Review of Hearing Science Introduction 1
Overview of course and requirements Lecture/discussion; lecture notes on website http://faculty.washington.edu/lawerner/sphsc462/ No text; chapter and other readings on website Requirements: two take-home exams (40% grade each) and discussion questions/in-class discussion of articles (20% grade). 2
Exams Take-home Essay Open book Must work alone; penalty for breaking this rule is failing grade 3
Article discussion Articles every week starting Oct 15. I will provide some questions to guide you through the articles. You will submit one question about things you didn t understand, one question that you think it would be interesting to discuss and answer the question, How is this article relevant to me? by 5:30 am on discussion day. 4
Drop boxes Your discussion questions: https://catalysttools.washington.edu/collectit/drop box/lawerner/7177 Take-home exams https://catalysttools.washington.edu/collectit/drop box/lawerner/7143 5
Review of Hearing Science I m going to go pretty quickly through the highlights of what we understand about hearing. I will talk about some of these in more detail when we get to the part of the course where each is relevant, but today the point is to get us all on the same page about what it is that could be developing when we talk about hearing development. 6
Hearing in a nutshell Wow! Psychophysics is interesting! TIME FREQUENCY So this process is illustrated in this slide. Sources in the environment produce sounds that are conducted into the ear and transduced into a neural response. The output of the ear is carried in a series of frequency-labeled lines-- like a spectrogram (go thru). The amount of activity in each line represents the intensity of sound in a given frequency band at a given moment in time. So the brain takes this output and extracts information and then constructs a representation of the sources, their locations, what they re saying etc. So another way of describing infants situation is that the coding aspects of the process, the way the periphery represents sound is mature, but the process by which the listener extracts information from the output is immature. 7
What are the characteristics of sound represented in the auditory nerve response? Intensity Frequency Temporal characteristics (changes in intensity or frequency over time) There are three basic characteristics that we need to have to identify the sound, its intensity, its frequency and the way that its intensity and frequency change over time. An example of a temporal characteristic is the formant transitions in speech. The ear has to have some way of representing those characteristics of sound in the response that it sends to the brain. Sometimes I ll refer to the way that the ear represents a characteristic of sound as the code for that characteristic. So we have to have a code for intensity, a code for frequency, and a code for the temporal characteristics of sound. 8
How does the ear come to represent these characteristics of sound? Conduction Transduction Traveling wave and Active Mechanism There are three mechanisms in the ear that produce the neural code for sound. They are the conduction of sound into the inner ear, accomplished by the outer and middle ear; the transduction of sound in the inner ear, which results in changing the sound into a neural response; and finally, there are special mechanisms based on the traveling wave and the active mechanisms (which I will review in a few slides) that provide and extra code for frequency. 9
Conduction Level (db) Frequency (Hz) Level (db) Frequency (Hz) Level (db) Frequency (Hz) Level (db) Frequency (Hz) Sound is conducted into the inner ear through the external, or outer ear, and the middle ear. If I play a sound into the ear with an amplitude spectrum like the one at the top left of the slide and I measure the amplitude spectrum of sound in the ear canal, it will look like the figure in the middle left graph-- certain frequencies are present at higher intensities than others because of resonance effects. If I were to take to sound in the ear canal and somehow lead it directly to the inner ear, and then measured the amplitude spectrum of the sound in the inner ear, the sound intensity would be vastly reduced, because of the impedance mismatch between the air in the ear canal and the fluids in the inner ear. The middle ear overcomes this impedance mismatch, mainly because the tympanic membrane is much bigger than the stapes footplate that conducts sound into the inner ear. So if sound goes through the middle ear into the inner ear, and I measured its amplitude spectrum in the inner ear, a lot of sound would get through. 10
Transduction: changing acoustic energy to electrochemical energy A C D B E F The part of the inner ear concerned with hearing is the cochlea. The cochlea is a coiled tube, divided along its length by two membranes, forming three fluid-filled sections. It is shown in A as if it were uncoiled. Sound reaches the inner ear via the stapes footplate through the oval window. An increase in pressure caused by inward motion of the stapes footplate is counteracted by decrease in pressure at the round window; a decrease in pressure caused by outward motion of the stapes footplate is counteracted by an increase in pressure at the round window. This pressure difference across the cochlear membranes causes the membranes to be set into motion, sort of like a wave moving along a flicked rope. If you look at the cochlear tube in cross section, you would see the two membranes and the three sections of the tube as in B, The membranes are called Reissner s membrane and the basilar membrane. On top of the basilar membrane is the organ of Corti. The organ of Corti contains hair cells, which are the actual transducer in the ear. The fluid above the hair cells has a higher concentration of positive potassium ions than the fluids below the hair cells, so there is an electrical potential difference across the hair cell. 11
Where does the code for intensity come from? Low level High level Combined firing rate of auditory nerve fibers with the same best frequency So at this point we can see how the ear can represent the intensity of sound. A more intense sound will produce a bigger stapes displacement and a bigger pressure difference in the cochlear fluids and bigger basilar membrane displacement, which means more displacement of the stereocilia, more ions flowing into the hair cell, more neurotransmitter released, and more action potentials (or spikes ) in the auditory nerve fibers. So the basic code for intensity is the firing rate of auditory nerve fibers. Of course, you may remember that each auditory nerve fiber can only cover a part of the sound intensity range, so the brain has to combine the firing rates of nerve fibers to get the total firing rate. We ll return to the idea of best frequency shortly. 12
Where does the code for frequency come from? At this point, we can also see where one of the two codes for frequency and the code for the temporal characteristics of sound could come from. This movie shows a hair cell as it is being stimulated by a pure tone. The middle panel shows the time waveform of the tone, and as the movie plays, a dot moves along the waveform to indicate the pressure at each point in time. When the stereocilia are displaced to the right in this picture, the tip links between the stereocilia are stretched, and ions flow into the hair cell, which leads to neurotransmitter release, and sometimes to an action potential, which is shown as the lighting up of the nerve fibers contacting the hair cell and of the little traveling dot. The graph at the bottom is a histogram that counts how many action potentials occur at each point in time. Notice that action potentials only ever occur when the pressure is positive and the stereocilia are displaced to the right, never when the pressure is negative and the stereocilia are displace to the left. Notice also that the higher the positive pressure, the more action potentials occur. This is the phenomenon known as phase-locking-- action potentials tend to occur at the positive peaks in the sound waveform. 13
Coding of the time waveform of sound: Tone Pressure 0 Hair cell potential Time Time Spikes Time So think about the distribution of action potentials in a bunch of auditory nerve fibers over a period of time. Let s say the sound is a pure tone like this one. The pressure varies symmetrically around zero (shown on the left), the hair cell only gets depolarized when the sound pressure is positive, so the hair cell s electrical response looks like the waveform at the top right. So if we count how many action potentials we get at each point in time, over a bunch of auditory nerve fibers, we get a histogram like the one on the bottom left. The most action potentials when the positive peak in the waveform occurs, fewer action potentials when the pressure is positive but not at its peak, no action potentials when the pressure is negative. But if the brain wanted to figure out the frequency from this, it could do it pretty well by looking at the timing of the action potentials. 14
Coding of the time waveform of sound: Complex waveform Pressure 0 Hair cell potential Time Time Spikes Time Now let s say we have a complex sound-- a sound with more than one frequency in it (shown on the left). The hair cells electrical response still follows the time waveform of the sound--at least the positive parts-- but now the peaks in the waveform aren t evenly spaced as they were for a pure tone. The most action potentials still occur at the positive peaks in the time waveform. It may not be really obvious to you, but a smart brain could still look at the timing of the action potentials and figure out which frequencies are in there, because the timing of the action potentials depends on which frequencies are in there. 15
Coding of the time waveform of sound: High frequency tone Pressure 0 Hair cell potential Time Time Spikes Time The situation is different for a high-frequency tone. Ions flow into the hair cell when the tone comes on, but hair cell can t respond fast enough to clear out the ions before the next positive peak in time waveform occurs. So the electrical potential in the hair cell builds up and just stays about the same the whole time the high-frequency tone is on. As a result, action potentials occur spread out evenly over the time the sound is on. You can t tell what the frequency is by looking at the timing of the action potentials. 16
Coding of the time waveform of sound: High frequency complex Pressure 0 Hair cell potential Time Time Spikes Time If it s a high-frequency complex, the situation is similar. The pressure is varying rapidly, but the electrical potential in the hair cell cannot change that rapidly. Notice though that the overall intensity of the sound changes over time-- sometimes the positive peaks are higher than at other times, As the overall intensity of the sound changes, you do get some changes in the hair cell electrical potential. That s because these overall intensity changes are fairly slow compared to the rapid changes that tell us the sound frequency. So the action potentials in the auditory nerve fibers follow these slower overall changes in the sound. You can t tell what frequencies are in there by looking at the timing of the action potentials, but you could tell that the sound was changing over time in some way-- a point to which we will return momentarily. 17
Where does the code for frequency come from? For low frequencies, a code for frequency is carried in the timing of auditory nerve action. So for some situations, the auditory nerve carries a code for frequency in the timing of action potentials, which results from phase locking. We usually teach that phase locking does not occur for frequencies above about 5000 Hz, but it appears that phase locking deteriorate and is not as useful for frequencies above about 2000 Hz. 18
Where does the code for temporal characteristics come from? Finally, we now know something about how the temporal characteristics of a sound could be coded in the auditory nerve response. Again this is based on phase locking. 19
Coding of the time waveform of sound: Temporal characteristics Pressure 0 Pressure 0 Time Time Spikes Spikes Time Time In the case of the high frequency complex on the top right here, we learned that the hair cell can only follow these slow changes in the sound s intensity over time-- well, those are the temporal characteristics of this sound. And the number and timing of action potentials in the auditory nerve carry that information as well. It may not be as obvious with the low frequency complex-- but its overall intensity is varying over time, and the number and timing of action potentials in the auditory nerve reflects this. So: for low frequency sounds, both the frequency and the temporal characteristics are coded by the timing of action potentials. For high frequency sounds, only the temporal characteristics can be coded. We illustrate this with complex sounds, because they will change in amplitude over time. But even with a pure tone, the timing of action potentials is telling us about the temporal characteristics it s just that by definition the pure tone doesn t vary in amplitude over time. 20
Traveling wave and active mechanism The other code for frequency So if we can t code high frequencies by the timing of action potentials, how can we code them? The answer is the place code: Different places along the basilar membrane have maximal responses at different frequencies. This is accomplished through the mechanical properties of the basilar membrane itself and sharpened by the action of the outer hair cells. 21
Basilar membrane motion Base Apex So if we played just one cycle of a tone into the ear, the response of the basilar membrane would look (vaguely) like this. The motion looks like a wave that travels up the membrane, so it s referred to as a traveling wave. The size of the wave increases as it moves up the membrane--to a certain point-- and then the wave quickly disappears. The point at which the wave reaches its greatest size (I.e., basilar membrane displacement is the greatest) varies systematically with frequency. High frequencies produce the greatest response near the base of the basilar membrane and low frequencies produce the greatest response near the apex. Because bigger basilar membrane displacements lead to bigger displacement of the stereocilia on the hair cells, which leads to a bigger electrical response in the hair cells, which leads to a greater amount of neurotransmitter release--you get more action potentials in the nerve fibers innervating the portion of the basilar membrane that is displaced the most. So by looking at which nerve fibers are responding, you can tell what the frequency is. This code works for ALL frequencies--low and high, but it is apparently not as accurate a code for low frequencies as the temporal code is. 22
Traveling wave So if we played a continuous high frequency tone into the ear, basilar membrane motion might look something like the panel on the left; a continuous low frequency tone might produce basilar membrane displacement like the panel on the right. You ll notice that the 8000 Hz traveling wave seems to be faster than the 1000 Hz traveling wave-- and it is. Remember that each point on the basilar membrane is moving up and down at the frequency of stimulation. So if you focus on one position on the basilar membrane rather than following the wave, you ll see that position going up and down faster in the left panel than in the right panel. 23
Traveling wave And of course for complex sounds, there will be peaks in the traveling wave at lots of places. This example is just for a complex of two tones, at 8000 and 1000 Hz. 24
The active mechanism This is the last topic about basic sound coding. It turns out that if the point of maximum displacement in the traveling wave were only determined by the stiffness of the basilar membrane at different positions, the maximum displacement would be pretty broad--covering a big space on the basilar membrane. If that were the case, then there would be a lot of overlap between the responses to different frequencies and it would be hard to use the place code to figure out which frequencies are in there. It is the outer hair cells that make the maximum displacement of the basilar membrane even bigger at their particular location, so that the biggest displacement only occurs over a restricted section of the basilar membrane. We believe that the outer hair cells accomplish this mechanically-- by changing the length of their cell bodies at the frequency of stimulation or perhaps by moving their stereocilia at that frequency. This vibration of the hair cell feeds back into the basilar membrane motion, making it bigger at a particular spot. 25
How are these three characteristics of sound represented in the auditory nerve response? Intensity Frequency Temporal characteristics Combined firing rate of auditory nerve fibers with the same CF Place code and temporal code Phase-locking, but with limitations 26
So the message in the auditory nerve is sort of like a spectrogram TIME FREQUENCY = POSITION OF AN FIBER ALONG BASILAR MEMBRANE (In a spectrogram, the frequency is on the y-axis and time is on the x- axis. The intensity of sound is indicated by how dark the trace is.) In the auditory nerve, the frequency axis is replaced by position along the basilar membrane. Within each nerve fiber (which is carrying information about a narrow band of frequencies), the amount of activity is changing over time as the frequencies in the sound change in intensity and frequency. This spectrogram is of speech-- notice how much the intensity and frequency change over time-- that is important information in speech. 27
Hearing in a nutshell Wow! Psychophysics is interesting! TIME FREQUENCY But to return to hearing in a nutshell--- once the brain gets this message from the ear it still has a lot of things to do with it, like figuring out which frequencies are coming from which sound sources, what the sound sources are, where the sound sources are and finally what they mean. I want to talk very briefly about two of these processes: sound source localization and sound source segregation. 28
Sound source segregation TIME FREQUENCY So the basic sound source segregation problem is this: the sound from several sound sources is intermingled in the ear. The sources often have overlapping spectra. Somehow we figure out that certain frequency components come from one source, while other components come from a different source. People often talk about this as the cocktail party problem. Lots of people are talking at the cocktail party, but you are still able to pull out the voice of one talker to follow-- and you can even switch between different talkers if you want to. So there has to be some information in the message coming up the auditory nerve that allows us to do this. 29
Cues that could be used to segregate components into sources Spectral separation Spectral profile Harmonicity Spatial separation Temporal separation Temporal onsets and offsets Temporal modulations So these are some of the types of acoustic information that we might use to group frequency components together. Spectral separation refers to how far apart the components are in frequency. In general, the components from a single source aren t very far apart. (Of course, two component can be close together and still not come from the same source.) Spectral profile refers to the fact that the components from a single source tend to maintain the same intensity ratio to each other when the level of sound changes. So even though the intensity of the sound is greater, the intensity of the first and second components change together so that the first is always 5 db higher than the second, for example. Harmonicity refers to the fact that for some sorts of sources, the components from that source are harmonically reelated-- they are integral multiples of the same fundamental frequency. Spattial separation refers to the fact that the compoentns comeing from different sources will be heard at different spatial locations. The last three cues refer to temporal properties of the components: Do they occur around the same time? Do they come on and go off together? Do they chang over time in the same way? 30
Hearing in a nutshell and sound source segregation Wow! Psychophysics is interesting! TIME FREQUENCY So I want to make three points about sound source segregation. First, it is clearly a neural process. Some parts of the brain have got to be finding and grouping together the components from the same source. BUT-- second, to be able to have sound source segregation, the ear has to provide information about the frequency, intneisty and temporal characteristics of the sound. If peripheral coding is poor, sound source segregation will be poor. The third point is this: sometimes it is hard to tell the difference between a problem in sound source segregation and a problem in selective attention. To find out if you can segregate thesee sound sources, I might ask you to tell me what the people are saying. If you can t segregate their voices from the sounds produced by the frog and bus, you won t be able to do that. But imagine that you can segregate the sources, but that you keep getting distracted from the voices by the sound of that cute frog. That is, you can pick out the voices, but you can t focus your attention on them. That would be a problem with selective attention. So if a person (say an infant) doesn t seem to segregate sounds, it will look very similar to a problem in selective attention. 31
Acoustic cues used in localization Interaural intensity differences Interaural time differences Spectral shape cues The final review topic is sound localization. If we use the spatial location of the source to figure out which components go together, we must have some way of figuring out where the source is. The information we us to localize sounds comes from the ears, but the location has to be calculated by the brain. Sound is more intense at the ear nearer the source, and the difference is bigger as the source moves away from midline. So taking the intensity difference between ears can be used to tell where the source is. Sound arrives first at the ear nearer to the source, and the difference is bigger as the source moves away from midline. So taking the arrival time difference between ears can be used to tell where the source is. Finally, remember that the external ear shapes the spectrum of sound coming into the ear. It does that differently depending on where the source is located. So information about the shape of the sound spectrum can be used to tell where the source is. All three types of information are available for position in azimuth (the horizontal plane). Only spectra shape cues are available for localizing sounds in elevation (the vertical plane). Again three points: 32
For next time Read p. 1-14 of Human auditory development chapter, Intro and Frequency representation 33