What you re in for Speech processing schemes for cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences How does an implant work? What does an implant do to speech sounds? What works well in implants? What works poorly in implants? Who are cochlear implants for? People who receive relatively little benefit from a hearing aid in the implanted ear. Implants seem to work best in adults who had a significant period of relatively good hearing before becoming profoundly deaf, and who developed good language. children who are young enough to develop language through an implant. The bottom line Cochlear implants effectively bypass the cochlea by artificially stimulating the auditory nerve fibres directly. So, you still need some residual auditory nerve although auditory brainstem implants (ABI) may be appropriate when no auditory nerve is present
Components of a Cochlear Implant System Speech processor Implant The implant in place Implanted radio receiver Electrode inserted in the cochlea Microphone Transmitter Electrode array 1. Sound is received by the microphone of the speech processor. 2. The sound is digitized, analyzed and transformed into coded signals. 3. Coded signals are sent to the transmitter. 4. The transmitter sends the code across the skin to the internal implant where it is converted to electric signals. 5. Electric signals are sent to the electrode array to stimulate the residual auditory nerve fibres in the cochlea. 7 6. Signals travel to the brain, carrying information about sound. 8
The electrode array Play 2 Winston videos Play electrode insertion video What are the essential purposes of a speech processor? To transduce acoustical signals into an electrical form. To extract the appropriate speech information. To convert (or code) the resulting electrical signals into a form appropriate for stimulation of the auditory nerve. What other functions can and might be implemented in a speech processor? Enhancing speech features that contribute most to speech intelligibility. Minimising the effects of background noise. The possibility of different processing schemes for different situations.
What should an implant do? Mimic the most important functions of the normal ear. So what does a normal ear do? frequency analysis amplitude compression preservation of temporal features Frequency analysis: Basilar membrane motion to two sinusoids of different frequency compression revealed through input/ output functions on the basilar membrane 10 khz
25 20 15 10 5 0-5 100 1000 10000 Preservation of temporal features Think of the ear, not as an organ, but as a signal processor (limits on temporal features) 25 20 15 10 Joris et al. 2004 5 0-5 100 1000 10000 frequency shaping (frequency analysis & compression) Slower features too (4 Hz modulations) a CI substitutes for all this Common elements in speech processing A microphone to transduce acoustic signals into electrical ones. Amplitude compression to address the very limited dynamic range of electrocochlear stimulation. Use of the place principle for multiple electrodes (mapping low to high frequency components onto apical to basal cochlear places). But speech processing schemes vary significantly in other ways Pulsatile vs. continuously varying ( wavey ) stimulation. Not to be confused with analogue vs. digital implementations. All electrical stimulation is analogue. Simultaneous vs. non-simultaneous presentation of currents to different electrodes. Non-simultaneous stimulation requires pulsatile stimulation
Multi-channel systems All contemporary systems present different waveforms to different electrodes to mimic the frequency analysis of the normal mammalian cochlea. Think of the peripheral auditory system as analogous to a filter bank. What is a filter bank? A set of bandpass filters with centre frequencies covering the desired frequency range. What does a filter bank do to a speech waveform? Narrow bands of speech at different frequencies: Individual outputs from a filter bank a 6-channel filter bank
25 20 15 10 5 0-5 100 1000 10000 The filter bank analogy Imagine each afferent auditory nerve fibre has a bandpass filter attached to its input centre frequencies decreasing from base to apex. The no-brainer cochlear implant speech processing strategy Use an electronic filter bank to substitute for the auditory filter bank (the mechanics of the basilar membrane). 25 20 15 10 5 0-5 100 1000 10000 A simple speech processing scheme for a cochlear implant: Compressed Analogue (CA) Noise-excited vocoding as a simulation of what a CI might sound like 100Hz- 400Hz 400Hz- 1000Hz 1000Hz- 2000Hz Acoustic signal 2000Hz- 3500Hz 3500Hz- 5000Hz 5000Hz- 8000Hz Electrical signal How does a CI hear?
Separate channels in a 6- channel simulation
... and when summed together. Children like strawberries. Never mind the quality... feel the intelligibility. Effects of channel number 1 Connected Discourse Tracking shows that an unintelligible auditory signal can nevertheless significantly improve lipreading ability. 2 4 8 16 lipreading alone lipreading with voice pitch
25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 A much more common speech processing scheme: Continuous Interleaved Sampling (CIS) Intonation & voicing can be an important aids to lipreading from Philipos Louizou: http://www.utdallas.edu/~loizou/cimplants/tutorial/ Note similarity of CIS, noise vocoding and normal cochlear processing Continuous Interleaved Sampling 0.5 khz 1.0 khz just like CA 2.0 khz 4.0 khz
CIS in detail CIS stimulation pattern Necessity is the mother of invention Spectral Peak Strategy SPEAK (n of m strategies) The problem How could devices running at relatively slow rates be used for CIS, which required high rates of pulsatile stimulation? The solution Pick and present pulses only at the significant peaks in the spectrum. Acoustic signal 20 Programmable Filters 100Hz- 200Hz etc 200Hz- 350Hz 350Hz- 500Hz 500Hz- 800Hz 7500Hz- 8000Hz Spectra Peak Extractor 100Hz 8kHz (Sample of sound spectrum) Electrical signal
SPEAK stimulation pattern SPEAK in quiet...the ladder s near the door... Advantages of peak-picking in noise Electrodograms show advantages of peak-picking in pink noise (The clown had a funny face.) The matches lie on the shelf. 1 6 of 20 20 of 20 2 1 3 2 4 3 5 4 6 5 7 6 8 7 9 8 10 9 11 10 12 11 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 499 699 899 1099 1299 1499 1699 1899 2099 2299 2499 499 699 899 1099 1299 1499 1699 1899 2099 2299 2499 A B A B 6 of 20 20 of 20 The car engine s running. Kendall (1998 PhD Thesis)
Advanced Combination Encoders ACE (a faster n of m) Stimulation Pattern - ACE Acoustic signal signal 20 Programmable Filters 100Hz- High lots spectral of Spectra Peak 200Hz content electrodes Extractor etc 200Hz- 350Hz Optimal sites 7500Hzstimulated 500Hz- 800Hz Optimal temporal content 100Hz 350Hz- 500Hz 8000Hz 8kHz (Sample of sound spectrum) Electrical signal Stimulation rate, 250Hz- 2400Hz Any of 22 channels may be stimulated, and up to 20 are chosen per spectral sweep ACE stimulation pattern Other modifications to speech processing: Current steering requires simultaneous stimulation e.g., AB HiRes 120
What implant users are good at Spectral dynamics are encoded in acrosschannel envelope modulations Th-ee-z d- ay - z a ch-i-ck - en-l-e-g iz a r-a- (re) d -i - sh Following relatively slow modulations in amplitude and rates up to 20 Hz or so lead to high levels of intelligibility As long as there is a reasonable degree of analysis into distinct frequency regions but not too bad (3-6 channels?) nowhere near as good as a normal ear (20+ channels) What implant users are bad at Following fast modulations in amplitude Hearing out the periodicity in speech melody What do we mean by speech melody? waveform the n ew z ea l and rug by t ea m is called the all bl a ck s s spectrogram melody
Why is speech melody (voice pitch) important to hear? Melody coded as periodicity in rapid within-channel patterns Th-ee-z d- ay - z a ch-i-ck - en-l-e-g iz a r-a- (re) d -i - sh Contributes to speech intelligibility in all languages A good supplement to lipread information May play an important role in separating speech from background noises Appears to play a more crucial role for the young child developing language Crucial in so-called tone languages The representation of melody can be messy! What happens when an electrode is incompletely inserted?
Simulations of incomplete insertions 0 mm 2.2 mm Can the deleterious effects of spectral shifting be overcome over time? Pre-training v words in sentences over 3 hours of experience using CDT Post-training v 4.3 mm 6.5 mm normal listeners in simulations: Rosen et al. 1999 J Acoust Soc Am Hair cell substitution? Why is a CI not as good as normal hearing? It s a damaged auditory system, presumably with accompanying neural degeneration (e.g. dead regions) Electrodes cannot extend fully along the length of the basilar membrane (BM), so mis-matched tuning and restricted access to apical regions (where nerve survival is typically greatest) 3000 IHCs vs. a couple of dozen electrodes, hence poorer frequency selectivity Current spreads across BM, hence poorer frequency selectivity Less independence of firing across nerve fibres, appears to affect temporal coding Small dynamic ranges but intensity jnd s not correspondingly smaller, hence fewer discriminable steps in loudness But good temporal and intensity resolution from Lynne Werner: http://depts.washington.edu/sphsc461/ci_notes.htm
Ways to improve a cochlear implant Need more independent frequency channels clever schemes for steering stimulation? better electrode design? neural regeneration? Need to signal voice melody better better transmission of so-called fine-structure or rapid envelope information in the right users, complement the CI by effective exploitation of residual hearing: a cheap solution what even poor acoustic hearing does well, the CI does poorly, and v.v. To conclude Cochlear implants have not been around for more than about 50 years with most work beginning in the 70s Outstanding progress, both in basic research and clinical implementation in that time And there s lots more to do!