Auditory Signal Processing: Physiology, Psychoacoustics, and Models. Pressnitzer, D., de Cheveigné, A., McAdams, S.,and Collet, L. (Eds). Springer Verlag, 24. Discrimination of temporal fine structure by birds and mammals Marjorie Leek 1, Robert Dooling 2, Otto Gleich 3, and Micheal L. Dent 4 1 Walter Reed Army Medical Center, Marjorie.Leek@na.amedd.army.mil 2 Department of Psychology, University of Maryland, dooling@psyc.umd.edu 3 ENT Department, University of Regensburg, otto.gleich@klinik.uni-regensburg.de 4 Department of Physiology, University of Wisconsin, dent@physiology.wisc.edu 1 Introduction In a series of studies involving masking and discrimination of variants of Schroeder-phase waveforms, we have reported that birds not only demonstrate nearly identical masking by the positive- and negative-schroeder-phase maskers, in contrast to humans, but that their ability to discriminate the fine structure in the positive- and negative-phase stimuli is maintained for very short fundamental periods. While humans require periods to be on the order of 3-4 ms, several bird species can make these discriminations over periods as short as 1-2 ms. We have suggested that the discrimination of fine structure over very short periods may reflect a generally enhanced capability in birds to process extremely precise temporal differences (Dooling, Leek, Gleich, and Dent 22). In that earlier paper, we argued that the differences in compound action potential (CAP) responses in birds to positive and negative Schroeder complexes with different fundamental frequencies parallel the discriminability between positive and negative Schroeder complexes. Here, we extend those earlier studies by asking how the distribution of energy throughout a harmonic period may be discriminated. Such discriminations may be based either on changes in on-off temporal ratios within the waveform periods (duty cycle) or as rates of change of instantaneous frequency over time (rates of frequency sweeps). 2 Methods 2.1 Stimuli All stimuli were created digitally, at a sampling rate of 4 khz, and stored as files for playback during the experiments. Harmonic complexes were generated with 49 equal-amplitude components with component frequencies from 2 to 5 Hz. The fundamental frequency was 1 Hz. Stimulus duration was 26 ms, including a
472 Marjorie Leek et al. 2-ms cosine-squared rise-fall time. Component starting phases were selected according to a modification of the algorithm given by Schroeder (197): ( n + ) N θ n = Cπn 1 (1) where θ n represents the phase of the n th harmonic, N is the total number of harmonics, and C is a scalar. The original negative and positive Schroeder-phase waveforms used in previous studies of masking by harmonic complexes (e.g., Leek, Dent, and Dooling 2) are generated by assigning C a value of 1 or +1, respectively, and assigning a value of to the scalar produces a cosine-phase wave, with all component phases set to degrees. Selections of scalars between and ±1. produce periodic temporal waveforms with ever-decreasing silent intervals within each period. Fig. 1 shows samples of positive-schroeder waveforms constructed with several scalar values. The negative-phase waveforms of the same scalar are identical, but reversed in time. Scalar=. Scalar=.2 Scalar=.8 Scalar=1. Fig. 1. Examples of waveforms created by varying C in Eq. 1. The instantaneous frequencies within these waveforms increase (negative scalars) or decrease (positive scalars) as harmonic frequency increases. As the lowenergy portions of the stimuli increase in duration, this sweep in frequency occurs over shorter time sections of the period, producing increasing rates of frequency sweep. Smaller scalars produce faster frequency sweeps, and larger scalars produce slower frequency sweeps. Temporal characteristics may be extracted from the scalar stimuli in order to translate the data into a more intuitive value of perception of the distribution of energy across periods. In order to assign a duty cycle to each stimulus, duration of the frequency sweep for each stimulus was assumed to be the on-time within the period. The ratio of the on-time to the full period (1 ms for these stimuli) was taken as the duty cycle, reflecting the distribution of energy in the waveforms across each period. Similar values could be extracted by calculating the envelope of each scaled waveform using a Hilbert transform. 2.2 Behavioral and physiological procedures Three zebra finches (Taeniopygia guttata), two budgerigars (Melopsittacus undulatus), and three humans were tested on discrimination among selected waveforms. Birds were trained by operant conditioning and tested in a go/no-go task using a repeating background procedure and the method of constant stimuli (see Dooling and Okanoya 1995, for details of these procedures). Human listeners
Discrimination of temporal fine structure by birds and mammals 473 were tested using the same procedures as the birds except that sounds were heard through earphones, and the pecking response was made by pushing buttons on a response box. On each block of 1 trials, waveforms with either -scalar or ±1.- scalar were assigned as the background. The comparison stimuli on each block were intermediate scalar values, maintaining the sign of the scalars when the background scalar was ±1.. All sounds were played through a loudspeaker in the free field at a sound pressure level that was randomly varied between 75 and 85 db measured at the location of the bird s head. The level of each stimulus was roved over the 1 db range in order to reduce possible loudness cues to the discriminations. On each block of 1 trials, only one of the three background sounds (, +1. or -1. scalars) was presented and the other sounds were used as targets. Cochlear microphonics (CM) and compound action potentials (CAP) in response to each of the scaled Schroeder-phase waves were recorded in three budgerigars, one zebra finch, and three gerbils using standard procedures described previously (Dooling et al. 22). Responses to each scalar stimulus were averaged over 124 presentations, occurring at 2 per second. Following each set of presentations, responses to an inverted version of the stimulus were recorded. The CAP was extracted by adding the normal and inverted responses and scaling the sum by half, thereby cancelling the CM component. The CM was subsequently determined by subtracting the derived CAP from the recorded trace to the normal stimulus. The response was then averaged over the 1-ms fundamental period and response amplitudes were measured from these period histograms. Both the peakto-peak amplitude for the CAP and the total root-mean-square amplitude (rms) for the CM were calculated for each stimulus. 3 Results 3.1 Behavioral measures of discrimination Discrimination of the scaled stimuli from either a repeating background of the - scalar stimuli or ±1-scalars is shown in Fig. 2. (Note that although only one - scaled stimulus was used, for convenience the data are referred to as positive or negative, to indicate the signs of the comparison stimuli in each set). Data are averaged across birds (circles), showing the contrast between small birds and humans (triangles) on these discriminations. The duty cycles are shown on the abscissa (associated scalars are shown at the top) and the percent correct is plotted on the ordinate. Responses to positive and negative scaled waveforms are shown separately for each group (solid and open symbols, respectively). For both ends of the scalar continuum, birds can make finer discriminations than humans until the scalar values reach about 4-5 scalar units different from the background, when all subjects performed near 1% correct. The slopes of the psychometric functions are shallower for humans than for birds. Differences between the positive and negative scalar sets are small for the birds, but show a little more difference in the human data for a background scalar of 1. For that end of the continuum, in humans, the
474 Marjorie Leek et al. discrimination is better for negative scalars than for positive, for nearly all target scalars. Target Scalar.1.2. 3.4.5.6.7.8.9 1. Target Scalar 1..9.8. 7.6.5.4.3.2.1 1 1 Percent Correct 75 5 25 2 4 6 8 1 Duty Cycle (%) Percent Correct 75 5 25 1 8 6 4 Duty Cycle (%) Bird Pos Bird Neg Hum Pos Hum Neg Fig. 2. Percent correct discriminations as a function of duty cycle. The scalar values are shown along the top. Left panel shows discrimination from a scalar of.; right panel shows discrimination from a scalar of ±1.. Thresholds (5% correct discriminations) are shown in Fig. 3 for each subject group and condition in terms of the duty cycle of the stimuli. Thresholds for comparisons to the ±1-scalars are subtracted from 1% for this display. Thresholds for all conditions for the birds fall in the range of duty cycles of 1-2%, while humans require a 3% - 4% duty cycle to support the discrimination. Separate ANOVAs for the two types of background indicate that there is a significant difference in discrimination performance between birds and humans (F(1,7)=25.3, p<.5 for -scalar background, F(1,7)=7.9, p <.5 for 1 -scalar background), but there were no significant differences between responses to positive and Threshold (% Duty Cycle) 5 Background: pos neg 4 pos 1. neg 1. 3 2 1 Birds Humans Fig. 3. Threshold duty cycles for discrimination. negative scalars, nor of the interaction between sign of the stimuli and species (p >.5 for all). In order to detect that the energy in a waveform is not all contained within the one small interval corresponding to the peak of the cosine-phase wave, energy must be spread out over at least 3-4 ms for humans. For birds, the energy distribution needs only to be 1-2 ms. Further, about the same difference in energy distribution is necessary to discriminate a target stimulus from a waveform with a very flat envelope (±1- scalars), with waveform energy across the entire period. 2
Discrimination of temporal fine structure by birds and mammals 475 3.2 CM and CAP responses to scaled stimuli As has been reported previously (e.g., Dooling et al 22), the cochlear microphonic responses to these harmonic complexes follows the stimulus waveforms very precisely. Consistent with our previous data, the CM response in CAP amplitude (p-p µv) 8 7 6 5 4 3 2 Bird Neg Bird Pos Gerbil Neg Gerbil Pos 1.2.4.6.8 1 Scalar Fig. 4. Compound action potentials as a function of scalar. gerbils was much higher (4-6 µv) than in birds (5-1 µv). The CM rms amplitude showed only a small systematic increase from.1 to 1. scalars in gerbils and birds, and little difference in CM amplitude between the positive and negative scaled stimuli. The peak-to-peak amplitudes of the CAP responses to these stimuli are shown in Fig. 4. Compared to the CM, the overall amplitude of the CAPs are more similar between birds and gerbils, but there are systematic differences between the species in the pattern of responses across positive and negative scalars. The CAP amplitudes of the negative scalars in birds are considerably higher (4-7 uv) than the response to positive stimuli, which are quite low (2-3 uv). Gerbil responses are intermediate, with scalars less than.5 showing a larger response to the negative (4-5 uv) than to the positive (3-4 uv) stimuli. These data show a more highly synchronized neural response in birds to the negative scalar stimuli than to the positive, as well as greater synchronization in birds than in gerbils to the negative stimuli. The smaller differences between responses to positive and negative waveforms in gerbils is reminiscent of our earlier findings describing smaller positive/negative differences in Schroeder-phase stimuli with different fundamental frequencies in gerbils compared with birds. These physiological differences between birds and gerbils corresponded well with the enhanced behavioral abilities of birds compared with humans to discriminate among Schroeder-phase stimuli with increasing fundamental frequency (Dooling et al. 22). In the present experiment on scalar discrimination the enhanced behavioral sensitivity of birds compared with humans remains. However, in comparing the CM and the CAP responses of birds and gerbils, there is no clue either in the CM or the pattern of CAP amplitudes to these same stimuli that would provide insight into the physiological basis of this enhanced sensitivity to temporal fine structure in birds compared to mammals.
476 Marjorie Leek et al. 4 Discussion The present research is an extension of earlier work showing that birds could discriminate between positive and negative Schroeder-phase stimuli at higher fundamental frequencies than humans and, correspondingly, that birds showed greater differences in CAP responses to positive and negative Schroeder-phase stimuli than did gerbils (Dooling et al. 22). Here we approached the temporal resolution question from a different angle, asking whether there are species differences in sensitivity to temporal on/off characteristics of the stimuli or to the rates of change of instantaneous frequency over a given time period. The behavioral discrimination experiment shows that birds require very little difference within the fine structure of the temporal waveforms of harmonic complexes in order to discriminate them. Recall that these stimuli are identical in their long-term spectra, and in other characteristics such as number of components and fundamental frequency. By scaling the phase selections, the only change within the waveforms involves the distribution of energy across the periods. With this converging discrimination task, we demonstrate again that whereas humans may require temporal discriminanda to differ by around 3 ms or more, birds can discriminate fine structure more on the order of 1-2 ms. We have argued earlier that the roots of these temporal discriminations must lie within the encoding provided by the hair cells or auditory nerves in the cochlea an encoded discrimination of some sort that is preserved through the brainstem nuclei and the cortex in the ascending auditory pathway in order to produce the behavioral response. This argument was supported by our earlier study, in which there appeared to be a relationship between the bandwidths of the peripheral auditory system in mammals (humans) and the highest fundamental frequency that would support a discrimination between the positive and negative Schroeder-phase stimuli. The data from birds were somewhat puzzling, however, since a similar relationship did not emerge. Here, the salient pattern found in the discrimination data, of better time resolution in birds compared with humans, is not obvious in the physiological data comparing birds and gerbils. Instead, the primary feature is that there are rather large differences in CAP amplitude between the negative and positive stimuli. Given that the magnitude of the CAP is determined by the degree of synchronization of neural firing of neurons in the auditory nerve, it would appear that these amplitudes reflect a greater degree of synchronization in the negative scaled stimuli than in the positive scaled stimuli, and a higher degree of synchronization for the scalars closer to the -scalar than the more dispersed 1 scalars, as the amplitudes are usually higher for the lower-valued scalar stimuli. These observations are consistent with a synchronization of firing when the stimulus frequencies appear within the periods from low to high (i.e., for negativescalar stimuli), and when all frequencies occur over a very short duration, as for the lower-valued scalars. Dau, Wegner, Mellert and Kollmeier (2) attempted to synchronize neural firing by optimizing a chirp stimulus for the human cochlea for use in auditory brainstem response measurement. They described their optimized chirp as gliding in frequency from low to high, as the negative scalars do here, and
Discrimination of temporal fine structure by birds and mammals 477 with a duration that is meant to reflect the neural response delays in the human ear. Those durations would no doubt be different in the bird ears measured here, and probably also in gerbils, because the size of the cochlear structures among all these species differs considerably. Nonetheless, neural delays increase from high to low frequencies in all these species, and therefore, to compensate for the travel time, certainly a negative scalar would be necessary. In the stimuli used here, frequency changes linearly over time within each cycle. Consequently, the different degrees of synchronization observed in the CAP data from birds and gerbils may reflect differences in frequency-specific cochlear response delays. 5 Conclusions The present study confirms that birds show an enhanced ability to discriminate temporal fine structure in harmonic complex stimuli when compared with human abilities. The physiological basis of this enhanced performance is not seen in the CM or the CAP of either birds or gerbils (our human surrogate). Acknowledgments This work was supported by NIH Grants DC-198 to RJD and DC-626 to MRL. The opinions or assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the Department of the Army or the Department of Defense. References Dau, T., Wegner, O., Mellert, V., Kollmeier, B. (2) Auditory brainstem responses with optimized chirp signals compensating basilar-membrane dispersion. J. Acoust. Soc. Am. 17, 153-154. Dooling, R.J., Leek, M.R., Gleich, O., and Dent, M.L. (22) Auditory temporal resolution in birds: Discrimination of harmonic complexes. J. Acoust. Soc. Am. 112, 748-759. Dooling, R.J. and Okanoya, K. (1995) The Method of Constant Stimuli in testing auditory sensitivity in small birds. In G.M. Klump, R.J. Dooling, R.R. Fay and W.C. Stebbins (Eds), Methods in Comparative Psychoacoustics Birkhaeuser Verlag, Basel, pp.161-169. Leek, M.R., Dent, M.L. and Dooling, R.J. (2) Masking by harmonic complexes in budgerigars (Melopsittacus undulatus). J. Acoust. Soc. Am. 17, 1737-1744. Schroeder, M.R. (197) Synthesis of low-peak-factor signals and binary sequences with low autocorrelation. IEEE Trans. Inf. Theory IT-16, 85-89.