LISTENER EXPERIENCE AND PERCEPTION OF VOICE QUALITY

Size: px
Start display at page:

Download "LISTENER EXPERIENCE AND PERCEPTION OF VOICE QUALITY"

Transcription

1 Journal of Speech and Hearing Research, Volume 33, , March 1990 LISTENER EXPERIENCE AND PERCEPTION OF VOICE QUALITY JODY KREIMAN BRUCE R. GERRATT KRISTIN PRECODA VA Medical Center, West Los Angeles, and UCLA School of Medicine Five speech-language clinicians and 5 naive listeners rated the similarity of pairs of normal and dysphonic voices. Multidimensional scaling was used to determine the voice characteristics that were perceptually important for each voice set and listener group. Solution spaces were compared to determine if clinical experience affects perceptual strategies. Naive and expert listeners attended to different aspects of voice quality when judging the similarity of voices, for both normal and pathological voices. All naive listeners used similar perceptual strategies; however, individual clinicians differed substantially in the parameters they considered important when judging similarity. These differences were large enough to suggest that care must be taken when using data averaged across clinicians, because averaging obscures important aspects of an individual's perceptual behavior. KEY WORDS: voice, vocal quality, perception of voices, listener perceptions Surprisingly little is known about the cognitive and perceptual processes underlying voice discrimination and recognition, despite a long history of active research in this area (see Bricker & Pruzansky, 1976, Hecker, 1971, for review). Most studies of voice perception have focused on stimulus characteristics, rather than on listener behavior. Researchers traditionally have favored designs in which stimulus conditions are varied and differences in listener performance are measured as a function of these variations. Such studies argue that changes in recognition and discrimination performance emerge because these stimulus dimensions are important for voice perception or recognition. Few studies have examined variations in the perceptual strategies used by different listeners or listener groups to evaluate voice quality. These have failed to detect differences. Murry, Singh, and Sargent (1977) examined the relative perceptual importance of different voice characteristics by informally comparing subject weights for highly and moderately experienced clinicians in a multidimensional scaling study of pathological voice quality. No apparent group differences were observed. Kempster (1984) also compared subject weights in another multidimensional scaling study of abnormal voice quality. She found good agreement among listeners (graduate students in speech-language pathology) on the relative importance of the obtained perceptual characteristics, with 19 of 25 listeners relying most heavily on the first perceptual dimension. Her listeners did differ in their relative reliance on the second and third dimensions in the study, however. Kreiman (1987; Kreiman & Papcun, 1986) used multidimensional scaling to argue that listeners who differed significantly in discrimination accuracy did not differ correspondingly in their perceptual strategies. In all these studies, it is possible that differences among listeners in perceptual strategies have failed to emerge simply because the investigators did not use research designs that were sensitive to such differences. For example, it is not clear that the Murry et al. (1977) or Kreiman (1987) listeners differed enough in their level of experience for differences in perceptual strategies to occur. 1990, American Speech-Language-Hearing Association A recently-proposed model of long-term memory for voice quality (Papcun, Kreiman, & Davis, 1989) suggests that such differences should exist. This model states that listeners code voice information in terms of a prototype or central category member, and a set of deviations from that prototype. A number of studies using artificial visual stimuli (e.g., Homa, Cross, Cornell, Goldman, & Schwartz, 1973; Posner & Keele, 1968, 1970; see Mervis & Rosch, 1981, for review) have demonstrated that prototypes are built up with repeated exposure to a class of stimuli: Subjects who have seen sets of patterns that vary around prototypical values "recognize" the prototypes (which they have not in fact seen)' with greater certainty than the figures with which they were actually trained. Grieser and Kuhl (1989) recently have demonstrated that infants as young as 6 months have developed auditory prototypes for vowels. Papcun et al. (1989) contend that listeners, by virtue of their life-long experience with voices, have developed central category members for vocal quality and use them when judging or remembering voices. Because prototypes derive from perceptual experience, listeners who differ significantly in experience presumably would differ in perceptual strategy. In this study, we used multidimensional scaling to determine the characteristics of dysphonic and normal voices that are perceptually important for listeners with and without clinical training. We hypothesized that the perceptual dimensions used by naive listeners to evaluate vocal qualities would differ significantly from those used by listeners with extensive training in the clinical evaluation of voices. Voice Samples METHOD The voices of 18 male speakers with voice disorders were selected from a library of 67 audio recordings /90/ $01.00/0

2 104 Journal of Speech and Hearing Research Because we were interested in comparing listener groups, rather than in the specific perceptual qualities of the stimulus voices per se, voice selection was random, although mildly and severely disordered vocal qualities (as judged by the second author) were approximately equally represented. Samples of 18 normal male voices were also selected at random from a similar library of samples. No attempt was made to match normal and pathological speakers. All speakers were originally recorded using a Bruel and Kjaer condenser microphone and a reel-to-reel tape recorder (Revox B77, model MK II). Speakers were asked to sustain the vowel l/a/ as long as possible at a conversational level of pitch and loudness. Only native speakers of American English served as speakers. Voice samples were lowpass filtered using two 4-pole Butterworth filters with cutoff frequencies of 6300 Hz, and two with cutoff frequencies of 7500 Hz, for a total reduction in amplitude of 3.2 db at 5.6 khz and 39.4 db at 9 khz. They were then sampled at 17.8 khz using a 16-bit A/D converter. A 1.67-second sample was taken from the middle portion of each speaker's /a/. These digitized segments were normalized for peak voltage, and onsets and offsets were multiplied by a 10-ms ramp to eliminate click artifacts. Stimuli were then output through a 16-bit D/A converter using the same antialiasing filters. An experimental tape was constructed for each set of voices (normal and disordered). Each tape included both orders (AB and BA) of all possible pairs of the 18 voices, for a total of 306 trials per voice set. Voice samples within a pair were separated by 1 second; pairs were separated by 6 seconds. All listeners heard the voice pairs in the same random order; again, because our primary interest was in comparing listener groups, and because both orders of all pairs were used, it was judged unnecessary to re-randomize stimuli for each listener. Each voice pair was preceded on the experimental tape by its consecutive number. Acoustic Measures and Perceptual Ratings Both time- and frequency-domain measurements were made on each voice sample, for use in interpreting the derived perceptual dimensions. These measures are routinely made on voice samples recorded in our laboratory and together generally provide a good description of voices (see, e.g., Baken, 1987). The fundamental frequency (Fo) and the frequencies of the first three formants (F1, F2, and F3) were measured from spectrographic displays (Kay Elemetrics Model 5500). Fo was measured from narrow-band displays with a frequency range of 0-1 khz; the center frequencies of the three clearest harmonics were measured to ensure accuracy. Formants were measured with reference to both narrow- and wide-band 'The voice of 1 pathological speaker was clearly diplophonic. Thus, only formant measurements and perceptual ratings were available for him March 1990 displays (with a frequency range of 0-4 khz), and to displays of line spectra of the vowels. Measurements were taken from sections of the display where the formants appeared most steady and level. For jitter and shimmer measurements, a point on each waveform cycle that could be identified reliably from cycle to cycle was selected by hand. Measurements of mean jitter, standard deviation of jitter, percent jitter, directional jitter, and the coefficient of variation for jitter were then calculated using parabolic interpolation when the point marked was a peak and linear interpolation when the zero crossing was marked (Titze, Horii, & Scherer, 1987). Analogous shimmer measures were also calculated, using the difference in db between the highest and lowest points in each marked cycle as the amplitude (except that percent shimmer was not calculated, because these measures were already normalized by the use of db). Several additional acoustic measures were also obtained. The natural logarithm of the standard deviation of the period lengths (LNSD; see Wolfe & Steinfatt, 1987) was calculated for each voice sample, as was LNSD divided by the mean of the period lengths. The harmonics-to-noise ratio (HTN) was calculated as described by Yumoto, Gould, and Baer (1982), and the ratio of the amplitude of the first to the second harmonic (H1/H2; Bickley, 1982; Ladefoged, 1981) and the number of visible harmonics were calculated using a smoothed linear magnitude spectrum. Finally, an unnamed algorithm described by Ladefoged, Maddieson, and Jackson (1988), which we will call a "partial period comparison," was used to calculate the "measured roughness" of the voice samples. This algorithm is a time-domain comparison of the standard deviations of differences between moving vectors (i.e., portions of the acoustic signal, each about 0.6 times the estimated period in length). In order to generate this measure over a long segment of phonation while limiting computational time, our analysis differed somewhat from that reported previously. We applied the algorithm to a sample approximately three glottal cycles long, skipped the next two cycles, applied it to the next three cycles, and so on for the duration of the vowel sample. The mean and standard deviation of the indices generated for the entire voice sample constitute our roughness measures.2 In addition to these acoustic measures, ratings of the pathological voices gathered 1 year prior to the present study were used to help interpret the derived perceptual dimensions described below. Note that these ratings were used only for interpretation, and did not affect the structure of the perceptual spaces in any way. Five clinicians (four of whom participated in this experiment) 'Several parameters were set differently from those described by Ladefoged, Maddieson, & Jackson (1988). The vector length was 0.6 times the estimated cycle length; the searching window was 2.3 times the estimated cycle length; the reference vector moved in 1-ms steps; and the comparison vectors moved by one point at a time.

3 KREIMAN ET AL.: Voice Quality Perception 105 rated the pathological voice samples (along with 49 other voices) for their breathiness, roughness, instability and overall abnormality. Listeners heard the voice samples individually over a loud speaker in a sound-treated booth, and rated them using a visual-analog scale by marking a point along a 7-inch line to indicate the extent to which each voice possessed a given characteristic. Ratings from the five clinicians were averaged to produce a single value for each scale for each voice. Listeners Two groups of five listeners participated in the two listening tests. Each group included 2 females and 3 males. The first group (expert listeners) consisted of three speech pathologists and two otolaryngologists, each with a minimum of 4 years' experience evaluating and treating voice disorders. The second group (naive listeners) included five listeners with no training in linguistics, audiology, or speech pathology, and with no previous formal exposure to pathological voices. Listening Tasks Each listener participated in two experimental sessions, one for each voice set. Sessions were held on separate days; order of presentation of the voice sets was counterbalanced across listeners. Listeners were tested individually in a sound-treated room. Stimuli were presented at a constant level (approximately 80 db SPL) in free field over two loudspeakers equidistant from the listener. At each session, listeners were first instructed that we were interested in how they as individuals judged the similarity of each voice pair. They were asked to listen carefully to each pair, and to rate the similarity of the voices on a 7-point equal-appearing-interval scale ranging from exact same (1) to most different (7). Listeners then heard four practice pairs of voices (normal or disordered, as appropriate, but not drawn from the stimulus set). Finally, they heard all 18 voice samples in random order, to familiarize them with the range of stimuli they would encounter at that session. Listeners then heard the experimental tape and rated the voice pairs. They were given a break half-way through the task. Each test session lasted approximately 1 1/2 hours. RESULTS Reliability of Individual Similarity Ratings Because the experimental task required listeners to concentrate for extended intervals, individual similarity matrices were examined to ensure that the ratings of the two presentation orders of each voice pair did not differ systematically. Each instance where a single listener's ratings for the AB and BA orders of a pair of voices differed by a scale value of three or more was noted, and the total pattern of such asymmetrical ratings across all listeners was examined. Listeners showed no tendency to agree on the pairs to rate asymmetrically: across the four experimental conditions, only 28 voice pairs attracted more than a single asymmetrical rating (25 voice pairs were asymmetrically rated by two listeners, three by three listeners, and none by more than 3 listeners). Further, asymmetries did not cluster in any particular portion of the experimental tape. The experimental tape was divided into thirds, and the frequency of asymmetrical ratings for AB/BA combinations occurring in different parts of the tape (e.g., both AB and BA orders in the first third of the tape, AB in the second third and BA in the last third, etc.) was determined. No significant difference in rates of asymmetrical ratings was observed when these portions of the test tape were compared ( 2 = 3.01, df = 5, p > ). Ratings asymmetries thus did not seem to represent practice or fatigue effects. Therefore, all differences in ratings were treated as noise in the data, and similarity matrices were symmetrized by averaging across the diagonal in all subsequent analyses. Multidimensional Scaling Analyses Separate nonmetric multidimensional scaling solutions in 2-6 dimensions were found for each listener group and voice set using the individual differences model of SAS PROC ALSCAL (SAS Institute, Inc., 1983; Schiffman, Reynolds, & Young, 1981). R 2 values, which indicate how much variance in the original similarity ratings is accounted for by each solution, are shown in Figure 1. Based on the location of elbows in these plots, shown by arrows in the figure, and on interpretability, solutions were selected as indicated. 3 The solutions selected fit the data quite well, accounting for an average of 76.5% of the 3 Solution selection for the pathological voices was straightforward: Both stress and R 2 values clearly pointed to the same solutions and these solutions had reasonable interpretations. Solution selection was more complicated for the normal voice set. The R 2 curve for the expert group had elbows at 3 and 5 dimensions, but stress values suggested the 3-dimensional solution was the correct one. For naive listeners, it was not clear whether to select the 2- or the 5-dimensional solution. Both stress and R2 values seemingly pointed to the 5-dimensional solution, which did account for 9% more variance in similarity ratings than did the 2-dimensional solution. However, the 2- and 3-dimensional solutions differed by less than 1% in their fit to the data. Further, the correlation between the unscaled similarity ratings and F o was.76, suggesting that approximately half the variance in similarity ratings could be accounted for by a single dimension, and thus, that there was a true elbow at 2 dimensions for these data. Finally, when the 5-dimensional solution was interpreted, dimensions were found to relate to the same acoustic parameters as did dimensions in the 2-dimensional solution. Because the extra dimensions did not provide any additional information about parameters underlying perceived similarity, and because the three extra dimensions added only 9% to the variance accounted for, the 2-dimensional solution was ultimately chosen for these data.

4 . w 106 Journal of Speech and Hearing Research March 1990 Experts Pathological Voices variance in the underlying similarity ratings. Note that R 2 values are higher for the naive listeners for both voice sets. We will return to this point below. Naive Perceptual Spaces for the Pathological Voice Set cy Cr.7' Experts (0.17) Ii i 6 w ) (018) Normal Voices Naive (0.21) Number of Dimensions -rf FIGURE 1. Values of R 2 for the four multidimensional scaling analyses. Arrows indicate elbows in the curves. Stress values for the selected solutions are given in parentheses. ALSCAL calculates coordinates for each stimulus voice on each dimension, such that voices that are perceptually similar are close together in the space described by the coordinates. Both the expert and naive listener group spaces were interpreted by examining the correlation of these stimulus coordinates with the acoustic measures and descriptive scale ratings described above. Because many of the variables used to interpret the dimensions are themselves intercorrelated, multiple regression was used to determine which measures explained unique parts of the variance on each dimension. Results for the pathological voice solutions are given in Table 1; complete intercorrelation matrices are included as Appendix A. Table 1 also includes the values of R 2 for each dimension, which indicate the average importance of a dimension in the perceptual space. R 2 values for individual dimensions sum to the value for the entire space, which represents the amount of variance in the underlying similarity ratings that is accounted for by the solution as a whole. As Table 1 shows, expert and naive listeners differed both in the stimulus characteristics on which they relied when making their similarity judgments, and in the relative importance of the dimensions they shared. For the pathological voices, the first dimension (D1) in the expert space was correlated with F, rated breathiness, and H1/H2; multiple regression indicates that all three explained independent parts of the variance on this dimension (multiple R =.94). The second dimension (D2) was correlated with various shimmer measures, with rated roughness and abnormality, and with measured roughness (partial period comparison); measured roughness provided the best interpretation (R =.82). The third dimension (D3) was significantly correlated only with F o (R =.69). Note that on the average the expert listeners TABLE 1. Multiple regression results, pathological voices. Multiple Stimulus Standardized Dimension R characteristics coefficient (B) Expert listeners 1 (R 2 =.27).94 Fo H1/H Rated breathiness (R 2 =.25).82 Measured roughness 3 (R 2 =.23).69 F 0 Naive listeners 1 (R 2 =.35).92 F 0 2 (R 2 =.21).65 Rated abnormality 3 (R 2 =.14).55 Directional jitter 4 (R 2 =.12).83 H1/H Rated roughness.404 Note: Only coefficients significant at p are listed.

5 KREIMAN ET AL.: Voice Quality Perception 107 weighed these dimensions almost equally: the R 2 values in Table 1 indicate that each explains about the same amount of variance in the similarity ratings. For naive listeners, D1 in the space for pathological voices was correlated only with Fo (R =.92). D2 was correlated with rated abnormality and rated breathiness; abnormality provided the best interpretation (R =.65). D3 was correlated with directional jitter (R =.51); and the fourth dimension (D4) was significantly correlated with both rated roughness and H1/H2, each of which contributed significantly to the explained variance (multiple R =.83). Unlike the expert listeners, who weighed their dimensions roughly equally, naive listeners as a group relied primarily on the first dimension (F), which accounted for nearly half the variance explained by the total solution (35% out of 83%). Further correlations compared stimulus coordinates in the expert and naive listeners' perceptual spaces (Table 2). All three expert dimensions were highly correlated with dimensions in the naive listeners' space. The first expert dimension (Fo + breathiness + H1/H2) was related most strongly to D4 (H1/H2 + roughness) in the naive space, supporting a "breathy" interpretation for both. The experts' D2 (correlated with measured roughness) was significantly related to D2 (rated abnormality) in the naive space. Expert D3 (Fo) was related to naive D1 (Fo). Note that naive D3 (jitter) was not significantly related to any dimension in the expert space, and evidently represented a perceptual dimension unique to naive listeners in this study. To summarize, experts' ratings of the similarity of pairs of pathological voices may be explained primarily in terms of breathiness (measured by H1/H2 and by previously obtained voice quality ratings), measured and rated roughness, and fundamental frequency. These dimensions were weighed approximately equally, and together accounted for 75.3% of the variance in the original similarity ratings. Naive listeners judged the similarity of pathological voices primarily in terms of differences in Fo 0, although rated abnormality, jitter, and roughness/ breathiness (H1/H2) also played a role. The four dimensions together accounted for 82.3% of the variance in the similarity ratings. Expert listeners paid more attention to breathiness and roughness than did naive listeners; naive listeners relied more on F o than did the experts. Perceptual Spaces for the Normal Voice Set For the normal voice set, the three-dimensional solu- TABLE 2. Correlations between dimensions for pathological voices. Expert dimension Naive dimension tion for the expert listeners accounted for 71.6% of the variance in the similarity ratings. Significant correlations between stimulus coordinates and voice characteristics are given in Table 3, along with the variance explained by each dimension. (The complete intercorrelation matrix is included as Appendix B.) Multiple regression was again used to eliminate redundancies in these correlations, but detailed results for expert listeners will not be presented because in each case a single parameter provided the best interpretation of the dimension. The first dimension in the expert space was not significantly correlated with any rated or measured voice characteristics. D2 was correlated with various jitter and shimmer measures, with shimmer standard deviation providing the best interpretation 4 (R = -. 83). D3 was significantly correlated with Fo, directional jitter, and the natural logarithm of the standard deviation of the periods; only F o contributed uniquely to the variance on this dimension (R = -.94). As in the pathological voice space, on the average experts weighed each dimension in this solution roughly equally. The two-dimensional solution for naive listeners and normal voices accounted for 77% of the variance in this set of similarity ratings. As for the pathological voices, naive listeners relied primarily on vocal pitch when judging the similarity of normal voices. The first dimension accounted for most of the variance in the solution space (56.5% out of 77%), and was highly correlated with Fo (R = -. 98). The dimension was also significantly correlated with a variety of jitter and shimmer measures. D2 was correlated with shimmer and formant frequencies; a combination of F2/F1 and shimmer covariance provided the best interpretation (multiple R =.73). Additional correlations again tested the equivalence of the expert and naive listeners' perceptual spaces. Results are given in Table 4. D1 (uninterpreted) in the experts' space was significantly related to D2 (resonances + shim- 4 Note that the correlations between coordinates and several jitter/shimmer measurements are roughly equal. Although in each case we have selected the one variable best correlated with these dimensions, they are perhaps best thought of as generalized "jitter/shimmer" dimensions (although as shown by D2 in the expert pathological voice space, it is possible to have one without the other). TABLE 3. Multiple regression results, normal voices. Multiple Stimulus Standardized Dimension R characteristic coefcient (B) Expert listeners 1 (R 2 =.26) - 2 (R 2 =.23).84 Shimmer SD 3 (R 2 =.23).94 F o Naive listeners 1 (R 2 =.57).98 F o 2 (R 2 =.21).73 Shimmer covariance.492 F2/F1.484 Note: Only coefficients significant at p c are listed.

6 108 Journal of Speech and Hearing Research TABLE 4. Correlations between dimensions for normal voices. Expert dimension Naive dimension mer) in the naive listeners' space. Expert D2 (shimmer) was only weakly related to both D1 and D2 (resonances + shimmer) in the naive space, and may represent a dimension unique to experts. Expert D3 (Fo) was well-correlated with D1 (Fo) in the naive space. Thus, for normal voices, as for pathological voices, expert and naive listeners differed both in the particular aspects of voice quality to which they attended when making their similarity judgments, and in the relative importance of common dimensions. F o and shimmer appeared in both solutions; F was again much more important for naive listeners than for experts, and shimmer was more important to experts than to naive listeners. Naive listeners also relied on formant frequencies/vowel quality information, which experts largely ignored: Although expert D1 was significantly related to naive D2 (which was correlated with both F2/F1 and F3), the correlation between expert D1 and F2/F1 was only (p > ), and the correlation with F3 was.38 (p > ).5 Differences Among Individual Listeners In addition to calculating the overall importance of each dimension in a group perceptual space, ALSCAL calculates the importance of each dimension to each individual subject and outputs a set of weights for each subject on each dimension. These weights, which reflect the relative importance of a dimension to an individual listener, are given in Tables 5 and 6. These tables show that expert listeners differed significantly in which dimension they weighed most heavily: For both pathological and normal voices, each dimension was both most and least salient to some listener, and no significant agreement on perceptual importance was found (Kendall's Coefficient of Concordance: for the pathological voices =.04, S = 2, p > ; for the normal voices =.04, S = 2, p > ). In contrast, naive listeners differed in precisely how heavily they weighed each dimension. However, they agreed significantly on the relative order of importance of the dimensions, with the first dimension most important for every listener, the second dimension usually second 5 This apparent contradiction-one dimension significantly correlated with another without being significantly correlated with the interpretation of that dimension-may be explained by the relatively low values of both correlations. Note that only about 53% of the variance on naive D2 is explained by the combination of resonances and shimmer. The correlation between expert D1 and naive D2 (which is only -. 66) may involve the "left over" 47% of variance on this dimension. TABLE 5. Relative importance of individual dimensions to expert listeners. Weight on dimension a Voice set Listener Pathological Normal asquared weights sum to R 2 for the individual subject. most important, and so on, for both voice sets. (Kendall's Coefficient of Concordance: for the pathological voices =.712, S = 89, p <. No statistic was calculated for the normal voice set because of the small number of degrees of freedom.) DISCUSSION March 1990 Table 7 summarizes the multidimensional scaling solutions found for naive and expert listeners and for pathological and normal voices. As suggested by prototype models (Papcun et al., 1989), a listener's background affects the perceptual strategy used when judging pairs of voices. Naive listeners differed from experts both in the dimensions that emerged, and in the relative salience of those dimensions the two groups shared. These results suggest that, because a listener's background and experience affect perceptual strategy, models of voice perception should incorporate the notion "population of listeners." A similar phenomenon has been reported by Terbeek (1977), whose multidimensional scaling study of vowel quality perception revealed systematic differences in perceptual spaces for listeners whose native languages differed in vowel inventories. Recall that the scaling solutions found for naive listeners accounted for a greater portion of the variance in TABLE 6. Relative importance of individual dimensions to naive listeners. Weight on dimension" Voice set Listener Pathological Normal asquared weights sum to R 2 for the individual subject.

7 KREIMAN ET AL.: Voice Quality Perception 109 TABLE 7. Summary of multidimensional scaling results. Group Voice set Expert listeners Naive listeners Pathological DI: F 0 + Breathiness + Di: F 0 H1/H2 D2: Roughness (+ shimmer) D2: Rated Abnormality D3: F 0 D3: Rated Roughness + H1/H2 Normal Di: - DI: F o D2: Jitter/Shimmer D2: Resonances + Shimmer (+ Breathiness) D3: F 0 similarity ratings than did the expert listeners' solutions. This finding is consistent with our view that voice perception involves both "features," which are useful across voice pairs, and idiosyncratic details of vocal quality, which may be useful when comparing one pair of voices, but not another (see, e.g., Kreiman & Papcun, 1988; Van Lancker & Kreiman, 1987). Naive listeners in this study seem to have relied heavily on features-they apparently applied a rather inflexible perceptual strategy in roughly the same fashion across voice pairs, so their similarity judgments could be very well accounted for with only a few perceptual dimensions. Experts, on the other hand, may have varied their strategies somewhat, depending on the characteristics of the voice pair in question. Clinicians' training and experience allowed them to access a larger range of information than naive listeners were able to use. This sort of flexible strategy is not easy to summarize in a few features of the sort extracted by multidimensional scaling. To the extent that clinicians are able to adjust their perceptual strategies to the demands of a given task, their R 2 values would be expected to be lower than those of naive listeners. Individual naive listeners showed the same pattern of subject weights on their perceptual dimensions: They agreed about which voice quality dimensions were most important, for both pathological and normal voices. Expert listeners, on the other hand, did not agree about the relative importance of different aspects of voice quality. These data suggest that clinical training and experience cause listeners to differ more, not less, in how they perceive voice quality, at least in tasks that involve unstructured similarity judgments. The differences between clinicians were large enough to suggest that averaging data across subjects may produce misleading results and obscure important aspects of an individual subject's perceptual behavior. Recall that examination of expert group data gave the impression that each dimension is equally important to the similarity of these voices. Subject weights show that this is not the case. Rather, the substantial differences among individual clinicians are averaged away when group data are considered. Care must therefore be taken when using data averaged across clinicians. These differences in clinicians' perceptual strategies have several possible explanations, including differences in training (e.g., speech pathology vs. otolaryngology) and in the populations of patients most frequently treated (e.g., cancer patients, stroke patients, patients with neurological disorders). Although it is not possible to eliminate either explanation based on our sample of five experts, both sources of variability are probably implicated in our results. It is also possible that clinicians have developed more than one prototype for different sorts of pathological voices. More detailed analyses of individual data from a larger set of clinicians will help answer these questions. The finding that clinicians differ significantly in their perceptual strategies apparently contradicts a large number of studies reporting good agreement among experts in rating scale tasks (e.g., Bassich & Ludlow, 1986; Darley, Aronson, & Brown, 1969; Kruel & Hecker, 1971). Such studies examined the extent to which clinicians can be made to agree, via task-specific training and restriction of the rating task to a few dimensions (e.g., breathiness, hoarseness). In contrast, our purpose was to determine how listeners vary when left to structure the perceptual task for themselves. Listeners in the present study were free to attend to whatever voice characteristics seemed relevant for a given stimulus pair, and were not forced to restrict their judgments to a single feature that might provide little information about that pair. Recent studies suggest voice quality information is normally processed in a very flexible fashion, with individual voice features attended to or ignored, as appropriate for a given voice or voice pair (see Van Lancker, Kreiman, & Cummings, in press; Van Lancker, Kreiman, & Emmorey, 1985; Van Lancker, Kreiman, & Wickens, 1985). Because the dimensions along which listeners judged the voices were not specified in advance, and because clinicians rarely attend perceptual training sessions like those used in structured rating scale tasks in the course of their everyday clinical work, the present task may more nearly approximate the sort of perceptual judgment listeners make every day. Structured rating scale tasks may well force listeners to behave in ways that are not consistent with normal perceptual processing, and we question the external validity of conclusions such studies draw regarding the evaluation of vocal quality. Our results also suggest that those parameters which emerge as perceptually salient from a multidimensional scaling analysis depend in part on the populations of voices under study, at least for naive listeners. Experts evidently used roughly the same parameters (Fo, breathiness, roughness and jitter/shimmer) for both voice sets.

8 110 Journal of Speech and Hearing Research March 1990 TABLE 8. Summary of previous multidimensional scaling studies of voice quality perception. Variance Study Speakers Listeners Dimensions accounted for Normal voices Murry & Singh (1980) 20 male 10 Fo 54% naive Resonances Nasality F2 20 female 10 naive Fo Breathiness 42% F2/F1 Effortlhoarseness Matsumoto, Hiki, Sone, 8 male 6 F o & Nimura (1973) naive Glottal source not given spectrum F1 + F2 Pathological voices Murry, Singh, & 20 male 16 Periodicity Sargent (1977) expert +/- tumor 48% Volume velocity Fo Uninterpreted Kempster (1984) 30 female 25 Intensity + 60% expert HTN Fo Perturbation Naive listeners relied primarily on Fo for both voice sets, but also attended to abnormality and breathiness for the pathological voices (but not for normal voices), and to resonance information for normal voices (but not for the pathological set). However, the large differences between results of other multidimensional scaling studies in the literature suggest that within-population differences are as great as those between different populations of voices, and thus that the concept of populations of voices may not be justified. Table 8 summarizes the previous studies (Kempster, 1984; Matsumoto, Hiki, Sone, & Nimura, 1973; Murry & Singh, 1980; Murry et al., 1977) using steady-state vowel stimuli to examine the perception of normal and pathological voice quality. Some of the differences shown in this table are attributable to methodological variations; for example, stimuli were equated for amplitude by Murry et al. (1977) but not by Kempster (1984). Nevertheless, only a single parameter-fo--is common to all solutions. Murry and Singh (1980) suggested, "... besides the Fo/pitch measure, there is no common set of acoustic parameters for judging voices applicable to both sexes and phonation conditions [i.e., for both vowel and phrase stimuli]" (p. 1300). Our finding that groups differed in their perceptions of the same voices argues further that, apart from F, a set of parameters that is common to different populations of listeners judging the same voices may not exist. Perceptual heterogeneity seems to be the rule in voice quality evaluations. ACKNOWLEDGMENTS This research was supported by NINCDS award NS20707, by a NINCDS post-doctoral traineeship to the first author (NS07059), and by Veterans Administration Rehabilitation Research and Development grant C468-R. We thank Gerald Berke, David Hanson, Jean Holle, and Jill Zweier for serving as subjects. We also thank Peter Ladefoged and the UCLA Phonetics Laboratory for use of their digital spectrograph. REFERENCES BAKEN, R. J. (1987). Clinical measurement of speech and voice. Boston: College-Hill. BASSICH, C. J., & LUDLOw, C. L. (1986). The use of perceptual methods by new clinicians for assessing voice quality. Journal of Speech and Hearing Disorders, 51, BICKLEY, C. (1982). Acoustic analysis and perception of breathy vowels. M.I.T., R.L.E. Speech Communications Group: Working Papers, 1, BRICKER, P., & PRUZANSKY, S. (1976). Speaker recognition. In N. J. Lass (Ed.), Contemporary issues in experimental phonetics (pp ). New York: Academic Press. DARLEY, F. L., ARONSON, A. E., & BROWN, J. R. (1969). Differential diagnostic patterns of dysarthria. Journal of Speech and Hearing Research, 12, GRIESER, D., & KUHL, P. (1989). Categorization of speech by infants: Support for speech-sound prototypes. Developmental Psychology, 25, HECKER, M. H. L. (1971). Speaker recognition: An interpretive survey of the literature. ASHA Monographs, 16. HOMA, D., CROSS, J., CORNELL, D., GOLDMAN, D., & SCHWARTZ, S. (1973). Prototype abstraction and classification of new instances as a function of number of instances defining the prototype. Journal of Experimental Psychology, 101, KEMPSTER, G. (1984). A multidimensional analysis of vocal quality in two dysphonic groups. Unpublished doctoral dissertation, Northwestern University. KREIMAN, J. (1987). Human memory for unfamiliar voices. Unpublished doctoral dissertation, University of Chicago. KREIMAN, J., & PAPCUN, G. (1986, May). The perception of voice quality: Multidimensional scaling evidence. Paper presented at the 111th Meeting of the Acoustical Society of America,

9 KREIMAN ET AL.: Voice Quality Perception 111 Cleveland, OH. KREIMAN, J., & PAPCUN, G. (1988, May). Voice 'features" in long-term memory. Paper presented at the 115th Meeting of the Acoustical Society of America, Seattle, WA. KRUEL, E. J., & HECKER, M. H. L. (1971). Descriptions of the speech of patients with cancer of the vocal folds. Part II: Judgments of age and voice quality. Journal of the Acoustical Society of America, 49, LADEFOGED, P. (1981, May). The relative nature of voice quality. Paper presented at the 101st Meeting of the Acoustical Society of America, Ottawa, Ontario. LADEFOGED, P., MADDIESON, I., & JACKSON, M. (1988). Investigating phonation types in different languages. In O. Fujimura (Ed.), Vocal fold physiology: Voice production, mechanisms and functions (pp ). New York: Raven Press. MATSUMOTO, H., HIKI, S., SONE, T., & NIMURA, T. (1973). Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Transactions on Audio and Electroacoustics, AU-21, MERVIS, C., & ROSCH, E. (1981). Categorization of natural objects. Annual Review of Psychology, 32, MURRY, T., & SINGH, S. (1980). Multidimensional analysis of male and female voices. Journal of the Acoustical Society of America, 68, MURRY T., SINGH, S., & SARGENT, M. (1977). Multidimensional classification of abnormal voice qualities. Journal of the Acoustical Society of America, 61, PAPCUN, G., KREIMAN, J., & DAVIS, A. (1989). Long-term memory for unfamiliar voices. Journal of the Acoustical Society of America, 85, POSNER, M., & KEELE, S. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, POSNER, M., & KEELE, S. (1970). Retention of abstract ideas. Journal of Experimental Psychology, 83, SAS INSTITUTE, INC. (1983). SUGI supplemental library user's guide. Cary, NC: SAS Institute, Inc. SCHIFFMAN, S. S., REYNOLDS, M. L., & YOUNG, F. W. (1981). Introduction to multidimensional scaling. New York: Academic. TERBEEK, D. (1977). A cross-language multidimensional scaling study of vowel perception. UCLA Working Papers in Phonetics, 37. TITZE, I., HORII, Y., & SCHERER, R. (1987). Some technical considerations in voice perturbation measurements.journal of Speech and Hearing Research, 30, VAN LANCKER, D., & KREIMAN, J. (1987). Voice discrimination and recognition are separate abilities. Neuropsychologia, 25, VAN LANCKER, D., KREIMAN, J., & CUMMINGS, J. (in press). Voice perception deficits: Neuroanatomical correlates of phonagnosia. Journal of Clinical and Experimental Neuropsychology. VAN LANCKER, D., KREIMAN, J., & EMMOREY, K. (1985). Familiar voice recognition: Patterns and parameters. Part I: Recognition of backward voices. Journal of Phonetics, 13, VAN LANCKER, D., KREIMAN, J., & WICKENS, T. (1985). Familiar voice recognition: Patterns and parameters. Part II: Recognition of rate-altered voices. Journal of Phonetics, 13, WOLFE, V., & STEINFATT, T. (1987). Prediction of vocal severity within and across voice types. Journal of Speech and Hearing Research, 30, YUMOTO, E., GOULD, W., & BAER, T. (1982). Harmonics-to-noise ratio as an index of the degree of hoarseness. Journal of the Acoustical Society of America, 71, Received May 5, 1989 Accepted August 18, 1989 Requests for reprints should be sent to Jody Kreiman, VA Medical Center, Audiology and Speech Pathology (126), Wilshire and Sawtelle Boulevards, Los Angeles, CA

10 112 Journal of Speech and Hearing Research March 1990 APPENDIX A CORRELATIONS BETWEEN DIMENSIONS AND STIMULUS CHARACTERISTICS FOR THE PATHOLOGICAL VOICES Key to abbreviations: ExDn = Stimulus coordinates on expert dimension n; NaDn = Stimulus Coordinates on naive dimension n; Abnormal = Rated abnormality; Breathy = Rated breathiness; Instable = Rated instability; Rough = Rated roughness; # Harmon = The number of visible harmonics; HTN = Harmonics-to-noise ratio; ShCovar = Covariance of shimmer; DirSh = Directional shimmer; ShMean = Mean shimmer; ShSD = Shimmer standard deviation; JitCovar = Covariance of jitter; DirJit = Directional jitter; JitMean = Mean jitter; JitSD = Jitter standard deviation; LNSD = The natural logarithm of the standard deviation of the period lengths; LNSDMN = LNSD divided by the mean period length; MeasRough = Measured roughness; RoughSD = Standard deviation of the roughness measure (see text for details). ExD1 ExD2 ExD3 NaDl NaD2 NaD3 NaD4 Abnorm ExD2 ExD3 NaD1 NaD2 NaD3 NaD4 Abnormal Breathy Instable Rough Fo F1 F2 F3 F2-F1 F2/F1 H 1/H2 #Harmon HTN ShCovar DirSh ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD LNSDMN MeasRough RoughSD Breathy Instable Rough F o Fl F2 F3 F2-F1 Instable Rough Fo F1 F2 F3 F2-F1 F2/F1 H1/H2 #Harmon HTN ShCovar DirSh ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD

11 KREIMAN ET AL.: Voice Quality Perception 113 Breathy Instable Rough Fo F1 F2 F3 F2-Fi LNSDMN MeasRough RoughSD F2/F1 H1/H2 Harmon HTN ShCovar DirSh ShMean ShSD H1/H2.03 #Harmon HTN ShCovar DirSh ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD LNSDMN MeasRough RoughSD JitCovar DirJit Jitter JitMean JitSD LNSD LNSDMN MeasRough DirJit % Jitter JitMean JitSD LNSD LNSDMN MeasRough RoughSD

12 114 Journal of Speech and Hearing Research March 1990 APPENDIX B CORRELATIONS BETWEEN DIMENSIONS AND STIMULUS CHARACTERISTICS FOR THE PATHOLOGICAL VOICES Key to abbreviations: ExDn = Stimulus coordinates on expert dimension n; NaDn = Stimulus Coordinates on naive dimension n; # Harmon = The number of visible harmonics; HTN = Harmonics-to-noise ratio; ShCovar = Covariance of shimmer; DirSh = Directional shimmer; ShMean = Mean shimmer; ShSD = Shimmer standard deviation; JitCovar = Covariance of jitter; DirJit = Directional jitter; JitMean = Mean jitter; JitSD = Jitter standard deviation; LNSD = The natural logarithm of the standard deviation of the period lengths; LNSDMN = LNSD divided by the mean period length; MeasRough = Measured roughness; RoughSD = Standard deviation of the roughness measure (see text for details). ExD1 ExD2 ExD3 NaDl NaD2 F o F1 F2 ExD ExD NaDl NaD F o F F F F2-F F2/F H1/H #Harmon HTN ShCovar DirSh ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD LNSDMN MeasRough RoughSD F3 F2-F1 F2/F1 HI/H2 # Harmon HTN ShCovar DirSh F2-F F2/F H1/H #Harmon HTN ShCovar DirSh ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD LNSDMN MeasRough RoughSD ShMean ShSD JitCovar DirJit % Jitter JitMean JitSD LNSD ShSD JitCovar DirJit % Jitter JitMean JitSD

Individual Differences in Voice Quality Perception

Individual Differences in Voice Quality Perception Journal ot Speech and it earing ResearLhl. V ',lnui. il- 5. laun Individual Differences in Voice Quality Perception Jody Kreiman Bruce R. Gerratt Kristin Precoda Gerald S. Berke VA Medical Center, WVest

More information

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 4aSCb: Voice and F0 Across Tasks (Poster

More information

Perceptual Evaluation of Voice Quality: Review, Tutorial, and a Framework for Future Research

Perceptual Evaluation of Voice Quality: Review, Tutorial, and a Framework for Future Research Journal of Speech and Hearing Research, Volume 36, 21-40, February 1993 Perceptual Evaluation of Voice Quality: Review, Tutorial, and a Framework for Future Research Jody Kreiman Bruce R. Gerratt VA Medical

More information

Credits. Learner objectives and outcomes. Outline. Why care about voice quality? I. INTRODUCTION

Credits. Learner objectives and outcomes. Outline. Why care about voice quality? I. INTRODUCTION Credits Perceptual assessment of voice - past, present and future Jody Kreiman, PhD University of California, Los Angeles The ideas presented here were developed in collaboration with Bruce Gerratt (UCLA).

More information

Linguistic Phonetics Fall 2005

Linguistic Phonetics Fall 2005 MIT OpenCourseWare http://ocw.mit.edu 24.963 Linguistic Phonetics Fall 2005 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 24.963 Linguistic Phonetics

More information

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES ISCA Archive ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES Allard Jongman 1, Yue Wang 2, and Joan Sereno 1 1 Linguistics Department, University of Kansas, Lawrence, KS 66045 U.S.A. 2 Department

More information

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions. 24.963 Linguistic Phonetics Basic Audition Diagram of the inner ear removed due to copyright restrictions. 1 Reading: Keating 1985 24.963 also read Flemming 2001 Assignment 1 - basic acoustics. Due 9/22.

More information

Topics in Linguistic Theory: Laboratory Phonology Spring 2007

Topics in Linguistic Theory: Laboratory Phonology Spring 2007 MIT OpenCourseWare http://ocw.mit.edu 24.91 Topics in Linguistic Theory: Laboratory Phonology Spring 27 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080 Perceptual segregation of a harmonic from a vowel by interaural time difference in conjunction with mistuning and onset asynchrony C. J. Darwin and R. W. Hukin Experimental Psychology, University of Sussex,

More information

PERCEPTUAL MEASUREMENT OF BREATHY VOICE QUALITY

PERCEPTUAL MEASUREMENT OF BREATHY VOICE QUALITY PERCEPTUAL MEASUREMENT OF BREATHY VOICE QUALITY By SONA PATEL A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER

More information

Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2

Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2 Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2 1 Bloomsburg University of Pennsylvania; 2 Walter Reed National Military Medical Center; 3 University of Pittsburgh

More information

ACOUSTIC ANALYSIS AND PERCEPTION OF CANTONESE VOWELS PRODUCED BY PROFOUNDLY HEARING IMPAIRED ADOLESCENTS

ACOUSTIC ANALYSIS AND PERCEPTION OF CANTONESE VOWELS PRODUCED BY PROFOUNDLY HEARING IMPAIRED ADOLESCENTS ACOUSTIC ANALYSIS AND PERCEPTION OF CANTONESE VOWELS PRODUCED BY PROFOUNDLY HEARING IMPAIRED ADOLESCENTS Edward Khouw, & Valter Ciocca Dept. of Speech and Hearing Sciences, The University of Hong Kong

More information

Polly Lau Voice Research Lab, Department of Speech and Hearing Sciences, The University of Hong Kong

Polly Lau Voice Research Lab, Department of Speech and Hearing Sciences, The University of Hong Kong Perception of synthesized voice quality in connected speech by Cantonese speakers Edwin M-L. Yiu a) Voice Research Laboratory, Department of Speech and Hearing Sciences, The University of Hong Kong, 5/F

More information

Interjudge Reliability in the Measurement of Pitch Matching. A Senior Honors Thesis

Interjudge Reliability in the Measurement of Pitch Matching. A Senior Honors Thesis Interjudge Reliability in the Measurement of Pitch Matching A Senior Honors Thesis Presented in partial fulfillment of the requirements for graduation with distinction in Speech and Hearing Science in

More information

be investigated online at the following web site: Synthesis of Pathological Voices Using the Klatt Synthesiser. Speech

be investigated online at the following web site: Synthesis of Pathological Voices Using the Klatt Synthesiser. Speech References [1] Antonanzas, N. The inverse filter program developed by Norma Antonanzas can be investigated online at the following web site: www.surgery.medsch.ucla.edu/glottalaffairs/software_of_the_boga.htm

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION

More information

EEL 6586, Project - Hearing Aids algorithms

EEL 6586, Project - Hearing Aids algorithms EEL 6586, Project - Hearing Aids algorithms 1 Yan Yang, Jiang Lu, and Ming Xue I. PROBLEM STATEMENT We studied hearing loss algorithms in this project. As the conductive hearing loss is due to sound conducting

More information

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants Context Effect on Segmental and Supresegmental Cues Preceding context has been found to affect phoneme recognition Stop consonant recognition (Mann, 1980) A continuum from /da/ to /ga/ was preceded by

More information

Speech Cue Weighting in Fricative Consonant Perception in Hearing Impaired Children

Speech Cue Weighting in Fricative Consonant Perception in Hearing Impaired Children University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange University of Tennessee Honors Thesis Projects University of Tennessee Honors Program 5-2014 Speech Cue Weighting in Fricative

More information

Creative Commons: Attribution 3.0 Hong Kong License

Creative Commons: Attribution 3.0 Hong Kong License Title Effect of feedback on the effectiveness of a paired comparison perceptual voice rating training program Other Contributor(s) University of Hong Kong. Author(s) Law, Tsz-ying Citation Issued Date

More information

The Pennsylvania State University. The Graduate School. College of Health and Human Development THE EFFECT OF EXPERIENCE AND THE RELATIONSHIP AMONG

The Pennsylvania State University. The Graduate School. College of Health and Human Development THE EFFECT OF EXPERIENCE AND THE RELATIONSHIP AMONG i The Pennsylvania State University The Graduate School College of Health and Human Development THE EFFECT OF EXPERIENCE AND THE RELATIONSHIP AMONG SUBJECTIVE AND OBJECTIVE MEASURES OF VOICE QUALITY A

More information

A COMPARISON OF PSYCHOPHYSICAL METHODS FOR THE EVALUATION OF ROUGH VOICE QUALITY

A COMPARISON OF PSYCHOPHYSICAL METHODS FOR THE EVALUATION OF ROUGH VOICE QUALITY A COMPARISON OF PSYCHOPHYSICAL METHODS FOR THE EVALUATION OF ROUGH VOICE QUALITY By STACIE NOELLE CUMMINGS A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

More information

Variation in spectral-shape discrimination weighting functions at different stimulus levels and signal strengths

Variation in spectral-shape discrimination weighting functions at different stimulus levels and signal strengths Variation in spectral-shape discrimination weighting functions at different stimulus levels and signal strengths Jennifer J. Lentz a Department of Speech and Hearing Sciences, Indiana University, Bloomington,

More information

Best Practice Protocols

Best Practice Protocols Best Practice Protocols SoundRecover for children What is SoundRecover? SoundRecover (non-linear frequency compression) seeks to give greater audibility of high-frequency everyday sounds by compressing

More information

Consonant Perception test

Consonant Perception test Consonant Perception test Introduction The Vowel-Consonant-Vowel (VCV) test is used in clinics to evaluate how well a listener can recognize consonants under different conditions (e.g. with and without

More information

Performance of Gaussian Mixture Models as a Classifier for Pathological Voice

Performance of Gaussian Mixture Models as a Classifier for Pathological Voice PAGE 65 Performance of Gaussian Mixture Models as a Classifier for Pathological Voice Jianglin Wang, Cheolwoo Jo SASPL, School of Mechatronics Changwon ational University Changwon, Gyeongnam 64-773, Republic

More information

2012, Greenwood, L.

2012, Greenwood, L. Critical Review: How Accurate are Voice Accumulators for Measuring Vocal Behaviour? Lauren Greenwood M.Cl.Sc. (SLP) Candidate University of Western Ontario: School of Communication Sciences and Disorders

More information

What Is the Difference between db HL and db SPL?

What Is the Difference between db HL and db SPL? 1 Psychoacoustics What Is the Difference between db HL and db SPL? The decibel (db ) is a logarithmic unit of measurement used to express the magnitude of a sound relative to some reference level. Decibels

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception Long-term spectrum of speech HCS 7367 Speech Perception Connected speech Absolute threshold Males Dr. Peter Assmann Fall 212 Females Long-term spectrum of speech Vowels Males Females 2) Absolute threshold

More information

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Tetsuya Shimamura and Fumiya Kato Graduate School of Science and Engineering Saitama University 255 Shimo-Okubo,

More information

Topic 4. Pitch & Frequency

Topic 4. Pitch & Frequency Topic 4 Pitch & Frequency A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu (accompanying himself on doshpuluur) demonstrates perfectly the characteristic sound of the Xorekteer voice An

More information

Categorical Perception

Categorical Perception Categorical Perception Discrimination for some speech contrasts is poor within phonetic categories and good between categories. Unusual, not found for most perceptual contrasts. Influenced by task, expectations,

More information

Perceptual Effects of Nasal Cue Modification

Perceptual Effects of Nasal Cue Modification Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2015, 9, 399-407 399 Perceptual Effects of Nasal Cue Modification Open Access Fan Bai 1,2,*

More information

Speech (Sound) Processing

Speech (Sound) Processing 7 Speech (Sound) Processing Acoustic Human communication is achieved when thought is transformed through language into speech. The sounds of speech are initiated by activity in the central nervous system,

More information

Contributions of the piriform fossa of female speakers to vowel spectra

Contributions of the piriform fossa of female speakers to vowel spectra Contributions of the piriform fossa of female speakers to vowel spectra Congcong Zhang 1, Kiyoshi Honda 1, Ju Zhang 1, Jianguo Wei 1,* 1 Tianjin Key Laboratory of Cognitive Computation and Application,

More information

International Forensic Science & Forensic Medicine Conference Naif Arab University for Security Sciences Riyadh Saudi Arabia

International Forensic Science & Forensic Medicine Conference Naif Arab University for Security Sciences Riyadh Saudi Arabia SPECTRAL EDITING IN SPEECH RECORDINGS: CHALLENGES IN AUTHENTICITY CHECKING Antonio César Morant Braid Electronic Engineer, Specialist Official Forensic Expert, Public Security Secretariat - Technical Police

More information

Juan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3

Juan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3 PRESERVING SPECTRAL CONTRAST IN AMPLITUDE COMPRESSION FOR HEARING AIDS Juan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3 1 University of Malaga, Campus de Teatinos-Complejo Tecnol

More information

Scale Invariance and Primacy and Recency Effects in an Absolute Identification Task

Scale Invariance and Primacy and Recency Effects in an Absolute Identification Task Neath, I., & Brown, G. D. A. (2005). Scale Invariance and Primacy and Recency Effects in an Absolute Identification Task. Memory Lab Technical Report 2005-01, Purdue University. Scale Invariance and Primacy

More information

Speech Intelligibility Measurements in Auditorium

Speech Intelligibility Measurements in Auditorium Vol. 118 (2010) ACTA PHYSICA POLONICA A No. 1 Acoustic and Biomedical Engineering Speech Intelligibility Measurements in Auditorium K. Leo Faculty of Physics and Applied Mathematics, Technical University

More information

EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE DESIGN OF SPATIAL AUDIO DISPLAYS

EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE DESIGN OF SPATIAL AUDIO DISPLAYS Proceedings of the 14 International Conference on Auditory Display, Paris, France June 24-27, 28 EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE

More information

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED Francisco J. Fraga, Alan M. Marotta National Institute of Telecommunications, Santa Rita do Sapucaí - MG, Brazil Abstract A considerable

More information

Effects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag

Effects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag JAIST Reposi https://dspace.j Title Effects of speaker's and listener's environments on speech intelligibili annoyance Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag Citation Inter-noise 2016: 171-176 Issue

More information

Hearing. Juan P Bello

Hearing. Juan P Bello Hearing Juan P Bello The human ear The human ear Outer Ear The human ear Middle Ear The human ear Inner Ear The cochlea (1) It separates sound into its various components If uncoiled it becomes a tapering

More information

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear The psychophysics of cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences Prelude Envelope and temporal

More information

CONTRIBUTION OF DIRECTIONAL ENERGY COMPONENTS OF LATE SOUND TO LISTENER ENVELOPMENT

CONTRIBUTION OF DIRECTIONAL ENERGY COMPONENTS OF LATE SOUND TO LISTENER ENVELOPMENT CONTRIBUTION OF DIRECTIONAL ENERGY COMPONENTS OF LATE SOUND TO LISTENER ENVELOPMENT PACS:..Hy Furuya, Hiroshi ; Wakuda, Akiko ; Anai, Ken ; Fujimoto, Kazutoshi Faculty of Engineering, Kyushu Kyoritsu University

More information

Audibility, discrimination and hearing comfort at a new level: SoundRecover2

Audibility, discrimination and hearing comfort at a new level: SoundRecover2 Audibility, discrimination and hearing comfort at a new level: SoundRecover2 Julia Rehmann, Michael Boretzki, Sonova AG 5th European Pediatric Conference Current Developments and New Directions in Pediatric

More information

Sound Texture Classification Using Statistics from an Auditory Model

Sound Texture Classification Using Statistics from an Auditory Model Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering

More information

Temporal offset judgments for concurrent vowels by young, middle-aged, and older adults

Temporal offset judgments for concurrent vowels by young, middle-aged, and older adults Temporal offset judgments for concurrent vowels by young, middle-aged, and older adults Daniel Fogerty Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South

More information

The role of low frequency components in median plane localization

The role of low frequency components in median plane localization Acoust. Sci. & Tech. 24, 2 (23) PAPER The role of low components in median plane localization Masayuki Morimoto 1;, Motoki Yairi 1, Kazuhiro Iida 2 and Motokuni Itoh 1 1 Environmental Acoustics Laboratory,

More information

Although considerable work has been conducted on the speech

Although considerable work has been conducted on the speech Influence of Hearing Loss on the Perceptual Strategies of Children and Adults Andrea L. Pittman Patricia G. Stelmachowicz Dawna E. Lewis Brenda M. Hoover Boys Town National Research Hospital Omaha, NE

More information

Telephone Based Automatic Voice Pathology Assessment.

Telephone Based Automatic Voice Pathology Assessment. Telephone Based Automatic Voice Pathology Assessment. Rosalyn Moran 1, R. B. Reilly 1, P.D. Lacy 2 1 Department of Electronic and Electrical Engineering, University College Dublin, Ireland 2 Royal Victoria

More information

Effect of spectral content and learning on auditory distance perception

Effect of spectral content and learning on auditory distance perception Effect of spectral content and learning on auditory distance perception Norbert Kopčo 1,2, Dávid Čeljuska 1, Miroslav Puszta 1, Michal Raček 1 a Martin Sarnovský 1 1 Department of Cybernetics and AI, Technical

More information

Twenty subjects (11 females) participated in this study. None of the subjects had

Twenty subjects (11 females) participated in this study. None of the subjects had SUPPLEMENTARY METHODS Subjects Twenty subjects (11 females) participated in this study. None of the subjects had previous exposure to a tone language. Subjects were divided into two groups based on musical

More information

11 Music and Speech Perception

11 Music and Speech Perception 11 Music and Speech Perception Properties of sound Sound has three basic dimensions: Frequency (pitch) Intensity (loudness) Time (length) Properties of sound The frequency of a sound wave, measured in

More information

Auditory scene analysis in humans: Implications for computational implementations.

Auditory scene analysis in humans: Implications for computational implementations. Auditory scene analysis in humans: Implications for computational implementations. Albert S. Bregman McGill University Introduction. The scene analysis problem. Two dimensions of grouping. Recognition

More information

Computational Perception /785. Auditory Scene Analysis

Computational Perception /785. Auditory Scene Analysis Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON

PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON Summary This paper has been prepared for a meeting (in Beijing 9-10 IX 1996) organised jointly by the

More information

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics 2/14/18 Can hear whistle? Lecture 5 Psychoacoustics Based on slides 2009--2018 DeHon, Koditschek Additional Material 2014 Farmer 1 2 There are sounds we cannot hear Depends on frequency Where are we on

More information

Congruency Effects with Dynamic Auditory Stimuli: Design Implications

Congruency Effects with Dynamic Auditory Stimuli: Design Implications Congruency Effects with Dynamic Auditory Stimuli: Design Implications Bruce N. Walker and Addie Ehrenstein Psychology Department Rice University 6100 Main Street Houston, TX 77005-1892 USA +1 (713) 527-8101

More information

METHOD. The current study was aimed at investigating the prevalence and nature of voice

METHOD. The current study was aimed at investigating the prevalence and nature of voice CHAPTER - 3 METHOD The current study was aimed at investigating the prevalence and nature of voice problems in call center operators with the following objectives: To determine the prevalence of voice

More information

Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease

Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease ORIGINAL ARTICLE Voice Analysis in Individuals with Chronic 10.5005/jp-journals-10023-1081 Obstructive Pulmonary Disease Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease 1 Anuradha

More information

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds Hearing Lectures Acoustics of Speech and Hearing Week 2-10 Hearing 3: Auditory Filtering 1. Loudness of sinusoids mainly (see Web tutorial for more) 2. Pitch of sinusoids mainly (see Web tutorial for more)

More information

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation Aldebaro Klautau - http://speech.ucsd.edu/aldebaro - 2/3/. Page. Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation ) Introduction Several speech processing algorithms assume the signal

More information

Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence

Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence Journal of Experimental Psychology: Human Perception and Performance 1976, Vol. 2, No. 1, 23-29 Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence Diana Deutsch and Philip

More information

Spectrograms (revisited)

Spectrograms (revisited) Spectrograms (revisited) We begin the lecture by reviewing the units of spectrograms, which I had only glossed over when I covered spectrograms at the end of lecture 19. We then relate the blocks of a

More information

Trading Directional Accuracy for Realism in a Virtual Auditory Display

Trading Directional Accuracy for Realism in a Virtual Auditory Display Trading Directional Accuracy for Realism in a Virtual Auditory Display Barbara G. Shinn-Cunningham, I-Fan Lin, and Tim Streeter Hearing Research Center, Boston University 677 Beacon St., Boston, MA 02215

More information

Kaylah Lalonde, Ph.D. 555 N. 30 th Street Omaha, NE (531)

Kaylah Lalonde, Ph.D. 555 N. 30 th Street Omaha, NE (531) Kaylah Lalonde, Ph.D. kaylah.lalonde@boystown.org 555 N. 30 th Street Omaha, NE 68131 (531)-355-5631 EDUCATION 2014 Ph.D., Speech and Hearing Sciences Indiana University minor: Psychological and Brain

More information

Gick et al.: JASA Express Letters DOI: / Published Online 17 March 2008

Gick et al.: JASA Express Letters DOI: / Published Online 17 March 2008 modality when that information is coupled with information via another modality (e.g., McGrath and Summerfield, 1985). It is unknown, however, whether there exist complex relationships across modalities,

More information

Carnegie Mellon University Annual Progress Report: 2011 Formula Grant

Carnegie Mellon University Annual Progress Report: 2011 Formula Grant Carnegie Mellon University Annual Progress Report: 2011 Formula Grant Reporting Period January 1, 2012 June 30, 2012 Formula Grant Overview The Carnegie Mellon University received $943,032 in formula funds

More information

Changes in the Role of Intensity as a Cue for Fricative Categorisation

Changes in the Role of Intensity as a Cue for Fricative Categorisation INTERSPEECH 2013 Changes in the Role of Intensity as a Cue for Fricative Categorisation Odette Scharenborg 1, Esther Janse 1,2 1 Centre for Language Studies and Donders Institute for Brain, Cognition and

More information

Role of F0 differences in source segregation

Role of F0 differences in source segregation Role of F0 differences in source segregation Andrew J. Oxenham Research Laboratory of Electronics, MIT and Harvard-MIT Speech and Hearing Bioscience and Technology Program Rationale Many aspects of segregation

More information

Hearing in the Environment

Hearing in the Environment 10 Hearing in the Environment Click Chapter to edit 10 Master Hearing title in the style Environment Sound Localization Complex Sounds Auditory Scene Analysis Continuity and Restoration Effects Auditory

More information

A neural network model for optimizing vowel recognition by cochlear implant listeners

A neural network model for optimizing vowel recognition by cochlear implant listeners A neural network model for optimizing vowel recognition by cochlear implant listeners Chung-Hwa Chang, Gary T. Anderson, Member IEEE, and Philipos C. Loizou, Member IEEE Abstract-- Due to the variability

More information

The development of a modified spectral ripple test

The development of a modified spectral ripple test The development of a modified spectral ripple test Justin M. Aronoff a) and David M. Landsberger Communication and Neuroscience Division, House Research Institute, 2100 West 3rd Street, Los Angeles, California

More information

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu

More information

Digital. hearing instruments have burst on the

Digital. hearing instruments have burst on the Testing Digital and Analog Hearing Instruments: Processing Time Delays and Phase Measurements A look at potential side effects and ways of measuring them by George J. Frye Digital. hearing instruments

More information

Although most previous studies on categorization

Although most previous studies on categorization Japanese Psychological Research 1987, Vol.29, No.3, 120-130 Effects of category structure on children's categorization TAKESHI SUGIMURA and TOYOKO INOUE Department of Psychology, Nara University of Education,

More information

Tactile Communication of Speech

Tactile Communication of Speech Tactile Communication of Speech RLE Group Sensory Communication Group Sponsor National Institutes of Health/National Institute on Deafness and Other Communication Disorders Grant 2 R01 DC00126, Grant 1

More information

New Approaches to Studying Auditory Processing in Marine Mammals

New Approaches to Studying Auditory Processing in Marine Mammals DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. New Approaches to Studying Auditory Processing in Marine Mammals James J. Finneran Space and Naval Warfare Systems Center

More information

Signals, systems, acoustics and the ear. Week 1. Laboratory session: Measuring thresholds

Signals, systems, acoustics and the ear. Week 1. Laboratory session: Measuring thresholds Signals, systems, acoustics and the ear Week 1 Laboratory session: Measuring thresholds What s the most commonly used piece of electronic equipment in the audiological clinic? The Audiometer And what is

More information

A FRÖHLICH EFFECT IN MEMORY FOR AUDITORY PITCH: EFFECTS OF CUEING AND OF REPRESENTATIONAL GRAVITY. Timothy L. Hubbard 1 & Susan E.

A FRÖHLICH EFFECT IN MEMORY FOR AUDITORY PITCH: EFFECTS OF CUEING AND OF REPRESENTATIONAL GRAVITY. Timothy L. Hubbard 1 & Susan E. In D. Algom, D. Zakay, E. Chajut, S. Shaki, Y. Mama, & V. Shakuf (Eds.). (2011). Fechner Day 2011: Proceedings of the 27 th Annual Meeting of the International Society for Psychophysics (pp. 89-94). Raanana,

More information

Auditory temporal order and perceived fusion-nonfusion

Auditory temporal order and perceived fusion-nonfusion Perception & Psychophysics 1980.28 (5). 465-470 Auditory temporal order and perceived fusion-nonfusion GREGORY M. CORSO Georgia Institute of Technology, Atlanta, Georgia 30332 A pair of pure-tone sine

More information

Sinlultaneous vs sequential discriminations of Markov-generated stimuli 1

Sinlultaneous vs sequential discriminations of Markov-generated stimuli 1 Sinlultaneous vs sequential discriminations of Markov-generated stimuli 1 BLL R. BROWN AND THOMAS J. REBBN2 UNlVERSTY OF LOUSVLLE This experiment required Ss to make Markov-generated histoforms that were

More information

SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds

SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds Phonak Insight April 2016 SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds Phonak led the way in modern frequency lowering technology with the introduction

More information

THE EFFECT OF A REMINDER STIMULUS ON THE DECISION STRATEGY ADOPTED IN THE TWO-ALTERNATIVE FORCED-CHOICE PROCEDURE.

THE EFFECT OF A REMINDER STIMULUS ON THE DECISION STRATEGY ADOPTED IN THE TWO-ALTERNATIVE FORCED-CHOICE PROCEDURE. THE EFFECT OF A REMINDER STIMULUS ON THE DECISION STRATEGY ADOPTED IN THE TWO-ALTERNATIVE FORCED-CHOICE PROCEDURE. Michael J. Hautus, Daniel Shepherd, Mei Peng, Rebecca Philips and Veema Lodhia Department

More information

PERCEPTION OF UNATTENDED SPEECH. University of Sussex Falmer, Brighton, BN1 9QG, UK

PERCEPTION OF UNATTENDED SPEECH. University of Sussex Falmer, Brighton, BN1 9QG, UK PERCEPTION OF UNATTENDED SPEECH Marie Rivenez 1,2, Chris Darwin 1, Anne Guillaume 2 1 Department of Psychology University of Sussex Falmer, Brighton, BN1 9QG, UK 2 Département Sciences Cognitives Institut

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening AUDL GS08/GAV1 Signals, systems, acoustics and the ear Pitch & Binaural listening Review 25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 Part I: Auditory frequency selectivity Tuning

More information

An active unpleasantness control system for indoor noise based on auditory masking

An active unpleasantness control system for indoor noise based on auditory masking An active unpleasantness control system for indoor noise based on auditory masking Daisuke Ikefuji, Masato Nakayama, Takanabu Nishiura and Yoich Yamashita Graduate School of Information Science and Engineering,

More information

Resonating memory traces account for the perceptual magnet effect

Resonating memory traces account for the perceptual magnet effect Resonating memory traces account for the perceptual magnet effect Gerhard Jäger Dept. of Linguistics, University of Tübingen, Germany Introduction In a series of experiments, atricia Kuhl and co-workers

More information

Speech Generation and Perception

Speech Generation and Perception Speech Generation and Perception 1 Speech Generation and Perception : The study of the anatomy of the organs of speech is required as a background for articulatory and acoustic phonetics. An understanding

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - - Electrophysiological Measurements Psychophysical Measurements Three Approaches to Researching Audition physiology

More information

Study of perceptual balance for binaural dichotic presentation

Study of perceptual balance for binaural dichotic presentation Paper No. 556 Proceedings of 20 th International Congress on Acoustics, ICA 2010 23-27 August 2010, Sydney, Australia Study of perceptual balance for binaural dichotic presentation Pandurangarao N. Kulkarni

More information

Goodness of Pattern and Pattern Uncertainty 1

Goodness of Pattern and Pattern Uncertainty 1 J'OURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 2, 446-452 (1963) Goodness of Pattern and Pattern Uncertainty 1 A visual configuration, or pattern, has qualities over and above those which can be specified

More information

Discrete Signal Processing

Discrete Signal Processing 1 Discrete Signal Processing C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.cs.nctu.edu.tw/~cmliu/courses/dsp/ ( Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - Electrophysiological Measurements - Psychophysical Measurements 1 Three Approaches to Researching Audition physiology

More information

Development of a new loudness model in consideration of audio-visual interaction

Development of a new loudness model in consideration of audio-visual interaction Development of a new loudness model in consideration of audio-visual interaction Kai AIZAWA ; Takashi KAMOGAWA ; Akihiko ARIMITSU 3 ; Takeshi TOI 4 Graduate school of Chuo University, Japan, 3, 4 Chuo

More information

Sound Preference Development and Correlation to Service Incidence Rate

Sound Preference Development and Correlation to Service Incidence Rate Sound Preference Development and Correlation to Service Incidence Rate Terry Hardesty a) Sub-Zero, 4717 Hammersley Rd., Madison, WI, 53711, United States Eric Frank b) Todd Freeman c) Gabriella Cerrato

More information