MODELING WITHIN-PAIR ORDER EFFECTS IN PAIRED-COMPARISON JUDGMENTS.

Similar documents
Extraction of Auditory Features and Elicitation of Attributes for the Assessment of Multichannel Reproduced Sound*

TESTING A NEW THEORY OF PSYCHOPHYSICAL SCALING: TEMPORAL LOUDNESS INTEGRATION

Convention Paper Presented at the 118th Convention 2005 May Barcelona, Spain

Effects of Sequential Context on Judgments and Decisions in the Prisoner s Dilemma Game

CONTRIBUTION OF DIRECTIONAL ENERGY COMPONENTS OF LATE SOUND TO LISTENER ENVELOPMENT

TIME-ORDER EFFECTS FOR AESTHETIC PREFERENCE

Modeling Human Perception

Spatial audio and sensory evaluation techniques context, history and aims

Goodness of Pattern and Pattern Uncertainty 1

Evaluation of Perceived Spatial Audio Quality

The influence of binaural incoherence on annoyance reported for unpleasant low frequency sound

A comparison between Japanese and Chinese adjectives which express auditory impressions

Influence of music-induced floor vibration on impression of music in concert halls

Framework for Comparative Research on Relational Information Displays

Classical Psychophysical Methods (cont.)

Pushing the Right Buttons: Design Characteristics of Touch Screen Buttons

Handling Partial Preferences in the Belief AHP Method: Application to Life Cycle Assessment

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Decisions based on verbal probabilities: Decision bias or decision by belief sampling?

Loudness Processing of Time-Varying Sounds: Recent advances in psychophysics and challenges for future research

EFFECTS OF RHYTHM ON THE PERCEPTION OF URGENCY AND IRRITATION IN 1 AUDITORY SIGNALS. Utrecht University

STUDY OF THE BISECTION OPERATION

Empirical Formula for Creating Error Bars for the Method of Paired Comparison

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

MEASURING AFFECTIVE RESPONSES TO CONFECTIONARIES USING PAIRED COMPARISONS

SHORT REPORT Subsymmetries predict auditory and visual pattern complexity

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

Toward an objective measure for a stream segregation task

Audio Quality Assessment

Evaluation of Spatial Audio Reproduction Methods (Part 2): Analysis of Listener Preference

A Memory Model for Decision Processes in Pigeons

MODELS FOR THE ADJUSTMENT OF RATING SCALES 1

Learning to classify integral-dimension stimuli

Caveats for the Spatial Arrangement Method: Supplementary Materials

The effects of subthreshold synchrony on the perception of simultaneity. Ludwig-Maximilians-Universität Leopoldstr 13 D München/Munich, Germany

A Brief (very brief) Overview of Biostatistics. Jody Kreiman, PhD Bureau of Glottal Affairs

Changing expectations about speed alters perceived motion direction

SCALAR TIMING (EXPECTANCY) THEORY: A COMPARISON BETWEEN PROSPECTIVE AND RETROSPECTIVE DURATION. Abstract

Effect of spectral content and learning on auditory distance perception

Congruency Effects with Dynamic Auditory Stimuli: Design Implications

Formulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification

Implicit Information in Directionality of Verbal Probability Expressions

THE RELATION BETWEEN SPATIAL IMPRESSION AND THE PRECEDENCE EFFECT. Masayuki Morimoto

Categorical Perception

Adapting to Remapped Auditory Localization Cues: A Decision-Theory Model

Encoding of Elements and Relations of Object Arrangements by Young Children

3-D Sound and Spatial Audio. What do these terms mean?

Absolute Identification is Surprisingly Faster with More Closely Spaced Stimuli

Some methodological aspects for measuring asynchrony detection in audio-visual stimuli

Effect of Choice Set on Valuation of Risky Prospects

Considering the perception of combined railway noise and vibration as a multidimensional phenomenon

Introduction & Basics

A model of parallel time estimation

SYSTEMATIC EVALUATION OF PERCEIVED SPATIAL QUALITY

Unit 1 Exploring and Understanding Data

Sensory Cue Integration

innate mechanism of proportionality adaptation stage activation or recognition stage innate biological metrics acquired social metrics

Comparing Direct and Indirect Measures of Just Rewards: What Have We Learned?

Deriving auditory features from triadic comparisons

AUTOCORRELATION AND CROSS-CORRELARION ANALYSES OF ALPHA WAVES IN RELATION TO SUBJECTIVE PREFERENCE OF A FLICKERING LIGHT

Synthesis of Spatially Extended Virtual Sources with Time-Frequency Decomposition of Mono Signals

Journal of Mathematical Psychology. Theory and tests of the conjoint commutativity axiom for additive conjoint measurement

STATISTICAL PROCESSING OF SUBJECTIVE TEST DATA FOR SOUND QUALITY EVALUATION OF AUTOMOTIVE HORN

IN EAR TO OUT THERE: A MAGNITUDE BASED PARAMETERIZATION SCHEME FOR SOUND SOURCE EXTERNALIZATION. Griffin D. Romigh, Brian D. Simpson, Nandini Iyer

Auditory Perception: Sense of Sound /785 Spring 2017

Evaluating normality of procedure up down trasformed response by Kolmogorov Smirnov test applied to difference thresholds for seat vibration

Introduction to Bayesian Analysis 1

On testing dependency for data in multidimensional contingency tables

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Deconstructing the Receptive Field: Information Coding in Macaque area MST

Assessment of auditory temporal-order thresholds A comparison of different measurement procedures and the influences of age and gender

Perceptual Plasticity in Spatial Auditory Displays

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Today s Agenda. Human abilities Cognition Review for Exam1

EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE DESIGN OF SPATIAL AUDIO DISPLAYS

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

STOCHASTIC CHOICE HEURISTICS *

Consonant Perception test

VISUAL PERCEPTION OF STRUCTURED SYMBOLS

PSYC 441 Cognitive Psychology II

Wendy J. Reece (INEEL) Leroy J. Matthews (ISU) Linda K. Burggraff (ISU)

Correlation and Regression

It is Whether You Win or Lose: The Importance of the Overall Probabilities of Winning or Losing in Risky Choice

Risk attitude in decision making: A clash of three approaches

Time- order effects in timed brightness and size discrimination of paired visual stimuli

Quantifying temporal ventriloquism in audio-visual synchrony perception

Louis Leon Thurstone in Monte Carlo: Creating Error Bars for the Method of Paired Comparison

Neurobiology: The nerve cell. Principle and task To use a nerve function model to study the following aspects of a nerve cell:

Understanding Users. - cognitive processes. Unit 3

Technical Specifications

An algorithm modelling the Irrelevant Sound Effect (ISE)

Chapter 1: Explaining Behavior

Introduction to Computational Neuroscience

Perceptual Audio Evaluation

9/3/2014. Which impairs the ability to integrate these experiences in an adaptive manner.

Blame the Skilled. Introduction. Achievement Motivation: Ability and Effort. Skill, Expectation and Control

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Sinlultaneous vs sequential discriminations of Markov-generated stimuli 1

Transcription:

MODELING WITHIN-PAIR ORDER EFFECTS IN PAIRED-COMPARISON JUDGMENTS Florian Wickelmaier 1,2 and Sylvain Choisel 1,3 1 Sound Quality Research Unit, Dept. of Acoustics, Aalborg University, Denmark 2 Dept. of Psychiatry, Ludwig-Maximilians-University of Munich, Germany 3 Bang & Olufsen A/S, Struer, Denmark florian.wickelmaier@med.uni-muenchen.de; sc@acoustics.aau.dk Abstract Probabilistic choice models which generalize the Bradley-Terry-Luce (BTL) model (Luce, 1959) and the elimination-by-aspects (EBA) model (Tversky, 1972) are presented. These models account for the effect of presentation order within a pair of stimuli, and thereby allow for quantifying judgmental biases for the first or second presentation interval. Data were collected in an experiment where 39 subjects made pairwise choices between short musical excerpts reproduced in eight different audio formats (mono, stereo, and multichannel). Choice criteria were overall preference, and eight specific (spatial and timbral) auditory attributes. The results indicate that biases are well accounted for by the generalized modeling, and that for seven of the nine attributes, including preference, they significantly favored the second presentation interval. Time and order effects in perception and decision making are phenomena met with great research interest by psychophysicists (see Hellström, 1985, for a review). In pairedcomparison experiments stimuli are often presented sequentially. When subjects are asked to choose one of two stimuli with respect to a given criterion, their judgments might be influenced by the presentation order within that pair. As a consequence, there might be a perceptual or judgmental bias favoring either the first or the second presentation interval. In the following section probabilistic choice models that account for the effect of presentation order within a pair are briefly introduced. Subsequently, an application is presented, where subjects were asked to judge upon overall preference and specific auditory sensations in an experiment on perceptual evaluation of reproduced sound. The models introduced were employed to quantifying within-pair order effects in the pairedcomparison judgments. Probabilistic choice models without order effect Among the most widely used choice models for paired comparisons is the Bradley-Terry- Luce (BTL) model (Bradley & Terry, 1952; Luce, 1959). The BTL model predicts the probability, P xy, of choosing stimulus x over stimulus y by P xy = u(x) u(x) + u(y), (1) where u( ) is a ratio scale (Luce, 1959) representing the strength or the weight of the stimuli. The BTL model requires the so-called independence of irrelevant alternatives (IIA) which in essence demands that the pairwise choices be made independently of the context introduced by a given pair. The IIA assumption has been criticized both on theoretical (e. g., Debreu, 1960) and empirical (e. g., Rumelhart & Greeno, 1971; Zimmer

et al., 2004) grounds. In order to relax this assumption, Tversky (1972) introduced the elimination-by-aspects (EBA) model. According to EBA, each stimulus is characterized by a set of features or aspects. When choosing between two alternatives, only those aspects which the two alternatives do not have in common influence the decision. The probability of choosing x over y is then defined as P xy = α x \y u(α) α x \y u(α) + β y \x u(β), (2) where x (y ) denotes the set of aspects belonging to stimulus x (y), and u( ) is a ratio scale of the aspects. The set of aspects, x does not share with y, is indicated by x \ y. Probabilistic choice models with order effect Neither BTL nor EBA model take into account the presentation order of the stimuli within a pair, and that there might be a bias for the first or the second presentation interval. Therefore, Davidson & Beaver (1977) extended the BTL model and suggested a multiplicative order effect, ϑ xy 0, that depends only on a given pair {x,y}. Let P xy x denote the probability that x is chosen over y, given that x was presented first. Then the choice probabilities for the ordered pair (x,y) are defined as P xy x = u(x) u(x) + ϑ xy u(y) P xy y = ϑ xy u(x) ϑ xy u(x) + u(y), (3) where u( ) is a ratio scale and ϑ xy is unique (Augustin, 2004). If ϑ xy is smaller (greater) than one, the bias favors the first (second) choice interval; there is no order effect if ϑ xy = 1. A special case arises when ϑ xy = ϑ, where the order effect is constant for all pairs. In a similar manner, the EBA model may be extended to account for judgmental bias by introducing a multiplicative order effect. Choice probabilities are given by u(α) α x P xy x = \y u(α) + ϑ xy u(β), (4) α x \y β y \x where parameters are defined as in Eq. 2 and Eq. 3. The probability P xy y may be specified accordingly. Note that this generalized EBA model includes both the BTL model and the Davidson-Beaver model as special cases. Method Thirty-nine subjects took part in an experiment on the perceived quality of different audio formats (mono, stereo, and multichannel). Figure 1 displays the playback setup placed in an acoustically treated insulated room. Using this setup, four musical excerpts (program materials: two pop, two classical) of 5 s duration were played back to the listeners in eight different formats (reproduction modes), summarized in Table 1. Details of the stimulus rendering and presentation can be found in Choisel & Wickelmaier (in press).

C L R LL 30 45 2.5 m RR Screen LS 110 Curtain RS Figure 1: Playback setup consisting of seven loudspeakers: left (L), right (R), center (C), left-of-left (LL), right-of-right (RR), left surround (LS) and right surround (RS). This setup was symmetrically placed with respect to the width of the room and was hidden from the subject by an acoustically transparent curtain. A computer flat screen was used as a response interface. For each pair of reproduction modes, the subjects were asked (in Danish) Which of the two sounds is more... followed by one of the following adjectives: wide (bred), elevated (høj oppe), spacious (rummelig), enveloping (omsluttende), far ahead (langt foran), bright (lys), clear (tydelig) and natural (naturlig). Two buttons on a computer screen, labeled A and B, were visually emphasized in turn (by changing their size) during playback to indicate which sound was playing. The response was given by clicking the button corresponding to the chosen sound. Each pair was judged only once. A completely balanced design (David, 1988, chapter 5) was employed to ensure that each stimulus occurred equally often in both presentation intervals. In addition, the within-pair order was balanced across subjects by having half of the subjects receiving the pairs in reverse order. The between-pair order was randomized for each subject. Each participant gave 28 judgments per program material and auditory attribute in an experimental block. For a given attribute, all four program materials were completed sequentially in about 25 minutes. The order of the attributes and program materials was balanced across subjects using a Graeco-Latin square design. Overall pref- Table 1: The eight reproduction modes used in the experiment: full name and loudspeakers active during playback (see Figure 1). Name Speakers mono C phantom mono L,R stereo L,R wide stereo LL,RR matrix upmixing L,R,LS,RS Dolby Pro Logic II L,R,C,LS,RS DTS Neo:6 L,R,C,LS,RS original 5.0 L,R,C,LS,RS

erence judgments were collected in a similar fashion, except that each subject received each pair in both orders, resulting in 56 preference judgments per musical excerpt and attribute. The choice data collected in each presentation order were aggregated over subjects. Analysis of the choice frequencies thus obtained proceeded as follows: First, the two presentation orders were collapsed and the resulting data were modeled according to BTL (Eq. 1). Where BTL showed significant (α =.05) lack of fit, the EBA model was employed (Eq. 2). Second, order-effect models were applied to the data observed in the two presentation orders; according to the previous results, either the Davidson-Beaver model (Eq. 3) or the EBA order-effect model (Eq. 4) were applied. Parameter estimation and model testing was performed using software described in Wickelmaier & Schmid (2004). Results and discussion Generally, the BTL model was found to fit the choice data well. For two attributes (width and envelopment) in one of the musical excerpts (Steely Dan), however, BTL had to be rejected; this indicates that IIA was violated and that more complex choice mechanisms might have played a role. In these two cases, EBA models (with nine aspect parameters each) accounted for the collected judgments sufficiently well (results not shown). Table 2 shows the outcome of fitting order-effect models to the choice data. In columns three to five, the deviance statistic is reported as a lack-of-fit measure. Across all attributes and excerpts, the fit was fair to good apart from few exceptions (for lack of space, Table 2 displays the results for only four of the nine attributes). Columns six to eight show point and interval estimates of the order parameter, which was restricted to be equal for all pairs of a given attribute/excerpt data set. Estimates for ϑ ranged from 0.81 to 2.19 (mean 1.41). To illustrate, an estimate of ˆϑ = 2 would indicate that if two audio formats are equal with respect to a given attribute, the probability of choosing the second sound is two times greater than choosing the first sound. For seven of the nine attributes, order effect parameters were significantly greater than one. This indicates that for the majority of attributes (including preference) there was a bias favoring the second presentation interval. For brightness, the order effect was not significant. Only for distance, order effect parameters were less than one, indicating that only for this attribute there is a tendency towards an advantage of the first interval. Although the goodness of fit of the order effect models was generally found to be satisfactory, occasional misfits were observed. This might indicate that the assumption of a constant order effect across all pairs, ϑ xy = ϑ, is too restrictive, and that the magnitude or even the direction of the bias varies with the pair. Further analysis to that effect is currently ongoing. In the present study, the first sound played was always associated with the left button on the computer screen, and the second sound with the right button. Thus, the temporal effect of presentation order is confounded with a potential visual (or spatial) bias favoring the left or right button. Therefore, the current findings should be confirmed in further experiments where temporal presentation order and assignment to left/right button are counterbalanced. In summary, the within-pair order effects encountered in this study were systematic and rather large. Across a variety of auditory attributes, judgments tended to be biased towards the second choice interval. The EBA order-effect model presented al-

Table 2: Goodness-of-fit statistics and parameter estimates for the order-effect models. Reported are the results of likelihood ratio tests of the Davidson-Beaver model (where df = 48) and the EBA order-effect model (where df = 47) vs. a saturated binomial model. Order effects (ϑ) and 95% confidence intervals have been estimated by maximum likelihood. Attribute Excerpt χ 2 df p ˆϑ 2.5% 97.5% Preference Beethoven 59.80 48 0.118 1.55 1.38 1.72 Rachmaninov 82.44 48 0.001 1.86 1.66 2.06 Steely Dan 48.76 48 0.442 1.41 1.27 1.56 Sting 67.40 48 0.034 1.26 1.14 1.38 Envelopment Beethoven 56.71 48 0.182 1.80 1.51 2.09 Rachmaninov 73.89 48 0.010 1.73 1.47 1.99 Steely Dan 63.38 47 0.056 1.36 1.16 1.57 Sting 39.99 48 0.788 1.15 0.99 1.31 Spaciousness Beethoven 61.82 48 0.087 2.07 1.74 2.39 Rachmaninov 61.63 48 0.089 2.19 1.86 2.53 Steely Dan 61.89 48 0.086 1.27 1.08 1.46 Sting 63.06 48 0.071 1.48 1.24 1.72 Distance Beethoven 57.92 48 0.155 0.83 0.73 0.93 Rachmaninov 41.57 48 0.732 0.89 0.78 1.00 Steely Dan 41.88 48 0.720 0.81 0.72 0.91 Sting 42.47 48 0.698 0.93 0.81 1.05 lows for quantification of such order effects, when context independence of the paired comparison judgments cannot be readily assumed. Acknowledgements This work was supported by the Centercontract on Sound Quality, Aalborg University. References Augustin, T. (2004). Bradley-Terry-Luce models to incorporate within-pair order effects: representation and uniqueness theorems. British Journal of Mathematical and Statistical Psychology, 57, 281 294. Bradley, R. A. & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39, 324 345. Choisel, S. & Wickelmaier, F. (in press). Evaluation of multichannel reproduced sound: scaling auditory attributes underlying listener preference. Journal of the Acoustical Society of America. David, H. A. (1988). The Method of Paired Comparisons. New York: Oxford University Press. Davidson, R. R. & Beaver, R. J. (1977). On extending the Bradley-Terry model to incorporate within-pair order effects. Biometrics, 33, 693 702. Debreu, G. (1960). Review of R. D. Luce s Individual choice behavior: A theoretical analysis. American Economic Review, 50, 186 188. Hellström, Å. (1985). The time-order error and its relatives: mirrors of cognitive processes in comparing. Psychological Bulletin, 97, 35 61.

Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley. Rumelhart, D. L. & Greeno, J. G. (1971). Similarity between stimuli: An experimental test of the Luce and Restle choice models. Journal of Mathematical Psychology, 8, 370 381. Tversky, A. (1972). Elimination by aspects: a theory of choice. Psychological Review, 79, 281 299. Wickelmaier, F. & Schmid, C. (2004). A Matlab function to estimate choice model parameters from paired-comparison data. Behavior Research Methods, Instruments, & Computers, 36, 29 40. Zimmer, K., Ellermeier, W., & Schmid, C. (2004). Using probabilistic choice models to investigate auditory unpleasantness. Acta Acustica united with Acustica, 90, 1019 1028.