Rchard Wllams Notre Dame Socology rwllam@nd.edu http://www.nd.edu/~rwllam Meetngs of the European Survey Research Assocaton Ljubljana, Slovena July 19, 2013
Comparng Logt and Probt Coeffcents across groups We often want to compare the effects of varables across groups, e.g. we want to see f the effect of educaton s the same for men as t s for women But many/most researchers do not realze that methods typcally used wth contnuous dependent varables to compare effects across groups may be problematc when the dependent varable s bnary or ordnal
We often thnk that the observed bnary or ordnal varable y s a collapsed verson of a latent contnuous unobserved varable y*. Because y* s unobserved, ts metrc has to be fed n some way. Ths s typcally done by scalng y* so that ts resdual varance s π 2 /3 = 3.29. But ths creates problems smlar to those encountered when analyzng standardzed coeffcents n OLS unless the resdual varance really s the same n both groups (.e. errors are homoskedastc) the coeffcents wll be scaled dfferently and wll not be comparable.
Case 1: True coeffcents are equal, resdual varances dffer Group 0 Group 1 True coeffcents y 3 2 1 * y 2 3 2 1 * Standardzed Coeffcents y 3 2 1 * y 3 2 1 *.5.5 5. In Case 1, the true coeffcents all equal 1 n both groups. But, because the resdual varance s twce as large for group 1 as t s for group 0, the standardzed βs (.e. the ones reported by most logstc regresson programs) are only half as large for group 1 as for group 0. Nave comparsons of coeffcents can ndcate dfferences where none est.
Substantve Eample: Allson s (1999) model for group comparsons Allson (Socologcal Methods and Research, 1999) analyzes a data set of 301 male and 177 female bochemsts. Allson uses logstc regressons to predct the probablty of promoton to assocate professor.
Table 1: Results of Logt Regressons Predctng Promoton to Assocate Professor for Male and Female Bochemsts (Adapted from Allson 1999, p. 188) Men Women Rato of Varable Coeffcent SE Coeffcent SE Coeffcents Ch-Square for Dfference Intercept -7.6802***.6814-5.8420***.8659.76 2.78 Duraton 1.9089***.2141 1.4078***.2573.74 2.24 Duraton squared -0.1432***.0186-0.0956***.0219.67 2.74 Undergraduate selectvty 0.2158***.0614 0.0551.0717.25 2.90 Number of artcles 0.0737***.0116 0.0340**.0126.46 5.37* Job prestge -0.4312***.1088-0.3708*.1560.86 0.10 Log lkelhood -526.54-306.19 Error varance 3.29 3.29 *p <.05, **p <.01, *** p <.001
As hs Table 1 shows, the effect of number of artcles on promoton s about twce as great for males (.0737) as t s for females (.0340). If accurate, ths dfference suggests that men get a greater payoff from ther publshed work than do females, a concluson that many would fnd troublng (Allson 1999:186). BUT, Allson warns, women may have more heterogeneous career patterns, and unmeasured varables affectng chances for promoton may be more mportant for women than for men.
Allson argued that The apparent dfference n the coeffcents for artcle counts n Table 1 does not necessarly reflect a real dfference n causal effects. It can be readly eplaned by dfferences n the degree of resdual varaton between men and women. Allson proposed one way for dealng wth group comparsons, but there are others
Soluton I: Modfy the Model & Make the hetero go away Wllams (2010) notes that often the appearance of heteroskedastcty s actually caused by other problems n model specfcaton, e.g. varables are omtted, varables should be transformed (e.g. logged), squared terms should be added Wllams (2010) shows that the heteroskedastcty ssues n Allson s models go away f artcles^2 s added to the model
Soluton 2: Heterogeneous Choce Models Heterogeneous choce/ locaton-scale models eplctly specfy the determnants of heteroskedastcty n an attempt to correct for t. In the tenure problem, Allson and Wllams both let resdual varablty dffer by gender (but more complcated varance models are also possble)
The Heterogeneous Choce (aka Locaton-Scale) Model Can be used for bnary or ordnal models Two equatons, choce & varance Bnary case : g g z g y )) ep(ln( ) ep( 1) Pr(
Problem: Radcally dfferent nterpretatons are possble Hauser and Andrew noted that the effects of SES varables on educatonal attanment declned wth each educatonal transton They modeled ths va what they called the logstc response model wth proportonalty constrants. If the LRPC holds, the effects of varables dffer only by a scale factor across each transton (or group), e.g. the model could hold f each SES varable only had half as large an effect on transton 2 as t dd on transton 1.
Models compared
Wllams (2010) showed that, even though the ratonales behnd the models are totally dfferent, heterogeneous choce models produce dentcal fts to the LRPC models estmated by Hauser and Andrew Indeed, when the models are both appled to Allson s tenure data, the estmated coeffcents are eactly dentcal or can be easly converted from one parameterzaton to the other
But, the theoretcal concerns that motvate the models lead to radcally dfferent nterpretatons of the results. Those who beleved that the LRPC was the theoretcally correct model would lkely conclude that there s substantal gender nequalty n the tenure promoton process, because every varable has a smaller effect on women than t does men Somebody lookng at these eact same numbers from the standpont of the hetero choce model would conclude there s no nequalty; effects of varables are the same for both men and women and only appear dfferent because dfferences n resdual varablty cause coeffcents to get scaled dfferently
Soluton III: Compare Predcted Probabltes across groups Long (2009) proposes a dfferent analytcal approach that he says avods the problems wth the prevous approaches. Long estmates models that allow for, say, every varable to nteract wth gender. He then creates graphs lke the followng that plot dfferences n predcted probabltes of tenure for men and women
0.2.4.6.8 Contrasts of Adjusted Predctons of male wth 95% CIs 0 10 20 30 40 50 Total number of artcles.
Ths smple eample shows that the predcted probabltes of tenure for men and women dffer lttle for those wth small numbers of artcles But, the dfferences become greater as the number of artcles ncreases. For eample, a women wth 40 artcles s predcted to be 45 percent less lkely to get tenure than a man wth 40 artcles.
Crtque of Long Once dfferences n predcted probabltes are dscovered, polcy makers may decde that some sort of correctve acton should be consdered,.e. the graphs wll show you whether there s a reason to be concerned n the frst place At the same tme, Long s approach may be frustratng because t doesn t try to eplan why the dfferences est..e. s t because the effects of varables dffer across groups or s t because of dfferences n resdual varablty?
From a polcy standpont, we would lke to know what s causng these observed dfferences n predcted probabltes If t s because women are rewarded less for each artcle they wrte, we may want to eamne f women s work s not beng evaluated farly If t s because of dfferences n resdual varablty, we may want to further eamne why that s. For eample, f famly oblgatons create more career hurdles for women then they do men, how can we make the workplace more famly-frendly? But f we do not know what s causng the dfferences, we aren t even sure where to start f we want to elmnate them.
But, as we have seen, when we try to eplan group dfferences, the coeffcents can be nterpreted n radcally dfferent ways. Gven such ambguty, some mght argue that you should settle for descrpton and not strve for eplanaton (at least not wth the current data). Others mght argue that you should go wth the model that you thnk makes most theoretcal sense, whle acknowledgng that alternatve nterpretatons of the results are possble.
Conclusons Researchers need to be aware that comparsons of effects across groups are much more dffcult wth logt and ordered logt models than wth OLS But unfortunately the proposed ways for dealng wth these ssues have problems of ther own At ths pont, t s probably far to say that the descrptons of the problems wth group comparsons may be better, or at least more clear-cut, than the varous proposed solutons.
Selected References Allson, Paul. 1999. Comparng Logt and Probt Coeffcents Across Groups. Socologcal Methods and Research 28(2): 186-208. Hauser, Robert M. and Megan Andrew. 2006. Another Look at the Stratfcaton of Educatonal Transtons: The Logstc Response Model wth Partal Proportonalty Constrants. Socologcal Methodology 36(1):1-26. Hoetker, Glenn. 2004. Confounded Coeffcents: Etendng Recent Advances n the Accurate Comparson of Logt and Probt Coeffcents Across Groups. Workng Paper, October 22, 2004. Retreved September 27, 2011 (http://papers.ssrn.com/sol3/papers.cfm?abstract_d=609104) Keele, Luke and Davd K. Park. 2006. Dffcult Choces: An Evaluaton of Heterogeneous Choce Models. Workng Paper, March 3, 2006. Retreved March 21, 2006 (http://www.nd.edu/~rwllam/oglm/ljk-021706.pdf ) Long, J. Scott. 2009. Group comparsons n logt and probt usng predcted probabltes. Workng Paper, June 25, 2009. Retreved September 27, 2011 (http://www.ndana.edu/~jslsoc/fles_research/groupdf/groupwthprobabltes/groups-wth-prob-2009-06- 25.pdf ) Long, J. Scott and Jeremy Freese. 2006. Regresson Models for Categorcal Dependent Varables Usng Stata, 2nd Edton. College Staton, Teas: Stata Press. Wllams, Rchard. 2009. Usng Heterogeneous Choce Models to Compare Logt and Probt Coeffcents across Groups. Socologcal Methods & Research 37(4): 531-559. A pre-publcaton verson s avalable at http://www.nd.edu/~rwllam/oglm/rw_hetero_choce.pdf. Wllams, Rchard. 2010. Fttng Heterogeneous Choce Models wth oglm. The Stata Journal 10(4):540-567. A pre-publcaton verson s avalable at http://www.nd.edu/~rwllam/oglm/oglm_stata.pdf.
For more nformaton, see: http://www3.nd.edu/~rwllam