existing statistical techniques. However, even with some statistical background, reading and

STRUCTURAL EQUATION MODELING (SEM): A STEP BY STEP APPROACH (PART 1) By: Zuraidah Zainol (PhD) Faculty of Management & Economics, Universiti Pendidikan Sultan Idris zuraidah@fpe.upsi.edu.my 2016 INTRODUCTION Structural equation modeling (SEM) is no more than another statistical technique that can be used to analyse data. It is not a new technique for analysing data, just an extension of the existing statistical techniques. However, even with some statistical background, reading and understanding SEM s output in a research report seems very difficult and could lead to a total frustration. As a matter of fact, SEM procedures are easy to perform and the results are easy to interpret. However, due to many uncommon statistical terms that are used to describe almost every aspect of SEM, as if SEM has its own language, explanation of SEM procedures and the results is just hard to understand. For instance, instead of using familiar terms such independent and dependent variables, in SEM, the terms such as observed variables, latent variables, endogenous variables and exogenous variables are used to distinguish the types of variables. By analogy, learning SEM, is just like learning how to drive an automatic transmission car. Without any driving experience, it is still easy to master the skill. But once you already grasp the skills to drive a manual transmission car, mastering the skill to drive automatic transmission car would be much easier. Hence, with a clear understanding of the basic concepts of SEM, it would be easy for anyone to conduct SEM analysis and, interpret and report the results of SEM. Thus, for a person to conduct SEM procedures and report the results, or only to have an idea of what is explained in SEM-based research report, having a clear idea of the basic concepts of SEM is a must. Accordingly, this manual was written 1

based on three main objectives, namely to clarify what is SEM, the basic concepts in SEM, the steps to perform SEM analysis and how to report the results. WHAT IS SEM? Structural equation modeling (SEM) is a multivariate technique that allows for simultaneous analysis of a series of direct or indirect dependence relationships between multiple independent and dependent variables (Garson, 2012b; Groenland & Stalpers, 2012; Hair, Black, Babin, & Anderson, 2010; Ho, 2006; Tabachnick & Fidell, 2007). Hence, it is a technique that can be used to test a simple direct relationship between two variables, a direct relationship between one or several independent variables and several dependent variables and even an indirect relationship, with the existence of a mediator. In that regards, SEM is an advance statistical techniques as it incorporates many of the existing statistical techniques including factor analysis, path analysis, correlation, analysis of variance (ANOVA) and multiple regression (Garson, 2012b; Ho, 2006). Due to its capability to test many types of relationship, including direct and indirect, as well as the mediating and moderating effects, SEM is considered as a comprehensive analysis to test the hypotheses about the relationship between observed and latent variables. SEM APPROACHES To conduct SEM, there are two approaches that are widely identified in the literature, namely covariance-based SEM (CB-SEM) and partial least square SEM (PLS-SEM). CB-SEM is considered more appropriate to conduct SEM whenever confirming the relationships between latent variables serves as a major research concern, the structural model is low to moderate complexity, the relationship between indicators and latent variables is modelled in reflective mode and the sample size used is large (Haenlein & Kaplan, 2004; Hair, Ringle, & Sarstedt,

2011; Henseler, Ringle, & Sinkovics, 2009; Henseler & Sarstedt, 2013; Urbach & Ahlemann, 2010). By contrast, PLS-SEM is deemed appropriate for prediction, particularly when the proposed model is relatively new with lack of theoretical supports, the structural model is complex, the measurement model is modelled in reflective and/or formative mode and a small sample is used (Hair, et al., 2011; Henseler, et al., 2009; Henseler & Sarstedt, 2013; Urbach & Ahlemann, 2010). Thus, it is apparent that both approaches to SEM are complementary and neither is superior to the other (Hair, Sarstedt, Ringle, & Mena, 2012). To perform CB-SEM, software such as Analysis of MOment Structure (AMOS), EQS, LISREL and Mplus can be used (Garson, 2012b; Hair, et al., 2011; Kline, 2011; Tabachnick & Fidell, 2007), while for PLS-SEM, SmartPLS, SPSS PLS, PLS Graph and R package sempls can be used (Garson, 2012a; Hair, et al., 2011; Wan Mohamad Asyraf, 2013; Wong, 2013). To decide on which approach to use, three aspects need to be considered, i.e. the research objective, model set-up and data characteristics (Hair, et al., 2012; Ronkko & Evermann, 2013). If the research aim is to verify the proposed relationships between constructs, which derived from strong relationship theories, using reflective measurement of latent variables and a large sample size, CB-SEM deems as better SEM approaches for the study. More importantly, it has been emphasized that PLS-SEM should be considered as an alternative where CB-SEM could not be used for any reason (Oke, Ogunsami, & Ogunlana, 2012, p. 91). Therefore, whenever the data collected failed to satisfy the CB-SEM assumptions, PLS- SEM will be considered as alternative approaches. Details regarding the selection of SEM approach are shown in

Table 1. 4

Table 1: Rules Of Thumb For Selecting PLS-SEM or CB-SEM Criteria PLS-SEM CB-SEM Source Research Goals - Predicting key target constructs or identifying key driver constructs - Exploratory, theory building or an extension of an existing structural theory - To test and validate exploratory model in the early stage of the theoretical development - For studies that intend to explain variances of the dependent variables and predict the dependent variable when a priori model does not exist. - Investigate a relatively new phenomenon and measurement models need to be newly developed - The relationship between indicators and LV has to be modelled in different modes formative and reflective - Prediction is more important than parameter estimation - Theory testing, confirmation, selection or comparison of alternative theories - A theory oriented (prior theory is strong) - Testing the structural relationships (parameter estimation) between the latent constructs is the primary concern - Established constructs and reflective measurement models are available - In a confirmatory study, which structural model is low to moderate complexity - More frequently accepted for rigorous model validation purposes established approach with recognized GOF metric (Hair, et al., 2011; Henseler, et al., 2009; Henseler & Sarstedt, 2013) Chin & Newsted (1999) as cited in (Urbach & Ahlemann, 2010) Model set-up - If formative are part of the structural model - If error terms require additional specification (Hair, et al., 2011) - If the structural model is complex (many constructs - If the model is small to moderate complexity e.g. (Hair, et al., 2011; and indicators) e.g. 100 constructs and 1000 less than 100 indicators, strong prior theory Henseler, et al., indicators; but lack of theoretical supports 2009; Urbach & Data characteristics - If need to use latent variable scores in subsequent analyses - Low sample size and nonnormal distribution *more accurate results for studies with a small sample (minimum sample of 30 to 100 cases) - In dealing with the problem of consistency at large for large data - If research requires a global goodness of fit criterion - Need to test for measurement model invariance - If the data meet the CBSEM assumptions *under normal data, CBSEM provide a slightly more precise model estimates/accurate result Ahlemann, 2010) (Hair, et al., 2011) (Hair, et al., 2011; Urbach & Ahlemann, 2010) Sample size: min 200 to 800 - For a large data set (Haenlein & Kaplan, 2004) 5

WHEN TO USE SEM? Despite there are two approaches of SEM, this manual will concentrate on CB-SEM rather than PLS-SEM. It is, therefore, CB-SEM will be referred as SEM hereafter. SEM is a statistical technique for those researchers that follow the positivist, quantitative and deductive approach. The basic aim of the study is to test a theory-driven hypothesis, that is either to confirm the theory, test the modified model or compare competing theoretical models. To confirm the theory means testing the standard theoretical model in different context in order to prove its applicability, while to test the modified theory is to test the extended model (modification of the standard model with either addition and/or deletion of new constructs in the model). As for comparing competing theoretical models, the aim is to choose the best fit model by testing several models, either standard and/or modified models. Basically, SEM is used to test a model comprising a minimum of 2 independent variables (IVs) and more than 1 dependent variable (DV), but all the variables and data must be either interval or ratio. TEST YOUR UNDERSTANDING Can we use SEM to test the following research objective? 1. to develop and verify a scale to measure brand economic and social investment 2. to determine the effect of customer commitment on customer engagement 3. to investigate the effect of green attitude, subjective norm, perceived behavioural control and green practice consequences toward Malaysian customers intention to recycle and spread positive word-of-mouth

4. to test the mediating role of trust in the effect of satisfaction on customer commitment 5. to assess the moderating effect of relationship duration in the relationship between customer engagement and its antecedent variables.