University Depression Rankings Using Twitter Data
|
|
- Deirdre Byrd
- 6 years ago
- Views:
Transcription
1 University Depression Rankings Using Twitter Data Dept. of CIS - Senior Design Ashwin Baweja ashwinb@seas.upenn.edu Univ. of Pennsylvania Philadelphia, PA Jason Kong jkon@seas.upenn.edu Univ. of Pennsylvania Philadelphia, PA Yaou Wang yaouwang@seas.upenn.edu Univ. of Pennsylvania Philadelphia, PA Tommy Pan Fang tommyp@seas.upenn.edu Univ. of Pennsylvania Philadelphia, PA ABSTRACT With the rise of social media, university rankings are playing an increasingly influential role in the selection process for prospective university students. Simultaneously, mental health has risen to the forefront of discussions among universities nationwide, in light of calls for increased mental illness awareness. Previous attempts at formulating rankings of schools happiness and mental illness have centered around paper or electronic surveys taken by only a small fraction of the student body. Our work posits a new methodology for constructing college depression rankings through analysis of the language used by students on social media platforms. By using a corpus of 78 million Tweets generated from September 2014 to March 2015 and leveraging existing research into depression language analysis, we produce a set of meaningful rankings comparing depression among schools. 1. INTRODUCTION In this paper, we propose a novel approach to ranking universities along the dimension of depression. Such rankings not only influence the prestige of a university but also the decisions of high school students when determining where to spend the next four years of their life. However, current methodologies for computing depression rankings are neither robust nor scalable. Our approach leverages social media data and existing depression models to produce rankings that provide significant improvements in robustness and scalability. The resulting rankings provide key insights into correlations between depression at universities and characteristics of universities, such as the prevalence of a pre-professional culture at the university. Finally, although our rankings are specific to depression, our approach can be generalized and applied to other areas as well. For example, our method could be used to rank universities on the dimensions of health. Alternatively, the model could be refined to make the rankings more finegrained (ex. at the student group level) or more coarse (ex. at the region level). 2. BACKGROUND Advisor: H. Andrew Shwartz (hansens@seas.upenn.edu) and Chris Callison-Burch (ccb@cis.upenn.edu). For universities, rankings play an important role in influencing prestige, endowments, administrative decisions, and perhaps most importantly the college choices of prospective students in high school. A set of rankings that are rising in importance are happiness rankings, a listing of universities based on the purported happiness of their students. In light of mental health awareness rising to the forefront of national attention through frequent and high-profile suicides, students and administrators are pushing for more efforts in monitoring mental health and depression levels at the respective universities. Accompanying the additional importance placed on college rankings has been an increased number of published rankings by today s media. Joining established and recognized publications such as The U.S. News and World Report [8] and The Princeton Review in creating rankings are up-and-coming viral media websites such as BuzzFeed and The Huffington Post. Rather surprisingly, the methodology used by these publishers to construct such happiness rankings have not kept up with the swell in technology that have led to their greater prominence. According to writers at The Princeton Review, their methodology for constructing their annual set of rankings comprises distributing an 83-question questionnaire to university students through a physical booth and through [10]. The questions are all multiple choice, with answers being rankings of 1-5, with 1 being strongly disagree and 5 strongly agree. On average, fewer than four hundred students at each university take the survey, and official surveying is only completed once every three years. The lack of granularity in the data is troubling when considering the amount of emphasis placed on their findings. For example, the The Princeton Review s recent Happiest Colleges ranking appeared in headlines on numerous hightraffic websites such as The Huffington Post, College Atlas, and The University Herald. Upon deeper investigation into their methodology, it was found that the rankings were calculated solely through averaging the answers of students in each university to the question How happy are you? [10] At the same time, the conclusions of these emotional health ranking reports have great influence on their readers. High school students and families refer to college rankings as an important source of information during the college decision process. College students find rankings a helpful tool to un- 1
2 derstand outward perception regarding their university. Administrators look to rankings to evaluate their performance in regards to student mental health and formulate policy decisions to manage their reputation. Given the importance placed on these rankings, a mismatch exists between current rankings accuracy and the decisions made through consulting these rankings. 3. RELATED WORK There is existing academic research centered around using natural language models to predict depression in the general population. However, only a small subset of these studies focus on college students. Below, we highlight such previous work that is relevant to our study. As early as 2004, Rude et. al. [11] conducted a study that looked at the language used by depressed and depressionprone college students. The paper used a linguistic approach by analyzing the diction used in essays written by depressed college students. This study was the first to establish that there is a significant difference in language use between depressed and non-depressed college students. In 2006, Stephenson et. al. [14] published a study that examines predictors of suicidal ideation among college students. This study primarily focused on contrasting the differences in indicators of suicide between male and female college students. Unlike our work, Stephenson s study focuses on suicidal ideation rather than depression. However, many of the indicators of suicide proposed by Stephenson et. al. indicate depression as well, thus making it relevant to our work. In later years, several papers used the same linguistic approach taken by Rude et. al. to analyze depression and its symptoms. However, most of these studies lacked a demographic focus, choosing instead to analyze depression patients across the entire population. A paper by Neuman et. al. [7] used data from the Internet as well as expertise of linguistic scholars to construct a predictive model to identify depression based on a piece of writing. The model was able to achieve an 84% classification rate. A more recent study by Howes, Purver, and McCabe [5] used linguistic indicators to track depression patients throughout an online text-based therapy. They found that linguistic models could predict the important measures on depression with a high degree of accuracy. In recent years, there has been research done on applying linguistic models to social media data to predict depression. A study by Choudhury et. al. [3] used Twitter data to predict which patients were depressed before they even received a formal diagnosis. The study used the Twitter data of diagnosed depression patients one year prior to their diagnosis to test whether they could gauge a diagnosis of depression on the individual level. The model achieved a predictive accuracy of 72%. A recent study by Schwartz et. al. [12] further fine-grained this predictive model by predicting depression score on a continuous scale rather than simply producing a classification of depressed versus not depressed. Several recent studies have used social media data to analyze depression for only a subset of the population. One study, conducted by Thompson, Poulin, and Bryan [15], used social media posts to predict suicide risk in military personnel and veterans. Another study, conducted by Moreno et. al., studied disclosures of depression on Facebook by already diagnosed college students. However, this study did not focus on building a model of depression for college students. Instead, it studied how often students who are diagnosed with depression display negative emotions on Facebook. Additionally, the work also considers using PERMA scores, a positive psychology methodology, as an alternative format for validating the depression approach. PERMA is a scheme developed by Dr. Martin Seligman as an attempt to capture emotional well-being among five dimensions [13]. For each dimension, a corpus of keywords (as well as their relative weights) was developed at the University of Pennsylvania. PERMA scores are generated by aggregating the normalized frequencies of these keywords for a given input text. Specifically, the output will has different dimensions, with each dimension having two directions (positive and negative), and a score for each direction in each dimension 1. The model for calculating these scores has been set and established [13]. 4. SYSTEM MODEL Figure 1: Block Diagram of Full Model Our approach, outlined in Figure 1, leverages language features from Tweets and the World Well-Being Project s (WWBP) 2 existing depression model to construct university depression rankings. However, this approach can be generalized to work for a broader set of applications. In particular, these rankings can be produced over any set of groups, not just universities. Additionally, alternative language models 1 PERMA stands for Positive Emotion, Engagement, Relationships, Meaning, and Accomplishment. Each of these dimensions will have a positive direction and a negative direction. Hence, there are a total of 10 scores, one for each direction on each dimension. 2 The World Well-Being Project is a collaboration among computer scientists and psychologists at the University of Pennsylvania, aimed at studying psychology through modern machine learning techniques. For more details please refer to 2
3 can be substituted for WWBP s model to provide for more flexibility in how depression is measured. This more generalized approach has two main components: a language model and a data set with user-level social media messages. These two components are then combined to produce user-level scores per the language model, and these scores are finally aggregated to produce the final rankings. We now describe in more detail each of these four components/stages particularly for our attribute of interest, depression. 4.1 Prerequisite: A Depression Scale The overall model first decides on a numerical scale for depression. This scale is used to label users in the training data set on their levels of depression. It is also used by the depression model to output depression scores for each user. 4.2 Depression Model The depression rankings produced in this work use WWBP s depression model that produces depression scores given social media data. However, our approach is not restricted to this model; other existing individual-level depression models can also be substituted for WWBP s depression model. The overall requirements for such a model are rather loose. The model must take as input training data a set of users, where each user has some text and a depression score. The model then takes text from a test set of users and is able to produce a depression score for each of the users. This outputted score should be on the same scale as the depression scores in the training data. 4.3 Data Set Construction Our work aims to leverage social media language to generate depression rankings. That said, it is not required that the data set consist strictly of social media messages: any written text or combination of written texts will do. For example, for each user, one may choose to use a combination of Facebook status updates and college application essays. The only requirement is that the text in the training and ranking data sets (described below) are similar. As with any statistical model, the quality of the results is dependent on whether the data set is large enough to produce statistically significant results. However, the size of the data set required to produce meaningful rankings will depend on the depression model chosen in the system implementation. Thus, this system model leaves it to the user to ensure that the size of the data set is sufficiently large enough for the depression model chosen. The required data set can be broken down into two parts. First, the model requires a training data set that will be used to train the depression model. Here, we require that the training data consist of a sufficient number of users to train the depression model. Then, for each user, the data set should consist of sufficient text per user as well as a depression ranking on the scale described above. Second, we need a ranking data set that will be used to produce the final rankings. For each group that is to be ranked, this data set should consist of a sufficient number of users labeled as belonging to that group. Then, for each user, the data set should contain a sufficient amount of text to produce a depression score for that user. 4.4 User-Level Scores In the third stage, we compute the depression scores for each individual user. This is done by training the depression model chosen using the training data described previously. Then, for each user in the ranking data set described in the below section, the text for that user is simply run through the depression model to compute a depression score for that user. 4.5 Depression Ranking Output In the final stage, we compute rankings for our groups using the user-level depression scores. This simply requires an aggregation function that takes as input the depression score for each user in a group and returns a single depression score for that group. The simplest of such functions simply takes the average of all of the depression scores, but a more sophisticated approach that weights users based on the total number of words in their text or on other qualities of the users may also be used. The final output is, for each group, a depression score, and the groups can then be sorted on this depression score to produce the final rankings. 5. SYSTEM IMPLEMENTATION Below, we discuss the details about how each stage of the model described above is implemented. 5.1 Depression Model The method in the baseline depression model developed by WWBP is called Differential Language Analysis. Differential language analysis involves first working with a set of labeled training data, choosing a set of features that would best predict the labels (e.g. n-grams, topics, etc.), and then generating a corresponding model (e.g. Naive Bayes, Regressions, SVM) on that. Then, these weightings can be applied on a novel data set through three steps: sanitizing and converting the data set into a message table, extracting the desired features from this message table, and then performing correlation analysis and visualization. This is our overarching approach when it comes to constructing a predictive model based on social media language. WWBP Library We base our work off a low-level implementation needed to run and tune our WWBP model. The model is a machine learning library that has a interface through which a range of tasks including model creation (including a range of different regression and classification models), feature extraction, and data visualization can be accomplished. The library uses a MySQL database to store the necessary data and uses Python to run code that completes the necessary machine learning tasks. The necessary parameters for each model is stored in memory as local variables in the Python methods, resulting in the necessary creation of a.pickle file for each model that needs to be accessed later. This library provides a convenient interface with a relatively fast run time. WWBP Depression Model This work uses the model developed by Schwartz et. al. [12] as part of the World Well-Being Project as the baseline model to predict individual levels of depression. The model is trained and tested on data from 28,749 Facebook users who have opted into a study where they complete a personality questionnaire and provide access to their status updates between June 2009 and March The personality 3
4 questionnaire measures levels of depression in seven different facets as based off a methodology developed by Cambridge University [12]. The personality survey computes an average of all seven to output a depression score, termed degree of depression that ranges from 0 to 12. We use this degree of depression score as the depression metric for this work. Schwartz et. al. s [12] model uses the following features to output the aggregate depression score: to 3-grams: The relative frequency of n-grams restricted to those used by at least 5% of all users 2. Topics: 2000 topics derived via latent Dirichlet allocation (LDA) on the Facebook data, in addition to 64 Linguistic Inquiry and Word Count (LIWC) categories [9] 3. Number of words: The total number of words a user has posted Figure 3: Words Most Positively Correlated with Depression in the WWBP Model The model first applies Principal Component Analysis to reduce its feature space, and then uses an L2-penalized regression model to predict the depression score. The model provides a more nuanced prediction of depression (having the outcome as a scale rather just a binary output) while still maintaining decent accuracy. It achieved a Pearson R value of 0.39 and a mean squared error of 0.78 on its out of sample test set, which significantly outperformed the baseline of sentiment analysis. Figures 2 and 3 are visualizations of the data set used to construct the regression model. The word clouds are generated by computing the correlation values for each feature with the labeled depression scores, and then emitting the unigrams and bigrams that output the highest absolute values. Interestingly, the words in the data set that correlate most strongly with depression tend to be in the first person (e.g. I, myself, etc.) whereas those most negatively correlated with depression tend to be activity related or refers to a collective entity (e.g. our, team, game, etc.). messages into a well-formatted table. This requires sanitizing the data for unsupported languages as well as labeling features such as links and re-tweets. Ultimately, the well-formatted table contains the text of messages as well as supporting metadata, such as the user id, the timestamp of the message, and the geographical location of the post, if available. We also create another table that contains user information (e.g. user id, bio) as well as a labeling to the school to which they were mapped. Feature Extraction To extract features from our messages, we tokenize the text. As part of the tokenizer, there have been modifications to recognize emoticons common to social media text (e.g. <3, :-) ). From the tokenized text, we then create n-grams (sequences of one, two, or three words), which allow greater context than a simple bag-of-words model. We also use lexical and topical features to find language characterizing depression. For lexica, we use LIWC (Linguistic Query and Word Count) lists. Each LIWC list of words is associated with a semantic or syntactic category, such as engagement or leisure. For topics, we use clusters of lexico-semantically related words as derived from latent Dirichlet allocation (LDA) [12]. In addition, we refine our features. We use a point-wise mutual information criteria, which looks at the ratio of the actual rate that two words occur together to the expected rate that two words appear together according to chance; 2-grams and 3-grams not meeting the criteria are discarded. We also limit our words and phrases to restrict to those used by at least 5% of the sample. While longer phrases could be considered, computations become increasing challenging because as n-gram size increase, the number of combinations increase exponentially. Words and phrases are normalized by the total number of words written by the user, and are transformed using the Anscombe transformation to stabilize variance. Figure 2: Words Most Negatively Correlated with Depression in the WWBP Model Correlation Analysis Message and User Table Conversion After feature extraction, with the training set, we run a correlation analysis between our features and depression scores. We use a ordinary least squares linear regression over stan- Assuming a data set of social media messages, the first step in the model training implementation is to convert the raw 4
5 dardized variables, producing a linear function and a Pearson R value, as well as a set of weightings for each feature. These weightings can then be applied to the features extracted above from the message table, which are aggregated on a user basis, in order to get a degree of depression score for each user. This degree of depression score will then be aggregated at a university level in order to generate the rankings, to be discussed below. 5.2 Data Set Construction Next, we construct our data set. For each university we wish to rank, our data set contains a set of Tweets that were Tweeted by Twitter users from that university. Approach Overview Unlike Facebook profiles, which track many details about users such as age, university affiliation, and work history, Twitter profiles are very simple. The sign-up process consists of merely entering one s name and , and users can also later upload a profile picture and write a very short (160 character max) bio about themselves. Without any explicit age or university labels on Twitter accounts, drawing conclusions about which users are from a given university is difficult. For our approach, we aim to construct a data set with high precision. In other words, for the Tweets that we find for each school, we want a very high percent of those Tweets to actually have been Tweeted by Twitter users at that school. To construct the data set, we make use of two observations. First, we note that most colleges across America have a Twitter account, and many students who attend the university and have Twitter accounts follow the college account. Second, we note that although Twitter doesn t directly store university affiliation for each user, many student users choose to list their university affiliation in their bio. Approach Details As per the description of the data set construction approach outlined in the System Model section, we take a four-step approach towards constructing our data set. In the first step, we manually (through a Google search) find the main Twitter account for each university in our data set.while there are many Twitter accounts containing the university name, most of which are controlled by third parties not affiliated with the university, we track only verified accounts. Verification of such accounts is done by Twitter and establish[es] authenticity of key individuals and brands on Twitter. [2] Because such accounts are verified, they are actually affiliated with the university, and thus are more easily found by users in searches and have a larger number of followers. Second, for each university Twitter account found in the previous step, we use the Twitter API [2] to find all Twitter users with public accounts who follow the university account. The reasoning behind this step follows from the previouslymade observation that students at universities usually follow their school s Twitter account. Then, for each of these Twitter accounts, again using the Twitter API [2], we pull the account s bio information (if it exists). Now, although we have information for all Twitter users who follow the school Twitter account, we note that it is very unlikely that all of these users are actually students at the university. In fact, many of these users may be prospective students, fans of the school s sports team, or faculty at the school. Thus, in the third step, we use the bios gathered to filter down the Twitter users to only those who attend the university. To do so, we use a regular expression that searches for two components in the Twitter bio. First, we look for some affiliation with the university by looking for either the full school name (e.g. University of Michigan) or a well-known abbreviation for that school (e.g. umich). All such searches are case insensitive. However, looking for just a university affiliation is not sufficient. Alumni, parents of students, faculty, and even sports fans may list the university name in their profile. Thus, to filter these out, we also look for a typical graduation year for 4-year students at the university (i.e. a year between 2015 and 2018) or the keyword student. An example of such a Twitter bio would be: UMich, Class of Finally, for the Twitter users found in the step above, we pull all Tweets that were made in the school year through use of the Twitter API. We classify Tweets made during the school year as those dated after August 31 st, Depression Ranking Output From the first two stages of the model, we are left with a degree of depression score for each user in our data set, the number of words a user has Tweeted (since August 31 st, 2015), as well as a labeling to the university the user attends. In order to generate our desired set of comparative rankings for universities, we need some methodology for aggregating user scores to a university. We choose to aggregate user degree of depression scores using a weighted average based on the number of words Tweeted, as it makes sense that users who have Tweeted more words should be given greater weight, as their degree of depression score is less volatile given the amount of data used in generating the score. When deciding between weighting schemes, we considered a logarithmic scale, square root scale, as well as linear scale for the number of words. We select a linear scale because the model is considerably more volatile for users with a small amount of Tweet data, and so we want to be conservative while weighting users with little data backing their score. Furthermore, we choose a ceiling of 500 words, at which the weighting stops increasing, because previous research et. al. [12] indicates that above this, degree of depression scores are relatively stabilized, and we did not want to overweight users who were excessive Tweeters. We decide on this weighting schema rather than only considering Twitter users who Tweeted a certain threshold because we believe that even users who seldom Tweet provide an indication of overall school well-being, and we do not want to arbitrarily filter down our set of data. Thus, the formula used for aggregating user scores to the university level is: DepScore = user university score user W user where W user = min(1, count(words)/500) From here, we rank universities according to this aggregate depression score in order to generate our results. 6. RESULTS The primary result is a depression ranking of the top 25 academic universities as chosen by the U.S. News and World 5
6 Report in 2014 [8]. The exact ranking is seen in Figure 4. The universities have 409 students on average in the data set. Rice University has the lowest number of students (88) and University of Southern California has the highest number of students (827). We note that the California Institute of Technology (CalTech) is removed from the ranking. This is because the data set construction process did not generate enough students (only 26) from CalTech for the score to be meaningful. The scores ranged from for Duke, the least depressed school according to our rankings, to for Penn, which tops our depression rankings. 7. ANALYSIS OF RESULTS Interestingly, we see some trends and surprising results emerging from our rankings. At the top of the rankings are schools that appear to have a greater emphasis on preprofessional development. The University of Pennsylvania, University of California Los Angeles, Carnegie Mellon, Emory University, Johns Hopkins University, and the University of Virginia share similarities in that they all have a focus on undergraduate pre-professional programs; all these schools have undergraduate engineering programs, all but UCLA have undergraduate business programs, and all but Emory University have undergraduate engineering programs. To compare, Duke and Stanford, two prestigious schools at the low end of the depression rankings, only offer a school in humanities as well as in engineering, and over 80% of students are enrolled in their respective school of arts and sciences, per their respective school websites. Furthermore, we see schools at the lower end of the rankings tend to have strong athletic programs and a sense of school spirit. For example, Duke University has a religious basketball following, Notre Dame basketball and football, and Stanford football, among other sports they excel at. Schools with a heavier emphasis around pre-professional development (e.g. University of Pennsylvania, University of California Los Angeles, Johns Hopkins University) tend to have a higher depression score in our ranking, whereas schools with strong athletic programs tend to rank much lower (e.g. Duke University, Stanford University, University of Notre Dame). 3 Additionally, Cornell appears surprisingly low on our rankings (ranked 16 th, 6 th among Ivy League schools), as the media often portrays either Cornell or Yale (ranked 7 th, second among Ivy League schools) as the most depressed Ivy League university. We believe that Cornell s lower than expected ranking when compared to public perception may be a result of poor publicity relating to the campus. Public perception may be negative due to the sensationalized reporting of Cornell suicides, which occur through jumping off a bridge into the gorges. The Huffington Post supports our findings that Cornell does not have an above-average suicide rate when compared to other universities. [4] In addition to the previous ranking, we use the same methodology to generate a set of depression rankings for the largest U.S. universities in terms of student enrollment, as per the Department of Education[16]. Additionally, we 3 We note that our Tweets were gathered up until the beginning of March, prior to the beginning of the 2015 NCAA March Madness Tournament. Thus, Duke s NCAA Men s Basketball Championship win as a one time event did not deflate their depression scores, although their performance during the regular season may have played a factor. generate PERMA scores as well as PERMA rankings for the top 25 academic universities as a parallel ranking in order to validate our depression rankings, which we discuss below. Please refer to the appendix for these outputs. Figure 4: Depression Ranking for Top 25 Academic Universities 8. EVALUATION OF RESULTS There currently exists no established set of university depression rankings that are widely accepted amongst the research community. As a result, we are unable to provide a benchmark to evaluate the results of this work. Consequently, we resort primarily to human evaluation to evaluate the two main components of our system: the depression model and the data set mapping. Furthermore, we perform a correlation analysis between the depression rankings produced by our model and happiness rankings backed by existing work in psychology. Depression Model To evaluate the depression model, we create a web application that displays two Twitter users in our data set and asks testers to identify which user appears more depressed. For each pair of Twitter users, the web application ensures that one user scores high on our depression model (score > 2.7) and thus exhibits traits of depression according to the model, while the other user scores low on our model (score 6
7 < 2.3). The tester then examines the Tweets of each of the two users and evaluates which user appears to be more depressed based on the Tweets displayed. We compare the human results against the outputs of the depression model, where the depression model chooses the user with the higher depression score as more depressed. In these results, when the depression model and the human agree on the more depressed user, we have a concordant pair. In the opposite case, we have a discordant pair. From these human-produced depression ranking evaluations, we take the concordant and discordant pairs to compute a Kendall s Tau coefficient for our model using the equation: τ = nc n d 1 n(n 1) 2 where n c is the number of concordant pairs, n d is the number of discordant pairs, and n is the total number of pairs in the test set. This statistic is commonly used to measure the association between two measured quantities, and it ranges between 1 τ 1. Our model yields a τ coefficient of 0.651, demonstrating a highly positively correlation between human evaluation and our model outputs. Finally, the results can be used to calculate a p-value (p) for our model. We calculated a p-value of for our model, using the following equation: Validation with PERMA p = 6(nc n d) n(2n + 5)/2 The PERMA model is provided to us by WWBP, and we run the model against our Twitter data set for the positive emotion element. After running the data, we calculate the correlations between depression, positive emotion (Pos P), negative emotion (Neg P), and a standardized metric for happiness, which we calculate as the Z-score for Pos P in the sample minus the Z-score for Neg P in the sample (Pos P Neg P Z). Professor Martin Seligman, the father of positive psychology, previously identified in his work the correlation of depression to happiness as being [12]. Our own correlation between depression rankings and standardized happiness is Furthermore, our depression rankings show a very low negative correlation with positive emotion and a moderate correlation with negative emotion, a result that is supported by Seligman s previous work [12]; this lends further confidence to the methodology in this work. Correlations Pos P Neg P Pos P Neg P Z Dep Rank Pos P Rank Neg P rank Table 1: PERMA Correlation with Depression Rankings Validation with Other Metrics Additionally, we have computed the correlation between university depression scores (as well as the PERMA scores) with some simple, easy-to-find metrics commonly used in ranking universities in terms of academic prestige [8] [10]. The values (shown in Table 2) match common intuition. We see that retention rate, defined as the percentage of freshmen that enroll as sophomores in the same university, is negatively correlated with the model s depression score. This is expected, as a higher retention rate indicates that more students are returning to school after spending a year at the university. Interestingly, the acceptance rate of the university correlates positively with the depression score. This seems to indicate that students at exclusive universities are less depressed. This is supported further by the correlation between depression score and US News Ranking, which can serve as a proxy for a university s prestige. In addition, we note that university enrollment is correlated with depression. The average depression score of the top 25 academic universities is lower than the score for the 40 largest schools, which supports the correlations in Table 2. Correlations Dep Score Pos P Score Neg P Score Tuition and fees Total enrollment accept. rate Retention rate US News ranking Table 2: Other Factors Correlation with Depression Rankings Data Set Mapping In constructing our data set, we use the approach and implementation outlined in the previous sections to find Tweets for the 40 largest universities in the United States, as specified by the total enrollment reported by U.S. Department of Education [16], as well as the top 25 academic universities as ranked by U.S. News and World Report [8]. We focus on the evaluation of the data set for the top 25 academic universities, as this is the selection from which our primary ranking is generated. The data set consists of, on average, 409 Twitter users per school. Additionally, the data set has, on average, 145,000 Tweets per school. More detailed statistics about the results (on the school level) are shown in Table 2 below: Recall Statistic min average max # of Twitter users # of Tweets 11, , ,645 Table 3: Results of data set Construction As previously mentioned, our data set construction approach aims for high precision. However, as expected, there is a trade-off between precision and recall. We can roughly measure our recall using the following calculations. Using the U.S. Department of Education s statistics, we find that on average, the 25 academic schools have about 17,601 students per school. Additionally, a study by Digiday [6] reports that as of November 2013, approximately 43.7% of college students are on Twitter. Using this, we calculate that for the 7
8 25 top academic universities, there are, on average, approximately 7,692 Twitter users per university. Because our data set has only 409 users per university, we obtain a recall of approximately 5.3%. Although our data set has a very low recall and only captures a small fraction of Tweets for each university, it is a sufficient size for our model. Our model requires a minimum of 10,000 Tweets per school to produce meaningful results [12]. All 25 schools in our ranking have at least 10,000 Tweets, with the average being much higher (over 100,000 Tweets). Precision To evaluate the quality of the data set mapping phase, we construct a web application that provides an interface for testers, a selected group of colleagues at Penn, to review a sample of our data and verify the accuracy of the mapping between Twitter bios and universities. The webpage displays the Twitter biography for a randomly chosen user from our data set along with the university to which that user was mapped. Testers on the website then use this information to determine whether the biography identifies the user as a current student at the listed university. Based on this validation, our university data mapping yields an accuracy of 86.9% from a sample of 390 Twitter user bios. Drawbacks The ideal data set for a university either consists of all Tweets that were Tweeted by users at that university, or a random subset of those Tweets. However, because of our method of finding Twitter users at each university, there is a systematic bias in which Tweets are captured by our data set. This bias is towards those twitter users who list their school and graduation year on their Twitter bio and follow the school Twitter account. One may argue that those who are more likely to list their school affiliation in their Twitter bio are less likely to be depressed, or one may argue the opposite. However, regardless of such arguments, we make the underlying assumption that any such biases introduced into our data have an equal effect on the data all universities in our data set. Therefore, these biases do not impact our results. 9. FUTURE WORK There are several useful extensions of our work that may be explored further. First, our novel message mapping approach is a useful way to label Twitter profiles with metadata about university affiliations. As such, we were able to build a data set consisting of Twitter users for each university. No such data set currently exists; thus, it may be useful to explore further applications of such a data set. Additionally, our rankings looked at select groups of schools, such as academically prestigious undergraduate institutions and the largest schools in the United States. For a complete set of rankings, we need to incorporate other universities into our data set. Furthermore, a limiting constraint in our work is the number of users mapped to each university. For most universities, the mapping technique captures a sufficient number of Twitter profiles for a university to perform our analysis detailed in the work. However, in our sampling of the top academic universities, there is one outlier university, CalTech, which is only mapped to a few dozen users. This is because CalTech has a very small student body, with an undergraduate enrollment of less than 1,000 students in 2012[1]. To include outliers in rankings and analysis, more sophisticated methods for university mapping, which improve recall without a significant trade-off in precision, should be developed. The framework developed in our work may also be extended for depression analysis at institutions other than universities. For example, the system developed may be utilized as a Human Resources tool to evaluate worker morale based on language used in s or enterprise social media platforms such as Yammer. This would allow such companies to not only increase employee satisfaction but also improve in areas such as worker retention. 10. ETHICS Although the depression model is validated with some degree of confidence, it cannot be used as a tool to diagnose individuals for depression. As discussed in prior sections of the work, the amount that an individual writes on social media will affect their depression score. Furthermore, the language that a single individual uses may not be enough to indicate their mental well-being. The depression model has not been verified medically and cannot supplant the opinion of professional services Another concern is that the data is collected from a publicly available source, Twitter, which therefore is not anonymized. It is possible to identify a user based on data we have collected, such as their user id, biography, or Tweets. The model connects a user with sensitive information about their mental well-being. As a result, our data set must be anonymized and insights drawn from the data, especially about individuals, must be filtered to avoid defamation and ensure user confidentiality. On a final note, while the framework developed in this work may be applied in studying depression beyond the university level, there are privacy and security concerns with regards to collecting user data. For example, if a depression model is applied on employee s, there will likely be public concern about using this data to draw conclusions on the mental state of employees. 11. CONCLUSION In this work, we create a set of university depression rankings using Twitter data. We developed a novel approach of mapping student Twitter accounts to universities to construct a data set of college student Tweets. Then, using differential language analysis and machine learning, we generate individual depression scores and aggregate them to the university level to create our depression rankings. Our data set of college student Tweets and depression model is a truly useful tool for understanding and ranking depression for students at universities. We find on average 409 students for each of the top 25 academic universities with an accuracy of 86.9%. Furthermore, we develop a model that has a p-value of measured against human evaluation. While there is still room for improvement in our system, we have built a strong foundation for understanding depression across universities and for conducting other sets of rankings and analysis at the university level. Student well-being is a serious and relevant topic on college campuses, and we hope that our model and insights about depression provide value 8
9 for and help for students, faculty, and administrators. 12. REFERENCES [1] CalTech Undergraduate Admissions Facts and Stats, [2] Twitter, [3] Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. Predicting depression via social media. In Emre Kiciman, Nicole B. Ellison, Bernie Hogan, Paul Resnick, and Ian Soboroff, editors, ICWSM. The AAAI Press, [4] Rob Fishman. Cornell suicides: do ithaca s gorges invite jumpers?, [5] Christine Howes, Matthew Purver, and Rose McCabe. Linguistic indicators of severity and progress in online text-based therapy for depression. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pages 7 16, Baltimore, Maryland, USA, June Association for Computational Linguistics. [6] John Mcdermott. Facebook losing its edge among college-aged adults, [7] Yair Neuman, Yohai Cohen, Dan Assaf, and Gabi Kedma. Proactive screening for depression through metaphorical and automatic text analysis. Artificial Intelligence in Medicine, 56(1):19 25, [8] US News. US News and World Report s Annual College Rankings, Web. Accessed 19 Oct [9] James W. Pennebaker, C.K. Chung, M. Ireland, A. Gonzales, and R.J. Booth. The development and psychometric properties of liwc Austin, TX, LIWC. Net. [10] Princeton Review. Surveying Students: How It Works Princeton Review, Web. Accessed 28 Apr [11] Stephanie Rude, Eva-Maria Gortner, and James Pennebaker. Language use of depressed and depression-vulnerable college students, [12] H. Andrew Schwartz, Johannes Eichstaedt, Margaret L. Kern, Gregory Park, Maarten Sap, David Stillwell, Michal Kosinski, and Lyle Ungar. Towards assessing changes in degree of depression through facebook. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pages , Baltimore, Maryland, USA, June Association for Computational Linguistics. [13] Martin E. P. Seligman. Flourish: A Visionary New Understanding of Happiness and Well-being. Atria Books, reprint edition, [14] Hugh Stephenson, Judith Pena-Shaff, and Priscilla Quirk. Predictors of college student suicidal ideation: Gender differences, [15] Paul Thompson, Craig Bryan, and Chris Poulin. Predicting military and veteran suicide risk: Cultural aspects. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pages 1 6, Baltimore, Maryland, USA, June Association for Computational Linguistics. [16] National Center for Education U.S. Department of Education. Selected statistics for degree-granting postsecondary institutions enrolling more than 15,000 students in 2012, by selected institution and student characteristics: Selected years, 1990 through , May
10 APPENDIX A. ADDITIONAL FIGURES Figure 5: Depression Rankings for Top 25 Academic Schools 10
11 Figure 6: Depression Rankings for Top 40 Largest Schools 11
12 Figure 7: Depression and PERMA Rankings for Top 25 Academic Schools 12
Asthma Surveillance Using Social Media Data
Asthma Surveillance Using Social Media Data Wenli Zhang 1, Sudha Ram 1, Mark Burkart 2, Max Williams 2, and Yolande Pengetnze 2 University of Arizona 1, PCCI-Parkland Center for Clinical Innovation 2 {wenlizhang,
More informationIdentifying Signs of Depression on Twitter Eugene Tang, Class of 2016 Dobin Prize Submission
Identifying Signs of Depression on Twitter Eugene Tang, Class of 2016 Dobin Prize Submission INTRODUCTION Depression is estimated to affect 350 million people worldwide (WHO, 2015). Characterized by feelings
More informationEpiDash A GI Case Study
EpiDash 1.0 - A GI Case Study Speaker: Elizabeth Musser Graduate Research Assistant Virginia Bioinformatics Institute In collaboration with James Schlitt, Harshal Hayatnagarkar, P. Alexander Telionis,
More informationPractitioner s Guide To Stratified Random Sampling: Part 1
Practitioner s Guide To Stratified Random Sampling: Part 1 By Brian Kriegler November 30, 2018, 3:53 PM EST This is the first of two articles on stratified random sampling. In the first article, I discuss
More informationPredicting outcomes in online chatbot-mediated therapy
Predicting outcomes in online chatbot-mediated therapy David S. Lim dslim@ stanford.edu CS229 Fall 2017 Background More than 50% of college students suggest symptoms of anxiety and depression in the previous
More informationSawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.
Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,
More informationSentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions?
Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions? Isa Maks and Piek Vossen Vu University, Faculty of Arts De Boelelaan 1105, 1081 HV Amsterdam e.maks@vu.nl, p.vossen@vu.nl
More informationNot all NLP is Created Equal:
Not all NLP is Created Equal: CAC Technology Underpinnings that Drive Accuracy, Experience and Overall Revenue Performance Page 1 Performance Perspectives Health care financial leaders and health information
More informationLarge Scale Analysis of Health Communications on the Social Web. Michael J. Paul Johns Hopkins University
Large Scale Analysis of Health Communications on the Social Web Michael J. Paul Johns Hopkins University 3rd-year PhD student Who am I? Background: computer science (not an expert in health, medicine,
More informationTHE WELL-BEING OF WILLIAM & MARY STUDENTS March 2013 Report by the Student Affairs Assessment Committee
THE WELL-BEING OF WILLIAM & MARY STUDENTS March 2013 Report by the Student Affairs Assessment Committee In the fall of 2012, the Division of Student Affairs administered a survey to William & Mary undergraduate
More informationModeling Sentiment with Ridge Regression
Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,
More informationThe field of consulting psychology has blossomed in recent years. It
Series Editor s Foreword Rodney L. Lowman The field of consulting psychology has blossomed in recent years. It covers the applications of psychology in consultation to organizations and systems but also
More informationAre You Satisfied with Life?: Predicting Satisfaction with Life from Facebook
Are You Satisfied with Life?: Predicting Satisfaction with Life from Facebook Susan Collins 1,2, Yizhou Sun 1, Michal Kosinski 3, David Stillwell 3, Natasha Markuzon 2 1 Northeastern University, CCIS,
More informationLocal Healthwatch Quality Statements. February 2016
Local Healthwatch Quality Statements February 2016 Local Healthwatch Quality Statements Contents 1 About the Quality Statements... 3 1.1 Strategic context and relationships... 5 1.2 Community voice and
More informationHandout 16: Opinion Polls, Sampling, and Margin of Error
Opinion polls involve conducting a survey to gauge public opinion on a particular issue (or issues). In this handout, we will discuss some ideas that should be considered both when conducting a poll and
More informationJia Jia Tsinghua University 26/09/2017
Jia Jia jjia@tsinghua.edu.cn Tsinghua University 26/09/2017 Stage 1: Online detection of mental health problems Stress Detection via Harvesting Social Media Detecting Stress Based on Social Interactions
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationPrinciples of publishing
Principles of publishing Issues of authorship, duplicate publication and plagiarism in scientific journal papers can cause considerable conflict among members of research teams and embarrassment for both
More informationAngell Society, Detroit (June 11, 1991) Good evening Anne and I are delighted to be here tonight to honor and thank this very special group of
Angell Society, Detroit (June 11, 1991) Good evening Anne and I are delighted to be here tonight to honor and thank this very special group of friends whose vision, commitment and generosity are helping
More informationRunning Head: PERSONALITY AND SOCIAL MEDIA 1
Running Head: PERSONALITY AND SOCIAL MEDIA 1 Reflection of Personality in Social Media Gina Mancini April 6, 2015 CMST 4899 PERSONALITY AND SOCIAL MEDIA 2 Abstract This paper analyzes the ways in which
More informationThe Alcohol Debate: Should Division-1 Universities Sell Alcohol at Their Football Stadiums? Hannah Johnson Alyssa Martinez. Belmont University
The Alcohol Debate 1 The Alcohol Debate: Should Division-1 Universities Sell Alcohol at Their Football Stadiums? Hannah Johnson Alyssa Martinez Belmont University The Alcohol Debate 2 The Alcohol Debate:
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationTriaging Mental Health Forum Posts
Triaging Mental Health Forum Posts Arman Cohan, Sydney Young and Nazli Goharian Information Retrieval Lab Department of Computer Science Georgetown University {arman, nazli}@ir.cs.goergetown.edu, sey24@georgetown.edu
More informationFilippo Chiarello, Andrea Bonaccorsi, Gualtiero Fantoni, Giacomo Ossola, Andrea Cimino and Felice Dell Orletta
Technical Sentiment Analysis Measuring Advantages and Drawbacks of New Products Using Social Media Filippo Chiarello, Andrea Bonaccorsi, Gualtiero Fantoni, Giacomo Ossola, Andrea Cimino and Felice Dell
More informationRating prediction on Amazon Fine Foods Reviews
Rating prediction on Amazon Fine Foods Reviews Chen Zheng University of California,San Diego chz022@ucsd.edu Ye Zhang University of California,San Diego yez033@ucsd.edu Yikun Huang University of California,San
More informationTHE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE
Spring 20 11, Volume 1, Issue 1 THE STATSWHISPERER The StatsWhisperer Newsletter is published by staff at StatsWhisperer. Visit us at: www.statswhisperer.com Introduction to this Issue The current issue
More informationPrediction of Average and Perceived Polarity in Online Journalism
Prediction of Average and Perceived Polarity in Online Journalism Albert Chu, Kensen Shi, Catherine Wong Abstract We predicted the average and perceived journalistic objectivity in online news articles
More informationFunnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b
Accidental sampling A lesser-used term for convenience sampling. Action research An approach that challenges the traditional conception of the researcher as separate from the real world. It is associated
More informationChristopher Cairns and Elizabeth Plantan. October 9, 2016
Online appendices to Why autocrats sometimes relax online censorship of sensitive issues: A case study of microblog discussion of air pollution in China Christopher Cairns and Elizabeth Plantan October
More informationSound Off DR. GOOGLE S ROLE IN PRE-DIAGNOSIS THROUGH TREATMENT. Ipsos SMX. June 2014
Sound Off DR. GOOGLE S ROLE IN PRE-DIAGNOSIS THROUGH TREATMENT June 2014 Ipsos SMX : Sound bits (of advice) and bites (of research) from Ipsos SMX Ipsos social media research division, dedicated to providing
More informationA Prediction Tournament Paradox
A Prediction Tournament Paradox David J. Aldous University of California Department of Statistics 367 Evans Hall # 3860 Berkeley CA 94720-3860 https://www.stat.berkeley.edu/ aldous/ February 25, 2018 Abstract
More informationMotivation. Motivation. Motivation. Finding Deceptive Opinion Spam by Any Stretch of the Imagination
Finding Deceptive Opinion Spam by Any Stretch of the Imagination Myle Ott, 1 Yejin Choi, 1 Claire Cardie, 1 and Jeff Hancock 2! Dept. of Computer Science, 1 Communication 2! Cornell University, Ithaca,
More informationCarnegie Mellon University Annual Progress Report: 2011 Formula Grant
Carnegie Mellon University Annual Progress Report: 2011 Formula Grant Reporting Period January 1, 2012 June 30, 2012 Formula Grant Overview The Carnegie Mellon University received $943,032 in formula funds
More informationISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology
ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation
More informationPerformance Dashboard for the Substance Abuse Treatment System in Los Angeles
Performance Dashboard for the Substance Abuse Treatment System in Los Angeles Abdullah Alibrahim, Shinyi Wu, PhD and Erick Guerrero PhD Epstein Industrial and Systems Engineering University of Southern
More informationMachine Gaydar : Using Facebook Profiles to Predict Sexual Orientation
Machine Gaydar : Using Facebook Profiles to Predict Sexual Orientation Nikhil Bhattasali 1, Esha Maiti 2 Mentored by Sam Corbett-Davies Stanford University, Stanford, California 94305, USA ABSTRACT The
More informationMULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES
24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter
More information6. Unusual and Influential Data
Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the
More informationCritical Thinking Assessment at MCC. How are we doing?
Critical Thinking Assessment at MCC How are we doing? Prepared by Maura McCool, M.S. Office of Research, Evaluation and Assessment Metropolitan Community Colleges Fall 2003 1 General Education Assessment
More informationIRIT at e-risk. 1 Introduction
IRIT at e-risk Idriss Abdou Malam 1 Mohamed Arziki 1 Mohammed Nezar Bellazrak 1 Farah Benamara 2 Assafa El Kaidi 1 Bouchra Es-Saghir 1 Zhaolong He 2 Mouad Housni 1 Véronique Moriceau 3 Josiane Mothe 2
More informationUNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2
UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2 DATE: 3 May 2017 MARKS: 75 ASSESSOR: Prof PJ Blignaut MODERATOR: Prof C de Villiers (UP) TIME: 2 hours
More informationA Comparison of Three Measures of the Association Between a Feature and a Concept
A Comparison of Three Measures of the Association Between a Feature and a Concept Matthew D. Zeigenfuse (mzeigenf@msu.edu) Department of Psychology, Michigan State University East Lansing, MI 48823 USA
More informationBlind Manuscript Submission to Reduce Rejection Bias?
Sci Eng Ethics DOI 10.1007/s11948-014-9547-7 OPINION Blind Manuscript Submission to Reduce Rejection Bias? Khaled Moustafa Received: 10 March 2014 / Accepted: 3 April 2014 Ó Springer Science+Business Media
More informationAustralian Political Studies Association Survey Report prepared for the APSA Executive
Australian Political Studies Association Survey 2017 Report prepared for the APSA Executive September 2017 1 Survey of the Australian Political Studies Association Membership 2017 Table of Contents 1.
More informationPaul Bennett, Microsoft Research (CLUES) Joint work with Ben Carterette, Max Chickering, Susan Dumais, Eric Horvitz, Edith Law, and Anton Mityagin.
Paul Bennett, Microsoft Research (CLUES) Joint work with Ben Carterette, Max Chickering, Susan Dumais, Eric Horvitz, Edith Law, and Anton Mityagin. Why Preferences? Learning Consensus from Preferences
More informationIntroduction...2 A Note About Data Privacy...3
Introduction........2 A Note About Data Privacy......3 Indexes Overall Satisfaction......4 Recommend Employment at the University...5 Accept Position at the University Again....6 Satisfaction with Work......7
More informationMethodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA
PharmaSUG 2014 - Paper SP08 Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA ABSTRACT Randomized clinical trials serve as the
More informationA Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value
SPORTSCIENCE Perspectives / Research Resources A Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value Will G Hopkins sportsci.org Sportscience 11,
More informationThe Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX
The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main
More informationExperimental evidence of massive-scale emotional contagion through social networks
Experimental evidence of massive-scale emotional contagion through social networks September 26, 2016 Goal of Experiment Problems to be solved in Experiment Details of Experiment Conclusions Experiment
More informationCHAPTER 3 METHOD AND PROCEDURE
CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference
More informationThe ALL ABOUT Website Portfolio
The ALL ABOUT Website Portfolio Introducing A Unique Online Marketing Opportunity By David Webb Owner of The All About Website Portfolio June 2010 The All About Website Portfolio consists of three pre-eminent
More informationCitation Characteristics of Research Published in Emergency Medicine Versus Other Scientific Journals
ORIGINAL CONTRIBUTION Citation Characteristics of Research Published in Emergency Medicine Versus Other Scientific From the Division of Emergency Medicine, University of California, San Francisco, CA *
More informationIs Leisure Theory Needed For Leisure Studies?
Journal of Leisure Research Copyright 2000 2000, Vol. 32, No. 1, pp. 138-142 National Recreation and Park Association Is Leisure Theory Needed For Leisure Studies? KEYWORDS: Mark S. Searle College of Human
More informationCHAPTER VI RESEARCH METHODOLOGY
CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the
More informationA Web Tool for Building Parallel Corpora of Spoken and Sign Languages
A Web Tool for Building Parallel Corpora of Spoken and Sign Languages ALEX MALMANN BECKER FÁBIO NATANAEL KEPLER SARA CANDEIAS July 19,2016 Authors Software Engineer Master's degree by UFSCar CEO at Porthal
More informationChapter 5: Field experimental designs in agriculture
Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction
More informationChapter 1 Introduction to I/O Psychology
Chapter 1 Introduction to I/O Psychology 1. I/O Psychology is a branch of psychology that in the workplace. a. treats psychological disorders b. applies the principles of psychology c. provides therapy
More informationIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis Machine Learning and Modelling for Social Networks Lloyd Sanders, Olivia Woolley, Iza Moize, Nino Antulov-Fantulin D-GESS: Computational Social Science Overview What
More informationOverview of Non-Parametric Statistics
Overview of Non-Parametric Statistics LISA Short Course Series Mark Seiss, Dept. of Statistics April 7, 2009 Presentation Outline 1. Homework 2. Review of Parametric Statistics 3. Overview Non-Parametric
More informationThe Yin and Yang of OD
University of Pennsylvania ScholarlyCommons Building ODC as an Discipline (2006) Conferences 1-1-2006 The Yin and Yang of OD Matt Minahan Organization Development Network Originally published in OD Practitioner,
More informationEmotion Recognition using a Cauchy Naive Bayes Classifier
Emotion Recognition using a Cauchy Naive Bayes Classifier Abstract Recognizing human facial expression and emotion by computer is an interesting and challenging problem. In this paper we propose a method
More informationControlled Experiments
CHARM Choosing Human-Computer Interaction (HCI) Appropriate Research Methods Controlled Experiments Liz Atwater Department of Psychology Human Factors/Applied Cognition George Mason University lizatwater@hotmail.com
More informationResearch Proposal on Emotion Recognition
Research Proposal on Emotion Recognition Colin Grubb June 3, 2012 Abstract In this paper I will introduce my thesis question: To what extent can emotion recognition be improved by combining audio and visual
More informationMapping A Pathway For Embedding A Strengths-Based Approach In Public Health. By Resiliency Initiatives and Ontario Public Health
+ Mapping A Pathway For Embedding A Strengths-Based Approach In Public Health By Resiliency Initiatives and Ontario Public Health + Presentation Outline Introduction The Need for a Paradigm Shift Literature
More informationPsychology Department Assessment
Psychology Department Assessment 2008 2009 The 2008-2009 Psychology assessment included an evaluation of graduating psychology seniors regarding their experience in the program, an analysis of introductory
More informationPaths to Power. Lauren Rose Carrasco, Amanda Sisley, Monica Zeitlin
Paths to Power April 6, 2009 Lauren Rose Carrasco, Amanda Sisley, Monica Zeitlin PROBLEM STATEMENT Women still lag behind men in holding senior management and executive positions. RESEARCH PROJECT We focused
More informationDesirability Bias: Do Desires Influence Expectations? It Depends on How You Ask.
University of Iowa Honors Theses University of Iowa Honors Program Spring 2018 Desirability Bias: Do Desires Influence Expectations? It Depends on How You Ask. Mark Biangmano Follow this and additional
More informationHow are Journal Impact, Prestige and Article Influence Related? An Application to Neuroscience*
How are Journal Impact, Prestige and Article Influence Related? An Application to Neuroscience* Chia-Lin Chang Department of Applied Economics and Department of Finance National Chung Hsing University
More informationA Bayesian Network Model of Knowledge-Based Authentication
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2007 Proceedings Americas Conference on Information Systems (AMCIS) December 2007 A Bayesian Network Model of Knowledge-Based Authentication
More informationRuffalo Noel-Levitz Student Satisfaction Inventory Results: All Students Gallaudet University Spring 2018 Report
Ruffalo Noel-Levitz Student Satisfaction Inventory Results: All Students Gallaudet University Spring 2018 Report Student Success and Academic Quality Office of Institutional Research August 03, 2018 Gallaudet
More informationImpact of Personal Attitudes on Propensity to Use Autonomous Vehicles for Intercity Travel
Impact of Personal Attitudes on Propensity to Use Autonomous Vehicles for Intercity Travel FINAL RESEARCH REPORT Megan Ryerson (PI), Ivan Tereshchenko Contract No. DTRT12GUTG11 DISCLAIMER The contents
More informationChapter 11. Experimental Design: One-Way Independent Samples Design
11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing
More informationGamma Phi Beta Fraternity/Sorority Annual Evaluation Process Gettysburg College
Gamma Phi Beta Fraternity/Sorority Annual Evaluation Process Gettysburg College 2016 Academic Achievement and Intellectual Engagement Criteria 5 pts 10 pts 15 pts Bonus Points (1-5) Academic Support Plan
More informationNot All Moods are Created Equal! Exploring Human Emotional States in Social Media
Not All Moods are Created Equal! Exploring Human Emotional States in Social Media Munmun De Choudhury Scott Counts Michael Gamon Microsoft Research, Redmond {munmund, counts, mgamon}@microsoft.com [Ekman,
More informationAthletic Identity and Life Roles of Division I and Division III Collegiate Athletes
ATHLETIC IDENTITY AND LIFE ROLES OF DIVISION I AND DIVISION III COLLEGIATE ATHLETES 225 Athletic Identity and Life Roles of Division I and Division III Collegiate Athletes Katie A. Griffith and Kristine
More informationExploiting Ordinality in Predicting Star Reviews
Exploiting Ordinality in Predicting Star Reviews Alim Virani UBC - Computer Science alim.virani@gmail.com Chris Cameron UBC - Computer Science cchris13@cs.ubc.ca Abstract Automatically evaluating the sentiment
More informationThe psychology publication situation in Cyprus
Psychology Science Quarterly, Volume 51, 2009 (Supplement 1), pp. 135-140 The psychology publication situation in Cyprus MARIA KAREKLA 1 Abstract The aim of the present paper was to review the psychological
More informationExploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk
Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk Michael Denkowski and Alon Lavie Language Technologies Institute School of
More informationIDENTIFYING STRESS BASED ON COMMUNICATIONS IN SOCIAL NETWORKS
IDENTIFYING STRESS BASED ON COMMUNICATIONS IN SOCIAL NETWORKS 1 Manimegalai. C and 2 Prakash Narayanan. C manimegalaic153@gmail.com and cprakashmca@gmail.com 1PG Student and 2 Assistant Professor, Department
More informationReview of Veterinary Epidemiologic Research by Dohoo, Martin, and Stryhn
The Stata Journal (2004) 4, Number 1, pp. 89 92 Review of Veterinary Epidemiologic Research by Dohoo, Martin, and Stryhn Laurent Audigé AO Foundation laurent.audige@aofoundation.org Abstract. The new book
More informationLEAF Marque Assurance Programme
Invisible ISEAL Code It is important that the integrity of the LEAF Marque Standard is upheld therefore the LEAF Marque Standards System has an Assurance Programme to ensure this. This document outlines
More informationTitle: What 'outliers' tell us about missed opportunities for TB control: a cross-sectional study of patients in Mumbai, India
Author's response to reviews Title: What 'outliers' tell us about missed opportunities for TB control: a cross-sectional study of patients in Authors: Anagha Pradhan (anp1002004@yahoo.com) Karina Kielmann
More informationSaville Consulting Wave Professional Styles Handbook
Saville Consulting Wave Professional Styles Handbook PART 4: TECHNICAL Chapter 19: Reliability This manual has been generated electronically. Saville Consulting do not guarantee that it has not been changed
More informationCHAPTER 5: PRODUCING DATA
CHAPTER 5: PRODUCING DATA 5.1: Designing Samples Exploratory data analysis seeks to what data say by using: These conclusions apply only to the we examine. To answer questions about some of individuals
More informationPlease take time to read this document carefully. It forms part of the agreement between you and your counsellor and Insight Counselling.
Informed Consent Please take time to read this document carefully. It forms part of the agreement between you and your counsellor and Insight Counselling. AGREEMENT FOR COUNSELLING SERVICES CONDUCTED BY
More informationSLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1
SLEEP DISTURBANCE A brief guide to the PROMIS Sleep Disturbance instruments: ADULT PROMIS Item Bank v1.0 Sleep Disturbance PROMIS Short Form v1.0 Sleep Disturbance 4a PROMIS Short Form v1.0 Sleep Disturbance
More informationHow Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?
How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? Richards J. Heuer, Jr. Version 1.2, October 16, 2005 This document is from a collection of works by Richards J. Heuer, Jr.
More informationIntroductory: Coding
Introductory: Coding Sandra Jo Wilson Editor, Education Coordinating Group Associate Director, Peabody Research Institute Research Assistant Professor, Dept. of Special Education Vanderbilt University,
More informationExploring the Role of Time Alone in Modern Culture
Article 56 Exploring the Role of Time Alone in Modern Culture Paper based on a program presented at the 2013 American Counseling Association Conference, March 20-24, Cincinnati, OH. William Z. Nance and
More informationDMA will take your dental practice to the next level
DMA will take your dental practice to the next level A membership payment plan created by dentists for dentists and patients Traditionally dentists have only been able to grow their practices by a mix
More informationUptake and outcome of manuscripts in Nature journals by review model and author characteristics
McGillivray and De Ranieri Research Integrity and Peer Review (2018) 3:5 https://doi.org/10.1186/s41073-018-0049-z Research Integrity and Peer Review RESEARCH Open Access Uptake and outcome of manuscripts
More informationPerceived Emotional Aptitude of Clinical Laboratory Sciences Students Compared to Students in Other Healthcare Profession Majors
Perceived Emotional Aptitude of Clinical Laboratory Sciences Students Compared to Students in Other Healthcare Profession Majors AUSTIN ADAMS, KRISTIN MCCABE, CASSANDRA ZUNDEL, TRAVIS PRICE, COREY DAHL
More informationReal-time Summarization Track
Track Jaime Arguello jarguell@email.unc.edu February 6, 2017 Goal: developing systems that can monitor a data stream (e.g., tweets) and push content that is relevant, novel (with respect to previous pushes),
More informationScientific evaluation of Charles Dickens
Scientific evaluation of Charles Dickens Mikhail Simkin Department of Electrical Engineering, University of California, Los Angeles, CA 90095-1594 email: simkin@ee.ucla.edu cell phone: 415-407-6542 About
More informationHow do we identify a good healthcare provider? - Patient Characteristics - Clinical Expertise - Current best research evidence
BSC206: INTRODUCTION TO RESEARCH METHODOLOGY AND EVIDENCE BASED PRACTICE LECTURE 1: INTRODUCTION TO EVIDENCE- BASED MEDICINE List 5 critical thinking skills. - Reasoning - Evaluating - Problem solving
More informationOnline Journal for Weightlifters & Coaches
Online Journal for Weightlifters & Coaches WWW.WL-LOG.COM Genadi Hiskia Dublin, October 7, 2016 WL-LOG.com Professional Online Service for Weightlifters & Coaches Professional Online Service for Weightlifters
More information