Title From User Usage to Subject Analysis A Case Study on the Oncogene Author Chung-Yen Yu, National Taiwan Normal University, Taiwan Jiann-Cherng Shieh, National Taiwan Normal University, Taiwan Objective By analyzing user downloaded electronic medical resources, the purpose of this study is to explore subject distribution on Oncogene articles. In this study, we first disclose the procedures of URL analysis and retracing to acquire the information about what articles that library users downloaded. We thus can analysis the amount, time span and the subject distribution of Oncogene related articles. Method The method of case study was adopted. The theme analysis and time distribution of articles downloaded by users were determined by analyzing the electronic resource download record from the library of a certain medical university. Dataset: The dataset used was from the electronic resource record of a certain medical university s library. The time of collection was from 2009/08/17 to 2012/07/06. The selection conditions included: (1) file type: Application/PDF ; (2) URL prefix: http://www.nature.com/onc/. The Ruby program was used on the download log for analyzing URL to determine the digital object identifier (DOI) and the article title of each article. The DOI and article titles were used as keywords for searching on PubMed in order to acquire their medical subject headings (MeSH) The preliminary statistics showed 15,345 downloads, and after deletion of repetitions, a total of 4,501 articles were calculated. Result
This study conducted descriptive statistics on four types: (a) monthly number of downloads; (b) statistics of MeSH; (c) distribution of article time; and (d) MeSH major topic. (a) Monthly number of downloads: the monthly average number of downloads is 426 PDF files. The month with the highest number of downloads was October of 2010 (648), followed by November of 2010 (607), June of 2010 (582), May of 2010 (574), March of 2011 (571), March of 2010 (566), November of 2011 (531), and April of 2010 (507); August of 2011 had the lowest count (271). (b) Statistics of MeSH: there were 68,105 medical subject headings (MeSH) in the 4,501 articles, which is an average of 15 MeSH per article. After further analysis, 21,942 MeSH major topics were determined with an average of 4.77 MeSH major topics per article. (c) Distribution of article time: the 4,501 articles were published between 1997 to 2012. The year 2010 had the highest number of publications (494), followed by 2012 (396), 2008 (392), 2004 (343), and 2011 (341); 1997 had the lowest count (78). (d) MeSH major topic: the top ten MeSH major topics were used, which were *Gene Expression Regulation, Neoplastic (250); inhibitors/*metabolism (201); Signal Transduction (176); Apoptosis (158); Cell Transformation, Neoplastic (107); Apoptosis/*drug effects (97); Signal Transduction/*physiology (97); Apoptosis/*physiology (90); and Genes, Tumor Suppressor (82). Conclusion In order to understand the situation of electronic resource usage, this study used Oncogene as an example and analyzed its monthly number of downloads, statistics of subject headings, distribution of article time, and MeSH major topics. The results enable the library to understand more of users requirements, as well as provide topic information for scholars and users.
From user usage to Subject Analysis - A Case Study on the Oncogene. Chung-Yen, Yu 1 and Jiann-Cherng, Shieh 2 1 Ph.D Student, Graduate Institute of Library & Information Studies, National Taiwan Normal University, Taipei, Taiwan Email : jcshieh@ntnu.edu.tw 2 Professor, Graduate Institute of Library & Information Studies, National Taiwan Normal University, Taipei, Taiwan Email: emmet.yu@gmail.com EBLIP 7 Saskatoon, Saskatchewan. July 16,2013 1
Outline Research Motivation Research Method Expected Findings Research Findings Conclusion 2
Research Motivation According to a record about journal article download rates from a university, of all the journals published by Nature Publication Group, Oncogene was the journal with the second largest number of downloads. According to a journal ranking list about SJR in the area of Cancer Research, Oncogene was ranked in 12th place. What are popular subject headings in Oncogene received much attention from faculty and students? 3
Research Motivation explained how to proceed data cleaning, data assessment and pre-assessment when assessing the original data from user s log files. Used the example of the cancer literature downloaded by users in Oncogene journal collected by Nature data base to analyze the amount of authors, the subject words, and the published year of the literature. 4
Research Method Data collection time: 2009/08/17-2012/07/06 Log Filter: (1) file type: Application/PDF, (2) URL prefix: http://www.nature.com/onc/ Total of 4,501 downloads left. (After delete duplicated) Use DOI and Title as keywords and link to PubMed database to search. Get MeSH Term for further analysis. 5
Expected Findings Understanding the Needs of Library Users Provide subject heading information for faculty and students Reduce unnecessary database purchases so as to allocate budget in the database with more popular subject headings (e.g. the most downloaded subjects) 6
Research Finding Oncogene Literature published year distribution (Users interested) Counts 500 494 392 375 267 294 343 299 302 311 286 340 309 250 179 178 218 125 78 123 88 0 1997 1999 2001 2003 2005 2007 2009 2011 2013 7
Research Finding Oncogene Literature author number distribution (Users interested) Counts 600 450 523 460 483494 480 416 338 300 282 248 150 160 177 131 101 59 41 36 26 13 13 9 5 2 1 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 26 29 8
Research Finding Oncogene Literature MeSH subject words numbers distribution (Users interested) Counts 3000 2250 2094 1500 1432 750 458 430 74 11 2 0 0-9 10-19 20-29 30-39 40-49 50-59 60-69 9
Research Finding The statistic number for Major Headings words of the literature Number of document 1000 909 952 750 628 715 500 250 0 457 301 223 106 34 45 23 8 8 2 1 2 3 4 5 6 7 8 9 10 11 12 13 15 10
Research Finding Major Heading words statistics - top ten major headings Major Heading Art. Amount Only Maj.H Apotosis 533 159 Neoplasms 462 0 DNA-Binding Proteins 448 59 Breast Neoplasms 424 0 Transcription Factors 403 27 Signal Transduction 389 184 Tumor Suppressor Protein p53 377 2 Gene Expression Regulation, Neoplastic 372 260 Proto-Oncogene Protein 302 41 Cell Transformation, Neoplastic 276 107 11
Research Finding Major Heading statistics of downloaded documents - only for major heading, not including sub-heading Major Heading Article amount *Gene Expression Regulation, Neoplastic 260 *Signal Transduction 184 *Apoptosis 159 *Cell Transformation, Neoplastic 107 *Genes, Tumor Suppressor 84 *Mutation 83 *Transcription, Genetic 75 *Gene Expression Regulation 72 *DNA Methylation 67 *DNA Damage 67 12
Research Finding The literature statistics of Major Heading Words on Oncogene Journal- according to MeSH subject level Chemicals and Drugs had the highest amount of documents(27,306 articles) Diseases had the second high amount of topic(11,458 articles) Phenomena and Processes had the third (8,895 articles) Anatomy had the fourth (1,799 articles) Named Groups had the lowest (1 articles) 13
Research Finding The literature statistics on MeSH major heading -Chemicals and Drugs(Category D) Amino Acids, Peptides, and Proteins (D12) had the most article amount among 16 topic extended from Chemicals and Drugs.(18,775 articles) Enzymes and Coenzymes (D08) was the next (3,672 articles) Biological Factors (D23) was the third. (1,797 articles) 14
Research Finding The literature statistics of MeSH Major heading- Amino Acids, Peptides,and proteins. MeSH Major Heading Series No. Art.Amount Amino Acids D12.125 117 Peptides D12.1644 2,637 Proteins D12.776 16,021 15
Research Finding The literature statistics of MeSH Major heading - Proteins(Category D12.776) over 1000 articles MeSH Major Heading Series No. Art. Amount Neoplasm Proteins D12.776.624 2,203 DNA-Binding Proteins D12.776.260 2,081 Intracellular Signaling Peptides and Proteins D12.776.476 1,733 Transcription Factors D12.776.930 1,708 Membrane Proteins D12.776.543 1,615 Nuclear Proteins D12.776.660 1,188 Intercellular Signaling Peptides and Proteins 16 D12.776.467 1,023
Conclusion The result of the research was to provide librarians more information on users s usage status via the evidence of electronic resources usage for the evaluation and subscription references of the electronic resources. On the other hand, the research provided users more choices by offering relevant subject words from literature topic analysis. 17
THANK YOU FOR YOUR ATTENTION! 18