Poor multi-rater reliability in TCM pattern diagnoses and variation in the use of symptoms to obtain a diagnosis

Similar documents
Traditional Chinese medicine patterns and recommended acupuncture points in infertile and fertile women

FOUNDATIONS OF TRADITIONAL CHINESE MEDICINE SUBJECT STUDY GUIDE Semester 1, 2018

Research Article The Study on the Agreement between Automatic Tongue Diagnosis System and Traditional Chinese Medicine Practitioners

Foundations of Traditional Chinese Medicine

The Reliable Measurement of Radial Pulse: Gender Differences in Pulse Profiles

The Foundations of Oriental Medicine Abbreviated Content Outline

IMPROVED OUTCOMES WHEN COMBINING TCM WITH WESTERN INTERVENTIONS FOR CANCER

2 Philomeen Weijenborg, Moniek ter Kuile and Frank Willem Jansen.

The Foundations of Oriental Medicine Expanded Content Outline

A survey comparing TCM diagnosis, health status and medical diagnosis in women undergoing assisted reproduction

CMCS121. Session 4. Interview Workshop/ Abdominal Pain. Chinese Medicine Department.

ATLAS OF CHINESE TONGUE DIAGNOSIS (2ND EDITION) BY BARBARA KIRSCHBAUM

The Herbalist s Corner

Chinese Medicine - Acupuncture Clinical Studies 4

INTERNAL CANON OF THE YELLOW EMPEROR TCM DIAGNOSIS METHODS. Stanley Liang Ph.D., R.TCMP, R.Ac

Objective research on tongue manifestation of patients with eczema

Research Article Analysis of Agreement on Traditional Chinese Medical Diagnostics for Many Practitioners

Long term benefits of acupuncture for chronic pain: what makes a difference?

Gynaecology and Acupuncture The evidence for effectiveness

DISEASES OF THE RESPIRATORY SYSTEM

INFERTILITY Giovanni Maciocia

Introduction to Acupuncture

w. ACUPUNCTURE +

Acupuncture for low back pain: a survey of clinical practice in the UK. Authors: Felicity L Bishop PhD, Shipu Zaman, George T Lewith DM.

Academy of Classical Oriental Sciences

Effects of Acupuncture on Chinese Adult Patients with Psoriatic Arthritis: A Prospective Cohort Study

TCM Ideology and Methodology

Kidney. Spleen Liver. Heart

TRADITIONAL CHINESE MEDICINE Alliance for Global Education Public Health Policy and Practice Program in Shanghai

The Reliable Measurement of Radial Pulse Characteristics

Heavenly Qi Podcast 5 Element Blocks to Treatment

CMPR121. Session 13. Small Intestine

Traditional Chinese Medicine Tongue Diagnosis Index of Early-Stage Breast Cancer

SUSHI VERSUS STIR-FRY

Simple Steps: The Chinese Way to Better Health

Acupuncture , The Patient Education Institute, Inc. amf00102 Last reviewed: 06/23/2017 1

Oriental Medical Physiology. Shaoyin (Heart and Kidney) Physiology

Transforming the Health Care System

Patient Health History Questionnaire

Acupuncture in the Chiropractic Practice

CMAC311. Session 4. Chinese Medicine Department MENSTRUATION DISORDERS. Pre-menstrual Syndrome (PMS) Dysmenorrhea (painful periods)

Practicing Traditional Chinese medicine in New Zealand: The views and experiences of Auckland-based TCM practitioners

ENABLING RECOVERY FROM COMMON TRAFFIC INJURIES: A FOCUS ON THE INJURED PERSON Response of the Ontario Physiotherapy Association

The prevalence and history of knee osteoarthritis in general practice: a case control study

TONICS TO TONIFY OR TO EXPEL: THAT IS THE QUESTION

Table of Contents INTRODUCTION THE BIGGER PICTURE BLOOD AND QI CONNECTION THE FERTILITY DIET THE RESEARCH RISKS AND BENEFITS 10 TOP TIPS BIBLIOGRAPHY

Gynaecology and Acupuncture The evidence for effectiveness

Approved: CAE Hours General 12 General 12. Requested: CAE Hours

Faculty of Health Sciences, UiT The Arctic University of Norway, Tromsø, Norway

Research Article Clinical and Epidemiological Investigation of TCM Syndromes of Patients with Coronary Heart Disease in China

Acupuncture Found Effective For IBS-D

Einar Kristian Borud, 1 Terje Alræk, 1 Adrian White, 3 Sameline Grimsgaard 1,2. Original paper

Kerry Hampton Danielle Mazza Department of General Practice, School of Primary Health Care, Monash University

EXPLAINING 4 Imbalances and 5 Organs

Female fertility problems How Chinese medicine may help

Results. NeuRA Herbal medicines August 2016

Presenters: Angela Smith Clinical Naturopath. Rocco Di Vincenzo Accredited Practicing Dietitian (APD)

Acupuncture And Herbs Proven Effective For PID Treatment

ACUPUNCTURE AND TOUCH PROTOCOL

FERTILITY CHARTING YOUR GUIDE TO BASAL BODY TEMPERATURE CHARTING FOR CONCEPTION BY KIM GATENBY

Research Article Reliability of the NICMAN Scale: An Instrument to Assess the Quality of Acupuncture Administered in Clinical Trials

Statistical Analysis of. Manual Therapists Funded by ACC:

Cheng, Xinnong, ed. Chinese Acupuncture and Moxibustion. 3rd ed. Beijing: Foreign Languages Press, 2010.

How Healthy Are Nutritional Therapists?

Fundamentals of Traditional Chinese Medicine

UCD School of Psychology Guidelines for Publishing

Acupuncture Health History Page 1 of 5

Five Virtues Center for Acupuncture

Course: Diagnostics II Date: 9/26/07 Class #: 1

Influence of acupuncture stimulation on pregnancy rates for women undergoing embryo transfer

Group Counseling Interventions for Premenstrual Syndrome

Principles of publishing

CMAC111 Acupuncture Channel Theory

Research Article Interobserver Reliability of Four Diagnostic Methods Using Traditional Korean Medicine for Stroke Patients

Books Included: 2012 TCM Book Package - Subscription (TCMB-CS-I12)

Acupuncture for Tennis Elbow: An Consensus Study to Define a Standardised Treatment in a GPs Surgery

1. In Oriental medicine, what two organs are related to the earth element? 2. In Oriental medicine what two organs are related to the wood element?

THE DIAGNOSTIC PILLAR: INTERVIEWING

Chinese Journal of Integrative Medicine. Education of TCM Tongue Diagnosis by Automatic Tongue Diagnosis System

A study of forensic psychiatric screening reports and their relationship to full psychiatric reports

Course: Diagnostics II Date: Class #: 2

TEMPE COMMUNITY ACUPUNCTURE (480)

Effect of minimal acupuncture for infantile colic: a multicentre, three-armed, single-blind, randomised controlled trial (ACU-COL)

Sound View Acupuncture and Chinese Herbs 5410 California Ave SW, #202, Seattle, WA

Answers to end of chapter questions

How to Write a Case Report

Measuring and Assessing Study Quality

National Practitioners Examination: 2010 TCM Physician Assistant Exam Problem Sets (with CD 1) By ZHONG YI ZHU LI YI SHI YING SHI XI TI JI )ZHUAN JIA

Agreements among traditional Chinese medicine practitioners in the diagnosis and treatment of irritable bowel syndrome

Hip and Knee Replacement for Osteoarthritis Priority Setting Partnership

Auricular Medicine Seminar I & V: Introduction to Auricular Medicine & Auricular Diagnosis A New Era of Medicine and Healing

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing

Clinical Policy: Acupuncture Reference Number: PA.CP.MP.92

Meridian and Point Energetics 1

Guide to Complete PLAR (Prior Learning Assessment)

Now she's invited BBC cameras inside The Zhai Clinic, which she opened in 1996, to reveal the secrets of her success.

Post-Stroke Depression Relief With Acupuncture

Transcription:

1 Institute of Health and Society, University of Oslo, Oslo, Norway 2 Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway Correspondence to Oddveig Birkeflet, Institute of Health and Society, University of Oslo, N-0318 Oslo, Norway; oddveig.birkeflet@medisin.uio.no Received 12 October 2013 Accepted 10 April 2014 To cite: Birkeflet O, Laake P, Vøllestad NK. Acupunct Med Published Online First: [ please include Day Month Year] doi:10.1136/acupmed-2013-010473 Poor multi-rater reliability in TCM pattern diagnoses and variation in the use of symptoms to obtain a diagnosis Oddveig Birkeflet, 1 Petter Laake, 2 Nina K Vøllestad 1 ABSTRACT Background Pattern differentiation and diagnosis are fundamental principles of Traditional Chinese Medicine (TCM). Studies have shown low inter-rater reliability in TCM pattern diagnoses. This variability may originate from both the identification and the interpretation of symptoms and signs. Objective To examine the inter-rater reliability in TCM pattern diagnoses made in the style of Maciocia for 25 case histories by eight acupuncturists and to explore the impact of demographic factors on the diagnostic conclusion. Further, the association between the diagnosis and the presence of symptoms was examined for a single TCM diagnosis. Methods Eight acupuncturists independently diagnosed 25 women (15 fertile, 10 infertile) based on written case histories. Descriptive statistics, logistic regression and inter-rater reliability (κ) were used. Results Poor inter-rater reliability on TCM patterns (κ<0.20) and large variation in the number of TCM pattern diagnoses were found. Sex, duration of practice and education had a highly significant effect (p<0.001) on the use of TCM patterns and working hours had a significant effect (p=0.029). There was considerable intra- and inter-rater variation in the use of symptoms to make a diagnosis. Symptoms occurring frequently as well as infrequently were inconsistently used to diagnose Liver Qi Stagnation. The study was limited by a small sample size. Conclusions The results showed extensive variation and poor inter-rater reliability in TCM diagnoses. Demographic variables influenced the frequency of diagnoses and symptoms were used inconsistently to set a diagnosis. The variability shown could impede individually tailored treatment. Original paper INTRODUCTION Pattern differentiation is essential in Traditional Chinese Medicine (TCM) in the process of making a diagnosis from symptoms and signs. 1 According to TCM theory presented by Maciocia, diseases are reflected in symptoms and signs that may be interpreted as imbalanced internal organs. 2 The signs and symptoms are used to establish TCM pattern diagnoses and to form the basis of individual treatment, which is the selection of acupuncture points tailored to each patient s diagnosis. 2 4 Based on these principles, one might expect that a given set of symptoms and signs results in the same TCM diagnosis across acupuncturists that is, that the diagnosis has high reliability. However, previous studies have shown low inter-rater reliability and large variability in TCM pattern diagnoses. 5 10 It is unknown whether this low reliability is due to variability in identification of symptoms and signs or variability in the interpretation of them with regard to diagnosis. Two previous studies have examined the agreement between acupuncturists in the recognition of symptoms and signs in the examination of patients. Hua et al found low agreement between two TCM practitioners in their identification of symptoms by visual inspection and palpation. 11 O Brien et al compared three TCM practitioners and the level of agreement on the inspection, auscultation and palpation variables ranged from slight agreement to almost perfect agreement. 12 Hence, the low reliability in making a TCM diagnosis might be due to a low agreement in identifying symptoms and Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473 1 Copyright 2014 by British Medical Journal Publishing Group.

signs, but this does not preclude variability in the interpretation of symptoms and signs. It is important also to examine the variability in the interpretation of the diagnostic process to provide a comprehensive foundation for securing precision and reliability. By presenting a set of symptoms and signs to a group of TCM practitioners, one may be able to examine how they use and interpret the information to set TCM diagnoses. The diagnostic process might also be influenced by personal and demographic factors. It has been proposed that the individual patients symptoms and signs play a lesser role in TCM diagnosis, 6 and that the final diagnosis depends on the acupuncturists subjective interpretation, their education, clinical experience and which TCM books are consulted as the foundation for pattern differentiation. 13 Unreliable TCM pattern diagnosis does not ensure optimal treatment tailored to the individual patient and provides a weak basis for TCM research. Evaluation of the TCM pattern identification process is therefore essential. 10 The objective of this study was to examine the interrater reliability in TCM pattern diagnoses made by eight acupuncturists from 25 case histories. We also explored the impact of the demographic background of the acupuncturists on the diagnostic results. Finally, we examined how the acupuncturists used symptoms and signs to diagnose one common TCM pattern namely, Liver Qi Stagnation. MATERIAL AND METHODS Study design In this cross-sectional study, 25 case histories were selected from 54 Norwegian women (24 fertile and 30 infertile) included in a previous study on TCM acupuncture diagnostics. 5 None had sought acupuncture for health problems apart from infertility. The women had given written informed consent to distribute an anonymous description of their symptoms and signs to other acupuncturists for the purpose of the present study. The 54 women had all been through a TCM consultation by a senior acupuncturist (first author of the paper) who was educated at the Norwegian Acupuncture College offering a Bachelor s degree in TCM acupuncture and holds the Advance Course at Nanjing College of TCM. Data recording was based on The Four Diagnostic Methods of inquiry as described in Maciocia namely, case history taking, palpation, observation and auscultation. 14 An operationalised structured interview guide based on Maciocia s symptoms and signs in gynaecology 14 ensured that all the participants had identical questions. Supplementary information was collected according to individual symptoms. We randomly sampled 25 women, 15 fertile and 10 infertile, to construct the case histories. All the symptoms and signs described by the women during the consultation were presented in the case histories (an example is shown in figure 1). The quality of the pulse and the tongue was described as perceived by the senior acupuncturist and reported without using any TCM organ names. The qualities of the pulses were specified according to Maciocia s statements. The pulse qualities were described as felt in each of the three pulse depths (superficial, middle and deep) and in the three positions (front, middle and rear). 15 Photographs of the tongues were not used because the pictures appeared with different colours on different computers. The tongues were therefore described according to Maciocia s aspects of tongue diagnosis: vitality of colour/tongue spirit, body colour, body shape, tongue coating and tongue moisture. 16 These data were included in the case histories distributed to the acupuncturists to be diagnosed according to their knowledge of TCM criteria. Participating acupuncturists Members of the Norwegian Acupuncture Association (NAA) were invited to participate. Membership of NAA requires an equivalent to a Bachelor s degree in TCM acupuncture, including supervised practice. All eight acupuncturists were educated at the same school, which operates in collaboration with the Nanjing College of TCM. Invitations including information about the study were sent to 16 acupuncturists selected to represent a variation in geographical location, additional education and age. The invitation was followed up by telephone. Eight acupuncturists (labelled Acu1 to Acu8 for this report) volunteered to participate and received the case histories and instructions electronically. The acupuncturists consisted of three men and five women of mean age 50 years (range 33 59 years). Two were physiotherapists, one a registered nurse and five had 1 year of basic medical courses (BM). The average experience was 12 years (range 4 20 years). The average working hours was 24 h per week (range 15 40 h). The acupuncturists who declined to participate were seven women and one man. Two were physiotherapists, one nurse, one ergonomist, one homeopath and four were BM. The acupuncturists were informed that exactly the same questions were given to all the women and that only the symptoms actually experienced were reported. If a problem with, for instance, the stomach was not mentioned, it meant that the woman did not have such a problem. Some women were unable to specify symptoms and signs, such as the character of pain. In these cases their non-specific descriptions were given. If the acupuncturists found the case history description obscure, they were free to ask the researchers. Each of the 25 case histories was distributed as separate Word documents, with space for the 2 Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473

Figure 1 Example of a case history with the presentation of symptoms and signs as presented to the acupuncturists. Bold type marks the headings for the symptoms. The pulse qualities are described as they were felt in three depths and in the three positions (front, middle and rear). acupuncturists to type the TCM diagnoses. There was no instruction regarding the number of TCM patterns to be diagnosed. The acupuncturists were asked to identify all symptoms and signs that were used to substantiate each TCM pattern. The acupuncturists were blinded for any clinical or personal information of the cases apart from that specified in the case reports. Analysis of symptoms and signs used to diagnose a single TCM pattern The most common TCM pattern, Liver Qi Stagnation, was chosen as an analytical example to explore how the acupuncturists used symptoms and signs to make a diagnosis. For each case, the acupuncturists were asked to identify all symptoms and signs that substantiated their diagnosis. We counted the use of each symptom and sign and present the data as frequencies and percentage of their occurrence in the cases. The presence of symptoms and signs varied from occurring in all 25 cases to occurring in only one case. It might be hypothesised that the differences in frequency influenced how the practitioners use them as the foundation for a diagnosis. Common symptoms might be too non-specific and might thus be ignored, whereas less common symptoms might provide more specific information. Hence, the use of symptoms was examined according to how frequently they occurred. Three groups of symptoms were analysed: frequent (in 13 25 cases), less frequent (in 6 12 cases) and rare symptoms (in 1 5 cases). The four most common symptoms in each group were used in the analyses (table 4). Statistics To examine the variation in frequencies of TCM diagnoses among acupuncturists and with respect to demographic covariates, binary logistic regression was used. The number of analytical categories was reduced by merging the single patterns into the corresponding excess and deficiency patterns as shown in figure 2. The frequencies of the merged patterns were used for reliability analysis. Only the seven merged TCM patterns that the acupuncturist used on average more than five times were used for the analysis. To examine the intra-rater reliability, the level of agreement between the acupuncturists beyond that expected by chance, the inter-rater κ statistic was used. κ values <0.20 were considered as poor agreement, 0.21 0.40 as fair, 0.41 0.60 as moderate and 0.61 0.80 were considered as good. Values of 0.81 1.00 were regarded as very good agreement. 17 The calculations were performed in SPSS V.18.0 for Windows. Variation in the use of symptoms and signs to diagnose a TCM pattern was examined by descriptive analysis. Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473 3

Original paper RESULTS All acupuncturists diagnosed all case histories. Acu2 reported that the lack of opportunity to interview the patient was a limitation. Lack of the biomedical diagnosis to facilitate the TCM diagnosis was reported as a limitation by Acu7 and Acu4. All reported Maciocia as a main source. The acupuncturists diagnosed a total of 114 different single TCM patterns. Some of the patterns were only used once while others were used on almost all cases by all acupuncturists. After grouping the single patterns into the merged patterns, there was still a wide variation between the acupuncturists (table 1). The total number of TCM pattern diagnoses used by each of the acupuncturists varied from 63 to 203. Acu7 diagnosed 2 3 times as many patterns as the other acupuncturists. There was also a wide variation between the acupuncturists with respect to the use of each individual merged pattern. Five patterns varied in use from 0 to >10: Heart Deficiency, Blood Excess, Blood Deficiency, Lung Deficiency and Stomach Deficiency. One acupuncturist diagnosed Blood Deficiency and Stomach Deficiency on 11 cases whereas three acupuncturists did not use these patterns at all. Logistic regression analysis (table 2) showed that sex, duration of practice and education had a highly significant effect (p<0.001) on the use of merged patterns (ie, the total numbers of patterns summarised in table 1). The number of working hours was also significant (p=0.029). Age had no significant impact on the frequency of diagnoses, but the acupuncturists diagnosed fewer 4 TCM patterns if they were women, had long practical experience or had longer working hours. An excess of 10 years practice implies a 65% reduction in the odds of diagnosing TCM patterns, whereas 10 more hours per week implies a 16% reduction in the odds. Acupuncturists with a background in nursing or physiotherapy were more likely to diagnose TCM patterns than acupuncturists with BM education. Agreement on diagnosis Six merged TCM patterns in table 1 were used by all the acupuncturists; Liver Excess, Liver Deficiency, Spleen Deficiency, Kidney Deficiency, Damp Excess and Heat Excess. Except for Liver Deficiency and Heat Excess, these patterns appeared at least twice as often as the other patterns. They were on average used in more than half of the cases. The inter-rater reliability test was performed on the seven merged TCM patterns that were diagnosed on average more than five times (table 3). There was poor agreement between the acupuncturists with regard to the use of all these merged patterns (κ<0.20). To explore factors that could contribute to the poor agreement, we selected a frequently used TCM pattern diagnosis and examined all the symptoms used to diagnose Liver Qi Stagnation. Symptoms and signs used to diagnose Liver Qi Stagnation Liver Qi Stagnation was frequently used by five acupuncturists on 21 24 cases (table 4). The remaining three acupuncturists used the pattern on 11 16 cases each. Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473 Figure 2 The single patterns merged into the respective Excess and Deficiency pattern. The seven merged Traditional Chinese Medicine pattern included in the analyses are shown in the grey boxes.

Table 1 The 17 merged Traditional Chinese Medicine (TCM) pattern diagnoses set on 25 case histories by eight acupuncturists (Acu1 to Acu8), the frequencies, mean and SD Acu1 Acu2 Acu3 Acu4 Acu5 Acu6 Acu7 Acu8 Sum Mean SD Merged TCM pattern diagnoses used by all acupuncturists Liver Excess 23 25 17 19 25 21 24 23 177 22.1 2.9 Spleen Deficiency 21 23 12 8 21 17 25 23 150 18.9 6.0 Damp Excess 16 22 9 15 22 11 24 18 137 17.1 5.4 Kidney Deficiency 15 15 7 6 14 19 22 15 113 14.1 5.4 Liver Deficiency 4 15 7 6 6 12 9 3 62 7.8 4.1 Heat Excess 3 1 2 5 7 5 15 2 40 5.0 4.5 Merged TCM pattern diagnoses not used by all acupuncturists Heart Deficiency 7 11 4 0 7 1 19 6 55 6.9 6.0 Blood Excess 7 1 0 1 3 17 17 7 53 6.9 6.9 Merged TCM pattern diagnoses used on average less than 5 times Blood Deficiency 6 0 2 0 0 5 11 6 30 3.8 4.0 Lung Deficiency 2 1 0 1 1 5 10 2 22 2.8 3.3 Stomach Deficiency 1 2 0 0 0 4 11 1 19 2.4 3.7 Cold Excess 2 2 2 1 0 2 3 1 13 1.6 0.9 Cold Empty 0 0 0 0 1 0 3 0 4 0.5 1.1 Stomach Excess 1 1 1 1 0 0 5 2 11 1.4 1.6 Empty Heat 1 3 3 0 2 2 3 2 16 2.0 1.1 Qi Deficiency 1 0 3 0 0 1 2 0 7 0.9 1.1 Heart Excess 1 2 2 0 0 0 0 0 5 0.6 0.9 Average 53.8 6.7 3.5 Total number of patterns 111 124 71 63 109 122 203 111 114.3 Average 6.5 7.3 4.2 3.7 6.4 7.2 11.9 6.5 6.7 SD 7.5 9.1 4.8 5.7 8.6 7.4 8.6 8.0 7.5 All cases were diagnosed as Liver Qi Stagnation by at least one acupuncturist. Four of the 25 cases were diagnosed by all eight acupuncturists while one case was diagnosed by only one acupuncturist. For the remaining cases there was a variation in the number of acupuncturists who gave the diagnosis (range 4 7). Altogether, 179 different symptoms and signs, including description of the pulse and tongue, were used to describe the 25 case histories. Overall, the acupuncturists used 147 of these symptoms to diagnose Liver Qi Stagnation. For individual cases the number of symptoms ranged from 32 to 106. There was a large variation in how the acupuncturists used symptoms and signs to diagnose Liver Qi Stagnation (table 4). In general, the acupuncturists Table 2 variables used a symptom or sign to make the diagnosis only in some of the cases, even though it was present in other cases. Some acupuncturists never used a symptom to diagnose whereas others used it on all cases presenting with the symptom. For instance, Acu1 never used red edge on tongue whereas Acu7 used it on all cases. Wiry pulse was reported in all cases and was used to diagnose Liver Qi Stagnation in 32 84% of the cases (table 4). Even larger variability among the acupuncturists was seen for the other symptoms. Acu7 used the most symptoms, ranging from 3 to 37 symptoms on a single case with an average of 15 symptoms per case. Acu4 used the fewest symptoms, ranging from 1 to 10 symptoms with an average of three symptoms per case. There seemed to be a rather Logistic regression analysis of the total number of merged Traditional Chinese Medicine pattern diagnoses versus demographic Estimate SE p Value OR 95% CI for OR Sex (0=male,1=female) 0.781 0.142 <0.001 0.458 (0.347 to 0.608) Age (in years) 0.002 0.009 0.871 1.002 (0.984 to 1.020) Practice (in years) 0.105 0.017 <0.001 0.900 (0.870 to 0.931) Working hours (hours/week) 0.018 0.008 0.029 0.982 (0.966 to 0.998) Education (0=Basic Medical course, 1=physiotherapist, nurse) 1.934 0.291 <0.001 6.915 (3.911 to 12.225) Education: physiotherapists and nurses were merged to constituted acupuncturists with additional professional training. Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473 5

Table 3 Multiple κ and CI for the most used merged Traditional Chinese Medicine (TCM) patterns TCM merged patterns κ CI Liver Excess 0.014 0.000 to 0.036 Liver Deficiency 0.118 0.056 to 0.179 Spleen Deficiency 0.029 0.001 to 0.056 Heart Deficiency 0.179 0.095 to 0.263 Kidney Deficiency 0.043 0.001 to 0.076 Damp Excess 0.042 0.011 to 0.071 Blood Excess 0.144 0.065 to 0.222 consistent variation between the acupuncturists with regard to the number of symptoms used to make a diagnosis. Some acupuncturists used few symptoms whereas others seemed to use a larger number of symptoms. The symptoms occurring less frequently were used to diagnose Liver Qi Stagnation from none to all cases with the symptom. A similar variability was observed for the rare symptoms. Hence, there seemed to be a wide variation among the acupuncturists in the use of symptoms for setting the diagnosis, irrespective of how common the symptom is. DISCUSSION Although acupuncturists with the same TCM background evaluated exactly the same information, there was extensive variation and poor inter-rater agreement on the merged TCM pattern diagnoses as well as for single patterns. Disagreement can occur in several stages of the diagnostic process. Variability in the data collection phase was eliminated, and the poor interrater reliability occurred in the understanding and interpretation of the symptoms and signs. The acupuncturists interpreted the symptoms and signs differently and turned the information into different diagnoses. There was a large variation in how symptoms and signs were used to set a diagnosis. Wiry pulse existed in all cases and is, according to Maciocia, a clinical manifestation of Liver Qi Stagnation, 2 yet it was used to diagnose only 69% of the cases, indicating that symptoms are used inconsistently in setting a diagnosis. The study examined only how single symptoms were used to set a diagnosis and not how diagnoses were reinforced or rejected in the presence of other symptoms. Nevertheless, it is expected that acupuncturists should agree on the principles, consider the same symptoms and from that conclude with the same TCM patterns. The variation in the use of symptoms and signs to make a diagnosis reflects an unpredictable diagnostic process, both between and within individual acupuncturists. Scheid found that personal interpretations of the textual sources were of importance. Practitioners used the same terms from the canonical literature but Table 4 Symptom used to diagnose Liver Qi Stagnation distributed into three groups: frequent, moderate and rare symptoms (four symptoms in each group) Frequent symptoms (13 25 cases) Less frequent symptoms (6 12 cases) Rare symptoms (1 5 cases) Irregular Menstruation Stool altering consistence Stress Pain first day of menstruation Lumps in Menstruation blood Premenstrual breast pain Moderate Menstrual Pain Red Edge on Tongue Premenstrual Headache General Menstruation Pain Premenstrual irritation Wiry Pulse n=25 n=22 n=19 n=13 n=12 n=12 n=11 n=10 n=5 n=5 n=5 n=4 Liver Qi Stagnation # of cases diagnosed Acu1 22 80 82 37 54 0 58 73 0 80 0 20 25 Acu2 24 72 41 11 46 33 8 27 20 20 80 40 25 Acu3 16 80 14 21 8 50 0 18 20 0 60 40 25 Acu4 14 56 0 11 8 33 0 0 30 0 40 40 25 Acu5 11 52 73 5 69 33 8 82 10 20 0 40 25 Acu6 21 32 73 32 62 42 17 91 60 40 60 60 0 Acu7 24 84 0 32 8 100 50 0 80 60 60 80 0 Acu8 22 84 91 32 77 25 75 82 70 40 80 80 50 The use of the symptoms is expressed as a percentage according to how often they occurred. 6 Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473

applied them differently. 1 With a large influence of personal interpretation, the diagnostic process may be individual for each acupuncturist. In addition, there was considerable variability in how each acupuncturist used the symptoms and signs, suggesting that additional factors other than inherent personal differences also play a role. Hence, there seems to be a need for the development of guidelines to achieve a more reliable diagnostic process across acupuncturists. The results showed that the diagnostic process was influenced by demographic factors. Acupuncturists with long clinical experience diagnosed fewer TCM patterns. Liu claims that experienced acupuncturists have more in-depth understanding of TCM theory and better communicating skills that will ensure a correct diagnosis. 18 Although the communicating skill was eliminated in the present study, we found variation in diagnoses. The acupuncturists did not agree as to how symptoms should be interpreted into TCM patterns. Our sample size was too small to examine whether different educational backgrounds affected the interpretation of symptoms and signs. The use of case histories without the possibility for the acupuncturist to gain additional information restricts the interpretation of the present study. However, it provides data on a wide range of interpretations of given sets of symptoms, and supports our previous finding that two acupuncturists examining the same patient at the same time ended up with a substantial variation in TCM diagnoses. The variability reported in the present study is a challenge for the application of the TCM principles. 5 10 Some attempts to improve the agreement on TCM patterns, such as guided training, have been tested. 19 However, the effects have been meagre, with slight to substantial improvement on common diagnoses and a lack of improvement on less common diagnoses. 19 This indicates that there is still a need for further development of the TCM diagnostic process. CONCLUSIONS There is extensive variation and poor multi-rater reliability in TCM diagnosis among acupuncturists. Demographic variables of acupuncturists influenced the frequency of diagnosis. Symptoms were used Summary points Traditional Chinese Medicine diagnosis involves identifying symptoms and signs, then interpreting them. We asked eight acupuncturists to interpret 25 case histories. Their diagnoses were varied and inter-rater reliability was poor. There was wide variability in how clinical information was used to arrive at the diagnosis. Original paper differently and inconsistently to set diagnoses. This variability in the diagnostic process is a threat to the aim of reliable diagnosis as a basis for individually tailored treatment. Acknowledgements The authors acknowledge the participating acupuncturists from the Norwegian Acupuncture Association and the women for providing study data. Contributors All of the authors have contributed to the conception and design of the study, analysis and interpretation of data, drafting the article or revising it critically for important intellectual content and final approval of the version to be published. Competing interests None. Patient consent Obtained. Ethics approval The Norwegian regional committee for medical and health research ethics, REC South East, Oslo, approved this study. Provenance and peer review Not commissioned; externally peer reviewed. REFERENCES 1 Scheid V. Chinese medicine in contemporary China, plurality and synthesis. Durham & London: Duke University Press, 2002. 2 Maciocia G. The foundations of Chinese medicine. A comprehensive text for acupuncturists and herbalists. London: Churchill Livingstone, 1989. 3 MacPherson H, Sherman K, Hammerschlag R, et al. The clinical evaluation of traditional East Asian systems of medicine. Clin Acupunct Oriental Med 2002;3:16 19. 4 Maciocia G. The practice of Chinese medicine. The treatment of diseases with acupuncture and Chinese herbs. London: Chutchill Livinvstone, 1994. 5 Birkeflet O, Laake P, Vøllestad N. Low inter-rater reliability in traditional Chinese medicine for female infertility. Acupunct Med 2011;1 7. 6 Sherman KJ, Cherkin DC, Hogeboom CJ. The diagnosis and treatment of patients with chronic low-back pain by traditional Chinese medical acupuncturists. J Altern Complement Med 2001;7:641 50. 7 Coeytaux RR, Chen W, Lindemuth CE, et al. Variability in the diagnosis and point selection for persons with frequent headache by traditional Chinese medicine acupuncturists. J Altern Complement Med 2006;12:863 72. 8 Hogeboom CJ, Sherman KJ, Cherkin DC. Variation in diagnosis and treatment of chronic low back pain by traditional Chinese medicine acupuncturists. Complement Ther Med 2001;9:154 66. 9 Zhang G, Lee W,BausellB,et al. Variability in the Traditional Chinese Medicine (TCM) diagnoses and herbal prescriptions provided by three TCM practitioners for 40 patients with rheumatoid arthritis. J Altern Complement Med 2005;11:415 21. 10 O Brien KA, Abbas E, Zhang J, et al. An investigation into the reliability of Chinese medicine diagnosis according to eight guiding principles and Zang-Fu theory in Australians with hypercholesterolaemia. J Altern Complement Med 2009;15:259 66. 11 Hua B, Abbas, Hayes A, et al. Reliability of Chinese medicine diagnostic variables in the examination of patients with osteoarthritis of the knee. J Altern Complement Med 2012;18:1 10. Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473 7

12 O Brien KA, Abbas E, Zhang J, et al. Understanding the reliability of diagnostic variables in a Chinese medicine examination. J Altern Complement Med 2009;15:727 34. 13 Bonzak S, Blitstein R, Fisk A, et al. Examination of cyclic vomiting syndrome in children using traditional Chinese medical pattern differentiation. Med Acupunct 2007;19:85 94. 14 Maciocia G. Obstetrics and gynecology in Chinese medicine. London: Churchill Livingstone, 1998. 15 Maciocia G. Diagnosis in Chinese medicine a comprehensive guide. London: Churchill Livingstone, 2004. 16 Maciocia G. Tongue diagnosis in Chinese medicine. Seattle: Eastland Press, 1987. 17 Altman DG. Practical statistics for medical research. London: Chapman & Hall/CRC, 1999. 18 Liu T. Role of acupuncturists in acupuncture treatment. ecam 2007;4:3 6. 19 Mist S, Ritenbaugh C, Aickin M. Effects of questionnaire-based diagnosis and training on inter-rater reliability among practitioners of traditional Chinese medicine. J Altern Complement Med 2009;15:703 9. Acupunct Med: first published as 10.1136/acupmed-2013-010473 on 8 May 2014. Downloaded from http://aim.bmj.com/ on 19 July 2018 by guest. Protected by copyright. 8 Birkeflet O, et al. Acupunct Med 2014;0:1 8. doi:10.1136/acupmed-2013-010473