24 Detection of First Heart Sound Using Sequence Alignment Algorithm The Sixth PSU Engineering Conference 8-9 May 2008 P. Bangcharoensap 1, S. Kamolphiwong 2, T. Kamolphiwong 2, M. Karnjanadecha 2, S. Sea-Wong 2 and S. Cheewatanakornkul 3 Centre for Network Research (CNR), Department of Computer Engineering, Faculty of Engineering, Prince of Songkla University, Thailand E-mail: phiradet@gmail.com, ksinchai@coe.psu.ac.th, kthossaporn@coe.psu.ac.th, montri@fivedots.coe.psu.ac.th,suthon@coe.psu.ac.th, sirichai.chee@gmail.com Abstract- Heart Sound Auscultation is a helpful and non-invasive diagnostic tool for detecting heart dysfunction. Segmentation of First Heart Sound (S1) into its major sound components is the first step in the automated diagnosis of cardiac. However, to detect S1, it requires the proficiency and high experience technical skill. A new method for detection of S1 components of Heart Sound without the ECG reference is proposed. This algorithm is based on Sequence Alignment which is often used in biology, for example for finding relationships between primary sequences of DNA, RNA, or protein. The Experiment results show that the proposed algorithm have achieved the accuracy of 94% (Overall Error Rate) which is better than other algorithms. Furthermore, this research can be applied for assisting the new physician in Heart Sound Auscultation. It can solve the deficient of proficient doctor in countryside. It can help physician student for learning Heart Sound Auscultation. Therefore, segmentation of first heart sound into associated cardiac cycle is a primary step prior to the analysis of heart sounds for diagnostic purpose. This is because S1 sound is the start of cardiac cycle as shown in Figure 1. Segmentation of First Heart Sound (S1) usually use the reference of electrocardiogram (ECG) signal or/and carotid pulse but it is expensive and non-portable. Once it is detected, diagnostic features may be subsequently extracted for each type of sound. However, S1 sound detection is one of the major problems in heart sound analysis [3]. Previously attempts at an algorithm for segmentation result in a 93% success rate [1]. The purpose of this study is to develop a detection algorithm for detecting first heart sound into its component using Sequence Alignment Algorithm. Keywords: First Heart Sound (S1), Sequence Alignment, Heart Sound Segmentation I. INTRODUCTION Many heart dysfunctions can be effectively diagnosed using auscultation techniques. Heart Sound Auscultation is one of the most reliable, inexpensive and non-invasive tools because of its ability to provide the useful information concerning the integrity and function of heart valve and the hemodynamic of the heart. Phonocardiogram (PCG) is the recording of the heart sound and murmurs [2]. It is a multicomponent signal comprising of fundamental heart sound component (S1 and S2) and other components such as Opening Snap and Ejection Click. Figure 1. The section of heart sound signal (1) Buranarumluk school, Trang, Thailand (2) Department of Computer Engineering, Faculty of Engineering, Prince of Songkla University, Thailand (3) Faculty of Medicine, Prince of Songkla University, Thailand This project was supported by National Electronics and Computer Technology Center (NECTEC), National Science and Technology Agency (NSTDA), Ministry of Science, Thailand, under Young Scientist Competition 2008 (YSC) and 10 th Junior Science Talent Project (JSTP #10).
25 II. SEQUENCE ALIGNMENT In bioinformatics, a Sequence Alignment is a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that residues with identical or similar characters are aligned in successive columns. Sequence alignment can be used for nonbiological sequences, such as identifying similarities in a series of letters and words present in human language. In this paper, we are interested in applying sequence alignment technique to this task. The objective of this research is then to study how the sequence alignment can be applied to analyze the heart sound signal. Sequence Alignment has a number of methods but in this paper proposed two main methods called Needleman-Wunsch Algorithm [8] and Smith-Waterman Algorithm [7] Needleman-Wunsch Algorithm performs a global alignment. Global alignments, which attempt to align every residue in every sequence, are most useful when the sequences in the query set are similar and of roughly equal size. It is commonly used in bioinformatics to align protein or nucleotide sequences. The algorithm was proposed in 1970 by Saul Needleman and Christian Wunsch in their paper [8]. Smith-Waterman Algorithm performs a local alignment. Local alignments are more useful for dissimilar sequences that are suspected to contain regions of similarity or similar sequence motifs within their larger sequence context. That is determining similar regions between two nucleotide or protein sequences. Instead of looking at the total sequence, The Smith-Waterman algorithm compares segments of all possible lengths and optimizes the similarity measure. The algorithm was first proposed in 1981 by Temple Smith and Michael Waterman in their [7]. III. METHODOLOGY Pre-processing All signals which use in the experiment is normalized according to the equation. (1) Proposed S1 Detection Algorithm The scheme for determining the type of signal (S1 or not) show as this Pseudo-code. FreqA = FFT(A) for i 1 to N do { FreqT = FFT(Template i ) dis = SeqAlig(FreqA, FreqT) if dis Threshold then { score score+1 } } if score then A is S1 Signal else A is not S1 Signal - A: represents the suspect signal. - N: represents the number of Template. - FFT(x): represents the Fast Fourier Transform (FFT) which convert time domain signal to frequency domain signal. - SeqAlig(A,B): represents the function to calculate the Minimum Distance Mapping of signal A and signal B. In our work, we study two algorithms for calculate the Minimum Distance Mapping of two signals as follows. 1. S-W algorithm: This algorithm developed based on Smith-Waterman Algorithm. This algorithm will calculate the distance that maximizes the local match between two alignments. It can be calculated in Dynamic Programming as follows. the distance = = A m - B n ; m=length of A n = length of B 2. N-W algorithm: This algorithm developed based on Needleman-Wunsch Algorithm. This algorithm will calculate the distance that maximizes the global match between two alignments. It can be calculated in Dynamic Programming as follows.
26 = A m - B n According to graph in Figure 4, it shows the Overall Error Rate of SW is 26 % (when N=100 and Threshold=0.009). the distance = S (m,n) ; m = length of A n = length of B Note: the purpose of these algorithms tried to matching data set A (A i represent a data in set A) and B (B j represent a data in set B). IV. PERFORMANCE MEASUREMENT To evaluate the effectiveness of the first heart sound detection, three commonly Error Rate can be given the explanations as follows: 1. False Accept Rate (FAR): the percentage of detecting other heart sound component (S2) which not S1. It can be calculated as follows. (2) Figure 3. The FAR and FRR of SW algorithm According to graph in Figure 5, it show the Overall Error Rate of NW is 6 % (when N=100 and Threshold = 1.9).. 2. False Reject Rate (FRR): the percentage of S1signal which missed. It can be calculated as follows. 3. Overall Error Rate (OER): the intersection of FAR and FRR curves (ROC) as shown below. Figure 4. The FAR and FRR of SW algorithm Figure 2. Overall Error Rate (OER) VI. DISCUSSION According to experiment results, the NW algorithm obviously better than the SW algorithm. V. EXPERIMENT RESULTS The Heart Sound were receive from M.D. Anthony Ricke, GE Healthcare, WI, USA and were collected at GE Healthcare from nine different human subjects, using CardioLab system and a Dash family patient monitor. The CardioLab system sampled data from ECG leads I,II and III, and an electronic stethoscope signal at 977 samples per second.[1]
27 component. The minimum overall error rate or best case was 6% where threshold are 1.9 and the number of templates are 100 in NW algorithm. This minimum error rate showed promising results which indicated that this algorithm could be implemented as a complement to the actual system. Figure 5. the section of S1(template) and S1(input). VIII. ACKNOWLEDGMENT I would like to thank Associate Professor Dr.Sinchai Kamolphiwong and Associate Professor Dr.Montri Karnjanadecha, Department of Computer Engineering, Faculty of Engineering, Prince of Songkla University who gave me continuously advice. I also would like to thank Associate Professor Dr.Thossaporn Kamolphiwong, Suthon Sea-Wong and M.D.Sirichai Cheewatanakornkul for their valuable support. I wish to thank National Electronics and Computer Technology Center (NECTEC), National Science and Technology Agency (NSTDA) for their support under Young Scientist Competition (YSC) and Junior Science Talent Project (JSTP). I wish to thanks to my family for their unending encouragement. Last, I would like to thank everyone who involve in this project. Figure 6. the section of S1(template) and S2(input). The figure 6 shows the frequency of S1 and S1 and Figure 7 shows the frequency of S1 and S2. If consider in the whole of signal S1(input) and S1(template) quite similar and S1 and S2 quite difference. But if focus in the short region of signal S1(template) and S1(input) there is the difference in some short region. In the other hand, there are few short region which S2 similar to S1 (in accident) such as showed in figure 6.Then consideration in whole of data concept as NW algorithm is better than the consideration of the short region as SW algorithm. That is why the NW better than SW algorithm. The concept like NW algorithm have been applied in Speech Recognize named Dynamic Time Warping. However, further study should be conducted. First of all, combining between NW algorithm and SW algorithm with the Logical Operation (AND, OR) is needed to be developed. Nonetheless, the relationship between the number of templates and error rate may be further studied. VII. CONCLUSION Novel first heart sound detection approaches have been proposed based on Sequence Alignment in this paper. The prime aim of this research is to develop the algorithms that can segment the first heart sound component (S1) into other heart sound IX. REFERENCES [1] AD Ricke, RJ Povinelli, MT Johnson., 2005, Automatic Segmentation of Heart Sound Signals Using Hidden Markov Models, Computers in Cardiology 2005, Sept. 25-28, Page(s): 953-956 [2] P. Wang., Y. Kim., L. H. Ling., C. B. Soh., First Heart Sound Detection for Phonocardiogram Segmentation, Proceeding of the 2005 IEEE, Engineering in Medicine and Biology 27 th Annual Conference, Shanghai, China, September 1-4, 2005 [3] D. Kumar., P. Carvalho., M. Antunes., J. Henriques., L. Eugenio., R. Schmidt., J. Habetha., Detection of S1 and S2 Heart Sound by High Frequency Signatures, Cardiothoractic Surgery Centre, University Hospital of Coimbra, Portugal [4] T.F. Smith, M.S. Waterman, Identification of Common Molecular Subsequences, J. Mol. Biol. (1981) 147, pp. 195-197, 1981. [5] Saul Needleman, Christian Wunsch, A General Method Applicable to The Search for Similarities in The Amino Acid Sequence of Two Proteins, J Mol Biol. 48(3):443-53, 1970. [6] Michael J. Barrett M.D., Archana Saxena M.D., Katherine A. Thomas.,2007. Rapid Rise in Cardiac Auscultation Skill After a Single 90 Minute Intervention: A Quality Improvement Study., Temple University School of Medicine.
28 [7] T.F. Smith, M.S. Waterman,1981. Identification of Common Molecular Subsequences, J. Mol. Biol.(1981)147,pp.195-197 [8] Saul Needleman, Christian Wunsch,1970 A General Method Applicable to The Search for Similarities in The Amino Acid Sequence of Two Proteins, J Mol Biol. 48(3):443-53