Shape-based Retrieval of Heart Sounds for Disease Similarity Detection Tanveer Syeda-Mahmood, Fei Wang

Similar documents
Fast Algorithm for Vectorcardiogram and Interbeat Intervals Analysis: Application for Premature Ventricular Contractions Classification

Dr.S.Sumathi 1, Mrs.V.Agalya 2 Mahendra Engineering College, Mahendhirapuri, Mallasamudram

Using Past Queries for Resource Selection in Distributed Information Retrieval

Study and Comparison of Various Techniques of Image Edge Detection

Heart Rate Variability Analysis Diagnosing Atrial Fibrillation

Balanced Query Methods for Improving OCR-Based Retrieval

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

A New Machine Learning Algorithm for Breast and Pectoral Muscle Segmentation

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

*VALLIAPPAN Raman 1, PUTRA Sumari 2 and MANDAVA Rajeswari 3. George town, Penang 11800, Malaysia. George town, Penang 11800, Malaysia

Copy Number Variation Methods and Data

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

A Geometric Approach To Fully Automatic Chromosome Segmentation

ARTICLE IN PRESS Biomedical Signal Processing and Control xxx (2011) xxx xxx

Importance of Atrial Compliance in Cardiac Performance

Optimal Planning of Charging Station for Phased Electric Vehicle *

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

A Novel artifact for evaluating accuracies of gear profile and pitch measurements of gear measuring instruments

FAST DETECTION OF MASSES IN MAMMOGRAMS WITH DIFFICULT CASE EXCLUSION

Using a Wavelet Representation for Classification of Movement in Bed

Myocardial Mural Thickness During the Cardiac Cycle

An Improved Time Domain Pitch Detection Algorithm for Pathological Voice

Sparse Representation of HCP Grayordinate Data Reveals. Novel Functional Architecture of Cerebral Cortex

CLUSTERING is always popular in modern technology

Research Article Statistical Analysis of Haralick Texture Features to Discriminate Lung Abnormalities

Myocardial Motion Analysis of Echocardiography Images using Optical Flow Radial Direction Distribution

Towards Automated Pose Invariant 3D Dental Biometrics

STOCHASTIC MODELS OF PITCH JITTER A D AMPLITUDE SHIMMER FOR VOICE MODIFICATIO

EXAMINATION OF THE DENSITY OF SEMEN AND ANALYSIS OF SPERM CELL MOVEMENT. 1. INTRODUCTION

AUTOMATED CHARACTERIZATION OF ESOPHAGEAL AND SEVERELY INJURED VOICES BY MEANS OF ACOUSTIC PARAMETERS

Proceedings of the 6th WSEAS Int. Conf. on EVOLUTIONARY COMPUTING, Lisbon, Portugal, June 16-18, 2005 (pp )

Physical Model for the Evolution of the Genetic Code

Arrhythmia Detection based on Morphological and Time-frequency Features of T-wave in Electrocardiogram ABSTRACT

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

A New Diagnosis Loseless Compression Method for Digital Mammography Based on Multiple Arbitrary Shape ROIs Coding Framework

Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

Chapter 20. Aggregation and calibration. Betina Dimaranan, Thomas Hertel, Robert McDougall

Introduction ORIGINAL RESEARCH

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

Survival Rate of Patients of Ovarian Cancer: Rough Set Approach

Semantics and image content integration for pulmonary nodule interpretation in thoracic computed tomography

Reconstruction of gene regulatory network of colon cancer using information theoretic approach

IDENTIFICATION AND DELINEATION OF QRS COMPLEXES IN ELECTROCARDIOGRAM USING FUZZY C-MEANS ALGORITHM

Automated and ERP-Based Diagnosis of Attention-Deficit Hyperactivity Disorder in Children

DETECTION AND CLASSIFICATION OF BRAIN TUMOR USING ML

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Recognition of ASL for Human-robot Interaction

ARTICLE IN PRESS Neuropsychologia xxx (2010) xxx xxx

An expressive three-mode principal components model for gender recognition

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Biomarker Selection from Gene Expression Data for Tumour Categorization Using Bat Algorithm

An Approach to Discover Dependencies between Service Operations*

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

AN ENHANCED GAGS BASED MTSVSL LEARNING TECHNIQUE FOR CANCER MOLECULAR PATTERN PREDICTION OF CANCER CLASSIFICATION

Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THRESHOLDING AND SVM

An Introduction to Modern Measurement Theory

Improvement of Automatic Hemorrhages Detection Methods using Brightness Correction on Fundus Images

Detection of Lung Cancer at Early Stage using Neural Network Techniques for Preventing Health Care

We analyze the effect of tumor repopulation on optimal dose delivery in radiation therapy. We are primarily

Classification of Breast Tumor in Mammogram Images Using Unsupervised Feature Learning

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

What Determines Attitude Improvements? Does Religiosity Help?

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

Natural Image Denoising: Optimality and Inherent Bounds

A deterministic approach for finding the T onset parameter of Flatten T wave in ECG

1 INTRODUCTION 2 HEART-BEAT CYCLE DETECTION

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data

Lymphoma Cancer Classification Using Genetic Programming with SNR Features

Encoding processes, in memory scanning tasks

Pattern Recognition for Robotic Fish Swimming Gaits Based on Artificial Lateral Line System and Subtractive Clustering Algorithms

Statistical Analysis on Infectious Diseases in Dubai, UAE

ARTICLE IN PRESS. computer methods and programs in biomedicine xxx (2007) xxx xxx. journal homepage:

A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique

4.2 Scheduling to Minimize Maximum Lateness

A Neural Network System for Diagnosis and Assessment of Tremor in Parkinson Disease Patients

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

Nonlinear Modeling Method Based on RBF Neural Network Trained by AFSA with Adaptive Adjustment

Combined Temporal and Spatial Filter Structures for CDMA Systems

Maize Varieties Combination Model of Multi-factor. and Implement

IMPROVING THE EFFICIENCY OF BIOMARKER IDENTIFICATION USING BIOLOGICAL KNOWLEDGE

Non-linear Multiple-Cue Judgment Tasks

NUMERICAL COMPARISONS OF BIOASSAY METHODS IN ESTIMATING LC50 TIANHONG ZHOU

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

NHS Outcomes Framework

A Linear Regression Model to Detect User Emotion for Touch Input Interactive Systems

A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

Investigation of zinc oxide thin film by spectroscopic ellipsometry

An Angiocardiographic Method for Directly Determining Left Ventricular Stroke Volume in Man

Music Structure based Vector Space Retrieval

The Influence of the Isomerization Reactions on the Soybean Oil Hydrogenation Process

Integration of sensory information within touch and across modalities

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

Muscle Activating Force Detection Using Surface Electromyography

Transcription:

Shape-based Retreval of Heart Sounds for Dsease Smlarty Detecton Tanveer Syeda-Mahmood, Fe Wang 1 IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120. {stf,wangfe}@almaden.bm.com Abstract. Retreval of smlar heart sounds from a sound database has applcatons n physcan tranng, dagnostc screenng, and decson support. In ths paper, we explot a vsual renderng of heart sounds and model the morphologcal varatons of audo envelopes through a constraned non-rgd translaton transform. Smlar heart sounds are then retreved by recoverng the correspondng algnment transform usng a varant of shape-based dynamc tme warpng. Results of smlar heart sound retreval are demonstrated for varous dseases on a large database of heart sounds. Keywords: Sound pattern analyss, audo retreval, curve analyss, healthcare applcaton. 1 Introducton The feld of pattern recognton s becomng ncreasngly applcable to new modaltes. In ths paper we explore the applcaton of computer vson technques to an mportant class of audo sgnals, namely, heart sounds. Heart auscultaton,.e., lstenng to the sounds produced by the heart, s a common practce n the screenng of heart dsease. Although dfferent dseases produce characterstc sounds, formng a dagnoss based on sounds heard through a stethoscope s a skll that takes years to perfect. Numerous studes have shown that as much as 87% of patents referred to cardologsts for evaluaton are as a result of false alarms [2]. Thus software dagnostc tools that ad physcans n ther dagnoss of heart sounds are needed. Audtory dscrmnaton of heart sounds s nherently dffcult as these sounds are fant and le at the lower end of the audble frequency range [1]. Although there are tools to vsually render the sounds [3], we recently observed that the vsual representaton of heart sounds actually brngs out the dfferentatng characterstcs of varous dseases more readly than the audo sgnal. Fgure 1 llustrates ths by recordng the vsual appearance of the audo sgnal wthn a sngle heart beat duraton, from patents wth varous dseases. As can be seen, dfferent dseases show characterstcally dfferent shape patterns. Further, t s also easer to spot the smlarty n the sounds across patents wth smlar dseases through ther vsual representatons. Fgure 2 llustrates ths for dfferent patents dagnosed wth the same dsease. From these examples, t appears plausble that the dsease smlarty can be nferred by developng a measure for capturng the vsual smlarty of audo sgnals.

In ths paper we address the smlarty retreval of heart sounds usng spatal representatons of heart sounds. Specfcally, we extract perceptual envelopes of audo shapes that capture the essental low frequency dsease-specfc nformaton. We then model the morphologcal varatons n the spatal envelopes from the same dsease class, as a constraned non-rgd translaton transform. Matchng nvolves recoverng the correspondng algnment transform usng a new shape-based dynamc tme warpng algorthm. Results of heart sound retreval are reported for varous dseases on a large database of heart sounds. Our experments demonstrate that our method outperforms state-of-the-art heart sound sgnal analyss methods when they are adapted for use n smlar heart sound retreval. The paper makes several mportant contrbutons. It addresses, for the frst tme, the problem of smlarty retreval of heart sounds to ad physcan decson support (a new use of content-based retreval methods). Secondly, t demonstrates the advantages of vsual analyss of heart sounds over conventonal audo processng methods to enhance dsease smlarty detecton. Further, t proposes the use of envelope curves of heart sounds as an adequate representaton to dscrmnate between dseases. Fnally, the audo envelop representaton used here s nvarant to a number of factors ncludng heart rate dfferences n young and adults, number of heart beat samples, nose n recordngs, etc. The rest of the paper descrbes ths approach n detal. In Secton 2 we ntroduce the doman of heart sounds and the need for modelng non-rgd tme deformatons for detectng vsual smlarty of audo sgnals. In Secton 3, we dscuss related work. In Sectons 4-5, we develop a general method of modelng the shape changes n audo sgnals correspondng to smlar sounds and present a method of audo shape matchng. In Secton 6, we descrbe the varous pre-processng steps for enablng the shape matchng. Fnally, n Secton 7 we present results demonstratng the effectves of shape smlarty measures for dscoverng audo smlarty among heart sounds. 2. The heart sound matchng problem The Heart sounds are produced durng the heart cycle, where events such as the moton of the valves and the movement of the blood generate vbratons [1]. In healthy adults, the frst heart sound (S1) ( lub ) and second heart sound (S2) ( dub ) are produced by the closure of the AV valves and sem-lunar valves as shown n Fgure 3. In addton to these normal sounds, a varety of other sounds may be present ncludng S3 whch may be heard at the begnnng of the dastole, durng the rapd fllng of the ventrcles and S4, whch may be heard n late dastole durng atral contracton, regurgtaton sounds,.e. murmurs whch are nose-lke sounds that are heard between the two major heart sounds durng systole or dastole, and adventtous sounds, or clcks. Matchng heart sounds s dffcult as they are hghly non-statonary sgnals of short duraton. Further, the energy content of low-frequency vbratons s much hgher than that of hgh-frequency vbratons. Thus much of the dscrmnatng heart sound nformaton s captured n ther low frequency envelopes whch must be extracted. Next the varatons n heart beat must be taken nto account. Ths requres

extracton of sngle heart beats from sgnals whch are not always perodc (partcularly for cardac arrhythmas). Further, the ampltude varatons due to recordng levels and varyng dgtal stethoscope qualtes affect the matchng. Fnally, due to varablty n the systolc and dastolc phases across patents, both advancement as well as preponement of cardac audo phases (sounds S1 and S3) s possble. Fgure 4 llustrates matchng problem for heart sounds usng two patents both dagnosed wth Atral Septal Defect (ASD). Fgure 4a and b show the orgnal sounds. Due to the samplng rate dfferences, a sngle heart beat corresponds to 6172 samples n Fgure 4a whle there are 34542 samples wthn a sngle heart beat for the sound n Fgure 4b. The two sgnals, nevertheless sound very smlar. Drect comparson of the two sgnals hardly reveals any smlarty as can be seen through drect superposton n Fgure 4c. Once the sgnals are normalzed n both tme and ampltude by scalng usng the samplng rate and the range of 0 to 1, the smlarty can be seen under a shft as shown n Fgure 4d. Usng cross-correlaton, we can recover the shft to roughly algn them as shown n Fgure 4e. From ths fgure, t s clear that although a sngle fxed shft can brng them nto rough algnment, matchng peaks requre dfferent amounts of shft n dfferent regons. Furthermore, the drecton of the shft could be n opposte drectons n dfferent phases to correspond to the cases where the systolc and dastolc phases are affected dfferently (due to the dsease). Fgure 4f llustrates ths non-unformty n the tme shft for the sgnals from Fgure 4d whch are already normalzed n tme (marked by arrows n the fgure). Thus any measure of smlarty n sounds for the same dsease class needs to model non-unformty n the scalng of tme. 3 Related Work Whle we are not aware of work on smlarty retreval of heart sounds, there s consderable work n audo sgnal analyss and retreval n general [10,16,12,15,11], and heart sound sgnal analyss[6,7,4], n partcular. Further, there s also work on computer-aded dagnoss of heart sounds [19,20] although no commercal systems exst n practce. Classcal methods of audo analyss dvde the audo nto short-tme segments and extract spato-temporal features such as the zero-crossng rate (ZCR) [10], average energy n short-tme segments, hdden Markov model features[14], etc. Methods based on frequency-doman nformaton of the audo sgnal often form the speech spectrogram and extract features such as Mel-frequency cepstral coeffcents (MFCC) [12], subband spectral flux, harmoncty promnence, spectral roll-off [15], beat frequency estmaton [11], mult-resoluton wavelets etc. Often, the spectral representatons are generated from a movng overlappng wndow analyss of the tme doman sgnal. The short-tme sgnal analyss methods are not sutable for modelng heart sounds due to (a) ther hghly non-statonary nature, (b) varablty n heart rate, (c) dseasespecfc varatons n the relatve placement of S2,S2,S3,and S4 phases, and (d) due to the selectve mportance that needs to be gven to the lower frequences of the spectrum. Further, the choce of wndow sze s not clear for heart sounds. Often, the

segments pcked are end ponts of S1 or S3 phases. However, for dseased cases, there s smearng of the S1, and S3 phases (eg. Mtral regurgtaton). Further, addtonal sounds such as S2 and S4 may be present. In such cases, t s not clear how to approprately extract a coherent segment for spatal-frequency analyss. The method of MFCC suffers from the same lmtaton due to overlappng wndow analyss. Further, MFCC s well-known to be senstve to even small amounts of nose. Automatc analyss of heart sounds has been attempted for abnormalty detecton wthn a sngle sound. Early work on phono-cardograms found correlaton between sounds and varous heart defects [19]. However, the sounds were studed by solatng sngle heart beat duraton. The predomnant approach n heart sound smlarty detecton has been based on feature extracton and classfcaton as s conventonal for audo analyss. In [19], a neural network was traned on heart sounds to classfy several valve-related dsorders. However, the dentfcaton of dsorders was based on classfyng sound nto ntutve groups such as normal, splt sound, open snap, etc based on detecton of S1 and S2 sounds and notng ther separaton. For many dseases, such as percardal rub, the perods S1 and S2 are hardly dstnct for detecton and hence are not relable for audo smlarty detecton. In [20] agan, a neural net classfer was used n conjuncton wth wavelet processng for heart sound classfcaton. However, the system was traned on a sngle sample for each dsease usng dfferent heart beat cycles. Heart sound analyss usng tme-frequency representatons has also been common ncludng recent uses of MFCC to heart sounds [5]. In all these approaches, no explct varatonal modelng across dfferent dsease nstances was performed other than usng bounds on the szes of the S1 and S3 phases. Fnally, most of these methods assume that a sngle perod of the heart sounds can be obtaned. However, n a long recordng, partcularly n the case of arrhythmas, fndng perodc repettons by smple correlaton s nsuffcent. The vsually modelng of heart sounds s analogous to vsual modelng of musc spectrograms reported n [25] although the spectral features were dfferent. Fnally, several methods of curve matchng have been developed n lterature[23,24]. Methods exst to match two curves usng curvature-scale space [24], and to dynamcally warp the curves nto one another usng dynamc tme warpng [23]. The curve matchng approaches, n general, are senstve to spurous and mssng data. Ths condton s qute lkely n modelng dsease nstances as the S1 or S3 phases are prolonged. The dynamc tme warpng algorthms for curve matchng have mostly used tme order and tme proxmty constrants. Shape-based dynamc tme warpng usng combned shape cues ncludng angle of corners, and orentaton of ther bsectors n combnaton wth proxmty have not been proposed prevously. 4 Representng heart sounds usng perceptual envelopes In our approach, we model the heart sound (wthn a sngle heart bea through a perceptual envelope. Varous algorthms are avalable n lterature for envelop extracton from sgnals ncludng homomorphc flterng [8]. Whle these algorthms

are less senstve to nose-related fluctuatons, they frequently extract the low frequency component of the sgnal rather than render a fathful approxmaton of the perceptual envelope as can be seen from Fgure 5. Fgure 1. Illustraton of dfferent vsual appearances for heart sounds for dfferent dseases ndcatng vsual dscrmnaton of heart sounds s possble. (a)-(c) (d)-(f), (g)-() left to rght, top to bottom. To extract audo envelope curve, we perform nose flterng usng wavelet flters to remove the hssng nose that comes from dgtal stethoscopes. We then form a lne segment approxmaton to the audo sgnal. Ths s a standard splt-and-merge algorthm for lne segment approxmaton that uses two thresholds, namely, a dstance threshold and a lne length threshold l to recursvely partton the audo sgnal nto a set of lne segments. Each consecutve par of lne segments then defnes a corner feature C. Wth =0.01 and l=5, a fathful renderng of the audo sgnal s made possble through a lne segment approxmaton whle retanng only 10% of the samples. The thresholds for curve parameterzaton are not as crtcal here as the shape matchng algorthm we present next, s robust to mssng and spurous features. We defne the audo envelope (AE) as the set of ponts AE { P } where P C j for some j s. t. P ( y) P 1 ( y) & P ( y) P 1( y) and P ( y) baselne. Only the values of the sgnal above the baselne are retaned n the maxma envelope curve. Smlarly, all peaks below the baselne for the mnma envelope curve.

Fgure 2. Illustraton of smlarty retreval for heart sounds. The top matches n each case are matchng patents wth the same dsease. Capton runs from (a)-(b), (c)-(d), (e)-(f), Left to rght, top to bottom n the fgure. Fgure 3. Illustraton of four major heart sounds. Fgure 4. Illustraton of the algnment problem for heart sounds. Fgure 5. Illustraton of envelope curve extracton of sounds. (a) Homomorphc flterng-based low frequency approxmaton (shown n red), (b) perceptual envelope. Fgure 5b llustrates the envelope curve extracted from a heart sound. As can be seen, the boundary shape of the sgnal s well-captured n the envelope curve. Once the envelope curve f( s extracted, ts shape can be represented by the curvature change ponts or corners on the envelope curve. The shape nformaton at each corner s captured usng the followng parameters

S f ( t )) t, f ( t ), ( t ), ( t ) ( (1) These features were chosen to facltate matchng of audo envelopes. Usng the angle of the corner ensures that wder complexes are not matched to narrow complex as these can change the dsease nterpretaton. The angular bsector, on the other hand, ( t ) ensures that polarty reversals such as nverted waves can be captured. Here s t the ncluded angle n the corner at ( t ), and s the orentaton of the bsector at t corner. 5 Modelng morphologcal varatons We now develop a model of morphologcal varatons n the shape of heart sounds envelopes for dfferent nstances of a dsease. Fgure 6. Illustraton of the shape matchng problem for envelop curves. Referrng to Fgure 6, consder an envelop curve g( correspondng to a heart sound. Consder another curve f( that s a potental match to g(,.e. comes from a dfferent patent dagnosed wth the same dsease. The curve f( s consdered perceptually smlar to g( f a non-rgd transform characterzed by [a,b, ] can be found such that f ( g( where represents the dstance metrc that measures the dfference between f ( and g(, the smplest beng the Eucldean norm and f ( af ( ( ) wth ( bt ( (3) where the (b s the lnear component of the transform and s the non-lnear translaton component. The parameters a and b can be recovered by normalzng n ampltude and tme. We can normalze n ampltude by transformng f( and g( such that (2)

f ( f fˆ( f ( f max mn ( ( mn and g( g gˆ( g ( g max mn ( ( so that a=1. To elmnate solvng for b, we can normalze the tme axs by dvdng the heart rate. Suppose the samplng rate of ponts on the curve f( s F S. Let a perodcty detecton algorthm sgnal the heart rate perod to be (Perodcty detecton s dscussed n Secton 5). Then dvdng by the heart perod samples, we can ensure that that all tme nstants le n the range [0,1]. Thus the tme normalzaton can be easly acheved as: f ( fˆ( t / T1 ) g( gˆ( t / T2 ) (5) where T1 and T2 are the heart beat duratons of f( and g( respectvely. Wth ths tme normalzaton, b=1. Such ampltude and tme normalzaton automatcally makes the shape modelng nvarant to ampltude varatons n audo recordngs, as well as varatons n heart rate across patents. Snce the non-unform translaton s a functon of t, we can avod computatonal overhead by recoverng t at mportant fducal ponts such as the corners, and recover the overall shape approxmaton by nterpolaton. Let there be K features extracted from as FK {( t1, f1( t1)),( t2, f2( t2))...( tk, f K ( tk ))} at tme { t1, t2,.. tk} respectvely. Let mn T 1 f ( g ( there be M fducal ponts extracted from as G {( 1, ( 1)),( 2, ( 2))...(, ( t M t g1 t t g2 t t M gm M ))} at tme { t 1, t 2,.. t M } C, respectvely. If we can fnd a set of N matchng fducal ponts {( t t j )}, then the non-unform translaton transform can be defned as: (4) t f t t j and ( t, t j) C ( t ) t s tr t r ( ) (, ),(, ) t tk where tr tk ts tl C tl tk (6) t and k s the hghest of {tj } t t and l s the lowest of {tj } t that have a vald mappng n possble. C. Other nterpolaton methods besdes lnear (eg. splne) are also Usng Equatons 5 and 6, the shape approxmaton error between the two curves s then gven by: f ( g( f ( t ( ) g( (7) For each g(, we would lke to select such that t mnmzes the approxmaton error n (7) whle maxmzng the sze of match. Fndng the best matchng audo based on shape can then be formulated as fndng the g( such that C

g best arg mn f ( ( t ) g ( t ) g whle choosng the best for each respectve canddate match g(. (8) Fgure 7. Illustraton of shape-based dynamc tme warpng. Fgure 8. Illustraton of the comparson between shape-based tme warpng and MFCC algorthm for par-wse dscrmnaton. (a) Confuson matrx usng our algorthm (b) Confuson matrx usng MFCC. 5.1 Shape-based dynamc tme warpng If we consder the feature set F, G K M extracted from the respectve curves as sequences, the problem of computng the best reduces to fndng the best global subsequence algnment usng the dynamc programmng prncple. The best global algnment maxmzes the match of the curve fragments whle allowng for possble gaps and

nsertons. Gaps and nsertons correspond to sgnal fragments from feature set G M that don t fnd a match n set and vce versa. In fact, the algnment can be computed usng a dynamc programmng matrx H where the element H(,j) s the cost of matchng up to the th and jth element n the respectve sequences. As more features fnd a match, we want the cost to ncrease as lttle as possble. The dynamc programmng step n our case becomes: H 1, j1 d( f ( t), g( t j ) H, j mn H 1, j d( f ( t),0) H, j1 d(0, g( t j)) (9) H0,0 0 H j wth ntalzaton as and 0, H and,0 for all 0< <= K, and 0<j<=M. Here d(.) s the cost of matchng the ndvdual features descrbed next. f ( t ) Also, the frst term represents the cost of matchng the feature pont to feature pont g ( t j ) whch s low f the features are smlar. The second term represents the f (t ) choce where no match s assgned to feature. F K 5.2 Shape smlarty of envelop curves After the transformaton s recovered, the smlarty between two envelop curves s d ( f ( t ), ( j ) gven by the cost functon g t 2 1 g ( t 2 t j ) ( t t j ) ( f ( t ) ( t j ) ( f ( t ) g( t j ) 2 d( f ( t ), g( t j ) 2 2 ( ( t ) ( )) ( ( t ) t j ( t )) f j ( ( t ) ( t j )) 3 otherwse ( ( t ) ( t j )) The thresholds ( 1, 2, 3, 4) are determned through a tranng phase n whch the expected varatons per dsease class are noted. The cost functon d( f ( t ),0) can be computed by substtutng t j 0, g( t j ) 0, and ( t j ) 0, ( t j ) 0 n Equaton 10. The cost functon d(0, g( t j )) can be smlarly computed. Thus usng ths approach two heart sounds are consdered smlar f enough number of fducal ponts between query and target envelop curves can be matched usng shape-based dynamc tme warpng. In general, due to the perod estmaton offset errors, the sgnals may have to be crcularly shfted by a fxed translaton for a rough algnment before the fne non-rgd algnment descrbed above. 4 (10)

6. Feature pre-processng The above formulaton assumed that a sngle heart beat duraton sgnal could be solated pror to ntatng shape matchng. We now descrbe the pre-processng steps that lead to such an solaton. 6.1 Perodcty detecton Whle normal heart sounds show good repetton, the perodcty s surprsngly dffcult to spot for abnormal heart sounds where the repettons can be nested or rregular (arrhythmas). Smple auto-correlaton s often nsuffcent for ths purpose. A robust perodcty detector was, therefore, developed that treats perodcty detecton as the problem of recoverng a shft/translaton that best algns two selfsmlar curves. Consder a perodc curve g( wth perod T. Then by defnton g( = g(t+kt) for all multples k=0,1,2,. Consder a canddate perod. Form a curve f( by shftng g( by,.e. g( t ) f t f ( g( otherwse Then defne a functon R( ) that records the number of curve features that can be verfed to satsfy the perodcty condton based on the current estmate of the perod as { g( t)} R( ) st.. g( t) f ( max{ N, } where N s the total number of ponts on the curve. The above functon can be computed n lnear tme n comparson to the quadratc tme for the autocorrelaton R( ) functon. The functon shows peaks at precsely those shfts whch correspond to perodc repettons of the curve. The most lkely perod s then taken as the smallest wth the most nteger multples n the allowed range of heart beats (40-180 beats/mnute). In general, there may be more than one canddate for perod, partcularly when the perodc repettons are nested. Our algorthm fnds all such overlappng repettons and tests the dynamc tme warpng algorthm wth each such choce. (11) (12) 7. Results The audo shape matchng algorthm was tested on a large database of heart sounds. The data came from varous teachng hosptals and reference CDs provded by dgtal stethoscope makers (Lttman). Thus the ground truth labels for dseases were known durng the evaluaton. Currently, the collecton has over several hundred

heart sound examples for varous knds of murmurs, Mtral regurgtaton, Mtral Stenoss, septal defects, Cardomyopathy, etc. Each dsease was represented by at least 6-10 patent samples n the database. Fgure 7 llustrates the steps nvolved n audo shape matchng through a sample retreval. Fgure 7a shows the query audo sgnal and Fgure 7b shows a matchng audo sound from the database. The result of perodcty detecton s shown overlad n the respectve fgures n red. Fgure 7c and d show the audo envelopes extracted from the respectve sgnals normalzed n ampltude and tme over a sngle heart beat duraton. Fgure 7e and 7f show fducal features extracted for matchng from the query and database sgnals. Drect algnment of the two sgnals by cross-correlaton s shown n Fgure 7g. As can be seen, algnment errors are stll present. The shapebased dynamc tme warpng algnment s ndcated n Fgure 7. The projecton of fducal features from the query (green crcle) onto the matchng database curve (n magenta) usng the non-rgd algnment s ndcated n Fgure 7h. In ths example, 110 of the 114 of query fducal features from Fgure 7e have found a match wth fducal features of the database curve. From ths fgure, t s clear that although a sngle fxed shft can brng them nto rough algnment, matchng the sgnals requres dfferent amounts of shft n dfferent regons. Furthermore, the drecton of the shft could be opposte n dfferent sectons when the systolc and dastolc phases are affected dfferently due to the dsease. Fgure 2a-f llustrates three other examples of heart sound smlarty retreval. In each case, the query s shown on the left and the best matchng heart sound s shown on the rght. 7.1 Comparson wth MFCC We compared the performance of our algorthm wth a tradtonal sgnal processng approach usng Mel-frequency Cepstral Coeffcents (MFCC) [2]. The MFCC method was chosen as t has been the most successful of the audo analyss approaches and has recently been used to classfy heart dseases [5]. We mplemented a verson of MFCC n whch we dvded the raw audo sgnals (no perodcty detecton) nto short-tme segments of 400 samples and MFCC coeffcents were then extracted from each segment usng the short-tme Fourer transform (STFT). The coeffcents of all segments were used to form the feature vector. Eucldean dstance was used to fnd nearest matches. We tested the two technques for groupng sounds from smlar dseases and dscrmnatng sounds from dfferent dseases usng the par-wse smlarty matrx. Thus a technque that dscrmnates well would form bands along the dagonal n the smlarty matrx. Fgure 8a and 8b show the smlarty matrces usng our algorthm and usng MFCC. As can be seen, the bandng structure s clearer usng our method n comparson to MFCC where the bandng s evdent mostly for normal heart sounds. Further, the outler (the orange colored sgnal) that has no match n the database clearly stands out usng the DTW algnment. 7.2 Precson recall performance

Fnally, we evaluated the precson and recall of the two methods by usng all avalable samples per dsease as queres and retrevng matches from the database for varous choces of K and retaned top K matches. Precson and recall were defned as # correct matches selected Re call # of correct matches present # ncorrect matches selected Pr ecson 1 # of matches returned The precson and recall values were averaged over the queres tested for the respectve dsease classes. Fgure 8c shows the performance of the two methods on the entre audo database. Agan, as can be seen, both precson and recall are hgher usng the more precse audo shape matchng algorthm over MFCC. Due to the averagng over a large number of queres, the precson values are somewhat lower than expected here. Sample precson and recall values for some dseases averaged over all queres of the same dsease class are shown n Table 1 below ndcatng actually good precson and recall for specfc dseases. In general, we observed that the top K matches returned by our method often contaned sgnals whose vsual appearance were smlar and the correspondng auscultaton sounds heard were smlar. Table 1. Illustraton of precson and recall. Dsease DTW Precson DTW Recall MFCC Precson MFCC Recall Mtral regurgtaton 56.67% 70.83% 43.33% 50.0% Patent Ductus Arterosus 100% 75% 50.0% 37.5% Percardal Rub 67.2% 85.6% 54.4% 45.7% Mtral stenoss 76.2% 78.3% 45.2 39.8% Ventrcular septal defect 65.4% 82.3% 54.2% 48.9% Atral Fbrllaton (AF) 78.2% 84.5% 64.7% 54.3% 8. Conclusons In ths paper, we have presented a novel algorthm for shape based retreval of heart sounds. Unlke exstng work based on feature extracton and classfcaton, we take the approach of nonrgd shape algnment for retreval. The algorthm s ndependent of dsease specfcs and s potentally applcable to other audo sgnals where ther vsual appearance s suffcently dscrmnatory. Our experments demonstrate that ths approach sgnfcantly outperforms the current state-of-the-art approached based on heart sound sgnal analyss. References [1] Erkson B., Heart sounds and murmurs: A practcal gude. Mosby-Year, 1997, 9-12.

[2] A.F. Pease If the heart could speak, onlne reference from.w4.semens.de/fui/en/archv/pof/heft2_01/artkel19/ndex.html. [3] http://www.zargs.com/whtepaperintro.asp [4] See survey of papers on auscultaton analyss at http://www.bsgnetcs.com/boengneerngpapers.htm. [5] I. Kamarulafzam et al., Heart Sound Analyss Usng MFCC and Tme Frequency Dstrbuton, n Proc. 3rd Intl. Conf. on Bomedcal Eng., Kaula Lampur, 2006. [6] O.A. Alm, N. Hamdy, and M.A. El-Hanjour, Heart dseases dagnoss usng heart sounds, n Proc. Natonal Rado Scence Conference (NRSC), pp. 634-640, 2002. [7] Hebden, J.E.; Torry, J.N. Identfcaton of aortc stenoss and mtral regurgtaton by heart sound analyss. Computers n Cardology 1997, 7-10 Sept. 1997 Pages:109-112. [8] I. Rezek, and S.J. Roberts, Envelope Extracton va Complex Homomorphc Flterng, Research Report TR-98-9, June 1998. [9] Unv. of Washngton Medcal School (http://depts.washngton.edu/physdx/ndex.htm). [10] L. Rabner and R. Shafer, Dgtal Processng of Speech Sgnals, Prentce Hall: NJ. [11] Jonathan Foote, Matthew Cooper. Audo Retreval by Rhythmc Smlarty. Proceedngs of ISMIR 2002, Pars, France, October 2002. [12] J. T. Foote, Content-based retreval of musc and audo, Proc of SPIE, vol.3229, pp.138-147, 1997. [13] L. Lu, H.-J. Zhang, H. Jang, "Content Analyss for Audo Classfcaton and Segmentaton", IEEE Trans. on Speech and Audo Processng, Vol.10, No.7, pp.504-516, 2002. [14] Le Lu, Ru Ca, and Alan Hanjalc. Towards A Unfed Framework for Content-based Audo Analyss, IEEE Internatonal Conference on Acoustcs, Speech and Sgnal Processng (ICASSP) 2005, Vol. II, pp1069-1072, 2005. [15] R. CA, L. Lu, H.-J. Zhang, Usng structure patterns of temporal and spectral feature n audo smlarty measure, Proc. ACM Multmeda 2003., 219-222. [16] Ca, R., Lu, L., Hanjalc, A., Zhang, H.-J., and Ca, L.-H. A flexble framework for key audo effects detecton and audtory context nference. n IEEE Trans. Speech Audo Processng, May, 2006. [17] Guodong Guo and Stan Z. L, Content-Based Audo Classfcaton and Retreval by Support Vector Machnes, IEEE Transactons on Neural Networks, vol.14, no.1, Jan. 2003. [18] Jandre, F.C.; Souza, M.N, Wavelet analyss of phonocardograms: dfferences between normal and abnormal heart sounds Proceedngs of the 19th Annual Internatonal Conference of the IEEEVolume 4, Issue, 30 Oct-2 Nov 1997 Page(s):1642-1644. [19] O.A.Alm et al, Heart dseases dagnoss usng heart sounds, n Proc. 19th Natonal Rado Scence Conference, Alexandra, March 2002. [20] Say, O.; Dokur, Z.; Olmez, T. Classfcaton of heart sounds by usng wavelet transform, 24th Annual Conference and the Annual Fall Meetng of the Bomedcal Engneerng Socety] EMBS/BMES Conference, Volume 1, Issue, 2002 Page(s): 128 129. [21] Iead Rezek and Stephen J. Roberts (1998). Envelope Extracton va Complex Homomorphc Flterng. Research Report TR-98-9, June 1998. [22] Elfeky, M.G. Aref, W.G. Elmagarmd, A.K. WARP: tme warpng for perodcty detecton. In Ffth Intl. Conf. on Data Mnng, p. 8, Nov. 2005. [23] H.J Wolfson, On curve matchng, n IEEE Trans. PAMI, pp.483-489, vol.12, May 1990. [24] B. Avants and J. Gee, Contnuous curve matchng wth scale-space curvature and extrema-based scale selecton, n Proc. Scale-space Methods n Computer Vson, p.1079, 2003. [25] Y. Ke, D. Hoem, and R. Sukthankar. Computer vson for musc dentfcaton, n Proc. CVPR 2000.