A Lip Reading Application on MS Kinect Camera
|
|
- Grant Morris
- 6 years ago
- Views:
Transcription
1 A Lip Reading Application on MS Kinect Camera Alper Yargıç, Muzaffer Doğan Computer Engineering Department Anadolu University Eskisehir, Turkey Abstract Hearing-impaired people can read lips and lip reading applications may help them to improve their lip imitation skills. Speech of normal people can be recognized by even cellular phones but lip reading systems using only visual features remain important for hearing-impaired people. This paper aims to develop an application using MS Kinect camera to recognize Turkish color names to be used in the education of hearingimpaired children. Predefined lip points are located with depth information by the MS Kinect Face Tracking SDK. Words are segmented from the speech and the angles between the lip points are used as features to classify the words. Angles are computed using the 3D coordinates of the lip points. The KNN classifier is used to classify the words with Manhattan and Euclidean distances and the best feature vectors are tried to be found. As a result, the isolated words are classified with the success rate of 78.22%. Keywords- Lip reading; MS Kinect camera; 3D face tracking; K-nearest neighbor classifier; lip activation; lip motion detection I. INTRODUCTION With the developing technology there has been an increasing interest in automatic lip reading systems. Lip reading systems are commonly used to support hearing impaired people so that the system measures how successful the hearing impaired person when trying to mimic the mouth motions and also scoring pronunciation accuracy [1, 2]. The first step of a lip reading system is to acquire speaker s mouth images and to find the coordinates of some predefined points on the mouth. Then the motions of those points are analyzed with some classifier methods, such as artificial neural networks [3], hidden Markov models [4], k-nearest neighbor classifier [5], etc. to recognize the spoken word. Audio information can also be used as a supplementary resource with visual mouth images in lip reading systems to increase the success of the recognition. However, the importance of the visual features, such as lip movements and shape of the mouth, cannot be underestimated [6, 7]. Classification systems using only visual data are available in the literature [8, 9]. Hearing-impaired people and people with normal hearing ability are uses sound signals and visual features while listening to a conversation [10]. In real world, audio and visual features are complementing each other. The same rule applies for real-time lip reading applications. Classification processes which are made using only audio or video data are not so accurate, therefore usage of audio and video data combination increases the accuracy of speech recognition systems [11]. The data set created in low noise level environment has the accuracy over the success of 95% This research was supported by Scientific Research Projects Commission of Anadolu University (Project number: 1302F039) /13/$ IEEE [12]. However, the classification accuracy is reduced while the noise level in the environment increases. Consistency and classification rate of audio-visual systems are more accurate than visual-only or audio-only systems. However, hearing-impaired people are not able to extract audio in the conversation and they can t use audio information to recognize the lip motions. Therefore, increasing the accuracy of visual-only lip reading systems is important in applications which demonstrate hearing-impaired people s lip reading behavior. Visual-only classification systems can be divided into two groups as shape-based and appearance-based systems [9]. Shape-based systems use predefined points on the lip as features, such as distance between two points and angle between three points on the lip [11]. Appearance-based systems are based on pixel intensity values around the mouth area of the image [13]. In this paper we used MS Kinect camera which detects predefined lip points using an appearance-based algorithm. In this paper some isolated Turkish words whose lip points are obtained by MS Kinect camera are classified using K- Nearest Neighbor (KNN) classifier. The aim of the study is to develop an application in the future which tries to improve hearing-impaired people s lip reading imitations. In Sec. 2 face tracking with MS Kinect sensor is explained. Sec. 3 describes the data preprocessing before classification. Sec. 4 describes the lip activation detection, lip movements tracking, active lip motion interval detection and segmentation of an isolated word. Sec. 5 describes the classification of the isolated words with KNN. Experimental results are presented in Sec. 6 and the results are discussed in Sec. 7. II. FACE TRACKING WITH KINECT SENSOR Microsoft Kinect sensor is an effective camera for face tracking and real time movement tracking applications, because Kinect camera has an integrated infrared sensor and it captures a stream of colored images with the depth data of each image [14]. Kinect sensor obtains 3D data by using these components: A color camera An infrared emitter An infrared receiver. Kinect sensor is supported by Kinect Face Tracking Software Development Kit (Face Tracking SDK) whose
2 versions after 1.5 provide tools for developing real-time face tracking and lip reading applications on.net Framework. The Face Tracking engine analyzes the input images to compute the 3D position of head and discovers 121 predefined face points on the face model. These face points are shown in Figure 1. The SDK engine measures also the distances of these face points to the camera and makes these values available for using in an application at the same process time. The Face Tracking SDK uses Active Appearance Model for 2D feature tracker and the 2D data obtained from the Kinect sensor are extended to 3D by inserting depth information [15]. Face tracking is then accomplished on the 3D Kinect coordinate system where depth values and skeleton space coordinates are expressed in meters [16]. The and axes represent the skeleton space coordinates and the axis represents the depth, as shown in Fig. 3. In this study, the Kinect sensor and the Kinect Face Tracking SDK are used for; face detection, lip detection, lip motion tracking. 18 of the 121 face points represent lip where 8 of them are settled on the inner lip and 10 of them stays on the outer lip. Each lip point is assigned an integer ID value to identify them in this paper. The lip points and their ID s are shown in Fig. 2. Fig face points tracked by Kinect sensor Fig lip feature points and their assigned ID values Fig. 3. Kinect Coordinate Space III. DATA PREPROCESSING In real-time face tracking and lip reading systems, head pose, head movements, camera-to-head distance and headdirection of the speaker are important parameters and these parameters affect lip reading performance and robustness of the system. In real-time lip reading systems, the most frequently used features are the distance between upper- and lower-lips and the width of mouth. Width and height deformation rate of mouth is also used as a feature of the classification process. The classification methods which use these features have reached effective word classification rates, because the biggest deformation on lip contour occurs on these features [12]. Input images should be preprocessed before classification because of the following three main problems that may occur while collecting data: People have different sizes of lip contour shape and mouth height/width ratios, and these differences may cause different classifications. The distance between the camera and the speaker affects pixel distances between lip points on the image and the classification algorithms based on the distances may suffer from those differences. If speaker turns his/her head to left or right slightly, the height/width ratio of the mouth may change and it may lead to incorrect classifications. To prevent these problems, 3D data which contain the depth information [17] and the angles between the lip points [18] can be used effectively instead of width and height information between pixels on the input images. Depth information, i.e. the 3D data, eliminates the distance problem and angles eliminate the shape and height/width ratio problem. Therefore data preprocessing is not required when 3D information and angles are used to classify the isolated words. IV. WORD SEGMENTATION In lip-reading systems, the first step is to determine the starting and ending points of a word in the speech video. Precisely determining these points increases the success of the classification. Detecting where a word starts and finishes is called word segmentation. Word segmentation can be made after lip activation detection which decides the movements of lip are meaningful or not.
3 During the pronunciation of a specified word, lip activation can be detected by using the standard deviation of a specified lip feature during the past image frames where the value of is chosen experimentally according the frame rate of the video data. The standard deviation of the first feature,, over past frames around the th frame is computed by the formula Input Features Lip Activation Detection Word Segmentation 1 (1) Fig. 5. Active Interval Detection Word segmentation steps where is the th value of feature and is the mean value of over the past frames [12]. In this paper, 4 lip points, specified with the IDs 9, 12, 15, and 17 in Fig. 2, on the outer lip are used to define 2 features. The first feature,, is the angle between the lip points 12, 9, and 17; the second feature,, is the angle between the lip points 9, 12, and 15 as shown in Fig. 4. These features represent how much the mouth is opened in vertical and horizontal axes respectively. Number of frames,, to detect the lip activation is taken as 5. Standard deviation of over last 5 frames is computed according to Eq. (1). If for the th frame is greater than 1, then the lip is assumed as active at the th frame, otherwise it is assumed as passive. This threshold value of 1 is computed from some preliminary experiments. A computer program is developed which displays a word to speaker and the speaker reads that word while the Kinect camera records the data. If consecutive 10 frames are classified as active (1) or passive (0) in the order [ ], where 5 passive frames are followed by 5 active frames, then the frame where 1 is first occurred, i.e. the 6 th frame, is assumed as the starting frame of the word. If the activity of the frames occur in the order [ ], where 5 active frames are followed by 5 passive frames, then the first occurrence of zero is assumed as the ending frame of the word. By using these starting and ending frames, the words can be segmented from the input video. Word segmentation steps are shown in Fig. 5. V. WORD CLASSIFICATION A. Preprocessing (Normalization) Pronunciations of different words take different time intervals. For example, the word beyaz in Turkish (meaning white) may take about 1 second and the word kahverengi (meaning brown) may take 1.25 seconds. If the frame rate of the video is 12 fps, then the word beyaz takes 12 frames and kahverengi takes 16 frames. Besides, two different persons may pronounce the same word in different intervals. Even the same person may pronounce the same word at two different times in different intervals because of his/her mood at that time. Thus, the frame-lengths of the words should be normalized for a successful classification. Linear regression method is widely used to normalize the segmented word data for speaker adaptation in real time lip reading systems [19]. Here, cubic spline interpolation [20] is used for normalization so that each word has the same framelength, 15, at the end. B. K-Nearest Neighbor Classifier The KNN classifier is a popular classifier for speech recognition systems because of its simplicity and powerful classification rates on lip reading applications [5]. Input data is classified by using the distance to the nearest neighbor. During the classification process, each isolated Turkish word is assigned to the nearest neighbor based on the Euclidean and Manhattan distances. The KNN classifier does not need retraining when a new word is added to the training data set and also it does not need more training data for classification [21]. Fig. 4. The angle features, and VI. EXPERIMENTAL RESULTS AND DISCUSSION A computer application is developed to acquire and analyze images of speakers from the MS Kinect camera. The video images are recorded in a normally lit environment, with 12 frames per second, and the resolution of 1280x960. The Face Tracking SDK is used to detect face and extract the lip points. During the records, 2D video images are recorded as well as the 2D coordinates of lip points and 3D information which contains depth information of each lip point. The data are then processed offline to segment the words and the words are classified using KNN. The whole process is summarized in Fig. 6.
4 Face Detection Lip Detection Lip Feature Detection Input Image Face Tracking SDK Speaker the aid of an instructor. For this reason, most frequently used words about colors in Turkish are chosen to construct a dataset. These 15 color names and their English translations are listed in Table 1. The Kinect camera is located about 90 cm. away from the speaker s face and each speaker repeats each of the 15 isolated Turkish words 5 times by different. For the experiments, 10 volunteer people read the words so that the data set consists of 750 words in total. Each of the angles, listed in Table 2, on the lip is separately used to classify the words with KNN algorithm. The second and third columns in Table 2 show the percentages of correctly classified words over the total 750 spellings when Manhattan and Euclidean distances are used in the KNN classifier respectively. It is seen that the Manhattan distance gives better classification rates than Euclidean distance. It is also seen that the angles between the corners , , and give the best classification rates. These four angles are visually shown in Fig. 7. Lip Activation Detection Activation Interval Detection Offline Data Processing Word Segmentation Word Classification with KNN Fig. 6. The flowchart for the lip reading system TABLE I. LIP READING DATA SET Turkish Word Meaning Beyaz White Bordo Burgundy Gri Gray Kahverengi Brown Kırmızı Red Lacivert Navy blue Mavi Blue Menekşe Violet Mor Purple Pembe Pink Sarı Yellow Siyah Black Turkuaz Turquoise Turuncu Orange Yeşil Green TABLE II. ACCURACY OF WORD CLASSIFICATION USING KNN Percentage of correctly classified Angle corners words Manhattan Euclidean This study aims in the future to develop an application to improve lip imitation skills of hearing-impaired children with
5 Fig. 7. Visual representation of the four best angles TABLE III. Feature set Four best angles All of the angles PERCENTAGE OF CORRECTLY CLASSIFIED WORDS Percentage of correctly classified words As the second experiment, four best angles are used together to classify the words instead of using only one feature as explained in the first experiment. In the third experiment, all of the angles are used for the classification. The Manhattan distance is used in both experiments. The percentages of correctly recognized words are presented in Table 3. All of the selected angles give better classification rate but the computation takes more time. VII. CONCLUSION In this paper, MS Kinect camera and Face Tracking SDK are used to acquire video images and detect 3D coordinates of predefined points on lip. A set of isolated words about Turkish color names are constructed and they are classified by the KNN classifier. The features are selected from the angles of some points on lip and reasonable results are obtained by using angle features. Each angle is used to classify the words separately and the best angle features are determined. It is shown that the angle between the corners alone correctly classifies the 63.33% of the words. When the number of features increases, the classification rate also increases but computation gets slower. The success of the classifier is 78.22% when all angle features are used and 72.44% when only the best four angles are used. Using four angles decreases the computation time and it can be used instead of all angles reasonably. The KNN classifier is used to classify the words and it is shown that Manhattan distance yields better results than the Euclidean distance when used in the KNN classifier. As the future work, the experiments will be repeated on the new version of MS Kinect camera for Windows which supports near mode where the face would be tracked only from 40 cm. away. REFERENCES [1] L. Neumeyer, H. Franco, V. Digalakis and M. Weintraub, "Automatic scoring of pronunciation quality," Speech Commun., vol. 30, pp , [2] O. Turk and L. Arslan, "Speech recognition methods for speech therapy," in Signal Processing and Communications Applications Conference, Proceedings of the IEEE 12th, 2004, pp [3] A. Bagai, H. Gandhi, R. Goyal, M. Kohli and T. Prasad, "Lip-Reading using Neural Networks," Ijcsns, vol. 9, pp. 108, [4] L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc IEEE, vol. 77, pp , [5] T. Pao, W. Liao and Y. Chen, "Audio-visual speech recognition with weighted KNN-based classification in mandarin database," in Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP Third International Conference on,2007, pp [6] W. H. Sumby and I. Pollack, "Visual contribution to speech intelligibility in noise," J. Acoust. Soc. Am., vol. 26, pp. 212, [7] K. K. Neely, "Effect of visual factors on the intelligibility of speech," J. Acoust. Soc. Am., vol. 28, pp. 1275, [8] W. C. Yau, D. K. Kumar and S. P. Arjunan, "Visual recognition of speech consonants using facial movement features," Integrated Computer-Aided Engineering, vol. 14, pp , [9] W. C. Yau, D. K. Kumar, S. P. Arjunan and S. Kumar, "Visual speech recognition using image moments and multiresolution wavelet images," in Computer Graphics, Imaging and Visualisation, 2006 International Conference on, 2006, pp [10] P. Duchnowski, U. Meier and A. Waibel, "See me, hear me: Integrating automatic speech recognition and lip-reading," in Proc. Int. Conf. Spoken Lang. Process, 1994, pp [11] E. Petajan, "Automatic lipreading to enhance speech recognition," in: Proceedings of Global Telecommunications Conference, Atlanta, GA, pp [12] J. Shin, J. Lee and D. Kim, "Real-time lip reading system for isolated Korean word recognition," Pattern Recognit, vol. 44, pp , [13] L. Liang, X. Liu, Y. Zhao, X. Pi and A. V. Nefian, "Speaker independent audio-visual continuous speech recognition," in Multimedia and Expo, ICME'02. Proceedings IEEE International Conference on, 2002, pp [14] D. Catuhe, Programming with the Kinect for Windows Software Development Kit: Add Gesture and Posture Recognition to Your Applications. O'Reilly Media, Inc., [15] Wang, Q., & Ren, X. Facial Feature Locating Using Active Appearance Models With Contour Constraints From Consumer Depth Cameras. Journal of Theoretical and Applied Information Technology, 45(2), 2012, pp [16] J. Webb and J. Ashley, Beginning Kinect Programming with the Microsoft Kinect SDK. Apress, [17] G. Loy, E. Holden and R. Owens, "3D head tracker for an automatic lipreading system," in Proc. Australian Conf. on Robotics and Automation (ACRA2000), 2000,. [18] T. Yoshinaga, S. Tamura, K. Iwano and S. Furui, "Audio-visual speech recognition using new lip features extracted from side-face images," in COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction, 2004,. [19] K. Chen, W. Liau, H. Wang and L. Lee, "Fast speaker adaptation using eigenspace-based maximum likelihood linear regression," in Proc. ICSLP, 2000, pp [20] Y. Cheung, X. Liu and X. You, "A local region based approach to lip tracking," Pattern Recognit, [21] C. Rodriguez, F. Boto, I. Soraluze and A. Pérez, "An incremental and hierarchical k-nn classifier for handwritten characters," in Pattern Recognition, Proceedings. 16th International Conference on, 2002, pp
Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate
, pp.57-61 http://dx.doi.org/10.14257/astl.2015.107.14 Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate Yong-Ki Kim 1, Jong Gwan Lim 2, Mi-Hye Kim *3
More informationExperimental evaluation of the accuracy of the second generation of Microsoft Kinect system, for using in stroke rehabilitation applications
Experimental evaluation of the accuracy of the second generation of Microsoft Kinect system, for using in stroke rehabilitation applications Mohammad Hossein Saadatzi 1 Home-based Stroke Rehabilitation
More informationNoise-Robust Speech Recognition Technologies in Mobile Environments
Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.
More informationFacial expression recognition with spatiotemporal local descriptors
Facial expression recognition with spatiotemporal local descriptors Guoying Zhao, Matti Pietikäinen Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering, P. O. Box
More informationTURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL Kakajan Kakayev 1 and Ph.D. Songül Albayrak 2 1,2 Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey kkakajan@gmail.com
More informationAnalysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information
Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion
More informationBuilding an Application for Learning the Finger Alphabet of Swiss German Sign Language through Use of the Kinect
Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2014 Building an Application for Learning the Finger Alphabet of Swiss German
More informationTWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING
134 TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING H.F.S.M.Fonseka 1, J.T.Jonathan 2, P.Sabeshan 3 and M.B.Dissanayaka 4 1 Department of Electrical And Electronic Engineering, Faculty
More informationA Survey on Hand Gesture Recognition for Indian Sign Language
A Survey on Hand Gesture Recognition for Indian Sign Language Miss. Juhi Ekbote 1, Mrs. Mahasweta Joshi 2 1 Final Year Student of M.E. (Computer Engineering), B.V.M Engineering College, Vallabh Vidyanagar,
More informationMULTIMODAL FUSION FOR CUED SPEECH LANGUAGE RECOGNITION
MULTIMODAL FUSION FOR CUED SPEECH LANGUAGE RECOGNITION Savvas Argyropoulos 1,2, Dimitrios Tzovaras 2, and Michael G. Strintzis 1,2 1 Electrical and Computer Engineering Dept. Aristotle University of Thessaloniki
More informationImplementation of image processing approach to translation of ASL finger-spelling to digital text
Rochester Institute of Technology RIT Scholar Works Articles 2006 Implementation of image processing approach to translation of ASL finger-spelling to digital text Divya Mandloi Kanthi Sarella Chance Glenn
More informationDesign of Palm Acupuncture Points Indicator
Design of Palm Acupuncture Points Indicator Wen-Yuan Chen, Shih-Yen Huang and Jian-Shie Lin Abstract The acupuncture points are given acupuncture or acupressure so to stimulate the meridians on each corresponding
More informationPAPER REVIEW: HAND GESTURE RECOGNITION METHODS
PAPER REVIEW: HAND GESTURE RECOGNITION METHODS Assoc. Prof. Abd Manan Ahmad 1, Dr Abdullah Bade 2, Luqman Al-Hakim Zainal Abidin 3 1 Department of Computer Graphics and Multimedia, Faculty of Computer
More informationContour-based Hand Pose Recognition for Sign Language Recognition
Contour-based Hand Pose Recognition for Sign Language Recognition Mika Hatano, Shinji Sako, Tadashi Kitamura Graduate School of Engineering, Nagoya Institute of Technology {pia, sako, kitamura}@mmsp.nitech.ac.jp
More informationEffect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face
Effect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face Yasunari Yoshitomi 1, Sung-Ill Kim 2, Takako Kawano 3 and Tetsuro Kitazoe 1 1:Department of
More information3. MANUAL ALPHABET RECOGNITION STSTM
Proceedings of the IIEEJ Image Electronics and Visual Computing Workshop 2012 Kuching, Malaysia, November 21-24, 2012 JAPANESE MANUAL ALPHABET RECOGNITION FROM STILL IMAGES USING A NEURAL NETWORK MODEL
More informationdoi: / _59(
doi: 10.1007/978-3-642-39188-0_59(http://dx.doi.org/10.1007/978-3-642-39188-0_59) Subunit modeling for Japanese sign language recognition based on phonetically depend multi-stream hidden Markov models
More informationDeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation Biyi Fang Michigan State University ACM SenSys 17 Nov 6 th, 2017 Biyi Fang (MSU) Jillian Co (MSU) Mi Zhang
More informationAgitation sensor based on Facial Grimacing for improved sedation management in critical care
Agitation sensor based on Facial Grimacing for improved sedation management in critical care The 2 nd International Conference on Sensing Technology ICST 2007 C. E. Hann 1, P Becouze 1, J. G. Chase 1,
More informationPrimary Level Classification of Brain Tumor using PCA and PNN
Primary Level Classification of Brain Tumor using PCA and PNN Dr. Mrs. K.V.Kulhalli Department of Information Technology, D.Y.Patil Coll. of Engg. And Tech. Kolhapur,Maharashtra,India kvkulhalli@gmail.com
More informationINTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT
INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT R.Nishitha 1, Dr K.Srinivasan 2, Dr V.Rukkumani 3 1 Student, 2 Professor and Head, 3 Associate Professor, Electronics and Instrumentation
More informationDevelopment of novel algorithm by combining Wavelet based Enhanced Canny edge Detection and Adaptive Filtering Method for Human Emotion Recognition
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 12, Issue 9 (September 2016), PP.67-72 Development of novel algorithm by combining
More informationClassification of ECG Data for Predictive Analysis to Assist in Medical Decisions.
48 IJCSNS International Journal of Computer Science and Network Security, VOL.15 No.10, October 2015 Classification of ECG Data for Predictive Analysis to Assist in Medical Decisions. A. R. Chitupe S.
More informationSign Language in the Intelligent Sensory Environment
Sign Language in the Intelligent Sensory Environment Ákos Lisztes, László Kővári, Andor Gaudia, Péter Korondi Budapest University of Science and Technology, Department of Automation and Applied Informatics,
More informationSPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM)
SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) Virendra Chauhan 1, Shobhana Dwivedi 2, Pooja Karale 3, Prof. S.M. Potdar 4 1,2,3B.E. Student 4 Assitant Professor 1,2,3,4Department of Electronics
More informationFacial Expression Biometrics Using Tracker Displacement Features
Facial Expression Biometrics Using Tracker Displacement Features Sergey Tulyakov 1, Thomas Slowe 2,ZhiZhang 1, and Venu Govindaraju 1 1 Center for Unified Biometrics and Sensors University at Buffalo,
More informationResearch Proposal on Emotion Recognition
Research Proposal on Emotion Recognition Colin Grubb June 3, 2012 Abstract In this paper I will introduce my thesis question: To what extent can emotion recognition be improved by combining audio and visual
More informationFinger spelling recognition using distinctive features of hand shape
Finger spelling recognition using distinctive features of hand shape Y Tabata 1 and T Kuroda 2 1 Faculty of Medical Science, Kyoto College of Medical Science, 1-3 Imakita Oyama-higashi, Sonobe, Nantan,
More informationUsing Deep Convolutional Networks for Gesture Recognition in American Sign Language
Using Deep Convolutional Networks for Gesture Recognition in American Sign Language Abstract In the realm of multimodal communication, sign language is, and continues to be, one of the most understudied
More informationA Kinematic Assessment of Knee Prosthesis from Fluoroscopy Images
Memoirs of the Faculty of Engineering, Kyushu University, Vol. 68, No. 1, March 2008 A Kinematic Assessment of Knee Prosthesis from Fluoroscopy Images by Mohammad Abrar HOSSAIN *, Michihiko FUKUNAGA and
More information1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1
1. INTRODUCTION Sign language interpretation is one of the HCI applications where hand gesture plays important role for communication. This chapter discusses sign language interpretation system with present
More informationMRI Image Processing Operations for Brain Tumor Detection
MRI Image Processing Operations for Brain Tumor Detection Prof. M.M. Bulhe 1, Shubhashini Pathak 2, Karan Parekh 3, Abhishek Jha 4 1Assistant Professor, Dept. of Electronics and Telecommunications Engineering,
More informationA Review on Feature Extraction for Indian and American Sign Language
A Review on Feature Extraction for Indian and American Sign Language Neelam K. Gilorkar, Manisha M. Ingle Department of Electronics & Telecommunication, Government College of Engineering, Amravati, India
More informationInternational Journal of Software and Web Sciences (IJSWS)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International
More informationFacial Emotion Recognition with Facial Analysis
Facial Emotion Recognition with Facial Analysis İsmail Öztel, Cemil Öz Sakarya University, Faculty of Computer and Information Sciences, Computer Engineering, Sakarya, Türkiye Abstract Computer vision
More informationenterface 13 Kinect-Sign João Manuel Ferreira Gameiro Project Proposal for enterface 13
enterface 13 João Manuel Ferreira Gameiro Kinect-Sign Project Proposal for enterface 13 February, 2013 Abstract This project main goal is to assist in the communication between deaf and non-deaf people.
More informationFrequency Tracking: LMS and RLS Applied to Speech Formant Estimation
Aldebaro Klautau - http://speech.ucsd.edu/aldebaro - 2/3/. Page. Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation ) Introduction Several speech processing algorithms assume the signal
More informationFace Analysis : Identity vs. Expressions
Hugo Mercier, 1,2 Patrice Dalle 1 Face Analysis : Identity vs. Expressions 1 IRIT - Université Paul Sabatier 118 Route de Narbonne, F-31062 Toulouse Cedex 9, France 2 Websourd Bâtiment A 99, route d'espagne
More informationGender Based Emotion Recognition using Speech Signals: A Review
50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the
More informationAudiovisual to Sign Language Translator
Technical Disclosure Commons Defensive Publications Series July 17, 2018 Audiovisual to Sign Language Translator Manikandan Gopalakrishnan Follow this and additional works at: https://www.tdcommons.org/dpubs_series
More informationAutomated Brain Tumor Segmentation Using Region Growing Algorithm by Extracting Feature
Automated Brain Tumor Segmentation Using Region Growing Algorithm by Extracting Feature Shraddha P. Dhumal 1, Ashwini S Gaikwad 2 1 Shraddha P. Dhumal 2 Ashwini S. Gaikwad ABSTRACT In this paper, we propose
More informationMotion Control for Social Behaviours
Motion Control for Social Behaviours Aryel Beck a.beck@ntu.edu.sg Supervisor: Nadia Magnenat-Thalmann Collaborators: Zhang Zhijun, Rubha Shri Narayanan, Neetha Das 10-03-2015 INTRODUCTION In order for
More informationAn Approach to Hand Gesture Recognition for Devanagari Sign Language using Image Processing Tool Box
An Approach to Hand Gesture Recognition for Devanagari Sign Language using Image Processing Tool Box Prof. Abhijit V. Warhade 1 Prof. Pranali K. Misal 2 Assistant Professor, Dept. of E & C Engineering
More informationAnalysis of Recognition System of Japanese Sign Language using 3D Image Sensor
Analysis of Recognition System of Japanese Sign Language using 3D Image Sensor Yanhua Sun *, Noriaki Kuwahara**, Kazunari Morimoto *** * oo_alison@hotmail.com ** noriaki.kuwahara@gmail.com ***morix119@gmail.com
More informationFacial Expression Recognition Using Principal Component Analysis
Facial Expression Recognition Using Principal Component Analysis Ajit P. Gosavi, S. R. Khot Abstract Expression detection is useful as a non-invasive method of lie detection and behaviour prediction. However,
More informationEvaluating the accuracy of markerless body tracking for digital ergonomic assessment using consumer electronics
Evaluating the accuracy of markerless body tracking for digital ergonomic assessment using consumer electronics Dominik Bonin a, Sascha Wischniewski a a Unit Human Factors, Ergonomics, Federal Institute
More informationCOMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION
COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION 1 R.NITHYA, 2 B.SANTHI 1 Asstt Prof., School of Computing, SASTRA University, Thanjavur, Tamilnadu, India-613402 2 Prof.,
More informationQUANTIFICATION OF PROGRESSION OF RETINAL NERVE FIBER LAYER ATROPHY IN FUNDUS PHOTOGRAPH
QUANTIFICATION OF PROGRESSION OF RETINAL NERVE FIBER LAYER ATROPHY IN FUNDUS PHOTOGRAPH Hyoun-Joong Kong *, Jong-Mo Seo **, Seung-Yeop Lee *, Hum Chung **, Dong Myung Kim **, Jeong Min Hwang **, Kwang
More information1 Pattern Recognition 2 1
1 Pattern Recognition 2 1 3 Perceptrons by M.L. Minsky and S.A. Papert (1969) Books: 4 Pattern Recognition, fourth Edition (Hardcover) by Sergios Theodoridis, Konstantinos Koutroumbas Publisher: Academic
More informationResearch Article Sleep Monitoring System Using Kinect Sensor
Distributed Sensor Networks Volume 2015, Article ID 875371, 9 pages http://dx.doi.org/10.1155/2015/875371 Research Article Sleep Monitoring System Using Kinect Sensor Jaehoon Lee, 1 Min Hong, 2 andsungyongryu
More informationImage Understanding and Machine Vision, Optical Society of America, June [8] H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for
Image Understanding and Machine Vision, Optical Society of America, June 1989. [8] H. Ney. The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. IEEE International Conference
More informationOnline Speaker Adaptation of an Acoustic Model using Face Recognition
Online Speaker Adaptation of an Acoustic Model using Face Recognition Pavel Campr 1, Aleš Pražák 2, Josef V. Psutka 2, and Josef Psutka 2 1 Center for Machine Perception, Department of Cybernetics, Faculty
More informationFEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES
FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES Riku Matsumoto, Hiroki Yoshimura, Masashi Nishiyama, and Yoshio Iwai Department of Information and Electronics,
More informationKeywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.
Design of Classifier Using Artificial Neural Network for Patients Survival Analysis J. D. Dhande 1, Dr. S.M. Gulhane 2 Assistant Professor, BDCE, Sevagram 1, Professor, J.D.I.E.T, Yavatmal 2 Abstract The
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationMEM BASED BRAIN IMAGE SEGMENTATION AND CLASSIFICATION USING SVM
MEM BASED BRAIN IMAGE SEGMENTATION AND CLASSIFICATION USING SVM T. Deepa 1, R. Muthalagu 1 and K. Chitra 2 1 Department of Electronics and Communication Engineering, Prathyusha Institute of Technology
More informationIDENTIFICATION OF REAL TIME HAND GESTURE USING SCALE INVARIANT FEATURE TRANSFORM
Research Article Impact Factor: 0.621 ISSN: 2319507X INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK IDENTIFICATION OF REAL TIME
More informationPosture Monitor. User Manual. Includes setup, guidelines and troubleshooting information for your Posture Monitor App
Posture Monitor User Manual Includes setup, guidelines and troubleshooting information for your Posture Monitor App All rights reserved. This manual shall not be copied, in whole or in part, without the
More informationSign Language to English (Slate8)
Sign Language to English (Slate8) App Development Nathan Kebe El Faculty Advisor: Dr. Mohamad Chouikha 2 nd EECS Day April 20, 2018 Electrical Engineering and Computer Science (EECS) Howard University
More informationLearning Utility for Behavior Acquisition and Intention Inference of Other Agent
Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Yasutake Takahashi, Teruyasu Kawamata, and Minoru Asada* Dept. of Adaptive Machine Systems, Graduate School of Engineering,
More informationA Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China
A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some
More informationCharacterization of 3D Gestural Data on Sign Language by Extraction of Joint Kinematics
Human Journals Research Article October 2017 Vol.:7, Issue:4 All rights are reserved by Newman Lau Characterization of 3D Gestural Data on Sign Language by Extraction of Joint Kinematics Keywords: hand
More informationAutomatic Hemorrhage Classification System Based On Svm Classifier
Automatic Hemorrhage Classification System Based On Svm Classifier Abstract - Brain hemorrhage is a bleeding in or around the brain which are caused by head trauma, high blood pressure and intracranial
More informationPerformance of Gaussian Mixture Models as a Classifier for Pathological Voice
PAGE 65 Performance of Gaussian Mixture Models as a Classifier for Pathological Voice Jianglin Wang, Cheolwoo Jo SASPL, School of Mechatronics Changwon ational University Changwon, Gyeongnam 64-773, Republic
More informationFREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED
FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED Francisco J. Fraga, Alan M. Marotta National Institute of Telecommunications, Santa Rita do Sapucaí - MG, Brazil Abstract A considerable
More informationABSTRACT I. INTRODUCTION
2018 IJSRSET Volume 4 Issue 2 Print ISSN: 2395-1990 Online ISSN : 2394-4099 National Conference on Advanced Research Trends in Information and Computing Technologies (NCARTICT-2018), Department of IT,
More informationGesture Recognition using Marathi/Hindi Alphabet
Gesture Recognition using Marathi/Hindi Alphabet Rahul Dobale ¹, Rakshit Fulzele², Shruti Girolla 3, Seoutaj Singh 4 Student, Computer Engineering, D.Y. Patil School of Engineering, Pune, India 1 Student,
More informationN RISCE 2K18 ISSN International Journal of Advance Research and Innovation
The Computer Assistance Hand Gesture Recognition system For Physically Impairment Peoples V.Veeramanikandan(manikandan.veera97@gmail.com) UG student,department of ECE,Gnanamani College of Technology. R.Anandharaj(anandhrak1@gmail.com)
More informationNAÏVE BAYESIAN CLASSIFIER FOR ACUTE LYMPHOCYTIC LEUKEMIA DETECTION
NAÏVE BAYESIAN CLASSIFIER FOR ACUTE LYMPHOCYTIC LEUKEMIA DETECTION Sriram Selvaraj 1 and Bommannaraja Kanakaraj 2 1 Department of Biomedical Engineering, P.S.N.A College of Engineering and Technology,
More informationANALYSIS OF FACIAL FEATURES OF DRIVERS UNDER COGNITIVE AND VISUAL DISTRACTIONS
ANALYSIS OF FACIAL FEATURES OF DRIVERS UNDER COGNITIVE AND VISUAL DISTRACTIONS Nanxiang Li and Carlos Busso Multimodal Signal Processing (MSP) Laboratory Department of Electrical Engineering, The University
More informationBrain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Brain Tumour Detection of MR Image Using Naïve
More informationThe 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian
The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University
More informationDesign and Implementation study of Remote Home Rehabilitation Training Operating System based on Internet
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Design and Implementation study of Remote Home Rehabilitation Training Operating System based on Internet To cite this article:
More informationNeuro-Inspired Statistical. Rensselaer Polytechnic Institute National Science Foundation
Neuro-Inspired Statistical Pi Prior Model lfor Robust Visual Inference Qiang Ji Rensselaer Polytechnic Institute National Science Foundation 1 Status of Computer Vision CV has been an active area for over
More informationAccuracy and validity of Kinetisense joint measures for cardinal movements, compared to current experimental and clinical gold standards.
Accuracy and validity of Kinetisense joint measures for cardinal movements, compared to current experimental and clinical gold standards. Prepared by Engineering and Human Performance Lab Department of
More informationKeywords Fuzzy Logic, Fuzzy Rule, Fuzzy Membership Function, Fuzzy Inference System, Edge Detection, Regression Analysis.
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Modified Fuzzy
More information1 Introduction. Abstract: Accurate optic disc (OD) segmentation and fovea. Keywords: optic disc segmentation, fovea detection.
Current Directions in Biomedical Engineering 2017; 3(2): 533 537 Caterina Rust*, Stephanie Häger, Nadine Traulsen and Jan Modersitzki A robust algorithm for optic disc segmentation and fovea detection
More informationCS 2310 Software Engineering Final Project Report. Arm Motion Assessment for Remote Physical Therapy Using Kinect
I. Introduction & Motivation CS 2310 Software Engineering Final Project Report Arm Motion Assessment for Remote Physical Therapy Using Kinect Kinect based remote physical therapy systems are becoming increasingly
More informationBMMC (UG SDE) IV SEMESTER
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION BMMC (UG SDE) IV SEMESTER GENERAL COURSE IV COMMON FOR Bsc ELECTRONICS, COMPUTER SCIENCE, INSTRUMENTATION & MULTIMEDIA BASICS OF AUDIO & VIDEO MEDIA QUESTION
More informationValence-arousal evaluation using physiological signals in an emotion recall paradigm. CHANEL, Guillaume, ANSARI ASL, Karim, PUN, Thierry.
Proceedings Chapter Valence-arousal evaluation using physiological signals in an emotion recall paradigm CHANEL, Guillaume, ANSARI ASL, Karim, PUN, Thierry Abstract The work presented in this paper aims
More informationDiagnosis of Liver Tumor Using 3D Segmentation Method for Selective Internal Radiation Therapy
Diagnosis of Liver Tumor Using 3D Segmentation Method for Selective Internal Radiation Therapy K. Geetha & S. Poonguzhali Department of Electronics and Communication Engineering, Campus of CEG Anna University,
More informationCSE 118/218 Final Presentation. Team 2 Dreams and Aspirations
CSE 118/218 Final Presentation Team 2 Dreams and Aspirations Smart Hearing Hearing Impairment A major public health issue that is the third most common physical condition after arthritis and heart disease
More informationIncorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011
Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 I. Purpose Drawing from the profile development of the QIBA-fMRI Technical Committee,
More informationAudio-Visual Integration in Multimodal Communication
Audio-Visual Integration in Multimodal Communication TSUHAN CHEN, MEMBER, IEEE, AND RAM R. RAO Invited Paper In this paper, we review recent research that examines audiovisual integration in multimodal
More informationComparison of ANN and Fuzzy logic based Bradycardia and Tachycardia Arrhythmia detection using ECG signal
Comparison of ANN and Fuzzy logic based Bradycardia and Tachycardia Arrhythmia detection using ECG signal 1 Simranjeet Kaur, 2 Navneet Kaur Panag 1 Student, 2 Assistant Professor 1 Electrical Engineering
More informationDimensional Emotion Prediction from Spontaneous Head Gestures for Interaction with Sensitive Artificial Listeners
Dimensional Emotion Prediction from Spontaneous Head Gestures for Interaction with Sensitive Artificial Listeners Hatice Gunes and Maja Pantic Department of Computing, Imperial College London 180 Queen
More informationReal Time Sign Language Processing System
Real Time Sign Language Processing System Dibyabiva Seth (&), Anindita Ghosh, Ariruna Dasgupta, and Asoke Nath Department of Computer Science, St. Xavier s College (Autonomous), Kolkata, India meetdseth@gmail.com,
More informationSign Language Recognition using Webcams
Sign Language Recognition using Webcams Overview Average person s typing speed Composing: ~19 words per minute Transcribing: ~33 words per minute Sign speaker Full sign language: ~200 words per minute
More informationComputer based delineation and follow-up multisite abdominal tumors in longitudinal CT studies
Research plan submitted for approval as a PhD thesis Submitted by: Refael Vivanti Supervisor: Professor Leo Joskowicz School of Engineering and Computer Science, The Hebrew University of Jerusalem Computer
More informationRecognition of sign language gestures using neural networks
Recognition of sign language gestures using neural s Peter Vamplew Department of Computer Science, University of Tasmania GPO Box 252C, Hobart, Tasmania 7001, Australia vamplew@cs.utas.edu.au ABSTRACT
More informationEmotion Recognition using a Cauchy Naive Bayes Classifier
Emotion Recognition using a Cauchy Naive Bayes Classifier Abstract Recognizing human facial expression and emotion by computer is an interesting and challenging problem. In this paper we propose a method
More informationHierarchical Age Estimation from Unconstrained Facial Images
Hierarchical Age Estimation from Unconstrained Facial Images STIC-AmSud Jhony Kaesemodel Pontes Department of Electrical Engineering Federal University of Paraná - Supervisor: Alessandro L. Koerich (/PUCPR
More informationRetinal Blood Vessel Segmentation Using Fuzzy Logic
Retinal Blood Vessel Segmentation Using Fuzzy Logic Sahil Sharma Chandigarh University, Gharuan, India. Er. Vikas Wasson Chandigarh University, Gharuan, India. Abstract This paper presents a method to
More informationGesture Control in a Virtual Environment. Presenter: Zishuo Cheng (u ) Supervisors: Prof. Tom Gedeon and Mr. Martin Henschke
Gesture Control in a Virtual Environment Presenter: Zishuo Cheng (u4815763) Supervisors: Prof. Tom Gedeon and Mr. Martin Henschke 2 Outline Background Motivation Methodology Result & Discussion Conclusion
More informationSign Language Recognition System Using SIFT Based Approach
Sign Language Recognition System Using SIFT Based Approach Ashwin S. Pol, S. L. Nalbalwar & N. S. Jadhav Dept. of E&TC, Dr. BATU Lonere, MH, India E-mail : ashwin.pol9@gmail.com, nalbalwar_sanjayan@yahoo.com,
More informationLabview Based Hand Gesture Recognition for Deaf and Dumb People
International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 7 Issue 4 Ver. V April 2018 PP 66-71 Labview Based Hand Gesture Recognition for Deaf
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationGender Discrimination Through Fingerprint- A Review
Gender Discrimination Through Fingerprint- A Review Navkamal kaur 1, Beant kaur 2 1 M.tech Student, Department of Electronics and Communication Engineering, Punjabi University, Patiala 2Assistant Professor,
More informationSign Language Recognition with the Kinect Sensor Based on Conditional Random Fields
Sensors 2015, 15, 135-147; doi:10.3390/s150100135 Article OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Sign Language Recognition with the Kinect Sensor Based on Conditional Random Fields
More information