IEEE International Workshop on Machine Learning for Signal Processing September 8-,, Beijing, China DETERMINING THE NUMBER OF SOURCES IN HIGH-DENSITY EEG RECORDINGS OF EVENT-RELATED POTENTIALS BY MODEL ORDER SELECTION Fengyu Cong, Zhaoshui He,, Jarmo Hämäläinen, Andrzej Cichocki, Tapani Ristaniemi. Department of Mathematical Information Technology, University of Jyväskylä,, Finland; Lab for Advanced Brain Signal Processing, RIKEN Brain Science Institute, Japan;. Faculty of Automation, Guangdong University of Technology, Guangzhou, 6;. Department of Psychology, University of Jyväskylä,, Finland. ABSTRACT To high-density electroencephalography (EEG) recordings, determining the number of sources to separate the signal and the noise subspace is very important. A mostly used criterion is that percentage of variance of raw data explained by the selected principal components composing the signal space should be over 9%. Recently, a model order selection method named as GAP has been proposed. We investigated the two methods by performing independent component analysis (ICA) on the estimated signal subspace, assuming the number of selected principal components composing the signal subspace is equal to the number of sources of brain activities. Through examining waveletfiltered EEG recordings (8 electrodes) of ERPs, ICA with the reference to GAP decomposed selected principal components reliably into independent components, and ICA decomposition with the variance explained method was not reliable, indicating that the number of sources, as well as the signal subspace, should be well estimated through GAP. Index Terms Event-related potential, independent /principal component analysis, model order selection, number of sources, reliability, wavelet filter. INTRODUCTION Elelctroencephalography (EEG) recordings can be modelled as the linear transformation of latent variables under EEG frequencies []. They represent the summation of scaled versions of electrical brain activities and artifacts including eye blinks, muscle activities, and so on, produced by participants during the experiment [-]. Thus, it is always desired to remove the artifacts and extract the interesting electrical brain activities from recordings at the scalp. Indeed, EEG can reveal two types of electrical brain activities including the spontaneous ongoing and the eventrelated potentials (ERPs) []. To produce ERPs, EEG recordings of many single trials are often collected and are averaged over those single trials [6]. However, the averaged EEG recordings of ERPs are still mixtures of electrical brain activities. In order to extract ERPs from EEG recordings, the digital filter, wavelet filter, principal component analysis (PCA), independent component analysis (ICA), and so on, have been applied [7]. PCA or ICA is based on the linear transformation model of EEG recordings. In this model, the electrical activities in the brain are the sources, and the EEG recordings at the scalp are the mixtures [-]. Regarding EEG collected by the high-density array, it is thought that the number of sources is less than the number of electrodes under the assumption of the discrete source model [8]. In this case, the dimension reduction is often executed before the implementation of ICA [9, ]. To achieve that, PCA may be firstly implemented on data; then, the principal components corresponding to the first k large eigenvalues are then selected to separate the signal and the noise subspace, assuming that there are k sources in the signal subspace [9-]. Subsequently, ICA can be performed on the signal space to estimate the desired components of brain activities [9, ]. Hence, the problem in this context is how many principal components should be chosen. Usually, the number of the selected components is determined by the prior knowledge which is mostly according to people s experience. For example, the variance explained by the selected principal components is usually over 8% or 9% of the mixtures variance []. Actually, different people may have different experiences. Therefore, selecting principal components is probably uncertain by using different percentages as the threshold. In fact, ICA has become an important tool in the study of ERPs [], and EEG recordings are collected with more and more electrodes to completely represent the electrical brain activities [, ]. Consequently, to reasonably determine the number of sources for separating the signal and the noise subspace through PCA in high-density EEG recordings becomes very important. However, this problem is not well addressed for ERP studies yet. There are two obstacles to resolve this problem. One is that we do not know the true number of sources in the brain and the other is that how to explicitly validate the effectiveness and rationality of the estimated number is very difficult too. Recently, a simple yet efficient model order selection method to separate the signal and the noise subspace, known as GAP method [, 6], has been developed to estimate the 978--77-6-//$6. c IEEE
number of components in multivariate data analysis, such as PCA and ICA, which was successfully applied to detect the number of clusters for probabilistic clustering [6] and determine the rank for each mode of Tucker tensor []. In [], it was shown that the GAP method significantly outperformed the existing model order selection methods in terms of percentage of correct selection. It may potentially become a promising tool for detecting the number of sources to separate the signal and the noise subspace in high-density EEG recordings. Hence, this study will investigate GAP to estimate the number of sources in the ordinary averaged EEG recordings of ERPs collected by 8 electrodes, and in the waveletfiltered counterparts. Then, the determined ICA algorithm will be performed on the selected principal components and the reliability of each extracted independent component will be analyzed by ICASSO [7]. The idea is that if the number of sources was correctly estimated, the signal and the noise subspace would be well separated by PCA, thus, the ICA decomposition could be reliable.. Data description. METHOD The goal of the experiment was originally to identify the ERPs to pitch and rising time change in children with reading disabilities and typically reading children through a passive oddball paradigm []. In this study, only responses to the pitch change of the healthy children ( girls and 6 boys with mean age of 9. years and a standard deviation (STD) of. year) were taken for analysis. The stimuli consisted of pairs of harmonic sinusoidal tones. In the middle of a tone pair, a silence gap of ms separated the two tones. The durations of two tones in sequence were ms and ms. The fundamental frequencies of the first and second tones were Hz and Hz, respectively. Each tone was with three additional harmonics of their individual fundamental frequency. The inter stimulus interval (ISI) was 6ms between the pairs in this study. In the oddball paradigm, the probability of one deviant was., and - standard sounds existed between the presentations of deviant sounds. For further details on the participants, stimuli and procedure, refer the study of Hämäläinen et al. []. During the experiment, the participants were instructed to pay their attention in watching a silenced video or playing a computer game; data was collected with 8 electrodes using the vertex as the reference; a high-pass filter of.hz and a low-pass filter of Hz were used to filter the collected data; then, data was down sampled with the rate of Hz. After the artifact rejection (with the criteria of μv for peak-to-peak amplitude and μv for transients), at least 6 trials among collected ones remained to any subject. EEG data was offline filtered with the band pass filter from.hz to Hz to further remove noises. Having been averaged over the kept single trials, the data was re-referenced to the average over all electrodes. Such recordings were named as the averaged EEG recordings of ERPs in this study. The averaged trace last 9ms, and the first ms were the recordings of the prestimulus and were formulated as the baseline. Then, the data was well prepared for the further advanced processing.. Signal model and GAP method In this section, we briefly introduce the earlier developed GAP method [6]. Considering a multiple-input-multipleoutput (MIMO) signal model: an array of electrodes sensing signals,, are from sources,, through a gain (or named as mixing) matrix, i.e.,, () where,, is the noise vector. In this problem (), both and, as well as the number of sources, are unknown. The task in this study is to seek the number of sources from the observed data. To achieve this goal, first, we make three trivial assumptions as the following: (i) is a tall matrix ( ) and full column rank, (ii) noise signals,, are mutually independent and follow identical Gaussian distribution, ; (iii) the noise is independent statistically with the dipole sources,, [6]. Then, we can obtain, () where E denotes the mathematical expectation, is the identity matrix, and. Since the rank of is, one can readily derive, () where are the eigenvalues of matrix in the descending order. Usually, when the first principal components explain over 8% or 9% of variance of the observed, those components are assumed to compose the signal subspace and then are selected for the further analysis [9-]. However, this method is not adaptive to the data, which is the main obstacle for it to obtain optimal results. The recently developed GAP is adaptive to the data and can detect the number of clusters in the -way probabilistic clustering [6]. In this study, we attempt to use this method to determine the number of sources for separating the signal and the noise subspace in high-density EEG recordings. For completeness of the study, we briefly introduce the GAP algorithm as follows. In order to identify the parameter by searching the gap between and in (), a gap measure [6] has been defined, GAP where,, and, ()
() denotes the sample variance of the sequence, and,,,. Then, we determine the number of sources by the criterion [, 6]: min GAP. (6). Data processing p=,, m The ordinary averaged EEG recordings of ERPs were filtered by a wavelet filter, and the numbers of sources in the ordinary averaged EEG recordings and the filtered were estimated by PCA based variance explained method (denoted as PCA hereinafter) and GAP, respectively. Then, the corresponding principal components were selected to compose the signal subspace. Finally, ICA decomposition was performed on the signal subspace. Regarding PCA, when the percentage of the variance explained by the selected principal components was over 9%, the number of selected principal components was defined as the number of sources to separate the signal and the noise subspace. Wavelet filter: Indeed, the experiment here was to elicit a negative ERP named as mismatch negativity (MMN) []. Thus, before the application of wavelet analysis, the spectral properties of MMN should be studied [8]. In most cases, MMN has its main energy below Hz in the frequency domain and has its considerably high frequency less than Hz, and such information is the basis to design the reasonable wavelet filter [9]. In this study, the reversal biorthogonal wavelet with the order of 6.8 [] was used. This is because this wavelet has been proved to be very effective to extract well-defined MMN from the ordinary averaged trace in a very large dataset including children [9]. Data at each channel of each subject was decomposed into nine levels, and the coefficients from the sixth level to eighth level were used to reconstruct the desired MMN. The reason to select the coefficients of those levels for the reconstruction is that the frequency responses of such a wavelet filter under the configurations mentioned above should match the spectral properties of MMN [9]. Component extraction and validation: ICA has been proved to be a very useful method to extract desired ERP components from EEG recordings []. Indeed, most ICA algorithms are adaptive, and the convergence of the adaptive algorithm might not reach the global optimum, but the local one, which quite frequently happens in practice [7]. As a result, the single-run ICA might present uncertain results. This means the components extracted by ICA might not be satisfactory in a single run. To overcome this problem, the software-icasso was invented in [7]. Since then, it has been extensively used. One paradigm of ICASSO is to randomly initialize the unmixing matrix many times, and then, the same ICA algorithm is run with each initialized unmixing matrix to extract a number of components. The logic of this paradigm is that if a certain component is extracted in most of runs, it means the estimation to this component is very reliable and stable [7]. Hence, all the extracted components in so many runs are grouped into a number of clusters and each cluster represents one component. ICASSO calculates a stability index denoted by Iq for such a component. The Iq ranges from to. When it approaches to, it means that the corresponding component is reliably extracted; otherwise, it indicates that the component is randomly produced. Hence, the Iq was chosen as the criterion to validate the reliability and stability of ICA decomposition in this study. To ICASSO, FastICA [] was chosen for the decomposition, and it was run times with different randomly initialized unmixing matrices. For the clustering of ICASSO, the agglomerative hierarchical clustering with average-linkage criterion was chosen [7].. RESULT To demonstrate the effectiveness of the proposed data processing method, the data of one typical subject is exhibited first. Fig. describes the selected principal components extracted by PCA from the wavelet-filtered EEG recordings. Fig. shows the independent components extracted by ICASSO from those principal components, and the independent components are ranked according to their contribution to the mixture [, ], but their polarities are still ambiguous. In this example, fourteen was the number of sources estimated by GAP, and the averaged Iq over components was.86 with STD of.. This means the ICA decomposition under the determined model was reliable and stable, indicating the number of sources was correctly estimated and the signal space was well extracted. Recordings to ICA # 8 #8 8 # 8 #9 8 # 8 # 8 8 Fig. Fourteen principal components derived from filtered EEG recordings of ERPs. Meanwhile, Fig. displays the logarithm of the eigenvalue by PCA derived from the ordinary averaged EEG recordings and the filtered in this typical example. Indeed, to GAP, the logarithm of the eigenvalue was not # # 8 # 8 # 8 #6 8 # 8 #7 8 # 8
used. Here, the logarithm operation is just for better display. In this example, the difference between the th and th eigenvalue was very significant under the filtered EEG recordings, and GAP predicted sources in the filtered EEG recordings. However, in the averaged EEG recordings, the difference between the 6th and 7th eigenvalues was considerable, and GAP predicted 6 sources in the averaged EEG recordings. activity denoted by the intensive red and blue colors in the topography estimated by ICA was more evident, which is desired for the dipole source analysis [, ]. These two figures mean that the estimation of ICA was reasonable and precise to reveal the brain activity. Topography of ERP estimated from raw data 6 Extracted Component # # # # # #6 #7 Amplitude 8 8 8 8 8 8 8 #8 8 #9 8 # 8 8 Fig. Fourteen independent components derived from principal components. Amplitude/dB 6 8 Eigenvalue Fig. Eigenvalues of averaged EEG recordings of ERPs and the filtered for one typical subject, respectively. Furthermore, Figs. and demonstrate the topography of the desired ERP, namely MMN, estimated from the ordinary averaged EEG recordings and extracted by ICA. To the former, the recordings between 8ms and ms at every electrode were averaged to represent the peak amplitude of MMN because MMN peaked around 9ms in this experiment []; to the latter, the first component in Fig. was projected back to the electrode field, and then, the peak amplitude was computed by the same way as mentioned above. The topography was obtained through EEGLAB []. In contrast to the topography estimated from the ordinary averaged EEG recordings, the dipole source # # 8 # 8 The logarithm of the eigenvalue estimated by PCA filtered data raw data # 8 Fig. Topography of the desired ERP-MMN estimated from the ordinarily averaged EEG recordings Topography of ERP estimated by ICA Fig. Topography of the desired ERP-MMN extracted by ICA Fig.6 depicts the number of the sources estimated by the variance explained procedure and GAP from the ordinary averaged EEG recordings of ERPs and the filtered for each subject. Through the variance explained procedure for model order selection, the mean number of sources in the ordinary averaged EEG was 8 (STD: ), and the mean number of the wavelet-filtered EEG was 6 (STD:.8). Through GAP for model order selection, the mean number of averaged EEG was about (STD: ), and the estimated number of sources of to the filtered was for 6 subjects and another two subjects corresponded to. Fig.7 exhibits the mean of the Iqs of the extracted independent components by ICASSO for each subject. Regarding the filtered EEG recordings with the reference to GAP, it is obvious that the results of all subjects were very similar to that typical example mentioned above, indicating the ICA decomposition was stable to data of every subject. However, regarding the results with reference of the variance explained procedure for model order selection, ICA decomposition was not reliable for most subjects. 6.... Amplitude
Magnitude Fig.6 Number of sources estimated from averaged EEG recordings of ERPs and the filtered for each subject, respectively Magnitude of Iq 8 6.9.8.7.6.. Estimated number of sources Subject Fig.7 Stability analysis of ICA decomposition on selected principal components derived from filtered EEG recordings of ERPs. DISCUSSION GAP filtered data PCA filtered data GAP raw data PCA raw data Stability of ICA decomposition GAP filtered data PCA filtered data PCA raw data Subject This study is targeted to estimate the number of sources for separating the signal and the noise subspace in EEG recordings of ERPs collected by the high-density array. The recently developed method named as GAP [6] estimated that the number of sources in the ordinary averaged EEG recordings of auditory ERPs collected by 8 electrodes was about, and that the number of sources was reduced to about after the wavelet filter designed for the target ERP was implemented. With the conventional variance explained approach, the numbers of sources were 8 and 6 in the ordinary averaged EEG recordings and the filtered counterparts, respectively. The results through GAP indicate that the number of sources has been dramatically reduced by the wavelet filter, on the contrary, through the variance explained approach, the numbers were comparable. Our previous study has shown that the wavelet filter can reduce the number of sources severely []. Hence, the results through GAP should be more reasonable. Indeed, this study assumes the discrete source model [8], thus, the EEG recordings collected by high-density array are overdetermined in terms of linear transformation model. Then, in this case, to extract the desired ERP component by ICA, it is necessary to implement the dimension reduction, or in other words, to separate the signal and the noise subspace [9, ]. PCA is the first step to achieve this goal. After that, the issue is to determine how many principal components composing the signal subspace should be selected for the further ICA decomposition. With the eigenvalues derived from PCA, GAP can determine the number of sources in EEG recordings by efficiently locating the hidden gap between noise/outlier eigenvalues and signal eigenvalues. And subsequently, the appropriate principal components composing the signal subspace can be selected, simultaneously fulfilling the dimension reduction to convert the overdetermined model of data to the determined one. Furthermore, this number is also regarded as the number of sources in the signal subspace. Consequently, the ICA decomposition under the determined model by ICASSO [7] can validate whether the number of sources estimated by GAP is correct because ICASSO can interpret whether the extraction of independent components is reliable or stable. In this study, with the reference to GAP in the estimation for the number of sources, ICASSO demonstrates that ICA decomposition on the filtered EEG recordings of ERPs was reliable, indicating the success of the application of GAP. Hence, for the study of ERPs collected by the highdensity array, the data processing may follow the next three steps: ) wavelet filter and PCA to preprocess the averaged EEG recordings of ERPs, ) GAP to estimate the number of sources in the filtered EEG recordings to separate the signal and the noise subspace, and ) ICASSO to decompose the selected principal components in the signal subspace into the same number of independent components. It is definite that this method can be used to analyze any other ERPs. Moreover, this study analyzed the averaged EEG recordings over single trials. This is because ERPs are usually produced by such averaging [6]. ICA can be performed on the recordings of the concatenated single trials and the averaged recordings over single trials []. Actually, both the two paradigms assume that the linear transformation models are invariant along different single trials. Consequently, from this point of view, the two paradigms are identical. The difference is that the former supplies much more samples to ICA, and the latter provides better shaped waveforms to ICA. ICA has the limitation to the number of samples of the signal. It should be at least several times of the square number of estimated sources []. In this study, the number of sources in the filtered EEG recordings was about as estimated by GAP, and the number of samples of the averaged trace was. Hence, these parameters just matched the requirement of ICA mentioned above. Moreover, the wavelet filter in this study is designed according to the spectral properties of the target ERP and it removes most of other sources of no interest in the averaged EEG recordings []. This is the reason why the wavelet filter can reduce the number of sources.
Furthermore, to determine the number of sources for separating the signal and the noise subspace in EEG recordings is still challenging, and it is worth investigating more methods like Laplace approximation to the model order (LAP), Akaike information criterion (AIC), Bayesian information criterion (BIC), and the minimum description length (MDL) [] in contrast to GAP in the future.. ACKNOWLEDGEMENT Cong F. thanks the international mobility grants (Spring- ) of University of Jyväskylä for sponsoring the research. He Z. is supported in part by the Fundamental Research Funds for the Central Universities, SCUT (Grant 9ZZ9) and Guangdong International cooperative research projection (grant 9B7). 6. REFERENCES [] P. Nunez and R. Srinivasan, Electric Fields of the Brain: The Neurophysics of EEG. New York: Oxford University Press,. [] F. Cong, I. Kalyakin and T. Ristaniemi, "Can backprojection fully resolve polarity indeterminacy of ICA in study of ERP?" Biomed. Signal Process. Control, vol. DOI:.6/j.bspc...6,. [] S. Makeig, T. P. Jung, A. J. Bell, D. Ghahremani and T. J. Sejnowski, "Blind separation of auditory event-related brain responses into independent components," Proc. Natl. Acad. Sci. U. S. A., vol. 9, pp. 979-98, Sep, 997. [] B. Blankertz, S. Lemm, M. Treder, S. Haufe and K. R. Muller, "Single-trial analysis and classification of ERP components - A tutorial," Neuroimage DOI:.6/j.neuroimage..6.8,. [] S. J. Luck, An Introduction to the Event-Related Potential Technique. The MIT Press,. [6] T. W. Picton, S. Bentin, P. Berg, E. Donchin, S. A. Hillyard, R. Johnson Jr, G. A. Miller, W. Ritter, D. S. Ruchkin, M. D. Rugg and M. J. Taylor, "Guidelines for using human event-related potentials to study cognition: recording standards and publication criteria," Psychophysiology, vol. 7, pp. 7-, Mar,. [7] S. Sanei and J. A. Chambers, EEG Signal Processing. Wiley, 7. [8] M. Scherg and D. Von Cramon, "Two bilateral sources of the late AEP as identified by a spatio-temporal dipole model," Electroencephalogr. Clin. Neurophysiol., vol. 6, pp. -, Jan, 98. [9] N. Kovacevic and A. R. McIntosh, "Groupwise independent component decomposition of EEG data and partial least square analysis," Neuroimage, vol., pp. -, Apr, 7. [] V. A. Vakorin, N. Kovacevic and A. R. McIntosh, "Exploring transient transfer entropy based on a group-wise ICA decomposition of EEG data," Neuroimage, vol. 9, pp. 9-6, Jan,. [] I. Jolliffe, Principal Component Analysis. New York: Springer-Verlag,. [] R. Vigario and E. Oja, "BSS and ICA in Neuroinformatics: From Current Practices to Open Challenges," IEEE Reviews in Biomedical Engineering, vol., pp. -6, 8. [] J. A. Hamalainen, P. H. Leppanen, T. K. Guttorm and H. Lyytinen, "Event-related potentials to pitch and rise time change in children with reading disabilities and typically reading children," Clin. Neurophysiol., vol. 9, pp. -, Jan, 8. [] J. A. Hämäläinen, S. Ortiz-Mantilla and A. A. Benasich, "Source localization of event-related potentials to pitch change mapped onto age-appropriate MRIs at 6 months of age," NeuroImage DOI:.6/j.neuroimage...6,. [] Z. S. He, A. Cichocki and S. Xie, "Efficient method for Tucker model selection," Electron. Lett., vol., pp. 8-86, 9. [6] Z. He, A. Cichocki, S. Xie and K. Choi, "Detecting the number of clusters in n-way probabilistic clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol., pp. 6-, Nov,. [7] J. Himberg, A. Hyvarinen and F. Esposito, "Validating the independent components of neuroimaging time series via clustering and visualization," Neuroimage, vol., pp. -, Jul,. [8] E. Basar, M. Schurmann, T. Demiralp, C. Basar-Eroglu and A. Ademoglu, "Event-related oscillations are 'real brain responses'--wavelet analysis and new strategies," Int. J. Psychophysiol., vol. 9, pp. 9-7, Jan,. [9] F. Cong, Y. Huang, I. Kalyakin, H. Li, T. Huttunen- Scott, H. Lyytinen and T. Ristaniemi, "Frequency Response based Wavelet Decomposition to Extract Mismatch Negativity of Children in Uninterrupted Sound," Journal of Medical and Biological Engineering, (Under the nd round review) [] I. Daubechies, Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, 99. [] A. Hyvarinen, "Fast and robust fixed-point algorithms for independent component analysis," IEEE Trans. Neural Netw., vol., pp. 66-6, 999. [] F. Cong, P. H. T. Leppänen, P. Astikainen, J. Hämäläinen, J. K. Hietanen and T. Ristaniemi, "Dimension Reduction: Additional Benefit of an Optimal Filter for Independent Component Analysis to Extract Event-related Potentials," Journal of Neuroscience Methods,, DOI:.6/j.jneumeth..7. [] A. Delorme and S. Makeig, "EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis," J. Neurosci. Methods, vol., pp. 9-, Mar,. [] A. Anderson, I.D. Dinov, J.E. Sherin, J. Quintana, A.L. Yuille, M.S. Cohen, "Classification of spatially unaligned fmri scans," NeuroImage, vol. 9, no. pp. 9-9,.