Assessment of nonnegative matrix factorization algorithms for electroencephalography spectral analysis

Background Nonnegative matrix factorization (NMF) has been successfully used for electroencephalography (EEG) spectral analysis. Since NMF was proposed in the 1990s, many adaptive algorithms have been developed. However, the performance of their use in EEG data analysis has not been fully compared. Here, we provide a comparison of four NMF algorithms in terms of accuracy of estimation, stability (repeatability of the results) and time complexity of algorithms with simulated data. In the practical application of NMF algorithms, stability plays an important role, which was an emphasis in the comparison. A Hierarchical clustering algorithm was implemented to evaluate the stability of NMF algorithms. Results In simulation-based comprehensive analysis of fit, stability, accuracy of estimation and time complexity, hierarchical alternating least squares (HALS) low-rank NMF algorithm (lraNMF_HALS) outperformed the other three NMF algorithms. In the application of lraNMF_HALS for real resting-state EEG data analysis, stable and interpretable features were extracted. Conclusion Based on the results of assessment, our recommendation is to use lraNMF_HALS, providing the most accurate and robust estimation.

physical meaning and they are consistent with the brain intuitive perception [38]. Compared with ICA, some interesting components with higher distinguishing ratio were extracted by NMF [50].
The importance of stability of algorithms is self-evident for scalp EEG data analysis. Stability is a common problem of adaptive algorithms. Same to other blind sources separation (BSS) methods such as ICA, NMF also faces the same issue. Theoretically, just like the stability analysis of ICA [51,52], the uniqueness of NMF could be verified by geometric interpretation [2,49]. However, in the practical applications, algorithms may converge to local optima, which has been theoretically analysis with Lyapunov's first and second methods [53]. Thus, the stability of algorithms would be the most important aspect in the assessment of NMF algorithms.
In this study, the performance of four NMF algorithms was compared in terms of accuracy of estimation, stability and time complexity of algorithms with simulated data. The algorithm with excellent performance in simulation data was applied to real resting-state EEG data analysis. Based on stability analysis method, stable and interpretable features were extracted.

Simulation and results
Given a signal H ∈ R 10 × 1000 + as component matrices shown in Fig. 1, we generated a mix matrix W ∈ R 10×64 + . Then we constructed V * = W T H ∈ R 64×1000 + and V = V * + E , where E denoted the independent noise and V denoted 64-channel simulated time-series EEG data. Signal noise ratio (SNR) was used to quantitative describe the quality of data. In this simulation, compared the performance of four NMF algorithms (NMF_MU, HALS, lraNMF_MU, lraNMF_HALS) were compared. The number of extracted components r was chosen as 10. The same algorithm was run 50 times. 500 components in total were clustered. All four algorithms were run in Matlab under the Window 10 system (Intel Xeon CPU 3.5 GHz and 32 GB of Random Access Memory (RAM)).  Hu et al. BioMed Eng OnLine (2020) 19:61 Simulated data with SNR = 20 dB Simulated data with SNR of 20 dB was used for demonstrating the clustering related results. The 10 estimated sources and the clustering results of all 500 components estimated with HALS and NMF_MU algorithms are shown in Fig. 2. Ideally, the number of components in each cluster equals to the number of times that NMF was run (50 at here), under the condition that the number of clusters is same with the number of extracted components (10 at here). Obviously, HALS outperformed NMF_MU from the view of the stability of extracted components. In Fig. 2, the denser the cluster, the more stable of the components, extracted by an NMF algorithm, are. The purpose of the NMF algorithm is to accurately estimate the potential components in the data. Hence, it is necessary to check the accuracy of estimated sources. Here, readers are guided to have an intuitive feeling of the accuracy of estimation. The quantitative comparison in terms of estimation accuracy would be shown in the next section. The centroid of each cluster was selected as the component extracted by NMF as ICASSO [54] applied. Figure 3 shows the waveforms of 10 extracted components by two NMF algorithms. By visual inspection, HALS outperformed NMF_MU for estimating the sources. Simulated data with multiple SNR In order to test the performance of different NMF algorithms under different SNR, Fig. 4 illustrates the results with SNR ranging from z− 10 dB to 20 dB with the step of Fig. 2 Clustering 500 extracted components from 50 runs of NMF with 10 components in each run: a inner similarity of each cluster for HALS; b inner similarity of each cluster for NMF_MU; c stability index ( I q ) and the number of components in each cluster for HALS; d stability index ( I q ) and the number of components in each cluster for NMF_MU 5 dB. Figure 4a shows that fits of four different NMF algorithms were very similar to one another. Figure 4b reveals the estimation accuracy of four algorithms. The estimation accuracy is measured with average correlation coefficient (Pearson correlation was used in this study) cross-ground truth and estimated component. The four algorithms yielded different correlation coefficients, which mean the accuracy of the estimation of four algorithms is different. The situation is a little bit different across SNR. When SNR is lower than − 5 dB, the estimation accuracy is mainly dominated with signal quality, the accuracy of all four algorithms is very low. As the SNR increases, the accuracy of the estimated components also increases. When SNR large than 0 dB, the performance of  four algorithm begins to differ in terms of estimation accuracy. By comparison, it can be found that the estimated component of lraNMF_HALS is the most accurate, followed by HALS, followed by lraNMF_MU, and MU is the worst. Figure 4c presents the mean over 10 I q of 10 components extracted by algorithm for each SNR and the Iq indicates the stability of the components over multiple runs of NMF. The stability of the extracted components of four NMF algorithms was very different. Figure 4d illustrates the computation time of four algorithms for the same data. The difference in computational time mainly is decided by the difference of iteration between algorithms and time complexity of the algorithms, which would be introduced in "Methods" section. The result of computation time shows that lraNMF_HALS needs less time to converge. Evidently, bigger correlation coefficient indicates better estimation of sources in Fig. 4b and the higher I q implies better stability of extracted components in Fig. 4c. Based on comprehensive analysis of fit, stability, accuracy of estimation and time complexity, lraNMF_HALS outperformed the other three NMF algorithms. The algorithm would be used in the application of real EEG data analysis.

Real EEG results
In this section, the application of NMF algorithm to EEG data analysis was introduced. The stability-validating method was extended used not only for algorithm stability evaluation, but also used for determination of number of extracted components. The flowchart of data processing is illustrated in Fig. 5. After EEG data were collected, standard EEG preprocessing procedure was applied. It includes data filtering and artificial removal. Then the EEG data were transformed into frequency domain, which would be fed into NMF algorithm. Performance of several NMF algorithms was then evaluated in terms of algorithm stability and time complexity. Next, the model order of NMF algorithm was selected in terms of reproducibility of estimated components. Based on the above rigorous steps, repeatable and interpretable components were finally obtained. The specific process of each step will be described in the following content.

Data description
To evaluate the performance of four NMF algorithms, we tested it on EEG dataset which was recorded at The First Affiliated Hospital of Dalian Medical University. 16 primary insomnia patients and 17 normal sleepers were included in the study. Gender, age and education were matched between the two groups. All patients fit criteria for primary insomnia in DSM-IV. Patients and normal sleepers are required to complete the Pittsburgh Sleep Quality Index Scale (PSQI), Insomnia Severity Scale (ISI), Hyper-arousal Scale (HAS), Hamilton Anxiety Scale (HAMA) and Hamilton Depression Scale (HRSD-17) for clinical assessment. Continuous EEG recordings were digitally obtained using a 62-channel Neuroscan SynAmps2 brain electrical physiological instrument, EEG were recorded in a resting awake condition with the eyes closed for 20 min.

Data preprocessing
The EEG data were sampled at 1000 Hz for recording. Subsequently, the data were resampled at 500 Hz. DC drifts were removed using 0.5 Hz high-pass filter and 50 Hz notch filters were applied to minimize line noise artifacts. PCA was used to determine linear components associated with eye blinks and saccades. Epochs with strong eye movement or other movement artifacts were manually removed by inspection. For all further analyses, EEG signals were processed with the method of discrete Fourier transform (DFT) which transform the time series into frequency domain. We cut 4-40 Hz of our interest frequency band. Thus, the sizes of preprocessed data are 73 (frequency bins) by 62 (channels) by 33 (2 groups with 16 primary insomnia patients and 17 health control). Then we transform the three-dimensional matrix from the EEG data of each subject into the matrix (channels × frequency bins by samples, 62 × 2409 73 by 33 ) and used it as input to the space-by-frequency decomposition. Based on these processing, the elements in the matrix that would be fed into NMF are nonnegative.

Determination of the model order
Here, lraNMF_HALS was run 50 times with random initialization with the model order (number of extracted components) ranging from 2 to 30. Coefficient matrix was used to evaluate the stability of algorithms. Figure 6a shows the stability (denoted by I q ) of lraNMF_HALS decomposition for each component. Indeed, each I q in Fig. 6a is the averaged I q for different components. It could be found that when the number of extracted components equals 9, the algorithm will be the most stable. As mentioned earlier, except the stability index of NMF decomposition, it is also necessary to check Fig. 6b, c and ensure whether the clustering result is good enough or not.
Different algorithms took different time finishing 50 times runs. NMF_MU took 1157.007s. HALS took 1173.178s. lraNMF_MU took 106.010s. lraNMF_HALS took 72.315s. Obviously, lraNMF_HALS took shortest time to complete the mission of decomposition. The stability indexes of four algorithms are also different. The stability index of NMF_MU is 0.760. The stability index of HALS is 0.717. The stability index of lraNMF_MU is 0.622. The stability index of lraNMF_HALS is 0.833. Obviously, as the result of simulation data, lraNMF_HALS is the most stable algorithm for the data. So, in the following subsection only the results of lraNMF_HALS would be introduced.

Feature selection
After a matrix is decomposed by lraNMF_HALS with 9 components, the features need to be further filtrated. Usually, the criteria of feature selection are determined by the purpose of further feature analysis. As we know that the collected EEG data consist of brain activities of no interest, brain activities of interest, and interference as well as noise. Therefore, even though artifacts are removed in the data preprocessing phase, the features of interest and features of no interest may also be extracted by NMF. Therefore, feature selection step is necessary to choose the features of brain activities of interest for further analysis [55]. The spectrum-domain features and corresponding spatial distribution are illustrated in Fig. 7. In Fig. 7a, the power spectrum values were log-transformed. Group difference was examined between insomnia and normal sleepers in absolute spectrum of 4-40 Hz for the component of interest. Insomnia group is noted by red line and normal sleeper group is noted by black line.
Furthermore, the nine features were statistic by two-way t test. We found that the fea-ture#1 and #8 revealed a significant main effect in theta band (P < 0.05). From Fig. 7b, the #1 and #8 of brain map, the result indicated that the right occipital region has a main effect of group in theta band. The feature #4 also revealed that a significant main effect in low gamma band (30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40), P < 0.05. On the basis of the #4 of the brain maps in Fig. 7b, there was significant difference in low gamma band at frontal lobe. The feature #5 also revealed that a significant main effect in beta band, P < 0.05. In accordance with the #5 in the brain map in Fig. 7a, there was the significant difference in beta band at parietal lobe. The significant differences are shown in Fig. 8. For the other five features, no main effect of group was observed.

Discussion
In this study, we provide a comparison of four NMF algorithms in terms of accuracy of estimation, stability (repeatability of the results) and time complexity of algorithms with simulated data. In the practical application of NMF algorithms, stability plays an important role, which was an emphasis in the comparison. A hierarchical clustering algorithm was implemented to evaluate the stability of NMF algorithms. In simulation, based on comprehensive analysis of fit, stability, accuracy of estimation and time complexity, hierarchical alternating least squares (HALS) low-rank NMF algorithm (lraNMF_HALS) outperformed the other three NMF algorithms. In the application of lraNMF_HALS for real resting-state EEG data analysis, stable and interpretable features were extracted. Based on our assessment, lraNMF_HALS is the most stable algorithm. In order to guarantee the reliability of estimated components, the model order is selected in terms of algorithm stability. With these strategies stable components were reconstructed. These results of lraNMF_HALS for scalp resting-state EEG data analysis are consistent with the previous findings and also provide more reasonable results in support of pathological mechanisms of insomnia. Corsi-Cabrera et al. found that patients with insomnia exhibited significantly higher gamma power at frontoparietal [28]. This is consistent with the result of Comp#4. Szelenberger et al. pointed out that theta band in insomniacs is lower than normal subjects, but beta band is higher than normal subjects [26]. This is in accordance with the result of the Comp#1 and Comp#5. The consistent results show that the proposed method is effective for spectrum analysis of EEG. In addition, we also found some results that were not found before, which provide support of pathological mechanisms of insomnia. The Comp#4 and Comp#5 with higher power can verify that the wakefulness of the insomniacs is higher than that of the normal subjects, which may lead to their insomnia. Based on the evaluation of algorithm stability, the repeatability of the results is guaranteed. This finding provides theoretical support for clinical treatment of insomnia. When adaptive algorithm is applied in real-world application, more attention needs to be paid to its stability. Since initialization of adaptive algorithm may be different in different runs, it may converge to local optimum, which makes results different for different runs. It is vital for scientific research. The stability of algorithms could be evaluated with clustering analysis of components extracted from multiple runs of the same algorithm. The study of algorithm stability can not only quantitatively describe the reproducibility of the components, but also provide an effective criterion for the comparison of different algorithms and the selection of model order.

Conclusion
In this study, we proposed a method to compare different NMF algorithms so as to extract stable components. Specifically, we provide a comparison of four NMF algorithms in terms of accuracy of estimation, stability of algorithms and time complexity with simulated data. The performance of NMF algorithms also evaluated scalp restingstate EEG data analysis from aspects of stability and time complexity. In the application of lraNMF_HALS for real resting-state EEG data analysis, stable and interpretable features were extracted. From both simulation and real-world application, lraNMF_HALS outperformed the other three NMF algorithms in terms of algorithm stability and computational time. In simulation, the comparison is illustrated in Fig. 4. From simulation, we also find that lraNMF_HALS can estimate the most accurate potential components. In real-world application, in terms of computational time, lrNMF_HALS (72.315s) is faster than NMF_MU (1157.007s), HALS (1173.178s) and lraNMF_MU (106.010s) and in terms of algorithms stability, lrNMF_HALS (0.833) is more stable than NMF_MU (0.760), HALS (0.717) and lraNMF_MU (0.622). Based on the results of assessment, our recommendation is to use lraNMF_HALS, providing the most accurate and robust estimation as well as offering an intuitive interpretation. In terms of reproducibility of the final estimated components, this work provides a useful pipeline that applied NMF to spectrum analysis of EEG data. The pipeline includes selection of algorithm, selection of model order and stability analysis of estimated components. Based on the novel pipeline, components related to the pathology of insomnia were extracted and provide new ideas for clinical diagnosis and treatment of insomnia. For further study, the method is worthy of being applied on some other EEG spectrum data and identify more reasonable features. When an investigator prepares to use NMF for subsequent analysis, NMF performance needs to be considered in the experimental design to obtain stable and reliable results. Through this research, we found that different algorithms have a great impact on the decomposition results. In the following research, it is necessary to propose an NMF algorithm specifically applied to EEG spectrum analysis.

NMF algorithms
For a given nonnegative data matrix V ∈ R m×n and the number of extracted components r < min[m, n] , NMF attempts to find nonnegative matrices W ∈ R m×r and H ∈ R r×n which minimize the cost function as follows: where H and W are coefficient matrix and component matrix, respectively. Their product is a rank-r approximate estimation of V . In practical applications, the selection of r is critical and has great influence on the results. In this study, the model order, r, would be determined by checking the stability of the model with different values: To optimize Eq. (1), Lee and Seung suggested a very popular multiplicative update rule with computational complexity of O(mnr) [6]. The iterative formulas are as follows: In 1998, Rasmus Bro proposed another set of updated formula [56] which is a columnwise method. And then it was extended with HALS algorithm and it can be computed in the complexity of O(mn) [2], where the columns of H and W are updated sequentially: where V i = V − j� =i w j h T j . w j and h j are the jth column of W and H , respectively. Only one column of W and H are updated in each iteration of HALS algorithm. Since r columns included in W and H , essentially, NMF_MU and HALS have equivalent space complexity and time complexity. However, in practical application, HALS is usually faster than NMF_MU [57].
Although HALS is faster than NMF_MU, the two algorithms still have a lot of room for improvement in computing speed. The common bottleneck of the computing speed is the size of matrix V . In the process of updating W and H , the large original data V will be used for many times. This issue causes not only slow convergence, but also great consuming of computer memory. To address the issue is to replace the large matrix with a (1) where W ∈ R m×r + , H ∈ R r×n + , , l = pr ≪ m , and p is a small positive constant. In order to solve Eq. (7), the cost function min The prototypical low-rank NMF algorithms originated by Guoxu Zhou and Andrzej Cichocki [57] are provided as follows: This is lraNMF_MU that low-rank approximation-based multiplicative update (NMF_ MU). The initialization of ⌢ W and ⌢ H come from PCA singular value decomposition (tSVD). In the first step, suppose that the optimal where W 1 ∈ R m×(r−1) and H 1 ∈ R n×r−1 are the submatrices of W and H by removing their ith column. The low-rank approximation-based HALS is named as lraNMF_HALS, which could be computed in the complexity of O(nr ) [57].
We also provide a summary of the comparison of different NMF algorithms that have been studied by the various published methods, as listed in Table 1, which may serve as a reference of method selection according to the available data types of the analyzers.

Fitness of the algorithm
The fit is defined as follows: where ⌢ V is a reconstructed version of V by Eq. (2). Apparently, fit (V, 57]. In the previous study, fit was used as the performance index of decomposition [2]. However, we find that even when the fits of two NMF decomposition are similar, the extracted components could be different. Hence the fit is not enough to evaluate the stability of the decomposition.

Hierarchical clustering
Hierarchical clustering is one of most popular clustering methods. In contrast to partitioned clustering, which directly decomposes the features into a set of disjoint clusters, the hierarchical clustering method is the process for transforming a proximity matrix into a nested partition, which can be graphically represented by a tree called dendrogram. Cutting the tree at different selected heights will provide a partitioning cluster at selected precision. So the precision is the one that is "tuned" by the cut [58]. This algorithm was applied to validate the stability of ICA components and it was named ICASSO [59]. Hierarchical clustering algorithms have two different types: agglomerative clustering (bottom-up) and divisive clustering (top-down). In this study, agglomerative clustering was used, and the dendrogram is formed from bottom to up. For this clustering method, at the first iteration, the number of clusters is same as the number of objects N. At the second iteration, the most similarity cluster will merge as a new cluster, so the number of clusters will become N-1. At the third iteration, the number of clusters will become N-2.
Dimension reduction at the start of the algorithm. Fix the problem of the size of matrix to be decomposed. Faster than NMF_MU in theory and in practical application Dimension reduction at the start of the algorithm. Faster than HALS in theory and practical application As the clustering goes forward, the number of clusters will become 1. By this way, the relation dendrogram of different objects could be established. In the stability algorithm, the cluster result is the dendrogram at the level that is the number of extracted features of NMF.
A conservative cluster quality index I q in the previous study [59] was defined to reveal the compactness and isolation of a cluster. It is computed as the difference between the average intra-cluster similarities and average extra-cluster similarities: where C m means the features of mth cluster. σ ij is the similarity of the ith and jth features. C −m = C − C m represent the features that not belong to this cluster. Ideally, the same component from different runs will be clustered in the same cluster. The different components are in different clusters. The number of clusters equals to the number of extracted components. The number of features in each cluster is the number of runs of the algorithm. First term 1 |C m | 2 i,j∈C m σ ij is the average similarity of mth cluster (average intra-cluster similarities). This part is used to describe the similarity of the components in the same cluster. Second term σ ij is the average similarity between the features that belong to mth cluster and not belong to mth cluster (average extra-cluster similarities). This part is used to describe the dissimilarity of the components in different clusters. The range of I q is from 0 to 1. The value closer to 1, the higher stability it is.

Evaluation of stability of NMF algorithms
Given a non-negative data matrix V ∈ R m×n + , we used an NMF algorithm to decompose the data. In order to evaluate the stability and reliability of components decomposed by NMF, three steps are used, which is introduced in our previous study [60]. Three steps are as follows: Step 1: An NMF algorithm was run K times. Each time with random initialization.
Step 2: All extracted components were clustered. According to their mutual similarities, agglomerative clustering was used.
Step 3: The centroid of each cluster was selected as the component extracted by NMF. The cluster quality index I q works as stability index for each component.