[期刊论文][Full-length article]


Robust multi-view non-negative matrix factorization with adaptive graph and diversity constraints

作   者:
Chenglu Li;Hangjun Che;Man-Fai Leung;Cheng Liu;Zheng Yan;

出版年:2023

页     码:587 - 607
出版社:Elsevier BV


摘   要:

Multi-view clustering (MVC) has received extensive attention due to its efficient processing of high-dimensional data. Most of the existing multi-view clustering methods are based on non-negative matrix factorization (NMF), which can achieve dimensionality reduction and interpretable representation. However, there are following issues in the existing researches: (1) The existing methods based on NMF using Frobenius norm are sensitive to noises and outliers. (2) Many methods only use the information shared by multi-view data, while ignoring the diverse information between views. (3) The data graph constructed by the conventional K Nearest Neighbors (KNN) method may misclassify neighbors and degrade the clustering performance. To address the above problems, we propose a novel robust multi-view clustering method. Specifically, l 2 , 1 -norm is introduced to measure the factorization error to improve the robustness of NMF. Additionally, a diversity constraint is utilized to learn the diverse relationship of multi-view data, and an adaptive graph method via information entropy is designed to overcome the shortcomings of misclassifying neighbors. Finally, an iterative updating algorithm is developed to solve the optimization model, which can make the objective function monotonically non-increasing. The effectiveness of the proposed method is substantiated by comparing with eleven state-of-the-art methods on five real-world and four synthetic multi-view datasets for clustering tasks . Introduction Clustering is the fundamental algorithm of machine learning [1] and data mining [2]. Real-world datasets exhibit high dimensionality and multiple views [3], and to efficiently process these complex data, many clustering methods have been proposed. NMF [4] factorizes the original data matrix into a basis matrix and a coefficient matrix. The coefficient matrix is a low-rank representation of the original data matrix. NMF is particularly successful in high-dimensional data as an effective framework for learning the low-rank representation [5], [6]. The data is generally derived from low-dimensional manifolds embedded in high-dimensional spaces [7], but NMF fails to maintain the geometric structure of the data. To solve this problem, a graph regularized NMF (GNMF) algorithm based on manifold hypothesis and NMF is proposed in [7]. For NMF, a sparser solution indicates a better parts-based representation [7], so some methods based on sparse non-negative matrix factorization (SNMF) are proposed in [8], [9], [10]. In [11], an optimization algorithm for SNMF is proposed, which is based on discrete-time projection neural networks. The above methods are typically used to analyze single-view data. For multi-view data, data in different views contain complementary and consistent features, which provide more comprehensive information than single-view data [12], [13], [14]. Therefore, multi-view data cannot simply be processed by applying single-view methods. To address this issue, numerous multi-view clustering methods have been developed. How to effectively learn the similar information and consensus representation of multi-view data is a significant problem in multi-view clustering [15], [16], [17], [18]. In [15], MultiNMF learns a consensus representation by pushing the coefficient matrices of all views to a consensus matrix and mining a consistent clustering structure. In [16], uniform distribution multi-view NMF obtains a consistent representation by introducing a consensus regularization. In [17], the similarity of inter-view is captured by minimizing the discrepancies between coefficient matrices of different views. In [18], a consensus regularization is used to process the similarity information between views and a pairwise regularization is utilized to extract complementary information. The NMF is known to generate improved clustering results when combined with manifold learning, which preserves the geometrical structure of the data distribution [7]. Thus, many multi-view NMF methods with graph regularization have been proposed in [19], [20], [21]. Specifically, an extension of MultiNMF is proposed in [19], which constrains the coefficient matrix in each view with a graph regularization. In [20], sparse feature and data graphs of all views are constructed according to self-expressiveness principle, which more effectively exploits the graph structure than previous multi-view methods. In [21], a multi-manifold multi-view NMF method is proposed, which utilizes a manifold regularization to constrain the coefficient matrix and the consensus matrix. View weight learning is essential in multi-view clustering, due to the fact that view differences widely exist in multi-view data [22]. In [23], different weights are imposed on each view, which can use view information reasonably and reduce the influence of noisy views. In [24], a implicit view weight learning method is proposed. Furthermore, different clusters also have different influence on obtaining the correct clustering results. In [25], a clusterwise weight learning method is proposed, which assigns weights to each basis vector, i.e., each cluster center. Recently, deep learning techniques have been applied to matrix factorization. Deep matrix factorization (DMF) is developed in [26], which can fully explore the complicated hierarchical features in data [27]. Moreover, some multi-view DMF methods have been proposed in [28], [29], [30], [31], [32]. Specifically, in [28], in order to obtain mutual information, the coefficient matrix obtained in the last layer of different views is unique, and a graph regularization is introduced to ensure that the data after dimensionality reduction have the same structural characteristics as the original data. In [29], the top layer representation matrices of all views are forced to a consistent representation matrix, and a data graph is constructed to protect the geometric structure of the data. In [30], an auto-weighted multi-view DMF method is proposed, which automatically assigns corresponding weights to different views to obtain diverse information, and uses the l 2 , 1 -norm to increase robustness. In [31], DMF and structural constraints are exploited to obtain structural information in sequential data. In [32], the partition level information is utilized in DMF to obtain the last consistent partition matrix for data analysis. Although a number of NMF-based clustering methods have made great progress, there still remain the following problems: (1) Frobenius norm is sensitive to noises and outliers; (2) the diverse information in multi-view data cannot be fully utilized; (3) using KNN method to construct data graph may fail to assign neighbors to each sample. To solve these issues, this paper proposes a multi-view method based on l 2 , 1 -norm NMF with adaptive graph and diversity constraints, and its flow chart is shown in Fig. 1. The contributions are listed as follows: • We validate the performance of multi-view l 2 , 1 -norm NMF on datasets containing Poisson noise, which is rarely considered in existing literature. Experimental data show that the multi-view l 2 , 1 -norm NMF can effectively process Poisson noise. • Taking advantage of the diverse characteristics of multi-view data, we utilize a diversity constraint to obtain a more comprehensive data representation. • The information entropy is introduced to adaptively construct data graphs for multi-view clustering. It can overcome the shortcomings of misclassifying neighbors by using conventional KNN method. • For the proposed optimization model, we design a corresponding alternate iteration optimization algorithm. The objective function is monotonically non-increasing after each iteration of the optimization algorithm. We compare and analyze the computational complexity of the proposed algorithm and other comparison algorithms. Extensive experiments demonstrate the effectiveness of the proposed method. The remaining structure of this paper is as follows: Section 2 and 3 present related works and preliminaries. Section 4 provides a detailed description of the proposed multi-view clustering method. Section 5 and 6 describe and analyze the proposed optimization algorithm. Section 7 presents the corresponding experimental results. Conclusions are provided in the last section. Section snippets Related works Robust data clustering The data is generally corrupted by noise during the sampling process. NMF using the Frobenius norm is sensitive to noises and outliers. To enhance robustness, NMF using the l 2 , 1 -norm ( l 2 , 1 -norm NMF) is proposed in [33], which is more robust than NMF using the Frobenius norm and can reduce the impact of noise. In [34], a robust multi-view feature selection approach is proposed to jointly execute clustering and sparse learning in an efficient and robust manner. In [23], the Preliminaries Define the data matrix as X = { x 1 , x 2 , . . . , x n } ∈ R m × n , where x i ∈ R m ( i = 1 , 2 , . . . , n ) . NMF can be expressed as the following optimization problem: min U , V ⁡ | | X − U V | | F 2 = min U , V ⁡ ∑ i = 1 n | | x i − U v i | | 2 2 s . t . U ≥ 0 , V ≥ 0 , where | | ⋅ | | F denotes the Frobenius norm of a matrix and | | X | | F = ∑ i = 1 n ∑ j = 1 m x j i 2 , U ∈ R m × k is the basis matrix and V = { v 1 , v 2 , . . . , v n } ∈ R k × n is the coefficient matrix. To enhance the robustness of NMF, l 2 , 1 -norm is used to replace with the Frobenius norm, and the l 2 , 1 -norm NMF is formulated as follows: min U , V ⁡ | | X − U V | | 2 , 1 The proposed method Suppose that X = { X 1 , X 2 , . . . , X P } is a multi-view dataset containing P views. The data matrix corresponding to the p th view is X p = { x 1 p , x 2 p , . . . , x n p } ∈ R m p × n , where x i p ∈ R m p ( i = 1 , 2 , . . . , n ) , ( p = 1 , 2 , . . . , P ) . Then, robust multi-view NMF using the l 2 , 1 -norm is expressed as min U p , V p ⁡ ∑ p = 1 P | | X p − U p V p | | 2 , 1 s . t . U p ≥ 0 , V p ≥ 0 , p ∈ { 1 , 2 , . . . , P } . The basis matrix and the coefficient matrix of the p th view are U p ∈ R m p × k and V p ∈ R k × n , respectively. Multi-view data is richer in content than single-view data, that is, it has diverse Optimization In this section, an alternate updating technique is employed to solve the optimization Problem (9). Specifically, we update each variable while keeping the other variables fixed. The detailed optimization algorithm is outlined in Algorithm 1. Convergence analysis The mathematical proof of convergence of the proposed algorithm is given in this section. We only give the convergence analysis about V p , and it can be extended to the proof of U p . Definition 1 G ( h , h ′ ) is an auxiliary function of F ( h ) , if the following conditions are satisfied. F ( h ) ≤ G ( h , h ′ ) , F ( h ) = G ( h , h ) . Lemma 1 If G is an auxiliary function of F ( h ) , then F ( h ) is non-increasing under the following updating rule. h t + 1 = a r g min h ⁡ G ( h , h ′ ) . Proof F ( h t + 1 ) ≤ G ( h t + 1 , h t ) ≤ G ( h t , h t ) = F ( h t ) . Eq. (35) shows that F ( h ) is non-increasing under the Experiments In this section, we perform extensive experiments to demonstrate the effectiveness and robustness of the proposed method. Conclusion In this paper, a novel multi-view clustering method is proposed, which enhances the robustness of NMF by utilizing l 2 , 1 -norm and captures the diverse information between views by using a diversity constraint. Furthermore, the proposed method constructs a data graph based on information entropy, which makes the data after dimensionality reduction retain the geometric structure of the original data. An efficient iterative updating algorithm is developed to solve the proposed optimization model, CRediT authorship contribution statement Chenglu Li: Coding, Writing – original draft preparation. Hangjun Che: Methodology, Supervision. Manfai Leung: Investigation, Revision. Cheng Liu: Formal analysis, Investigation. Zheng Yan: Review & editing. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements This work was supported by National Natural Science Foundation of China (Grant No. 62003281 ), Natural Science Foundation of Chongqing , China (Grant No. cstc2021jcyj-msxmX1169 ). References (50) Hao Wang et al. A study of graph-based system for multi-view clustering Knowl.-Based Syst. (2019) Maria Brbić et al. Multi-view low-rank sparse subspace clustering Pattern Recognit. (2018) Wen-Sheng Chen et al. A survey of deep nonnegative matrix factorization Neurocomputing (2022) Yanyong Huang et al. Adaptive graph-based generalized regression model for unsupervised feature selection Knowl.-Based Syst. (2021) Naiyao Liang et al. Multi-view clustering by non-negative matrix factorization with co-orthogonal constraints Knowl.-Based Syst. (2020) Shudong Huang et al. Auto-weighted multi-view clustering via deep matrix decomposition Pattern Recognit. (2020) Jianqiang Li et al. Deep graph regularized non-negative matrix factorization for multi-view clustering Neurocomputing (2020) Qianli Zhao et al. Multi-view clustering via clusterwise weights learning Knowl.-Based Syst. (2020) Shudong Huang et al. Robust multi-view data clustering with multi-view capped-norm k-means Neurocomputing (2018) Linlin Zong et al. Multi-view clustering via multi-manifold regularized non-negative matrix factorization Neural Netw. (2017) Peng Luo et al. Dual regularized multi-view non-negative matrix factorization for clustering Neurocomputing (2018) Lin Feng et al. Re-weighted multi-view clustering via triplex regularized non-negative matrix factorization Neurocomputing (2021) Baicheng Pan et al. Nonconvex low-rank tensor approximation with graph and consistent regularizations for multi-view subspace learning Neural Netw. (2023) Zhe Xue et al. Deep low-rank subspace ensemble for multi-view clustering Inf. Sci. (2019) Chong Peng et al. Log-based sparse nonnegative matrix factorization for data representation Knowl.-Based Syst. (2022) Hangjun Che et al. A nonnegative matrix factorization algorithm based on a discrete-time projection neural network Neural Netw. (2018) Hao Cai et al. Semi-supervised multi-view clustering based on orthonormality-constrained nonnegative matrix factorization Inf. Sci. (2020) Guoqing Chao et al. A survey on multiview clustering IEEE Trans. Artif. Intell. (2021) Cheng Liu et al. Self-guided partial graph propagation for incomplete multiview clustering IEEE Trans. Neural Netw. Learn. Syst. (2023) Daniel D. Lee et al. Learning the parts of objects by non-negative matrix factorization Nature (1999) Khanh Luong et al. A novel approach to learning consensus and complementary information for multi-view data clustering Deng Cai et al. Graph regularized nonnegative matrix factorization for data representation IEEE Trans. Pattern Anal. Mach. Intell. (2010) Maoguo Gong et al. Multiobjective sparse non-negative matrix factorization IEEE Trans. Cybern. (2018) Keyi Chen et al. Graph non-negative matrix factorization with alternative smoothed l 0 regularizations Neural Comput. Appl. (2022) Hangjun Che et al. Bicriteria sparse nonnegative matrix factorization via two-timescale duplex neurodynamic optimization IEEE Trans. Neural Netw. Learn. Syst. (2021) View more references Cited by (0) Recommended articles (6) Research article Dense lead contrast for self-supervised representation learning of multilead electrocardiograms Information Sciences, Volume 634, 2023, pp. 189-205 Show abstract Usually, manual labeling of large-scale electrocardiograms (ECGs) for deep learning is always expensive, as it requires considerable effort and time from cardiologists. Currently, contrastive learning can utilize unlabeled ECGs to improve deep learning using insufficient labels. However, this field may still lack instructive literature. In this paper, a dense lead contrast (DLC) is proposed for effective contrastive learning on multilead ECGs. It develops contrastive learning between any two leads from different views to explore intralead and interlead invariance. A joint loss combining intralead and interlead contrastive loss guides the DLC pretraining. Moreover, DLC introduces a multibranch network (MBN) for contrastive learning, generates a representation for each lead, and fuses all the leads for a global representation. In the downstream tasks, DLC outperforms the standard contrastive learning paradigm of multilead ECGs by 2.78%∼6.59% in AUROC (linear probe). Using only 10% of the labeled training data, it still outperforms standard contrastive learning by a significant margin in AUROC. Compared with existing methods, DLC shows obvious advantages in all the experiments. Therefore, DLC may be more suitable for the contrastive learning of multilead ECGs. Its good performance based on insufficient labels can alleviate the cardiologists' burden from data labeling. Research article Urban regional function guided traffic flow prediction Information Sciences, Volume 634, 2023, pp. 308-320 Show abstract The prediction of traffic flow is a challenging yet crucial problem in spatial-temporal analysis, which has recently gained increasing interest. In addition to spatial-temporal correlations, the functionality of urban areas also plays a crucial role in traffic flow prediction. However, the exploration of regional functional attributes mainly focuses on adding additional topological structures, ignoring the influence of functional attributes on regional traffic patterns. Different from the existing works, we propose a novel module named POI-MetaBlock, which utilizes the functionality of each region (represented by Point of Interest distribution) as metadata to further mine different traffic characteristics in areas with different functions. Specifically, the proposed POI-MetaBlock employs a self-attention architecture and incorporates POI and time information to generate dynamic attention parameters for each region, which enables the model to fit different traffic patterns of various areas at different times. Furthermore, our lightweight POI-MetaBlock can be easily integrated into conventional traffic flow prediction models. Extensive experiments demonstrate that our module significantly improves the performance of traffic flow prediction and outperforms state-of-the-art methods that use metadata. Research article Dual-graph regularized concept factorization for multi-view clustering Expert Systems with Applications, Volume 223, 2023, Article 119949 Show abstract Matrix factorization is an important technology that obtains the latent representation of data by mining the potential structure of data. As two popular matrix factorization techniques, concept factorization (CF) and non-negative matrix factorization (NMF) have achieved excellent results in multi-view clustering tasks. Compared with multi-view NMF, multi-view CF not only removes the non-negative constraint but also utilizes the idea of the kernel to learn the latent representation of data. However, both of them ignore the local geometric structure in the nonlinear low-dimensional manifold. Furthermore, most of the existing CF-based methods are designed for single-view tasks, which cannot be directly applied to multi-view clustering tasks. To tackle the above shortcomings, we present a new multi-view clustering algorithm, called dual-graph regularized concept factorization for multi-view clustering (MVDGCF). Specifically, we first extend conventional single-view CF to a multi-view version, which can explore the complementary information of multi-view data more effectively. Then we develop a novel dual-graph regularization strategy, which can simultaneously capture the local structure information of the data space and feature space, respectively. Moreover, an adaptive weight vector is introduced to balance the importance of different views. Finally, extensive experiments are carried out on seven datasets. The results show that our method is superior to several popular multi-view clustering methods. Research article Multi-view graph neural network with cascaded attention for lncRNA-miRNA interaction prediction Knowledge-Based Systems, Volume 268, 2023, Article 110492 Show abstract Identifying interactions between long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) reveals the mechanisms of biological processes, thereby contributing to disease diagnosis and treatment. Recently, graph neural networks (GNNs) have achieved remarkable progress in this task due to their consideration of both node attributes and graph topology. Nevertheless, existing GNN-based methods use only one type of node attribute, and the possible bias of a single view leads them to learn suboptimal node representations. Moreover, the underlying mechanisms of action between lncRNAs and miRNAs are complex. Ignoring the importance of neighboring nodes to the target node and the influence of different order neighborhood information makes them fail to learn satisfactory topological information. To this end, we propose a novel Multi-view Graph Neural Network with Cascaded ATtention (MGCAT) for lncRNA-miRNA interaction (LMI) prediction, where cascaded attention is a key ingredient consisting of view-level, node-level, and layer-level attentions. Specifically, we first construct a multi-attributed LMI graph to fully characterize lncRNAs and miRNAs, where nodes have multiple node attributes (i.e., multi-view features). Next, view-level attention dynamically integrates multi-view features to capture the inherent attribute information of nodes. Then, node-level attention iteratively aggregates the neighborhood information of each node. Finally, layer-level attention adaptively combines integrated features and different order neighborhood information to obtain informative node representations. Extensive experiments on four benchmark datasets show that MGCAT consistently outperforms recent state-of-the-art methods. Further case studies demonstrate the potential ability of MGCAT to identify novel LMIs. Code and datasets are publicly available at https://github.com/ai4slab/mgcat . Research article A novel spatiotemporal prediction method based on fuzzy Transform: Application to demographic balance data Information Sciences, Volume 634, 2023, pp. 677-695 Show abstract Many issues require the application of forecasting models applied to spatiotemporal data in Geographic Information Systems (GIS) to predict the spatial distribution and evolution of a specific feature. The use of soft computing techniques in the development of these forecasting models makes it possible to detect non-linear trends but has the disadvantage of increasing the computational complexity of the model. In this paper we present a GIS-based framework in which a fast soft computing forecasting model based on the multidimensional Fuzzy Transform (for short, MF-transform) is applied to evaluate the spatial distribution and the time evolution in a study area of a measurable entity (the feature). The study area is divided into homogeneous zones (the subzones) in which the feature was measured in each time frame. The time series of the feature are analyzed to assess the trend of the feature in subsequent time frames; furthermore, those sub-areas are detected in which the feature is higher than a maximum threshold (hot spots) or lower than a minimum threshold (cold spots) in this time range. A process of fuzzifying the values of the feature is carried out in order to facilitate the interpretation of the results by expert users. The framework was tested on a study area provided by the province of Naples (Italy) to predict and analyze the spatial distribution and temporal trend of the monthly rate of births compared to deaths. Furthermore, the thematic map of the hot and cold spots detected in the three months following the time period of measurements was built. The results show that our method provides reliable results both in terms of forecast error and similarity between the detected hot and cold spots and those who have really formed. Research article Discovering periodic frequent travel patterns of individual metro passengers considering different time granularities and station attributes International Journal of Transportation Science and Technology, 2023 Show abstract Periodic frequent pattern discovery is a non-trivial task to discover frequent patterns based on user interests using a periodicity measure. Although conventional algorithms for periodic frequent pattern detection have numerous applications, there is still little research on periodic frequent pattern detection of individual passengers in the metro. The travel behavior of individual passengers has complex spatio-temporal characteristics in the metro network, which may pose new challenges in discovering periodic frequent patterns of individual metro passengers and developing mining algorithms based on real-world smart card data. This study addresses these issues by proposing a novel pattern for metro passenger travel pattern called periodic frequent passenger traffic patterns with time granularities and station attributes (PFPTS). This discovered pattern can automatically capture the features of the temporal dimension (morning and evening peak hour, week) and the spatial dimension (entering and leaving stations). The corresponding complete mining algorithm with the PFPTS-Tree structure has been developed. To evaluate the performance of PFPTS-Tree, several experiments are conducted on one-year real-world smart card data collected by an automatic fare collection system in a certain large metro network. The results show that PFPTS-Tree is efficient and can discover numerous interesting periodic frequent patterns of metro passengers in the real-world dataset. View full text © 2023 Elsevier Inc. All rights reserved. About ScienceDirect Remote access Shopping cart Advertise Contact and support Terms and conditions Privacy policy We use cookies to help provide and enhance our service and tailor content and ads. By continuing you agree to the use of cookies . Copyright © 2023 Elsevier B.V. or its licensors or contributors. ScienceDirect® is a registered trademark of Elsevier B.V. ScienceDirect® is a registered trademark of Elsevier B.V.



关键字:

暂无


所属期刊
Information Sciences
ISSN: 0020-0255
来自:Elsevier BV