To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overco...To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.展开更多
The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces....The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces. In the EDD-tree, all data points are first grouped into clusters by a k-means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme, in which each cluster sphere is partitioned twice according to the dual distances of start- and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B^+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EDD-tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high-dimensional search techniques such as the X-tree, VA-file, iDistance and NB-tree, especially when the query radius is not very large.展开更多
The boom of Internet and multimedia technology leads to the explosion of multimedia information, especially image, which has created an urgent need of quickly retrieving similar and interested images from huge image c...The boom of Internet and multimedia technology leads to the explosion of multimedia information, especially image, which has created an urgent need of quickly retrieving similar and interested images from huge image collections. The content-based high-dimensional indexing mechanism holds the key to achieving this goal by efficiently organizing the content of images and storing them in computer memory. In the past decades, many important developments in high-dimensional image indexing technologies have occurred to cope with the 'curse of dimensionality'. The high-dimensional indexing mechanisms can mainly be divided into three categories: tree-based index, hashing-based index, and visual words based inverted index. In this paper we review the technologies with respect to these three categories of mechanisms, and make several recommendations for future research issues.展开更多
This paper studies the target controllability of multilayer complex networked systems,in which the nodes are highdimensional linear time invariant(LTI)dynamical systems,and the network topology is directed and weighte...This paper studies the target controllability of multilayer complex networked systems,in which the nodes are highdimensional linear time invariant(LTI)dynamical systems,and the network topology is directed and weighted.The influence of inter-layer couplings on the target controllability of multi-layer networks is discussed.It is found that even if there exists a layer which is not target controllable,the entire multi-layer network can still be target controllable due to the inter-layer couplings.For the multi-layer networks with general structure,a necessary and sufficient condition for target controllability is given by establishing the relationship between uncontrollable subspace and output matrix.By the derived condition,it can be found that the system may be target controllable even if it is not state controllable.On this basis,two corollaries are derived,which clarify the relationship between target controllability,state controllability and output controllability.For the multi-layer networks where the inter-layer couplings are directed chains and directed stars,sufficient conditions for target controllability of networked systems are given,respectively.These conditions are easier to verify than the classic criterion.展开更多
Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext...Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.展开更多
The objective of reliability-based design optimization(RBDO)is to minimize the optimization objective while satisfying the corresponding reliability requirements.However,the nested loop characteristic reduces the effi...The objective of reliability-based design optimization(RBDO)is to minimize the optimization objective while satisfying the corresponding reliability requirements.However,the nested loop characteristic reduces the efficiency of RBDO algorithm,which hinders their application to high-dimensional engineering problems.To address these issues,this paper proposes an efficient decoupled RBDO method combining high dimensional model representation(HDMR)and the weight-point estimation method(WPEM).First,we decouple the RBDO model using HDMR and WPEM.Second,Lagrange interpolation is used to approximate a univariate function.Finally,based on the results of the first two steps,the original nested loop reliability optimization model is completely transformed into a deterministic design optimization model that can be solved by a series of mature constrained optimization methods without any additional calculations.Two numerical examples of a planar 10-bar structure and an aviation hydraulic piping system with 28 design variables are analyzed to illustrate the performance and practicability of the proposed method.展开更多
In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all usef...In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all useful information across quantiles and can detect nonlinear effects including interactions and heterogeneity,effectively.Furthermore,the proposed screening method based on cCCQC is robust to the existence of outliers and enjoys the sure screening property.Simulation results demonstrate that the proposed method performs competitively on survival datasets of high-dimensional predictors,particularly when the variables are highly correlated.展开更多
The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o...The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.展开更多
This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficient...This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficients of the parameter part of the Varying Index Coefficient Model (VICM), while the unknown function part uses the B-spline to expand. Moreover, we combine the above two estimation methods under the assumption of high-dimensional data. The results of data simulation and empirical analysis show that for the varying index coefficient model, the re-adjusted cross-validation method is better in terms of accuracy and stability than traditional methods based on ordinary least squares.展开更多
Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curs...Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.展开更多
k-means is a popular clustering algorithm because of its simplicity and scalability to handle large datasets.However,one of its setbacks is the challenge of identifying the correct k-hyperparameter value.Tuning this v...k-means is a popular clustering algorithm because of its simplicity and scalability to handle large datasets.However,one of its setbacks is the challenge of identifying the correct k-hyperparameter value.Tuning this value correctly is critical for building effective k-means models.The use of the traditional elbow method to help identify this value has a long-standing literature.However,when using this method with certain datasets,smooth curves may appear,making it challenging to identify the k-value due to its unclear nature.On the other hand,various internal validation indexes,which are proposed as a solution to this issue,may be inconsistent.Although various techniques for solving smooth elbow challenges exist,k-hyperparameter tuning in high-dimensional spaces still remains intractable and an open research issue.In this paper,we have first reviewed the existing techniques for solving smooth elbow challenges.The identified research gaps are then utilized in the development of the new technique.The new technique,referred to as the ensemble-based technique of a self-adapting autoencoder and internal validation indexes,is then validated in high-dimensional space clustering.The optimal k-value,tuned by this technique using a voting scheme,is a trade-off between the number of clusters visualized in the autoencoder’s latent space,k-value from the ensemble internal validation index score and one that generates a value of 0 or close to 0 on the derivative f″′(k)(1+f′(k)^(2))−3 f″(k)^(2)f″((k)2f′(k),at the elbow.Experimental results based on the Cochran’s Q test,ANOVA,and McNemar’s score indicate a relatively good performance of the newly developed technique in k-hyperparameter tuning.展开更多
Guaranteed cost consensus analysis and design problems for high-dimensional multi-agent systems with time varying delays are investigated. The idea of guaranteed cost con trol is introduced into consensus problems for...Guaranteed cost consensus analysis and design problems for high-dimensional multi-agent systems with time varying delays are investigated. The idea of guaranteed cost con trol is introduced into consensus problems for high-dimensiona multi-agent systems with time-varying delays, where a cos function is defined based on state errors among neighboring agents and control inputs of all the agents. By the state space decomposition approach and the linear matrix inequality(LMI)sufficient conditions for guaranteed cost consensus and consensu alization are given. Moreover, a guaranteed cost upper bound o the cost function is determined. It should be mentioned that these LMI criteria are dependent on the change rate of time delays and the maximum time delay, the guaranteed cost upper bound is only dependent on the maximum time delay but independen of the Laplacian matrix. Finally, numerical simulations are given to demonstrate theoretical results.展开更多
Parallel multi-thread processing in advanced intelligent processors is the core to realize high-speed and high-capacity signal processing systems.Optical neural network(ONN)has the native advantages of high paralleliz...Parallel multi-thread processing in advanced intelligent processors is the core to realize high-speed and high-capacity signal processing systems.Optical neural network(ONN)has the native advantages of high parallelization,large bandwidth,and low power consumption to meet the demand of big data.Here,we demonstrate the dual-layer ONN with Mach-Zehnder interferometer(MZI)network and nonlinear layer,while the nonlinear activation function is achieved by optical-electronic signal conversion.Two frequency components from the microcomb source carrying digit datasets are simultaneously imposed and intelligently recognized through the ONN.We successfully achieve the digit classification of different frequency components by demultiplexing the output signal and testing power distribution.Efficient parallelization feasibility with wavelength division multiplexing is demonstrated in our high-dimensional ONN.This work provides a high-performance architecture for future parallel high-capacity optical analog computing.展开更多
The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities...The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity,leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals,and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this method,three data types are used,and seven common similarity measurement methods are compared.The experimental result indicates that the relative difference of the method is increasing with the dimensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition,the similarity range of this method in different dimensions is [0,1],which is fit for similarity analysis after dimensionality reduction.展开更多
The quantum state transmission through the medium of high-dimensional many-particle system (boson or spinless fermion) is generally studied with a symmetry analysis. We discover that, if the spectrum of a Hamiltonia...The quantum state transmission through the medium of high-dimensional many-particle system (boson or spinless fermion) is generally studied with a symmetry analysis. We discover that, if the spectrum of a Hamiltonian matches the symmetry of a fermion or boson system in a certain fashion, a perfect quantum state transfer can be implemented without any operation on the medium with pre-engineered nearest neighbor (NN). We also study a simple but realistic near half-filled tight-bindlng fermion system wlth uniform NN hopping integral. We show that an arbitrary many-particle state near the fermi surface can be perfectly transferred to its translational counterpart.展开更多
Image matching technology is theoretically significant and practically promising in the field of autonomous navigation.Addressing shortcomings of existing image matching navigation technologies,the concept of high-dim...Image matching technology is theoretically significant and practically promising in the field of autonomous navigation.Addressing shortcomings of existing image matching navigation technologies,the concept of high-dimensional combined feature is presented based on sequence image matching navigation.To balance between the distribution of high-dimensional combined features and the shortcomings of the only use of geometric relations,we propose a method based on Delaunay triangulation to improve the feature,and add the regional characteristics of the features together with their geometric characteristics.Finally,k-nearest neighbor(KNN)algorithm is adopted to optimize searching process.Simulation results show that the matching can be realized at the rotation angle of-8°to 8°and the scale factor of 0.9 to 1.1,and when the image size is 160 pixel×160 pixel,the matching time is less than 0.5 s.Therefore,the proposed algorithm can substantially reduce computational complexity,improve the matching speed,and exhibit robustness to the rotation and scale changes.展开更多
Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts itera...Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.展开更多
High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurat...High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurately represent them is of great significance.A latent factor(LF)model is one of the most popular and successful ways to address this issue.Current LF models mostly adopt L2-norm-oriented Loss to represent an HiDS matrix,i.e.,they sum the errors between observed data and predicted ones with L2-norm.Yet L2-norm is sensitive to outlier data.Unfortunately,outlier data usually exist in such matrices.For example,an HiDS matrix from RSs commonly contains many outlier ratings due to some heedless/malicious users.To address this issue,this work proposes a smooth L1-norm-oriented latent factor(SL-LF)model.Its main idea is to adopt smooth L1-norm rather than L2-norm to form its Loss,making it have both strong robustness and high accuracy in predicting the missing data of an HiDS matrix.Experimental results on eight HiDS matrices generated by industrial applications verify that the proposed SL-LF model not only is robust to the outlier data but also has significantly higher prediction accuracy than state-of-the-art models when they are used to predict the missing data of HiDS matrices.展开更多
Because all the known integrable models possess Schwarzian forms with Mobious transformation invariance,it may be one of the best ways to find new integrable models starting from some suitable Mobious transformation i...Because all the known integrable models possess Schwarzian forms with Mobious transformation invariance,it may be one of the best ways to find new integrable models starting from some suitable Mobious transformation invariant equations. In this paper, we study the Painlevé integrability of some special (3+1)-dimensional Schwarzian models.展开更多
This paper deals with the representation of the solutions of a polynomial system, and concentrates on the high-dimensional case. Based on the rational univari- ate representation of zero-dimensional polynomial systems...This paper deals with the representation of the solutions of a polynomial system, and concentrates on the high-dimensional case. Based on the rational univari- ate representation of zero-dimensional polynomial systems, we give a new description called rational representation for the solutions of a high-dimensional polynomial sys- tem and propose an algorithm for computing it. By this way all the solutions of any high-dimensional polynomial system can be represented by a set of so-called rational- representation sets.展开更多
基金Sponsored by the National High Technology Research and Development Program of China (863 Program)(Grant No.[2005]555)
文摘To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.
基金the key program of the National Natural Science Foundation of China (Grant No.60533090)the National Natural Science Fund for Distinguished Young Scholars (Grant No.60525108)China-America Academic Digital Library Project
文摘The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces. In the EDD-tree, all data points are first grouped into clusters by a k-means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme, in which each cluster sphere is partitioned twice according to the dual distances of start- and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B^+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EDD-tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high-dimensional search techniques such as the X-tree, VA-file, iDistance and NB-tree, especially when the query radius is not very large.
基金supported by the National Natural Science Foundation of China (Nos. 61173114, 61202300, and 61272202)the Guangdong Provincial Research Project (No. 2011B090400251)
文摘The boom of Internet and multimedia technology leads to the explosion of multimedia information, especially image, which has created an urgent need of quickly retrieving similar and interested images from huge image collections. The content-based high-dimensional indexing mechanism holds the key to achieving this goal by efficiently organizing the content of images and storing them in computer memory. In the past decades, many important developments in high-dimensional image indexing technologies have occurred to cope with the 'curse of dimensionality'. The high-dimensional indexing mechanisms can mainly be divided into three categories: tree-based index, hashing-based index, and visual words based inverted index. In this paper we review the technologies with respect to these three categories of mechanisms, and make several recommendations for future research issues.
基金supported by the National Natural Science Foundation of China (U1808205)Hebei Natural Science Foundation (F2000501005)。
文摘This paper studies the target controllability of multilayer complex networked systems,in which the nodes are highdimensional linear time invariant(LTI)dynamical systems,and the network topology is directed and weighted.The influence of inter-layer couplings on the target controllability of multi-layer networks is discussed.It is found that even if there exists a layer which is not target controllable,the entire multi-layer network can still be target controllable due to the inter-layer couplings.For the multi-layer networks with general structure,a necessary and sufficient condition for target controllability is given by establishing the relationship between uncontrollable subspace and output matrix.By the derived condition,it can be found that the system may be target controllable even if it is not state controllable.On this basis,two corollaries are derived,which clarify the relationship between target controllability,state controllability and output controllability.For the multi-layer networks where the inter-layer couplings are directed chains and directed stars,sufficient conditions for target controllability of networked systems are given,respectively.These conditions are easier to verify than the classic criterion.
文摘Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.
基金supported by the Innovation Fund Project of the Gansu Education Department(Grant No.2021B-099).
文摘The objective of reliability-based design optimization(RBDO)is to minimize the optimization objective while satisfying the corresponding reliability requirements.However,the nested loop characteristic reduces the efficiency of RBDO algorithm,which hinders their application to high-dimensional engineering problems.To address these issues,this paper proposes an efficient decoupled RBDO method combining high dimensional model representation(HDMR)and the weight-point estimation method(WPEM).First,we decouple the RBDO model using HDMR and WPEM.Second,Lagrange interpolation is used to approximate a univariate function.Finally,based on the results of the first two steps,the original nested loop reliability optimization model is completely transformed into a deterministic design optimization model that can be solved by a series of mature constrained optimization methods without any additional calculations.Two numerical examples of a planar 10-bar structure and an aviation hydraulic piping system with 28 design variables are analyzed to illustrate the performance and practicability of the proposed method.
基金Outstanding Youth Foundation of Hunan Provincial Department of Education(Grant No.22B0911)。
文摘In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all useful information across quantiles and can detect nonlinear effects including interactions and heterogeneity,effectively.Furthermore,the proposed screening method based on cCCQC is robust to the existence of outliers and enjoys the sure screening property.Simulation results demonstrate that the proposed method performs competitively on survival datasets of high-dimensional predictors,particularly when the variables are highly correlated.
文摘The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.
文摘This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficients of the parameter part of the Varying Index Coefficient Model (VICM), while the unknown function part uses the B-spline to expand. Moreover, we combine the above two estimation methods under the assumption of high-dimensional data. The results of data simulation and empirical analysis show that for the varying index coefficient model, the re-adjusted cross-validation method is better in terms of accuracy and stability than traditional methods based on ordinary least squares.
基金Project (No. [2005]555) supported by the Hi-Tech Research and De-velopment Program (863) of China
文摘Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.
文摘k-means is a popular clustering algorithm because of its simplicity and scalability to handle large datasets.However,one of its setbacks is the challenge of identifying the correct k-hyperparameter value.Tuning this value correctly is critical for building effective k-means models.The use of the traditional elbow method to help identify this value has a long-standing literature.However,when using this method with certain datasets,smooth curves may appear,making it challenging to identify the k-value due to its unclear nature.On the other hand,various internal validation indexes,which are proposed as a solution to this issue,may be inconsistent.Although various techniques for solving smooth elbow challenges exist,k-hyperparameter tuning in high-dimensional spaces still remains intractable and an open research issue.In this paper,we have first reviewed the existing techniques for solving smooth elbow challenges.The identified research gaps are then utilized in the development of the new technique.The new technique,referred to as the ensemble-based technique of a self-adapting autoencoder and internal validation indexes,is then validated in high-dimensional space clustering.The optimal k-value,tuned by this technique using a voting scheme,is a trade-off between the number of clusters visualized in the autoencoder’s latent space,k-value from the ensemble internal validation index score and one that generates a value of 0 or close to 0 on the derivative f″′(k)(1+f′(k)^(2))−3 f″(k)^(2)f″((k)2f′(k),at the elbow.Experimental results based on the Cochran’s Q test,ANOVA,and McNemar’s score indicate a relatively good performance of the newly developed technique in k-hyperparameter tuning.
基金supported by Shaanxi Province Natural Science Foundation of Research Projects(2016JM6014)the Innovation Foundation of High-Tech Institute of Xi’an(2015ZZDJJ03)the Youth Foundation of HighTech Institute of Xi’an(2016QNJJ004)
文摘Guaranteed cost consensus analysis and design problems for high-dimensional multi-agent systems with time varying delays are investigated. The idea of guaranteed cost con trol is introduced into consensus problems for high-dimensiona multi-agent systems with time-varying delays, where a cos function is defined based on state errors among neighboring agents and control inputs of all the agents. By the state space decomposition approach and the linear matrix inequality(LMI)sufficient conditions for guaranteed cost consensus and consensu alization are given. Moreover, a guaranteed cost upper bound o the cost function is determined. It should be mentioned that these LMI criteria are dependent on the change rate of time delays and the maximum time delay, the guaranteed cost upper bound is only dependent on the maximum time delay but independen of the Laplacian matrix. Finally, numerical simulations are given to demonstrate theoretical results.
基金Peng Xie acknowledges the support from the China Scholarship Council(Grant no.201804910829).
文摘Parallel multi-thread processing in advanced intelligent processors is the core to realize high-speed and high-capacity signal processing systems.Optical neural network(ONN)has the native advantages of high parallelization,large bandwidth,and low power consumption to meet the demand of big data.Here,we demonstrate the dual-layer ONN with Mach-Zehnder interferometer(MZI)network and nonlinear layer,while the nonlinear activation function is achieved by optical-electronic signal conversion.Two frequency components from the microcomb source carrying digit datasets are simultaneously imposed and intelligently recognized through the ONN.We successfully achieve the digit classification of different frequency components by demultiplexing the output signal and testing power distribution.Efficient parallelization feasibility with wavelength division multiplexing is demonstrated in our high-dimensional ONN.This work provides a high-performance architecture for future parallel high-capacity optical analog computing.
基金Supported by the National Natural Science Foundation of China(No.61502475)the Importation and Development of High-Caliber Talents Project of the Beijing Municipal Institutions(No.CIT&TCD201504039)
文摘The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity,leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals,and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this method,three data types are used,and seven common similarity measurement methods are compared.The experimental result indicates that the relative difference of the method is increasing with the dimensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition,the similarity range of this method in different dimensions is [0,1],which is fit for similarity analysis after dimensionality reduction.
基金The project supported by National Natural Science Foundation of China under Grant Nos. 90203018, 10474104, and 10447133, and the Knowledge Innovation Program (KIP) of the Chinese Academy of Sciences, the National Fundamental Research Program of China under Grant No. 2001CB309310
文摘The quantum state transmission through the medium of high-dimensional many-particle system (boson or spinless fermion) is generally studied with a symmetry analysis. We discover that, if the spectrum of a Hamiltonian matches the symmetry of a fermion or boson system in a certain fashion, a perfect quantum state transfer can be implemented without any operation on the medium with pre-engineered nearest neighbor (NN). We also study a simple but realistic near half-filled tight-bindlng fermion system wlth uniform NN hopping integral. We show that an arbitrary many-particle state near the fermi surface can be perfectly transferred to its translational counterpart.
基金supported by the National Natural Science Foundations of China(Nos.51205193,51475221)
文摘Image matching technology is theoretically significant and practically promising in the field of autonomous navigation.Addressing shortcomings of existing image matching navigation technologies,the concept of high-dimensional combined feature is presented based on sequence image matching navigation.To balance between the distribution of high-dimensional combined features and the shortcomings of the only use of geometric relations,we propose a method based on Delaunay triangulation to improve the feature,and add the regional characteristics of the features together with their geometric characteristics.Finally,k-nearest neighbor(KNN)algorithm is adopted to optimize searching process.Simulation results show that the matching can be realized at the rotation angle of-8°to 8°and the scale factor of 0.9 to 1.1,and when the image size is 160 pixel×160 pixel,the matching time is less than 0.5 s.Therefore,the proposed algorithm can substantially reduce computational complexity,improve the matching speed,and exhibit robustness to the rotation and scale changes.
基金supported in part by the National Natural Science Foundation of China (6177249391646114)+1 种基金Chongqing research program of technology innovation and application (cstc2017rgzn-zdyfX0020)in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences
文摘Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.
基金supported in part by the National Natural Science Foundation of China(61702475,61772493,61902370,62002337)in part by the Natural Science Foundation of Chongqing,China(cstc2019jcyj-msxmX0578,cstc2019jcyjjqX0013)+1 种基金in part by the Chinese Academy of Sciences“Light of West China”Program,in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciencesby Technology Innovation and Application Development Project of Chongqing,China(cstc2019jscx-fxydX0027)。
文摘High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurately represent them is of great significance.A latent factor(LF)model is one of the most popular and successful ways to address this issue.Current LF models mostly adopt L2-norm-oriented Loss to represent an HiDS matrix,i.e.,they sum the errors between observed data and predicted ones with L2-norm.Yet L2-norm is sensitive to outlier data.Unfortunately,outlier data usually exist in such matrices.For example,an HiDS matrix from RSs commonly contains many outlier ratings due to some heedless/malicious users.To address this issue,this work proposes a smooth L1-norm-oriented latent factor(SL-LF)model.Its main idea is to adopt smooth L1-norm rather than L2-norm to form its Loss,making it have both strong robustness and high accuracy in predicting the missing data of an HiDS matrix.Experimental results on eight HiDS matrices generated by industrial applications verify that the proposed SL-LF model not only is robust to the outlier data but also has significantly higher prediction accuracy than state-of-the-art models when they are used to predict the missing data of HiDS matrices.
文摘Because all the known integrable models possess Schwarzian forms with Mobious transformation invariance,it may be one of the best ways to find new integrable models starting from some suitable Mobious transformation invariant equations. In this paper, we study the Painlevé integrability of some special (3+1)-dimensional Schwarzian models.
基金The National Grand Fundamental Research 973 Program (2004CB318000) of China
文摘This paper deals with the representation of the solutions of a polynomial system, and concentrates on the high-dimensional case. Based on the rational univari- ate representation of zero-dimensional polynomial systems, we give a new description called rational representation for the solutions of a high-dimensional polynomial sys- tem and propose an algorithm for computing it. By this way all the solutions of any high-dimensional polynomial system can be represented by a set of so-called rational- representation sets.