An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sp...An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scaleenormously, and can get the clustering result with only one data scan. Both theoretical analysis andempirical tests showed that CABOSFV is of low computational complexity. The algorithm findsclusters in high dimensional large datasets efficiently and handles noise effectively.展开更多
Information analysis of high dimensional data was carried out through similarity measure application. High dimensional data were considered as the a typical structure. Additionally, overlapped and non-overlapped data ...Information analysis of high dimensional data was carried out through similarity measure application. High dimensional data were considered as the a typical structure. Additionally, overlapped and non-overlapped data were introduced, and similarity measure analysis was also illustrated and compared with conventional similarity measure. As a result, overlapped data comparison was possible to present similarity with conventional similarity measure. Non-overlapped data similarity analysis provided the clue to solve the similarity of high dimensional data. Considering high dimensional data analysis was designed with consideration of neighborhoods information. Conservative and strict solutions were proposed. Proposed similarity measure was applied to express financial fraud among multi dimensional datasets. In illustrative example, financial fraud similarity with respect to age, gender, qualification and job was presented. And with the proposed similarity measure, high dimensional personal data were calculated to evaluate how similar to the financial fraud. Calculation results show that the actual fraud has rather high similarity measure compared to the average, from minimal 0.0609 to maximal 0.1667.展开更多
In this paper, the global controllability for a class of high dimensional polynomial systems has been investigated and a constructive algebraic criterion algorithm for their global controllability has been obtained. B...In this paper, the global controllability for a class of high dimensional polynomial systems has been investigated and a constructive algebraic criterion algorithm for their global controllability has been obtained. By the criterion algorithm, the global controllability can be determined in finite steps of arithmetic operations. The algorithm is imposed on the coefficients of the polynomials only and the analysis technique is based on Sturm Theorem in real algebraic geometry and its modern progress. Finally, the authors will give some examples to show the application of our results.展开更多
To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed p...To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed paradigm consists of three main stages:categorization of high dimensional data,high-dimensionality-trait-driven feature extraction,and high-dimensionality-trait-driven classifier selection.In the first stage,according to the definition of high-dimensionality and the relationship between sample size and feature dimensions,the high-dimensionality traits of credit dataset are further categorized into two types:100<feature dimensions<sample size,and feature dimensions≥sample size.In the second stage,some typical feature extraction methods are tested regarding the two categories of high dimensionality.In the final stage,four types of classifiers are performed to evaluate credit risk considering different high-dimensionality traits.For the purpose of illustration and verification,credit classification experiments are performed on two publicly available credit risk datasets,and the results show that the proposed high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is effective in handling high-dimensional credit classification issues and improving credit classification accuracy relative to the benchmark models listed in this study.展开更多
Markowitz Portfolio theory under-estimates the risk associated with the return of a portfolio in case of high dimensional data. El Karoui mathematically proved this in [1] and suggested improved estimators for unbiase...Markowitz Portfolio theory under-estimates the risk associated with the return of a portfolio in case of high dimensional data. El Karoui mathematically proved this in [1] and suggested improved estimators for unbiased estimation of this risk under specific model assumptions. Norm constrained portfolios have recently been studied to keep the effective dimension low. In this paper we consider three sets of high dimensional data, the stock market prices for three countries, namely US, UK and India. We compare the Markowitz efficient frontier to those obtained by unbiasedness corrections and imposing norm-constraints in these real data scenarios. We also study the out-of-sample performance of the different procedures. We find that the 2-norm constrained portfolio has best overall performance.展开更多
This paper proposes a test procedure for testing the regression coefficients in high dimensional partially linear models based on the F-statistic. In the partially linear model, the authors first estimate the unknown ...This paper proposes a test procedure for testing the regression coefficients in high dimensional partially linear models based on the F-statistic. In the partially linear model, the authors first estimate the unknown nonlinear component by some nonparametric methods and then generalize the F-statistic to test the regression coefficients under some regular conditions. During this procedure, the estimation of the nonlinear component brings much challenge to explore the properties of generalized F-test. The authors obtain some asymptotic properties of the generalized F-test in more general cases,including the asymptotic normality and the power of this test with p/n ∈(0, 1) without normality assumption. The asymptotic result is general and by adding some constraint conditions we can obtain the similar conclusions in high dimensional linear models. Through simulation studies, the authors demonstrate good finite-sample performance of the proposed test in comparison with the theoretical results. The practical utility of our method is illustrated by a real data example.展开更多
Three high dimensional spatial standardization algorithms are used for diffusion tensor image(DTI)registration,and seven kinds of methods are used to evaluate their performances.Firstly,the template used in this paper...Three high dimensional spatial standardization algorithms are used for diffusion tensor image(DTI)registration,and seven kinds of methods are used to evaluate their performances.Firstly,the template used in this paper was obtained by spatial transformation of 16 subjects by means of tensor-based standardization.Then,high dimensional standardization algorithms for diffusion tensor images,including fractional anisotropy(FA)based diffeomorphic registration algorithm,FA based elastic registration algorithm and tensor-based registration algorithm,were performed.Finally,7 kinds of evaluation methods,including normalized standard deviation,dyadic coherence,diffusion cross-correlation,overlap of eigenvalue-eigenvector pairs,Euclidean distance of diffusion tensor,and Euclidean distance of the deviatoric tensor and deviatoric of tensors,were used to qualitatively compare and summarize the above standardization algorithms.Experimental results revealed that the high-dimensional tensor-based standardization algorithms perform well and can maintain the consistency of anatomical structures.展开更多
This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly express...This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly expressed.Then,the authors propose an empirical likelihood method to test regression coefficients.The authors derive the asymptotic chi-squared distribution with two degrees of freedom of the proposed test statistics under the null hypothesis.In addition,the method is extended to test with nuisance parameters.Simulations show that the proposed method have a good performance in control of type-I error rate and power.The proposed method is also employed to analyze a data of Skin Cutaneous Melanoma(SKCM).展开更多
Covariance matrix plays an important role in risk management, asset pricing, and portfolio allocation. Covariance matrix estimation becomes challenging when the dimensionality is comparable or much larger than the sam...Covariance matrix plays an important role in risk management, asset pricing, and portfolio allocation. Covariance matrix estimation becomes challenging when the dimensionality is comparable or much larger than the sample size. A widely used approach for reducing dimensionality is based on multi-factor models. Although it has been well studied and quite successful in many applications, the quality of the estimated covariance matrix is often degraded due to a nontrivial amount of missing data in the factor matrix for both technical and cost reasons. Since the factor matrix is only approximately low rank or even has full rank, existing matrix completion algorithms are not applicable. We consider a new matrix completion paradigm using the factor models directly and apply the alternating direction method of multipliers for the recovery. Numerical experiments show that the nuclear-norm matrix completion approaches are not suitable but our proposed models and algorithms are promising.展开更多
We propose a two-step variable selection procedure for censored quantile regression with high dimensional predictors. To account for censoring data in high dimensional case, we employ effective dimension reduction and...We propose a two-step variable selection procedure for censored quantile regression with high dimensional predictors. To account for censoring data in high dimensional case, we employ effective dimension reduction and the ideas of informative subset idea. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. Simulation study and real data analysis are conducted to evaluate the finite sample performance of the proposed approach.展开更多
Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approac...Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approach for processing high-dimensional data by finding relevant features for each cluster in the data space.Subspace clustering methods extend traditional clustering to account for the constraints imposed by data streams.Data streams are not only high-dimensional,but also unbounded and evolving.This necessitates the development of subspace clustering algorithms that can handle high dimensionality and adapt to the unique characteristics of data streams.Although many articles have contributed to the literature review on data stream clustering,there is currently no specific review on subspace clustering algorithms in high-dimensional data streams.Therefore,this article aims to systematically review the existing literature on subspace clustering of data streams in high-dimensional streaming environments.The review follows a systematic methodological approach and includes 18 articles for the final analysis.The analysis focused on two research questions related to the general clustering process and dealing with the unbounded and evolving characteristics of data streams.The main findings relate to six elements:clustering process,cluster search,subspace search,synopsis structure,cluster maintenance,and evaluation measures.Most algorithms use a two-phase clustering approach consisting of an initialization stage,a refinement stage,a cluster maintenance stage,and a final clustering stage.The density-based top-down subspace clustering approach is more widely used than the others because it is able to distinguish true clusters and outliers using projected microclusters.Most algorithms implicitly adapt to the evolving nature of the data stream by using a time fading function that is sensitive to outliers.Future work can focus on the clustering framework,parameter optimization,subspace search techniques,memory-efficient synopsis structures,explicit cluster change detection,and intrinsic performance metrics.This article can serve as a guide for researchers interested in high-dimensional subspace clustering methods for data streams.展开更多
Dynamic transient responses of rotating twisted plate under the air-blast loading and step loading respectively considering the geometric nonlinear relationships are investigated using classical shallow shell theory.B...Dynamic transient responses of rotating twisted plate under the air-blast loading and step loading respectively considering the geometric nonlinear relationships are investigated using classical shallow shell theory.By applying energy principle,a novel high dimensional nonlinear dynamic system of the rotating cantilever twisted plate is derived for the first time.The use of variable mode functions by polynomial functions according to the twist angles and geometric of the plate makes it more accurate to describe the dynamic system than that using the classic cantilever beam functions and the free-free beam functions.The comparison researches are carried out between the present results and other literatures to validate present model,formulation and computer process.Equations of motion describing the transient high dimensional nonlinear dynamic response are reduced to a four degree of freedom dynamic system which expressed by out-plane displacement.The effects of twisted angle,stagger angle,rotation speed,load intensity and viscous damping on nonlinear dynamic transient responses of the twisted plate have been investigated.It’s important to note that although the homogeneous and isotropic material is applied here,it might be helpful for laminated composite,functionally graded material as long as the equivalent material parameters are obtained.展开更多
The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the req...The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.展开更多
This paper aims to develop a new robust U-type test for high dimensional regression coefficients using the estimated U-statistic of order two and refitted cross-validation error variance estimation. It is proved that ...This paper aims to develop a new robust U-type test for high dimensional regression coefficients using the estimated U-statistic of order two and refitted cross-validation error variance estimation. It is proved that the limiting null distribution of the proposed new test is normal under two kinds of ordinary models.We further study the local power of the proposed test and compare with other competitive tests for high dimensional data. The idea of refitted cross-validation approach is utilized to reduce the bias of sample variance in the estimation of the test statistic. Our theoretical results indicate that the proposed test can have even more substantial power gain than the test by Zhong and Chen(2011) when testing a hypothesis with outlying observations and heavy tailed distributions. We assess the finite-sample performance of the proposed test by examining its size and power via Monte Carlo studies. We also illustrate the application of the proposed test by an empirical analysis of a real data example.展开更多
In this paper a high-dimension multiparty quantum secret sharing scheme is proposed by using Einstein-Podolsky-Rosen pairs and local unitary operators. This scheme has the advantage of not only having higher capacity,...In this paper a high-dimension multiparty quantum secret sharing scheme is proposed by using Einstein-Podolsky-Rosen pairs and local unitary operators. This scheme has the advantage of not only having higher capacity, but also saving storage space. The security analysis is also given.展开更多
We deal with the boundedness of solutions to a class of fully parabolic quasilinear repulsion chemotaxis systems{ut=∇・(ϕ(u)∇u)+∇・(ψ(u)∇v),(x,t)∈Ω×(0,T),vt=Δv−v+u,(x,t)∈Ω×(0,T),under homogeneous Neumann...We deal with the boundedness of solutions to a class of fully parabolic quasilinear repulsion chemotaxis systems{ut=∇・(ϕ(u)∇u)+∇・(ψ(u)∇v),(x,t)∈Ω×(0,T),vt=Δv−v+u,(x,t)∈Ω×(0,T),under homogeneous Neumann boundary conditions in a smooth bounded domainΩ⊂R^N(N≥3),where 0<ψ(u)≤K(u+1)^a,K1(s+1)^m≤ϕ(s)≤K2(s+1)^m withα,K,K1,K2>0 and m∈R.It is shown that ifα−m<4/N+2,then for any sufficiently smooth initial data,the classical solutions to the system are uniformly-in-time bounded.This extends the known result for the corresponding model with linear diffusion.展开更多
This paper establishes the asymptotic independence between the quadratic form z^(T)Az and maximum max1≤i≤p|zi|of a sequence of independent sub-Gaussian random variables z=(z1m…zp)^(T).Based on this theoretical resu...This paper establishes the asymptotic independence between the quadratic form z^(T)Az and maximum max1≤i≤p|zi|of a sequence of independent sub-Gaussian random variables z=(z1m…zp)^(T).Based on this theoretical result,we find the asymptotic joint distribution for the quadratic form and maximum,which can be applied into the high-dimensional testing problems.By combining the sum-type test and the max-type test,we propose the Fisher’s combination tests for the one-sample mean test and two-sample mean test.Under this novel general framework,several strong assumptions in existing literature have been relaxed.Monte Carlo simulation has been done which shows that our proposed tests are strongly robust to both sparse and dense data.展开更多
This paper investigates the moment selection and parameter estimation problem of highdimensional unconditional moment conditions. First, the authors propose a Fantope projection and selection(FPS) approach to distingu...This paper investigates the moment selection and parameter estimation problem of highdimensional unconditional moment conditions. First, the authors propose a Fantope projection and selection(FPS) approach to distinguish the informative and uninformative moments in high-dimensional unconditional moment conditions. Second, for the selected unconditional moment conditions, the authors present a generalized empirical likelihood(GEL) approach to estimate unknown parameters. The proposed method is computationally feasible, and can efficiently avoid the well-known ill-posed problem of GEL approach in the analysis of high-dimensional unconditional moment conditions. Under some regularity conditions, the authors show the consistency of the selected moment conditions, the consistency and asymptotic normality of the proposed GEL estimator. Two simulation studies are conducted to investigate the finite sample performance of the proposed methodologies. The proposed method is illustrated by a real example.展开更多
This paper reviews the adaptive sparse grid discontinuous Galerkin(aSG-DG)method for computing high dimensional partial differential equations(PDEs)and its software implementation.The C++software package called AdaM-D...This paper reviews the adaptive sparse grid discontinuous Galerkin(aSG-DG)method for computing high dimensional partial differential equations(PDEs)and its software implementation.The C++software package called AdaM-DG,implementing the aSG-DG method,is available on GitHub at https://github.com/JuntaoHuang/adaptive-multiresolution-DG.The package is capable of treating a large class of high dimensional linear and nonlinear PDEs.We review the essential components of the algorithm and the functionality of the software,including the multiwavelets used,assembling of bilinear operators,fast matrix-vector product for data with hierarchical structures.We further demonstrate the performance of the package by reporting the numerical error and the CPU cost for several benchmark tests,including linear transport equations,wave equations,and Hamilton-Jacobi(HJ)equations.展开更多
Controlled quantum teleportation(CQT), which is regarded as the prelude and backbone for a genuine quantum internet, reveals the cooperation, supervision, and control relationship among the sender, receiver, and contr...Controlled quantum teleportation(CQT), which is regarded as the prelude and backbone for a genuine quantum internet, reveals the cooperation, supervision, and control relationship among the sender, receiver, and controller in the quantum network within the simplest unit. Compared with low-dimensional counterparts, high-dimensional CQT can exhibit larger information transmission capacity and higher superiority of the controller's authority. In this article, we report a proof-of-principle experimental realization of three-dimensional(3D) CQT with a fidelity of 97.4% ± 0.2%. To reduce the complexity of the circuit, we simulate a standard 4-qutrit CQT protocol in a 9×9-dimensional two-photon system with high-quality operations. The corresponding control powers are 48.1% ± 0.2% for teleporting a qutrit and 52.8% ± 0.3% for teleporting a qubit in the experiment, which are both higher than the theoretical value of control power in 2-dimensional CQT protocol(33%). The results fully demonstrate the advantages of high-dimensional multi-partite entangled networks and provide new avenues for constructing complex quantum networks.展开更多
文摘An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scaleenormously, and can get the clustering result with only one data scan. Both theoretical analysis andempirical tests showed that CABOSFV is of low computational complexity. The algorithm findsclusters in high dimensional large datasets efficiently and handles noise effectively.
基金Project(RDF 11-02-03)supported by the Research Development Fund of XJTLU,China
文摘Information analysis of high dimensional data was carried out through similarity measure application. High dimensional data were considered as the a typical structure. Additionally, overlapped and non-overlapped data were introduced, and similarity measure analysis was also illustrated and compared with conventional similarity measure. As a result, overlapped data comparison was possible to present similarity with conventional similarity measure. Non-overlapped data similarity analysis provided the clue to solve the similarity of high dimensional data. Considering high dimensional data analysis was designed with consideration of neighborhoods information. Conservative and strict solutions were proposed. Proposed similarity measure was applied to express financial fraud among multi dimensional datasets. In illustrative example, financial fraud similarity with respect to age, gender, qualification and job was presented. And with the proposed similarity measure, high dimensional personal data were calculated to evaluate how similar to the financial fraud. Calculation results show that the actual fraud has rather high similarity measure compared to the average, from minimal 0.0609 to maximal 0.1667.
基金supported by the Natural Science Foundation of China under Grant Nos.60804008,61174048and 11071263the Fundamental Research Funds for the Central Universities and Guangdong Province Key Laboratory of Computational Science at Sun Yat-Sen University
文摘In this paper, the global controllability for a class of high dimensional polynomial systems has been investigated and a constructive algebraic criterion algorithm for their global controllability has been obtained. By the criterion algorithm, the global controllability can be determined in finite steps of arithmetic operations. The algorithm is imposed on the coefficients of the polynomials only and the analysis technique is based on Sturm Theorem in real algebraic geometry and its modern progress. Finally, the authors will give some examples to show the application of our results.
基金This work is partially supported by grants from the Key Program of National Natural Science Foundation of China(NSFC Nos.71631005 and 71731009)the Major Program of the National Social Science Foundation of China(No.19ZDA103).
文摘To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed paradigm consists of three main stages:categorization of high dimensional data,high-dimensionality-trait-driven feature extraction,and high-dimensionality-trait-driven classifier selection.In the first stage,according to the definition of high-dimensionality and the relationship between sample size and feature dimensions,the high-dimensionality traits of credit dataset are further categorized into two types:100<feature dimensions<sample size,and feature dimensions≥sample size.In the second stage,some typical feature extraction methods are tested regarding the two categories of high dimensionality.In the final stage,four types of classifiers are performed to evaluate credit risk considering different high-dimensionality traits.For the purpose of illustration and verification,credit classification experiments are performed on two publicly available credit risk datasets,and the results show that the proposed high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is effective in handling high-dimensional credit classification issues and improving credit classification accuracy relative to the benchmark models listed in this study.
文摘Markowitz Portfolio theory under-estimates the risk associated with the return of a portfolio in case of high dimensional data. El Karoui mathematically proved this in [1] and suggested improved estimators for unbiased estimation of this risk under specific model assumptions. Norm constrained portfolios have recently been studied to keep the effective dimension low. In this paper we consider three sets of high dimensional data, the stock market prices for three countries, namely US, UK and India. We compare the Markowitz efficient frontier to those obtained by unbiasedness corrections and imposing norm-constraints in these real data scenarios. We also study the out-of-sample performance of the different procedures. We find that the 2-norm constrained portfolio has best overall performance.
基金supported by the Natural Science Foundation of China under Grant Nos.11231010,11471223,11501586BCMIIS and Key Project of Beijing Municipal Educational Commission under Grant No.KZ201410028030
文摘This paper proposes a test procedure for testing the regression coefficients in high dimensional partially linear models based on the F-statistic. In the partially linear model, the authors first estimate the unknown nonlinear component by some nonparametric methods and then generalize the F-statistic to test the regression coefficients under some regular conditions. During this procedure, the estimation of the nonlinear component brings much challenge to explore the properties of generalized F-test. The authors obtain some asymptotic properties of the generalized F-test in more general cases,including the asymptotic normality and the power of this test with p/n ∈(0, 1) without normality assumption. The asymptotic result is general and by adding some constraint conditions we can obtain the similar conclusions in high dimensional linear models. Through simulation studies, the authors demonstrate good finite-sample performance of the proposed test in comparison with the theoretical results. The practical utility of our method is illustrated by a real data example.
基金Supported by the National Key Research and Development Program of China(2016YFC0100300)the National Natural Science Foundation of China(61402371,61771369)+1 种基金the Natural Science Basic Research Plan in Shaanxi Province of China(2017JM6008)the Fundamental Research Funds for the Central Universities of China(3102017zy032,3102018zy020)
文摘Three high dimensional spatial standardization algorithms are used for diffusion tensor image(DTI)registration,and seven kinds of methods are used to evaluate their performances.Firstly,the template used in this paper was obtained by spatial transformation of 16 subjects by means of tensor-based standardization.Then,high dimensional standardization algorithms for diffusion tensor images,including fractional anisotropy(FA)based diffeomorphic registration algorithm,FA based elastic registration algorithm and tensor-based registration algorithm,were performed.Finally,7 kinds of evaluation methods,including normalized standard deviation,dyadic coherence,diffusion cross-correlation,overlap of eigenvalue-eigenvector pairs,Euclidean distance of diffusion tensor,and Euclidean distance of the deviatoric tensor and deviatoric of tensors,were used to qualitatively compare and summarize the above standardization algorithms.Experimental results revealed that the high-dimensional tensor-based standardization algorithms perform well and can maintain the consistency of anatomical structures.
基金supported by the University of Chinese Academy of Sciences under Grant No.Y95401TXX2Beijing Natural Science Foundation under Grant No.Z190004Key Program of Joint Funds of the National Natural Science Foundation of China under Grant No.U19B2040。
文摘This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly expressed.Then,the authors propose an empirical likelihood method to test regression coefficients.The authors derive the asymptotic chi-squared distribution with two degrees of freedom of the proposed test statistics under the null hypothesis.In addition,the method is extended to test with nuisance parameters.Simulations show that the proposed method have a good performance in control of type-I error rate and power.The proposed method is also employed to analyze a data of Skin Cutaneous Melanoma(SKCM).
基金supported by National Natural Science Foundation of China(Grant Nos.10971122,11101274 and 11322109)Scientific and Technological Projects of Shandong Province(Grant No.2009GG10001012)Excellent Young Scientist Foundation of Shandong Province(Grant No.BS2012SF025)
文摘Covariance matrix plays an important role in risk management, asset pricing, and portfolio allocation. Covariance matrix estimation becomes challenging when the dimensionality is comparable or much larger than the sample size. A widely used approach for reducing dimensionality is based on multi-factor models. Although it has been well studied and quite successful in many applications, the quality of the estimated covariance matrix is often degraded due to a nontrivial amount of missing data in the factor matrix for both technical and cost reasons. Since the factor matrix is only approximately low rank or even has full rank, existing matrix completion algorithms are not applicable. We consider a new matrix completion paradigm using the factor models directly and apply the alternating direction method of multipliers for the recovery. Numerical experiments show that the nuclear-norm matrix completion approaches are not suitable but our proposed models and algorithms are promising.
基金supported by National Natural Science Foundation of China (Grant Nos. 11401383, 11301391 and 11271080)
文摘We propose a two-step variable selection procedure for censored quantile regression with high dimensional predictors. To account for censoring data in high dimensional case, we employ effective dimension reduction and the ideas of informative subset idea. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. Simulation study and real data analysis are conducted to evaluate the finite sample performance of the proposed approach.
文摘Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approach for processing high-dimensional data by finding relevant features for each cluster in the data space.Subspace clustering methods extend traditional clustering to account for the constraints imposed by data streams.Data streams are not only high-dimensional,but also unbounded and evolving.This necessitates the development of subspace clustering algorithms that can handle high dimensionality and adapt to the unique characteristics of data streams.Although many articles have contributed to the literature review on data stream clustering,there is currently no specific review on subspace clustering algorithms in high-dimensional data streams.Therefore,this article aims to systematically review the existing literature on subspace clustering of data streams in high-dimensional streaming environments.The review follows a systematic methodological approach and includes 18 articles for the final analysis.The analysis focused on two research questions related to the general clustering process and dealing with the unbounded and evolving characteristics of data streams.The main findings relate to six elements:clustering process,cluster search,subspace search,synopsis structure,cluster maintenance,and evaluation measures.Most algorithms use a two-phase clustering approach consisting of an initialization stage,a refinement stage,a cluster maintenance stage,and a final clustering stage.The density-based top-down subspace clustering approach is more widely used than the others because it is able to distinguish true clusters and outliers using projected microclusters.Most algorithms implicitly adapt to the evolving nature of the data stream by using a time fading function that is sensitive to outliers.Future work can focus on the clustering framework,parameter optimization,subspace search techniques,memory-efficient synopsis structures,explicit cluster change detection,and intrinsic performance metrics.This article can serve as a guide for researchers interested in high-dimensional subspace clustering methods for data streams.
基金support of National Natural Science Foundation of China through grant Nos.11872127,11832002 and 11732005,Fundamental Research Program of Shenzhen Municipality No.JCYJ20160608153749600 and the Project of Highlevel Innovative Team Building Plan for Beijing Municipal Colleges and Universities No.IDHT20180513 and the project of Qin Xin Talents Cultivation Program,Beijing Information Science&Technology University QXTCP A201901.
文摘Dynamic transient responses of rotating twisted plate under the air-blast loading and step loading respectively considering the geometric nonlinear relationships are investigated using classical shallow shell theory.By applying energy principle,a novel high dimensional nonlinear dynamic system of the rotating cantilever twisted plate is derived for the first time.The use of variable mode functions by polynomial functions according to the twist angles and geometric of the plate makes it more accurate to describe the dynamic system than that using the classic cantilever beam functions and the free-free beam functions.The comparison researches are carried out between the present results and other literatures to validate present model,formulation and computer process.Equations of motion describing the transient high dimensional nonlinear dynamic response are reduced to a four degree of freedom dynamic system which expressed by out-plane displacement.The effects of twisted angle,stagger angle,rotation speed,load intensity and viscous damping on nonlinear dynamic transient responses of the twisted plate have been investigated.It’s important to note that although the homogeneous and isotropic material is applied here,it might be helpful for laminated composite,functionally graded material as long as the equivalent material parameters are obtained.
基金Supported by National Basic Research Program of China (Grant No.2006CB303103)the National Natural Science Foundation of China (Grant Nos.60873011,60802026,60773219,60773021)the High Technology Program (Grant No.2007AA01Z192)
文摘The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.
基金supported by National Natural Science Foundation of China (Grant Nos. 11071022, 11231010 and 11471223)Beijing Center for Mathematics and Information Interdisciplinary ScienceKey Project of Beijing Municipal Educational Commission (Grant No. KZ201410028030)
文摘This paper aims to develop a new robust U-type test for high dimensional regression coefficients using the estimated U-statistic of order two and refitted cross-validation error variance estimation. It is proved that the limiting null distribution of the proposed new test is normal under two kinds of ordinary models.We further study the local power of the proposed test and compare with other competitive tests for high dimensional data. The idea of refitted cross-validation approach is utilized to reduce the bias of sample variance in the estimation of the test statistic. Our theoretical results indicate that the proposed test can have even more substantial power gain than the test by Zhong and Chen(2011) when testing a hypothesis with outlying observations and heavy tailed distributions. We assess the finite-sample performance of the proposed test by examining its size and power via Monte Carlo studies. We also illustrate the application of the proposed test by an empirical analysis of a real data example.
基金Project supported by the National Fundamental Research Program (Grant No 001CB309308), China National Natural Science Foundation (Grant Nos 60433050, 10325521, 10447106), the Hang-Tian Science Fund, the SRFDP program of Education Ministry of China and Beijing Education Committee (Grant No XK100270454).
文摘In this paper a high-dimension multiparty quantum secret sharing scheme is proposed by using Einstein-Podolsky-Rosen pairs and local unitary operators. This scheme has the advantage of not only having higher capacity, but also saving storage space. The security analysis is also given.
基金Supported by the National Natural Science Foundation of China(Grant No.11601140,11401082,11701260)Program funded by Education Department of Liaoning Province(Grant No.LN2019Q15).
文摘We deal with the boundedness of solutions to a class of fully parabolic quasilinear repulsion chemotaxis systems{ut=∇・(ϕ(u)∇u)+∇・(ψ(u)∇v),(x,t)∈Ω×(0,T),vt=Δv−v+u,(x,t)∈Ω×(0,T),under homogeneous Neumann boundary conditions in a smooth bounded domainΩ⊂R^N(N≥3),where 0<ψ(u)≤K(u+1)^a,K1(s+1)^m≤ϕ(s)≤K2(s+1)^m withα,K,K1,K2>0 and m∈R.It is shown that ifα−m<4/N+2,then for any sufficiently smooth initial data,the classical solutions to the system are uniformly-in-time bounded.This extends the known result for the corresponding model with linear diffusion.
基金supported by the National Natural Science Foundation of China(Grant Nos.12101335 and 12271271)the Natural Science Foundation of Tianjin(Grant No.21JCQNJC00020)+4 种基金the Fundamental Research Funds for the Central Universities,Nankai University(Grant Nos.63211088 and 63221050)supported by National Natural Science Foundation of China(Grant No.12101332)supported by Shenzhen Wukong Investment Company,the Fundamental Research Funds for the Central Universities under(Grant No.ZB22000105)the China National Key R&D Program(Grant Nos.2019YFC1908502,2022YFA1003703,2022YFA1003802,2022YFA1003803)the National Natural Science Foundation of China(Grants Nos.12271271,11925106,12231011,11931001 and 11971247)。
文摘This paper establishes the asymptotic independence between the quadratic form z^(T)Az and maximum max1≤i≤p|zi|of a sequence of independent sub-Gaussian random variables z=(z1m…zp)^(T).Based on this theoretical result,we find the asymptotic joint distribution for the quadratic form and maximum,which can be applied into the high-dimensional testing problems.By combining the sum-type test and the max-type test,we propose the Fisher’s combination tests for the one-sample mean test and two-sample mean test.Under this novel general framework,several strong assumptions in existing literature have been relaxed.Monte Carlo simulation has been done which shows that our proposed tests are strongly robust to both sparse and dense data.
基金supported by the Postdoctoral Oriented Fund of Yunnan Province under Grant No.W8163007the National Natural Science Foundation of China under Grant No. 11961079。
文摘This paper investigates the moment selection and parameter estimation problem of highdimensional unconditional moment conditions. First, the authors propose a Fantope projection and selection(FPS) approach to distinguish the informative and uninformative moments in high-dimensional unconditional moment conditions. Second, for the selected unconditional moment conditions, the authors present a generalized empirical likelihood(GEL) approach to estimate unknown parameters. The proposed method is computationally feasible, and can efficiently avoid the well-known ill-posed problem of GEL approach in the analysis of high-dimensional unconditional moment conditions. Under some regularity conditions, the authors show the consistency of the selected moment conditions, the consistency and asymptotic normality of the proposed GEL estimator. Two simulation studies are conducted to investigate the finite sample performance of the proposed methodologies. The proposed method is illustrated by a real example.
基金supported by the NSF grant DMS-2111383Air Force Office of Scientific Research FA9550-18-1-0257the NSF grant DMS-2011838.
文摘This paper reviews the adaptive sparse grid discontinuous Galerkin(aSG-DG)method for computing high dimensional partial differential equations(PDEs)and its software implementation.The C++software package called AdaM-DG,implementing the aSG-DG method,is available on GitHub at https://github.com/JuntaoHuang/adaptive-multiresolution-DG.The package is capable of treating a large class of high dimensional linear and nonlinear PDEs.We review the essential components of the algorithm and the functionality of the software,including the multiwavelets used,assembling of bilinear operators,fast matrix-vector product for data with hierarchical structures.We further demonstrate the performance of the package by reporting the numerical error and the CPU cost for several benchmark tests,including linear transport equations,wave equations,and Hamilton-Jacobi(HJ)equations.
基金supported by the National Key Research and Development Program of China (Grant No. 2021YFE0113100)the National Natural Science Foundation of China (Grant Nos. 11904357, 12174367, 12204458,12374338, 62071064, and 62322513)+6 种基金the Innovation Program for Quantum Science and Technology (Grant No. 2021ZD0301200)the Fundamental Research Funds for the Central UniversitiesUSTC Tang ScholarshipScience and Technological Fund of Anhui Province for Outstanding Youth(Grant No. 2008085J02)the China Postdoctoral Science Foundation (Grant No. 2021M700138)the China Postdoctoral for Innovative Talents (Grant No. BX2021289)the Shanghai Municipal Science and Technology Fundamental Project (Grant No. 21JC1405400)。
文摘Controlled quantum teleportation(CQT), which is regarded as the prelude and backbone for a genuine quantum internet, reveals the cooperation, supervision, and control relationship among the sender, receiver, and controller in the quantum network within the simplest unit. Compared with low-dimensional counterparts, high-dimensional CQT can exhibit larger information transmission capacity and higher superiority of the controller's authority. In this article, we report a proof-of-principle experimental realization of three-dimensional(3D) CQT with a fidelity of 97.4% ± 0.2%. To reduce the complexity of the circuit, we simulate a standard 4-qutrit CQT protocol in a 9×9-dimensional two-photon system with high-quality operations. The corresponding control powers are 48.1% ± 0.2% for teleporting a qutrit and 52.8% ± 0.3% for teleporting a qubit in the experiment, which are both higher than the theoretical value of control power in 2-dimensional CQT protocol(33%). The results fully demonstrate the advantages of high-dimensional multi-partite entangled networks and provide new avenues for constructing complex quantum networks.