Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kerne...Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kernel to uncover the relationship between proteins functions and protein-protein interactions (PPI). The author first construct kernels based on PPI networks, then apply support vector machine (SVM) techniques to classify proteins into different functional groups. The 5-fold cross validation is then applied to the selected 359 GO terms to compare the performance of different kernels and guilt-by-association methods including neighbor counting methods and Chi-square methods. Finally, the authors conduct predictions of functions of some unknown genes and verify the preciseness of our prediction in part by the information of other data source.展开更多
As wind farms are commonly installed in areas with abundant wind resources,spatial dependence of wind speed among nearby wind farms should be considered when modeling a power system with large-scale wind power.In this...As wind farms are commonly installed in areas with abundant wind resources,spatial dependence of wind speed among nearby wind farms should be considered when modeling a power system with large-scale wind power.In this paper,a novel bivariate non-parametric copula,and a bivariate diffusive kernel(BDK)copula are proposed to formulate the dependence between random variables.BDK copula is then applied to higher dimension using the pair-copula method and is named as pair diffusive kernel(PDK)copula,offering flexibility to formulate the complicated dependent structure of multiple random variables.Also,a quasi-Monte Carlo method is elaborated in the sampling procedure based on the combination of the Sobol sequence and the Rosen-blatt transformation of the PDK copula,to generate correlated wind speed samples.The proposed method is applied to solve probabilistic optimal power flow(POPF)problems.The effectiveness of the BDK copula is validated in copula definitions.Then,three different data sets are used in various goodness-of-fit tests to verify the superior performance of the PDK copula,which facilitates in formulating the dependence structure of wind speeds at different wind farms.Furthermore,samples obtained from the PDK copula are used to solve POPF problems,which are modeled on three modified IEEE 57-bus power systems.Compared to the Gaussian,T,and parametric-pair copulas,the results obtained from the PDK copula are superior in formulating the complicated dependence,thus solving POPF problems.展开更多
基金This research is supported in part by HKRGC Grant 7017/07P, HKU CRCG Grants, HKU strategic theme grant on computational sciences, HKU Hung Hing Ying Physical Science Research Grant, National Natural Science Foundation of China Grant No. 10971075 and Guangdong Provincial Natural Science Grant No. 9151063101000021. The preliminary version of this paper has been presented in the OSB2009 conference and published in the corresponding conference proceedings[25]. The authors would like to thank the anonymous referees for their helpful comments and suggestions.
文摘Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kernel to uncover the relationship between proteins functions and protein-protein interactions (PPI). The author first construct kernels based on PPI networks, then apply support vector machine (SVM) techniques to classify proteins into different functional groups. The 5-fold cross validation is then applied to the selected 359 GO terms to compare the performance of different kernels and guilt-by-association methods including neighbor counting methods and Chi-square methods. Finally, the authors conduct predictions of functions of some unknown genes and verify the preciseness of our prediction in part by the information of other data source.
基金supported by Key-Area Research and Development Program of Guangdong Province(No.2020B010166004)the National Natural Science Foundation of China(No.52077081).
文摘As wind farms are commonly installed in areas with abundant wind resources,spatial dependence of wind speed among nearby wind farms should be considered when modeling a power system with large-scale wind power.In this paper,a novel bivariate non-parametric copula,and a bivariate diffusive kernel(BDK)copula are proposed to formulate the dependence between random variables.BDK copula is then applied to higher dimension using the pair-copula method and is named as pair diffusive kernel(PDK)copula,offering flexibility to formulate the complicated dependent structure of multiple random variables.Also,a quasi-Monte Carlo method is elaborated in the sampling procedure based on the combination of the Sobol sequence and the Rosen-blatt transformation of the PDK copula,to generate correlated wind speed samples.The proposed method is applied to solve probabilistic optimal power flow(POPF)problems.The effectiveness of the BDK copula is validated in copula definitions.Then,three different data sets are used in various goodness-of-fit tests to verify the superior performance of the PDK copula,which facilitates in formulating the dependence structure of wind speeds at different wind farms.Furthermore,samples obtained from the PDK copula are used to solve POPF problems,which are modeled on three modified IEEE 57-bus power systems.Compared to the Gaussian,T,and parametric-pair copulas,the results obtained from the PDK copula are superior in formulating the complicated dependence,thus solving POPF problems.