G-protein coupled receptors (GPCRs) are a class of seven-helix transmembraneproteins that have been used in bioinformatics as the targets to facilitate drugdiscovery for human diseases. Although thousands of GPCR sequ...G-protein coupled receptors (GPCRs) are a class of seven-helix transmembraneproteins that have been used in bioinformatics as the targets to facilitate drugdiscovery for human diseases. Although thousands of GPCR sequences have beencollected, the ligand specificity of many GPCRs is still unknown and only onecrystal structure of the rhodopsin-like family has been solved. Therefore, iden-tifying GPCR types only from sequence data has become an important researchissue. In this study, a novel technique for identifying GPCR types based on theweighted Levenshtein distance between two receptor sequences and the nearestneighbor method (NNM) is introduced, which can deal with receptor sequenceswith different lengths directly. In our experiments for classifying four classes(acetylcholine, adrenoceptor, dopamine, and serotonin) of the rhodopsin-like familyof GPCRs, the error rates from the leave-one-out procedure and the leave-half-outprocedure were 0.62% and 1.24%, respectively. These results are prior to those ofthe covariant discriminant algorithm, the support vector machine method, and theNNM with Euclidean distance.展开更多
In this paper,Edgeworth expansion for the nearest neighbor\|kernel estimate and random weighting approximation of conditional density are given and the consistency and convergence rate are proved.
Short-term traffic flow prediction is one of the essential issues in intelligent transportation systems(ITS). A new two-stage traffic flow prediction method named AKNN-AVL method is presented, which combines an advanc...Short-term traffic flow prediction is one of the essential issues in intelligent transportation systems(ITS). A new two-stage traffic flow prediction method named AKNN-AVL method is presented, which combines an advanced k-nearest neighbor(AKNN)method and balanced binary tree(AVL) data structure to improve the prediction accuracy. The AKNN method uses pattern recognition two times in the searching process, which considers the previous sequences of traffic flow to forecast the future traffic state. Clustering method and balanced binary tree technique are introduced to build case database to reduce the searching time. To illustrate the effects of these developments, the accuracies performance of AKNN-AVL method, k-nearest neighbor(KNN) method and the auto-regressive and moving average(ARMA) method are compared. These methods are calibrated and evaluated by the real-time data from a freeway traffic detector near North 3rd Ring Road in Beijing under both normal and incident traffic conditions.The comparisons show that the AKNN-AVL method with the optimal neighbor and pattern size outperforms both KNN method and ARMA method under both normal and incident traffic conditions. In addition, the combinations of clustering method and balanced binary tree technique to the prediction method can increase the searching speed and respond rapidly to case database fluctuations.展开更多
In this paper,the application of an algorithm for precipitation retrieval based on Himawari-8 (H8) satellite infrared data is studied.Based on GPM precipitation data and H8 Infrared spectrum channel brightness tempera...In this paper,the application of an algorithm for precipitation retrieval based on Himawari-8 (H8) satellite infrared data is studied.Based on GPM precipitation data and H8 Infrared spectrum channel brightness temperature data,corresponding "precipitation field dictionary" and "channel brightness temperature dictionary" are formed.The retrieval of precipitation field based on brightness temperature data is studied through the classification rule of k-nearest neighbor domain (KNN) and regularization constraint.Firstly,the corresponding "dictionary" is constructed according to the training sample database of the matched GPM precipitation data and H8 brightness temperature data.Secondly,according to the fact that precipitation characteristics in small organizations in different storm environments are often repeated,KNN is used to identify the spectral brightness temperature signal of "precipitation" and "non-precipitation" based on "the dictionary".Finally,the precipitation field retrieval is carried out in the precipitation signal "subspace" based on the regular term constraint method.In the process of retrieval,the contribution rate of brightness temperature retrieval of different channels was determined by Bayesian model averaging (BMA) model.The preliminary experimental results based on the "quantitative" evaluation indexes show that the precipitation of H8 retrieval has a good correlation with the GPM truth value,with a small error and similar structure.展开更多
基金supported by the Natural Science Foundation of Jiangsu Province(No.BK2004142)partly by the National Natural Science Foundation of China(No.60275007).
文摘G-protein coupled receptors (GPCRs) are a class of seven-helix transmembraneproteins that have been used in bioinformatics as the targets to facilitate drugdiscovery for human diseases. Although thousands of GPCR sequences have beencollected, the ligand specificity of many GPCRs is still unknown and only onecrystal structure of the rhodopsin-like family has been solved. Therefore, iden-tifying GPCR types only from sequence data has become an important researchissue. In this study, a novel technique for identifying GPCR types based on theweighted Levenshtein distance between two receptor sequences and the nearestneighbor method (NNM) is introduced, which can deal with receptor sequenceswith different lengths directly. In our experiments for classifying four classes(acetylcholine, adrenoceptor, dopamine, and serotonin) of the rhodopsin-like familyof GPCRs, the error rates from the leave-one-out procedure and the leave-half-outprocedure were 0.62% and 1.24%, respectively. These results are prior to those ofthe covariant discriminant algorithm, the support vector machine method, and theNNM with Euclidean distance.
文摘In this paper,Edgeworth expansion for the nearest neighbor\|kernel estimate and random weighting approximation of conditional density are given and the consistency and convergence rate are proved.
基金Project(2012CB725403)supported by the National Basic Research Program of ChinaProjects(71210001,51338008)supported by the National Natural Science Foundation of ChinaProject supported by World Capital Cities Smooth Traffic Collaborative Innovation Center and Singapore National Research Foundation Under Its Campus for Research Excellence and Technology Enterprise(CREATE)Programme
文摘Short-term traffic flow prediction is one of the essential issues in intelligent transportation systems(ITS). A new two-stage traffic flow prediction method named AKNN-AVL method is presented, which combines an advanced k-nearest neighbor(AKNN)method and balanced binary tree(AVL) data structure to improve the prediction accuracy. The AKNN method uses pattern recognition two times in the searching process, which considers the previous sequences of traffic flow to forecast the future traffic state. Clustering method and balanced binary tree technique are introduced to build case database to reduce the searching time. To illustrate the effects of these developments, the accuracies performance of AKNN-AVL method, k-nearest neighbor(KNN) method and the auto-regressive and moving average(ARMA) method are compared. These methods are calibrated and evaluated by the real-time data from a freeway traffic detector near North 3rd Ring Road in Beijing under both normal and incident traffic conditions.The comparisons show that the AKNN-AVL method with the optimal neighbor and pattern size outperforms both KNN method and ARMA method under both normal and incident traffic conditions. In addition, the combinations of clustering method and balanced binary tree technique to the prediction method can increase the searching speed and respond rapidly to case database fluctuations.
基金Supported by National Natural Science Foundation of China(41805080)Natural Science Foundation of Anhui Province,China(1708085QD89)+1 种基金Key Research and Development Program Projects of Anhui Province,China(201904a07020099)Open Foundation Project Shenyang Institute of Atmospheric Environment,China Meteorological Administration(2016SYIAE14)
文摘In this paper,the application of an algorithm for precipitation retrieval based on Himawari-8 (H8) satellite infrared data is studied.Based on GPM precipitation data and H8 Infrared spectrum channel brightness temperature data,corresponding "precipitation field dictionary" and "channel brightness temperature dictionary" are formed.The retrieval of precipitation field based on brightness temperature data is studied through the classification rule of k-nearest neighbor domain (KNN) and regularization constraint.Firstly,the corresponding "dictionary" is constructed according to the training sample database of the matched GPM precipitation data and H8 brightness temperature data.Secondly,according to the fact that precipitation characteristics in small organizations in different storm environments are often repeated,KNN is used to identify the spectral brightness temperature signal of "precipitation" and "non-precipitation" based on "the dictionary".Finally,the precipitation field retrieval is carried out in the precipitation signal "subspace" based on the regular term constraint method.In the process of retrieval,the contribution rate of brightness temperature retrieval of different channels was determined by Bayesian model averaging (BMA) model.The preliminary experimental results based on the "quantitative" evaluation indexes show that the precipitation of H8 retrieval has a good correlation with the GPM truth value,with a small error and similar structure.