In this work, Kendall correlation based collaborative filtering algorithms for the recommender systems are proposed. The Kendall correlation method is used to measure the correlation amongst users by means of consider...In this work, Kendall correlation based collaborative filtering algorithms for the recommender systems are proposed. The Kendall correlation method is used to measure the correlation amongst users by means of considering the relative order of the users' ratings. Kendall based algorithm is based upon a more general model and thus could be more widely applied in e-commerce. Another discovery of this work is that the consideration of only positive correlated neighbors in prediction, in both Pearson and Kendall algorithms, achieves higher accuracy than the consideration of all neighbors, with only a small loss of coverage.展开更多
Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to...Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to ignore low-support high-correlation of association rules.In view of the above problems,some scholars put forward the positive correlation coefficient based on Phi correlation to avoid the embarrassment caused by Apriori algorithm.It can dig item sets with low-support but high-correlation.Although the algorithm has pruned the search space,it is not obvious that the performance of the running time based on the big data set is reduced,and the correlation pairs can be meaningless.This paper presents an improved mining algorithm with new association rules based on interestingness for correlation pairs,using an upper bound on interestingness of the supersets to prune the search space.It greatly reduces the running time,and filters the meaningless correlation pairs according to the constraints of the redundancy.Compared with the algorithm based on the Phi correlation coefficient,the new algorithm has been significantly improved in reducing the running time,the result has pruned the redundant correlation pairs.So it improves the mining efficiency and accuracy.展开更多
The microphysical properties of a long-lasting heavy fog event are examined based on the results from a comprehensive field campaign conducted during the winter of 2006 at Pancheng (32.2°N, 118.7°E), Jiang...The microphysical properties of a long-lasting heavy fog event are examined based on the results from a comprehensive field campaign conducted during the winter of 2006 at Pancheng (32.2°N, 118.7°E), Jiangsu Province, China. It is demonstrated that the key microphysical properties (liquid water content, fog droplet concentration, mean radius and standard deviation) exhibited positive correlations with one another in general, and that the 5-min-average maximum value of fog liquid water content was sometimes greater than 0.5 g m-3. Further analysis shows that the unique combination of positive correlations likely arose from the simultaneous supply of moist air and fog condensation nuclei associated with the advection of warm air, which further led to high liquid water content. High values of liquid water content and droplet concentration conspired to cause low visibility (〈50 m) for a prolonged period of about 40 h. Examination of the microphysical relationships conditioned by the corresponding autoconversion threshold functions shows that the collision-coalescence process was sometimes likely to occur, weakening the positive correlations induced by droplet activation and condensational growth. Statistical analysis shows that the observed droplet size distribution can be described well by the Gamma distribution.展开更多
The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the comm...The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the common feature selection algorithms and feature representation methods,and introduces the basic principles,advantages and disadvantages of SVM and KNN,and the evaluation indexes of classification algorithms.In the aspect of mutual information feature selection function,it describes its processing flow,shortcomings and optimization improvements.In view of its weakness in not balancing the positive and negative correlation characteristics,a balance weight attribute factor and feature difference factor are introduced to make up for its deficiency.The experimental stage mainly describes the specific process:the word segmentation processing,to disuse words,using various feature selection algorithms,including optimized mutual information,and weighted with TF-IDF.Under the two classification algorithms of SVM and KNN,we compare the merits and demerits of all the feature selection algorithms according to the evaluation index.Experiments show that the optimized mutual information feature selection has good performance and is better than KNN under the SVM classification algorithm.This experiment proves its validity.展开更多
Hypoxia represents one of the most extreme environmental conditions for both human beings and animals living at high al- titudes (Zhao et al., 2009). Over the past few years, great attention has been focused on the ...Hypoxia represents one of the most extreme environmental conditions for both human beings and animals living at high al- titudes (Zhao et al., 2009). Over the past few years, great attention has been focused on the genetic bases of adaption to high-altitude environments (Bigham et al., 2010; Simonson et al., 2010). The domestic dog (Canisfamiliaris) is the first animal that developed an intimate relationship with human beings. Dogs migrated with human beings and have adapted to variety of ecological niches (Savolainen et al., 2002). Our previous research revealed parallel evolution and convergent evolution in the adaptation of dogs and humans to the high-altitude environment of the Tibetan plateau (Wang et al., 2013, 2014), suggesting that exploring the adaption of domestic dogs to high-altitude hypoxia is an interesting and important question.展开更多
The stochastic comparison and preservation of positive correlations for Levy-type processes on R^d are studied under the condition that Levy measure v satisfies f{0〈|z|≤1)|z||v(x, dz) - v(x, d(-z))| 〈 ...The stochastic comparison and preservation of positive correlations for Levy-type processes on R^d are studied under the condition that Levy measure v satisfies f{0〈|z|≤1)|z||v(x, dz) - v(x, d(-z))| 〈 ∞, x∈ R^d, while the sufficient conditions and necessary ones for them are obtained. In some cases the conditions for stochastic comparison are not only sufficient but also necessary.展开更多
基金Supported by the National Natural Science Foun-dation of China (60573095)
文摘In this work, Kendall correlation based collaborative filtering algorithms for the recommender systems are proposed. The Kendall correlation method is used to measure the correlation amongst users by means of considering the relative order of the users' ratings. Kendall based algorithm is based upon a more general model and thus could be more widely applied in e-commerce. Another discovery of this work is that the consideration of only positive correlated neighbors in prediction, in both Pearson and Kendall algorithms, achieves higher accuracy than the consideration of all neighbors, with only a small loss of coverage.
基金This research was supported by the National Natural Science Foundation of China under Grant No.61772280by the China Special Fund for Meteorological Research in the Public Interest under Grant GYHY201306070by the Jiangsu Province Innovation and Entrepreneurship Training Program for College Students under Grant No.201810300079X.
文摘Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to ignore low-support high-correlation of association rules.In view of the above problems,some scholars put forward the positive correlation coefficient based on Phi correlation to avoid the embarrassment caused by Apriori algorithm.It can dig item sets with low-support but high-correlation.Although the algorithm has pruned the search space,it is not obvious that the performance of the running time based on the big data set is reduced,and the correlation pairs can be meaningless.This paper presents an improved mining algorithm with new association rules based on interestingness for correlation pairs,using an upper bound on interestingness of the supersets to prune the search space.It greatly reduces the running time,and filters the meaningless correlation pairs according to the constraints of the redundancy.Compared with the algorithm based on the Phi correlation coefficient,the new algorithm has been significantly improved in reducing the running time,the result has pruned the redundant correlation pairs.So it improves the mining efficiency and accuracy.
基金mainly provided by the National Natural Science Foundation of China (Grant Nos. 40537034 and 40775012)the Natural Science Fund for Universities in Jiangsu Province(Grant Nos. 06KJA17021 and 08KJA170002)+1 种基金the Meteorology Fund of the Ministry of Science and Technology [Grant No. GYHY (QX) 2007-6-26]the Qing-Lan Project for cloud-fog-precipitation-aerosol study in Jiangsu Province and the Graduate Student Innovation Plan in the Universities of Jiangsu Province (CX09B 226Z)
文摘The microphysical properties of a long-lasting heavy fog event are examined based on the results from a comprehensive field campaign conducted during the winter of 2006 at Pancheng (32.2°N, 118.7°E), Jiangsu Province, China. It is demonstrated that the key microphysical properties (liquid water content, fog droplet concentration, mean radius and standard deviation) exhibited positive correlations with one another in general, and that the 5-min-average maximum value of fog liquid water content was sometimes greater than 0.5 g m-3. Further analysis shows that the unique combination of positive correlations likely arose from the simultaneous supply of moist air and fog condensation nuclei associated with the advection of warm air, which further led to high liquid water content. High values of liquid water content and droplet concentration conspired to cause low visibility (〈50 m) for a prolonged period of about 40 h. Examination of the microphysical relationships conditioned by the corresponding autoconversion threshold functions shows that the collision-coalescence process was sometimes likely to occur, weakening the positive correlations induced by droplet activation and condensational growth. Statistical analysis shows that the observed droplet size distribution can be described well by the Gamma distribution.
文摘The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the common feature selection algorithms and feature representation methods,and introduces the basic principles,advantages and disadvantages of SVM and KNN,and the evaluation indexes of classification algorithms.In the aspect of mutual information feature selection function,it describes its processing flow,shortcomings and optimization improvements.In view of its weakness in not balancing the positive and negative correlation characteristics,a balance weight attribute factor and feature difference factor are introduced to make up for its deficiency.The experimental stage mainly describes the specific process:the word segmentation processing,to disuse words,using various feature selection algorithms,including optimized mutual information,and weighted with TF-IDF.Under the two classification algorithms of SVM and KNN,we compare the merits and demerits of all the feature selection algorithms according to the evaluation index.Experiments show that the optimized mutual information feature selection has good performance and is better than KNN under the SVM classification algorithm.This experiment proves its validity.
基金supported by the National Natural Science Foundation of China(No.91231108)the Breakthrough Project of Strategic Priority Program of the Chinese Academy of Sciences(No.XDB13000000)+1 种基金the Key Research Program of the Chinese Academy of Sciencesthe Youth Innovation Promotion Association,Chinese Academy of Sciences(to GDW)
文摘Hypoxia represents one of the most extreme environmental conditions for both human beings and animals living at high al- titudes (Zhao et al., 2009). Over the past few years, great attention has been focused on the genetic bases of adaption to high-altitude environments (Bigham et al., 2010; Simonson et al., 2010). The domestic dog (Canisfamiliaris) is the first animal that developed an intimate relationship with human beings. Dogs migrated with human beings and have adapted to variety of ecological niches (Savolainen et al., 2002). Our previous research revealed parallel evolution and convergent evolution in the adaptation of dogs and humans to the high-altitude environment of the Tibetan plateau (Wang et al., 2013, 2014), suggesting that exploring the adaption of domestic dogs to high-altitude hypoxia is an interesting and important question.
基金Supported in part by Creative Research Group Fund of the National Natural Science Foundation of China (No. 10121101)the "985" Project from the Ministry of Education in China
文摘The stochastic comparison and preservation of positive correlations for Levy-type processes on R^d are studied under the condition that Levy measure v satisfies f{0〈|z|≤1)|z||v(x, dz) - v(x, d(-z))| 〈 ∞, x∈ R^d, while the sufficient conditions and necessary ones for them are obtained. In some cases the conditions for stochastic comparison are not only sufficient but also necessary.