提出了多示例嵌入学习(multi-instance learning,MIL)的实例关联性挖掘与强化算法(multi-instance embedding learning with instance affinity mining and reinforcement,MEMR),包括3个技术。关联性挖掘技术基于自定义的关联性指标,首...提出了多示例嵌入学习(multi-instance learning,MIL)的实例关联性挖掘与强化算法(multi-instance embedding learning with instance affinity mining and reinforcement,MEMR),包括3个技术。关联性挖掘技术基于自定义的关联性指标,首先在负实例空间中选择初始负代表实例集,然后根据正、负实例间的差异性,选择初始正代表实例集。关联性强化技术分别评估初始正、负代表实例集与整个实例空间的正负关联性,获得整体关联性更强的代表实例集。包嵌入技术通过嵌入函数将包转换为单向量进行学习。实验在4类应用领域和7种对比算法上进行。结果表明,MEMR的准确性总体优于其他对比算法,特别是在图像检索和网页推荐数据集上具有显著优势。展开更多
An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic rela...An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.展开更多
The similarity search is one of the fundamental components in time series data mining,e.g.clustering,classification,association rules mining.Many methods have been proposed to measure the similarity between time serie...The similarity search is one of the fundamental components in time series data mining,e.g.clustering,classification,association rules mining.Many methods have been proposed to measure the similarity between time series,including Euclidean distance,Manhattan distance,and dynamic time warping(DTW).In contrast,DTW has been suggested to allow more robust similarity measure and be able to find the optimal alignment in time series.However,due to its quadratic time and space complexity,DTW is not suitable for large time series datasets.Many improving algorithms have been proposed for DTW search in large databases,such as approximate search or exact indexed search.Unlike the previous modified algorithm,this paper presents a novel parallel scheme for fast similarity search based on DTW,which is called MRDTW(MapRedcuebased DTW).The experimental results show that our approach not only retained the original accuracy as DTW,but also greatly improved the efficiency of similarity measure in large time series.展开更多
文摘提出了多示例嵌入学习(multi-instance learning,MIL)的实例关联性挖掘与强化算法(multi-instance embedding learning with instance affinity mining and reinforcement,MEMR),包括3个技术。关联性挖掘技术基于自定义的关联性指标,首先在负实例空间中选择初始负代表实例集,然后根据正、负实例间的差异性,选择初始正代表实例集。关联性强化技术分别评估初始正、负代表实例集与整个实例空间的正负关联性,获得整体关联性更强的代表实例集。包嵌入技术通过嵌入函数将包转换为单向量进行学习。实验在4类应用领域和7种对比算法上进行。结果表明,MEMR的准确性总体优于其他对比算法,特别是在图像检索和网页推荐数据集上具有显著优势。
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Science and Technology Fund of China University of Mining and Technology(No.2007B016)
文摘An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.
基金supported in part by National High-tech R&D Program of China under Grants No.2012AA012600,2011AA010702,2012AA01A401,2012AA01A402National Natural Science Foundation of China under Grant No.60933005+1 种基金National Science and Technology Ministry of China under Grant No.2012BAH38B04National 242 Information Security of China under Grant No.2011A010
文摘The similarity search is one of the fundamental components in time series data mining,e.g.clustering,classification,association rules mining.Many methods have been proposed to measure the similarity between time series,including Euclidean distance,Manhattan distance,and dynamic time warping(DTW).In contrast,DTW has been suggested to allow more robust similarity measure and be able to find the optimal alignment in time series.However,due to its quadratic time and space complexity,DTW is not suitable for large time series datasets.Many improving algorithms have been proposed for DTW search in large databases,such as approximate search or exact indexed search.Unlike the previous modified algorithm,this paper presents a novel parallel scheme for fast similarity search based on DTW,which is called MRDTW(MapRedcuebased DTW).The experimental results show that our approach not only retained the original accuracy as DTW,but also greatly improved the efficiency of similarity measure in large time series.