At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attribu...At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.展开更多
针对海运货物邮件实体识别中存在识别精度不高、实体边界确定困难的问题,提出一种结合深度学习与规则匹配的识别方法。其中:深度学习方法是在BiLSTM-CRF(Bidirectional Long Short Term Memory-Conditional Random Field)模型的基础上...针对海运货物邮件实体识别中存在识别精度不高、实体边界确定困难的问题,提出一种结合深度学习与规则匹配的识别方法。其中:深度学习方法是在BiLSTM-CRF(Bidirectional Long Short Term Memory-Conditional Random Field)模型的基础上添加词的字符级特征,并融入多头注意力机制以捕获邮件文本中长距离依赖;规则匹配方法则根据领域实体特点制定规则来完成识别。根据货物邮件特点将语料进行标注并划分为:货物名称、货物重量、装卸港口、受载期和佣金五个类别。在自建语料中设置多组对比实验,实验表明所提方法在海运货物邮件实体识别的F1值达到79.3%。展开更多
基金supported by the Fundamental Research Funds for the Central Universities under Grants No.ZYGX2014J051 and No.ZYGX2014J066Science and Technology Projects in Sichuan Province under Grants No.2015JY0178,No.2016FZ0002,No.2014GZ0109,No.2015KZ002 and No.2015JY0030China Postdoctoral Science Foundation under Grant No.2015M572464
文摘At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.
文摘针对海运货物邮件实体识别中存在识别精度不高、实体边界确定困难的问题,提出一种结合深度学习与规则匹配的识别方法。其中:深度学习方法是在BiLSTM-CRF(Bidirectional Long Short Term Memory-Conditional Random Field)模型的基础上添加词的字符级特征,并融入多头注意力机制以捕获邮件文本中长距离依赖;规则匹配方法则根据领域实体特点制定规则来完成识别。根据货物邮件特点将语料进行标注并划分为:货物名称、货物重量、装卸港口、受载期和佣金五个类别。在自建语料中设置多组对比实验,实验表明所提方法在海运货物邮件实体识别的F1值达到79.3%。