本文探讨加权关联模式挖掘在越英跨语言查询扩展中的应用。首先提出面向跨语言查询扩展的基于支持度-CPIR(Conditional Probability Increment Ratio)-兴趣度评价框架的加权关联模式挖掘算法(WARM-SCPIRICLQE)以及越英跨语言查询扩展模...本文探讨加权关联模式挖掘在越英跨语言查询扩展中的应用。首先提出面向跨语言查询扩展的基于支持度-CPIR(Conditional Probability Increment Ratio)-兴趣度评价框架的加权关联模式挖掘算法(WARM-SCPIRICLQE)以及越英跨语言查询扩展模型,然后提出基于词间加权关联模式挖掘的越英跨语言用户相关反馈查询扩展算法。该算法将越南语查询通过机器翻译系统翻译为英文并检索英文文档,提取前列初检文档进行用户相关性判断得到初检相关文档集,采用WARM-SCPIRI-CLQE算法对该文档集挖掘加权关联规则,从规则中提取与原查询相关的扩展词实现越英跨语言查询译后扩展。以NTCIR-5 CLIR为实验语料,将本文算法与现有算法进行实验比较,实验结果表明,本文算法能提高和改善越英跨语言信息检索性能,对长查询更有效。展开更多
针对自然语言处理中查询主题漂移和词不匹配问题,提出基于CSC(Copulas-based Support and Confidence)框架的关联模式挖掘与规则扩展算法,并将基于统计学分析的关联模式与具有上下文语义信息的词向量融合,提出关联模式挖掘与词向量学习...针对自然语言处理中查询主题漂移和词不匹配问题,提出基于CSC(Copulas-based Support and Confidence)框架的关联模式挖掘与规则扩展算法,并将基于统计学分析的关联模式与具有上下文语义信息的词向量融合,提出关联模式挖掘与词向量学习融合的伪相关反馈查询扩展模型.该模型对伪相关反馈文档集挖掘规则扩展词,对初检文档集进行词嵌入学习训练得到词向量,计算规则扩展词与原查询的向量相似度,提取向量相似度不低于阈值的规则扩展词作为最终扩展词.实验结果表明,所提扩展模型能有效地减少查询主题漂移和词不匹配问题,提高检索性能,与现有基于关联模式的和基于词向量的查询扩展方法比较,MAP(Mean Average Precision)平均增幅最大可达17.52%,对短查询更有效.所提挖掘方法可用于其他文本挖掘任务和推荐系统,以提高其性能.展开更多
Shanxi Province is a typical resource-based region. After years of economic transformation, the air quality has been at a low level for a long time. The rationalization of industrial structure can measure the effect o...Shanxi Province is a typical resource-based region. After years of economic transformation, the air quality has been at a low level for a long time. The rationalization of industrial structure can measure the effect of economic transformation and whether it has an important impact on air quality. Therefore, it is necessary to study the non-linear impact that the rationalization of industrial structure has had on air quality at different stages, which is of positive significance for the continuing transformation and upgrading of resource-based regions. This study constructs a threshold regression model based on the panel data of 11 provincial cities in Shanxi Province from 2004 to 2016. The results show that the rationalization of industrial structure had a double threshold effect on air quality under different threshold variables. On the whole, optimizing the rationalization of industrial structure in Shanxi Province can improve air quality, and the improvement effect dropped first, and then began to rise. As a result, the current energy transformation and upgrading process should focus on the rationalization of industrial structure to solve the conflict between air quality and economic development.展开更多
文摘本文探讨加权关联模式挖掘在越英跨语言查询扩展中的应用。首先提出面向跨语言查询扩展的基于支持度-CPIR(Conditional Probability Increment Ratio)-兴趣度评价框架的加权关联模式挖掘算法(WARM-SCPIRICLQE)以及越英跨语言查询扩展模型,然后提出基于词间加权关联模式挖掘的越英跨语言用户相关反馈查询扩展算法。该算法将越南语查询通过机器翻译系统翻译为英文并检索英文文档,提取前列初检文档进行用户相关性判断得到初检相关文档集,采用WARM-SCPIRI-CLQE算法对该文档集挖掘加权关联规则,从规则中提取与原查询相关的扩展词实现越英跨语言查询译后扩展。以NTCIR-5 CLIR为实验语料,将本文算法与现有算法进行实验比较,实验结果表明,本文算法能提高和改善越英跨语言信息检索性能,对长查询更有效。
文摘针对自然语言处理中查询主题漂移和词不匹配问题,提出基于CSC(Copulas-based Support and Confidence)框架的关联模式挖掘与规则扩展算法,并将基于统计学分析的关联模式与具有上下文语义信息的词向量融合,提出关联模式挖掘与词向量学习融合的伪相关反馈查询扩展模型.该模型对伪相关反馈文档集挖掘规则扩展词,对初检文档集进行词嵌入学习训练得到词向量,计算规则扩展词与原查询的向量相似度,提取向量相似度不低于阈值的规则扩展词作为最终扩展词.实验结果表明,所提扩展模型能有效地减少查询主题漂移和词不匹配问题,提高检索性能,与现有基于关联模式的和基于词向量的查询扩展方法比较,MAP(Mean Average Precision)平均增幅最大可达17.52%,对短查询更有效.所提挖掘方法可用于其他文本挖掘任务和推荐系统,以提高其性能.
基金The National Natural Science Foundation of China(71774105)The National Social Science Foundation of China(19BTJ053)+1 种基金The Soft Science Research Project in Shanxi Province of China(2019041012-4)The Institutions of Higher Learning in Shanxi Province Key Research Base of Humanities and Social Sciences Fund Project(2017324).
文摘Shanxi Province is a typical resource-based region. After years of economic transformation, the air quality has been at a low level for a long time. The rationalization of industrial structure can measure the effect of economic transformation and whether it has an important impact on air quality. Therefore, it is necessary to study the non-linear impact that the rationalization of industrial structure has had on air quality at different stages, which is of positive significance for the continuing transformation and upgrading of resource-based regions. This study constructs a threshold regression model based on the panel data of 11 provincial cities in Shanxi Province from 2004 to 2016. The results show that the rationalization of industrial structure had a double threshold effect on air quality under different threshold variables. On the whole, optimizing the rationalization of industrial structure in Shanxi Province can improve air quality, and the improvement effect dropped first, and then began to rise. As a result, the current energy transformation and upgrading process should focus on the rationalization of industrial structure to solve the conflict between air quality and economic development.