轮廓查询(Skyline)是一种典型的多目标优化问题.动态轮廓查询(Dynamic Skyline)是轮廓查询的一个重要变种,其目标是对于一个给定的查询点q,返回在各维度上最接近q的所有点.对比轮廓查询,动态轮廓查询根据查询点q的位置变动,可以更加灵...轮廓查询(Skyline)是一种典型的多目标优化问题.动态轮廓查询(Dynamic Skyline)是轮廓查询的一个重要变种,其目标是对于一个给定的查询点q,返回在各维度上最接近q的所有点.对比轮廓查询,动态轮廓查询根据查询点q的位置变动,可以更加灵活地返回查询结果.文中关注数据流上动态轮廓查询处理,此问题在多目标决策方面具有非常重要的应用.为有效地解决该问题,首先提出了一种组合式索引结构来管理数据流上的点,该索引结构包括两个部分:对整体数据使用分层次划分结构进行维护;对子划分内部数据采用倒排索引结构进行维护.该组合式索引结构具有更新快、过滤性能高、适合任意数据分布等优点,可以提高动态轮廓的查询处理效率.然后,基于该组合式索引结构,提出了基础的数据流上动态轮廓查询算法(Basic Dynamic Skyline Query over Data Stream,BDS2).通过维护少量的数据,BDS2可以快速地计算出数据流上的动态轮廓集合.然而BDS2在处理个别更新时,会有较大的时间延迟,为了更稳定地计算数据流上的动态轮廓,避免更新某些点时计算量急剧增加,进一步提出了改进的数据流上动态轮廓查询算法(Improved Dynamic Skyline Query over Data Stream,IDS2).最后,通过一系列的实验验证了文中所提出算法的有效性.展开更多
For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to ...For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to rank the answers. Firstly, the method analyses questions to generate the query string, and then submits the query string to search engines to retrieve relevant documents. Sec- ondly, the method makes retrieved documents seg- mentation and identifies the most relevant candidate answers, in addition, it uses the rhetorical relations of rhetorical structure theory to analyze the relationship to determine the inherent relationship between para- graphs or sentences and generate the answer candi- date paragraphs or sentences. Thirdly, we construct the answer ranking model,, and extract five feature groups and adopt Ranking Support Vector Machine (SVM) algorithm to train ranking model. Finally, it re-ranks the answers with the training model and fred the optimal answers. Experiments show that the proposed method combined with discourse structure features can effectively improve the answer extrac- ting accuracy and the quality of non-factoid an- swers. The Mean Reciprocal Rank (MRR) of the an- swer extraction reaches 69.53%.展开更多
文摘轮廓查询(Skyline)是一种典型的多目标优化问题.动态轮廓查询(Dynamic Skyline)是轮廓查询的一个重要变种,其目标是对于一个给定的查询点q,返回在各维度上最接近q的所有点.对比轮廓查询,动态轮廓查询根据查询点q的位置变动,可以更加灵活地返回查询结果.文中关注数据流上动态轮廓查询处理,此问题在多目标决策方面具有非常重要的应用.为有效地解决该问题,首先提出了一种组合式索引结构来管理数据流上的点,该索引结构包括两个部分:对整体数据使用分层次划分结构进行维护;对子划分内部数据采用倒排索引结构进行维护.该组合式索引结构具有更新快、过滤性能高、适合任意数据分布等优点,可以提高动态轮廓的查询处理效率.然后,基于该组合式索引结构,提出了基础的数据流上动态轮廓查询算法(Basic Dynamic Skyline Query over Data Stream,BDS2).通过维护少量的数据,BDS2可以快速地计算出数据流上的动态轮廓集合.然而BDS2在处理个别更新时,会有较大的时间延迟,为了更稳定地计算数据流上的动态轮廓,避免更新某些点时计算量急剧增加,进一步提出了改进的数据流上动态轮廓查询算法(Improved Dynamic Skyline Query over Data Stream,IDS2).最后,通过一系列的实验验证了文中所提出算法的有效性.
基金supported by the National Nature Science Foundation of China under Grants No.60863011,No.61175068,No.61100205,No.60873001the Fundamental Research Funds for the Central Universities under Grant No.2009RC0212+1 种基金the National Innovation Fund for Technology based Firms under Grant No.11C26215305905the Open Fund of Software Engineering Key Laboratory of Yunnan Province under Grant No.2011SE14
文摘For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to rank the answers. Firstly, the method analyses questions to generate the query string, and then submits the query string to search engines to retrieve relevant documents. Sec- ondly, the method makes retrieved documents seg- mentation and identifies the most relevant candidate answers, in addition, it uses the rhetorical relations of rhetorical structure theory to analyze the relationship to determine the inherent relationship between para- graphs or sentences and generate the answer candi- date paragraphs or sentences. Thirdly, we construct the answer ranking model,, and extract five feature groups and adopt Ranking Support Vector Machine (SVM) algorithm to train ranking model. Finally, it re-ranks the answers with the training model and fred the optimal answers. Experiments show that the proposed method combined with discourse structure features can effectively improve the answer extrac- ting accuracy and the quality of non-factoid an- swers. The Mean Reciprocal Rank (MRR) of the an- swer extraction reaches 69.53%.