期刊文献+

一种基于双数组Trie的B2B规则串提取方法 被引量:1

Rules String Extracting Method for B2B System Based on Double-array Trie
下载PDF
导出
摘要 针对B2B垂直搜索引擎中提取产品规格信息困难的问题,提出了一种基于双数组Trie(Double-Array Trie)的规则串提取方法。该方法针对B2B系统中"参数名:参数值"字符串的规则特征构建规则串,生成双数组Trie树;并优先处理分支结点最多的子树,来提高存储效率。该方法对搜索文本进行一次扫描就能得到所有规则串;通过在规则中加入约束条件,对候选串进行有效过滤,以提高规则串的提取准确率。实验表明,该方法能够降低传统规则串查找的算法复杂度,查找规则串的时间复杂度是O(n)。 To extract the data of product specification in B2B system, the ruled string extracting method based on dou- ble-array trie was proposed. The data feature is formed as "name. value" for the parameters of the product specification in B2B system. The method constructs the rule according to the data feature of specification parameters. The double-ar- ray trie is generated for the extracting processing according to the rules database. The optimization measures are adopt- ed to improve the storing efficiency for the double-array trie. The measures include giving high priority to handle the sub tree with more child node. The method can extract all the ruled string by scanning the input text data once. The ac- curacy of the extracting results is improved via filtering according to the restrictions condition of the rules. Experimental results show that the extracting method can improve accuracy and decrease complexity comparing to the traditional methock The complexity of the extracting algorithm is O(n).
出处 《计算机科学》 CSCD 北大核心 2013年第5期206-208,223,共4页 Computer Science
基金 国家自然科学基金项目(61175048 60875029) 科技部创新方法工作专项项目(2010IM020900)资助
关键词 双数组Trie 垂直搜索 规则串 B2B系统 Double-array Trie Vertical search Rules string B2B system
  • 相关文献

参考文献5

二级参考文献75

共引文献243

同被引文献13

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部