期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Context Information and Fragments Based Cross-Domain Word Segmentation 被引量:8
1
作者 Huang Degen Tong Deqin 《China Communications》 SCIE CSCD 2012年第3期49-57,共9页
A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOV... A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect. 展开更多
关键词 cross-domain CWS Conditional Ran-dem Fields(crfs) joint decoding context variables segmentation fragments
下载PDF
A CONDITIONAL RANDOM FIELDS APPROACH TO BIOMEDICAL NAMED ENTITY RECOGNITION 被引量:4
2
作者 Wang Haochang Zhao Tiejun Li Sheng Yu Hao 《Journal of Electronics(China)》 2007年第6期838-844,共7页
Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system mak... Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better resuits. 展开更多
关键词 Conditional Random Fields (crfs) Named entity recognition Feature selection Post-processing
下载PDF
A probabilistic model with multi-dimensional features for object extraction
3
作者 Jing WANG Zhijing LIU Hui ZHAO 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第5期513-526,共14页
To identify recruitment information in different domains, we propose a novel model of hierarchical tree- structured conditional random fields (HT-CRFs). In our ap- proach, first, the concept of a Web object (WOB) ... To identify recruitment information in different domains, we propose a novel model of hierarchical tree- structured conditional random fields (HT-CRFs). In our ap- proach, first, the concept of a Web object (WOB) is discussed for the description of special Web information. Second, in contrast to traditional methods, the Boolean model and multi- rule are introduced to denote a one-dimensional text feature for a better representation of Web objects. Furthermore, a two-dimensional semantic texture feature is developed to dis- cover the layout of a WOB, which can emphasize the struc- tural attributes and the specific semantics term attributes of WOBs. Third, an optimal WOB information extraction (IE) based on HT-CRF is performed, addressing the problem of a model having an excessive dependence on the page structure and optimizing the efficiency of the model's training. Finally, we compare the proposed model with existing decoupled ap- proaches for WOB IE. The experimental results show that the accuracy rate of WOB IE is significantly improved and that time complexity is reduced. 展开更多
关键词 feature extraction conditional random fields(crfs) information extraction (IE)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部