Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Associ...Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Association (FA) words. We present the keyphrase separation technique not for a single document, although for a particular domain. The former builds a specific domain field. The second collects a list of ideal FA terms and compounds FA terms from the specific domain that are considered to be contender keyword phrases. Therefore, we combine two-word node weights and field tree relationships into a new approach to generate keyphrases from a particular domain. Studies using the changed approach to extract key phrases demonstrate that the latest techniques including FA terms are stronger than the others that use normal words and its precise words reach 90%.展开更多
In our study, we chose python as the programming platform for finding an Automatic Bengali Document Summarizer. English has sufficient tools to process and receive summarized records. However, there is no specifically...In our study, we chose python as the programming platform for finding an Automatic Bengali Document Summarizer. English has sufficient tools to process and receive summarized records. However, there is no specifically applicable to Bengali since Bengali has a lot of ambiguity, it differs from English in terms of grammar. Afterward, this language holds an important place because this language is spoken by 26 core people all over the world. As a result, it has taken a new method to summarize Bengali documents. The proposed system has been designed by using the following stages: pre-processing the sample doc/input doc, word tagging, pronoun replacement, sentence ranking, as well as summary. Pronoun replacement has been used to reduce the incidence of swinging pronouns in the performance review. We ranked sentences based on sentence frequency, numerical figures, and pronoun replacement. Checking the similarity between two sentences in order to exclude one since it has less duplication. Hereby, we’ve taken 3000 data as input from newspaper and book documents and learned the words to be appropriate with syntax. In addition, to evaluate the performance of the designed summarizer, the design system looked at the different documents. According to the assessment method, the recall, precision, and F-score were 0.70, 0.82 and 0.74, respectively, representing 70%, 82% and 74% recall, precision, and F-score. It has been found that the proper pronoun replacement was 72%.展开更多
针对目前普通搜索引擎无法对动态网页的查询结果进行有效索引与排序的问题,提出了一种对动态网页的排名算法Con ten tR ank,并介绍其在商务网站中的应用.该算法通过对与动态页面相关的内容进行挖掘分析,计算页面排名得分,从而实现对此...针对目前普通搜索引擎无法对动态网页的查询结果进行有效索引与排序的问题,提出了一种对动态网页的排名算法Con ten tR ank,并介绍其在商务网站中的应用.该算法通过对与动态页面相关的内容进行挖掘分析,计算页面排名得分,从而实现对此类动态网页的查询结果的排序,提高排名的准确性.展开更多
文摘Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Association (FA) words. We present the keyphrase separation technique not for a single document, although for a particular domain. The former builds a specific domain field. The second collects a list of ideal FA terms and compounds FA terms from the specific domain that are considered to be contender keyword phrases. Therefore, we combine two-word node weights and field tree relationships into a new approach to generate keyphrases from a particular domain. Studies using the changed approach to extract key phrases demonstrate that the latest techniques including FA terms are stronger than the others that use normal words and its precise words reach 90%.
基金国家自然科学基金(the National Natural Science Foundation of China under Grant No.60373095 No.60673039)国家高技术研究发展计划(863) (the National High- Tech Research and Development Plan of China under Grant No.2006AA01Z151)
文摘In our study, we chose python as the programming platform for finding an Automatic Bengali Document Summarizer. English has sufficient tools to process and receive summarized records. However, there is no specifically applicable to Bengali since Bengali has a lot of ambiguity, it differs from English in terms of grammar. Afterward, this language holds an important place because this language is spoken by 26 core people all over the world. As a result, it has taken a new method to summarize Bengali documents. The proposed system has been designed by using the following stages: pre-processing the sample doc/input doc, word tagging, pronoun replacement, sentence ranking, as well as summary. Pronoun replacement has been used to reduce the incidence of swinging pronouns in the performance review. We ranked sentences based on sentence frequency, numerical figures, and pronoun replacement. Checking the similarity between two sentences in order to exclude one since it has less duplication. Hereby, we’ve taken 3000 data as input from newspaper and book documents and learned the words to be appropriate with syntax. In addition, to evaluate the performance of the designed summarizer, the design system looked at the different documents. According to the assessment method, the recall, precision, and F-score were 0.70, 0.82 and 0.74, respectively, representing 70%, 82% and 74% recall, precision, and F-score. It has been found that the proper pronoun replacement was 72%.