期刊文献+

Enhancing Amharic Information Retrieval System Based on Statistical Co-Occurrence Technique

Enhancing Amharic Information Retrieval System Based on Statistical Co-Occurrence Technique
下载PDF
导出
摘要 Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems. Even though different IR systems exist, they cannot meet all users’ expectations. A different level of users’ knowledge makes queries to be expressed in different ways. As a result, the system may miss the core meaning of users query and retrieve dissatisfactory results. This happens mainly because of the ambiguities of words involved in the natural languages and expression mismatch among users and authors. The existing ambiguities in Amharic language have negative impacts on the performance of Amharic IR system. Some of the ambiguities for this type of problem are: spelling variants of the same word, polysemous and synonymous terms. If users are not fully knowledgeable about the information domain area, they will mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with the results found from an IR system. This research has been conducted, aiming at augmenting the recall of previous work. Statistical co-occurrence technique has been used in order to expand query terms. The main reason for performing query expansion is to provide relevant documents as per users’ query that can satisfy their information need. Statistical co-occurrence method considers, frequently appearing terms with the query term, regardless of their position. The efficiency of proposed technique has been tested on the prototype system and the result found compared with the result of previous study. Accordingly, 6% recall and 2% f-measure improvement has been made. Hence, the statistical co-occurrence method outperformed the bi-gram based IR system. Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems. Even though different IR systems exist, they cannot meet all users’ expectations. A different level of users’ knowledge makes queries to be expressed in different ways. As a result, the system may miss the core meaning of users query and retrieve dissatisfactory results. This happens mainly because of the ambiguities of words involved in the natural languages and expression mismatch among users and authors. The existing ambiguities in Amharic language have negative impacts on the performance of Amharic IR system. Some of the ambiguities for this type of problem are: spelling variants of the same word, polysemous and synonymous terms. If users are not fully knowledgeable about the information domain area, they will mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with the results found from an IR system. This research has been conducted, aiming at augmenting the recall of previous work. Statistical co-occurrence technique has been used in order to expand query terms. The main reason for performing query expansion is to provide relevant documents as per users’ query that can satisfy their information need. Statistical co-occurrence method considers, frequently appearing terms with the query term, regardless of their position. The efficiency of proposed technique has been tested on the prototype system and the result found compared with the result of previous study. Accordingly, 6% recall and 2% f-measure improvement has been made. Hence, the statistical co-occurrence method outperformed the bi-gram based IR system.
出处 《Journal of Computer and Communications》 2015年第12期67-76,共10页 电脑和通信(英文)
关键词 STATISTICAL CO-OCCURRENCE Information RETRIEVAL QUERY EXPANSION Amharic Statistical Co-Occurrence Information Retrieval Query Expansion Amharic
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部