Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes seman...Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes semantic enrichment in a digital curation model.Design/methodology/approach:This study conducts a literature review to analyze the preceding curation models,DCC CLM,DCC&U,UC3,and DCN.Findings:The concept of semantic enrichment is expressed in a single word,SEMANTIC in this study.The Semantic Enrichment Model,SEMANTIC has elements,subject,extraction,multi-language,authority,network,thing,identity,and connect.Research limitations:This study does not reflect the actual information environment because it focuses on the concepts of the representation of digital objects.Practical implications:This study presents the main considerations for creating and reinforcing the description and representation of digital objects when building and developing digital curation models in specific institutions.Originality/value:This study summarizes the elements that should be emphasized in the representation of digital objects in terms of information organization.展开更多
Long-document semantic measurement has great significance in many applications such as semantic searchs, plagiarism detection, and automatic technical surveys. However, research efforts have mainly focused on the sema...Long-document semantic measurement has great significance in many applications such as semantic searchs, plagiarism detection, and automatic technical surveys. However, research efforts have mainly focused on the semantic similarity of short texts. Document-level semantic measurement remains an open issue due to problems such as the omission of background knowledge and topic transition. In this paper, we propose a novel semantic matching method for long documents in the academic domain. To accurately represent the general meaning of an academic article, we construct a semantic profile in which key semantic elements such as the research purpose, methodology, and domain are included and enriched. As such, we can obtain the overall semantic similarity of two papers by computing the distance between their profiles. The distances between the concepts of two different semantic profiles are measured by word vectors. To improve the semantic representation quality of word vectors, we propose a joint word-embedding model for incorporating a domain-specific semantic relation constraint into the traditional context constraint. Our experimental results demonstrate that, in the measurement of document semantic similarity, our approach achieves substantial improvement over state-of-the-art methods, and our joint word-embedding model produces significantly better word representations than traditional word-embedding models.展开更多
基金supported by a research grant from Seoul Women’s University(2020)financially supported by Hansung University
文摘Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes semantic enrichment in a digital curation model.Design/methodology/approach:This study conducts a literature review to analyze the preceding curation models,DCC CLM,DCC&U,UC3,and DCN.Findings:The concept of semantic enrichment is expressed in a single word,SEMANTIC in this study.The Semantic Enrichment Model,SEMANTIC has elements,subject,extraction,multi-language,authority,network,thing,identity,and connect.Research limitations:This study does not reflect the actual information environment because it focuses on the concepts of the representation of digital objects.Practical implications:This study presents the main considerations for creating and reinforcing the description and representation of digital objects when building and developing digital curation models in specific institutions.Originality/value:This study summarizes the elements that should be emphasized in the representation of digital objects in terms of information organization.
基金supported by the Foundation of the State Key Laboratory of Software Development Environment(No.SKLSDE-2015ZX-04)
文摘Long-document semantic measurement has great significance in many applications such as semantic searchs, plagiarism detection, and automatic technical surveys. However, research efforts have mainly focused on the semantic similarity of short texts. Document-level semantic measurement remains an open issue due to problems such as the omission of background knowledge and topic transition. In this paper, we propose a novel semantic matching method for long documents in the academic domain. To accurately represent the general meaning of an academic article, we construct a semantic profile in which key semantic elements such as the research purpose, methodology, and domain are included and enriched. As such, we can obtain the overall semantic similarity of two papers by computing the distance between their profiles. The distances between the concepts of two different semantic profiles are measured by word vectors. To improve the semantic representation quality of word vectors, we propose a joint word-embedding model for incorporating a domain-specific semantic relation constraint into the traditional context constraint. Our experimental results demonstrate that, in the measurement of document semantic similarity, our approach achieves substantial improvement over state-of-the-art methods, and our joint word-embedding model produces significantly better word representations than traditional word-embedding models.