The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-atten...The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-attention mechanisms falter when applied to datasets with intricate semantic content and extensive dependency structures.In response,this paper introduces a Diffusion Sampling and Label-Driven Co-attention Neural Network(DSLD),which adopts a diffusion sampling method to capture more comprehensive semantic information of the data.Additionally,themodel leverages the joint correlation information of labels and data to introduce the computation of text representation,correcting semantic representationbiases in thedata,andincreasing the accuracyof semantic representation.Ultimately,the model computes the corresponding classification results by synthesizing these rich data semantic representations.Experiments on seven benchmark datasets show that our proposed model achieves competitive results compared to state-of-the-art methods.展开更多
The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for l...The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for low-frequency words.In knowledge bases,manually annotated semantic knowledge is stable and the essential attributes of words are accurately denoted.In this paper,we propose a Conceptual Semantics Enhanced Word Representation(CEWR)model,computing the synset embedding and hypernym embedding of Chinese words based on the Tongyici Cilin thesaurus,and aggregating it with distributed word representation to have both distributed information and the conceptual meaning encoded in the representation of words.We evaluate the CEWR model on two tasks:word similarity computation and short text classification.The Spearman correlation between model results and human judgement are improved to 64.71%,81.84%,and 85.16%on Wordsim297,MC30,and RG65,respectively.Moreover,CEWR improves the F1 score by 3%in the short text classification task.The experimental results show that CEWR can represent words in a more informative approach than distributed word embedding.This proves that conceptual semantics,especially hypernymous information,is a good complement to distributed word representation.展开更多
The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information...The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information Retrieval(IR)systems.The Semantic Web(SW)can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval.This paper focuses on exploiting the SWbase systemto provide interoperability through ontologies by combining the data concepts with ontology classes.This paper presents a 4-phase weather data model:data processing,ontology creation,SW processing,and query engine.The developed Oceanographic Weather Ontology helps to enhance data analysis,discovery,IR,and decision making.In addition to that,it also evaluates the developed ontology with other state-of-the-art ontologies.The proposed ontology’s quality has improved by 39.28%in terms of completeness,and structural complexity has decreased by 45.29%,11%and 37.7%in Precision and Accuracy.Indian Meteorological Satellite INSAT-3D’s ocean data is a typical example of testing the proposed model.The experimental result shows the effectiveness of the proposed data model and its advantages in machine understanding and IR.展开更多
Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic netw...Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic network knowledge representation processing. A new extended Petri net is defined based on qualitative mapping, which strengths the expressive ability of the feature of thinking and the mode of action of brain. A model of semantic network knowledge representation based on new Petri net is given. Semantic network knowledge has a more efficient representation and reasoning mechanism. This model not only can reflect the characteristics of associative memory in semantic network knowledge representation, but also can use Petri net to express the criterion changes and its change law of recognition judgment, especially the cognitive operation of thinking based on extraction and integration of sensory characteristics to well express the thinking transition course from quantitative change to qualitative change of human cognition.展开更多
Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, lea...Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, leading to text classification errors.This work proposed a document representation method, Word Net-based lexical semantic VSM, to solve the problem. Using Word Net,this method constructed a data structure of semantic-element information to characterize lexical semantic contents, and adjusted EM modeling to disambiguate word stems. Then, in the lexical-semantic space of corpus, lexical-semantic eigenvector of document representation was built by calculating the weight of each synset, and applied to a widely-recognized algorithm NWKNN. On text corpus Reuter-21578 and its adjusted version of lexical replacement, the experimental results show that the lexical-semantic eigenvector performs F1 measure and scales of dimension better than term-statistic eigenvector based on TF-IDF. Formation of document representation eigenvectors ensures the method a wide prospect of classification applications in text corpus analysis.展开更多
Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is stil...Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is still restricted by the availability of parallel data. In this paper, a method is proposed to jointly embed texts and entities on comparable data. In addition to evaluate on public semantic textual similarity datasets, a task (cross-lingual text extraction) was proposed to assess the similarities between texts and contribute to this dataset. It shows that the proposed method outperforms cross-lingual representations methods using parallel data on cross-lingual tasks, and achieves competitive results on mono-lingual tasks.展开更多
User representation learning is crucial for capturing different user preferences,but it is also critical challenging because user intentions are latent and dispersed in complex and different patterns of user-generated...User representation learning is crucial for capturing different user preferences,but it is also critical challenging because user intentions are latent and dispersed in complex and different patterns of user-generated data,and thus cannot be measured directly.Text-based data models can learn user representations by mining latent semantics,which is beneficial to enhancing the semantic function of user representations.However,these technologies only extract common features in historical records and cannot represent changes in user intentions.However,sequential feature can express the user’s interests and intentions that change time by time.But the sequential recommendation results based on the user representation of the item lack the interpretability of preference factors.To address these issues,we propose in this paper a novel model with Dual-Layer User Representation,named DLUR,where the user’s intention is learned based on two different layer representations.Specifically,the latent semantic layer adds an interactive layer based on Transformer to extract keywords and key sentences in the text and serve as a basis for interpretation.The sequence layer uses the Transformer model to encode the user’s preference intention to clarify changes in the user’s intention.Therefore,this dual-layer user mode is more comprehensive than a single text mode or sequence mode and can effectually improve the performance of recommendations.Our extensive experiments on five benchmark datasets demonstrate DLUR’s performance over state-of-the-art recommendation models.In addition,DLUR’s ability to explain recommendation results is also demonstrated through some specific cases.展开更多
Caring has long been recognized as central to nursing and is increasingly posited as a core concept although developing a theoretical description of caring which is adequate in the 21st. century continues to be a diff...Caring has long been recognized as central to nursing and is increasingly posited as a core concept although developing a theoretical description of caring which is adequate in the 21st. century continues to be a difficult task for nursing scholars. Consequently, verifying existing theoretical structures of caring remains an ongoing challenge. The aim of this article is to provide empirical verification of the caring processes of “knowing,” “being with,” “doing for,” “enabling” and “maintaining belief” from Swanson’s Middle Range Caring Theory based on the categorization of nursing actions from a systematic literature review on care. Methods: A systematic literature review was conducted in the fields of nursing sciences, medicine and psychology. Purposeful sampling was carried out covering a period from 2003-2013. The final sample included 25 articles. Results: Major themes of nursing actions included “knowing” which consisted of centering, nurturing, informed understanding, assessment skills, communication and respect for individual differences. “Being with” was characterized by intimate relationship, connecting, presencing, emotional adaptability awareness of self/other and decentering. “Doing for” included competence, knowledge, professional/technical skills, helping actions, anticipatory, multidisciplinary and preserving dignity. “Enabling” was characterized by self care, commitment, complexity of care, appropriate communication, information/education, sharing power, enabling choice and ongoing validation. Finally, “maintaining belief” was characterized by spiritual being, humanistic view, harmonious balance, hope, love, and compassion, meaning, and religious and spiritual orientation. Conclusion: Empirical verification was shown for the caring processes described in Swanson’s Caring Theory grounded in concrete nursing actions.展开更多
Current Chinese event detection methods commonly use word embedding to capture semantic representation,but these methods find it difficult to capture the dependence relationship between the trigger words and other wor...Current Chinese event detection methods commonly use word embedding to capture semantic representation,but these methods find it difficult to capture the dependence relationship between the trigger words and other words in the same sentence.Based on the simple evaluation,it is known that a dependency parser can effectively capture dependency relationships and improve the accuracy of event categorisation.This study proposes a novel architecture that models a hybrid representation to summarise semantic and structural information from both characters and words.This model can capture rich semantic features for the event detection task by incorporating the semantic representation generated from the dependency parser.The authors evaluate different models on kbp 2017 corpus.The experimental results show that the proposed method can significantly improve performance in Chinese event detection.展开更多
One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ...One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.展开更多
The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only...The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only with robot's vision.By imitating spatial cognizing mechanism of human,the robot constantly received the information of artificial labels at cognitive-guide points in a wide range of structured environment to achieve the perception of the environment and robot navigation.The immune network algorithm was used to form the environmental awareness mechanism with "distributed representation".The color recognition and SIFT feature matching algorithm were fused to achieve the memory and cognition of scenario tag.Then the cognition-guide-action based cognizing semantic map was built.Along with the continuously abundant map,the robot did no longer need to rely on the artificial label,and it could plan path and navigate freely.Experimental results show that the artificial label designed in this work can improve the cognitive ability of the robot,navigate the robot in the case of semi-unknown environment,and build the cognizing semantic map favorably.展开更多
Every day,the media reports tons of crimes that are considered by a large number of users and accumulate on a regular basis.Crime news exists on the Internet in unstructured formats such as books,websites,documents,an...Every day,the media reports tons of crimes that are considered by a large number of users and accumulate on a regular basis.Crime news exists on the Internet in unstructured formats such as books,websites,documents,and journals.From such homogeneous data,it is very challenging to extract relevant information which is a time-consuming and critical task for the public and law enforcement agencies.Keyword-based Information Retrieval(IR)systems rely on statistics to retrieve results,making it difficult to obtain relevant results.They are unable to understandthe user’s query and thus facewordmismatchesdue to context changes andthe inevitable semanticsof a given word.Therefore,such datasets need to be organized in a structured configuration,with the goal of efficiently manipulating the data while respecting the semantics of the data.An ontological semantic IR systemis needed that can find the right investigative information and find important clues to solve criminal cases.The semantic system retrieves information in view of the similarity of the semantics among indexed data and user queries.In this paper,we develop anontology-based semantic IRsystemthat leverages the latest semantic technologies including resource description framework(RDF),semantic protocol and RDF query language(SPARQL),semantic web rule language(SWRL),and web ontology language(OWL).We have conducted two experiments.In the first experiment,we implemented a keyword-based textual IR systemusing Apache Lucene.In the second experiment,we implemented a semantic systemthat uses ontology to store the data and retrieve precise results with high accuracy using SPARQL queries.The keyword-based system has filtered results with 51%accuracy,while the semantic system has filtered results with 95%accuracy,leading to significant improvements in the field and opening up new horizons for researchers.展开更多
Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes seman...Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes semantic enrichment in a digital curation model.Design/methodology/approach:This study conducts a literature review to analyze the preceding curation models,DCC CLM,DCC&U,UC3,and DCN.Findings:The concept of semantic enrichment is expressed in a single word,SEMANTIC in this study.The Semantic Enrichment Model,SEMANTIC has elements,subject,extraction,multi-language,authority,network,thing,identity,and connect.Research limitations:This study does not reflect the actual information environment because it focuses on the concepts of the representation of digital objects.Practical implications:This study presents the main considerations for creating and reinforcing the description and representation of digital objects when building and developing digital curation models in specific institutions.Originality/value:This study summarizes the elements that should be emphasized in the representation of digital objects in terms of information organization.展开更多
Design changes for 2D & 3D geometry are the most important features in the process of product design.Constraint modeling for variationl geometry based on geometric reasoning is one of the best approaches for this ...Design changes for 2D & 3D geometry are the most important features in the process of product design.Constraint modeling for variationl geometry based on geometric reasoning is one of the best approaches for this goal.However,it is difficult for the proposed systems to maintain or handle the consistency and completeness of the constraint model of the design objects.To change this situation,a semantic model and its control approach are presented,aiming at the integration of the data,knowledge and method related to design objects.Aconstraint definition system for in- teractively defining the semantic model and a prototype modeler based on the semantic model are also implemented to examine the idea which is extended to 3D geometric design too.展开更多
基金the Communication University of China(CUC230A013)the Fundamental Research Funds for the Central Universities.
文摘The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-attention mechanisms falter when applied to datasets with intricate semantic content and extensive dependency structures.In response,this paper introduces a Diffusion Sampling and Label-Driven Co-attention Neural Network(DSLD),which adopts a diffusion sampling method to capture more comprehensive semantic information of the data.Additionally,themodel leverages the joint correlation information of labels and data to introduce the computation of text representation,correcting semantic representationbiases in thedata,andincreasing the accuracyof semantic representation.Ultimately,the model computes the corresponding classification results by synthesizing these rich data semantic representations.Experiments on seven benchmark datasets show that our proposed model achieves competitive results compared to state-of-the-art methods.
基金This research is supported by the National Science Foundation of China(grant 61772278,author:Qu,W.grant number:61472191,author:Zhou,J.http://www.nsfc.gov.cn/)+2 种基金the National Social Science Foundation of China(grant number:18BYY127,author:Li B.http://www.cssn.cn)the Philosophy and Social Science Foundation of Jiangsu Higher Institution(grant number:2019SJA0220,author:Wei,T.https://jyt.jiangsu.gov.cn)Jiangsu Higher Institutions’Excellent Innovative Team for Philosophy and Social Science(grant number:2017STD006,author:Gu,W.https://jyt.jiangsu.gov.cn)。
文摘The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for low-frequency words.In knowledge bases,manually annotated semantic knowledge is stable and the essential attributes of words are accurately denoted.In this paper,we propose a Conceptual Semantics Enhanced Word Representation(CEWR)model,computing the synset embedding and hypernym embedding of Chinese words based on the Tongyici Cilin thesaurus,and aggregating it with distributed word representation to have both distributed information and the conceptual meaning encoded in the representation of words.We evaluate the CEWR model on two tasks:word similarity computation and short text classification.The Spearman correlation between model results and human judgement are improved to 64.71%,81.84%,and 85.16%on Wordsim297,MC30,and RG65,respectively.Moreover,CEWR improves the F1 score by 3%in the short text classification task.The experimental results show that CEWR can represent words in a more informative approach than distributed word embedding.This proves that conceptual semantics,especially hypernymous information,is a good complement to distributed word representation.
基金This work is financially supported by the Ministry of Earth Science(MoES),Government of India,(Grant.No.MoES/36/OOIS/Extra/45/2015),URL:https://www.moes.gov.in。
文摘The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information Retrieval(IR)systems.The Semantic Web(SW)can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval.This paper focuses on exploiting the SWbase systemto provide interoperability through ontologies by combining the data concepts with ontology classes.This paper presents a 4-phase weather data model:data processing,ontology creation,SW processing,and query engine.The developed Oceanographic Weather Ontology helps to enhance data analysis,discovery,IR,and decision making.In addition to that,it also evaluates the developed ontology with other state-of-the-art ontologies.The proposed ontology’s quality has improved by 39.28%in terms of completeness,and structural complexity has decreased by 45.29%,11%and 37.7%in Precision and Accuracy.Indian Meteorological Satellite INSAT-3D’s ocean data is a typical example of testing the proposed model.The experimental result shows the effectiveness of the proposed data model and its advantages in machine understanding and IR.
文摘Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic network knowledge representation processing. A new extended Petri net is defined based on qualitative mapping, which strengths the expressive ability of the feature of thinking and the mode of action of brain. A model of semantic network knowledge representation based on new Petri net is given. Semantic network knowledge has a more efficient representation and reasoning mechanism. This model not only can reflect the characteristics of associative memory in semantic network knowledge representation, but also can use Petri net to express the criterion changes and its change law of recognition judgment, especially the cognitive operation of thinking based on extraction and integration of sensory characteristics to well express the thinking transition course from quantitative change to qualitative change of human cognition.
基金Project(2012AA011205)supported by National High-Tech Research and Development Program(863 Program)of ChinaProjects(61272150,61379109,M1321007,61301136,61103034)supported by the National Natural Science Foundation of China+1 种基金Project(20120162110077)supported by Research Fund for the Doctoral Program of Higher Education of ChinaProject(11JJ1012)supported by Excellent Youth Foundation of Hunan Scientific Committee,China
文摘Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, leading to text classification errors.This work proposed a document representation method, Word Net-based lexical semantic VSM, to solve the problem. Using Word Net,this method constructed a data structure of semantic-element information to characterize lexical semantic contents, and adjusted EM modeling to disambiguate word stems. Then, in the lexical-semantic space of corpus, lexical-semantic eigenvector of document representation was built by calculating the weight of each synset, and applied to a widely-recognized algorithm NWKNN. On text corpus Reuter-21578 and its adjusted version of lexical replacement, the experimental results show that the lexical-semantic eigenvector performs F1 measure and scales of dimension better than term-statistic eigenvector based on TF-IDF. Formation of document representation eigenvectors ensures the method a wide prospect of classification applications in text corpus analysis.
基金supported by National Natural Science Foundation of China(71402157)the Natural Science Foundation of Guangdong Province,China(2014A030313753)+2 种基金CityU Start-up(7200399)the Center for Adaptive Super Computing Software-Multi Threaded Architectures(CASS-MT)at the U.S.Department of Energy’s Pacific Northwest National LaboratoryPacific Northwest National Laboratory Is Operated by Battelle Memorial Institute(Contract DE-ACO6-76RL01830)
文摘Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is still restricted by the availability of parallel data. In this paper, a method is proposed to jointly embed texts and entities on comparable data. In addition to evaluate on public semantic textual similarity datasets, a task (cross-lingual text extraction) was proposed to assess the similarities between texts and contribute to this dataset. It shows that the proposed method outperforms cross-lingual representations methods using parallel data on cross-lingual tasks, and achieves competitive results on mono-lingual tasks.
基金supported by the Applied Research Center of Artificial Intelligence,Wuhan College(Grant Number X2020113)the Wuhan College Research Project(Grant Number KYZ202009).
文摘User representation learning is crucial for capturing different user preferences,but it is also critical challenging because user intentions are latent and dispersed in complex and different patterns of user-generated data,and thus cannot be measured directly.Text-based data models can learn user representations by mining latent semantics,which is beneficial to enhancing the semantic function of user representations.However,these technologies only extract common features in historical records and cannot represent changes in user intentions.However,sequential feature can express the user’s interests and intentions that change time by time.But the sequential recommendation results based on the user representation of the item lack the interpretability of preference factors.To address these issues,we propose in this paper a novel model with Dual-Layer User Representation,named DLUR,where the user’s intention is learned based on two different layer representations.Specifically,the latent semantic layer adds an interactive layer based on Transformer to extract keywords and key sentences in the text and serve as a basis for interpretation.The sequence layer uses the Transformer model to encode the user’s preference intention to clarify changes in the user’s intention.Therefore,this dual-layer user mode is more comprehensive than a single text mode or sequence mode and can effectually improve the performance of recommendations.Our extensive experiments on five benchmark datasets demonstrate DLUR’s performance over state-of-the-art recommendation models.In addition,DLUR’s ability to explain recommendation results is also demonstrated through some specific cases.
文摘Caring has long been recognized as central to nursing and is increasingly posited as a core concept although developing a theoretical description of caring which is adequate in the 21st. century continues to be a difficult task for nursing scholars. Consequently, verifying existing theoretical structures of caring remains an ongoing challenge. The aim of this article is to provide empirical verification of the caring processes of “knowing,” “being with,” “doing for,” “enabling” and “maintaining belief” from Swanson’s Middle Range Caring Theory based on the categorization of nursing actions from a systematic literature review on care. Methods: A systematic literature review was conducted in the fields of nursing sciences, medicine and psychology. Purposeful sampling was carried out covering a period from 2003-2013. The final sample included 25 articles. Results: Major themes of nursing actions included “knowing” which consisted of centering, nurturing, informed understanding, assessment skills, communication and respect for individual differences. “Being with” was characterized by intimate relationship, connecting, presencing, emotional adaptability awareness of self/other and decentering. “Doing for” included competence, knowledge, professional/technical skills, helping actions, anticipatory, multidisciplinary and preserving dignity. “Enabling” was characterized by self care, commitment, complexity of care, appropriate communication, information/education, sharing power, enabling choice and ongoing validation. Finally, “maintaining belief” was characterized by spiritual being, humanistic view, harmonious balance, hope, love, and compassion, meaning, and religious and spiritual orientation. Conclusion: Empirical verification was shown for the caring processes described in Swanson’s Caring Theory grounded in concrete nursing actions.
基金973 Program,Grant/Award Number:2014CB340504The State Key Program of National Natural Science of China,Grant/Award Number:61533018+3 种基金National Natural Science Foundation of China,Grant/Award Number:61402220The Philosophy and Social Science Foundation of Hunan Province,Grant/Award Number:16YBA323Natural Science Foundation of Hunan Province,Grant/Award Number:2020JJ4525Scientific Research Fund of Hunan Provincial Education Department,Grant/Award Number:18B279,19A439。
文摘Current Chinese event detection methods commonly use word embedding to capture semantic representation,but these methods find it difficult to capture the dependence relationship between the trigger words and other words in the same sentence.Based on the simple evaluation,it is known that a dependency parser can effectively capture dependency relationships and improve the accuracy of event categorisation.This study proposes a novel architecture that models a hybrid representation to summarise semantic and structural information from both characters and words.This model can capture rich semantic features for the event detection task by incorporating the semantic representation generated from the dependency parser.The authors evaluate different models on kbp 2017 corpus.The experimental results show that the proposed method can significantly improve performance in Chinese event detection.
文摘One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.
基金Projects(61203330,61104009,61075092)supported by the National Natural Science Foundation of ChinaProject(2013M540546)supported by China Postdoctoral Science Foundation+2 种基金Projects(ZR2012FM031,ZR2011FM011,ZR2010FM007)supported by Shandong Provincal Nature Science Foundation,ChinaProjects(2011JC017,2012TS078)supported by Independent Innovation Foundation of Shandong University,ChinaProject(201203058)supported by Shandong Provincal Postdoctoral Innovation Foundation,China
文摘The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only with robot's vision.By imitating spatial cognizing mechanism of human,the robot constantly received the information of artificial labels at cognitive-guide points in a wide range of structured environment to achieve the perception of the environment and robot navigation.The immune network algorithm was used to form the environmental awareness mechanism with "distributed representation".The color recognition and SIFT feature matching algorithm were fused to achieve the memory and cognition of scenario tag.Then the cognition-guide-action based cognizing semantic map was built.Along with the continuously abundant map,the robot did no longer need to rely on the artificial label,and it could plan path and navigate freely.Experimental results show that the artificial label designed in this work can improve the cognitive ability of the robot,navigate the robot in the case of semi-unknown environment,and build the cognizing semantic map favorably.
文摘Every day,the media reports tons of crimes that are considered by a large number of users and accumulate on a regular basis.Crime news exists on the Internet in unstructured formats such as books,websites,documents,and journals.From such homogeneous data,it is very challenging to extract relevant information which is a time-consuming and critical task for the public and law enforcement agencies.Keyword-based Information Retrieval(IR)systems rely on statistics to retrieve results,making it difficult to obtain relevant results.They are unable to understandthe user’s query and thus facewordmismatchesdue to context changes andthe inevitable semanticsof a given word.Therefore,such datasets need to be organized in a structured configuration,with the goal of efficiently manipulating the data while respecting the semantics of the data.An ontological semantic IR systemis needed that can find the right investigative information and find important clues to solve criminal cases.The semantic system retrieves information in view of the similarity of the semantics among indexed data and user queries.In this paper,we develop anontology-based semantic IRsystemthat leverages the latest semantic technologies including resource description framework(RDF),semantic protocol and RDF query language(SPARQL),semantic web rule language(SWRL),and web ontology language(OWL).We have conducted two experiments.In the first experiment,we implemented a keyword-based textual IR systemusing Apache Lucene.In the second experiment,we implemented a semantic systemthat uses ontology to store the data and retrieve precise results with high accuracy using SPARQL queries.The keyword-based system has filtered results with 51%accuracy,while the semantic system has filtered results with 95%accuracy,leading to significant improvements in the field and opening up new horizons for researchers.
基金supported by a research grant from Seoul Women’s University(2020)financially supported by Hansung University
文摘Purpose:This study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes semantic enrichment in a digital curation model.Design/methodology/approach:This study conducts a literature review to analyze the preceding curation models,DCC CLM,DCC&U,UC3,and DCN.Findings:The concept of semantic enrichment is expressed in a single word,SEMANTIC in this study.The Semantic Enrichment Model,SEMANTIC has elements,subject,extraction,multi-language,authority,network,thing,identity,and connect.Research limitations:This study does not reflect the actual information environment because it focuses on the concepts of the representation of digital objects.Practical implications:This study presents the main considerations for creating and reinforcing the description and representation of digital objects when building and developing digital curation models in specific institutions.Originality/value:This study summarizes the elements that should be emphasized in the representation of digital objects in terms of information organization.
文摘Design changes for 2D & 3D geometry are the most important features in the process of product design.Constraint modeling for variationl geometry based on geometric reasoning is one of the best approaches for this goal.However,it is difficult for the proposed systems to maintain or handle the consistency and completeness of the constraint model of the design objects.To change this situation,a semantic model and its control approach are presented,aiming at the integration of the data,knowledge and method related to design objects.Aconstraint definition system for in- teractively defining the semantic model and a prototype modeler based on the semantic model are also implemented to examine the idea which is extended to 3D geometric design too.