Traditional topic models have been widely used for analyzing semantic topics from electronic documents.However,the obvious defects of topic words acquired by them are poor in readability and consistency.Only the domai...Traditional topic models have been widely used for analyzing semantic topics from electronic documents.However,the obvious defects of topic words acquired by them are poor in readability and consistency.Only the domain experts are possible to guess their meaning.In fact,phrases are the main unit for people to express semantics.This paper presents a Distributed Representation-Phrase Latent Dirichlet Allocation(DR-Phrase LDA)which is a phrase topic model.Specifically,we reasonably enhance the semantic information of phrases via distributed representation in this model.The experimental results show the topics quality acquired by our model is more readable and consistent than other similar topic models.展开更多
Text classification has always been an increasingly crucial topic in natural language processing.Traditional text classification methods based on machine learning have many disadvantages such as dimension explosion,da...Text classification has always been an increasingly crucial topic in natural language processing.Traditional text classification methods based on machine learning have many disadvantages such as dimension explosion,data sparsity,limited generalization ability and so on.Based on deep learning text classification,this paper presents an extensive study on the text classification models including Convolutional Neural Network-Based(CNN-Based),Recurrent Neural Network-Based(RNN-based),Attention Mechanisms-Based and so on.Many studies have proved that text classification methods based on deep learning outperform the traditional methods when processing large-scale and complex datasets.The main reasons are text classification methods based on deep learning can avoid cumbersome feature extraction process and have higher prediction accuracy for a large set of unstructured data.In this paper,we also summarize the shortcomings of traditional text classification methods and introduce the text classification process based on deep learning including text preprocessing,distributed representation of text,text classification model construction based on deep learning and performance evaluation.展开更多
As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embed...As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embedding algorithm, senses of over 1.7 million words were well represented in sufficiently short feature vectors. Through various analysis including clustering and classification, feasibility of our approach was tested. Finally, our trained classification model achieved 87.6% accuracy in the prediction of drug-disease relation in cancer treatment and succeeded in discovering novel drug-disease relations that were actually reported in recent studies.展开更多
A kind of new environment representation and object localization scheme is proposed in the paper aiming to accomplish the task of object operation more efficiently in intelligent space. First, a distributed environmen...A kind of new environment representation and object localization scheme is proposed in the paper aiming to accomplish the task of object operation more efficiently in intelligent space. First, a distributed environment represen- tation method is put forward to reduce storage burden and improve the system's stability. The layered topological maps are separately stored in different landmarks attached to the key positions of intelligent space, so that the robot can search the landmarks on which the map information can be read from the QR code, and then the environment map can be built autonomously. Map building is an important prerequisite for object search. An object search scheme based on RFID and vision technology is proposed. The RFID tags are attached to the target objects and reference objects in the indoor environ- ment. A fixed RFID system is built to monitor the rough position (room and local area) of target and a mobile RFID system is constructed to detect the targets which are not in the covering range of the fixed system. The existing area of target is determined by the time sequence of reference tags and target tags, and the accurate position is obtained by onboard vision system at a short distance. The experiments demonstrate that the distributed environment representation proposed in the paper can fully meet the requirements of object localization, and the positioning scheme has high search efficiency, high localization accuracy and precision, and a strong anti-interference ability in the complex indoor environment.展开更多
In the context of collaborative robotics,distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective ...In the context of collaborative robotics,distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective decision support.This is particularly important in applications pertaining to emergency rescue and crisis management.During operational missions,data and knowledge are gathered incrementally and in different ways by heterogeneous robots and humans.We describe this as the creation of Hastily Formed Knowledge Networks(HFKNs).The focus of this paper is the specification and prototyping of a general distributed system architecture that supports the creation of HFKNs by teams of robots and humans.The information collected ranges from low-level sensor data to high-level semantic knowledge,the latter represented in part as RDF Graphs.The framework includes a synchronization protocol and associated algorithms that allow for the automatic distribution and sharing of data and knowledge between agents.This is done through the distributed synchronization of RDF Graphs shared between agents.High-level semantic queries specified in SPARQL can be used by robots and humans alike to acquire both knowledge and data content from team members.The system is empirically validated and complexity results of the proposed algorithms are provided.Additionally,a field robotics case study is described,where a 3D mapping mission has been executed using several UAVs in a collaborative emergency rescue scenario while using the full HFKN Framework.展开更多
Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language rep...Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.展开更多
Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge...Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge graphs. Previous models such as Trans(E, H, R) and CTrans R are either insufficient for embedding hyper-relational data or focus on projecting an entity into multiple embeddings, which might not be effective for generalization nor accurately reflect real knowledge. To overcome these issues, we propose the novel model Trans HR, which transforms the hyper-relations in a pair of entities into an individual vector, serving as a translation between them. We experimentally evaluate our model on two typical tasks—link prediction and triple classification.The results demonstrate that Trans HR significantly outperforms Trans(E, H, R) and CTrans R, especially for hyperrelational data.展开更多
基金This work was supported by the Project of Industry and University Cooperative Research of Jiangsu Province,China(No.BY2019051)Ma,J.would like to thank the Jiangsu Eazytec Information Technology Company(www.eazytec.com)for their financial support.
文摘Traditional topic models have been widely used for analyzing semantic topics from electronic documents.However,the obvious defects of topic words acquired by them are poor in readability and consistency.Only the domain experts are possible to guess their meaning.In fact,phrases are the main unit for people to express semantics.This paper presents a Distributed Representation-Phrase Latent Dirichlet Allocation(DR-Phrase LDA)which is a phrase topic model.Specifically,we reasonably enhance the semantic information of phrases via distributed representation in this model.The experimental results show the topics quality acquired by our model is more readable and consistent than other similar topic models.
基金This work supported in part by the National Natural Science Foundation of China under Grant 61872134,in part by the Natural Science Foundation of Hunan Province under Grant 2018JJ2062in part by Science and Technology Development Center of the Ministry of Education under Grant 2019J01020in part by the 2011 Collaborative Innovative Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province。
文摘Text classification has always been an increasingly crucial topic in natural language processing.Traditional text classification methods based on machine learning have many disadvantages such as dimension explosion,data sparsity,limited generalization ability and so on.Based on deep learning text classification,this paper presents an extensive study on the text classification models including Convolutional Neural Network-Based(CNN-Based),Recurrent Neural Network-Based(RNN-based),Attention Mechanisms-Based and so on.Many studies have proved that text classification methods based on deep learning outperform the traditional methods when processing large-scale and complex datasets.The main reasons are text classification methods based on deep learning can avoid cumbersome feature extraction process and have higher prediction accuracy for a large set of unstructured data.In this paper,we also summarize the shortcomings of traditional text classification methods and introduce the text classification process based on deep learning including text preprocessing,distributed representation of text,text classification model construction based on deep learning and performance evaluation.
文摘As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embedding algorithm, senses of over 1.7 million words were well represented in sufficiently short feature vectors. Through various analysis including clustering and classification, feasibility of our approach was tested. Finally, our trained classification model achieved 87.6% accuracy in the prediction of drug-disease relation in cancer treatment and succeeded in discovering novel drug-disease relations that were actually reported in recent studies.
基金supported by the National High Technology Research and Development Program of China(No.2009AA04Z220)the National Natural Science Foundation of China(No.61075092)
文摘A kind of new environment representation and object localization scheme is proposed in the paper aiming to accomplish the task of object operation more efficiently in intelligent space. First, a distributed environment represen- tation method is put forward to reduce storage burden and improve the system's stability. The layered topological maps are separately stored in different landmarks attached to the key positions of intelligent space, so that the robot can search the landmarks on which the map information can be read from the QR code, and then the environment map can be built autonomously. Map building is an important prerequisite for object search. An object search scheme based on RFID and vision technology is proposed. The RFID tags are attached to the target objects and reference objects in the indoor environ- ment. A fixed RFID system is built to monitor the rough position (room and local area) of target and a mobile RFID system is constructed to detect the targets which are not in the covering range of the fixed system. The existing area of target is determined by the time sequence of reference tags and target tags, and the accurate position is obtained by onboard vision system at a short distance. The experiments demonstrate that the distributed environment representation proposed in the paper can fully meet the requirements of object localization, and the positioning scheme has high search efficiency, high localization accuracy and precision, and a strong anti-interference ability in the complex indoor environment.
基金This work has been supported by the ELLIIT Network Organization for Information and Communication Technology,Sweden(Project B09)and the Swedish Foundation for Strategic Research SSF(Smart Systems Project RIT15-0097)The first author is also supported by an RExperts Program Grant 2020A1313030098 from the Guangdong Department of Science and Technology,China in addition to a Sichuan Province International Science and Technology Innovation Cooperation Project Grant 2020YFH0160.
文摘In the context of collaborative robotics,distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective decision support.This is particularly important in applications pertaining to emergency rescue and crisis management.During operational missions,data and knowledge are gathered incrementally and in different ways by heterogeneous robots and humans.We describe this as the creation of Hastily Formed Knowledge Networks(HFKNs).The focus of this paper is the specification and prototyping of a general distributed system architecture that supports the creation of HFKNs by teams of robots and humans.The information collected ranges from low-level sensor data to high-level semantic knowledge,the latter represented in part as RDF Graphs.The framework includes a synchronization protocol and associated algorithms that allow for the automatic distribution and sharing of data and knowledge between agents.This is done through the distributed synchronization of RDF Graphs shared between agents.High-level semantic queries specified in SPARQL can be used by robots and humans alike to acquire both knowledge and data content from team members.The system is empirically validated and complexity results of the proposed algorithms are provided.Additionally,a field robotics case study is described,where a 3D mapping mission has been executed using several UAVs in a collaborative emergency rescue scenario while using the full HFKN Framework.
基金the National Natural Science Foundation of China(Grant Nos.61751201 and 61672162)the Shanghai Municipal Science and Technology Major Project(Grant No.2018SHZDZX01)and ZJLab。
文摘Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
基金partially supported by the National Natural Science Foundation of China(Nos.61302077,61520106007,61421061,and 61602048)
文摘Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge graphs. Previous models such as Trans(E, H, R) and CTrans R are either insufficient for embedding hyper-relational data or focus on projecting an entity into multiple embeddings, which might not be effective for generalization nor accurately reflect real knowledge. To overcome these issues, we propose the novel model Trans HR, which transforms the hyper-relations in a pair of entities into an individual vector, serving as a translation between them. We experimentally evaluate our model on two typical tasks—link prediction and triple classification.The results demonstrate that Trans HR significantly outperforms Trans(E, H, R) and CTrans R, especially for hyperrelational data.