Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations ...Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations between ontologies. Through analyzing and classifying the relationships between ontologies, the idea of bridge ontology was proposed, which had the powerful capability of expressing the complex relationships between concepts and relationships between relations in multi-ontologies. Meanwhile, a new approach employing bridge ontology was proposed to deal with the multi-ontologies-based semantic annotation problem. The bridge ontology is a peculiar ontology, which can be created and maintained conveniently, and is effective in the multi-ontologies-based semantic annotation. The approach using bridge ontology has the advantages of low-cost, scalable, robust in the web circumstance, and avoiding the unnecessary ontology extending and integration. Key words semantic web - bridge ontology - multi-ontologies - semantic annotation CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024). National Grand Fundamental Research 973 Program of China (2002CB312000), National Re-search Foundation for the Doctoral Program of Higher Education of China (20020286004)Biography: WANG Peng (1977-), male, Ph.D candidate, research direction: semantic web, ontology, and knowledge representation on the Web.展开更多
Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming k...Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming knowledge domain ontology to add semantic annotations to problems in the Web. The system we developed adds semantic annotation for each problem in the form of Extensible Makeup Language. Our method overcomes the difficulty of extracting semantics from problem set archives and the efficiency of this method is demonstrated through a case study. Having semantic annotations of problems, a student can efficiently locate the problems that logically corre spond to his knowledge.展开更多
Purpose: To design an efficient high-performance algorithm for semantic annotation of biodiversity documents in Chinese.Design/methodology/approach: Data set consists of 1,000 randomly selected documents from Flora of...Purpose: To design an efficient high-performance algorithm for semantic annotation of biodiversity documents in Chinese.Design/methodology/approach: Data set consists of 1,000 randomly selected documents from Flora of China. Comparative evaluation of the proposed approach with the Na ve Bayes algorithm have been developed before for the same purpose.Findings: Experimental results show that the heuristics based algorithm outperformed the Na ve Bayes algorithm. The use of leading words helped improving the annotation performance while prioritizing rule application based on their weights had no significant impact on algorithm performance.Research limitations: The ICTCLAS was used to identify word boundaries off-shelf without optimatization for biodiversity domain. This may have not made the best use of the tool.Practical implications & Originality/value: The performance of heuristics based approach,enhanced by leading words analysis, reached an F value of 0.9216, which is sufficiently accurate for practical use.展开更多
Semantic annotation of Web objects is a key problem for Web information extraction. The Web contains an abundance of useful semi-structured information about real world objects, and the empirical study shows that stro...Semantic annotation of Web objects is a key problem for Web information extraction. The Web contains an abundance of useful semi-structured information about real world objects, and the empirical study shows that strong two-dimensional sequence characteristics and correlative characteristics exist for Web information about objects of the same type across different Web sites. Conditional Random Fields (CRFs) are the state-of-the-art approaches taking the sequence characteristics to do better labeling. However, as the appearance of correlative characteristics between Web object elements, previous CRFs have their limitations for semantic annotation of Web objects and cannot deal with the long distance dependencies between Web object elements efficiently. To better incorporate the long distance dependencies, on one hand, this paper describes long distance dependencies by correlative edges, which are built by making good use of structured information and the characteristics of records from external databases; and on the other hand, this paper presents a two-dimensional Correlative-Chain Conditional Random Fields (2DCC-CRFs) to do semantic annotation of Web objects. This approach extends a classic model, two-dimensional Conditional Random Fields (2DCRFs), by adding correlative edges. Experimental results using a large number of real-world data collected from diverse domains show that the proposed approach can significantly improve the semantic annotation accuracy of Web objects.展开更多
A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic represent...A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic representation model,semantic information building and semantic retrieval techniques.In this paper,we introduce an associated semantic network and an automatic semantic annotation system.In the system,a semantic network model is employed as the semantic representation model,it uses semantic Key words,linguistic ontology and low-level features in semantic similarity calculating.Through several times of users' relevance feedback,semantic network is enriched automatically.To speed up the growth of semantic network and get a balance annotation,semantic seeds and semantic loners are employed especially.展开更多
Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk...Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.展开更多
Purpose: The objective of this paper is to testify the effect of ontology-based semantic annotation on the performance of document retrieval.Design/methodology/approach: An integrated document retrieval method is put ...Purpose: The objective of this paper is to testify the effect of ontology-based semantic annotation on the performance of document retrieval.Design/methodology/approach: An integrated document retrieval method is put forward in this paper, in which the entities of documents are annotated by the upper ontology and domain ontology, then the documents are further indexed by the entity annotation as well as traditional keywords.Findings: The research result shows that the structured entity retrieval and relation retrieval can be realized by the ontology-based entity index, which is beyond the ability of the tradition keyword-based retrieval. Meanwhile, the experiment shows that the recall and precision of document retrieval are improved effectively.Research limitations: Due to the small amount of our current tourism domain ontology, the document retrieval with the ontology-based semantic index is limited by the size of ontology and the precision of semantic annotation. Meanwhile, the semantic annotation algorithm mainly relies on the current information extraction strategy of KIM Platform. Therefore,the performance of disambiguation and relation extraction algorithm need to be further improved.Practical implications: Our method can improve the efficiency of document retrieval system,which facilitates the knowledge and document management in corporations, governments and other organizations.Originality/value: The integrated document retrieval method proposed in the paper can combine the entity index based on the general ontology with domain ontology and the keyword index. Our result verified the effectiveness of the combined index strategy.展开更多
In order to implement the real-time detection of abnormality of elder and devices in an empty nest home,multi-modal joint sensors are used to collect discrete action sequences of behavior,and the improved hierarchical...In order to implement the real-time detection of abnormality of elder and devices in an empty nest home,multi-modal joint sensors are used to collect discrete action sequences of behavior,and the improved hierarchical hidden Markov model is adopted to Abstract these discrete action sequences captured by multi-modal joint sensors into an occupant’s high-level behavior—event,then structure representation models of occupant normality are modeled from large amounts of spatio-temporal data. These models are used as classifiers of normality to detect an occupant’s abnormal behavior.In order to express context information needed by reasoning and detection,multi-media ontology (MMO) is designed to annotate and reason about the media information in the smart monitoring system.A pessimistic emotion model (PEM) is improved to analyze multi-interleaving events of multi-active devices in the home.Experiments demonstrate that the PEM can enhance the accuracy and reliability for detecting active devices when these devices are in blind regions or are occlusive. The above approach has good performance in detecting abnormalities involving occupants and devices in a real-time way.展开更多
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi...More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.展开更多
In this paper we propose a novel model "recursive directed graph" based on feature structure, and apply it to represent the semantic relations of postpositive attributive structures in biomedical texts. The usages o...In this paper we propose a novel model "recursive directed graph" based on feature structure, and apply it to represent the semantic relations of postpositive attributive structures in biomedical texts. The usages of postpositive attributive are complex and variable, especially three categories: present participle phrase, past participle phrase, and preposition phrase as postpositire attributive, which always bring the difficulties of automatic parsing. We summarize these categories and annotate the semantic information. Compared with dependency structure, feature structure, being recursive directed graph, enhances semantic information extraction in biomedical field. The annotation results show that recursive directed graph is more suitable to extract complex semantic relations for biomedical text mining.展开更多
To solve the irregular, poor efficiency and lowly reusable of resource, the hierarchy model of the ontology-based E-learning system is proposed. Some key techniques in the process of the project are also discussed in ...To solve the irregular, poor efficiency and lowly reusable of resource, the hierarchy model of the ontology-based E-learning system is proposed. Some key techniques in the process of the project are also discussed in this paper, such as the ontology construction, the content ontology for describing the semantics of the learning materials.展开更多
It is difficult to analyze semantic relations automatically, especially the semantic relations of Chinese special sentence patterns. In this paper, we apply a novel model feature structure to represent Chinese semanti...It is difficult to analyze semantic relations automatically, especially the semantic relations of Chinese special sentence patterns. In this paper, we apply a novel model feature structure to represent Chinese semantic relations, which is formalized as "recursive directed graph". We focus on Chinese special sentence patterns, including the complex noun phrase, verb-complement structure, pivotal sentences, serial verb sentence and subject-predicate predicate sentence. Feature structure facilitates a richer Chinese semantic information extraction when compared with dependency structure. The results show that using recursive directed graph is more suitable for extracting Chinese complex semantic relations.展开更多
文摘Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations between ontologies. Through analyzing and classifying the relationships between ontologies, the idea of bridge ontology was proposed, which had the powerful capability of expressing the complex relationships between concepts and relationships between relations in multi-ontologies. Meanwhile, a new approach employing bridge ontology was proposed to deal with the multi-ontologies-based semantic annotation problem. The bridge ontology is a peculiar ontology, which can be created and maintained conveniently, and is effective in the multi-ontologies-based semantic annotation. The approach using bridge ontology has the advantages of low-cost, scalable, robust in the web circumstance, and avoiding the unnecessary ontology extending and integration. Key words semantic web - bridge ontology - multi-ontologies - semantic annotation CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024). National Grand Fundamental Research 973 Program of China (2002CB312000), National Re-search Foundation for the Doctoral Program of Higher Education of China (20020286004)Biography: WANG Peng (1977-), male, Ph.D candidate, research direction: semantic web, ontology, and knowledge representation on the Web.
基金Supported by the National Natural Science Fundationof China (60273051)
文摘Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming knowledge domain ontology to add semantic annotations to problems in the Web. The system we developed adds semantic annotation for each problem in the form of Extensible Makeup Language. Our method overcomes the difficulty of extracting semantics from problem set archives and the efficiency of this method is demonstrated through a case study. Having semantic annotations of problems, a student can efficiently locate the problems that logically corre spond to his knowledge.
基金supported by the National Social Science Foundation of China (Grant No.:11BTQ024)the Foundation for Humanities and Social Sciences of the Chinese Ministry of Education (Grant No.:10YJC87004)
文摘Purpose: To design an efficient high-performance algorithm for semantic annotation of biodiversity documents in Chinese.Design/methodology/approach: Data set consists of 1,000 randomly selected documents from Flora of China. Comparative evaluation of the proposed approach with the Na ve Bayes algorithm have been developed before for the same purpose.Findings: Experimental results show that the heuristics based algorithm outperformed the Na ve Bayes algorithm. The use of leading words helped improving the annotation performance while prioritizing rule application based on their weights had no significant impact on algorithm performance.Research limitations: The ICTCLAS was used to identify word boundaries off-shelf without optimatization for biodiversity domain. This may have not made the best use of the tool.Practical implications & Originality/value: The performance of heuristics based approach,enhanced by leading words analysis, reached an F value of 0.9216, which is sufficiently accurate for practical use.
基金Supported by the National Natural Science Foundation of China under Grant No.90818001the Natural Science Foundation of Shandong Province of China under Grant No.Y2007G24
文摘Semantic annotation of Web objects is a key problem for Web information extraction. The Web contains an abundance of useful semi-structured information about real world objects, and the empirical study shows that strong two-dimensional sequence characteristics and correlative characteristics exist for Web information about objects of the same type across different Web sites. Conditional Random Fields (CRFs) are the state-of-the-art approaches taking the sequence characteristics to do better labeling. However, as the appearance of correlative characteristics between Web object elements, previous CRFs have their limitations for semantic annotation of Web objects and cannot deal with the long distance dependencies between Web object elements efficiently. To better incorporate the long distance dependencies, on one hand, this paper describes long distance dependencies by correlative edges, which are built by making good use of structured information and the characteristics of records from external databases; and on the other hand, this paper presents a two-dimensional Correlative-Chain Conditional Random Fields (2DCC-CRFs) to do semantic annotation of Web objects. This approach extends a classic model, two-dimensional Conditional Random Fields (2DCRFs), by adding correlative edges. Experimental results using a large number of real-world data collected from diverse domains show that the proposed approach can significantly improve the semantic annotation accuracy of Web objects.
文摘A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic representation model,semantic information building and semantic retrieval techniques.In this paper,we introduce an associated semantic network and an automatic semantic annotation system.In the system,a semantic network model is employed as the semantic representation model,it uses semantic Key words,linguistic ontology and low-level features in semantic similarity calculating.Through several times of users' relevance feedback,semantic network is enriched automatically.To speed up the growth of semantic network and get a balance annotation,semantic seeds and semantic loners are employed especially.
基金Supported by the National Basic Research Program of China(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.
基金supported by the National Social Science Foundation of China(Grant No.11CTQ003)
文摘Purpose: The objective of this paper is to testify the effect of ontology-based semantic annotation on the performance of document retrieval.Design/methodology/approach: An integrated document retrieval method is put forward in this paper, in which the entities of documents are annotated by the upper ontology and domain ontology, then the documents are further indexed by the entity annotation as well as traditional keywords.Findings: The research result shows that the structured entity retrieval and relation retrieval can be realized by the ontology-based entity index, which is beyond the ability of the tradition keyword-based retrieval. Meanwhile, the experiment shows that the recall and precision of document retrieval are improved effectively.Research limitations: Due to the small amount of our current tourism domain ontology, the document retrieval with the ontology-based semantic index is limited by the size of ontology and the precision of semantic annotation. Meanwhile, the semantic annotation algorithm mainly relies on the current information extraction strategy of KIM Platform. Therefore,the performance of disambiguation and relation extraction algorithm need to be further improved.Practical implications: Our method can improve the efficiency of document retrieval system,which facilitates the knowledge and document management in corporations, governments and other organizations.Originality/value: The integrated document retrieval method proposed in the paper can combine the entity index based on the general ontology with domain ontology and the keyword index. Our result verified the effectiveness of the combined index strategy.
基金The National Natural Science Foundation of China(No.60773110)the Youth Education Fund of Hunan Province(No.07B014)
文摘In order to implement the real-time detection of abnormality of elder and devices in an empty nest home,multi-modal joint sensors are used to collect discrete action sequences of behavior,and the improved hierarchical hidden Markov model is adopted to Abstract these discrete action sequences captured by multi-modal joint sensors into an occupant’s high-level behavior—event,then structure representation models of occupant normality are modeled from large amounts of spatio-temporal data. These models are used as classifiers of normality to detect an occupant’s abnormal behavior.In order to express context information needed by reasoning and detection,multi-media ontology (MMO) is designed to annotate and reason about the media information in the smart monitoring system.A pessimistic emotion model (PEM) is improved to analyze multi-interleaving events of multi-active devices in the home.Experiments demonstrate that the PEM can enhance the accuracy and reliability for detecting active devices when these devices are in blind regions or are occlusive. The above approach has good performance in detecting abnormalities involving occupants and devices in a real-time way.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciencesthe National High-Tech R&D Program of China(2008BAK49B05)
文摘More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.
基金Supported by the National Natural Science Foundation of China(61202193,61202304)the Major Projects of Chinese National Social Science Foundation(11&ZD189)the Chinese Postdoctoral Science Foundation(2013M540593,2014T70722)
文摘In this paper we propose a novel model "recursive directed graph" based on feature structure, and apply it to represent the semantic relations of postpositive attributive structures in biomedical texts. The usages of postpositive attributive are complex and variable, especially three categories: present participle phrase, past participle phrase, and preposition phrase as postpositire attributive, which always bring the difficulties of automatic parsing. We summarize these categories and annotate the semantic information. Compared with dependency structure, feature structure, being recursive directed graph, enhances semantic information extraction in biomedical field. The annotation results show that recursive directed graph is more suitable to extract complex semantic relations for biomedical text mining.
文摘To solve the irregular, poor efficiency and lowly reusable of resource, the hierarchy model of the ontology-based E-learning system is proposed. Some key techniques in the process of the project are also discussed in this paper, such as the ontology construction, the content ontology for describing the semantics of the learning materials.
基金Supported by the National Natural Science Foundation of China(61202193,61202304)the Major Projects of Chinese National Social Science Foundation(11&ZD189)+2 种基金the Chinese Postdoctoral Science Foundation(2013M540593,2014T70722)the Accomplishments of Listed Subjects in Hubei Prime Subject Developmentthe Open Foundation of Shandong Key Lab of Language Resource Development and Application
文摘It is difficult to analyze semantic relations automatically, especially the semantic relations of Chinese special sentence patterns. In this paper, we apply a novel model feature structure to represent Chinese semantic relations, which is formalized as "recursive directed graph". We focus on Chinese special sentence patterns, including the complex noun phrase, verb-complement structure, pivotal sentences, serial verb sentence and subject-predicate predicate sentence. Feature structure facilitates a richer Chinese semantic information extraction when compared with dependency structure. The results show that using recursive directed graph is more suitable for extracting Chinese complex semantic relations.