With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So...With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So how to predict the defects quickly and accurately on the software change has become an important problem for software developers.Current defect prediction methods often cannot reflect the feature information of the defect comprehensively,and the detection effect is not ideal enough.Therefore,we propose a novel defect prediction model named ITNB(Improved Transfer Naive Bayes)based on improved transfer Naive Bayesian algorithm in this paper,which mainly considers the following two aspects:(1)Considering that the edge data of the test set may affect the similarity calculation and final prediction result,we remove the edge data of the test set when calculating the data similarity between the training set and the test set;(2)Considering that each feature dimension has different effects on defect prediction,we construct the calculation formula of training data weight based on feature dimension weight and data gravity,and then calculate the prior probability and the conditional probability of training data from the weight information,so as to construct the weighted bayesian classifier for software defect prediction.To evaluate the performance of the ITNB model,we use six datasets from large open source projects,namely Bugzilla,Columba,Mozilla,JDT,Platform and PostgreSQL.We compare the ITNB model with the transfer Naive Bayesian(TNB)model.The experimental results show that our ITNB model can achieve better results than the TNB model in terms of accurary,precision and pd for within-project and cross-project defect prediction.展开更多
SaaS software that provides services through cloud platform has been more widely used nowadays.However,when SaaS software is running,it will suffer from performance fault due to factors such as the software structural...SaaS software that provides services through cloud platform has been more widely used nowadays.However,when SaaS software is running,it will suffer from performance fault due to factors such as the software structural design or complex environments.It is a major challenge that how to diagnose software quickly and accurately when the performance fault occurs.For this challenge,we propose a novel performance fault diagnosis method for SaaS software based on GBDT(Gradient Boosting Decision Tree)algorithm.In particular,we leverage the monitoring mean to obtain the performance log and warning log when the SaaS software system runs,and establish the performance fault type set and determine performance log feature.We also perform performance fault type annotation for the performance log combined with the analysis result of the warning log.Moreover,we deal with the incomplete performance log and the type non-equalization problem by using the mean filling for the same type and combination of SMOTE(Synthetic Minority Oversampling Technique)and undersampling methods.Finally,we conduct an empirical study combined with the disaster reduction system deployed on the cloud platform,and it demonstrates that the proposed method has high efficiency and accuracy for the performance diagnosis when SaaS software system runs.展开更多
Software defect prediction plays an important role in software quality assurance.However,the performance of the prediction model is susceptible to the irrelevant and redundant features.In addition,previous studies mos...Software defect prediction plays an important role in software quality assurance.However,the performance of the prediction model is susceptible to the irrelevant and redundant features.In addition,previous studies mostly regard software defect prediction as a single objective optimization problem,and multi-objective software defect prediction has not been thoroughly investigated.For the above two reasons,we propose the following solutions in this paper:(1)we leverage an advanced deep neural network-Stacked Contractive AutoEncoder(SCAE)to extract the robust deep semantic features from the original defect features,which has stronger discrimination capacity for different classes(defective or non-defective).(2)we propose a novel multi-objective defect prediction model named SMONGE that utilizes the Multi-Objective NSGAII algorithm to optimize the advanced neural network-Extreme learning machine(ELM)based on state-of-the-art Pareto optimal solutions according to the features extracted by SCAE.We mainly consider two objectives.One objective is to maximize the performance of ELM,which refers to the benefit of the SMONGE model.Another objective is to minimize the output weight norm of ELM,which is related to the cost of the SMONGE model.We compare the SCAE with six state-of-the-art feature extraction methods and compare the SMONGE model with multiple baseline models that contain four classic defect predictors and the MONGE model without SCAE across 20 open source software projects.The experimental results verify that the superiority of SCAE and SMONGE on seven evaluation metrics.展开更多
Software defect prediction is a research hotspot in the field of software engineering.However,due to the limitations of current machine learning algorithms,we can’t achieve good effect for defect prediction by only u...Software defect prediction is a research hotspot in the field of software engineering.However,due to the limitations of current machine learning algorithms,we can’t achieve good effect for defect prediction by only using machine learning algorithms.In previous studies,some researchers used extreme learning machine(ELM)to conduct defect prediction.However,the initial weights and biases of the ELM are determined randomly,which reduces the prediction performance of ELM.Motivated by the idea of search based software engineering,we propose a novel software defect prediction model named KAEA based on kernel principal component analysis(KPCA),adaptive genetic algorithm,extreme learning machine and Adaboost algorithm,which has three main advantages:(1)KPCA can extract optimal representative features by leveraging a nonlinear mapping function;(2)We leverage adaptive genetic algorithm to optimize the initial weights and biases of ELM,so as to improve the generalization ability and prediction capacity of ELM;(3)We use the Adaboost algorithm to integrate multiple ELM basic predictors optimized by adaptive genetic algorithm into a strong predictor,which can further improve the effect of defect prediction.To effectively evaluate the performance of KAEA,we use eleven datasets from large open source projects,and compare the KAEA with four machine learning basic classifiers,ELM and its three variants.The experimental results show that KAEA is superior to these baseline models in most cases.展开更多
Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model ...Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model is susceptible to high dimensionality of the dataset that contains irrelevant and redundant features.In addition,software metrics for software defect prediction are almost entirely traditional features compared to the deep semantic feature representation from deep learning techniques.To address these two issues,we propose the following two solutions in this paper:(1)We leverage a novel non-linear manifold learning method-SOINN Landmark Isomap(SL-Isomap)to extract the representative features by selecting automatically the reasonable number and position of landmarks,which can reveal the complex intrinsic structure hidden behind the defect data.(2)We propose a novel defect prediction model named DLDD based on hybrid deep learning techniques,which leverages denoising autoencoder to learn true input features that are not contaminated by noise,and utilizes deep neural network to learn the abstract deep semantic features.We combine the squared error loss function of denoising autoencoder with the cross entropy loss function of deep neural network to achieve the best prediction performance by adjusting a hyperparameter.We compare the SL-Isomap with seven state-of-the-art feature extraction methods and compare the DLDD model with six baseline models across 20 open source software projects.The experimental results verify that the superiority of SL-Isomap and DLDD on four evaluation indicators.展开更多
A substantial number of studies have been published since the Ninth International Workshop on Tropical Cyclones(IWTC-9)in 2018,improving our understanding of the effect of climate change on tropical cyclones(TCs)and a...A substantial number of studies have been published since the Ninth International Workshop on Tropical Cyclones(IWTC-9)in 2018,improving our understanding of the effect of climate change on tropical cyclones(TCs)and associated hazards and risks.These studies have reinforced the robustness of increases in TC intensity and associated TC hazards and risks due to anthropogenic climate change.New modeling and observational studies suggested the potential influence of anthropogenic climate forcings,including greenhouse gases and aerosols,on global and regional TC activity at the decadal and century time scales.However,there are still substantial uncertainties owing to model uncertainty in simulating historical TC decadal variability in the Atlantic,and the limitations of observed TC records.The projected future change in the global number of TCs has become more uncertain since IWTC-9 due to projected increases in TC frequency by a few climate models.A new paradigm,TC seeds,has been proposed,and there is currently a debate on whether seeds can help explain the physical mechanism behind the projected changes in global TC frequency.New studies also highlighted the importance of large-scale environmental fields on TC activity,such as snow cover and air-sea interactions.Future projections on TC translation speed and medicanes are new additional focus topics in our report.Recommendations and future research are proposed relevant to the remaining scientific questions and assisting policymakers.展开更多
The majority o f adult parasitoid wasps are unable to synthesize lipids and therefore face a trade-off between the investment of lipids in eggs or in the maintenance of soma.It has been shown that resource allocation ...The majority o f adult parasitoid wasps are unable to synthesize lipids and therefore face a trade-off between the investment of lipids in eggs or in the maintenance of soma.It has been shown that resource allocation should depend on body size in parasitoids.Given that smaller females have shorter expected life times,they should concentrate their reproductive effort into early life.To test this prediction,we investigated the relationship between body size and the timing of egg production in parasitoids.We measured body size,lipid reserves,and reproductive investment(number of eggs,ovigeny index equivalent[OIE]and egg size)at eclosion in five species of Asobara(Hymenoptera:Braconidae)originating from different geographic and climatic environments.Our results show significant interspecific variation in all these traits.A diagnostic test for phylogenetic independence revealed that closely related species did not resemble each other more closely than expected by chance for all traits measured.Lipid reserves scaled positively with body size both between and within species.In agreement with theory,01 correlated negatively with body size both between and within species.Total egg area at eclosion correlated negatively with lipid reserves both between and within species.This indicates the existence of a trade-off between allocation of lipids to current reproduction and survival/future reproduction.With the exception of the most extreme pro-ovigenic species,A.pers im ilis,we found that proovigeny was compensated for by small egg size.Our results indicate the role of habitats in shaping interspecific variation in resource allocation strategies.展开更多
Background:Axons,crucial for impulse transmission and cellular trafficking,are thought to be primary targets of neurodegeneration in Parkinson’s disease(PD)and dementia with Lewy bodies(DLB).Axonal degeneration occur...Background:Axons,crucial for impulse transmission and cellular trafficking,are thought to be primary targets of neurodegeneration in Parkinson’s disease(PD)and dementia with Lewy bodies(DLB).Axonal degeneration occurs early,preceeding and exceeding neuronal loss,and contributes to the spread of pathology,yet is poorly described outside the nigrostriatal circuitry.The insula,a cortical brain hub,was recently discovered to be highly vulnerable to pathology and plays a role in cognitive deficits in PD and DLB.The aim of this study was to evaluate morphological features as well as burden of proteinopathy and axonal degeneration in the anterior insular sub-regions in PD,PD with dementia(PDD),and DLB.Methods:α-Synuclein,phosphorylated(p-)tau,and amyloid-βpathology load were evaluated in the anterior insular(agranular and dysgranular)subregions of post-mortem human brains(n=27).Axonal loss was evaluated using modified Bielschowsky silver staining and quantified using stereology.Cytoskeletal damage was comprehensively studied using immunofluorescent multi-labelling and 3D confocal laser-scanning microscopy.Results:Compared to PD and PDD,DLB showed significantly higherα-synuclein and p-tau pathology load,argyrophilic grains,and more severe axonal loss,particularly in the anterior agranular insula.Alternatively,the dysgranular insula showed a significantly higher load of amyloid-βpathology and its axonal density correlated with cognitive performance.p-Tau contributed most to axonal loss in the DLB group,was highest in the anterior agranular insula and significantly correlated with CDR global scores for dementia.Neurofilament and myelin showed degenerative changes including swellings,demyelination,and detachment of the axon-myelin unit.Conclusions:Our results highlight the selective vulnerability of the anterior insular sub-regions to various converging pathologies,leading to impaired axonal integrity in PD,PDD and DLB,disrupting their functional properties and potentially contributing to cognitive,emotional,and autonomic deficits.展开更多
Semantic representation of evidence-based medical guidelines provides the support for the data inter-operability and has been found many applications in the medical domain. In this paper, we describe a semantic repres...Semantic representation of evidence-based medical guidelines provides the support for the data inter-operability and has been found many applications in the medical domain. In this paper, we describe a semantic representation approach of evidence-based medical guidelines, which is based on the Semantic Web Technology standards. We discuss several use cases of that semantic representation of evidence-based medical guideline, and show that they are potentially useful for medical applications.展开更多
Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among ...Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among the interest terms to refine the query. Traditional general purpose similarity measures may not always fit a domain specific context. This paper presents a similarity method for medical literature searches based on the biomedical literature knowledge source "MEDLINE", the normalized MEDLINE distance, to more reasonably reflect the relevance between medical terms. This measure gives more accurate user interest descriptions through calculating the similarities of user interest terms to rerank the interest term list. The accurate user interest descriptions can be used for query refinement in keyword searches to give more personalized results for the user. This measure also improves the search results for personalization through controlling the return number of results on each topic of interest.展开更多
In the absence of a central naming authority on the Semantic Web,it is common for different data sets to refer to the same thing by different names.Whenever multiple names are used to denote the same thing,owl:sameAs ...In the absence of a central naming authority on the Semantic Web,it is common for different data sets to refer to the same thing by different names.Whenever multiple names are used to denote the same thing,owl:sameAs statements are needed in order to link the data and foster reuse.Studies that date back as far as 2009,observed that the owl:sameAs property is sometimes used incorrectly.In our previous work,we presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements,and presented a scalable approach for automatically calculating an error degree for each identity statement.In this paper,we generate subgraphs of the overall identity graph that correspond to certain error degrees.We show that even though the Semantic Web contains many erroneous owl:sameAs statements,it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs.展开更多
基金the National Natural Science Foundation of China under Grant Nos.60573063,60573064the National Basic Research Program of China under Grant Nos.G1999032701,2003CB317008+1 种基金the National Laboratory of Intelligent Information Processing of Chinathe National Laboratory of Software Development Environment of China~~
基金This work is supported in part by the National Science Foundation of China(Nos.61672392,61373038)in part by the National Key Research and Development Program of China(No.2016YFC1202204).
文摘With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So how to predict the defects quickly and accurately on the software change has become an important problem for software developers.Current defect prediction methods often cannot reflect the feature information of the defect comprehensively,and the detection effect is not ideal enough.Therefore,we propose a novel defect prediction model named ITNB(Improved Transfer Naive Bayes)based on improved transfer Naive Bayesian algorithm in this paper,which mainly considers the following two aspects:(1)Considering that the edge data of the test set may affect the similarity calculation and final prediction result,we remove the edge data of the test set when calculating the data similarity between the training set and the test set;(2)Considering that each feature dimension has different effects on defect prediction,we construct the calculation formula of training data weight based on feature dimension weight and data gravity,and then calculate the prior probability and the conditional probability of training data from the weight information,so as to construct the weighted bayesian classifier for software defect prediction.To evaluate the performance of the ITNB model,we use six datasets from large open source projects,namely Bugzilla,Columba,Mozilla,JDT,Platform and PostgreSQL.We compare the ITNB model with the transfer Naive Bayesian(TNB)model.The experimental results show that our ITNB model can achieve better results than the TNB model in terms of accurary,precision and pd for within-project and cross-project defect prediction.
基金This work is supported in part by the National Science Foundation of China(61672392,61373038)in part by the National Key Research and Development Program of China(No.2016YFC1202204).
文摘SaaS software that provides services through cloud platform has been more widely used nowadays.However,when SaaS software is running,it will suffer from performance fault due to factors such as the software structural design or complex environments.It is a major challenge that how to diagnose software quickly and accurately when the performance fault occurs.For this challenge,we propose a novel performance fault diagnosis method for SaaS software based on GBDT(Gradient Boosting Decision Tree)algorithm.In particular,we leverage the monitoring mean to obtain the performance log and warning log when the SaaS software system runs,and establish the performance fault type set and determine performance log feature.We also perform performance fault type annotation for the performance log combined with the analysis result of the warning log.Moreover,we deal with the incomplete performance log and the type non-equalization problem by using the mean filling for the same type and combination of SMOTE(Synthetic Minority Oversampling Technique)and undersampling methods.Finally,we conduct an empirical study combined with the disaster reduction system deployed on the cloud platform,and it demonstrates that the proposed method has high efficiency and accuracy for the performance diagnosis when SaaS software system runs.
基金This work is supported in part by the National Science Foundation of China(Grant Nos.61672392,61373038)in part by the National Key Research and Development Program of China(Grant No.2016YFC1202204).
文摘Software defect prediction plays an important role in software quality assurance.However,the performance of the prediction model is susceptible to the irrelevant and redundant features.In addition,previous studies mostly regard software defect prediction as a single objective optimization problem,and multi-objective software defect prediction has not been thoroughly investigated.For the above two reasons,we propose the following solutions in this paper:(1)we leverage an advanced deep neural network-Stacked Contractive AutoEncoder(SCAE)to extract the robust deep semantic features from the original defect features,which has stronger discrimination capacity for different classes(defective or non-defective).(2)we propose a novel multi-objective defect prediction model named SMONGE that utilizes the Multi-Objective NSGAII algorithm to optimize the advanced neural network-Extreme learning machine(ELM)based on state-of-the-art Pareto optimal solutions according to the features extracted by SCAE.We mainly consider two objectives.One objective is to maximize the performance of ELM,which refers to the benefit of the SMONGE model.Another objective is to minimize the output weight norm of ELM,which is related to the cost of the SMONGE model.We compare the SCAE with six state-of-the-art feature extraction methods and compare the SMONGE model with multiple baseline models that contain four classic defect predictors and the MONGE model without SCAE across 20 open source software projects.The experimental results verify that the superiority of SCAE and SMONGE on seven evaluation metrics.
基金This work is supported in part by the National Science Foundation of China(61672392,61373038)in part by the National Key Research and Development Program of China(No.2016YFC1202204).
文摘Software defect prediction is a research hotspot in the field of software engineering.However,due to the limitations of current machine learning algorithms,we can’t achieve good effect for defect prediction by only using machine learning algorithms.In previous studies,some researchers used extreme learning machine(ELM)to conduct defect prediction.However,the initial weights and biases of the ELM are determined randomly,which reduces the prediction performance of ELM.Motivated by the idea of search based software engineering,we propose a novel software defect prediction model named KAEA based on kernel principal component analysis(KPCA),adaptive genetic algorithm,extreme learning machine and Adaboost algorithm,which has three main advantages:(1)KPCA can extract optimal representative features by leveraging a nonlinear mapping function;(2)We leverage adaptive genetic algorithm to optimize the initial weights and biases of ELM,so as to improve the generalization ability and prediction capacity of ELM;(3)We use the Adaboost algorithm to integrate multiple ELM basic predictors optimized by adaptive genetic algorithm into a strong predictor,which can further improve the effect of defect prediction.To effectively evaluate the performance of KAEA,we use eleven datasets from large open source projects,and compare the KAEA with four machine learning basic classifiers,ELM and its three variants.The experimental results show that KAEA is superior to these baseline models in most cases.
基金This work is supported in part by the National Science Foundation of China(Grant Nos.61672392,61373038)in part by the National Key Research and Development Program of China(Grant No.2016YFC1202204).
文摘Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model is susceptible to high dimensionality of the dataset that contains irrelevant and redundant features.In addition,software metrics for software defect prediction are almost entirely traditional features compared to the deep semantic feature representation from deep learning techniques.To address these two issues,we propose the following two solutions in this paper:(1)We leverage a novel non-linear manifold learning method-SOINN Landmark Isomap(SL-Isomap)to extract the representative features by selecting automatically the reasonable number and position of landmarks,which can reveal the complex intrinsic structure hidden behind the defect data.(2)We propose a novel defect prediction model named DLDD based on hybrid deep learning techniques,which leverages denoising autoencoder to learn true input features that are not contaminated by noise,and utilizes deep neural network to learn the abstract deep semantic features.We combine the squared error loss function of denoising autoencoder with the cross entropy loss function of deep neural network to achieve the best prediction performance by adjusting a hyperparameter.We compare the SL-Isomap with seven state-of-the-art feature extraction methods and compare the DLDD model with six baseline models across 20 open source software projects.The experimental results verify that the superiority of SL-Isomap and DLDD on four evaluation indicators.
基金support from NSF(AGS 20-43142 and AGS 22-17618)NOAA(NA21OAR4310344)+2 种基金DOE(DE SC0023333)and the Vetlesen Foundation.SSC acknowledges funding support from the Climate Systems Hub of the Australian Government's National Environmental Science Program(NESP)funded by the Korea government(MSIT)(No.RS-2022-00144325)the Ministry of Education(Basic Science Research Program,2021R1A2C1005287).
文摘A substantial number of studies have been published since the Ninth International Workshop on Tropical Cyclones(IWTC-9)in 2018,improving our understanding of the effect of climate change on tropical cyclones(TCs)and associated hazards and risks.These studies have reinforced the robustness of increases in TC intensity and associated TC hazards and risks due to anthropogenic climate change.New modeling and observational studies suggested the potential influence of anthropogenic climate forcings,including greenhouse gases and aerosols,on global and regional TC activity at the decadal and century time scales.However,there are still substantial uncertainties owing to model uncertainty in simulating historical TC decadal variability in the Atlantic,and the limitations of observed TC records.The projected future change in the global number of TCs has become more uncertain since IWTC-9 due to projected increases in TC frequency by a few climate models.A new paradigm,TC seeds,has been proposed,and there is currently a debate on whether seeds can help explain the physical mechanism behind the projected changes in global TC frequency.New studies also highlighted the importance of large-scale environmental fields on TC activity,such as snow cover and air-sea interactions.Future projections on TC translation speed and medicanes are new additional focus topics in our report.Recommendations and future research are proposed relevant to the remaining scientific questions and assisting policymakers.
文摘The majority o f adult parasitoid wasps are unable to synthesize lipids and therefore face a trade-off between the investment of lipids in eggs or in the maintenance of soma.It has been shown that resource allocation should depend on body size in parasitoids.Given that smaller females have shorter expected life times,they should concentrate their reproductive effort into early life.To test this prediction,we investigated the relationship between body size and the timing of egg production in parasitoids.We measured body size,lipid reserves,and reproductive investment(number of eggs,ovigeny index equivalent[OIE]and egg size)at eclosion in five species of Asobara(Hymenoptera:Braconidae)originating from different geographic and climatic environments.Our results show significant interspecific variation in all these traits.A diagnostic test for phylogenetic independence revealed that closely related species did not resemble each other more closely than expected by chance for all traits measured.Lipid reserves scaled positively with body size both between and within species.In agreement with theory,01 correlated negatively with body size both between and within species.Total egg area at eclosion correlated negatively with lipid reserves both between and within species.This indicates the existence of a trade-off between allocation of lipids to current reproduction and survival/future reproduction.With the exception of the most extreme pro-ovigenic species,A.pers im ilis,we found that proovigeny was compensated for by small egg size.Our results indicate the role of habitats in shaping interspecific variation in resource allocation strategies.
文摘Background:Axons,crucial for impulse transmission and cellular trafficking,are thought to be primary targets of neurodegeneration in Parkinson’s disease(PD)and dementia with Lewy bodies(DLB).Axonal degeneration occurs early,preceeding and exceeding neuronal loss,and contributes to the spread of pathology,yet is poorly described outside the nigrostriatal circuitry.The insula,a cortical brain hub,was recently discovered to be highly vulnerable to pathology and plays a role in cognitive deficits in PD and DLB.The aim of this study was to evaluate morphological features as well as burden of proteinopathy and axonal degeneration in the anterior insular sub-regions in PD,PD with dementia(PDD),and DLB.Methods:α-Synuclein,phosphorylated(p-)tau,and amyloid-βpathology load were evaluated in the anterior insular(agranular and dysgranular)subregions of post-mortem human brains(n=27).Axonal loss was evaluated using modified Bielschowsky silver staining and quantified using stereology.Cytoskeletal damage was comprehensively studied using immunofluorescent multi-labelling and 3D confocal laser-scanning microscopy.Results:Compared to PD and PDD,DLB showed significantly higherα-synuclein and p-tau pathology load,argyrophilic grains,and more severe axonal loss,particularly in the anterior agranular insula.Alternatively,the dysgranular insula showed a significantly higher load of amyloid-βpathology and its axonal density correlated with cognitive performance.p-Tau contributed most to axonal loss in the DLB group,was highest in the anterior agranular insula and significantly correlated with CDR global scores for dementia.Neurofilament and myelin showed degenerative changes including swellings,demyelination,and detachment of the axon-myelin unit.Conclusions:Our results highlight the selective vulnerability of the anterior insular sub-regions to various converging pathologies,leading to impaired axonal integrity in PD,PDD and DLB,disrupting their functional properties and potentially contributing to cognitive,emotional,and autonomic deficits.
基金Supported by the European Commission under the 7th Framework EURECA Project(FP7-ICT-2011-7,288048)the Key Projects of National Social Science Foundation of China(11ZD&189)the Natural Science Foundation of Hubei Province(2014CFB247)
文摘Semantic representation of evidence-based medical guidelines provides the support for the data inter-operability and has been found many applications in the medical domain. In this paper, we describe a semantic representation approach of evidence-based medical guidelines, which is based on the Semantic Web Technology standards. We discuss several use cases of that semantic representation of evidence-based medical guideline, and show that they are potentially useful for medical applications.
基金Supported by the European Commission under the 7th Framework Programme,the Large Knowledge Collider (LarKC) Project (No.FP7-215535)
文摘Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among the interest terms to refine the query. Traditional general purpose similarity measures may not always fit a domain specific context. This paper presents a similarity method for medical literature searches based on the biomedical literature knowledge source "MEDLINE", the normalized MEDLINE distance, to more reasonably reflect the relevance between medical terms. This measure gives more accurate user interest descriptions through calculating the similarities of user interest terms to rerank the interest term list. The accurate user interest descriptions can be used for query refinement in keyword searches to give more personalized results for the user. This measure also improves the search results for personalization through controlling the return number of results on each topic of interest.
文摘In the absence of a central naming authority on the Semantic Web,it is common for different data sets to refer to the same thing by different names.Whenever multiple names are used to denote the same thing,owl:sameAs statements are needed in order to link the data and foster reuse.Studies that date back as far as 2009,observed that the owl:sameAs property is sometimes used incorrectly.In our previous work,we presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements,and presented a scalable approach for automatically calculating an error degree for each identity statement.In this paper,we generate subgraphs of the overall identity graph that correspond to certain error degrees.We show that even though the Semantic Web contains many erroneous owl:sameAs statements,it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs.