A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use ...A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use "word" to compose question vector, that the semantic relations between words are ignored. OBQE takes the relation as an important part. The process of the new system is:① to build two-layered domain ontology referring to WordNet and domain corpse;② to expand question trunks into domain cases;③ to use domain case composed vector to calculate question similarity. The experimental result shows that the performance of question similarity calculation with OBQE is being improved.展开更多
Infrared image recognition plays an important role in the inspection of power equipment.Existing technologies dedicated to this purpose often require manually selected features,which are not transferable and interpret...Infrared image recognition plays an important role in the inspection of power equipment.Existing technologies dedicated to this purpose often require manually selected features,which are not transferable and interpretable,and have limited training data.To address these limitations,this paper proposes an automatic infrared image recognition framework,which includes an object recognition module based on a deep self-attention network and a temperature distribution identification module based on a multi-factor similarity calculation.First,the features of an input image are extracted and embedded using a multi-head attention encoding-decoding mechanism.Thereafter,the embedded features are used to predict the equipment component category and location.In the located area,preliminary segmentation is performed.Finally,similar areas are gradually merged,and the temperature distribution of the equipment is obtained to identify a fault.Our experiments indicate that the proposed method demonstrates significantly improved accuracy compared with other related methods and,hence,provides a good reference for the automation of power equipment inspection.展开更多
With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So...With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So how to predict the defects quickly and accurately on the software change has become an important problem for software developers.Current defect prediction methods often cannot reflect the feature information of the defect comprehensively,and the detection effect is not ideal enough.Therefore,we propose a novel defect prediction model named ITNB(Improved Transfer Naive Bayes)based on improved transfer Naive Bayesian algorithm in this paper,which mainly considers the following two aspects:(1)Considering that the edge data of the test set may affect the similarity calculation and final prediction result,we remove the edge data of the test set when calculating the data similarity between the training set and the test set;(2)Considering that each feature dimension has different effects on defect prediction,we construct the calculation formula of training data weight based on feature dimension weight and data gravity,and then calculate the prior probability and the conditional probability of training data from the weight information,so as to construct the weighted bayesian classifier for software defect prediction.To evaluate the performance of the ITNB model,we use six datasets from large open source projects,namely Bugzilla,Columba,Mozilla,JDT,Platform and PostgreSQL.We compare the ITNB model with the transfer Naive Bayesian(TNB)model.The experimental results show that our ITNB model can achieve better results than the TNB model in terms of accurary,precision and pd for within-project and cross-project defect prediction.展开更多
In recent years,a large number of intelligent sensing devices have been deployed in the physical world,which brings great difficulties to the existing entity search.With the increase of the number of intelligent sensi...In recent years,a large number of intelligent sensing devices have been deployed in the physical world,which brings great difficulties to the existing entity search.With the increase of the number of intelligent sensing devices,the accuracy of the search system in querying the entities to match the user’s request is reduced,and the delay of entity search is increased.We use the mobile edge technology to alleviate this problem by processing user requests on the edge side and propose a similar physical entity matching strategy for the mobile edge search.First,the raw data collected by the sensor is lightly weighted and expressed to reduce the storage overhead of the observed data.Furthermore,a physical entity matching degree estimation method is proposed,in which the similarity between the sensor and the given sensor in the network is estimated,and the matching search of the user request is performed according to the similarity.Simulation results show that the proposed method can effectively reduce the data storage overhead and improve the precision of the sensor search system.展开更多
The machine translation of Japanese sentences with determiners,like“shika...nai”,“tyoutto...dakedeha”,“tada...dake”and so on,are more special and regular on sentences structure.The research collects and cl...The machine translation of Japanese sentences with determiners,like“shika...nai”,“tyoutto...dakedeha”,“tada...dake”and so on,are more special and regular on sentences structure.The research collects and classifies the Japanese sentences which contain the determiners.The classification is carried out by according to the characteristics of Japanese sentences and translation habit of Chinese sentences.Through further abstraction and simplification,translation templates are extracted by gathering grammar rules information,studying syntax and analysis the collocation mode of sentences.Those determiners express confirmed meaning,and the corresponding translation Chinese sentences have the same characteristic.By analyzing the sentence characteristics with determiners and formalizing the sentences structure,the translation templates are abstracted.By investigating the structure characteristic of original sentences with translation templates,the similarity algorithm was defined.The threshold value of the similarity calculation was obtained by preliminary experiments,and the experiments of Japanese-Chinese translation are carried out by a small corpus.The experimental results for several kinds of Japanese sentences with determiners show the translation accuracy rate is 68.6%,template coverage rate reach 83.3%.At last,through the analysis for the translation errors,following conclusion is drawn:the results of morphological analysis are erroneous,because the error of word segmentation the part of speech tagging also are erroneous,result in the grammar structure cannot match with templates;the original sentences are long and especially complex sentences;the templates are too complicated;the similarity calculation method needs to discuss further,and so on.展开更多
Atmospheric nucleation is a process of phase transformation, which serves a significant role in many atmospheric and technological processes. To simulate atmospheric nucleation activities, certain molecular models wit...Atmospheric nucleation is a process of phase transformation, which serves a significant role in many atmospheric and technological processes. To simulate atmospheric nucleation activities, certain molecular models with three-dimensional (3-D) structures are generated. Analyzing these 3-D molecular models can help promote understanding of nucleation processes. Unfortunately, the ability to understand atmospheric nucleation processes is greatly restricted due to lack of efficient visual data exploration tools. In this paper, we present a data visualization solution to visualize and classify 3-D molecular crystals. We developed a novel algorithm for calculating similarity between the 3-D molecular crystals, and further improved the overall system performance with GPU (graphics processing unit) acceleration.展开更多
An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.T...An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.The core of H-CF is a linear weighted hybrid algorithm based on the Latent Factor Model(LFM)and the Improved Item Clustering and Similarity Calculation Collaborative Filtering Algorithm(ITCSCF).To begin with,the items are clustered based on their attribute dimension,which accelerates the computation of the nearest neighbor set.Subsequently,H-CF enhances the formula for scoring similarity by penalizing popular items and optimizing unpopular items.This improvement enhances the rationality of scoring similarity and reduces the impact of data sparseness.Furthermore,a weighting function is employed to combine the various improved algorithms.The balance factor of the weighting function is dynamically adjusted to attain the optimal recommendation list.To address the real-time and scalability concerns,the algorithm leverages the Spark big data distributed cluster computing framework.Experiments were conducted using the public dataset Movie Lens,where the improved algorithm’s performance was compared against the algorithm before enhancement and the algorithm running on a single machine.The experimental results demonstrate that the improved algorithm outperforms in terms of data sparsity,recommendation personalization,accuracy,recall,and efficiency.展开更多
文摘A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use "word" to compose question vector, that the semantic relations between words are ignored. OBQE takes the relation as an important part. The process of the new system is:① to build two-layered domain ontology referring to WordNet and domain corpse;② to expand question trunks into domain cases;③ to use domain case composed vector to calculate question similarity. The experimental result shows that the performance of question similarity calculation with OBQE is being improved.
基金This work was supported by National Key R&D Program of China(2019YFE0102900).
文摘Infrared image recognition plays an important role in the inspection of power equipment.Existing technologies dedicated to this purpose often require manually selected features,which are not transferable and interpretable,and have limited training data.To address these limitations,this paper proposes an automatic infrared image recognition framework,which includes an object recognition module based on a deep self-attention network and a temperature distribution identification module based on a multi-factor similarity calculation.First,the features of an input image are extracted and embedded using a multi-head attention encoding-decoding mechanism.Thereafter,the embedded features are used to predict the equipment component category and location.In the located area,preliminary segmentation is performed.Finally,similar areas are gradually merged,and the temperature distribution of the equipment is obtained to identify a fault.Our experiments indicate that the proposed method demonstrates significantly improved accuracy compared with other related methods and,hence,provides a good reference for the automation of power equipment inspection.
基金This work is supported in part by the National Science Foundation of China(Nos.61672392,61373038)in part by the National Key Research and Development Program of China(No.2016YFC1202204).
文摘With the continuous expansion of software scale,software update and maintenance have become more and more important.However,frequent software code updates will make the software more likely to introduce new defects.So how to predict the defects quickly and accurately on the software change has become an important problem for software developers.Current defect prediction methods often cannot reflect the feature information of the defect comprehensively,and the detection effect is not ideal enough.Therefore,we propose a novel defect prediction model named ITNB(Improved Transfer Naive Bayes)based on improved transfer Naive Bayesian algorithm in this paper,which mainly considers the following two aspects:(1)Considering that the edge data of the test set may affect the similarity calculation and final prediction result,we remove the edge data of the test set when calculating the data similarity between the training set and the test set;(2)Considering that each feature dimension has different effects on defect prediction,we construct the calculation formula of training data weight based on feature dimension weight and data gravity,and then calculate the prior probability and the conditional probability of training data from the weight information,so as to construct the weighted bayesian classifier for software defect prediction.To evaluate the performance of the ITNB model,we use six datasets from large open source projects,namely Bugzilla,Columba,Mozilla,JDT,Platform and PostgreSQL.We compare the ITNB model with the transfer Naive Bayesian(TNB)model.The experimental results show that our ITNB model can achieve better results than the TNB model in terms of accurary,precision and pd for within-project and cross-project defect prediction.
基金This work was supported by the National Natural Science Foundation of China(61871062,61771082,61901071)Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN201800615)General Project of Natural Science Foundation of Chongqing(cstc2019jcyj-msxmX0303).
文摘In recent years,a large number of intelligent sensing devices have been deployed in the physical world,which brings great difficulties to the existing entity search.With the increase of the number of intelligent sensing devices,the accuracy of the search system in querying the entities to match the user’s request is reduced,and the delay of entity search is increased.We use the mobile edge technology to alleviate this problem by processing user requests on the edge side and propose a similar physical entity matching strategy for the mobile edge search.First,the raw data collected by the sensor is lightly weighted and expressed to reduce the storage overhead of the observed data.Furthermore,a physical entity matching degree estimation method is proposed,in which the similarity between the sensor and the given sensor in the network is estimated,and the matching search of the user request is performed according to the similarity.Simulation results show that the proposed method can effectively reduce the data storage overhead and improve the precision of the sensor search system.
文摘The machine translation of Japanese sentences with determiners,like“shika...nai”,“tyoutto...dakedeha”,“tada...dake”and so on,are more special and regular on sentences structure.The research collects and classifies the Japanese sentences which contain the determiners.The classification is carried out by according to the characteristics of Japanese sentences and translation habit of Chinese sentences.Through further abstraction and simplification,translation templates are extracted by gathering grammar rules information,studying syntax and analysis the collocation mode of sentences.Those determiners express confirmed meaning,and the corresponding translation Chinese sentences have the same characteristic.By analyzing the sentence characteristics with determiners and formalizing the sentences structure,the translation templates are abstracted.By investigating the structure characteristic of original sentences with translation templates,the similarity algorithm was defined.The threshold value of the similarity calculation was obtained by preliminary experiments,and the experiments of Japanese-Chinese translation are carried out by a small corpus.The experimental results for several kinds of Japanese sentences with determiners show the translation accuracy rate is 68.6%,template coverage rate reach 83.3%.At last,through the analysis for the translation errors,following conclusion is drawn:the results of morphological analysis are erroneous,because the error of word segmentation the part of speech tagging also are erroneous,result in the grammar structure cannot match with templates;the original sentences are long and especially complex sentences;the templates are too complicated;the similarity calculation method needs to discuss further,and so on.
基金supported by the US National Science Foundation (No. CHE-1051396)
文摘Atmospheric nucleation is a process of phase transformation, which serves a significant role in many atmospheric and technological processes. To simulate atmospheric nucleation activities, certain molecular models with three-dimensional (3-D) structures are generated. Analyzing these 3-D molecular models can help promote understanding of nucleation processes. Unfortunately, the ability to understand atmospheric nucleation processes is greatly restricted due to lack of efficient visual data exploration tools. In this paper, we present a data visualization solution to visualize and classify 3-D molecular crystals. We developed a novel algorithm for calculating similarity between the 3-D molecular crystals, and further improved the overall system performance with GPU (graphics processing unit) acceleration.
基金Supported by the Natural Science Foundation of Jiangxi Province(20212BAB202018)Provincial Virtual Simulation Experiment Education Project of Jiangxi Education Department(2020-2-0048)the Science and Technology Research Project of Jiangxi Province Educational Department(GJJ210333)。
文摘An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.The core of H-CF is a linear weighted hybrid algorithm based on the Latent Factor Model(LFM)and the Improved Item Clustering and Similarity Calculation Collaborative Filtering Algorithm(ITCSCF).To begin with,the items are clustered based on their attribute dimension,which accelerates the computation of the nearest neighbor set.Subsequently,H-CF enhances the formula for scoring similarity by penalizing popular items and optimizing unpopular items.This improvement enhances the rationality of scoring similarity and reduces the impact of data sparseness.Furthermore,a weighting function is employed to combine the various improved algorithms.The balance factor of the weighting function is dynamically adjusted to attain the optimal recommendation list.To address the real-time and scalability concerns,the algorithm leverages the Spark big data distributed cluster computing framework.Experiments were conducted using the public dataset Movie Lens,where the improved algorithm’s performance was compared against the algorithm before enhancement and the algorithm running on a single machine.The experimental results demonstrate that the improved algorithm outperforms in terms of data sparsity,recommendation personalization,accuracy,recall,and efficiency.