Similarity has been playing an important role in computer science,artificial intelligence(AI)and data science.However,similarity intelligence has been ignored in these disciplines.Similarity intelligence is a process ...Similarity has been playing an important role in computer science,artificial intelligence(AI)and data science.However,similarity intelligence has been ignored in these disciplines.Similarity intelligence is a process of discovering intelligence through similarity.This article will explore similarity intelligence,similarity-based reasoning,similarity computing and analytics.More specifically,this article looks at the similarity as an intelligence and its impact on a few areas in the real world.It explores similarity intelligence accompanying experience-based intelligence,knowledge-based intelligence,and data-based intelligence to play an important role in computer science,AI,and data science.This article explores similarity-based reasoning(SBR)and proposes three similarity-based inference rules.It then examines similarity computing and analytics,and a multiagent SBR system.The main contributions of this article are:1)Similarity intelligence is discovered from experience-based intelligence consisting of data-based intelligence and knowledge-based intelligence.2)Similarity-based reasoning,computing and analytics can be used to create similarity intelligence.The proposed approach will facilitate research and development of similarity intelligence,similarity computing and analytics,machine learning and case-based reasoning.展开更多
The paper proposes a new text similarity computing method based on concept similarity in Chinese text processing. The new method converts text to words vector space model at first, and then splits words into a set of ...The paper proposes a new text similarity computing method based on concept similarity in Chinese text processing. The new method converts text to words vector space model at first, and then splits words into a set of concepts. Through computing the inner products between concepts, it obtains the similarity between words. The new method computes the similarity of text based on the similarity of words at last. The contributions of the paper include: 1) propose a new computing formula between words; 2) propose a new text similarity computing method based on words similarity; 3) successfully use the method in the application of similarity computing of WEB news; and 4) prove the validity of the method through extensive experiments.展开更多
Ontology heterogeneity is the primary obstacle for interoperation of ontologies. Ontology mapping is the best way to solve this problem. The key of ontology mapping is the similarity computation. At present, the metho...Ontology heterogeneity is the primary obstacle for interoperation of ontologies. Ontology mapping is the best way to solve this problem. The key of ontology mapping is the similarity computation. At present, the method of similarity computation is imperfect. And the computation quantity is high. To solve these problems, an ontology-mapping framework with a kind of hybrid architecture is put forward, with an improvement in the method of similarity computation. Different areas have different local ontologies. Two ontologies are taken as examples, to explain the specific mapping framework and improved method of similarity computation. These two ontologies are about classes and teachers in a university. The experimental results show that using this framework and improved method can increase the accuracy of computation to a certain extent. Otherwise, the quantity of computation can be decreased.展开更多
FAQ (frequently asked question) is widely used on the Internet, but most FAQ's asking and answering are not automatic. This paper introduces the design and imple mentation of a FAQ automatic return system based on ...FAQ (frequently asked question) is widely used on the Internet, but most FAQ's asking and answering are not automatic. This paper introduces the design and imple mentation of a FAQ automatic return system based on semantic similarity computation, including computation model choo sing, FAQ characters analyzing, FAQ data formal expressing, feature vector indexing, and weight computing and so on. According to FAQ features of sentence length short, two mapping, strong domain characteristics etc. Vector Space Model with special semantic process was selected in system, and corresponding algorithm of similarity computation was proposed too. Experiment shows that the system has a good performance for high frequent and common questions.展开更多
There exist a large number of composed documents in universities in the teaching process. Most of them are required to check the similarity for validation. A kind of similarity computation system is constructed for co...There exist a large number of composed documents in universities in the teaching process. Most of them are required to check the similarity for validation. A kind of similarity computation system is constructed for composed documents with images and text information. Firstly, each document is split and outputs two parts as images and text information. Then, these documents are compared by computing the similarities of images and text contents independently. Through Hadoop system, the text contents are easily and quickly separated. Experimental results show that the proposed system is efficient and practical.展开更多
In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely f...In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely fully analyzed,which is a major obstacle to the intelligent development of the social IoT.In this paper,we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages.Most of these data are unstructured or semi-structured sentences,so the accuracy of sentence similarity analysis largely determines the model’s performance.For the purpose of improving accuracy,we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences.We mainly consider the subjects,predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences.Finally,we verify the performance of the model with the Microsoft Research Paraphrase Corpus(MRPC),which consists of 4076 pairs of training sentences and 1725 pairs of test sentences,and most of the data came from the news of social data.Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation.展开更多
Objective:To investigate the feasibility of a 4D-CT reconstruction method based on the similarity principle of spatial adjacent images and mutual information measure. Methods:A motor driven sinusoidal motion platform ...Objective:To investigate the feasibility of a 4D-CT reconstruction method based on the similarity principle of spatial adjacent images and mutual information measure. Methods:A motor driven sinusoidal motion platform made in house was used to create one-dimensional periodical motion that was along the longitudinal axis of the CT couch. The amplitude of sinusoidal motion was set to an amplitude of ±1 cm. The period of the motion was adjustable and set to 3.5 s. Phantom objects of two eggs were placed in a Styrofoam block, which in turn were placed on the motion platform. These objects were used to simulate volumes of interest undergoing ideal periodic motion. CT data of static phantom were acquired using a multi-slice general electric (GE) LightSpeed 16-slice CT scanner in an axial mode. And the CT data of periodical motion phantom were acquired in an axial and cine-mode scan. A software program was developed by using VC++ and VTK software tools to resort the CT data and reconstruct the 4D-CT. Then all of the CT data with same phase were sorted by the program into the same series based on the similarity principle of spatial adjacent images and mutual information measure among them, and 3D reconstruction of different phase CT data were completed by using the software. Results:All of the CT data were sorted accurately into different series based on the similarity principle of spatial adjacent images and mutual information measures among them. Compared with the unsorted CT data, the motion artifacts in the 3D reconstruction of sorted CT data were reduced significantly, and all of the sorted CT series result in a 4D-CT that reflected the characteristic of the periodical motion phantom. Conclusion:Time-resolved 4D-CT reconstruction can be implemented with any general multi-slice CT scanners based on the similarity principle of spatial adjacent images and mutual information measure.The process of the 4D-CT data acquisition and reconstruction were not restricted to the hardware or software of the CT scanner and has the feasibility ,which extensive applicability.展开更多
In this paper, we propose a parallel computing technique for content-based image retrieval (CBIR) system. This technique is mainly used for single node with multi-core processor, which is different from those based ...In this paper, we propose a parallel computing technique for content-based image retrieval (CBIR) system. This technique is mainly used for single node with multi-core processor, which is different from those based on cluster or network computing architecture. Due to its specific applications (such as medical image processing) and the harsh terms of hardware resource requirement, the CBIR system has been prevented from being widely used. With the increasing volume of the image database, the widespread use of multi-core processors, and the requirement of the retrieval accuracy and speed, we need to achieve a retrieval strategy which is based on multi-core processor to make the retrieval faster and more convenient than before. Experimental results demonstrate that this parallel architecture can significantly improve the performance of retrieval system. In addition, we also propose an efficient parallel technique with the combinations of the cluster and the multi-core techniques, which is supposed to gear to the new trend of the cloud computing.展开更多
Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several m...Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several methods for calculating similarity between sentences, and brings out a new method which takes all factors into consideration including critical words, semantic information, sentential form and sen-tence length. And on this basis, a automatic abstracting system based on LexRank algorithm is implemented. We made several improvements in both sentence weight computing and redundancy resolution. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our system could produce better summaries to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summariza-tion technology are discussed.展开更多
Owing to the constraints on the fabrication ofγ-ray coding plates with many pixels,few studies have been carried out onγ-ray computational ghost imaging.Thus,the development of coding plates with fewer pixels is ess...Owing to the constraints on the fabrication ofγ-ray coding plates with many pixels,few studies have been carried out onγ-ray computational ghost imaging.Thus,the development of coding plates with fewer pixels is essential to achieveγ-ray computational ghost imaging.Based on the regional similarity between Hadamard subcoding plates,this study presents an optimization method to reduce the number of pixels of Hadamard coding plates.First,a moving distance matrix was obtained to describe the regional similarity quantitatively.Second,based on the matrix,we used two ant colony optimization arrangement algorithms to maximize the reuse of pixels in the regional similarity area and obtain new compressed coding plates.With full sampling,these two algorithms improved the pixel utilization of the coding plate,and the compression ratio values were 54.2%and 58.9%,respectively.In addition,three undersampled sequences(the Harr,Russian dolls,and cake-cutting sequences)with different sampling rates were tested and discussed.With different sampling rates,our method reduced the number of pixels of all three sequences,especially for the Russian dolls and cake-cutting sequences.Therefore,our method can reduce the number of pixels,manufacturing cost,and difficulty of the coding plate,which is beneficial for the implementation and application ofγ-ray computational ghost imaging.展开更多
The large finite element global stiffness matrix is an algebraic, discreet, even-order, differential operator of zero row sums. Direct application of the, practically convenient, readily applied, Gershgorin’s eigenva...The large finite element global stiffness matrix is an algebraic, discreet, even-order, differential operator of zero row sums. Direct application of the, practically convenient, readily applied, Gershgorin’s eigenvalue bounding theorem to this matrix inherently fails to foresee its positive definiteness, predictably, and routinely failing to produce a nontrivial lower bound on the least eigenvalue of this, theoretically assured to be positive definite, matrix. Considered here are practical methods for producing an optimal similarity transformation for the finite-elements global stiffness matrix, following which non trivial, realistic, lower bounds on the least eigenvalue can be located, then further improved. The technique is restricted here to the common case of a global stiffness matrix having only non-positive off-diagonal entries. For such a matrix application of the Gershgorin bounding method may be carried out by a mere matrix vector multiplication.展开更多
One of the interesting topics in grid computing systems is resources discovery. After the failure of a resource in a chain of resources made for a specific task in grid environment, discovering and finding a new resou...One of the interesting topics in grid computing systems is resources discovery. After the failure of a resource in a chain of resources made for a specific task in grid environment, discovering and finding a new resource that reconstructs the chain is an important topic. In this study, with defining new agent that is called task agent, and by proposing an algorithm, we will increase the fault tolerance against probable failure of a resource in the resource chain.展开更多
In this paper detection method for the illegal access to the cloud infrastructure is proposed. Detection process is based on the collaborative filtering algorithm constructed on the cloud model. Here, first of all, th...In this paper detection method for the illegal access to the cloud infrastructure is proposed. Detection process is based on the collaborative filtering algorithm constructed on the cloud model. Here, first of all, the normal behavior of the user is formed in the shape of a cloud model, then these models are compared with each other by using the cosine similarity method and by applying the collaborative filtering method the deviations from the normal behavior are evaluated. If the deviation value is above than the threshold, the user who gained access to the system is evaluated as illegal, otherwise he is evaluated as a real user.展开更多
A fundamental open question in the analysis of social networks was to understand the evolution between similarity and group social ties.In general,two groups are similar for two distinct reasons:first,they grow to cha...A fundamental open question in the analysis of social networks was to understand the evolution between similarity and group social ties.In general,two groups are similar for two distinct reasons:first,they grow to change their behaviors to the same group due to social influence;second,they tend to merge a group due to similar behaviors,where a process often is termed selection by sociologists.It was important to understand why two groups could merge and what led to high similarities for members in a group,influence or selection.In this paper,the techniques for identifying and modeling interactions between social influence and selection for different groups were developed.Different similarities were computed in three phases where groups came into being,before or after according to the number of common edits in Wikipedia.Experimental results showed selection played a more important role in two group merging.展开更多
文摘Similarity has been playing an important role in computer science,artificial intelligence(AI)and data science.However,similarity intelligence has been ignored in these disciplines.Similarity intelligence is a process of discovering intelligence through similarity.This article will explore similarity intelligence,similarity-based reasoning,similarity computing and analytics.More specifically,this article looks at the similarity as an intelligence and its impact on a few areas in the real world.It explores similarity intelligence accompanying experience-based intelligence,knowledge-based intelligence,and data-based intelligence to play an important role in computer science,AI,and data science.This article explores similarity-based reasoning(SBR)and proposes three similarity-based inference rules.It then examines similarity computing and analytics,and a multiagent SBR system.The main contributions of this article are:1)Similarity intelligence is discovered from experience-based intelligence consisting of data-based intelligence and knowledge-based intelligence.2)Similarity-based reasoning,computing and analytics can be used to create similarity intelligence.The proposed approach will facilitate research and development of similarity intelligence,similarity computing and analytics,machine learning and case-based reasoning.
基金Supported by the China Postdoctoral Science Foundation (Grant No. 20060400002)the Sichuan Youth Science and Technology Foundation of China (Grant No. 08JJ0109)+2 种基金the National Natural Science Foundation of China (Grant Nos.60473051, 60503037)the National High-tech Re- search and Development of China (Grant No. 2006AA01Z230)the Natural Science Foundation of Beijing Natural Science Foundation (Grant No. 4062018)
文摘The paper proposes a new text similarity computing method based on concept similarity in Chinese text processing. The new method converts text to words vector space model at first, and then splits words into a set of concepts. Through computing the inner products between concepts, it obtains the similarity between words. The new method computes the similarity of text based on the similarity of words at last. The contributions of the paper include: 1) propose a new computing formula between words; 2) propose a new text similarity computing method based on words similarity; 3) successfully use the method in the application of similarity computing of WEB news; and 4) prove the validity of the method through extensive experiments.
基金the National Natural Science Foundation of China (70371052).
文摘Ontology heterogeneity is the primary obstacle for interoperation of ontologies. Ontology mapping is the best way to solve this problem. The key of ontology mapping is the similarity computation. At present, the method of similarity computation is imperfect. And the computation quantity is high. To solve these problems, an ontology-mapping framework with a kind of hybrid architecture is put forward, with an improvement in the method of similarity computation. Different areas have different local ontologies. Two ontologies are taken as examples, to explain the specific mapping framework and improved method of similarity computation. These two ontologies are about classes and teachers in a university. The experimental results show that using this framework and improved method can increase the accuracy of computation to a certain extent. Otherwise, the quantity of computation can be decreased.
基金Supported by the National Natural Science Foun-dation of China (60272088)
文摘FAQ (frequently asked question) is widely used on the Internet, but most FAQ's asking and answering are not automatic. This paper introduces the design and imple mentation of a FAQ automatic return system based on semantic similarity computation, including computation model choo sing, FAQ characters analyzing, FAQ data formal expressing, feature vector indexing, and weight computing and so on. According to FAQ features of sentence length short, two mapping, strong domain characteristics etc. Vector Space Model with special semantic process was selected in system, and corresponding algorithm of similarity computation was proposed too. Experiment shows that the system has a good performance for high frequent and common questions.
文摘There exist a large number of composed documents in universities in the teaching process. Most of them are required to check the similarity for validation. A kind of similarity computation system is constructed for composed documents with images and text information. Firstly, each document is split and outputs two parts as images and text information. Then, these documents are compared by computing the similarities of images and text contents independently. Through Hadoop system, the text contents are easily and quickly separated. Experimental results show that the proposed system is efficient and practical.
基金supported by the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-006partially supported by the Shandong Provincial Natural Science Foundation,China under Grant ZR2020MF006partially supported by“the Fundamental Research Funds for the Central Universities”of China University of Petroleum(East China)under Grant 20CX05017A,18CX02139A.
文摘In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely fully analyzed,which is a major obstacle to the intelligent development of the social IoT.In this paper,we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages.Most of these data are unstructured or semi-structured sentences,so the accuracy of sentence similarity analysis largely determines the model’s performance.For the purpose of improving accuracy,we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences.We mainly consider the subjects,predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences.Finally,we verify the performance of the model with the Microsoft Research Paraphrase Corpus(MRPC),which consists of 4076 pairs of training sentences and 1725 pairs of test sentences,and most of the data came from the news of social data.Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation.
基金Guangzhou Municipal Medicine &Health ProgramGrant number:2006-YB-177+1 种基金Guangdong Province Medicine Scientific Research ProgramGrant number:A2007290
文摘Objective:To investigate the feasibility of a 4D-CT reconstruction method based on the similarity principle of spatial adjacent images and mutual information measure. Methods:A motor driven sinusoidal motion platform made in house was used to create one-dimensional periodical motion that was along the longitudinal axis of the CT couch. The amplitude of sinusoidal motion was set to an amplitude of ±1 cm. The period of the motion was adjustable and set to 3.5 s. Phantom objects of two eggs were placed in a Styrofoam block, which in turn were placed on the motion platform. These objects were used to simulate volumes of interest undergoing ideal periodic motion. CT data of static phantom were acquired using a multi-slice general electric (GE) LightSpeed 16-slice CT scanner in an axial mode. And the CT data of periodical motion phantom were acquired in an axial and cine-mode scan. A software program was developed by using VC++ and VTK software tools to resort the CT data and reconstruct the 4D-CT. Then all of the CT data with same phase were sorted by the program into the same series based on the similarity principle of spatial adjacent images and mutual information measure among them, and 3D reconstruction of different phase CT data were completed by using the software. Results:All of the CT data were sorted accurately into different series based on the similarity principle of spatial adjacent images and mutual information measures among them. Compared with the unsorted CT data, the motion artifacts in the 3D reconstruction of sorted CT data were reduced significantly, and all of the sorted CT series result in a 4D-CT that reflected the characteristic of the periodical motion phantom. Conclusion:Time-resolved 4D-CT reconstruction can be implemented with any general multi-slice CT scanners based on the similarity principle of spatial adjacent images and mutual information measure.The process of the 4D-CT data acquisition and reconstruction were not restricted to the hardware or software of the CT scanner and has the feasibility ,which extensive applicability.
基金supported by the Natural Science Foundation of Shanghai (Grant No.08ZR1408200)the Shanghai Leading Academic Discipline Project (Grant No.J50103)the Open Project Program of the National Laboratory of Pattern Recognition
文摘In this paper, we propose a parallel computing technique for content-based image retrieval (CBIR) system. This technique is mainly used for single node with multi-core processor, which is different from those based on cluster or network computing architecture. Due to its specific applications (such as medical image processing) and the harsh terms of hardware resource requirement, the CBIR system has been prevented from being widely used. With the increasing volume of the image database, the widespread use of multi-core processors, and the requirement of the retrieval accuracy and speed, we need to achieve a retrieval strategy which is based on multi-core processor to make the retrieval faster and more convenient than before. Experimental results demonstrate that this parallel architecture can significantly improve the performance of retrieval system. In addition, we also propose an efficient parallel technique with the combinations of the cluster and the multi-core techniques, which is supposed to gear to the new trend of the cloud computing.
文摘Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several methods for calculating similarity between sentences, and brings out a new method which takes all factors into consideration including critical words, semantic information, sentential form and sen-tence length. And on this basis, a automatic abstracting system based on LexRank algorithm is implemented. We made several improvements in both sentence weight computing and redundancy resolution. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our system could produce better summaries to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summariza-tion technology are discussed.
基金supported by the Youth Science Foundation of Sichuan Province(Nos.22NSFSC3816 and 2022NSFSC1231)the General Project of the National Natural Science Foundation of China(Nos.12075039 and 41874121)the Key Project of the National Natural Science Foundation of China(No.U19A2086).
文摘Owing to the constraints on the fabrication ofγ-ray coding plates with many pixels,few studies have been carried out onγ-ray computational ghost imaging.Thus,the development of coding plates with fewer pixels is essential to achieveγ-ray computational ghost imaging.Based on the regional similarity between Hadamard subcoding plates,this study presents an optimization method to reduce the number of pixels of Hadamard coding plates.First,a moving distance matrix was obtained to describe the regional similarity quantitatively.Second,based on the matrix,we used two ant colony optimization arrangement algorithms to maximize the reuse of pixels in the regional similarity area and obtain new compressed coding plates.With full sampling,these two algorithms improved the pixel utilization of the coding plate,and the compression ratio values were 54.2%and 58.9%,respectively.In addition,three undersampled sequences(the Harr,Russian dolls,and cake-cutting sequences)with different sampling rates were tested and discussed.With different sampling rates,our method reduced the number of pixels of all three sequences,especially for the Russian dolls and cake-cutting sequences.Therefore,our method can reduce the number of pixels,manufacturing cost,and difficulty of the coding plate,which is beneficial for the implementation and application ofγ-ray computational ghost imaging.
文摘The large finite element global stiffness matrix is an algebraic, discreet, even-order, differential operator of zero row sums. Direct application of the, practically convenient, readily applied, Gershgorin’s eigenvalue bounding theorem to this matrix inherently fails to foresee its positive definiteness, predictably, and routinely failing to produce a nontrivial lower bound on the least eigenvalue of this, theoretically assured to be positive definite, matrix. Considered here are practical methods for producing an optimal similarity transformation for the finite-elements global stiffness matrix, following which non trivial, realistic, lower bounds on the least eigenvalue can be located, then further improved. The technique is restricted here to the common case of a global stiffness matrix having only non-positive off-diagonal entries. For such a matrix application of the Gershgorin bounding method may be carried out by a mere matrix vector multiplication.
文摘One of the interesting topics in grid computing systems is resources discovery. After the failure of a resource in a chain of resources made for a specific task in grid environment, discovering and finding a new resource that reconstructs the chain is an important topic. In this study, with defining new agent that is called task agent, and by proposing an algorithm, we will increase the fault tolerance against probable failure of a resource in the resource chain.
文摘In this paper detection method for the illegal access to the cloud infrastructure is proposed. Detection process is based on the collaborative filtering algorithm constructed on the cloud model. Here, first of all, the normal behavior of the user is formed in the shape of a cloud model, then these models are compared with each other by using the cosine similarity method and by applying the collaborative filtering method the deviations from the normal behavior are evaluated. If the deviation value is above than the threshold, the user who gained access to the system is evaluated as illegal, otherwise he is evaluated as a real user.
文摘A fundamental open question in the analysis of social networks was to understand the evolution between similarity and group social ties.In general,two groups are similar for two distinct reasons:first,they grow to change their behaviors to the same group due to social influence;second,they tend to merge a group due to similar behaviors,where a process often is termed selection by sociologists.It was important to understand why two groups could merge and what led to high similarities for members in a group,influence or selection.In this paper,the techniques for identifying and modeling interactions between social influence and selection for different groups were developed.Different similarities were computed in three phases where groups came into being,before or after according to the number of common edits in Wikipedia.Experimental results showed selection played a more important role in two group merging.