As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocrea...As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news.展开更多
The existing multi-source contour matching studies have focused on the matching methods with consideration of topological relations and similarity measurement based on spatial Euclidean distance,while it is lack of ta...The existing multi-source contour matching studies have focused on the matching methods with consideration of topological relations and similarity measurement based on spatial Euclidean distance,while it is lack of taking the contour geometric features into account,which may lead to mismatching in map boundaries and areas with intensive contours or extreme terrain changes.In light of this,it is put forward that a matching strategy from coarse to precious based on the contour geometric features.The proposed matching strategy can be described as follows.Firstly,the point sequence is converted to feature sequence according to a feature descriptive function based on curvature and angle of normal vector.Then the level of similarity among multi-source contours is calculated by using the longest common subsequence solution.Accordingly,the identical contours could be matched based on the above calculated results.In the experiment for the proposed method,the reliability and efficiency of the matching method are verified using simulative datasets and real datasets respectively.It has been proved that the proposed contour matching strategy has a high matching precision and good applicability.展开更多
Similarity measurement is one of key operations to retrieve “desired” images from an image database. As a famous psychological similarity measure approach, the Feature Contrast (FC) model is defined as a linear comb...Similarity measurement is one of key operations to retrieve “desired” images from an image database. As a famous psychological similarity measure approach, the Feature Contrast (FC) model is defined as a linear combination of both common and distinct features. In this paper, an adaptive feature contrast (AdaFC) model is proposed to measure similarity between satellite images for image retrieval. In the AdaFC, an adaptive function is used to model a variable role of distinct features in the similarity measurement. Specifically, given some distinct features in a satellite image, e.g., a COAST image, they might play a significant role when the image is compared with an image including different semantics, e.g., a SEA image, and might be trivial when it is compared with a third image including same semantics, e.g., another COAST image. Experimental results on satellite images show that the proposed model can consistently improve similarity retrieval effectiveness of satellite images including multiple geo-objects, for example COAST images.展开更多
Content-based medical image retrieval(CBMIR)is a technique for retrieving medical images based on automatically derived image features.There are many applications of CBMIR,such as teaching,research,diagnosis and elect...Content-based medical image retrieval(CBMIR)is a technique for retrieving medical images based on automatically derived image features.There are many applications of CBMIR,such as teaching,research,diagnosis and electronic patient records.Several methods are applied to enhance the retrieval performance of CBMIR systems.Developing new and effective similarity measure and features fusion methods are two of the most powerful and effective strategies for improving these systems.This study proposes the relative difference-based similarity measure(RDBSM)for CBMIR.The new measure was first used in the similarity calculation stage for the CBMIR using an unweighted fusion method of traditional color and texture features.Furthermore,the study also proposes a weighted fusion method for medical image features extracted using pre-trained convolutional neural networks(CNNs)models.Our proposed RDBSM has outperformed the standard well-known similarity and distance measures using two popular medical image datasets,Kvasir and PH2,in terms of recall and precision retrieval measures.The effectiveness and quality of our proposed similarity measure are also proved using a significant test and statistical confidence bound.展开更多
The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully con...The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.展开更多
The existence of shadow leads to the degradation of the image qualities and the defect of ground object information.Shadow removal is therefore an essential research topic in image processing filed.The biggest challen...The existence of shadow leads to the degradation of the image qualities and the defect of ground object information.Shadow removal is therefore an essential research topic in image processing filed.The biggest challenge of shadow removal is how to restore the content of shadow areas correctly while removing the shadow in the image.Paired regions for shadow removal approach based on multi-features is proposed, in which shadow removal is only performed on related sunlit areas.Feature distance between regions is calculated to find the optimal paired regions with considering of multi-features(texture, gradient feature, etc.) comprehensively.Images in different scenes with peak signal-to-noise ratio(PSNR) and structural similarity(SSIM) evaluation indexes are chosen for experiments.The results are shown with six existing comparison methods by visual and quantitative assessments, which verified that the proposed method shows excellent shadow removal effect, the brightness, color of the removed shadow area, and the surrounding non-shadow area can be naturally fused.展开更多
Venusian coronae are large(60-2600 km diameter)tectono-magmatic features characterized by quasi-circular graben-fissure systems and topographic features such as a central dome,central depression,circular rim or circular
Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique f...Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique for dimensionality reduction to search an optimal feature subset preserving the most relevant information. In this paper, we propose an effective feature evaluation criterion for multi-label feature selection, called neighborhood relationship preserving score. This criterion is inspired by similarity preservation, which is widely used in single-label feature selection. It evaluates each feature subset by measuring its capability in preserving neighborhood relationship among samples. Unlike similarity preservation, we address the order of sample similarities which can well express the neighborhood relationship among samples, not just the pairwise sample similarity. With this criterion, we also design one ranking algorithm and one greedy algorithm for feature selection problem. The proposed algorithms are validated in six publicly available data sets from machine learning repository. Experimental results demonstrate their superiorities over the compared state-of-the-art methods.展开更多
A non-local denoising (NLD) algorithm for point-sampled surfaces (PSSs) is presented based on similarities, including geometry intensity and features of sample points. By using the trilateral filtering operator, the d...A non-local denoising (NLD) algorithm for point-sampled surfaces (PSSs) is presented based on similarities, including geometry intensity and features of sample points. By using the trilateral filtering operator, the differential signal of each sample point is determined and called "geometry intensity". Based on covariance analysis, a regular grid of geometry intensity of a sample point is constructed, and the geometry-intensity similarity of two points is measured according to their grids. Based on mean shift clustering, the PSSs are clustered in terms of the local geometry-features similarity. The smoothed geometry intensity, i.e., offset distance, of the sample point is estimated according to the two similarities. Using the resulting intensity, the noise component from PSSs is finally removed by adjusting the position of each sample point along its own normal direction. Ex- perimental results demonstrate that the algorithm is robust and can produce a more accurate denoising result while having better feature preservation.展开更多
To avoid the curse of dimensionality, text categorization (TC) algorithms based on machine learning (ML) have to use an feature selection (FS) method to reduce the dimensionality of feature space. Although havin...To avoid the curse of dimensionality, text categorization (TC) algorithms based on machine learning (ML) have to use an feature selection (FS) method to reduce the dimensionality of feature space. Although having been widely used, FS process will generally cause information losing and then have much side-effect on the whole performance of TC algorithms. On the basis of the sparsity characteristic of text vectors, a new TC algorithm based on lazy feature selection (LFS) is presented. As a new type of embedded feature selection approach, the LFS method can greatly reduce the dimension of features without any information losing, which can improve both efficiency and performance of algorithms greatly. The experiments show the new algorithm can simultaneously achieve much higher both performance and efficiency than some of other classical TC algorithms.展开更多
In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact t...In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact that only a relatively low number of distinct values of a particular visual feature is present in most images. To extract color feature and build indices into our image database we take into consideration factors such as human color perception and perceptual range, and the image is partitioned into a set of regions by using a simple classifying scheme. The compact color feature vector and the spatial color histogram, which are extracted from the seqmented image region, are used for representing the color and spatial information in the image. We have also developed the region-based distance measures to compare the similarity of two images. Extensive tests on a large image collection were conducted to demonstrate the effectiveness of the proposed approach.展开更多
The essential of feature matching technology lies in how to measure the similarity of spatial entities.Among all the possible similarity measures,the shape similarity measure is one of the most important measures beca...The essential of feature matching technology lies in how to measure the similarity of spatial entities.Among all the possible similarity measures,the shape similarity measure is one of the most important measures because it is easy to collect the necessary parameters and it is also well matched with the human intuition.In this paper a new shape similarity measure of linear entities based on the differences of direction change along each line is presented and its effectiveness is illustrated.展开更多
With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormou...With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormous losses caused by phishing webpages,which imitate the interface of real webpages and deceive the victims.To better identify and distinguish phishing webpages,a visual feature extraction method and a visual similarity algorithm are proposed.First,the visual feature extraction method improves the Visionbased Page Segmentation(VIPS)algorithm to extract the visual block and calculate its signature by perceptual hash technology.Second,the visual similarity algorithm presents a one-to-one correspondence based on the visual blocks’coordinates and thresholds.Then the weights are assigned according to the tree structure,and the similarity of the visual blocks is calculated on the basis of the measurement of the visual features’Hamming distance.Further,the visual similarity of webpages is generated by integrating the similarity and weight of different visual blocks.Finally,multiple pairs of phishing webpages and legitimate webpages are evaluated to verify the feasibility of the algorithm.The experimental results achieve excellent performance and demonstrate that our method can achieve 94%accuracy.展开更多
During the product family design, it is necessary to reduce the variety of components and share common components among many products. The major benefits are lessened design efforts and reduced costs. Therefore, this ...During the product family design, it is necessary to reduce the variety of components and share common components among many products. The major benefits are lessened design efforts and reduced costs. Therefore, this paper presents an approach to standardize components of a product family. Form feature modeling for components is discussed. Based on the similarity analysis, a step by step method to standardize the feature architectures of components is described. The algorithms for standardization are identified as well. A case for standardizing components of an auto-body family is used to demonstrate the validity of this approach.展开更多
In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power ...In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power generation forecast method using the combination of K-means++, grey relational analysis (GRA) and support vector regression (SVR) based on feature selection (Hybrid Kmeans-GRA-SVR, HKGSVR) was proposed. The historical power data were clustered through the multi-index K-means++ algorithm and divided into ideal and non-ideal weather. The GRA algorithm was used to match the similar day and the nearest neighbor similar day of the prediction day. And selected appropriate input features for different weather types to train the SVR model. Under ideal weather, the average values of MAE, RMSE and R2 were 0.8101, 0.9608 kW and 99.66%, respectively. And this method reduced the average training time by 77.27% compared with the standard SVR model. Under non-ideal weather conditions, the average values of MAE, RMSE and R2 were 1.8337, 2.1379 kW and 98.47%, respectively. And this method reduced the average training time of the standard SVR model by 98.07%. The experimental results show that the prediction accuracy of the proposed model is significantly improved compared to the other five models, which verify the effectiveness of the method.展开更多
In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and ev...In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.展开更多
Background Image matching is crucial in numerous computer vision tasks such as 3D reconstruction and simultaneous visual localization and mapping.The accuracy of the matching significantly impacted subsequent studies....Background Image matching is crucial in numerous computer vision tasks such as 3D reconstruction and simultaneous visual localization and mapping.The accuracy of the matching significantly impacted subsequent studies.Because of their local similarity,when image pairs contain comparable patterns but feature pairs are positioned differently,incorrect recognition can occur as global motion consistency is disregarded.Methods This study proposes an image-matching filtering algorithm based on global motion consistency.It can be used as a subsequent matching filter for the initial matching results generated by other matching algorithms based on the principle of motion smoothness.A particular matching algorithm can first be used to perform the initial matching;then,the rotation and movement information of the global feature vectors are combined to effectively identify outlier matches.The principle is that if the matching result is accurate,the feature vectors formed by any matched point should have similar rotation angles and moving distances.Thus,global motion direction and global motion distance consistencies were used to reject outliers caused by similar patterns in different locations.Results Four datasets were used to test the effectiveness of the proposed method.Three datasets with similar patterns in different locations were used to test the results for similar images that could easily be incorrectly matched by other algorithms,and one commonly used dataset was used to test the results for the general image-matching problem.The experimental results suggest that the proposed method is more accurate than other state-of-the-art algorithms in identifying mismatches in the initial matching set.Conclusions The proposed outlier rejection matching method can significantly improve the matching accuracy for similar images with locally similar feature pairs in different locations and can provide more accurate matching results for subsequent computer vision tasks.展开更多
基金the National Natural Science Foundation of China(No.62302540)with author F.F.S.For more information,please visit their website at https://www.nsfc.gov.cn/.Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)+1 种基金where F.F.S is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/.The research is also supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html.Lastly,it receives funding from the Natural Science Foundation of Zhongyuan University of Technology(No.K2023QN018),where F.F.S is an author.You can find more information at https://www.zut.edu.cn/.
文摘As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news.
基金National Science Foundation of China(Nos.41801388,41901397)。
文摘The existing multi-source contour matching studies have focused on the matching methods with consideration of topological relations and similarity measurement based on spatial Euclidean distance,while it is lack of taking the contour geometric features into account,which may lead to mismatching in map boundaries and areas with intensive contours or extreme terrain changes.In light of this,it is put forward that a matching strategy from coarse to precious based on the contour geometric features.The proposed matching strategy can be described as follows.Firstly,the point sequence is converted to feature sequence according to a feature descriptive function based on curvature and angle of normal vector.Then the level of similarity among multi-source contours is calculated by using the longest common subsequence solution.Accordingly,the identical contours could be matched based on the above calculated results.In the experiment for the proposed method,the reliability and efficiency of the matching method are verified using simulative datasets and real datasets respectively.It has been proved that the proposed contour matching strategy has a high matching precision and good applicability.
文摘Similarity measurement is one of key operations to retrieve “desired” images from an image database. As a famous psychological similarity measure approach, the Feature Contrast (FC) model is defined as a linear combination of both common and distinct features. In this paper, an adaptive feature contrast (AdaFC) model is proposed to measure similarity between satellite images for image retrieval. In the AdaFC, an adaptive function is used to model a variable role of distinct features in the similarity measurement. Specifically, given some distinct features in a satellite image, e.g., a COAST image, they might play a significant role when the image is compared with an image including different semantics, e.g., a SEA image, and might be trivial when it is compared with a third image including same semantics, e.g., another COAST image. Experimental results on satellite images show that the proposed model can consistently improve similarity retrieval effectiveness of satellite images including multiple geo-objects, for example COAST images.
基金funded by the Deanship of Scientific Research (DSR)at King Abdulaziz University,Jeddah,Saudi Arabia,Under Grant No. (G:146-830-1441).
文摘Content-based medical image retrieval(CBMIR)is a technique for retrieving medical images based on automatically derived image features.There are many applications of CBMIR,such as teaching,research,diagnosis and electronic patient records.Several methods are applied to enhance the retrieval performance of CBMIR systems.Developing new and effective similarity measure and features fusion methods are two of the most powerful and effective strategies for improving these systems.This study proposes the relative difference-based similarity measure(RDBSM)for CBMIR.The new measure was first used in the similarity calculation stage for the CBMIR using an unweighted fusion method of traditional color and texture features.Furthermore,the study also proposes a weighted fusion method for medical image features extracted using pre-trained convolutional neural networks(CNNs)models.Our proposed RDBSM has outperformed the standard well-known similarity and distance measures using two popular medical image datasets,Kvasir and PH2,in terms of recall and precision retrieval measures.The effectiveness and quality of our proposed similarity measure are also proved using a significant test and statistical confidence bound.
文摘The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.
基金Supported by the National Natural Science Foundation of China (No. 41971356, 41701446)the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources (No. KF-2022-07-001)。
文摘The existence of shadow leads to the degradation of the image qualities and the defect of ground object information.Shadow removal is therefore an essential research topic in image processing filed.The biggest challenge of shadow removal is how to restore the content of shadow areas correctly while removing the shadow in the image.Paired regions for shadow removal approach based on multi-features is proposed, in which shadow removal is only performed on related sunlit areas.Feature distance between regions is calculated to find the optimal paired regions with considering of multi-features(texture, gradient feature, etc.) comprehensively.Images in different scenes with peak signal-to-noise ratio(PSNR) and structural similarity(SSIM) evaluation indexes are chosen for experiments.The results are shown with six existing comparison methods by visual and quantitative assessments, which verified that the proposed method shows excellent shadow removal effect, the brightness, color of the removed shadow area, and the surrounding non-shadow area can be naturally fused.
文摘Venusian coronae are large(60-2600 km diameter)tectono-magmatic features characterized by quasi-circular graben-fissure systems and topographic features such as a central dome,central depression,circular rim or circular
基金supported in part by the National Natural Science Foundation of China(61379049,61772120)
文摘Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique for dimensionality reduction to search an optimal feature subset preserving the most relevant information. In this paper, we propose an effective feature evaluation criterion for multi-label feature selection, called neighborhood relationship preserving score. This criterion is inspired by similarity preservation, which is widely used in single-label feature selection. It evaluates each feature subset by measuring its capability in preserving neighborhood relationship among samples. Unlike similarity preservation, we address the order of sample similarities which can well express the neighborhood relationship among samples, not just the pairwise sample similarity. With this criterion, we also design one ranking algorithm and one greedy algorithm for feature selection problem. The proposed algorithms are validated in six publicly available data sets from machine learning repository. Experimental results demonstrate their superiorities over the compared state-of-the-art methods.
基金the Hi-Tech Research and Development Pro-gram (863) of China (Nos. 2007AA01Z311 and 2007AA04Z1A5)the Research Fund for the Doctoral Program of Higher Education of China (No. 20060335114)
文摘A non-local denoising (NLD) algorithm for point-sampled surfaces (PSSs) is presented based on similarities, including geometry intensity and features of sample points. By using the trilateral filtering operator, the differential signal of each sample point is determined and called "geometry intensity". Based on covariance analysis, a regular grid of geometry intensity of a sample point is constructed, and the geometry-intensity similarity of two points is measured according to their grids. Based on mean shift clustering, the PSSs are clustered in terms of the local geometry-features similarity. The smoothed geometry intensity, i.e., offset distance, of the sample point is estimated according to the two similarities. Using the resulting intensity, the noise component from PSSs is finally removed by adjusting the position of each sample point along its own normal direction. Ex- perimental results demonstrate that the algorithm is robust and can produce a more accurate denoising result while having better feature preservation.
文摘To avoid the curse of dimensionality, text categorization (TC) algorithms based on machine learning (ML) have to use an feature selection (FS) method to reduce the dimensionality of feature space. Although having been widely used, FS process will generally cause information losing and then have much side-effect on the whole performance of TC algorithms. On the basis of the sparsity characteristic of text vectors, a new TC algorithm based on lazy feature selection (LFS) is presented. As a new type of embedded feature selection approach, the LFS method can greatly reduce the dimension of features without any information losing, which can improve both efficiency and performance of algorithms greatly. The experiments show the new algorithm can simultaneously achieve much higher both performance and efficiency than some of other classical TC algorithms.
文摘In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact that only a relatively low number of distinct values of a particular visual feature is present in most images. To extract color feature and build indices into our image database we take into consideration factors such as human color perception and perceptual range, and the image is partitioned into a set of regions by using a simple classifying scheme. The compact color feature vector and the spatial color histogram, which are extracted from the seqmented image region, are used for representing the color and spatial information in the image. We have also developed the region-based distance measures to compare the similarity of two images. Extensive tests on a large image collection were conducted to demonstrate the effectiveness of the proposed approach.
文摘The essential of feature matching technology lies in how to measure the similarity of spatial entities.Among all the possible similarity measures,the shape similarity measure is one of the most important measures because it is easy to collect the necessary parameters and it is also well matched with the human intuition.In this paper a new shape similarity measure of linear entities based on the differences of direction change along each line is presented and its effectiveness is illustrated.
基金This work is supported by the National Key R&D Program of China(2016QY05X1000)the National Natural Science Foundation of China(201561402137).
文摘With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormous losses caused by phishing webpages,which imitate the interface of real webpages and deceive the victims.To better identify and distinguish phishing webpages,a visual feature extraction method and a visual similarity algorithm are proposed.First,the visual feature extraction method improves the Visionbased Page Segmentation(VIPS)algorithm to extract the visual block and calculate its signature by perceptual hash technology.Second,the visual similarity algorithm presents a one-to-one correspondence based on the visual blocks’coordinates and thresholds.Then the weights are assigned according to the tree structure,and the similarity of the visual blocks is calculated on the basis of the measurement of the visual features’Hamming distance.Further,the visual similarity of webpages is generated by integrating the similarity and weight of different visual blocks.Finally,multiple pairs of phishing webpages and legitimate webpages are evaluated to verify the feasibility of the algorithm.The experimental results achieve excellent performance and demonstrate that our method can achieve 94%accuracy.
基金Science & Technology Foundation of Shanghai (Grant No.05JC14021)
文摘During the product family design, it is necessary to reduce the variety of components and share common components among many products. The major benefits are lessened design efforts and reduced costs. Therefore, this paper presents an approach to standardize components of a product family. Form feature modeling for components is discussed. Based on the similarity analysis, a step by step method to standardize the feature architectures of components is described. The algorithms for standardization are identified as well. A case for standardizing components of an auto-body family is used to demonstrate the validity of this approach.
文摘In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power generation forecast method using the combination of K-means++, grey relational analysis (GRA) and support vector regression (SVR) based on feature selection (Hybrid Kmeans-GRA-SVR, HKGSVR) was proposed. The historical power data were clustered through the multi-index K-means++ algorithm and divided into ideal and non-ideal weather. The GRA algorithm was used to match the similar day and the nearest neighbor similar day of the prediction day. And selected appropriate input features for different weather types to train the SVR model. Under ideal weather, the average values of MAE, RMSE and R2 were 0.8101, 0.9608 kW and 99.66%, respectively. And this method reduced the average training time by 77.27% compared with the standard SVR model. Under non-ideal weather conditions, the average values of MAE, RMSE and R2 were 1.8337, 2.1379 kW and 98.47%, respectively. And this method reduced the average training time of the standard SVR model by 98.07%. The experimental results show that the prediction accuracy of the proposed model is significantly improved compared to the other five models, which verify the effectiveness of the method.
文摘In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.
基金Supported by the Natural Science Foundation of China(62072388,62276146)the Industry Guidance Project Foundation of Science technology Bureau of Fujian province(2020H0047)+2 种基金the Natural Science Foundation of Science Technology Bureau of Fujian province(2019J01601)the Creation Fund project of Science Technology Bureau of Fujian province(JAT190596)Putian University Research Project(2022034)。
文摘Background Image matching is crucial in numerous computer vision tasks such as 3D reconstruction and simultaneous visual localization and mapping.The accuracy of the matching significantly impacted subsequent studies.Because of their local similarity,when image pairs contain comparable patterns but feature pairs are positioned differently,incorrect recognition can occur as global motion consistency is disregarded.Methods This study proposes an image-matching filtering algorithm based on global motion consistency.It can be used as a subsequent matching filter for the initial matching results generated by other matching algorithms based on the principle of motion smoothness.A particular matching algorithm can first be used to perform the initial matching;then,the rotation and movement information of the global feature vectors are combined to effectively identify outlier matches.The principle is that if the matching result is accurate,the feature vectors formed by any matched point should have similar rotation angles and moving distances.Thus,global motion direction and global motion distance consistencies were used to reject outliers caused by similar patterns in different locations.Results Four datasets were used to test the effectiveness of the proposed method.Three datasets with similar patterns in different locations were used to test the results for similar images that could easily be incorrectly matched by other algorithms,and one commonly used dataset was used to test the results for the general image-matching problem.The experimental results suggest that the proposed method is more accurate than other state-of-the-art algorithms in identifying mismatches in the initial matching set.Conclusions The proposed outlier rejection matching method can significantly improve the matching accuracy for similar images with locally similar feature pairs in different locations and can provide more accurate matching results for subsequent computer vision tasks.