It is not uncommon for malicious sellers to collude with fake reviewers(also called spammers)to write fake reviews for multiple products to either demote competitors or promote their products'reputations,forming a...It is not uncommon for malicious sellers to collude with fake reviewers(also called spammers)to write fake reviews for multiple products to either demote competitors or promote their products'reputations,forming a gray industry chain.To detect spammer groups in a heterogeneous network with rich semantic information from both buyers and sellers,researchers have conducted extensive research using Frequent Item Mining-based and graph-based meth-ods.However,these methods cannot detect spammer groups with cross-product attacks and do not jointly consider structural and attribute features,and structure-attribute correlation,resulting in poorer detection performance.There-fore,we propose a collaborative training-based spammer group detection algorithm by constructing a heterogene-ous induced sub-network based on the target product set to detect cross-product attack spammer groups.To jointly consider all available features,we use the collaborative training method to learn the feature representations of nodes.In addition,we use the DBSCAN clustering method to generate candidate groups,exclude innocent ones,and rank them to obtain spammer groups.The experimental results on real-world datasets indicate that the overall detection performance of the proposed method is better than that of the baseline methods.展开更多
Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a...Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples,mixed vocabularies,variable length and content arbitrariness of text fields.In this paper,we propose a novel multi-label text classification method,Text GCN-SW-KNN,based on geographic semantics and collaborative training to improve classifica-tion accuracy.The semi-supervised collaborative training adopts two base models,i.e.a modified Text Graph Convolutional Network(Text GCN)by utilizing Semantic Web,named Text GCN-SW,and widely-used Multi-Label K-Nearest Neighbor(ML-KNN).Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text.The distances are calculated with the Semantic Web of Earth and Environmental Terminology(SWEET)and WordNet dictionaries.Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines,and demonstrate better stability in repeating experiments and robustness to less training data.Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.展开更多
Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face chal...Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face challenges such as high computational complexity and low classification accuracy.To overcome these limitations,we present a novel approach called Weighted fusion based Cooperative Training Algorithm(W-CTA),which leverages the cooperative training technique and unlabeled data to enhance classification performance.Moreover,we introduce the K-means Cooperative Training Algorithm(km-CTA)to prevent the occurrence of local optima during the training phase.Finally,we conduct various experiments to verify the performance of the proposed methods.Experimental results show that W-CTA and km-CTA are effective and efficient on CIFAR-10 dataset.展开更多
基金This paper is supported in part by the Natural Science Foundation of China(No.71772107,62072288)Shandong Nature Science Foundation of China[Grant No.ZR2019MF003,ZR2020MF044].
文摘It is not uncommon for malicious sellers to collude with fake reviewers(also called spammers)to write fake reviews for multiple products to either demote competitors or promote their products'reputations,forming a gray industry chain.To detect spammer groups in a heterogeneous network with rich semantic information from both buyers and sellers,researchers have conducted extensive research using Frequent Item Mining-based and graph-based meth-ods.However,these methods cannot detect spammer groups with cross-product attacks and do not jointly consider structural and attribute features,and structure-attribute correlation,resulting in poorer detection performance.There-fore,we propose a collaborative training-based spammer group detection algorithm by constructing a heterogene-ous induced sub-network based on the target product set to detect cross-product attack spammer groups.To jointly consider all available features,we use the collaborative training method to learn the feature representations of nodes.In addition,we use the DBSCAN clustering method to generate candidate groups,exclude innocent ones,and rank them to obtain spammer groups.The experimental results on real-world datasets indicate that the overall detection performance of the proposed method is better than that of the baseline methods.
基金supported by National Natural Science Foundation of China[No.41971349,No.41930107,No.42090010 and No.41501434]National Key Research and Development Program of China[No.2017YFB0503704 and No.2018YFC0809806].
文摘Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples,mixed vocabularies,variable length and content arbitrariness of text fields.In this paper,we propose a novel multi-label text classification method,Text GCN-SW-KNN,based on geographic semantics and collaborative training to improve classifica-tion accuracy.The semi-supervised collaborative training adopts two base models,i.e.a modified Text Graph Convolutional Network(Text GCN)by utilizing Semantic Web,named Text GCN-SW,and widely-used Multi-Label K-Nearest Neighbor(ML-KNN).Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text.The distances are calculated with the Semantic Web of Earth and Environmental Terminology(SWEET)and WordNet dictionaries.Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines,and demonstrate better stability in repeating experiments and robustness to less training data.Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.
基金supported in part by the National Natural Science Foundation of China(NSFC)(Nos.62033010,62102134)in part by the Leading talents of science and technology in the Central Plain of China(No.224200510004)+2 种基金in part by the Key R&D projects in Henan Province,China(No.231111222600)in part by the Aeronautical Science Foundation of China(No.2019460T5001)in part by the Scientific and Technological Innovation Talents of Colleges and Universities in Henan Province,China(No.22HASTIT014).
文摘Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face challenges such as high computational complexity and low classification accuracy.To overcome these limitations,we present a novel approach called Weighted fusion based Cooperative Training Algorithm(W-CTA),which leverages the cooperative training technique and unlabeled data to enhance classification performance.Moreover,we introduce the K-means Cooperative Training Algorithm(km-CTA)to prevent the occurrence of local optima during the training phase.Finally,we conduct various experiments to verify the performance of the proposed methods.Experimental results show that W-CTA and km-CTA are effective and efficient on CIFAR-10 dataset.