Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological heal...Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.展开更多
In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the v...In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.展开更多
We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is r...We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is regarded as a bag,while each class in the set is regarded as an instance. The learning task in this study is to estimate the label of unseen bags,i.e.,the fault-proneness of untested class hierarchies. A fault-prone class hierarchy contains at least one fault-prone (negative) class,while a non-fault-prone (positive) one has no negative class. Based on the modification records (MRs) of the previous project releases and OO software metrics,the fault-proneness of an untested class hierarchy can be predicted. Several selected MI learning algorithms were evalu-ated on five datasets collected from an industrial software project. Among the MI learning algorithms investigated in the ex-periments,the kernel method using a dedicated MI-kernel was better than the others in accurately and correctly predicting the fault-proneness of the class hierarchies. In addition,when compared to a supervised support vector machine (SVM) algorithm,the MI-kernel method still had a competitive performance with much less cost.展开更多
基金This work is supported by the Academic Research Project of Henan Police College(Grant:HNJY-2021-QN-14 and HNJY202220)the Key Technology R&D Program of Henan Province(Grant:222102210041).
文摘Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.
基金Supported by the National Natural Science Foundation of China under Grant Nos. 60105004 and 60325207. Acknowledgements The author wants to thank Min-Ling Zhang for running the experiments, Clancarlo Ruffo for providing the code of RELIC, and Nicolas Bredeche for providing the code of RIPPER-MI. A preliminary version of this paper has been presented at ECML'03 (the 14th European Conference on Machine Learning).
文摘In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.
文摘We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is regarded as a bag,while each class in the set is regarded as an instance. The learning task in this study is to estimate the label of unseen bags,i.e.,the fault-proneness of untested class hierarchies. A fault-prone class hierarchy contains at least one fault-prone (negative) class,while a non-fault-prone (positive) one has no negative class. Based on the modification records (MRs) of the previous project releases and OO software metrics,the fault-proneness of an untested class hierarchy can be predicted. Several selected MI learning algorithms were evalu-ated on five datasets collected from an industrial software project. Among the MI learning algorithms investigated in the ex-periments,the kernel method using a dedicated MI-kernel was better than the others in accurately and correctly predicting the fault-proneness of the class hierarchies. In addition,when compared to a supervised support vector machine (SVM) algorithm,the MI-kernel method still had a competitive performance with much less cost.