Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers.Due to the lack of professional knowledge,the quality of crowdsourced labels is relatively low.A common approach to addressing ...Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers.Due to the lack of professional knowledge,the quality of crowdsourced labels is relatively low.A common approach to addressing this issue is to collect multiple labels for each instance from different crowd workers and then a label integration method is used to infer its true label.However,to our knowledge,almost all existing label integration methods merely make use of the original attribute information and do not pay attention to the quality of the multiple noisy label set of each instance.To solve these issues,this paper proposes a novel three-stage label integration method called attribute augmentation-based label integration(AALI).In the first stage,we design an attribute augmentation method to enrich the original attribute space.In the second stage,we develop a filter to single out reliable instances with high-quality multiple noisy label sets.In the third stage,we use majority voting to initialize integrated labels of reliable instances and then use cross-validation to build multiple component classifiers on reliable instances to predict all instances.Experimental results on simulated and real-world crowdsourced datasets demonstrate that AALI outperforms all the other stateof-the-art competitors.展开更多
基金supported by the Science and Technology Project of Hubei Province-Unveiling System(2021BEC007)the Industry-University-Research Innovation Funds for Chinese Universities(2020ITA05008).
文摘Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers.Due to the lack of professional knowledge,the quality of crowdsourced labels is relatively low.A common approach to addressing this issue is to collect multiple labels for each instance from different crowd workers and then a label integration method is used to infer its true label.However,to our knowledge,almost all existing label integration methods merely make use of the original attribute information and do not pay attention to the quality of the multiple noisy label set of each instance.To solve these issues,this paper proposes a novel three-stage label integration method called attribute augmentation-based label integration(AALI).In the first stage,we design an attribute augmentation method to enrich the original attribute space.In the second stage,we develop a filter to single out reliable instances with high-quality multiple noisy label sets.In the third stage,we use majority voting to initialize integrated labels of reliable instances and then use cross-validation to build multiple component classifiers on reliable instances to predict all instances.Experimental results on simulated and real-world crowdsourced datasets demonstrate that AALI outperforms all the other stateof-the-art competitors.