Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing cloth...Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.展开更多
Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Prev...Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.展开更多
Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VI...Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VIS face images to train models.This paper focuses on the generation of paired NIR-VIS face images and proposes a dual variational generator based on ResNeSt(RS-DVG).RS-DVG can generate a large number of paired NIR-VIS face images from noise,and these generated NIR-VIS face images can be used as the training set together with the real NIR-VIS face images.In addition,a triplet loss function is introduced and a novel triplet selection method is proposed specifically for the training of the current face recognition model,which maximizes the inter-class distance and minimizes the intra-class distance in the input face images.The method proposed in this paper was evaluated on the datasets CASIA NIR-VIS 2.0 and BUAA-VisNir,and relatively good results were obtained.展开更多
Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by gener...Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.展开更多
基金National Natural Science Foundation of China (No.62006039)Shanghai Special Fund for Software and Integrated Circuit Industry Development,China (No.180330)。
文摘Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.
基金Supported by the National High Technology Research and Development Programme of China(No.2015AA015405)
文摘Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.
基金National Natural Science Foundation of China(No.62006039)National Key Research and Development Program of China(No.2019YFE0190500)。
文摘Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VIS face images to train models.This paper focuses on the generation of paired NIR-VIS face images and proposes a dual variational generator based on ResNeSt(RS-DVG).RS-DVG can generate a large number of paired NIR-VIS face images from noise,and these generated NIR-VIS face images can be used as the training set together with the real NIR-VIS face images.In addition,a triplet loss function is introduced and a novel triplet selection method is proposed specifically for the training of the current face recognition model,which maximizes the inter-class distance and minimizes the intra-class distance in the input face images.The method proposed in this paper was evaluated on the datasets CASIA NIR-VIS 2.0 and BUAA-VisNir,and relatively good results were obtained.
基金National Natural Science Foundation of China(No.62006039)。
文摘Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.