Retinal images play an essential role in the early diagnosis of ophthalmic diseases.Automatic segmentation of retinal vessels in color fundus images is challenging due to the morphological differences between the reti...Retinal images play an essential role in the early diagnosis of ophthalmic diseases.Automatic segmentation of retinal vessels in color fundus images is challenging due to the morphological differences between the retinal vessels and the low-contrast background.At the same time,automated models struggle to capture representative and discriminative retinal vascular features.To fully utilize the structural information of the retinal blood vessels,we propose a novel deep learning network called Pre-Activated Convolution Residual and Triple Attention Mechanism Network(PCRTAM-Net).PCRTAM-Net uses the pre-activated dropout convolution residual method to improve the feature learning ability of the network.In addition,the residual atrous convolution spatial pyramid is integrated into both ends of the network encoder to extract multiscale information and improve blood vessel information flow.A triple attention mechanism is proposed to extract the structural information between vessel contexts and to learn long-range feature dependencies.We evaluate the proposed PCRTAM-Net on four publicly available datasets,DRIVE,CHASE_DB1,STARE,and HRF.Our model achieves state-of-the-art performance of 97.10%,97.70%,97.68%,and 97.14%for ACC and 83.05%,82.26%,84.64%,and 81.16%for F1,respectively.展开更多
针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表...针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表示;然后利用首尾指针标注抽取句子中的所有主体,并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度,从而将关系标签信息加入句子嵌入中;最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体,并生成关系三元组。在纽约时报(NYT)和网络自然文本生成(WebNLG)两个数据集上进行了大量实验,结果表明,所提模型相较于目前最优的级联二元标记框架(CasRel)模型,整体性能分别提升了1.9和0.7个百分点;与基于跨度的提取标记方法(ETL-Span)模型相比,在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升,特别是在含有5个以上三元组的复杂句子中,所提模型的F1值分别提升了8.5和1.3个百分点,且在捕获更多实体对的同时能够保持稳定的提取能力,进一步验证了该模型在三元组重叠问题中的有效性。展开更多
针对多方向排列的文本因其尺度变化大、复杂背景干扰而导致检测效果仍不甚理想的问题,本文提出了一种基于注意力机制的多方向文本检测方法。首先,考虑到自然场景下干扰信息多,构建文本特征提取网络(text feature information ResNet50,T...针对多方向排列的文本因其尺度变化大、复杂背景干扰而导致检测效果仍不甚理想的问题,本文提出了一种基于注意力机制的多方向文本检测方法。首先,考虑到自然场景下干扰信息多,构建文本特征提取网络(text feature information ResNet50,TF-ResNet),对图像中的文本特征信息进行提取;其次,在特征融合模型中加入文本注意模块(text attention module, TAM),抑制无关信息的同时突出显示文本信息,以增强文本特征之间的潜在联系;最后,采用渐进扩展模块,逐步融合扩展前部分得到的多个不同尺度的分割结果,以获得精确检测结果。本文方法在数据集CTW1500、ICDAR2015上进行实验验证和分析,其F值分别达到80.4%和83.0%,比次优方法分别提升了2.0%和2.4%,表明该方法在多方向文本检测上与其他方法相比具备一定的竞争力。展开更多
针对当前跨模态行人重识别算法大多聚类能力不强、且难以提取高效辨别性特征的问题,提出了一种多粒度跨模态行人重识别算法。首先,在骨干网络Resnet50中加入非局部注意力机制模块,关注长距离像素之间的关系,保留细节信息;其次,采用多分...针对当前跨模态行人重识别算法大多聚类能力不强、且难以提取高效辨别性特征的问题,提出了一种多粒度跨模态行人重识别算法。首先,在骨干网络Resnet50中加入非局部注意力机制模块,关注长距离像素之间的关系,保留细节信息;其次,采用多分支网络提取不同细粒度特征信息,增强模型的辨别性特征提取能力;最后,联合基于样本的三元组损失和基于中心的三元组损失监督训练,加速模型收敛。所提算法在SYSU-MM01数据集的全搜索模式下Rank-1和mean average precision分别达到62.83%和58.10%,在RegDB数据集的可见光到红外模式下Rank-1和mAP分别达到87.78%和76.22%。展开更多
基金supported by the Open Funds from Guangxi Key Laboratory of Image and Graphic Intelligent Processing under Grant No.GIIP2209the National Natural Science Foundation of China under Grant Nos.62172120 and 62002082the Natural Science Foundation of Guangxi Province of China under Grant Nos.2019GXNSFAA245014 and 2020GXNSFBA238014.
文摘Retinal images play an essential role in the early diagnosis of ophthalmic diseases.Automatic segmentation of retinal vessels in color fundus images is challenging due to the morphological differences between the retinal vessels and the low-contrast background.At the same time,automated models struggle to capture representative and discriminative retinal vascular features.To fully utilize the structural information of the retinal blood vessels,we propose a novel deep learning network called Pre-Activated Convolution Residual and Triple Attention Mechanism Network(PCRTAM-Net).PCRTAM-Net uses the pre-activated dropout convolution residual method to improve the feature learning ability of the network.In addition,the residual atrous convolution spatial pyramid is integrated into both ends of the network encoder to extract multiscale information and improve blood vessel information flow.A triple attention mechanism is proposed to extract the structural information between vessel contexts and to learn long-range feature dependencies.We evaluate the proposed PCRTAM-Net on four publicly available datasets,DRIVE,CHASE_DB1,STARE,and HRF.Our model achieves state-of-the-art performance of 97.10%,97.70%,97.68%,and 97.14%for ACC and 83.05%,82.26%,84.64%,and 81.16%for F1,respectively.
文摘针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表示;然后利用首尾指针标注抽取句子中的所有主体,并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度,从而将关系标签信息加入句子嵌入中;最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体,并生成关系三元组。在纽约时报(NYT)和网络自然文本生成(WebNLG)两个数据集上进行了大量实验,结果表明,所提模型相较于目前最优的级联二元标记框架(CasRel)模型,整体性能分别提升了1.9和0.7个百分点;与基于跨度的提取标记方法(ETL-Span)模型相比,在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升,特别是在含有5个以上三元组的复杂句子中,所提模型的F1值分别提升了8.5和1.3个百分点,且在捕获更多实体对的同时能够保持稳定的提取能力,进一步验证了该模型在三元组重叠问题中的有效性。
文摘针对当前跨模态行人重识别算法大多聚类能力不强、且难以提取高效辨别性特征的问题,提出了一种多粒度跨模态行人重识别算法。首先,在骨干网络Resnet50中加入非局部注意力机制模块,关注长距离像素之间的关系,保留细节信息;其次,采用多分支网络提取不同细粒度特征信息,增强模型的辨别性特征提取能力;最后,联合基于样本的三元组损失和基于中心的三元组损失监督训练,加速模型收敛。所提算法在SYSU-MM01数据集的全搜索模式下Rank-1和mean average precision分别达到62.83%和58.10%,在RegDB数据集的可见光到红外模式下Rank-1和mAP分别达到87.78%和76.22%。