Rapid coal-rock identification is one of the key technologies for intelligent and unmanned coal mining.Currently,the existing image recognition algorithms cannot satisfy practical needs in terms of recognition speed a...Rapid coal-rock identification is one of the key technologies for intelligent and unmanned coal mining.Currently,the existing image recognition algorithms cannot satisfy practical needs in terms of recognition speed and accuracy.In view of the evident differences between coal and rock in visual attributes such as color,gloss and texture,the complete local binary pattern(CLBP)image feature descriptor is introduced for coal and rock image recognition.Given that the original algorithm oversimplifies local texture features by ignoring imaging information from higher-order pixels and the concave and convex areas between adjacent sampling points,this paper proposes a higher-order differential median CLBP image feature descriptor to replace the original CLBP center pixel gray with a local gray median,and replace the binary differential with a second-order differential.Meanwhile,for the high dimensionality of CLBP descriptor histogram and feature redundancy,deep learning perceptual field theory is introduced to realize data nonlinear dimensionality reduction and deep feature extraction.With relevant experiments conducted,the following conclusion can be drawn:(1)Compared with that of the original CLBP,the recognition accuracy of the improved CLBP algorithm is greatly improved and finally stabilized above 94.3%under strong noise interference;(2)Compared with that of the original CLBP model,the single image recognition time of the coal rock image recognition model fusing the improved CLBP and the receptive field theory is 0.0035 s,a reduction of 71.0%;compared with the improved CLBP model(without the fusion of receptive field theory),it can shorten the recognition time by 97.0%,but the accuracy rate still maintains more than 98.5%.The method offers a valuable technical reference for the fields of mineral development and deep mining.展开更多
The study was performed on neurons with direction selective (DS) receptive fields (RFs) in the primary visual cortex of the cat. Preferred directions (PDs) of these cells to a single light spot and a system of two ide...The study was performed on neurons with direction selective (DS) receptive fields (RFs) in the primary visual cortex of the cat. Preferred directions (PDs) of these cells to a single light spot and a system of two identical light spots moving across the RF with a given angle between them were compared. Directional interactions appeared when the angles between the directions of the two moving spots were 30o or 60o. PD for 56% of the cells coincided with bisectors of these angles. These cells responded to a combination of the two moving stimuli as if only one stimulus moved in the RF in an intermediate direction. This direction coincided with PD of the DS neuron to a single spot. Also, the investigation revealed that DS neurons responded to stimuli moving at such angles as 180o (to preferred and opposite directions simultaneously). In the further experiment we investigated responses of the DS cells in the primary visual cortex of RF. The angle between the directions of the two moving spots was 60o. These cells responded to a combination of the two moving stimuli as if only one stimulus moved in RF in an intermediate direction. The more relative luminance of one of spots in pair was, the closer the intermediate direction approached to the direction of this spot).展开更多
The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs w...The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs with intracranial local field potentials(LFPs) and spiking activity in human visual cortex(V1/V2/V3). We recorded LFPs via macro-contacts and discovered that RF sizes estimated from lowfrequency activity(LFA, 0.5–30 Hz) were larger than those estimated from low-gamma activity(LGA, 30–60 Hz) and high-gamma activity(HGA, 60–150 Hz). We then took a rare opportunity to record LFPs and spiking activity via microwires in V1 simultaneously. We found that RF sizes and temporal profiles measured from LGA and HGA closely matched those from spiking activity. In sum, this study reveals that spiking activity of neurons in human visual cortex could be well approximated by LGA and HGA in RF estimation and temporal profile measurement, implying the pivotal functions of LGA and HGA in early visual information processing.展开更多
针对金属涂层缺陷图像分割中存在特征提取能力弱和分割精度低的问题,提出了一种改进的U^(2)-Net分割模型。首先,在U型残差块(RSU)中嵌入改进的增大感受野模块(receptive field block light,RFB_l),组成新的特征提取层,增强对细节特征的...针对金属涂层缺陷图像分割中存在特征提取能力弱和分割精度低的问题,提出了一种改进的U^(2)-Net分割模型。首先,在U型残差块(RSU)中嵌入改进的增大感受野模块(receptive field block light,RFB_l),组成新的特征提取层,增强对细节特征的学习能力,解决了网络由于感受野受限造成分割精度低的问题;其次,在U^(2)-Net分割模型的解码阶段引入有效的边缘增强注意力机制(contour enhanced attention,CEA),抑制网络中的冗余特征,获取具有详细位置信息的特征注意力图,增强了边界与背景信息的差异性,从而达到更精确的分割效果。实验结果表明,该模型在两个金属涂层剥落与腐蚀数据集上的平均交并比、准确率、查准率、召回率和F_1-measure分别达到80.36%、96.29%、87.43%、84.61%和86.00%,相比于常用的SegNet、U-Net以及U^(2)-Net分割网络的性能都有较大提升。展开更多
从无人机视角进行目标检测,面临图像目标小、分布密集、类别不均衡等难点,且由于无人机的硬件条件限制了模型的规模,导致模型的准确率偏低。提出一种融合多种注意力机制的YOLOv8s改进模型,在骨干网络中引入感受野注意力卷积和CBAM(conce...从无人机视角进行目标检测,面临图像目标小、分布密集、类别不均衡等难点,且由于无人机的硬件条件限制了模型的规模,导致模型的准确率偏低。提出一种融合多种注意力机制的YOLOv8s改进模型,在骨干网络中引入感受野注意力卷积和CBAM(concentration-based attention module)注意力机制改进卷积模块,解决注意力权重参数在感受野特征中共享问题的同时,在通道和空间维度加上注意力权重,增强特征提取能力;通过引入大型可分离卷积注意力思想,改造空间金字塔池化层,增加不同层级特征间的信息交融;优化颈部结构,增加具有丰富小目标语义信息的特征层;使用inner-IoU损失函数的思想改进MPDIoU(minimum point distance based IoU)函数,以innerMPDIoU代替原损失函数,提升对困难样本的学习能力。实验结果表明,改进后的YOLOv8s模型在VisDrone数据集上mAP、P、R分别提升了16.1%、9.3%、14.9%,性能超过YOLOv8m,可以有效应用于无人机平台上的目标检测任务。展开更多
针对行人检测在复杂环境下存在的高误检率和丢失率问题,提出了一种基于YOLOv5s的改进模型YOLOv5s-RFDH。该模型在保留YOLOv5s基线网络的基础上,在特征提取和检测部分进行了优化改进,以提高行人检测在复杂场景中的准确性和鲁棒性。针对Cr...针对行人检测在复杂环境下存在的高误检率和丢失率问题,提出了一种基于YOLOv5s的改进模型YOLOv5s-RFDH。该模型在保留YOLOv5s基线网络的基础上,在特征提取和检测部分进行了优化改进,以提高行人检测在复杂场景中的准确性和鲁棒性。针对CrowdHuman数据集和WiderPerson数据集进行行人目标检测。以上数据集行人密集且存在大量遮挡,因此,采用了K-Means++聚类算法来重新聚类数据集以获取适合数据的锚框;引入感受野模块(Receptive Field Block,RFB)来进行特征提取,在不同分支中使用空洞卷积增加感受野从而提取更深层次的特征信息,并最终将这些特征融合在一起,提升了小目标行人的检测精度;解耦头可以解决目标检测中的尺度不变性问题,引入解耦检测头将分类和回归任务分离,从而能够更加准确地检测到不同尺度和大小的目标。在CrowdHuman数据集和WiderPerson数据集划分出的测试集上进行对比实验,结果表明,改进后的模型在检测准确率上得到提升,丢失率有所下降,在以上两个不同数据集上检测准确率分别提升1.4%和1.2%,丢失率分别降低2.0%和1.7%。展开更多
基金Scientific and technological innovation project of colleges and universities in Shanxi Province,Grant/Award Number:2020L0294Shanxi Province Science Foundation for Youths,Grant/Award Number:201901D211249。
文摘Rapid coal-rock identification is one of the key technologies for intelligent and unmanned coal mining.Currently,the existing image recognition algorithms cannot satisfy practical needs in terms of recognition speed and accuracy.In view of the evident differences between coal and rock in visual attributes such as color,gloss and texture,the complete local binary pattern(CLBP)image feature descriptor is introduced for coal and rock image recognition.Given that the original algorithm oversimplifies local texture features by ignoring imaging information from higher-order pixels and the concave and convex areas between adjacent sampling points,this paper proposes a higher-order differential median CLBP image feature descriptor to replace the original CLBP center pixel gray with a local gray median,and replace the binary differential with a second-order differential.Meanwhile,for the high dimensionality of CLBP descriptor histogram and feature redundancy,deep learning perceptual field theory is introduced to realize data nonlinear dimensionality reduction and deep feature extraction.With relevant experiments conducted,the following conclusion can be drawn:(1)Compared with that of the original CLBP,the recognition accuracy of the improved CLBP algorithm is greatly improved and finally stabilized above 94.3%under strong noise interference;(2)Compared with that of the original CLBP model,the single image recognition time of the coal rock image recognition model fusing the improved CLBP and the receptive field theory is 0.0035 s,a reduction of 71.0%;compared with the improved CLBP model(without the fusion of receptive field theory),it can shorten the recognition time by 97.0%,but the accuracy rate still maintains more than 98.5%.The method offers a valuable technical reference for the fields of mineral development and deep mining.
文摘The study was performed on neurons with direction selective (DS) receptive fields (RFs) in the primary visual cortex of the cat. Preferred directions (PDs) of these cells to a single light spot and a system of two identical light spots moving across the RF with a given angle between them were compared. Directional interactions appeared when the angles between the directions of the two moving spots were 30o or 60o. PD for 56% of the cells coincided with bisectors of these angles. These cells responded to a combination of the two moving stimuli as if only one stimulus moved in the RF in an intermediate direction. This direction coincided with PD of the DS neuron to a single spot. Also, the investigation revealed that DS neurons responded to stimuli moving at such angles as 180o (to preferred and opposite directions simultaneously). In the further experiment we investigated responses of the DS cells in the primary visual cortex of RF. The angle between the directions of the two moving spots was 60o. These cells responded to a combination of the two moving stimuli as if only one stimulus moved in RF in an intermediate direction. The more relative luminance of one of spots in pair was, the closer the intermediate direction approached to the direction of this spot).
基金supported by the National Science and Technology Innovation 2030 Major Program(2022ZD0204802,2022ZD0204804)the National Natural Science Foundation of China(31930053,32171039)Beijing Academy of Artificial Intelligence(BAAI)。
文摘The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs with intracranial local field potentials(LFPs) and spiking activity in human visual cortex(V1/V2/V3). We recorded LFPs via macro-contacts and discovered that RF sizes estimated from lowfrequency activity(LFA, 0.5–30 Hz) were larger than those estimated from low-gamma activity(LGA, 30–60 Hz) and high-gamma activity(HGA, 60–150 Hz). We then took a rare opportunity to record LFPs and spiking activity via microwires in V1 simultaneously. We found that RF sizes and temporal profiles measured from LGA and HGA closely matched those from spiking activity. In sum, this study reveals that spiking activity of neurons in human visual cortex could be well approximated by LGA and HGA in RF estimation and temporal profile measurement, implying the pivotal functions of LGA and HGA in early visual information processing.
文摘由于低照度图像具有对比度低、细节丢失严重、噪声大等缺点,现有的目标检测算法对低照度图像的检测效果不理想.为此,本文提出一种结合空间感知注意力机制和多尺度特征融合(Spatial-aware Attention Mechanism and Multi-Scale Feature Fusion,SAM-MSFF)的低照度目标检测方法 .该方法首先通过多尺度交互内存金字塔融合多尺度特征,增强低照度图像特征中的有效信息,并设置内存向量存储样本的特征,捕获样本之间的潜在关联性;然后,引入空间感知注意力机制获取特征在空间域的长距离上下文信息和局部信息,从而增强低照度图像中的目标特征,抑制背景信息和噪声的干扰;最后,利用多感受野增强模块扩张特征的感受野,对具有不同感受野的特征进行分组重加权计算,使检测网络根据输入的多尺度信息自适应地调整感受野的大小.在ExDark数据集上进行实验,本文方法的平均精度(mean Average Precision,mAP)达到77.04%,比现有的主流目标检测方法提高2.6%~14.34%.
文摘从无人机视角进行目标检测,面临图像目标小、分布密集、类别不均衡等难点,且由于无人机的硬件条件限制了模型的规模,导致模型的准确率偏低。提出一种融合多种注意力机制的YOLOv8s改进模型,在骨干网络中引入感受野注意力卷积和CBAM(concentration-based attention module)注意力机制改进卷积模块,解决注意力权重参数在感受野特征中共享问题的同时,在通道和空间维度加上注意力权重,增强特征提取能力;通过引入大型可分离卷积注意力思想,改造空间金字塔池化层,增加不同层级特征间的信息交融;优化颈部结构,增加具有丰富小目标语义信息的特征层;使用inner-IoU损失函数的思想改进MPDIoU(minimum point distance based IoU)函数,以innerMPDIoU代替原损失函数,提升对困难样本的学习能力。实验结果表明,改进后的YOLOv8s模型在VisDrone数据集上mAP、P、R分别提升了16.1%、9.3%、14.9%,性能超过YOLOv8m,可以有效应用于无人机平台上的目标检测任务。
文摘针对行人检测在复杂环境下存在的高误检率和丢失率问题,提出了一种基于YOLOv5s的改进模型YOLOv5s-RFDH。该模型在保留YOLOv5s基线网络的基础上,在特征提取和检测部分进行了优化改进,以提高行人检测在复杂场景中的准确性和鲁棒性。针对CrowdHuman数据集和WiderPerson数据集进行行人目标检测。以上数据集行人密集且存在大量遮挡,因此,采用了K-Means++聚类算法来重新聚类数据集以获取适合数据的锚框;引入感受野模块(Receptive Field Block,RFB)来进行特征提取,在不同分支中使用空洞卷积增加感受野从而提取更深层次的特征信息,并最终将这些特征融合在一起,提升了小目标行人的检测精度;解耦头可以解决目标检测中的尺度不变性问题,引入解耦检测头将分类和回归任务分离,从而能够更加准确地检测到不同尺度和大小的目标。在CrowdHuman数据集和WiderPerson数据集划分出的测试集上进行对比实验,结果表明,改进后的模型在检测准确率上得到提升,丢失率有所下降,在以上两个不同数据集上检测准确率分别提升1.4%和1.2%,丢失率分别降低2.0%和1.7%。