期刊文献+
共找到1,013篇文章
< 1 2 51 >
每页显示 20 50 100
SHEL:a semantically enhanced hardware-friendly entity linking method
1
作者 亓东林 CHEN Shudong +2 位作者 DU Rong TONG Da YU Yong 《High Technology Letters》 EI CAS 2024年第1期13-22,共10页
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train... With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder. 展开更多
关键词 entity linking(EL) pre-trained models knowledge graph text summarization semantic enhancement
下载PDF
Detection of semantically similar code 被引量:1
2
作者 Tiantian WANG Kechao WANG +1 位作者 Xiaohong SU Peijun MA 《Frontiers of Computer Science》 SCIE EI CSCD 2014年第6期996-1011,共16页
The traditional similar code detection approaches are limited in detecting semantically similar codes, impeding their applications in practice. In this paper, we have improved the traditional metrics-based approach as... The traditional similar code detection approaches are limited in detecting semantically similar codes, impeding their applications in practice. In this paper, we have improved the traditional metrics-based approach as well as the graph- based approach and presented a metrics-based and graph- based combined approach. First, source codes are represented as augmented system dependence graphs. Then, metrics- based candidate similar code extraction is performed to filter out most of the dissimilar code pairs so as to lower the computational complexity. After that, code normalization is performed on the candidate similar codes to remove code variations so as to detect similar code at the semantic level. Finally, program matching is performed on the normalized control dependence trees to output semantically similar codes. Experiment results show that our approach can detect similar codes with code variations, and it can be applied to large software. 展开更多
关键词 similar code detection system dependence graph code normalization semantically equivalent
原文传递
Part-Whole Relational Few-Shot 3D Point Cloud Semantic Segmentation
3
作者 Shoukun Xu Lujun Zhang +2 位作者 Guangqi Jiang Yining Hua Yi Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期3021-3039,共19页
This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation an... This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation and inaccurate semantic discrimination.To tackle these issues,we first leverage part-whole relationships into the task of 3D point cloud semantic segmentation to capture semantic integrity,which is empowered by the dynamic capsule routing with the module of 3D Capsule Networks(CapsNets)in the embedding network.Concretely,the dynamic routing amalgamates geometric information of the 3D point cloud data to construct higher-level feature representations,which capture the relationships between object parts and their wholes.Secondly,we designed a multi-prototype enhancement module to enhance the prototype discriminability.Specifically,the single-prototype enhancement mechanism is expanded to the multi-prototype enhancement version for capturing rich semantics.Besides,the shot-correlation within the category is calculated via the interaction of different samples to enhance the intra-category similarity.Ablation studies prove that the involved part-whole relations and proposed multi-prototype enhancement module help to achieve complete object segmentation and improve semantic discrimination.Moreover,under the integration of these two modules,quantitative and qualitative experiments on two public benchmarks,including S3DIS and ScanNet,indicate the superior performance of the proposed framework on the task of 3D point cloud semantic segmentation,compared to some state-of-the-art methods. 展开更多
关键词 Few-shot point cloud semantic segmentation CapsNets
下载PDF
CrossFormer Embedding DeepLabv3+ for Remote Sensing Images Semantic Segmentation
4
作者 Qixiang Tong Zhipeng Zhu +2 位作者 Min Zhang Kerui Cao Haihua Xing 《Computers, Materials & Continua》 SCIE EI 2024年第4期1353-1375,共23页
High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the d... High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the difficultyof segmentation. In this paper, an improved network with a cross-region self-attention mechanism for multi-scalefeatures based onDeepLabv3+is designed to address the difficulties of small object segmentation and blurred targetedge segmentation. First,we use CrossFormer as the backbone feature extraction network to achieve the interactionbetween large- and small-scale features, and establish self-attention associations between features at both large andsmall scales to capture global contextual feature information. Next, an improved atrous spatial pyramid poolingmodule is introduced to establish multi-scale feature maps with large- and small-scale feature associations, andattention vectors are added in the channel direction to enable adaptive adjustment of multi-scale channel features.The proposed networkmodel is validated using the PotsdamandVaihingen datasets. The experimental results showthat, compared with existing techniques, the network model designed in this paper can extract and fuse multiscaleinformation, more clearly extract edge information and small-scale information, and segment boundariesmore smoothly. Experimental results on public datasets demonstrate the superiority of ourmethod compared withseveral state-of-the-art networks. 展开更多
关键词 Semantic segmentation remote sensing multiscale self-attention
下载PDF
评测任务征集|全国知识图谱与语义计算大会和知识图谱国际联合会议联办(CCKS-IJCKG 2024)
5
作者 《中文信息学报》 CSCD 北大核心 2024年第3期162-162,共1页
全国知识图谱与语义计算大会和知识图谱国际联合会议联办,是由第十八届全国知识图谱与语义计算大会2024(China Conference on Knowledge Graph and Semantic Computing,CCKS 2024)和第十三届知识图谱国际联合会议2024(International Joi... 全国知识图谱与语义计算大会和知识图谱国际联合会议联办,是由第十八届全国知识图谱与语义计算大会2024(China Conference on Knowledge Graph and Semantic Computing,CCKS 2024)和第十三届知识图谱国际联合会议2024(International Joint Conference of Knowledge Graph,IJCKG 2024)联合举办。 展开更多
关键词 语义计算 知识图谱 SEMANTIC CCK GRAPH
下载PDF
A Video Captioning Method by Semantic Topic-Guided Generation
6
作者 Ou Ye Xinli Wei +2 位作者 Zhenhua Yu Yan Fu Ying Yang 《Computers, Materials & Continua》 SCIE EI 2024年第1期1071-1093,共23页
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de... In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits. 展开更多
关键词 Video captioning encoder-decoder semantic topic jointly decoding Enhance-TopK sampling
下载PDF
Nonlinear Registration of Brain Magnetic Resonance Images with Cross Constraints of Intensity and Structure
7
作者 Han Zhou HongtaoXu +2 位作者 Xinyue Chang Wei Zhang Heng Dong 《Computers, Materials & Continua》 SCIE EI 2024年第5期2295-2313,共19页
Many deep learning-based registration methods rely on a single-stream encoder-decoder network for computing deformation fields between 3D volumes.However,these methods often lack constraint information and overlook se... Many deep learning-based registration methods rely on a single-stream encoder-decoder network for computing deformation fields between 3D volumes.However,these methods often lack constraint information and overlook semantic consistency,limiting their performance.To address these issues,we present a novel approach for medical image registration called theDual-VoxelMorph,featuring a dual-channel cross-constraint network.This innovative network utilizes both intensity and segmentation images,which share identical semantic information and feature representations.Two encoder-decoder structures calculate deformation fields for intensity and segmentation images,as generated by the dual-channel cross-constraint network.This design facilitates bidirectional communication between grayscale and segmentation information,enabling the model to better learn the corresponding grayscale and segmentation details of the same anatomical structures.To ensure semantic and directional consistency,we introduce constraints and apply the cosine similarity function to enhance semantic consistency.Evaluation on four public datasets demonstrates superior performance compared to the baselinemethod,achieving Dice scores of 79.9%,64.5%,69.9%,and 63.5%for OASIS-1,OASIS-3,LPBA40,and ADNI,respectively. 展开更多
关键词 Medical image registration cross constraint semantic consistency directional consistency DUAL-CHANNEL
下载PDF
Enhancing Deep Learning Semantics:The Diffusion Sampling and Label-Driven Co-Attention Approach
8
作者 ChunhuaWang Wenqian Shang +1 位作者 Tong Yi Haibin Zhu 《Computers, Materials & Continua》 SCIE EI 2024年第5期1939-1956,共18页
The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-atten... The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-attention mechanisms falter when applied to datasets with intricate semantic content and extensive dependency structures.In response,this paper introduces a Diffusion Sampling and Label-Driven Co-attention Neural Network(DSLD),which adopts a diffusion sampling method to capture more comprehensive semantic information of the data.Additionally,themodel leverages the joint correlation information of labels and data to introduce the computation of text representation,correcting semantic representationbiases in thedata,andincreasing the accuracyof semantic representation.Ultimately,the model computes the corresponding classification results by synthesizing these rich data semantic representations.Experiments on seven benchmark datasets show that our proposed model achieves competitive results compared to state-of-the-art methods. 展开更多
关键词 Semantic representation sampling attention label-driven co-attention attention mechanisms
下载PDF
Weakly Supervised Network with Scribble-Supervised and Edge-Mask for Road Extraction from High-Resolution Remote Sensing Images
9
作者 Supeng Yu Fen Huang Chengcheng Fan 《Computers, Materials & Continua》 SCIE EI 2024年第4期549-562,共14页
Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous human... Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods. 展开更多
关键词 Semantic segmentation road extraction weakly supervised learning scribble supervision remote sensing image
下载PDF
第十八届全国知识图谱与语义计算大会(CCKS 2024)征稿通知
10
作者 《中文信息学报》 CSCD 北大核心 2024年第3期55-55,共1页
全国知识图谱与语义计算大会(China Conference on Knowledge Graph and Semantic Computing,CCKS)由中国中文信息学会语言与知识计算专业委员会主办,大会源自中文知识图谱研讨会(Chinese Knowledge Graph Symposium,CKGS)和中国语义网... 全国知识图谱与语义计算大会(China Conference on Knowledge Graph and Semantic Computing,CCKS)由中国中文信息学会语言与知识计算专业委员会主办,大会源自中文知识图谱研讨会(Chinese Knowledge Graph Symposium,CKGS)和中国语义网与万维网科学大会(Chinese Semantic Web and Web Science Conference,CSWS)。 展开更多
关键词 语义计算 中文信息 语义网 知识图谱 万维网 Web SEMANTIC Graph
下载PDF
第十八届全国知识图谱与语义计算大会(CCKS 2024)征稿通知
11
作者 《中文信息学报》 CSCD 北大核心 2024年第2期F0003-F0003,共1页
全国知识图谱与语义计算大会(China Conference on Knowledge Graph and Semantic Computing,CCKS)由中国中文信息学会语言与知识计算专业委员会主办,大会源自中文知识图谱研讨会(Chinese Knowledge Graph Symposium,CKGS)和中国语义网... 全国知识图谱与语义计算大会(China Conference on Knowledge Graph and Semantic Computing,CCKS)由中国中文信息学会语言与知识计算专业委员会主办,大会源自中文知识图谱研讨会(Chinese Knowledge Graph Symposium,CKGS)和中国语义网与万维网科学大会(Chinese Semantic Web and Web Science Conference,CSW),2016年两会合并,CCKS2016、2017、2018、2019、2020、2021、2022和2023分别在北京、成都、天津、杭州、南昌、广州(线上)、秦皇岛和沈阳举办。 展开更多
关键词 语义计算 中文信息 知识图谱 语义网 万维网 Web Semantic CCK
下载PDF
A Random Fusion of Mix 3D and Polar Mix to Improve Semantic Segmentation Performance in 3D Lidar Point Cloud
12
作者 Bo Liu Li Feng Yufeng Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期845-862,共18页
This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information throu... This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis. 展开更多
关键词 3D lidar point cloud data augmentation RandomFusion semantic segmentation
下载PDF
A Survey of Knowledge Graph Construction Using Machine Learning
13
作者 Zhigang Zhao Xiong Luo +1 位作者 Maojian Chen Ling Ma 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期225-257,共33页
Knowledge graph(KG)serves as a specialized semantic network that encapsulates intricate relationships among real-world entities within a structured framework.This framework facilitates a transformation in information ... Knowledge graph(KG)serves as a specialized semantic network that encapsulates intricate relationships among real-world entities within a structured framework.This framework facilitates a transformation in information retrieval,transitioning it from mere string matching to far more sophisticated entity matching.In this transformative process,the advancement of artificial intelligence and intelligent information services is invigorated.Meanwhile,the role ofmachine learningmethod in the construction of KG is important,and these techniques have already achieved initial success.This article embarks on a comprehensive journey through the last strides in the field of KG via machine learning.With a profound amalgamation of cutting-edge research in machine learning,this article undertakes a systematical exploration of KG construction methods in three distinct phases:entity learning,ontology learning,and knowledge reasoning.Especially,a meticulous dissection of machine learningdriven algorithms is conducted,spotlighting their contributions to critical facets such as entity extraction,relation extraction,entity linking,and link prediction.Moreover,this article also provides an analysis of the unresolved challenges and emerging trajectories that beckon within the expansive application of machine learning-fueled,large-scale KG construction. 展开更多
关键词 Knowledge graph(KG) semantic network relation extraction entity linking knowledge reasoning
下载PDF
Enhancing Relational Triple Extraction in Specific Domains:Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models
14
作者 Jiakai Li Jianpeng Hu Geng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2481-2503,共23页
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e... In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach. 展开更多
关键词 Relational triple extraction semantic interaction large language models data augmentation specific domains
下载PDF
A semantic segmentation-based underwater acoustic image transmission framework for cooperative SLAM
15
作者 Jiaxu Li Guangyao Han +1 位作者 Shuai Chang Xiaomei Fu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期339-351,共13页
With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection abil... With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection ability of a single vehicle limits the SLAM performance in wide areas.Thereby,cooperative SLAM using multiple vehicles has become an important research direction.The key factor of cooperative SLAM is timely and efficient sonar image transmission among underwater vehicles.However,the limited bandwidth of underwater acoustic channels contradicts a large amount of sonar image data.It is essential to compress the images before transmission.Recently,deep neural networks have great value in image compression by virtue of the powerful learning ability of neural networks,but the existing sonar image compression methods based on neural network usually focus on the pixel-level information without the semantic-level information.In this paper,we propose a novel underwater acoustic transmission scheme called UAT-SSIC that includes semantic segmentation-based sonar image compression(SSIC)framework and the joint source-channel codec,to improve the accuracy of the semantic information of the reconstructed sonar image at the receiver.The SSIC framework consists of Auto-Encoder structure-based sonar image compression network,which is measured by a semantic segmentation network's residual.Considering that sonar images have the characteristics of blurred target edges,the semantic segmentation network used a special dilated convolution neural network(DiCNN)to enhance segmentation accuracy by expanding the range of receptive fields.The joint source-channel codec with unequal error protection is proposed that adjusts the power level of the transmitted data,which deal with sonar image transmission error caused by the serious underwater acoustic channel.Experiment results demonstrate that our method preserves more semantic information,with advantages over existing methods at the same compression ratio.It also improves the error tolerance and packet loss resistance of transmission. 展开更多
关键词 Semantic segmentation Sonar image transmission Learning-based compression
下载PDF
A Joint Entity Relation Extraction Model Based on Relation Semantic Template Automatically Constructed
16
作者 Wei Liu Meijuan Yin +1 位作者 Jialong Zhang Lunchong Cui 《Computers, Materials & Continua》 SCIE EI 2024年第1期975-997,共23页
The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of... The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN. 展开更多
关键词 Natural language processing deep learning information extraction relation extraction relation semantic template
下载PDF
Generative Multi-Modal Mutual Enhancement Video Semantic Communications
17
作者 Yuanle Chen Haobo Wang +3 位作者 Chunyu Liu Linyi Wang Jiaxin Liu Wei Wu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2985-3009,共25页
Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the... Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent. 展开更多
关键词 Generative adversarial networks multi-modal mutual enhancement video semantic transmission deep learning
下载PDF
Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications
18
作者 Shuting Ge Jin Ren +3 位作者 Yihua Shi Yujun Zhang Shunzhi Yang Jinfeng Yang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3215-3245,共31页
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p... In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management. 展开更多
关键词 Speech-text multimodal automatic speech recognition semantic alignment air traffic control communications dual-tower architecture
下载PDF
FusionNN:A Semantic Feature Fusion Model Based on Multimodal for Web Anomaly Detection
19
作者 Li Wang Mingshan Xia +3 位作者 Hao Hu Jianfang Li Fengyao Hou Gang Chen 《Computers, Materials & Continua》 SCIE EI 2024年第5期2991-3006,共16页
With the rapid development of the mobile communication and the Internet,the previous web anomaly detectionand identificationmodels were built relying on security experts’empirical knowledge and attack features.Althou... With the rapid development of the mobile communication and the Internet,the previous web anomaly detectionand identificationmodels were built relying on security experts’empirical knowledge and attack features.Althoughthis approach can achieve higher detection performance,it requires huge human labor and resources to maintainthe feature library.In contrast,semantic feature engineering can dynamically discover new semantic featuresand optimize feature selection by automatically analyzing the semantic information contained in the data itself,thus reducing dependence on prior knowledge.However,current semantic features still have the problem ofsemantic expression singularity,as they are extracted from a single semantic mode such as word segmentation,character segmentation,or arbitrary semantic feature extraction.This paper extracts features of web requestsfrom dual semantic granularity,and proposes a semantic feature fusion method to solve the above problems.Themethod first preprocesses web requests,and extracts word-level and character-level semantic features of URLs viaconvolutional neural network(CNN),respectively.By constructing three loss functions to reduce losses betweenfeatures,labels and categories.Experiments on the HTTP CSIC 2010,Malicious URLs and HttpParams datasetsverify the proposedmethod.Results show that compared withmachine learning,deep learningmethods and BERTmodel,the proposed method has better detection performance.And it achieved the best detection rate of 99.16%in the dataset HttpParams. 展开更多
关键词 Feature fusion web anomaly detection MULTIMODAL convolutional neural network(CNN) semantic feature extraction
下载PDF
Automatic Road Tunnel Crack Inspection Based on Crack Area Sensing and Multiscale Semantic Segmentation
20
作者 Dingping Chen Zhiheng Zhu +1 位作者 Jinyang Fu Jilin He 《Computers, Materials & Continua》 SCIE EI 2024年第4期1679-1703,共25页
The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the su... The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the surface of roadtunnels is the key to improving the maintenance efficiency of road tunnels. Machine vision technology combinedwith a deep neural network model is an effective means to realize the localization and identification of crackdefects on the surface of road tunnels.We propose a complete set of automatic inspection methods for identifyingcracks on the walls of road tunnels as a solution to the problem of difficulty in identifying cracks during manualmaintenance. First, a set of equipment applied to the real-time acquisition of high-definition images of walls inroad tunnels is designed. Images of walls in road tunnels are acquired based on the designed equipment, whereimages containing crack defects are manually identified and selected. Subsequently, the training and validationsets used to construct the crack inspection model are obtained based on the acquired images, whereas the regionscontaining cracks and the pixels of the cracks are finely labeled. After that, a crack area sensing module is designedbased on the proposed you only look once version 7 model combined with coordinate attention mechanism (CAYOLOV7) network to locate the crack regions in the road tunnel surface images. Only subimages containingcracks are acquired and sent to the multiscale semantic segmentation module for extraction of the pixels to whichthe cracks belong based on the DeepLab V3+ network. The precision and recall of the crack region localizationon the surface of a road tunnel based on our proposed method are 82.4% and 93.8%, respectively. Moreover, themean intersection over union (MIoU) and pixel accuracy (PA) values for achieving pixel-level detection accuracyare 76.84% and 78.29%, respectively. The experimental results on the dataset show that our proposed two-stagedetection method outperforms other state-of-the-art models in crack region localization and detection. Based onour proposedmethod, the images captured on the surface of a road tunnel can complete crack detection at a speed often frames/second, and the detection accuracy can reach 0.25 mm, which meets the requirements for maintenanceof an actual project. The designed CA-YOLO V7 network enables precise localization of the area to which a crackbelongs in images acquired under different environmental and lighting conditions in road tunnels. The improvedDeepLab V3+ network based on lightweighting is able to extract crack morphology in a given region more quicklywhile maintaining segmentation accuracy. The established model combines defect localization and segmentationmodels for the first time, realizing pixel-level defect localization and extraction on the surface of road tunnelsin complex environments, and is capable of determining the actual size of cracks based on the physical coordinatesystemafter camera calibration. The trainedmodelhas highaccuracy andcanbe extendedandapplied to embeddedcomputing devices for the assessment and repair of damaged areas in different types of road tunnels. 展开更多
关键词 Road tunnel crack inspection crack area sensing multiscale semantic segmentation CA-YOLO V7 DeepLab V3+
下载PDF
上一页 1 2 51 下一页 到第
使用帮助 返回顶部