In the context of internationalization,China-UK Joint Education Programs are receiving increasing attention from universities.Based on the difficulties faced in China-UK Joint Education Program,this paper adopts a que...In the context of internationalization,China-UK Joint Education Programs are receiving increasing attention from universities.Based on the difficulties faced in China-UK Joint Education Program,this paper adopts a questionnaire survey method to study the learning effectiveness of students majoring in digital media technology in the China-UK Joint Education Program at Guangxi University of Finance and Economics,focusing on four aspects:learning materials,learning content,teacher conditions,and student learning outcomes.The research analysis in this paper not only provides strong support for the construction of China-UK Joint Education Program but also offers references for other China-UK Joint Education Programs.展开更多
Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their mov...Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.展开更多
Incredible progress has been made in human action recognition(HAR),significantly impacting computer vision applications in sports analytics.However,identifying dynamic and complex movements in sports like badminton re...Incredible progress has been made in human action recognition(HAR),significantly impacting computer vision applications in sports analytics.However,identifying dynamic and complex movements in sports like badminton remains challenging due to the need for precise recognition accuracy and better management of complex motion patterns.Deep learning techniques like convolutional neural networks(CNNs),long short-term memory(LSTM),and graph convolutional networks(GCNs)improve recognition in large datasets,while the traditional machine learning methods like SVM(support vector machines),RF(random forest),and LR(logistic regression),combined with handcrafted features and ensemble approaches,perform well but struggle with the complexity of fast-paced sports like badminton.We proposed an ensemble learning model combining support vector machines(SVM),logistic regression(LR),random forest(RF),and adaptive boosting(AdaBoost)for badminton action recognition.The data in this study consist of video recordings of badminton stroke techniques,which have been extracted into spatiotemporal data.The three-dimensional distance between each skeleton point and the right hip represents the spatial features.The temporal features are the results of Fast Dynamic Time Warping(FDTW)calculations applied to 15 frames of each video sequence.The weighted ensemble model employs soft voting classifiers from SVM,LR,RF,and AdaBoost to enhance the accuracy of badminton action recognition.The E2 ensemble model,which combines SVM,LR,and AdaBoost,achieves the highest accuracy of 95.38%.展开更多
In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanica...In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanical property of joints,has been a focal point in the research field.There are limitations in the current peak shear strength(PSS)prediction models for jointed rock:(i)the models do not comprehensively consider various influencing factors,and a PSS prediction model covering seven factors has not been established,including the sampling interval of the joints,the surface roughness of the joints,the normal stress,the basic friction angle,the uniaxial tensile strength,the uniaxial compressive strength,and the joint size for coupled joints;(ii)the datasets used to train the models are relatively limited;and(iii)there is a controversy regarding whether compressive or tensile strength should be used as the strength term among the influencing factors.To overcome these limitations,we developed four machine learning models covering these seven influencing factors,three relying on Support Vector Regression(SVR)with different kernel functions(linear,polynomial,and Radial Basis Function(RBF))and one using deep learning(DL).Based on these seven influencing factors,we compiled a dataset comprising the outcomes of 493 published direct shear tests for the training and validation of these four models.We compared the prediction performance of these four machine learning models with Tang’s and Tatone’s models.The prediction errors of Tang’s and Tatone’s models are 21.8%and 17.7%,respectively,while SVR_linear is at 16.6%,SVR_poly is at 14.0%,and SVR_RBF is at 12.1%.DL outperforms the two existing models with only an 8.5%error.Additionally,we performed shear tests on granite joints to validate the predictive capability of the DL-based model.With the DL approach,the results suggest that uniaxial tensile strength is recommended as the material strength term in the PSS model for more reliable outcomes.展开更多
Emotion cause extraction(ECE)task that aims at extracting potential trigger events of certain emotions has attracted extensive attention recently.However,current work neglects the implicit emotion expressed without an...Emotion cause extraction(ECE)task that aims at extracting potential trigger events of certain emotions has attracted extensive attention recently.However,current work neglects the implicit emotion expressed without any explicit emotional keywords,which appears more frequently in application scenarios.The lack of explicit emotion information makes it extremely hard to extract emotion causes only with the local context.Moreover,an entire event is usually across multiple clauses,while existing work merely extracts cause events at clause level and cannot effectively capture complete cause event information.To address these issues,the events are first redefined at the tuple level and a span-based tuple-level algorithm is proposed to extract events from different clauses.Based on it,a corpus for implicit emotion cause extraction that tries to extract causes of implicit emotions is constructed.The authors propose a knowledge-enriched jointlearning model of implicit emotion recognition and implicit emotion cause extraction tasks(KJ-IECE),which leverages commonsense knowledge from ConceptNet and NRC_VAD to better capture connections between emotion and corresponding cause events.Experiments on both implicit and explicit emotion cause extraction datasets demonstrate the effectiveness of the proposed model.展开更多
Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of int...Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.展开更多
Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively h...Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.展开更多
A distributed reinforcement learning(RL)based resource management framework is proposed for a mobile edge computing(MEC)system with both latency-sensitive and latency-insensitive services.We investigate joint optimiza...A distributed reinforcement learning(RL)based resource management framework is proposed for a mobile edge computing(MEC)system with both latency-sensitive and latency-insensitive services.We investigate joint optimization of both computing and radio resources to achieve efficient on-demand matches of multi-dimensional resources and diverse requirements of users.A multi-objective integer programming problem is formulated by two subproblems,i.e.,access point(AP)selection and subcarrier allocation,which can be solved jointly by our proposed distributed RL-based approach with a heuristic iteration algorithm.The proposed algorithm allows for the reduction in complexity since each user needs to consider only its own selection of AP without knowing full global information.Simulation results show that our algorithm can achieve near-optimal performance while reducing computational complexity significantly.Compared with other algorithms that only optimize either of the two sub-problems,the proposed algorithm can serve more users with much less power consumption and content delivery latency.展开更多
To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge gra...To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.展开更多
Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relati...Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.展开更多
With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,de...With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.展开更多
This paper analyzes the progress of handwritten Chinese character recognition technology,from two perspectives:traditional recognition methods and deep learning-based recognition methods.Firstly,the complexity of Chin...This paper analyzes the progress of handwritten Chinese character recognition technology,from two perspectives:traditional recognition methods and deep learning-based recognition methods.Firstly,the complexity of Chinese character recognition is pointed out,including its numerous categories,complex structure,and the problem of similar characters,especially the variability of handwritten Chinese characters.Subsequently,recognition methods based on feature optimization,model optimization,and fusion techniques are highlighted.The fusion studies between feature optimization and model improvement are further explored,and these studies further enhance the recognition effect through complementary advantages.Finally,the article summarizes the current challenges of Chinese character recognition technology,including accuracy improvement,model complexity,and real-time problems,and looks forward to future research directions.展开更多
Visual localization and object detection both play important roles in various tasks.In many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely together.Howeve...Visual localization and object detection both play important roles in various tasks.In many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely together.However,few researchers consider these two tasks simultaneously,because of a lack of datasets and the little attention paid to such environments.In this paper,we explore multi-task network design and joint refinement of detection and localization.To address the dataset problem,we construct a medium indoor scene of an aviation exhibition hall through a semi-automatic process.The dataset provides localization and detection information,and is publicly available at https://drive.google.com/drive/folders/1U28zk0N4_I0db zkqyIAK1A15k9oUKOjI?usp=sharing for benchmarking localization and object detection tasks.Targeting this dataset,we have designed a multi-task network,JLDNet,based on YOLO v3,that outputs a target point cloud and object bounding boxes.For dynamic environments,the detection branch also promotes the perception of dynamics.JLDNet includes image feature learning,point feature learning,feature fusion,detection construction,and point cloud regression.Moreover,object-level bundle adjustment is used to further improve localization and detection accuracy.To test JLDNet and compare it to other methods,we have conducted experiments on 7 static scenes,our constructed dataset,and the dynamic TUM RGB-D and Bonn datasets.Our results show state-of-the-art accuracy for both tasks,and the benefit of jointly working on both tasks is demonstrated.展开更多
In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owin...In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.展开更多
Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named ent...Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named entity recognition.Various supervised,unsupervised,and hybrid approaches are used to detect each field.Such intelligent systems,also called natural language understanding systems analyze user requests in sequential order:domain classification,intent,and entity recognition based on the semantic rules of the classified domain.This sequential approach propagates the downstream error;i.e.,if the domain classification model fails to classify the domain,intent and entity recognition fail.Furthermore,training such intelligent system necessitates a large number of user-annotated datasets for each domain.This study proposes a single joint predictive deep neural network framework based on long short-term memory using only a small user-annotated dataset to address these issues.It investigates value added by incorporating unlabeled data from user chatting logs into multi-domain spoken language understanding systems.Systematic experimental analysis of the proposed joint frameworks,along with the semi-supervised multi-domain model,using open-source annotated and unannotated utterances shows robust improvement in the predictive performance of the proposed multi-domain intelligent chatbot over a base joint model and joint model based on adversarial learning.展开更多
The traditional recommendation algorithm represented by the collaborative filtering algorithm is the most classical and widely recommended algorithm in the practical industry.Most book recommendation systems also use ...The traditional recommendation algorithm represented by the collaborative filtering algorithm is the most classical and widely recommended algorithm in the practical industry.Most book recommendation systems also use this algorithm.However,the traditional recommendation algorithm represented by the collaborative filtering algorithm cannot deal with the data sparsity well.This algorithm only uses the shallow feature design of the interaction between readers and books,so it fails to achieve the high-level abstract learning of the relevant attribute features of readers and books,leading to a decline in recommendation performance.Given the above problems,this study uses deep learning technology to model readers’book borrowing probability.It builds a recommendation system model through themulti-layer neural network and inputs the features extracted from readers and books into the network,and then profoundly integrates the features of readers and books through the multi-layer neural network.The hidden deep interaction between readers and books is explored accordingly.Thus,the quality of book recommendation performance will be significantly improved.In the experiment,the evaluation indexes ofHR@10,MRR,andNDCGof the deep neural network recommendation model constructed in this paper are higher than those of the traditional recommendation algorithm,which verifies the effectiveness of the model in the book recommendation.展开更多
中文电子病历实体关系抽取是构建医疗知识图谱,服务下游子任务的重要基础。目前,中文电子病例进行实体关系抽取仍存在因医疗文本关系复杂、实体密度大而造成医疗名词识别不准确的问题。针对这一问题,提出了基于对抗学习与多特征融合的...中文电子病历实体关系抽取是构建医疗知识图谱,服务下游子任务的重要基础。目前,中文电子病例进行实体关系抽取仍存在因医疗文本关系复杂、实体密度大而造成医疗名词识别不准确的问题。针对这一问题,提出了基于对抗学习与多特征融合的中文电子病历实体关系联合抽取模型AMFRel(adversarial learning and multi-feature fusion for relation triple extraction),提取电子病历的文本和词性特征,得到融合词性信息的编码向量;利用编码向量联合对抗训练产生的扰动生成对抗样本,抽取句子主语;利用信息融合模块丰富文本结构特征,并根据特定的关系信息抽取出相应的宾语,得到医疗文本的三元组。采用CHIP2020关系抽取数据集和糖尿病数据集进行实验验证,结果显示:AMFRel在CHIP2020关系抽取数据集上的Precision为63.922%,Recall为57.279%,F1值为60.418%;在糖尿病数据集上的Precision、Recall和F1值分别为83.914%,67.021%和74.522%,证明了该模型的三元组抽取性能优于其他基线模型。展开更多
基金Guangxi Key Laboratory of Financial Big Data Fund Project(Guikejizi[2021]No.5)Research on the Innovation of Teaching Models for Foreign Professional Courses in China-UK Joint Education Under the Background of Internationalization-Taking Guangxi University of Finance and Economics as an Example(2023XJJG26)Exploration and Practice of Digital Media Technology Talent Training Models in the Context of New Productive Forces(XGK202423)。
文摘In the context of internationalization,China-UK Joint Education Programs are receiving increasing attention from universities.Based on the difficulties faced in China-UK Joint Education Program,this paper adopts a questionnaire survey method to study the learning effectiveness of students majoring in digital media technology in the China-UK Joint Education Program at Guangxi University of Finance and Economics,focusing on four aspects:learning materials,learning content,teacher conditions,and student learning outcomes.The research analysis in this paper not only provides strong support for the construction of China-UK Joint Education Program but also offers references for other China-UK Joint Education Programs.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2023-00218176)and the Soonchunhyang University Research Fund.
文摘Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.
基金supported by the Center for Higher Education Funding(BPPT)and the Indonesia Endowment Fund for Education(LPDP),as acknowledged in decree number 02092/J5.2.3/BPI.06/9/2022。
文摘Incredible progress has been made in human action recognition(HAR),significantly impacting computer vision applications in sports analytics.However,identifying dynamic and complex movements in sports like badminton remains challenging due to the need for precise recognition accuracy and better management of complex motion patterns.Deep learning techniques like convolutional neural networks(CNNs),long short-term memory(LSTM),and graph convolutional networks(GCNs)improve recognition in large datasets,while the traditional machine learning methods like SVM(support vector machines),RF(random forest),and LR(logistic regression),combined with handcrafted features and ensemble approaches,perform well but struggle with the complexity of fast-paced sports like badminton.We proposed an ensemble learning model combining support vector machines(SVM),logistic regression(LR),random forest(RF),and adaptive boosting(AdaBoost)for badminton action recognition.The data in this study consist of video recordings of badminton stroke techniques,which have been extracted into spatiotemporal data.The three-dimensional distance between each skeleton point and the right hip represents the spatial features.The temporal features are the results of Fast Dynamic Time Warping(FDTW)calculations applied to 15 frames of each video sequence.The weighted ensemble model employs soft voting classifiers from SVM,LR,RF,and AdaBoost to enhance the accuracy of badminton action recognition.The E2 ensemble model,which combines SVM,LR,and AdaBoost,achieves the highest accuracy of 95.38%.
基金supported by the National Key Research and Development Program of China(2022YFC3080100)the National Natural Science Foundation of China(Nos.52104090,52208328 and 12272353)+1 种基金the Open Fund of Anhui Province Key Laboratory of Building Structure and Underground Engineering,Anhui Jianzhu University(No.KLBSUE-2022-06)the Open Research Fund of Key Laboratory of Construction and Safety of Water Engineering of the Ministry of Water Resources,China Institute of Water Resources and Hydropower Research(Grant No.IWHR-ENGI-202302)。
文摘In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanical property of joints,has been a focal point in the research field.There are limitations in the current peak shear strength(PSS)prediction models for jointed rock:(i)the models do not comprehensively consider various influencing factors,and a PSS prediction model covering seven factors has not been established,including the sampling interval of the joints,the surface roughness of the joints,the normal stress,the basic friction angle,the uniaxial tensile strength,the uniaxial compressive strength,and the joint size for coupled joints;(ii)the datasets used to train the models are relatively limited;and(iii)there is a controversy regarding whether compressive or tensile strength should be used as the strength term among the influencing factors.To overcome these limitations,we developed four machine learning models covering these seven influencing factors,three relying on Support Vector Regression(SVR)with different kernel functions(linear,polynomial,and Radial Basis Function(RBF))and one using deep learning(DL).Based on these seven influencing factors,we compiled a dataset comprising the outcomes of 493 published direct shear tests for the training and validation of these four models.We compared the prediction performance of these four machine learning models with Tang’s and Tatone’s models.The prediction errors of Tang’s and Tatone’s models are 21.8%and 17.7%,respectively,while SVR_linear is at 16.6%,SVR_poly is at 14.0%,and SVR_RBF is at 12.1%.DL outperforms the two existing models with only an 8.5%error.Additionally,we performed shear tests on granite joints to validate the predictive capability of the DL-based model.With the DL approach,the results suggest that uniaxial tensile strength is recommended as the material strength term in the PSS model for more reliable outcomes.
基金National Natural Science Foundation of China,Grant/Award Numbers:61671064,61732005National Key Research&Development Program,Grant/Award Number:2018YFC0831700。
文摘Emotion cause extraction(ECE)task that aims at extracting potential trigger events of certain emotions has attracted extensive attention recently.However,current work neglects the implicit emotion expressed without any explicit emotional keywords,which appears more frequently in application scenarios.The lack of explicit emotion information makes it extremely hard to extract emotion causes only with the local context.Moreover,an entire event is usually across multiple clauses,while existing work merely extracts cause events at clause level and cannot effectively capture complete cause event information.To address these issues,the events are first redefined at the tuple level and a span-based tuple-level algorithm is proposed to extract events from different clauses.Based on it,a corpus for implicit emotion cause extraction that tries to extract causes of implicit emotions is constructed.The authors propose a knowledge-enriched jointlearning model of implicit emotion recognition and implicit emotion cause extraction tasks(KJ-IECE),which leverages commonsense knowledge from ConceptNet and NRC_VAD to better capture connections between emotion and corresponding cause events.Experiments on both implicit and explicit emotion cause extraction datasets demonstrate the effectiveness of the proposed model.
基金This work was supported,in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401+1 种基金in part,by the Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant Numbers SJCX21_0363in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.
文摘Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.
基金supported in part by the National Natural Science Foundation of China under Grant 61671074in part by Project No.A01B02C01202015D0。
文摘A distributed reinforcement learning(RL)based resource management framework is proposed for a mobile edge computing(MEC)system with both latency-sensitive and latency-insensitive services.We investigate joint optimization of both computing and radio resources to achieve efficient on-demand matches of multi-dimensional resources and diverse requirements of users.A multi-objective integer programming problem is formulated by two subproblems,i.e.,access point(AP)selection and subcarrier allocation,which can be solved jointly by our proposed distributed RL-based approach with a heuristic iteration algorithm.The proposed algorithm allows for the reduction in complexity since each user needs to consider only its own selection of AP without knowing full global information.Simulation results show that our algorithm can achieve near-optimal performance while reducing computational complexity significantly.Compared with other algorithms that only optimize either of the two sub-problems,the proposed algorithm can serve more users with much less power consumption and content delivery latency.
基金Supported by the National Natural Science Foundation of China(No.61876144)。
文摘To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.
基金supported by the National Natural Science Foundation of China(Nos.62002206 and 62202373)the open topic of the Green Development Big Data Decision-Making Key Laboratory(DM202003).
文摘Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.
基金supported in part by the National Key Research Project of China under Grant No.2023YFA1009402General Science and Technology Plan Items in Zhejiang Province ZJKJT-2023-02.
文摘With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.
文摘This paper analyzes the progress of handwritten Chinese character recognition technology,from two perspectives:traditional recognition methods and deep learning-based recognition methods.Firstly,the complexity of Chinese character recognition is pointed out,including its numerous categories,complex structure,and the problem of similar characters,especially the variability of handwritten Chinese characters.Subsequently,recognition methods based on feature optimization,model optimization,and fusion techniques are highlighted.The fusion studies between feature optimization and model improvement are further explored,and these studies further enhance the recognition effect through complementary advantages.Finally,the article summarizes the current challenges of Chinese character recognition technology,including accuracy improvement,model complexity,and real-time problems,and looks forward to future research directions.
基金supported by the National Natural Science Foundation of China(No.62072020)Key-Area Research and the Leading Talents in Innovation and Entrepreneurship of Qingdao(No.19-3-2-21-zhc).
文摘Visual localization and object detection both play important roles in various tasks.In many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely together.However,few researchers consider these two tasks simultaneously,because of a lack of datasets and the little attention paid to such environments.In this paper,we explore multi-task network design and joint refinement of detection and localization.To address the dataset problem,we construct a medium indoor scene of an aviation exhibition hall through a semi-automatic process.The dataset provides localization and detection information,and is publicly available at https://drive.google.com/drive/folders/1U28zk0N4_I0db zkqyIAK1A15k9oUKOjI?usp=sharing for benchmarking localization and object detection tasks.Targeting this dataset,we have designed a multi-task network,JLDNet,based on YOLO v3,that outputs a target point cloud and object bounding boxes.For dynamic environments,the detection branch also promotes the perception of dynamics.JLDNet includes image feature learning,point feature learning,feature fusion,detection construction,and point cloud regression.Moreover,object-level bundle adjustment is used to further improve localization and detection accuracy.To test JLDNet and compare it to other methods,we have conducted experiments on 7 static scenes,our constructed dataset,and the dynamic TUM RGB-D and Bonn datasets.Our results show state-of-the-art accuracy for both tasks,and the benefit of jointly working on both tasks is demonstrated.
基金This work was supported by the Research Deanship of Prince Sattam Bin Abdulaziz University,Al-Kharj,Saudi Arabia(Grant No.2020/01/17215).Also,the author thanks Deanship of college of computer engineering and sciences for technical support provided to complete the project successfully。
文摘In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.
基金This research was supported by the BK21 FOUR(Fostering Outstanding Universities for Research)funded by the Ministry of Education(MOE,Korea)and National Research Foundation of Korea(NFR).
文摘Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named entity recognition.Various supervised,unsupervised,and hybrid approaches are used to detect each field.Such intelligent systems,also called natural language understanding systems analyze user requests in sequential order:domain classification,intent,and entity recognition based on the semantic rules of the classified domain.This sequential approach propagates the downstream error;i.e.,if the domain classification model fails to classify the domain,intent and entity recognition fail.Furthermore,training such intelligent system necessitates a large number of user-annotated datasets for each domain.This study proposes a single joint predictive deep neural network framework based on long short-term memory using only a small user-annotated dataset to address these issues.It investigates value added by incorporating unlabeled data from user chatting logs into multi-domain spoken language understanding systems.Systematic experimental analysis of the proposed joint frameworks,along with the semi-supervised multi-domain model,using open-source annotated and unannotated utterances shows robust improvement in the predictive performance of the proposed multi-domain intelligent chatbot over a base joint model and joint model based on adversarial learning.
基金This work was partly supported by the Basic Ability Improvement Project for Young andMiddle-aged Teachers in Guangxi Colleges andUniversities(2021KY1800,2021KY1804).
文摘The traditional recommendation algorithm represented by the collaborative filtering algorithm is the most classical and widely recommended algorithm in the practical industry.Most book recommendation systems also use this algorithm.However,the traditional recommendation algorithm represented by the collaborative filtering algorithm cannot deal with the data sparsity well.This algorithm only uses the shallow feature design of the interaction between readers and books,so it fails to achieve the high-level abstract learning of the relevant attribute features of readers and books,leading to a decline in recommendation performance.Given the above problems,this study uses deep learning technology to model readers’book borrowing probability.It builds a recommendation system model through themulti-layer neural network and inputs the features extracted from readers and books into the network,and then profoundly integrates the features of readers and books through the multi-layer neural network.The hidden deep interaction between readers and books is explored accordingly.Thus,the quality of book recommendation performance will be significantly improved.In the experiment,the evaluation indexes ofHR@10,MRR,andNDCGof the deep neural network recommendation model constructed in this paper are higher than those of the traditional recommendation algorithm,which verifies the effectiveness of the model in the book recommendation.
文摘中文电子病历实体关系抽取是构建医疗知识图谱,服务下游子任务的重要基础。目前,中文电子病例进行实体关系抽取仍存在因医疗文本关系复杂、实体密度大而造成医疗名词识别不准确的问题。针对这一问题,提出了基于对抗学习与多特征融合的中文电子病历实体关系联合抽取模型AMFRel(adversarial learning and multi-feature fusion for relation triple extraction),提取电子病历的文本和词性特征,得到融合词性信息的编码向量;利用编码向量联合对抗训练产生的扰动生成对抗样本,抽取句子主语;利用信息融合模块丰富文本结构特征,并根据特定的关系信息抽取出相应的宾语,得到医疗文本的三元组。采用CHIP2020关系抽取数据集和糖尿病数据集进行实验验证,结果显示:AMFRel在CHIP2020关系抽取数据集上的Precision为63.922%,Recall为57.279%,F1值为60.418%;在糖尿病数据集上的Precision、Recall和F1值分别为83.914%,67.021%和74.522%,证明了该模型的三元组抽取性能优于其他基线模型。