Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their mov...Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.展开更多
In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanica...In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanical property of joints,has been a focal point in the research field.There are limitations in the current peak shear strength(PSS)prediction models for jointed rock:(i)the models do not comprehensively consider various influencing factors,and a PSS prediction model covering seven factors has not been established,including the sampling interval of the joints,the surface roughness of the joints,the normal stress,the basic friction angle,the uniaxial tensile strength,the uniaxial compressive strength,and the joint size for coupled joints;(ii)the datasets used to train the models are relatively limited;and(iii)there is a controversy regarding whether compressive or tensile strength should be used as the strength term among the influencing factors.To overcome these limitations,we developed four machine learning models covering these seven influencing factors,three relying on Support Vector Regression(SVR)with different kernel functions(linear,polynomial,and Radial Basis Function(RBF))and one using deep learning(DL).Based on these seven influencing factors,we compiled a dataset comprising the outcomes of 493 published direct shear tests for the training and validation of these four models.We compared the prediction performance of these four machine learning models with Tang’s and Tatone’s models.The prediction errors of Tang’s and Tatone’s models are 21.8%and 17.7%,respectively,while SVR_linear is at 16.6%,SVR_poly is at 14.0%,and SVR_RBF is at 12.1%.DL outperforms the two existing models with only an 8.5%error.Additionally,we performed shear tests on granite joints to validate the predictive capability of the DL-based model.With the DL approach,the results suggest that uniaxial tensile strength is recommended as the material strength term in the PSS model for more reliable outcomes.展开更多
Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively h...Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.展开更多
In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owin...In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.展开更多
Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relati...Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.展开更多
With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,de...With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.展开更多
X-Ray knee imaging is widely used to detect knee osteoarthritis due to ease of availability and lesser cost.However,the manual categorization of knee joint disorders is time-consuming,requires an expert person,and is ...X-Ray knee imaging is widely used to detect knee osteoarthritis due to ease of availability and lesser cost.However,the manual categorization of knee joint disorders is time-consuming,requires an expert person,and is costly.This article proposes a new approach to classifying knee osteoarthritis using deep learning and a whale optimization algorithm.Two pre-trained deep learning models(Efficientnet-b0 and Densenet201)have been employed for the training and feature extraction.Deep transfer learning with fixed hyperparameter values has been employed to train both selected models on the knee X-Ray images.In the next step,fusion is performed using a canonical correlation approach and obtained a feature vector that has more information than the original feature vector.After that,an improved whale optimization algorithm is developed for dimensionality reduction.The selected features are finally passed to the machine learning algorithms such as Fine-Tuned support vector machine(SVM)and neural networks for classification purposes.The experiments of the proposed framework have been conducted on the publicly available dataset and obtained the maximum accuracy of 90.1%.Also,the system is explained using Explainable Artificial Intelligence(XAI)technique called occlusion,and results are compared with recent research.Based on the results compared with recent techniques,it is shown that the proposed method’s accuracy significantly improved.展开更多
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2023-00218176)and the Soonchunhyang University Research Fund.
文摘Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.
基金supported by the National Key Research and Development Program of China(2022YFC3080100)the National Natural Science Foundation of China(Nos.52104090,52208328 and 12272353)+1 种基金the Open Fund of Anhui Province Key Laboratory of Building Structure and Underground Engineering,Anhui Jianzhu University(No.KLBSUE-2022-06)the Open Research Fund of Key Laboratory of Construction and Safety of Water Engineering of the Ministry of Water Resources,China Institute of Water Resources and Hydropower Research(Grant No.IWHR-ENGI-202302)。
文摘In geotechnical and tunneling engineering,accurately determining the mechanical properties of jointed rock holds great significance for project safety assessments.Peak shear strength(PSS),being the paramount mechanical property of joints,has been a focal point in the research field.There are limitations in the current peak shear strength(PSS)prediction models for jointed rock:(i)the models do not comprehensively consider various influencing factors,and a PSS prediction model covering seven factors has not been established,including the sampling interval of the joints,the surface roughness of the joints,the normal stress,the basic friction angle,the uniaxial tensile strength,the uniaxial compressive strength,and the joint size for coupled joints;(ii)the datasets used to train the models are relatively limited;and(iii)there is a controversy regarding whether compressive or tensile strength should be used as the strength term among the influencing factors.To overcome these limitations,we developed four machine learning models covering these seven influencing factors,three relying on Support Vector Regression(SVR)with different kernel functions(linear,polynomial,and Radial Basis Function(RBF))and one using deep learning(DL).Based on these seven influencing factors,we compiled a dataset comprising the outcomes of 493 published direct shear tests for the training and validation of these four models.We compared the prediction performance of these four machine learning models with Tang’s and Tatone’s models.The prediction errors of Tang’s and Tatone’s models are 21.8%and 17.7%,respectively,while SVR_linear is at 16.6%,SVR_poly is at 14.0%,and SVR_RBF is at 12.1%.DL outperforms the two existing models with only an 8.5%error.Additionally,we performed shear tests on granite joints to validate the predictive capability of the DL-based model.With the DL approach,the results suggest that uniaxial tensile strength is recommended as the material strength term in the PSS model for more reliable outcomes.
文摘Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.
基金This work was supported by the Research Deanship of Prince Sattam Bin Abdulaziz University,Al-Kharj,Saudi Arabia(Grant No.2020/01/17215).Also,the author thanks Deanship of college of computer engineering and sciences for technical support provided to complete the project successfully。
文摘In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.
基金supported by the National Natural Science Foundation of China(Nos.62002206 and 62202373)the open topic of the Green Development Big Data Decision-Making Key Laboratory(DM202003).
文摘Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.
基金supported in part by the National Key Research Project of China under Grant No.2023YFA1009402General Science and Technology Plan Items in Zhejiang Province ZJKJT-2023-02.
文摘With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning (KETEP),granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘X-Ray knee imaging is widely used to detect knee osteoarthritis due to ease of availability and lesser cost.However,the manual categorization of knee joint disorders is time-consuming,requires an expert person,and is costly.This article proposes a new approach to classifying knee osteoarthritis using deep learning and a whale optimization algorithm.Two pre-trained deep learning models(Efficientnet-b0 and Densenet201)have been employed for the training and feature extraction.Deep transfer learning with fixed hyperparameter values has been employed to train both selected models on the knee X-Ray images.In the next step,fusion is performed using a canonical correlation approach and obtained a feature vector that has more information than the original feature vector.After that,an improved whale optimization algorithm is developed for dimensionality reduction.The selected features are finally passed to the machine learning algorithms such as Fine-Tuned support vector machine(SVM)and neural networks for classification purposes.The experiments of the proposed framework have been conducted on the publicly available dataset and obtained the maximum accuracy of 90.1%.Also,the system is explained using Explainable Artificial Intelligence(XAI)technique called occlusion,and results are compared with recent research.Based on the results compared with recent techniques,it is shown that the proposed method’s accuracy significantly improved.