Image captioning involves two different major modalities(image and sentence)that convert a given image into a language that adheres to visual semantics.Almost all methods first extract image features to reduce the dif...Image captioning involves two different major modalities(image and sentence)that convert a given image into a language that adheres to visual semantics.Almost all methods first extract image features to reduce the difficulty of visual semantic embedding and then use the caption model to generate fluent sentences.The Convolutional Neural Network(CNN)is often used to extract image features in image captioning,and the use of object detection networks to extract region features has achieved great success.However,the region features retrieved by this method are object-level and do not pay attention to fine-grained details because of the detection model’s limitation.We offer an approach to address this issue that more properly generates captions by fusing fine-grained features and region features.First,we extract fine-grained features using a panoramic segmentation algorithm.Second,we suggest two fusion methods and contrast their fusion outcomes.An X-linear Attention Network(X-LAN)serves as the foundation for both fusion methods.According to experimental findings on the COCO dataset,the two-branch fusion approach is superior.It is important to note that on the COCO Karpathy test split,CIDEr is increased up to 134.3%in comparison to the baseline,highlighting the potency and viability of our method.展开更多
With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability....With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.展开更多
Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fin...Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.展开更多
The infamous type Ⅳ failure within the fine-grained heat-affected zone (FGHAZ) in G115 steel weldments seriously threatens the safe operation of ultra-supercritical (USC) power plants.In this work,the traditional the...The infamous type Ⅳ failure within the fine-grained heat-affected zone (FGHAZ) in G115 steel weldments seriously threatens the safe operation of ultra-supercritical (USC) power plants.In this work,the traditional thermo-mechanical treatment was modified via the replacement of hot-rolling with cold rolling,i.e.,normalizing,cold rolling,and tempering (NCT),which was developed to improve the creep strength of the FGHAZ in G115 steel weldments.The NCT treatment effectively promoted the dissolution of preformed M_(23)C_(6)particles and relieved the boundary segregation of C and Cr during welding thermal cycling,which accelerated the dispersed reprecipitation of M_(23)C_(6) particles within the fresh reaustenitized grains during post-weld heat treatment.In addition,the precipitation of Cu-rich phases and MX particles was promoted evidently due to the deformation-induced dislocations.As a result,the interacting actions between precipitates,dislocations,and boundaries during creep were reinforced considerably.Following this strategy,the creep rupture life of the FGHAZ in G115 steel weldments can be prolonged by 18.6%,which can further push the application of G115 steel in USC power plants.展开更多
Traumatic spinal cord injury is potentially catastrophic and can lead to permanent disability or even death.China has the largest population of patients with traumatic spinal cord injury.Previous studies of traumatic ...Traumatic spinal cord injury is potentially catastrophic and can lead to permanent disability or even death.China has the largest population of patients with traumatic spinal cord injury.Previous studies of traumatic spinal cord injury in China have mostly been regional in scope;national-level studies have been rare.To the best of our knowledge,no national-level study of treatment status and economic burden has been performed.This retrospective study aimed to examine the epidemiological and clinical features,treatment status,and economic burden of traumatic spinal cord injury in China at the national level.We included 13,465 traumatic spinal cord injury patients who were injured between January 2013 and December 2018 and treated in 30 hospitals in 11 provinces/municipalities representing all geographical divisions of China.Patient epidemiological and clinical features,treatment status,and total and daily costs were recorded.Trends in the percentage of traumatic spinal cord injuries among all hospitalized patients and among patients hospitalized in the orthopedic department and cost of care were assessed by annual percentage change using the Joinpoint Regression Program.The percentage of traumatic spinal cord injuries among all hospitalized patients and among patients hospitalized in the orthopedic department did not significantly change overall(annual percentage change,-0.5%and 2.1%,respectively).A total of 10,053(74.7%)patients underwent surgery.Only 2.8%of patients who underwent surgery did so within 24 hours of injury.A total of 2005(14.9%)patients were treated with high-dose(≥500 mg)methylprednisolone sodium succinate/methylprednisolone(MPSS/MP);615(4.6%)received it within 8 hours.The total cost for acute traumatic spinal cord injury decreased over the study period(-4.7%),while daily cost did not significantly change(1.0%increase).Our findings indicate that public health initiatives should aim at improving hospitals’ability to complete early surgery within 24 hours,which is associated with improved sensorimotor recovery,increasing the awareness rate of clinical guidelines related to high-dose MPSS/MP to reduce the use of the treatment with insufficient evidence.展开更多
BACKGROUND Gastric cystica profunda(GCP)represents a rare condition characterized by cystic dilation of gastric glands within the mucosal and/or submucosal layers.GCP is often linked to,or may progress into,early gast...BACKGROUND Gastric cystica profunda(GCP)represents a rare condition characterized by cystic dilation of gastric glands within the mucosal and/or submucosal layers.GCP is often linked to,or may progress into,early gastric cancer(EGC).AIM To provide a comprehensive evaluation of the endoscopic features of GCP while assessing the efficacy of endoscopic treatment,thereby offering guidance for diagnosis and treatment.METHODS This retrospective study involved 104 patients with GCP who underwent endoscopic resection.Alongside demographic and clinical data,regular patient followups were conducted to assess local recurrence.RESULTS Among the 104 patients diagnosed with GCP who underwent endoscopic resection,12.5%had a history of previous gastric procedures.The primary site predominantly affected was the cardia(38.5%,n=40).GCP commonly exhibited intraluminal growth(99%),regular presentation(74.0%),and ulcerative mucosa(61.5%).The leading endoscopic feature was the mucosal lesion type(59.6%,n=62).The average maximum diameter was 20.9±15.3 mm,with mucosal involvement in 60.6%(n=63).Procedures lasted 73.9±57.5 min,achieving complete resection in 91.3%(n=95).Recurrence(4.8%)was managed via either surgical intervention(n=1)or through endoscopic resection(n=4).Final pathology confirmed that 59.6%of GCP cases were associated with EGC.Univariate analysis indicated that elderly males were more susceptible to GCP associated with EGC.Conversely,multivariate analysis identified lesion morphology and endoscopic features as significant risk factors.Survival analysis demonstrated no statistically significant difference in recurrence between GCP with and without EGC(P=0.72).CONCLUSION The findings suggested that endoscopic resection might serve as an effective and minimally invasive treatment for GCP with or without EGC.展开更多
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi...Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios.展开更多
With the rapid spread of Internet information and the spread of fake news,the detection of fake news becomes more and more important.Traditional detection methods often rely on a single emotional or semantic feature t...With the rapid spread of Internet information and the spread of fake news,the detection of fake news becomes more and more important.Traditional detection methods often rely on a single emotional or semantic feature to identify fake news,but these methods have limitations when dealing with news in specific domains.In order to solve the problem of weak feature correlation between data from different domains,a model for detecting fake news by integrating domain-specific emotional and semantic features is proposed.This method makes full use of the attention mechanism,grasps the correlation between different features,and effectively improves the effect of feature fusion.The algorithm first extracts the semantic features of news text through the Bi-LSTM(Bidirectional Long Short-Term Memory)layer to capture the contextual relevance of news text.Senta-BiLSTM is then used to extract emotional features and predict the probability of positive and negative emotions in the text.It then uses domain features as an enhancement feature and attention mechanism to fully capture more fine-grained emotional features associated with that domain.Finally,the fusion features are taken as the input of the fake news detection classifier,combined with the multi-task representation of information,and the MLP and Softmax functions are used for classification.The experimental results show that on the Chinese dataset Weibo21,the F1 value of this model is 0.958,4.9% higher than that of the sub-optimal model;on the English dataset FakeNewsNet,the F1 value of the detection result of this model is 0.845,1.8% higher than that of the sub-optimal model,which is advanced and feasible.展开更多
Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japane...Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japanese Sign Language(JSL)for communication.However,existing JSL recognition systems have faced significant performance limitations due to inherent complexities.In response to these challenges,we present a novel JSL recognition system that employs a strategic fusion approach,combining joint skeleton-based handcrafted features and pixel-based deep learning features.Our system incorporates two distinct streams:the first stream extracts crucial handcrafted features,emphasizing the capture of hand and body movements within JSL gestures.Simultaneously,a deep learning-based transfer learning stream captures hierarchical representations of JSL gestures in the second stream.Then,we concatenated the critical information of the first stream and the hierarchy of the second stream features to produce the multiple levels of the fusion features,aiming to create a comprehensive representation of the JSL gestures.After reducing the dimensionality of the feature,a feature selection approach and a kernel-based support vector machine(SVM)were used for the classification.To assess the effectiveness of our approach,we conducted extensive experiments on our Lab JSL dataset and a publicly available Arabic sign language(ArSL)dataset.Our results unequivocally demonstrate that our fusion approach significantly enhances JSL recognition accuracy and robustness compared to individual feature sets or traditional recognition methods.展开更多
Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical...Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.展开更多
Although sentiment analysis is pivotal to understanding user preferences,existing models face significant challenges in handling context-dependent sentiments,sarcasm,and nuanced emotions.This study addresses these cha...Although sentiment analysis is pivotal to understanding user preferences,existing models face significant challenges in handling context-dependent sentiments,sarcasm,and nuanced emotions.This study addresses these challenges by integrating ontology-based methods with deep learning models,thereby enhancing sentiment analysis accuracy in complex domains such as film reviews and restaurant feedback.The framework comprises explicit topic recognition,followed by implicit topic identification to mitigate topic interference in subsequent sentiment analysis.In the context of sentiment analysis,we develop an expanded sentiment lexicon based on domainspecific corpora by leveraging techniques such as word-frequency analysis and word embedding.Furthermore,we introduce a sentiment recognition method based on both ontology-derived sentiment features and sentiment lexicons.We evaluate the performance of our system using a dataset of 10,500 restaurant reviews,focusing on sentiment classification accuracy.The incorporation of specialized lexicons and ontology structures enables the framework to discern subtle sentiment variations and context-specific expressions,thereby improving the overall sentiment-analysis performance.Experimental results demonstrate that the integration of ontology-based methods and deep learning models significantly improves sentiment analysis accuracy.展开更多
Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms...Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image,improving the effectiveness of identifying relevant image regions at each step of caption generation.However,providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features.Consequently,this leads to enhanced captioning network performance.In light of this,we present an image captioning framework that efficiently exploits the extracted representations of the image.Our framework comprises three key components:the Visual Feature Detector module(VFD),the Visual Feature Visual Attention module(VFVA),and the language model.The VFD module is responsible for detecting a subset of the most pertinent features from the local visual features,creating an updated visual features matrix.Subsequently,the VFVA directs its attention to the visual features matrix generated by the VFD,resulting in an updated context vector employed by the language model to generate an informative description.Integrating the VFD and VFVA modules introduces an additional layer of processing for the visual features,thereby contributing to enhancing the image captioning model’s performance.Using the MS-COCO dataset,our experiments show that the proposed framework competes well with state-of-the-art methods,effectively leveraging visual representations to improve performance.The implementation code can be found here:https://github.com/althobhani/VFDICM(accessed on 30 July 2024).展开更多
The fingerprinting-based approach using the wireless local area network(WLAN)is widely used for indoor localization.However,the construction of the fingerprint database is quite time-consuming.Especially when the posi...The fingerprinting-based approach using the wireless local area network(WLAN)is widely used for indoor localization.However,the construction of the fingerprint database is quite time-consuming.Especially when the position of the access point(AP)or wall changes,updating the fingerprint database in real-time is difficult.An appropriate indoor localization approach,which has a low implementation cost,excellent real-time performance,and high localization accuracy and fully considers complex indoor environment factors,is preferred in location-based services(LBSs)applications.In this paper,we proposed a fine-grained grid computing(FGGC)model to achieve decimeter-level localization accuracy.Reference points(RPs)are generated in the grid by the FGGC model.Then,the received signal strength(RSS)values at each RP are calculated with the attenuation factors,such as the frequency band,three-dimensional propagation distance,and walls in complex environments.As a result,the fingerprint database can be established automatically without manual measurement,and the efficiency and cost that the FGGC model takes for the fingerprint database are superior to previous methods.The proposed indoor localization approach,which estimates the position step by step from the approximate grid location to the fine-grained location,can achieve higher real-time performance and localization accuracy simultaneously.The mean error of the proposed model is 0.36 m,far lower than that of previous approaches.Thus,the proposed model is feasible to improve the efficiency and accuracy of Wi-Fi indoor localization.It also shows high-accuracy performance with a fast running speed even under a large-size grid.The results indicate that the proposed method can also be suitable for precise marketing,indoor navigation,and emergency rescue.展开更多
The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orient...The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.展开更多
The safety and stability of high-speed maglev trains traveling on viaducts in crosswinds critically depend on their aerodynamic characteristics.Therefore,this paper uses an improved delayed detached eddy simulation(ID...The safety and stability of high-speed maglev trains traveling on viaducts in crosswinds critically depend on their aerodynamic characteristics.Therefore,this paper uses an improved delayed detached eddy simulation(IDDES)method to investigate the aerodynamic features of high-speed maglev trains with different marshaling lengths under crosswinds.The effects of marshaling lengths(varying from 3-car to 8-car groups)on the train’s aerodynamic performance,surface pressure,and the flow field surrounding the train were investigated using the three-dimensional unsteady compressible Navier-Stokes(N-S)equations.The results showed that the marshaling lengths had minimal influence on the aerodynamic performance of the head and middle cars.Conversely,the marshaling lengths are negatively correlated with the time-average side force coefficient(CS)and time-average lift force coefficient(Cl)of the tail car.Compared to the tail car of the 3-car groups,the CS and Cl fell by 27.77%and 18.29%,respectively,for the tail car of the 8-car groups.It is essential to pay more attention to the operational safety of the head car,as it exhibits the highest time average CS.Additionally,the mean pressure difference between the two sides of the tail car body increased with the marshaling lengths,and the side force direction on the tail car was opposite to that of the head and middle cars.Furthermore,the turbulent kinetic energy of the wake structure on the windward side quickly decreased as marshaling lengths increased.展开更多
BACKGROUND Duodenal neuroendocrine tumours(DNETs)are rare neoplasms.However,the incidence of DNETs has been increasing in recent years,especially as an incidental finding during endoscopic studies.Regrettably,there is...BACKGROUND Duodenal neuroendocrine tumours(DNETs)are rare neoplasms.However,the incidence of DNETs has been increasing in recent years,especially as an incidental finding during endoscopic studies.Regrettably,there is no consensus regarding the ideal treatment of DNETs.Even there are few studies on the clinical features and survival analysis of DNETs.AIM To analyze the clinical characteristics and prognostic factors of patients with duodenal neuroendocrine tumours.METHODS The clinical data of DNETs diagnosed in the First Affiliated Hospital of Air Force Military Medical University from June 2011 to July 2022 were collected.Neuroen-docrine tumours located in the ampulla area of the duodenum were divided into the ampullary region group;neuroendocrine tumours in any part of the duo-denum outside the ampullary area were divided into the nonampullary region group.Using a retrospective study,the clinical characteristics of the two groups and risk factors affecting the survival of DNET patients were analysed.RESULTS Twenty-nine DNET patients were screened.The male to female ratio was 1:1.9,and females comprised the majority.The ampullary region group accounted for 24.1%(7/29),while the nonampullary region group accounted for 75.9%(22/29).When diagnosed,the clinical symptoms of the ampullary region group were mainly abdominal pain(85.7%),while those of the nonampullary region groups were mainly abdominal distension(59.1%).There were differences in the composition of staging of tumours between the two groups(Fisher's exact probability method,P=0.001),with nonampullary stage II tumours(68.2%)being the main stage(P<0.05).After the diagnosis of DNETs,the survival rate of the ampullary region group was 14.3%(1/7),which was lower than that of 72.7%(16/22)in the nonampullary region group(Fisher's exact probability method,P=0.011).The survival time of the ampullary region group was shorter than that of the nonampullary region group(P<0.000).The median survival time of the ampullary region group was 10.0 months and that of the nonampullary region group was 451.0 months.Multivariate analysis showed that tumours in the ampulla region and no surgical treatment after diagnosis were independent risk factors for the survival of DNET patients(HR=0.029,95%CI 0.004-0.199,P<0.000;HR=12.609,95%CI:2.889-55.037,P=0.001).Further analysis of nonampullary DNET patients showed that the survival time of patients with a tumour diameter<2 cm was longer than that of patients with a tumour diameter≥2 cm(t=7.243,P=0.048).As of follow-up,6 patients who died of nonampullary DNETs had a tumour diameter that was≥2 cm,and 3 patients in stage IV had liver metastasis.Patients with a tumour diameter<2 cm underwent surgical treatment,and all survived after surgery.CONCLUSION Surgical treatment is a protective factor for prolonging the survival of DNET patients.Compared to DNETs in the ampullary region,patients in the nonampullary region group had a longer survival period.The liver is the organ most susceptible to distant metastasis of nonampullary DNETs.展开更多
Electrocatalytic nitrogen reduction to ammonia has garnered significant attention with the blooming of single-atom catalysts(SACs),showcasing their potential for sustainable and energy-efficient ammonia production.How...Electrocatalytic nitrogen reduction to ammonia has garnered significant attention with the blooming of single-atom catalysts(SACs),showcasing their potential for sustainable and energy-efficient ammonia production.However,cost-effectively designing and screening efficient electrocatalysts remains a challenge.In this study,we have successfully established interpretable machine learning(ML)models to evaluate the catalytic activity of SACs by directly and accurately predicting reaction Gibbs free energy.Our models were trained using non-density functional theory(DFT)calculated features from a dataset comprising 90 graphene-supported SACs.Our results underscore the superior prediction accuracy of the gradient boosting regression(GBR)model for bothΔg(N_(2)→NNH)andΔG(NH_(2)→NH_(3)),boasting coefficient of determination(R^(2))score of 0.972 and 0.984,along with root mean square error(RMSE)of 0.051 and 0.085 eV,respectively.Moreover,feature importance analysis elucidates that the high accuracy of GBR model stems from its adept capture of characteristics pertinent to the active center and coordination environment,unveilling the significance of elementary descriptors,with the colvalent radius playing a dominant role.Additionally,Shapley additive explanations(SHAP)analysis provides global and local interpretation of the working mechanism of the GBR model.Our analysis identifies that a pyrrole-type coordination(flag=0),d-orbitals with a moderate occupation(N_(d)=5),and a moderate difference in covalent radius(r_(TM-ave)near 140 pm)are conducive to achieving high activity.Furthermore,we extend the prediction of activity to more catalysts without additional DFT calculations,validating the reliability of our feature engineering,model training,and design strategy.These findings not only highlight new opportunity for accelerating catalyst design using non-DFT calculated features,but also shed light on the working mechanism of"black box"ML model.Moreover,the model provides valuable guidance for catalytic material design in multiple proton-electron coupling reactions,particularly in driving sustainable CO_(2),O_(2),and N_(2) conversion.展开更多
With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analy...With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analysis tasks such as behaviour recognition.These applications have dramatically increased the diversity of IoT systems.Specifically,behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension.Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions,in contrast to computer vision tasks involving images that focus on understanding spatial information.However,current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos.In this paper,we propose a novel behaviour recognition method based on the integration of multigranular(IMG)motion features,which can provide support for deploying video analysis in multimedia IoT crowdsensing systems.In particular,we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module(CSEM)and a cascaded long-term motion feature integration module(CLIM).We evaluate our model on several action recognition benchmarks,such as HMDB51,Something-Something and UCF101.The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods,which confirms its effective-ness and efficiency.展开更多
Numerical weather prediction(NWP)models have always presented large forecasting errors of surface wind speeds over regions with complex terrain.In this study,surface wind forecasts from an operational NWP model,the SM...Numerical weather prediction(NWP)models have always presented large forecasting errors of surface wind speeds over regions with complex terrain.In this study,surface wind forecasts from an operational NWP model,the SMS-WARR(Shanghai Meteorological Service-WRF ADAS Rapid Refresh System),are analyzed to quantitatively reveal the relationships between the forecasted surface wind speed errors and terrain features,with the intent of providing clues to better apply the NWP model to complex terrain regions.The terrain features are described by three parameters:the standard deviation of the model grid-scale orography,terrain height error of the model,and slope angle.The results show that the forecast bias has a unimodal distribution with a change in the standard deviation of orography.The minimum ME(the mean value of bias)is 1.2 m s^(-1) when the standard deviation is between 60 and 70 m.A positive correlation exists between bias and terrain height error,with the ME increasing by 10%−30%for every 200 m increase in terrain height error.The ME decreases by 65.6%when slope angle increases from(0.5°−1.5°)to larger than 3.5°for uphill winds but increases by 35.4%when the absolute value of slope angle increases from(0.5°−1.5°)to(2.5°−3.5°)for downhill winds.Several sensitivity experiments are carried out with a model output statistical(MOS)calibration model for surface wind speeds and ME(RMSE)has been reduced by 90%(30%)by introducing terrain parameters,demonstrating the value of this study.展开更多
Candida auris since it discovery in 2009 is becoming a severe threat to human health due to its very quickly spread, its worldwide high resistance to systemic antifungal drugs. In resource-constrained settings where s...Candida auris since it discovery in 2009 is becoming a severe threat to human health due to its very quickly spread, its worldwide high resistance to systemic antifungal drugs. In resource-constrained settings where several conditions are met for its emergence and spread, this worrisome fungus could cause large hospital and/or community-based outbreaks. This review aimed to summarize the available data on C. auris in Africa focusing on its epidemiology and antifungal resistance profile. Major databases were searched for articles on the epidemiology and antifungal resistance profile of C. auris in Africa. Out of 2,521 articles identified 22 met the inclusion criteria. In Africa, nearly 89% of African countries have no published data on C. auris. The prevalence of C. auris in Africa was 8.74%. The case fatality rate of C. auris infection in Africa was 39.46%. The main C. auris risk factors reported in Africa were cardiovascular disease, renal failure, diabetes, HIV, recent intake of antimicrobial drugs, ICU admissions, surgery, hemodialysis, parenteral nutrition and indwelling devices. Four phylogenetic clades were reported in Africa, namely clades I, II, III and IV. Candida auris showed a pan-African very high resistance rate to fluconazole, moderate resistance to amphotericin B, and high susceptibility to echinocandins. Finally, C. auris clade-specific mutations were observed within the ERG2, ERG3, ERG9, ERG11, FKS1, TAC1b and MRR1 genes in Africa. This systematic review showed the presence of C. auris in the African continent and a worrying unavailability of data on this resilient fungus in most African countries.展开更多
基金supported in part by the National Natural Science Foundation of China(NSFC)under Grant 6150140in part by the Youth Innovation Project(21032158-Y)of Zhejiang Sci-Tech University.
文摘Image captioning involves two different major modalities(image and sentence)that convert a given image into a language that adheres to visual semantics.Almost all methods first extract image features to reduce the difficulty of visual semantic embedding and then use the caption model to generate fluent sentences.The Convolutional Neural Network(CNN)is often used to extract image features in image captioning,and the use of object detection networks to extract region features has achieved great success.However,the region features retrieved by this method are object-level and do not pay attention to fine-grained details because of the detection model’s limitation.We offer an approach to address this issue that more properly generates captions by fusing fine-grained features and region features.First,we extract fine-grained features using a panoramic segmentation algorithm.Second,we suggest two fusion methods and contrast their fusion outcomes.An X-linear Attention Network(X-LAN)serves as the foundation for both fusion methods.According to experimental findings on the COCO dataset,the two-branch fusion approach is superior.It is important to note that on the COCO Karpathy test split,CIDEr is increased up to 134.3%in comparison to the baseline,highlighting the potency and viability of our method.
基金supported by the 2023 Open Project of Key Laboratory of Ministry of Public Security for Artificial Intelligence Security(RGZNAQ-2304)the Fundamental Research Funds for the Central Universities of PPSUC(2023JKF01ZK08).
文摘With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.
文摘Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.
基金financially supported by the National Key R&D Program of China(No.2022YFB3705300)the National Natural Science Foundation of China(Nos.U1960204 and 51974199)the Postdoctoral Fellowship Program of CPSF(No.GZB20230515)。
文摘The infamous type Ⅳ failure within the fine-grained heat-affected zone (FGHAZ) in G115 steel weldments seriously threatens the safe operation of ultra-supercritical (USC) power plants.In this work,the traditional thermo-mechanical treatment was modified via the replacement of hot-rolling with cold rolling,i.e.,normalizing,cold rolling,and tempering (NCT),which was developed to improve the creep strength of the FGHAZ in G115 steel weldments.The NCT treatment effectively promoted the dissolution of preformed M_(23)C_(6)particles and relieved the boundary segregation of C and Cr during welding thermal cycling,which accelerated the dispersed reprecipitation of M_(23)C_(6) particles within the fresh reaustenitized grains during post-weld heat treatment.In addition,the precipitation of Cu-rich phases and MX particles was promoted evidently due to the deformation-induced dislocations.As a result,the interacting actions between precipitates,dislocations,and boundaries during creep were reinforced considerably.Following this strategy,the creep rupture life of the FGHAZ in G115 steel weldments can be prolonged by 18.6%,which can further push the application of G115 steel in USC power plants.
基金supported by the National Key Research and Development Project,No.2019YFA0112100(to SF).
文摘Traumatic spinal cord injury is potentially catastrophic and can lead to permanent disability or even death.China has the largest population of patients with traumatic spinal cord injury.Previous studies of traumatic spinal cord injury in China have mostly been regional in scope;national-level studies have been rare.To the best of our knowledge,no national-level study of treatment status and economic burden has been performed.This retrospective study aimed to examine the epidemiological and clinical features,treatment status,and economic burden of traumatic spinal cord injury in China at the national level.We included 13,465 traumatic spinal cord injury patients who were injured between January 2013 and December 2018 and treated in 30 hospitals in 11 provinces/municipalities representing all geographical divisions of China.Patient epidemiological and clinical features,treatment status,and total and daily costs were recorded.Trends in the percentage of traumatic spinal cord injuries among all hospitalized patients and among patients hospitalized in the orthopedic department and cost of care were assessed by annual percentage change using the Joinpoint Regression Program.The percentage of traumatic spinal cord injuries among all hospitalized patients and among patients hospitalized in the orthopedic department did not significantly change overall(annual percentage change,-0.5%and 2.1%,respectively).A total of 10,053(74.7%)patients underwent surgery.Only 2.8%of patients who underwent surgery did so within 24 hours of injury.A total of 2005(14.9%)patients were treated with high-dose(≥500 mg)methylprednisolone sodium succinate/methylprednisolone(MPSS/MP);615(4.6%)received it within 8 hours.The total cost for acute traumatic spinal cord injury decreased over the study period(-4.7%),while daily cost did not significantly change(1.0%increase).Our findings indicate that public health initiatives should aim at improving hospitals’ability to complete early surgery within 24 hours,which is associated with improved sensorimotor recovery,increasing the awareness rate of clinical guidelines related to high-dose MPSS/MP to reduce the use of the treatment with insufficient evidence.
基金Supported by the 74th General Support of China Postdoctoral Science Foundation,No.2023M740675the National Natural Science Foundation of China,No.82170555+2 种基金Shanghai Academic/Technology Research Leader,No.22XD1422400Shuguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission,No.2022SG06Shanghai"Rising Stars of Medical Talent"Youth Development Program,No.20224Z0005.
文摘BACKGROUND Gastric cystica profunda(GCP)represents a rare condition characterized by cystic dilation of gastric glands within the mucosal and/or submucosal layers.GCP is often linked to,or may progress into,early gastric cancer(EGC).AIM To provide a comprehensive evaluation of the endoscopic features of GCP while assessing the efficacy of endoscopic treatment,thereby offering guidance for diagnosis and treatment.METHODS This retrospective study involved 104 patients with GCP who underwent endoscopic resection.Alongside demographic and clinical data,regular patient followups were conducted to assess local recurrence.RESULTS Among the 104 patients diagnosed with GCP who underwent endoscopic resection,12.5%had a history of previous gastric procedures.The primary site predominantly affected was the cardia(38.5%,n=40).GCP commonly exhibited intraluminal growth(99%),regular presentation(74.0%),and ulcerative mucosa(61.5%).The leading endoscopic feature was the mucosal lesion type(59.6%,n=62).The average maximum diameter was 20.9±15.3 mm,with mucosal involvement in 60.6%(n=63).Procedures lasted 73.9±57.5 min,achieving complete resection in 91.3%(n=95).Recurrence(4.8%)was managed via either surgical intervention(n=1)or through endoscopic resection(n=4).Final pathology confirmed that 59.6%of GCP cases were associated with EGC.Univariate analysis indicated that elderly males were more susceptible to GCP associated with EGC.Conversely,multivariate analysis identified lesion morphology and endoscopic features as significant risk factors.Survival analysis demonstrated no statistically significant difference in recurrence between GCP with and without EGC(P=0.72).CONCLUSION The findings suggested that endoscopic resection might serve as an effective and minimally invasive treatment for GCP with or without EGC.
文摘Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios.
基金The authors are highly thankful to the National Social Science Foundation of China(20BXW101,18XXW015)Innovation Research Project for the Cultivation of High-Level Scientific and Technological Talents(Top-Notch Talents of theDiscipline)(ZZKY2022303)+3 种基金National Natural Science Foundation of China(Nos.62102451,62202496)Basic Frontier Innovation Project of Engineering University of People’s Armed Police(WJX202316)This work is also supported by National Natural Science Foundation of China(No.62172436)Engineering University of PAP’s Funding for Scientific Research Innovation Team,Engineering University of PAP’s Funding for Basic Scientific Research,and Engineering University of PAP’s Funding for Education and Teaching.Natural Science Foundation of Shaanxi Province(No.2023-JCYB-584).
文摘With the rapid spread of Internet information and the spread of fake news,the detection of fake news becomes more and more important.Traditional detection methods often rely on a single emotional or semantic feature to identify fake news,but these methods have limitations when dealing with news in specific domains.In order to solve the problem of weak feature correlation between data from different domains,a model for detecting fake news by integrating domain-specific emotional and semantic features is proposed.This method makes full use of the attention mechanism,grasps the correlation between different features,and effectively improves the effect of feature fusion.The algorithm first extracts the semantic features of news text through the Bi-LSTM(Bidirectional Long Short-Term Memory)layer to capture the contextual relevance of news text.Senta-BiLSTM is then used to extract emotional features and predict the probability of positive and negative emotions in the text.It then uses domain features as an enhancement feature and attention mechanism to fully capture more fine-grained emotional features associated with that domain.Finally,the fusion features are taken as the input of the fake news detection classifier,combined with the multi-task representation of information,and the MLP and Softmax functions are used for classification.The experimental results show that on the Chinese dataset Weibo21,the F1 value of this model is 0.958,4.9% higher than that of the sub-optimal model;on the English dataset FakeNewsNet,the F1 value of the detection result of this model is 0.845,1.8% higher than that of the sub-optimal model,which is advanced and feasible.
基金supported by the Competitive Research Fund of the University of Aizu,Japan.
文摘Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japanese Sign Language(JSL)for communication.However,existing JSL recognition systems have faced significant performance limitations due to inherent complexities.In response to these challenges,we present a novel JSL recognition system that employs a strategic fusion approach,combining joint skeleton-based handcrafted features and pixel-based deep learning features.Our system incorporates two distinct streams:the first stream extracts crucial handcrafted features,emphasizing the capture of hand and body movements within JSL gestures.Simultaneously,a deep learning-based transfer learning stream captures hierarchical representations of JSL gestures in the second stream.Then,we concatenated the critical information of the first stream and the hierarchy of the second stream features to produce the multiple levels of the fusion features,aiming to create a comprehensive representation of the JSL gestures.After reducing the dimensionality of the feature,a feature selection approach and a kernel-based support vector machine(SVM)were used for the classification.To assess the effectiveness of our approach,we conducted extensive experiments on our Lab JSL dataset and a publicly available Arabic sign language(ArSL)dataset.Our results unequivocally demonstrate that our fusion approach significantly enhances JSL recognition accuracy and robustness compared to individual feature sets or traditional recognition methods.
基金National Key Research and Development Program of China(2022YFC3502302)National Natural Science Foundation of China(82074580)Graduate Research Innovation Program of Jiangsu Province(KYCX23_2078).
文摘Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.
基金supported by the BK21 FOUR Program of the National Research Foundation of Korea funded by the Ministry of Education(NRF5199991014091)Seok-Won Lee’s work was supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Artificial Intelligence Convergence Innovation Human Resources Development(IITP-2024-RS-2023-00255968)grant funded by the Korea government(MSIT).
文摘Although sentiment analysis is pivotal to understanding user preferences,existing models face significant challenges in handling context-dependent sentiments,sarcasm,and nuanced emotions.This study addresses these challenges by integrating ontology-based methods with deep learning models,thereby enhancing sentiment analysis accuracy in complex domains such as film reviews and restaurant feedback.The framework comprises explicit topic recognition,followed by implicit topic identification to mitigate topic interference in subsequent sentiment analysis.In the context of sentiment analysis,we develop an expanded sentiment lexicon based on domainspecific corpora by leveraging techniques such as word-frequency analysis and word embedding.Furthermore,we introduce a sentiment recognition method based on both ontology-derived sentiment features and sentiment lexicons.We evaluate the performance of our system using a dataset of 10,500 restaurant reviews,focusing on sentiment classification accuracy.The incorporation of specialized lexicons and ontology structures enables the framework to discern subtle sentiment variations and context-specific expressions,thereby improving the overall sentiment-analysis performance.Experimental results demonstrate that the integration of ontology-based methods and deep learning models significantly improves sentiment analysis accuracy.
基金supported by the National Natural Science Foundation of China(Nos.U22A2034,62177047)High Caliber Foreign Experts Introduction Plan funded by MOST,and Central South University Research Programme of Advanced Interdisciplinary Studies(No.2023QYJC020).
文摘Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image,improving the effectiveness of identifying relevant image regions at each step of caption generation.However,providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features.Consequently,this leads to enhanced captioning network performance.In light of this,we present an image captioning framework that efficiently exploits the extracted representations of the image.Our framework comprises three key components:the Visual Feature Detector module(VFD),the Visual Feature Visual Attention module(VFVA),and the language model.The VFD module is responsible for detecting a subset of the most pertinent features from the local visual features,creating an updated visual features matrix.Subsequently,the VFVA directs its attention to the visual features matrix generated by the VFD,resulting in an updated context vector employed by the language model to generate an informative description.Integrating the VFD and VFVA modules introduces an additional layer of processing for the visual features,thereby contributing to enhancing the image captioning model’s performance.Using the MS-COCO dataset,our experiments show that the proposed framework competes well with state-of-the-art methods,effectively leveraging visual representations to improve performance.The implementation code can be found here:https://github.com/althobhani/VFDICM(accessed on 30 July 2024).
基金the Open Project of Sichuan Provincial Key Laboratory of Philosophy and Social Science for Language Intelligence in Special Education under Grant No.YYZN-2023-4the Ph.D.Fund of Chengdu Technological University under Grant No.2020RC002.
文摘The fingerprinting-based approach using the wireless local area network(WLAN)is widely used for indoor localization.However,the construction of the fingerprint database is quite time-consuming.Especially when the position of the access point(AP)or wall changes,updating the fingerprint database in real-time is difficult.An appropriate indoor localization approach,which has a low implementation cost,excellent real-time performance,and high localization accuracy and fully considers complex indoor environment factors,is preferred in location-based services(LBSs)applications.In this paper,we proposed a fine-grained grid computing(FGGC)model to achieve decimeter-level localization accuracy.Reference points(RPs)are generated in the grid by the FGGC model.Then,the received signal strength(RSS)values at each RP are calculated with the attenuation factors,such as the frequency band,three-dimensional propagation distance,and walls in complex environments.As a result,the fingerprint database can be established automatically without manual measurement,and the efficiency and cost that the FGGC model takes for the fingerprint database are superior to previous methods.The proposed indoor localization approach,which estimates the position step by step from the approximate grid location to the fine-grained location,can achieve higher real-time performance and localization accuracy simultaneously.The mean error of the proposed model is 0.36 m,far lower than that of previous approaches.Thus,the proposed model is feasible to improve the efficiency and accuracy of Wi-Fi indoor localization.It also shows high-accuracy performance with a fast running speed even under a large-size grid.The results indicate that the proposed method can also be suitable for precise marketing,indoor navigation,and emergency rescue.
文摘The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.
基金supported by Wuyi University Hong Kong and Macao Joint Research and Development Fund(GrantsNos.2021WGALH15,2019WGALH17,2019WGALH15)the National Natural Science Foundation of China-Guangdong Joint Fund(GrantsNo.2019A1515111052)+2 种基金the National Natural Science Foundation of China(Grant No.52202426)a grant from the Research Grants Council(RGC)of the Hong Kong Special Administrative Region(SAR),China(Grants No.15205723)a grant from the Hong Kong Polytechnic University(Grant No.P0045325).
文摘The safety and stability of high-speed maglev trains traveling on viaducts in crosswinds critically depend on their aerodynamic characteristics.Therefore,this paper uses an improved delayed detached eddy simulation(IDDES)method to investigate the aerodynamic features of high-speed maglev trains with different marshaling lengths under crosswinds.The effects of marshaling lengths(varying from 3-car to 8-car groups)on the train’s aerodynamic performance,surface pressure,and the flow field surrounding the train were investigated using the three-dimensional unsteady compressible Navier-Stokes(N-S)equations.The results showed that the marshaling lengths had minimal influence on the aerodynamic performance of the head and middle cars.Conversely,the marshaling lengths are negatively correlated with the time-average side force coefficient(CS)and time-average lift force coefficient(Cl)of the tail car.Compared to the tail car of the 3-car groups,the CS and Cl fell by 27.77%and 18.29%,respectively,for the tail car of the 8-car groups.It is essential to pay more attention to the operational safety of the head car,as it exhibits the highest time average CS.Additionally,the mean pressure difference between the two sides of the tail car body increased with the marshaling lengths,and the side force direction on the tail car was opposite to that of the head and middle cars.Furthermore,the turbulent kinetic energy of the wake structure on the windward side quickly decreased as marshaling lengths increased.
基金The study protocol was approved by the Clinical Research Ethics Committee of Honghui Hospital,Xi’an Jiaotong University(No.202401004).
文摘BACKGROUND Duodenal neuroendocrine tumours(DNETs)are rare neoplasms.However,the incidence of DNETs has been increasing in recent years,especially as an incidental finding during endoscopic studies.Regrettably,there is no consensus regarding the ideal treatment of DNETs.Even there are few studies on the clinical features and survival analysis of DNETs.AIM To analyze the clinical characteristics and prognostic factors of patients with duodenal neuroendocrine tumours.METHODS The clinical data of DNETs diagnosed in the First Affiliated Hospital of Air Force Military Medical University from June 2011 to July 2022 were collected.Neuroen-docrine tumours located in the ampulla area of the duodenum were divided into the ampullary region group;neuroendocrine tumours in any part of the duo-denum outside the ampullary area were divided into the nonampullary region group.Using a retrospective study,the clinical characteristics of the two groups and risk factors affecting the survival of DNET patients were analysed.RESULTS Twenty-nine DNET patients were screened.The male to female ratio was 1:1.9,and females comprised the majority.The ampullary region group accounted for 24.1%(7/29),while the nonampullary region group accounted for 75.9%(22/29).When diagnosed,the clinical symptoms of the ampullary region group were mainly abdominal pain(85.7%),while those of the nonampullary region groups were mainly abdominal distension(59.1%).There were differences in the composition of staging of tumours between the two groups(Fisher's exact probability method,P=0.001),with nonampullary stage II tumours(68.2%)being the main stage(P<0.05).After the diagnosis of DNETs,the survival rate of the ampullary region group was 14.3%(1/7),which was lower than that of 72.7%(16/22)in the nonampullary region group(Fisher's exact probability method,P=0.011).The survival time of the ampullary region group was shorter than that of the nonampullary region group(P<0.000).The median survival time of the ampullary region group was 10.0 months and that of the nonampullary region group was 451.0 months.Multivariate analysis showed that tumours in the ampulla region and no surgical treatment after diagnosis were independent risk factors for the survival of DNET patients(HR=0.029,95%CI 0.004-0.199,P<0.000;HR=12.609,95%CI:2.889-55.037,P=0.001).Further analysis of nonampullary DNET patients showed that the survival time of patients with a tumour diameter<2 cm was longer than that of patients with a tumour diameter≥2 cm(t=7.243,P=0.048).As of follow-up,6 patients who died of nonampullary DNETs had a tumour diameter that was≥2 cm,and 3 patients in stage IV had liver metastasis.Patients with a tumour diameter<2 cm underwent surgical treatment,and all survived after surgery.CONCLUSION Surgical treatment is a protective factor for prolonging the survival of DNET patients.Compared to DNETs in the ampullary region,patients in the nonampullary region group had a longer survival period.The liver is the organ most susceptible to distant metastasis of nonampullary DNETs.
基金supported by the Research Grants Council of Hong Kong (City U 11305919 and 11308620)the NSFC/RGC Joint Research Scheme N_City U104/19The Hong Kong Research Grant Council Collaborative Research Fund:C1002-21G and C1017-22G。
文摘Electrocatalytic nitrogen reduction to ammonia has garnered significant attention with the blooming of single-atom catalysts(SACs),showcasing their potential for sustainable and energy-efficient ammonia production.However,cost-effectively designing and screening efficient electrocatalysts remains a challenge.In this study,we have successfully established interpretable machine learning(ML)models to evaluate the catalytic activity of SACs by directly and accurately predicting reaction Gibbs free energy.Our models were trained using non-density functional theory(DFT)calculated features from a dataset comprising 90 graphene-supported SACs.Our results underscore the superior prediction accuracy of the gradient boosting regression(GBR)model for bothΔg(N_(2)→NNH)andΔG(NH_(2)→NH_(3)),boasting coefficient of determination(R^(2))score of 0.972 and 0.984,along with root mean square error(RMSE)of 0.051 and 0.085 eV,respectively.Moreover,feature importance analysis elucidates that the high accuracy of GBR model stems from its adept capture of characteristics pertinent to the active center and coordination environment,unveilling the significance of elementary descriptors,with the colvalent radius playing a dominant role.Additionally,Shapley additive explanations(SHAP)analysis provides global and local interpretation of the working mechanism of the GBR model.Our analysis identifies that a pyrrole-type coordination(flag=0),d-orbitals with a moderate occupation(N_(d)=5),and a moderate difference in covalent radius(r_(TM-ave)near 140 pm)are conducive to achieving high activity.Furthermore,we extend the prediction of activity to more catalysts without additional DFT calculations,validating the reliability of our feature engineering,model training,and design strategy.These findings not only highlight new opportunity for accelerating catalyst design using non-DFT calculated features,but also shed light on the working mechanism of"black box"ML model.Moreover,the model provides valuable guidance for catalytic material design in multiple proton-electron coupling reactions,particularly in driving sustainable CO_(2),O_(2),and N_(2) conversion.
基金supported by National Natural Science Foundation of China under grant No.62271125,No.62273071Sichuan Science and Technology Program(No.2022YFG0038,No.2021YFG0018)+1 种基金by Xinjiang Science and Technology Program(No.2022273061)by the Fundamental Research Funds for the Central Universities(No.ZYGX2020ZB034,No.ZYGX2021J019).
文摘With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analysis tasks such as behaviour recognition.These applications have dramatically increased the diversity of IoT systems.Specifically,behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension.Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions,in contrast to computer vision tasks involving images that focus on understanding spatial information.However,current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos.In this paper,we propose a novel behaviour recognition method based on the integration of multigranular(IMG)motion features,which can provide support for deploying video analysis in multimedia IoT crowdsensing systems.In particular,we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module(CSEM)and a cascaded long-term motion feature integration module(CLIM).We evaluate our model on several action recognition benchmarks,such as HMDB51,Something-Something and UCF101.The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods,which confirms its effective-ness and efficiency.
基金supported by the National Natural Science Foundation of China(No.U2142206).
文摘Numerical weather prediction(NWP)models have always presented large forecasting errors of surface wind speeds over regions with complex terrain.In this study,surface wind forecasts from an operational NWP model,the SMS-WARR(Shanghai Meteorological Service-WRF ADAS Rapid Refresh System),are analyzed to quantitatively reveal the relationships between the forecasted surface wind speed errors and terrain features,with the intent of providing clues to better apply the NWP model to complex terrain regions.The terrain features are described by three parameters:the standard deviation of the model grid-scale orography,terrain height error of the model,and slope angle.The results show that the forecast bias has a unimodal distribution with a change in the standard deviation of orography.The minimum ME(the mean value of bias)is 1.2 m s^(-1) when the standard deviation is between 60 and 70 m.A positive correlation exists between bias and terrain height error,with the ME increasing by 10%−30%for every 200 m increase in terrain height error.The ME decreases by 65.6%when slope angle increases from(0.5°−1.5°)to larger than 3.5°for uphill winds but increases by 35.4%when the absolute value of slope angle increases from(0.5°−1.5°)to(2.5°−3.5°)for downhill winds.Several sensitivity experiments are carried out with a model output statistical(MOS)calibration model for surface wind speeds and ME(RMSE)has been reduced by 90%(30%)by introducing terrain parameters,demonstrating the value of this study.
文摘Candida auris since it discovery in 2009 is becoming a severe threat to human health due to its very quickly spread, its worldwide high resistance to systemic antifungal drugs. In resource-constrained settings where several conditions are met for its emergence and spread, this worrisome fungus could cause large hospital and/or community-based outbreaks. This review aimed to summarize the available data on C. auris in Africa focusing on its epidemiology and antifungal resistance profile. Major databases were searched for articles on the epidemiology and antifungal resistance profile of C. auris in Africa. Out of 2,521 articles identified 22 met the inclusion criteria. In Africa, nearly 89% of African countries have no published data on C. auris. The prevalence of C. auris in Africa was 8.74%. The case fatality rate of C. auris infection in Africa was 39.46%. The main C. auris risk factors reported in Africa were cardiovascular disease, renal failure, diabetes, HIV, recent intake of antimicrobial drugs, ICU admissions, surgery, hemodialysis, parenteral nutrition and indwelling devices. Four phylogenetic clades were reported in Africa, namely clades I, II, III and IV. Candida auris showed a pan-African very high resistance rate to fluconazole, moderate resistance to amphotericin B, and high susceptibility to echinocandins. Finally, C. auris clade-specific mutations were observed within the ERG2, ERG3, ERG9, ERG11, FKS1, TAC1b and MRR1 genes in Africa. This systematic review showed the presence of C. auris in the African continent and a worrying unavailability of data on this resilient fungus in most African countries.