Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous human...Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.展开更多
Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same g...Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same granularity,segmenting them into different granularity events can effectively mitigate the impact of varying time scales on prediction accuracy.However,these events of varying granularity frequently intersect with each other,which may possess unequal durations.Even minor differences can result in significant errors when matching time series with future trends.Besides,directly using matched events but unaligned events as state vectors in machine learning-based prediction models can lead to insufficient prediction accuracy.Therefore,this paper proposes a short-term forecasting method for time series based on a multi-granularity event,MGE-SP(multi-granularity event-based short-termprediction).First,amethodological framework for MGE-SP established guides the implementation steps.The framework consists of three key steps,including multi-granularity event matching based on the LTF(latest time first)strategy,multi-granularity event alignment using a piecewise aggregate approximation based on the compression ratio,and a short-term prediction model based on XGBoost.The data from a nationwide online car-hailing service in China ensures the method’s reliability.The average RMSE(root mean square error)and MAE(mean absolute error)of the proposed method are 3.204 and 2.360,lower than the respective values of 4.056 and 3.101 obtained using theARIMA(autoregressive integratedmoving average)method,as well as the values of 4.278 and 2.994 obtained using k-means-SVR(support vector regression)method.The other experiment is conducted on stock data froma public data set.The proposed method achieved an average RMSE and MAE of 0.836 and 0.696,lower than the respective values of 1.019 and 0.844 obtained using the ARIMA method,as well as the values of 1.350 and 1.172 obtained using the k-means-SVR method.展开更多
Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relati...Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.展开更多
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully superv...Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.展开更多
Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved throu...Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.展开更多
The two universes multi-granularity fuzzy rough set model is an effective tool for handling uncertainty problems between two domains with the help of binary fuzzy relations. This article applies the idea of neighborho...The two universes multi-granularity fuzzy rough set model is an effective tool for handling uncertainty problems between two domains with the help of binary fuzzy relations. This article applies the idea of neighborhood rough sets to two universes multi-granularity fuzzy rough sets, and discusses the two-universes multi-granularity neighborhood fuzzy rough set model. Firstly, the upper and lower approximation operators are defined in the two universes multi-granularity neighborhood fuzzy rough set model. Secondly, the properties of the upper and lower approximation operators are discussed. Finally, the properties of the two universes multi-granularity neighborhood fuzzy rough set model are verified through case studies.展开更多
To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select...To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.展开更多
The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning...The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning being difficult to process large-scale unlabeled data. The existing federated self-supervision framework has problems with low communication efficiency and high communication delay between clients and central servers. Therefore, we added edge servers to the federated self-supervision framework to reduce the pressure on the central server caused by frequent communication between both ends. A communication compression scheme using gradient quantization and sparsification was proposed to optimize the communication of the entire framework, and the algorithm of the sparse communication compression module was improved. Experiments have proved that the learning rate changes of the improved sparse communication compression module are smoother and more stable. Our communication compression scheme effectively reduced the overall communication overhead.展开更多
The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy ...The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy of decentralized SCN algorithms while effectively protecting user privacy. To this end, we propose a decentralized semi-supervised learning algorithm for SCN, called DMT-SCN, which introduces teacher and student models by combining the idea of consistency regularization to improve the response speed of model iterations. In order to reduce the possible negative impact of unsupervised data on the model, we purposely change the way of adding noise to the unlabeled data. Simulation results show that the algorithm can effectively utilize unlabeled data to improve the classification accuracy of SCN training and is robust under different ground simulation environments.展开更多
Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared ima...Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.展开更多
A novel multi-granularity flexible-grid switching optical-node architecture is proposed in this paper.In our system,the photonic lanterns are used as mode division multiplexing/demultiplexing(MD-Mux/MD-Demux)for selec...A novel multi-granularity flexible-grid switching optical-node architecture is proposed in this paper.In our system,the photonic lanterns are used as mode division multiplexing/demultiplexing(MD-Mux/MD-Demux)for selecting mode.The wavelength division multiplexer/demultiplexer(WDMux/WD-Demux)and the fiber bragg gratings(FBGs)are used to select wavelength channels with the various grid.The experimental results show that the transmission bandwidth covers the C+L band,the average transmission loss is-13.4 dB,and the average crosstalk is-30.5 dB.The optical-node architecture is suit for mode division multiplexing(MDM)optical communication system.展开更多
The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions va...The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions vary widely among individuals,making it challenging to accurately diagnose the disease.This study proposed a deep-learning disease diagnosismodel based onweakly supervised learning and clustering visualization(W_CVNet)that fused classification with segmentation.First,the data were preprocessed.An optimizable weakly supervised segmentation preprocessing method(O-WSSPM)was used to remove redundant data and solve the category imbalance problem.Second,a deep-learning fusion method was used for feature extraction and classification recognition.A dual asymmetric complementary bilinear feature extraction method(D-CBM)was used to fully extract complementary features,which solved the problem of insufficient feature extraction by a single deep learning network.Third,an unsupervised learning method based on Fuzzy C-Means(FCM)clustering was used to segment and visualize COVID-19 lesions enabling physicians to accurately assess lesion distribution and disease severity.In this study,5-fold cross-validation methods were used,and the results showed that the network had an average classification accuracy of 85.8%,outperforming six recent advanced classification models.W_CVNet can effectively help physicians with automated aid in diagnosis to determine if the disease is present and,in the case of COVID-19 patients,to further predict the area of the lesion.展开更多
Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension ...Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension curves(FECs)in pulling experiments and identifying states from extension-time trajectories(ETTs)in force-clamp experiments.The former is often accomplished manually and hence is time-consuming and laborious while the latter is always impeded by the presence of baseline drift.In this study,we attempt to accurately and automatically identify the events and states from SMFS experiments with a machine learning approach,which combines clustering and classification for event identification of SMFS(ACCESS).As demonstrated by analysis of a series of data sets,ACCESS can extract the rupture forces from FECs containing multiple unfolding steps and classify the rupture forces into the corresponding conformational transitions.Moreover,ACCESS successfully identifies the unfolded and folded states even though the ETTs display severe nonmonotonic baseline drift.Besides,ACCESS is straightforward in use as it requires only three easy-to-interpret parameters.As such,we anticipate that ACCESS will be a useful,easy-to-implement and high-performance tool for event and state identification across a range of single-molecule experiments.展开更多
N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning m...N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning methods (PCA, HCA, KNN, SIMCA, and SDA). The optimization of molecular structures was performed using the B3LYP/6-31G* approach. MEP maps and ligand-receptor interactions were used to investigate key structural features required for biological activities and likely interactions between N-11-azaartemisinins and heme, respectively. The supervised machine learning methods allowed the separation of the investigated compounds into two classes: cha and cla, with the properties ε<sub>LUMO+1</sub> (one level above lowest unoccupied molecular orbital energy), d(C<sub>6</sub>-C<sub>5</sub>) (distance between C<sub>6</sub> and C<sub>5</sub> atoms in ligands), and TSA (total surface area) responsible for the classification. The insights extracted from the investigation developed and the chemical intuition enabled the design of sixteen new N-11-azaartemisinins (prediction set), moreover, models built with supervised machine learning methods were applied to this prediction set. The result of this application showed twelve new promising N-11-azaartemisinins for synthesis and biological evaluation.展开更多
A large variety of complaint reports reflect subjective information expressed by citizens.A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary.Therefore...A large variety of complaint reports reflect subjective information expressed by citizens.A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary.Therefore,in this paper,a simple and weakly supervised framework considering factual consistency is proposed to generate a summary of city-based complaint reports without pre-labeled sentences/words.Furthermore,it considers the importance of entity in complaint reports to ensure factual consistency of summary.Experimental results on the customer review datasets(Yelp and Amazon)and complaint report dataset(complaint reports of Shenyang in China)show that the proposed framework outperforms state-of-the-art approaches in ROUGE scores and human evaluation.It unveils the effectiveness of our approach to helping in dealing with complaint reports.展开更多
In recent years, the place occupied by the various manifestations of cyber-crime in companies has been considerable. Indeed, due to the rapid evolution of telecommunications technologies, companies, regardless of thei...In recent years, the place occupied by the various manifestations of cyber-crime in companies has been considerable. Indeed, due to the rapid evolution of telecommunications technologies, companies, regardless of their size or sector of activity, are now the target of advanced persistent threats. The Work 2035 study also revealed that cyber crimes (such as critical infrastructure hacks) and massive data breaches are major sources of concern. Thus, it is important for organizations to guarantee a minimum level of security to avoid potential attacks that can cause paralysis of systems, loss of sensitive data, exposure to blackmail, damage to reputation or even a commercial harm. To do this, among other means, hardening is used, the main objective of which is to reduce the attack surface within a company. The execution of the hardening configurations as well as the verification of these are carried out on the servers and network equipment with the aim of reducing the number of openings present by keeping only those which are necessary for proper operation. However, nowadays, in many companies, these tasks are done manually. As a result, the execution and verification of hardening configurations are very often subject to potential errors but also highly consuming human and financial resources. The problem is that it is essential for operators to maintain an optimal level of security while minimizing costs, hence the interest in automating hardening processes and verifying the hardening of servers and network equipment. It is in this logic that we propose within the framework of this work the reinforcement of the security of the information systems (IS) by the automation of the mechanisms of hardening. In our work, we have, on the one hand, set up a hardening procedure in accordance with international security standards for servers, routers and switches and, on the other hand, designed and produced a functional application which makes it possible to: 1) Realise the configuration of the hardening;2) Verify them;3) Correct the non conformities;4) Write and send by mail a verification report for the configurations;5) And finally update the procedures of hardening. Our web application thus created allows in less than fifteen (15) minutes actions that previously took at least five (5) hours of time. This allows supervised network operators to save time and money, but also to improve their security standards in line with international standards.展开更多
Nowadays, in data science, supervised learning algorithms are frequently used to perform text classification. However, African textual data, in general, have been studied very little using these methods. This article ...Nowadays, in data science, supervised learning algorithms are frequently used to perform text classification. However, African textual data, in general, have been studied very little using these methods. This article notes the particularity of the data and measures the level of precision of predictions of naive Bayes algorithms, decision tree, and SVM (Support Vector Machine) on a corpus of computer jobs taken on the internet. This is due to the data imbalance problem in machine learning. However, this problem essentially focuses on the distribution of the number of documents in each class or subclass. Here, we delve deeper into the problem to the word count distribution in a set of documents. The results are compared with those obtained on a set of French IT offers. It appears that the precision of the classification varies between 88% and 90% for French offers against 67%, at most, for Cameroonian offers. The contribution of this study is twofold. Indeed, it clearly shows that, in a similar job category, job offers on the internet in Cameroon are more unstructured compared to those available in France, for example. Moreover, it makes it possible to emit a strong hypothesis according to which sets of texts having a symmetrical distribution of the number of words obtain better results with supervised learning algorithms.展开更多
Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of t...Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.展开更多
Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition...Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition method based on a small amount of labeled data is developed.First,a small amount of labeled data are randomly sampled by using the bootstrap method,loss functions for three common deep learning net-works are improved,the uniform distribution and cross-entropy function are combined to reduce the overconfidence of softmax classification.Subsequently,the dataset obtained after sam-pling is adopted to train three improved networks so as to build the initial model.In addition,the unlabeled data are preliminarily screened through dynamic time warping(DTW)and then input into the initial model trained previously for judgment.If the judg-ment results of two or more networks are consistent,the unla-beled data are labeled and put into the labeled data set.Lastly,the three network models are input into the labeled dataset for training,and the final model is built.As revealed by the simula-tion results,the semi-supervised learning method adopted in this paper is capable of exploiting a small amount of labeled data and basically achieving the accuracy of labeled data recognition.展开更多
As educational reforms intensify and societal emphasis shifts towards empowerment,the traditional discourse paradigm of management and control in educational supervision faces growing challenges.This paper explores th...As educational reforms intensify and societal emphasis shifts towards empowerment,the traditional discourse paradigm of management and control in educational supervision faces growing challenges.This paper explores the transformation of this discourse paradigm through the lens of empowerment,analyzing its distinct characteristics,potential pathways,and effective strategies.This paper begins by reviewing the concept of empowerment and examining the current research landscape surrounding the discourse paradigm in educational supervision.Subsequently,we conduct a comparative analysis of the“control”and“empowerment”paradigms,highlighting their essential differences.This analysis illuminates the key characteristics of an empowerment-oriented approach to educational supervision,particularly its emphasis on dialogue,collaboration,participation,and,crucially,empowerment itself.Ultimately,this research advocates for a shift in educational supervision towards an empowerment-oriented discourse system.This entails a multi-pronged approach:transforming ingrained beliefs,embracing renewed pedagogical concepts,fostering methodological innovation,and optimizing existing mechanisms and strategies within educational supervision.These changes are proposed to facilitate the more effective alignment of educational supervision with the pursuit of high-quality education.展开更多
基金the National Natural Science Foundation of China(42001408,61806097).
文摘Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.
基金funded by the Fujian Province Science and Technology Plan,China(Grant Number 2019H0017).
文摘Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same granularity,segmenting them into different granularity events can effectively mitigate the impact of varying time scales on prediction accuracy.However,these events of varying granularity frequently intersect with each other,which may possess unequal durations.Even minor differences can result in significant errors when matching time series with future trends.Besides,directly using matched events but unaligned events as state vectors in machine learning-based prediction models can lead to insufficient prediction accuracy.Therefore,this paper proposes a short-term forecasting method for time series based on a multi-granularity event,MGE-SP(multi-granularity event-based short-termprediction).First,amethodological framework for MGE-SP established guides the implementation steps.The framework consists of three key steps,including multi-granularity event matching based on the LTF(latest time first)strategy,multi-granularity event alignment using a piecewise aggregate approximation based on the compression ratio,and a short-term prediction model based on XGBoost.The data from a nationwide online car-hailing service in China ensures the method’s reliability.The average RMSE(root mean square error)and MAE(mean absolute error)of the proposed method are 3.204 and 2.360,lower than the respective values of 4.056 and 3.101 obtained using theARIMA(autoregressive integratedmoving average)method,as well as the values of 4.278 and 2.994 obtained using k-means-SVR(support vector regression)method.The other experiment is conducted on stock data froma public data set.The proposed method achieved an average RMSE and MAE of 0.836 and 0.696,lower than the respective values of 1.019 and 0.844 obtained using the ARIMA method,as well as the values of 1.350 and 1.172 obtained using the k-means-SVR method.
基金supported by the National Natural Science Foundation of China(Nos.62002206 and 62202373)the open topic of the Green Development Big Data Decision-Making Key Laboratory(DM202003).
文摘Extracting valuable information frombiomedical texts is one of the current research hotspots of concern to a wide range of scholars.The biomedical corpus contains numerous complex long sentences and overlapping relational triples,making most generalized domain joint modeling methods difficult to apply effectively in this field.For a complex semantic environment in biomedical texts,in this paper,we propose a novel perspective to perform joint entity and relation extraction;existing studies divide the relation triples into several steps or modules.However,the three elements in the relation triples are interdependent and inseparable,so we regard joint extraction as a tripartite classification problem.At the same time,fromthe perspective of triple classification,we design amulti-granularity 2D convolution to refine the word pair table and better utilize the dependencies between biomedical word pairs.Finally,we use a biaffine predictor to assist in predicting the labels of word pairs for relation extraction.Our model(MCTPL)Multi-granularity Convolutional Tokens Pairs of Labeling better utilizes the elements of triples and improves the ability to extract overlapping triples compared to previous approaches.Finally,we evaluated our model on two publicly accessible datasets.The experimental results show that our model’s ability to extract relation triples on the CPI dataset improves the F1 score by 2.34%compared to the current optimal model.On the DDI dataset,the F1 value improves the F1 value by 1.68%compared to the current optimal model.Our model achieved state-of-the-art performance compared to other baseline models in biomedical text entity relation extraction.
文摘Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.
文摘Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.
文摘The two universes multi-granularity fuzzy rough set model is an effective tool for handling uncertainty problems between two domains with the help of binary fuzzy relations. This article applies the idea of neighborhood rough sets to two universes multi-granularity fuzzy rough sets, and discusses the two-universes multi-granularity neighborhood fuzzy rough set model. Firstly, the upper and lower approximation operators are defined in the two universes multi-granularity neighborhood fuzzy rough set model. Secondly, the properties of the upper and lower approximation operators are discussed. Finally, the properties of the two universes multi-granularity neighborhood fuzzy rough set model are verified through case studies.
文摘To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.
文摘The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning being difficult to process large-scale unlabeled data. The existing federated self-supervision framework has problems with low communication efficiency and high communication delay between clients and central servers. Therefore, we added edge servers to the federated self-supervision framework to reduce the pressure on the central server caused by frequent communication between both ends. A communication compression scheme using gradient quantization and sparsification was proposed to optimize the communication of the entire framework, and the algorithm of the sparse communication compression module was improved. Experiments have proved that the learning rate changes of the improved sparse communication compression module are smoother and more stable. Our communication compression scheme effectively reduced the overall communication overhead.
文摘The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy of decentralized SCN algorithms while effectively protecting user privacy. To this end, we propose a decentralized semi-supervised learning algorithm for SCN, called DMT-SCN, which introduces teacher and student models by combining the idea of consistency regularization to improve the response speed of model iterations. In order to reduce the possible negative impact of unsupervised data on the model, we purposely change the way of adding noise to the unlabeled data. Simulation results show that the algorithm can effectively utilize unlabeled data to improve the classification accuracy of SCN training and is robust under different ground simulation environments.
基金supported in part by the National Natural Science Foundation of China under Grant 62177029,62307025in part by the Startup Foundation for Introducing Talent of Nanjing University of Posts and Communications under Grant NY221041in part by the General Project of The Natural Science Foundation of Jiangsu Higher Education Institution of China 22KJB520025,23KJD580.
文摘Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.
文摘A novel multi-granularity flexible-grid switching optical-node architecture is proposed in this paper.In our system,the photonic lanterns are used as mode division multiplexing/demultiplexing(MD-Mux/MD-Demux)for selecting mode.The wavelength division multiplexer/demultiplexer(WDMux/WD-Demux)and the fiber bragg gratings(FBGs)are used to select wavelength channels with the various grid.The experimental results show that the transmission bandwidth covers the C+L band,the average transmission loss is-13.4 dB,and the average crosstalk is-30.5 dB.The optical-node architecture is suit for mode division multiplexing(MDM)optical communication system.
基金funded by the Open Foundation of Anhui EngineeringResearch Center of Intelligent Perception and Elderly Care,Chuzhou University(No.2022OPA03)the Higher EducationNatural Science Foundation of Anhui Province(No.KJ2021B01)and the Innovation Team Projects of Universities in Guangdong(No.2022KCXTD057).
文摘The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions vary widely among individuals,making it challenging to accurately diagnose the disease.This study proposed a deep-learning disease diagnosismodel based onweakly supervised learning and clustering visualization(W_CVNet)that fused classification with segmentation.First,the data were preprocessed.An optimizable weakly supervised segmentation preprocessing method(O-WSSPM)was used to remove redundant data and solve the category imbalance problem.Second,a deep-learning fusion method was used for feature extraction and classification recognition.A dual asymmetric complementary bilinear feature extraction method(D-CBM)was used to fully extract complementary features,which solved the problem of insufficient feature extraction by a single deep learning network.Third,an unsupervised learning method based on Fuzzy C-Means(FCM)clustering was used to segment and visualize COVID-19 lesions enabling physicians to accurately assess lesion distribution and disease severity.In this study,5-fold cross-validation methods were used,and the results showed that the network had an average classification accuracy of 85.8%,outperforming six recent advanced classification models.W_CVNet can effectively help physicians with automated aid in diagnosis to determine if the disease is present and,in the case of COVID-19 patients,to further predict the area of the lesion.
基金the support from the Physical Research Platform in the School of Physics of Sun Yat-sen University(PRPSP,SYSU)Project supported by the National Natural Science Foundation of China(Grant No.12074445)the Open Fund of the State Key Laboratory of Optoelectronic Materials and Technologies of Sun Yat-sen University(Grant No.OEMT-2022-ZTS-05)。
文摘Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension curves(FECs)in pulling experiments and identifying states from extension-time trajectories(ETTs)in force-clamp experiments.The former is often accomplished manually and hence is time-consuming and laborious while the latter is always impeded by the presence of baseline drift.In this study,we attempt to accurately and automatically identify the events and states from SMFS experiments with a machine learning approach,which combines clustering and classification for event identification of SMFS(ACCESS).As demonstrated by analysis of a series of data sets,ACCESS can extract the rupture forces from FECs containing multiple unfolding steps and classify the rupture forces into the corresponding conformational transitions.Moreover,ACCESS successfully identifies the unfolded and folded states even though the ETTs display severe nonmonotonic baseline drift.Besides,ACCESS is straightforward in use as it requires only three easy-to-interpret parameters.As such,we anticipate that ACCESS will be a useful,easy-to-implement and high-performance tool for event and state identification across a range of single-molecule experiments.
文摘N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning methods (PCA, HCA, KNN, SIMCA, and SDA). The optimization of molecular structures was performed using the B3LYP/6-31G* approach. MEP maps and ligand-receptor interactions were used to investigate key structural features required for biological activities and likely interactions between N-11-azaartemisinins and heme, respectively. The supervised machine learning methods allowed the separation of the investigated compounds into two classes: cha and cla, with the properties ε<sub>LUMO+1</sub> (one level above lowest unoccupied molecular orbital energy), d(C<sub>6</sub>-C<sub>5</sub>) (distance between C<sub>6</sub> and C<sub>5</sub> atoms in ligands), and TSA (total surface area) responsible for the classification. The insights extracted from the investigation developed and the chemical intuition enabled the design of sixteen new N-11-azaartemisinins (prediction set), moreover, models built with supervised machine learning methods were applied to this prediction set. The result of this application showed twelve new promising N-11-azaartemisinins for synthesis and biological evaluation.
基金supported by National Natural Science Foundation of China(62276058,61902057,41774063)Fundamental Research Funds for the Central Universities(N2217003)Joint Fund of Science&Technology Department of Liaoning Province and State Key Laboratory of Robotics,China(2020-KF-12-11).
文摘A large variety of complaint reports reflect subjective information expressed by citizens.A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary.Therefore,in this paper,a simple and weakly supervised framework considering factual consistency is proposed to generate a summary of city-based complaint reports without pre-labeled sentences/words.Furthermore,it considers the importance of entity in complaint reports to ensure factual consistency of summary.Experimental results on the customer review datasets(Yelp and Amazon)and complaint report dataset(complaint reports of Shenyang in China)show that the proposed framework outperforms state-of-the-art approaches in ROUGE scores and human evaluation.It unveils the effectiveness of our approach to helping in dealing with complaint reports.
文摘In recent years, the place occupied by the various manifestations of cyber-crime in companies has been considerable. Indeed, due to the rapid evolution of telecommunications technologies, companies, regardless of their size or sector of activity, are now the target of advanced persistent threats. The Work 2035 study also revealed that cyber crimes (such as critical infrastructure hacks) and massive data breaches are major sources of concern. Thus, it is important for organizations to guarantee a minimum level of security to avoid potential attacks that can cause paralysis of systems, loss of sensitive data, exposure to blackmail, damage to reputation or even a commercial harm. To do this, among other means, hardening is used, the main objective of which is to reduce the attack surface within a company. The execution of the hardening configurations as well as the verification of these are carried out on the servers and network equipment with the aim of reducing the number of openings present by keeping only those which are necessary for proper operation. However, nowadays, in many companies, these tasks are done manually. As a result, the execution and verification of hardening configurations are very often subject to potential errors but also highly consuming human and financial resources. The problem is that it is essential for operators to maintain an optimal level of security while minimizing costs, hence the interest in automating hardening processes and verifying the hardening of servers and network equipment. It is in this logic that we propose within the framework of this work the reinforcement of the security of the information systems (IS) by the automation of the mechanisms of hardening. In our work, we have, on the one hand, set up a hardening procedure in accordance with international security standards for servers, routers and switches and, on the other hand, designed and produced a functional application which makes it possible to: 1) Realise the configuration of the hardening;2) Verify them;3) Correct the non conformities;4) Write and send by mail a verification report for the configurations;5) And finally update the procedures of hardening. Our web application thus created allows in less than fifteen (15) minutes actions that previously took at least five (5) hours of time. This allows supervised network operators to save time and money, but also to improve their security standards in line with international standards.
文摘Nowadays, in data science, supervised learning algorithms are frequently used to perform text classification. However, African textual data, in general, have been studied very little using these methods. This article notes the particularity of the data and measures the level of precision of predictions of naive Bayes algorithms, decision tree, and SVM (Support Vector Machine) on a corpus of computer jobs taken on the internet. This is due to the data imbalance problem in machine learning. However, this problem essentially focuses on the distribution of the number of documents in each class or subclass. Here, we delve deeper into the problem to the word count distribution in a set of documents. The results are compared with those obtained on a set of French IT offers. It appears that the precision of the classification varies between 88% and 90% for French offers against 67%, at most, for Cameroonian offers. The contribution of this study is twofold. Indeed, it clearly shows that, in a similar job category, job offers on the internet in Cameroon are more unstructured compared to those available in France, for example. Moreover, it makes it possible to emit a strong hypothesis according to which sets of texts having a symmetrical distribution of the number of words obtain better results with supervised learning algorithms.
基金support by the National Natural Science Foundation of China(NSFC)under grant number 61873274.
文摘Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.
文摘Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition method based on a small amount of labeled data is developed.First,a small amount of labeled data are randomly sampled by using the bootstrap method,loss functions for three common deep learning net-works are improved,the uniform distribution and cross-entropy function are combined to reduce the overconfidence of softmax classification.Subsequently,the dataset obtained after sam-pling is adopted to train three improved networks so as to build the initial model.In addition,the unlabeled data are preliminarily screened through dynamic time warping(DTW)and then input into the initial model trained previously for judgment.If the judg-ment results of two or more networks are consistent,the unla-beled data are labeled and put into the labeled data set.Lastly,the three network models are input into the labeled dataset for training,and the final model is built.As revealed by the simula-tion results,the semi-supervised learning method adopted in this paper is capable of exploiting a small amount of labeled data and basically achieving the accuracy of labeled data recognition.
文摘As educational reforms intensify and societal emphasis shifts towards empowerment,the traditional discourse paradigm of management and control in educational supervision faces growing challenges.This paper explores the transformation of this discourse paradigm through the lens of empowerment,analyzing its distinct characteristics,potential pathways,and effective strategies.This paper begins by reviewing the concept of empowerment and examining the current research landscape surrounding the discourse paradigm in educational supervision.Subsequently,we conduct a comparative analysis of the“control”and“empowerment”paradigms,highlighting their essential differences.This analysis illuminates the key characteristics of an empowerment-oriented approach to educational supervision,particularly its emphasis on dialogue,collaboration,participation,and,crucially,empowerment itself.Ultimately,this research advocates for a shift in educational supervision towards an empowerment-oriented discourse system.This entails a multi-pronged approach:transforming ingrained beliefs,embracing renewed pedagogical concepts,fostering methodological innovation,and optimizing existing mechanisms and strategies within educational supervision.These changes are proposed to facilitate the more effective alignment of educational supervision with the pursuit of high-quality education.