Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process...Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process for producing a shorter document in order to quickly access the important goals and main features of the input document.In this study,a novel method is introduced for selective text summarization using the genetic algorithm and generation of repetitive patterns.One of the important features of the proposed summarization is to identify and extract the relationship between the main features of the input text and the creation of repetitive patterns in order to produce and optimize the vector of the main document features in the production of the summary document compared to other previous methods.In this study,attempts were made to encompass all the main parameters of the summary text including unambiguous summary with the highest precision,continuity and consistency.To investigate the efficiency of the proposed algorithm,the results of the study were evaluated with respect to the precision and recall criteria.The results of the study evaluation showed the optimization the dimensions of the features and generation of a sequence of summary document sentences having the most consistency with the main goals and features of the input document.展开更多
An invariant can be described as an essential relationship between program variables.The invariants are very useful in software checking and verification.The tools that are used to detect invariants are invariant dete...An invariant can be described as an essential relationship between program variables.The invariants are very useful in software checking and verification.The tools that are used to detect invariants are invariant detectors.There are two types of invariant detectors:dynamic invariant detectors and static invariant detectors.Daikon software is an available computer program that implements a special case of a dynamic invariant detection algorithm.Daikon proposes a dynamic invariant detection algorithm based on several runs of the tested program;then,it gathers the values of its variables,and finally,it detects relationships between the variables based on a simple statistical analysis.This method has some drawbacks.One of its biggest drawbacks is its overwhelming time order.It is observed that the runtime for the Daikon invariant detection tool is dependent on the ordering of traces in the trace file.A mechanism is proposed in order to reduce differences in adjacent trace files.It is done by applying some special techniques of mutation/crossover in genetic algorithm(GA).An experiment is run to assess the benefits of this approach.Experimental findings reveal that the runtime of the proposed dynamic invariant detection algorithm is superior to the main approach with respect to these improvements.展开更多
These days,imbalanced datasets,denoted throughout the paper by ID,(a dataset that contains some(usually two)classes where one contains considerably smaller number of samples than the other(s))emerge in many real world...These days,imbalanced datasets,denoted throughout the paper by ID,(a dataset that contains some(usually two)classes where one contains considerably smaller number of samples than the other(s))emerge in many real world problems(like health care systems or disease diagnosis systems,anomaly detection,fraud detection,stream based malware detection systems,and so on)and these datasets cause some problems(like under-training of minority class(es)and over-training of majority class(es),bias towards majority class(es),and so on)in classification process and application.Therefore,these datasets take the focus of many researchers in any science and there are several solutions for dealing with this problem.The main aim of this study for dealing with IDs is to resample the borderline samples discovered by Support Vector Data Description(SVDD).There are naturally two kinds of resampling:Under-sampling(U-S)and oversampling(O-S).The O-S may cause the occurrence of over-fitting(the occurrence of over-fitting is its main drawback).The U-S can cause the occurrence of significant information loss(the occurrence of significant information loss is its main drawback).In this study,to avoid the drawbacks of the sampling techniques,we focus on the samples that may be misclassified.The data points that can be misclassified are considered to be the borderline data points which are on border(s)between the majority class(es)and minority class(es).First by SVDD,we find the borderline examples;then,the data resampling is applied over them.At the next step,the base classifier is trained on the newly created dataset.Finally,we compare the result of our method in terms of Area Under Curve(AUC)and F-measure and G-mean with the other state-of-the-art methods.We show that our method has betterresults than the other state-of-the-art methods on our experimental study.展开更多
In order to improve performance and robustness of clustering,it is proposed to generate and aggregate a number of primary clusters via clustering ensemble technique.Fuzzy clustering ensemble approaches attempt to impr...In order to improve performance and robustness of clustering,it is proposed to generate and aggregate a number of primary clusters via clustering ensemble technique.Fuzzy clustering ensemble approaches attempt to improve the performance of fuzzy clustering tasks.However,in these approaches,cluster(or clustering)reliability has not paid much attention to.Ignoring cluster(or clustering)reliability makes these approaches weak in dealing with low-quality base clustering methods.In this paper,we have utilized cluster unreliability estimation and local weighting strategy to propose a new fuzzy clustering ensemble method which has introduced Reliability Based weighted co-association matrix Fuzzy C-Means(RBFCM),Reliability Based Graph Partitioning(RBGP)and Reliability Based Hyper Clustering(RBHC)as three new fuzzy clustering consensus functions.Our fuzzy clustering ensemble approach works based on fuzzy cluster unreliability estimation.Cluster unreliability is estimated according to an entropic criterion using the cluster labels in the entire ensemble.To do so,the new metric is dened to estimate the fuzzy cluster unreliability;then,the reliability value of any cluster is determined using a Reliability Driven Cluster Indicator(RDCI).The time complexities of RBHC and RBGP are linearly proportional with thnumber of data objects.Performance and robustness of the proposed method are experimentally evaluated for some benchmark datasets.The experimental results demonstrate efciency and suitability of the proposed method.展开更多
This paper considers algebraic ordinary differential equations(AODEs)and study their polynomial and rational solutions.The authors first prove a sufficient condition for the existence of a bound on the degree of the p...This paper considers algebraic ordinary differential equations(AODEs)and study their polynomial and rational solutions.The authors first prove a sufficient condition for the existence of a bound on the degree of the possible polynomial solutions to an AODE.An AODE satisfying this condition is called noncritical.Then the authors prove that some common classes of low-order AODEs are noncritical.For rational solutions,the authors determine a class of AODEs,which are called maximally comparable,such that the possible poles of any rational solutions are recognizable from their coefficients.This generalizes the well-known fact that any pole of rational solutions to a linear ODE is contained in the set of zeros of its leading coefficient.Finally,the authors develop an algorithm to compute all rational solutions of certain maximally comparable AODEs,which is applicable to 78.54%of the AODEs in Kamke's collection of standard differential equations.展开更多
文摘Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process for producing a shorter document in order to quickly access the important goals and main features of the input document.In this study,a novel method is introduced for selective text summarization using the genetic algorithm and generation of repetitive patterns.One of the important features of the proposed summarization is to identify and extract the relationship between the main features of the input text and the creation of repetitive patterns in order to produce and optimize the vector of the main document features in the production of the summary document compared to other previous methods.In this study,attempts were made to encompass all the main parameters of the summary text including unambiguous summary with the highest precision,continuity and consistency.To investigate the efficiency of the proposed algorithm,the results of the study were evaluated with respect to the precision and recall criteria.The results of the study evaluation showed the optimization the dimensions of the features and generation of a sequence of summary document sentences having the most consistency with the main goals and features of the input document.
文摘An invariant can be described as an essential relationship between program variables.The invariants are very useful in software checking and verification.The tools that are used to detect invariants are invariant detectors.There are two types of invariant detectors:dynamic invariant detectors and static invariant detectors.Daikon software is an available computer program that implements a special case of a dynamic invariant detection algorithm.Daikon proposes a dynamic invariant detection algorithm based on several runs of the tested program;then,it gathers the values of its variables,and finally,it detects relationships between the variables based on a simple statistical analysis.This method has some drawbacks.One of its biggest drawbacks is its overwhelming time order.It is observed that the runtime for the Daikon invariant detection tool is dependent on the ordering of traces in the trace file.A mechanism is proposed in order to reduce differences in adjacent trace files.It is done by applying some special techniques of mutation/crossover in genetic algorithm(GA).An experiment is run to assess the benefits of this approach.Experimental findings reveal that the runtime of the proposed dynamic invariant detection algorithm is superior to the main approach with respect to these improvements.
基金grants to HAR and HP.HAR is supported by UNSW Scientia Program Fellowship and is a member of the UNSW Graduate School of Biomedical Engineering.
文摘These days,imbalanced datasets,denoted throughout the paper by ID,(a dataset that contains some(usually two)classes where one contains considerably smaller number of samples than the other(s))emerge in many real world problems(like health care systems or disease diagnosis systems,anomaly detection,fraud detection,stream based malware detection systems,and so on)and these datasets cause some problems(like under-training of minority class(es)and over-training of majority class(es),bias towards majority class(es),and so on)in classification process and application.Therefore,these datasets take the focus of many researchers in any science and there are several solutions for dealing with this problem.The main aim of this study for dealing with IDs is to resample the borderline samples discovered by Support Vector Data Description(SVDD).There are naturally two kinds of resampling:Under-sampling(U-S)and oversampling(O-S).The O-S may cause the occurrence of over-fitting(the occurrence of over-fitting is its main drawback).The U-S can cause the occurrence of significant information loss(the occurrence of significant information loss is its main drawback).In this study,to avoid the drawbacks of the sampling techniques,we focus on the samples that may be misclassified.The data points that can be misclassified are considered to be the borderline data points which are on border(s)between the majority class(es)and minority class(es).First by SVDD,we find the borderline examples;then,the data resampling is applied over them.At the next step,the base classifier is trained on the newly created dataset.Finally,we compare the result of our method in terms of Area Under Curve(AUC)and F-measure and G-mean with the other state-of-the-art methods.We show that our method has betterresults than the other state-of-the-art methods on our experimental study.
文摘In order to improve performance and robustness of clustering,it is proposed to generate and aggregate a number of primary clusters via clustering ensemble technique.Fuzzy clustering ensemble approaches attempt to improve the performance of fuzzy clustering tasks.However,in these approaches,cluster(or clustering)reliability has not paid much attention to.Ignoring cluster(or clustering)reliability makes these approaches weak in dealing with low-quality base clustering methods.In this paper,we have utilized cluster unreliability estimation and local weighting strategy to propose a new fuzzy clustering ensemble method which has introduced Reliability Based weighted co-association matrix Fuzzy C-Means(RBFCM),Reliability Based Graph Partitioning(RBGP)and Reliability Based Hyper Clustering(RBHC)as three new fuzzy clustering consensus functions.Our fuzzy clustering ensemble approach works based on fuzzy cluster unreliability estimation.Cluster unreliability is estimated according to an entropic criterion using the cluster labels in the entire ensemble.To do so,the new metric is dened to estimate the fuzzy cluster unreliability;then,the reliability value of any cluster is determined using a Reliability Driven Cluster Indicator(RDCI).The time complexities of RBHC and RBGP are linearly proportional with thnumber of data objects.Performance and robustness of the proposed method are experimentally evaluated for some benchmark datasets.The experimental results demonstrate efciency and suitability of the proposed method.
基金supported by Vietnam National Foundation for Science and Technology Development(NAFOSTED)under Grant No.101.04-2019.06supported by the Austrian Science Fund(FWF)under Grant No.P29467-N32+1 种基金the UTD startup Fund under Grant No.P-1-03246the Natural Science Foundations of USA under Grant No.CF-1815108 and CCF-1708884。
文摘This paper considers algebraic ordinary differential equations(AODEs)and study their polynomial and rational solutions.The authors first prove a sufficient condition for the existence of a bound on the degree of the possible polynomial solutions to an AODE.An AODE satisfying this condition is called noncritical.Then the authors prove that some common classes of low-order AODEs are noncritical.For rational solutions,the authors determine a class of AODEs,which are called maximally comparable,such that the possible poles of any rational solutions are recognizable from their coefficients.This generalizes the well-known fact that any pole of rational solutions to a linear ODE is contained in the set of zeros of its leading coefficient.Finally,the authors develop an algorithm to compute all rational solutions of certain maximally comparable AODEs,which is applicable to 78.54%of the AODEs in Kamke's collection of standard differential equations.