The emergence of digital networks and the wide adoption of information on internet platforms have given rise to threats against users’private information.Many intruders actively seek such private data either for sale...The emergence of digital networks and the wide adoption of information on internet platforms have given rise to threats against users’private information.Many intruders actively seek such private data either for sale or other inappropriate purposes.Similarly,national and international organizations have country-level and company-level private information that could be accessed by different network attacks.Therefore,the need for a Network Intruder Detection System(NIDS)becomes essential for protecting these networks and organizations.In the evolution of NIDS,Artificial Intelligence(AI)assisted tools and methods have been widely adopted to provide effective solutions.However,the development of NIDS still faces challenges at the dataset and machine learning levels,such as large deviations in numeric features,the presence of numerous irrelevant categorical features resulting in reduced cardinality,and class imbalance in multiclass-level data.To address these challenges and offer a unified solution to NIDS development,this study proposes a novel framework that preprocesses datasets and applies a box-cox transformation to linearly transform the numeric features and bring them into closer alignment.Cardinality reduction was applied to categorical features through the binning method.Subsequently,the class imbalance dataset was addressed using the adaptive synthetic sampling data generation method.Finally,the preprocessed,refined,and oversampled feature set was divided into training and test sets with an 80–20 ratio,and two experiments were conducted.In Experiment 1,the binary classification was executed using four machine learning classifiers,with the extra trees classifier achieving the highest accuracy of 97.23%and an AUC of 0.9961.In Experiment 2,multiclass classification was performed,and the extra trees classifier emerged as the most effective,achieving an accuracy of 81.27%and an AUC of 0.97.The results were evaluated based on training,testing,and total time,and a comparative analysis with state-of-the-art studies proved the robustness and significance of the applied methods in developing a timely and precision-efficient solution to NIDS.展开更多
In this paper,we propose a new relational schema (R-schema) to XML schema translation algorithm, VQT, which analyzes the value cardinality and user query patterns and extracts the implicit referential integrities by u...In this paper,we propose a new relational schema (R-schema) to XML schema translation algorithm, VQT, which analyzes the value cardinality and user query patterns and extracts the implicit referential integrities by using the cardinality property of foreign key constraints between columns and the equi-join characteristic in user queries. The VQT algorithm can apply the extracted implied referential integrity relation information to the R-schema and create an XML schema as the final result. Therefore, the VQT algorithm prevents the R-schema from being incorrectly converted into the XML schema, and it richly and powerfully represents all the information in the R-schema by creating an XML schema as the translation result on behalf of the XML DTD.展开更多
An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are in...An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are inaccurate and the query efficiency cannot be guaranteed as well.In particular,they are difficult to accurately obtain the complex relationships between multiple tables in complex database systems.When dealing with complex queries,the existing cardinality estimators cannot achieve good results.In this study,a novel cardinality estimator is proposed.It uses the core techniques with the BiLSTM network structure and adds the attention mechanism.First,the columns involved in the query statements in the training set are sampled and compressed into bitmaps.Then,the Word2vec model is used to embed the word vectors about the query statements.Finally,the BiLSTM network and attention mechanism are employed to deal with word vectors.The proposed model takes into consideration not only the correlation between tables but also the processing of complex predicates.Extensive experiments and the evaluation of BiLSTM-Attention Cardinality Estimator(BACE)on the IMDB datasets are conducted.The results show that the deep learning model can significantly improve the quality of cardinality estimation,which is a vital role in query optimisation for complex databases.展开更多
The Cardinality Constraint-Based Optimization problem is investigated in this note. In portfolio optimization problem, the cardinality constraint allows one to invest in assets out of a universe of N assets for a pres...The Cardinality Constraint-Based Optimization problem is investigated in this note. In portfolio optimization problem, the cardinality constraint allows one to invest in assets out of a universe of N assets for a prespecified value of K. It is generally agreed that choosing a “small” value of K forces the implementation of diversification in small portfolios. However, the question of how small must be K has remained unanswered. In the present work, using a comparative approach we show computationally that optimal portfolio selection with a relatively small or large number of assets, K, may produce similar results with differentiated reliabilities.展开更多
This paper studied cardinality constrained portfolio with integer weight.We suggested two optimization models and used two genetic algorithms to solve them.In this paper,after finding well matching stocks,according to...This paper studied cardinality constrained portfolio with integer weight.We suggested two optimization models and used two genetic algorithms to solve them.In this paper,after finding well matching stocks,according to investor’s target by using first genetic algorithm,we gave optimal integer weight of portfolio with well matching stocks by using second genetic algorithm.Through numerical comparisons with other feasible portfolios,we verified advantages of designed portfolio with two genetic algorithms.For a numerical comparison,we used a prepared data consisted of 18 stocks listed in S&P 500 and numerical example strongly supported the designed portfolio in this paper.Also,we made all comparisons visible through all feasible efficient frontiers.展开更多
Optimization problem of cardinality constrained mean-variance(CCMV)model for sparse portfolio selection is considered.To overcome the difficulties caused by cardinality constraint,an exact penalty approach is employed...Optimization problem of cardinality constrained mean-variance(CCMV)model for sparse portfolio selection is considered.To overcome the difficulties caused by cardinality constraint,an exact penalty approach is employed,then CCMV problem is transferred into a difference-of-convex-functions(DC)problem.By exploiting the DC structure of the gained problem and the superlinear convergence of semismooth Newton(ssN)method,an inexact proximal DC algorithm with sieving strategy based on a majorized ssN method(siPDCA-mssN)is proposed.For solving the inner problems of siPDCA-mssN from dual,the second-order information is wisely incorporated and an efficient mssN method is employed.The global convergence of the sequence generated by siPDCA-mssN is proved.To solve large-scale CCMV problem,a decomposed siPDCA-mssN(DsiPDCA-mssN)is introduced.To demonstrate the efficiency of proposed algorithms,siPDCA-mssN and DsiPDCA-mssN are compared with the penalty proximal alternating linearized minimization method and the CPLEX(12.9)solver by performing numerical experiments on realword market data and large-scale simulated data.The numerical results demonstrate that siPDCA-mssN and DsiPDCA-mssN outperform the other methods from computation time and optimal value.The out-of-sample experiments results display that the solutions of CCMV model are better than those of other portfolio selection models in terms of Sharp ratio and sparsity.展开更多
Multiparty private set intersection(PSI)allows several parties,each holding a set of elements,to jointly compute the intersection without leaking any additional information.With the development of cloud computing,PSI ...Multiparty private set intersection(PSI)allows several parties,each holding a set of elements,to jointly compute the intersection without leaking any additional information.With the development of cloud computing,PSI has a wide range of applications in privacy protection.However,it is complex to build an efficient and reliable scheme to protect user privacy.To address this issue,we propose EMPSI,an efficient PSI(with cardinality)protocol in a multiparty setting.EMPSI avoids using heavy cryptographic primitives(mainly rely on symmetric-key encryption)to achieve better performance.In addition,both PSI and PSI with the cardinality of EMPSI are secure against semi-honest adversaries and allow any number of colluding clients(at least one honest client).We also do experiments to compare EMPSI with some state-of-the-art works.The experimental results show that proposed EMPSI(-CA)has better performance and is scalable in the number of clients and the set size.展开更多
In this paper, a cardinality compensation method based on Information-weighted Consensus Filter(ICF) using data clustering is proposed in order to accurately estimate the cardinality of the Cardinalized Probability Hy...In this paper, a cardinality compensation method based on Information-weighted Consensus Filter(ICF) using data clustering is proposed in order to accurately estimate the cardinality of the Cardinalized Probability Hypothesis Density(CPHD) filter. Although the joint propagation of the intensity and the cardinality distribution in the CPHD filter process allows for more reliable estimation of the cardinality(target number) than the PHD filter, tracking loss may occur when noise and clutter are high in the measurements in a practical situation. For that reason, the cardinality compensation process is included in the CPHD filter, which is based on information fusion step using estimated cardinality obtained from the CPHD filter and measured cardinality obtained through data clustering. Here, the ICF is used for information fusion. To verify the performance of the proposed method, simulations were carried out and it was confirmed that the tracking performance of the multi-target was improved because the cardinality was estimated more accurately as compared to the existing techniques.展开更多
Although the popular database systems perform well on query optimization,they still face poor query execution plans when the join operations across multiple tables are complex.Bad execution planning usually results in...Although the popular database systems perform well on query optimization,they still face poor query execution plans when the join operations across multiple tables are complex.Bad execution planning usually results in bad cardinality estimations.The cardinality estimation models in traditional databases cannot provide high-quality estimation,because they are not capable of capturing the correlation between multiple tables in an effective fashion.Recently,the state-of-the-art learning-based cardinality estimation is estimated to work better than the traditional empirical methods.Basically,they used deep neural networks to compute the relationships and correlations of tables.In this paper,we propose a vertical scanning convolutional neural network(abbreviated as VSCNN)to capture the relationships between words in the word vector in order to generate a feature map.The proposed learning-based cardinality estimator converts Structured Query Language(SQL)queries from a sentence to a word vector and we encode table names in the one-hot encoding method and the samples into bitmaps,separately,and then merge them to obtain enough semantic information from data samples.In particular,the feature map obtained by VSCNN contains semantic information including tables,joins,and predicates about SQL queries.Importantly,in order to improve the accuracy of cardinality estimation,we propose the negative sampling method for training the word vector by gradient descent from the base table and compress it into a bitmap.Extensive experiments are conducted and the results show that the estimation quality of q-error of the proposed vertical scanning convolutional neural network based model is reduced by at least 14.6%when compared with the estimators in traditional databases.展开更多
Mathematical programming problems with semi-continuous variables and cardinality constraint have many applications,including production planning,portfolio selection,compressed sensing and subset selection in regressio...Mathematical programming problems with semi-continuous variables and cardinality constraint have many applications,including production planning,portfolio selection,compressed sensing and subset selection in regression.This class of problems can be modeled as mixed-integer programs with special structures and are in general NP-hard.In the past few years,based on new reformulations,approximation and relaxation techniques,promising exact and approximate methods have been developed.We survey in this paper these recent developments for this challenging class of mathematical programming problems.展开更多
In this paper,we study the joint bandwidth allocation and path selection problem,which is an extension of the well-known network utility maximization(NUM)problem,via solving a multi-objective minimization problem unde...In this paper,we study the joint bandwidth allocation and path selection problem,which is an extension of the well-known network utility maximization(NUM)problem,via solving a multi-objective minimization problem under path cardinality constraints.Specifically,such a problem formulation captures various types of objectives including proportional fairness,average delay,as well as load balancing.In addition,in order to handle the"unsplittable flows",path cardinality constraints are added,making the resulting optimization problem quite challenging to solve due to intrinsic nonsmoothness and nonconvexity.Almost all existing works deal with such a problem using relaxation techniques to transform it into a convex optimization problem.However,we provide a novel solution framework based on the linearized alternating direction method of multipliers(LADMM)to split the original problem with coupling terms into several subproblems.We then derive that these subproblems,albeit nonconvex nonsmooth,are actually simple to solve and easy to implement,which can be of independent interest.Under some mild assumptions,we prove that any limiting point of the generated sequence of the proposed algorithm is a stationary point.Numerical simulations are performed to demonstrate the advantages of our proposed algorithm compared with various baselines.展开更多
Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic man...Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic management and routing optimization in software defined network. In this pa- per, we propose a novel data structure (called LRU-Sketch) to address the problem. The significant contributions are as follows. 1) The proposed data structure adapts a well-known probabilistic sketch to sliding window model; 2) By using the least-recently used (LRU) replacement policy, we design a highly time-efficient algorithm for timely forgetting stale information, which takes constant (O(1)) time per time slot; 3) Moreover, a further memory-reducing schema is given at a cost of very little loss of accuracy; 4) Comprehensive ex- periments, performed on two real IP trace files, confirm that the proposed schema attains high accuracy and high time efficiency.ferences including IEEE TPDS, ACM ToS, JCST, MIDDLEWARE, CLUSTER, NAS, etc. Currently, his research interests include big data management, cloud storage, and distributed file systems.展开更多
The notion of quasi-biorthogonal frame wavelets is a generalization of the notion of orthog- onal wavelets. A quasi-biorthogonal frame wavelet with the cardinality r consists of r pairs of functions. In this paper we ...The notion of quasi-biorthogonal frame wavelets is a generalization of the notion of orthog- onal wavelets. A quasi-biorthogonal frame wavelet with the cardinality r consists of r pairs of functions. In this paper we first analyze the local property of the quasi-biorthogonal frame wavelet and show that its each pair of functions generates reconstruction formulas of the corresponding subspaces. Next we show that the lower bound of its cardinalities depends on a pair of dual frame multiresolution analyses deriving it. Finally, we present a split trick and show that any quasi-biorthogonal frame wavelet can be split into a new quasi-biorthogonal frame wavelet with an arbitrarily large cardinality. For generality, we work in the setting of matrix dilations.展开更多
The cardinality constrained mean–variance(CCMV)portfolio selection model aims to identify a subset of the candidate assets such that the constructed portfolio has a guaranteed expected return and minimum variance.By ...The cardinality constrained mean–variance(CCMV)portfolio selection model aims to identify a subset of the candidate assets such that the constructed portfolio has a guaranteed expected return and minimum variance.By formulating this model as the mixed-integer quadratic program(MIQP),the exact solution can be solved by a branch-and-bound algorithm.However,computational efficiency is the central issue in the time-sensitive portfolio investment due to its NP-hardness properties.To accelerate the solution speeds to CCMV portfolio optimization problems,we develop various heuristic methods based on techniques such as continuous relaxation,l1-norm approximation,integer optimization,and relaxation of semi-definite programming(SDP).We evaluate our heuristic methods by applying them to the US equity market dataset.The experimental results show that our SDP-based method is effective in terms of the computation time and the approximation ratio.Our SDP-based method performs even better than a commercial MIQP solver when the computational time is limited.In addition,several investment companies in China have adopted our methods,gaining good returns.This paper sheds light on the computation optimization for financial investments.展开更多
§1. Introduction If we denote by o (X) the number of all open subsets in X, I. Juhasz raised a famous question in[1] whether o (X)ω= o GX) for every infinite strongly Hausdorff space. In the same paper, he a...§1. Introduction If we denote by o (X) the number of all open subsets in X, I. Juhasz raised a famous question in[1] whether o (X)ω= o GX) for every infinite strongly Hausdorff space. In the same paper, he alsoproved that it is true for T3hereditarily paracompact spaces or topological groups. In [2], VanDouwen and Zhou Hao-Xuan showed it is true for perfectly normal spaces and suggested thequestion of whether it holds for any hereditarily normal space X. On the other hand, we know from[4] that paracompactness can be characterized by collectionwise normality (CWN) θ-refinability展开更多
An edge coloring of hypergraph H is a function such that holds for any pair of intersecting edges . The minimum number of colors in edge colorings of H is called the chromatic index of H and is ...An edge coloring of hypergraph H is a function such that holds for any pair of intersecting edges . The minimum number of colors in edge colorings of H is called the chromatic index of H and is denoted by . Erdös, Faber and Lovász proposed a famous conjecture that holds for any loopless linear hypergraph H with n vertices. In this paper, we show that is true for gap-restricted hypergraphs. Our result extends a result of Alesandroni in 2021.展开更多
Up to now, the study on the cardinal number of fuzzy sets has advanced at on pace since it is very hard to give it an appropriate definition. Althrough for it in [1], it is with some harsh terms and is not reasonable ...Up to now, the study on the cardinal number of fuzzy sets has advanced at on pace since it is very hard to give it an appropriate definition. Althrough for it in [1], it is with some harsh terms and is not reasonable as we point out in this paper. In the paper, we give a general definition of fuzzy cardinal numbers. Based on this definition, we not only obtain a large part of results with re spect to cardinal numbers, but also give a few of new properties of fuzzy cardinal numbers.展开更多
文摘The emergence of digital networks and the wide adoption of information on internet platforms have given rise to threats against users’private information.Many intruders actively seek such private data either for sale or other inappropriate purposes.Similarly,national and international organizations have country-level and company-level private information that could be accessed by different network attacks.Therefore,the need for a Network Intruder Detection System(NIDS)becomes essential for protecting these networks and organizations.In the evolution of NIDS,Artificial Intelligence(AI)assisted tools and methods have been widely adopted to provide effective solutions.However,the development of NIDS still faces challenges at the dataset and machine learning levels,such as large deviations in numeric features,the presence of numerous irrelevant categorical features resulting in reduced cardinality,and class imbalance in multiclass-level data.To address these challenges and offer a unified solution to NIDS development,this study proposes a novel framework that preprocesses datasets and applies a box-cox transformation to linearly transform the numeric features and bring them into closer alignment.Cardinality reduction was applied to categorical features through the binning method.Subsequently,the class imbalance dataset was addressed using the adaptive synthetic sampling data generation method.Finally,the preprocessed,refined,and oversampled feature set was divided into training and test sets with an 80–20 ratio,and two experiments were conducted.In Experiment 1,the binary classification was executed using four machine learning classifiers,with the extra trees classifier achieving the highest accuracy of 97.23%and an AUC of 0.9961.In Experiment 2,multiclass classification was performed,and the extra trees classifier emerged as the most effective,achieving an accuracy of 81.27%and an AUC of 0.97.The results were evaluated based on training,testing,and total time,and a comparative analysis with state-of-the-art studies proved the robustness and significance of the applied methods in developing a timely and precision-efficient solution to NIDS.
基金Project supported by the 2nd Brain Korea Project
文摘In this paper,we propose a new relational schema (R-schema) to XML schema translation algorithm, VQT, which analyzes the value cardinality and user query patterns and extracts the implicit referential integrities by using the cardinality property of foreign key constraints between columns and the equi-join characteristic in user queries. The VQT algorithm can apply the extracted implied referential integrity relation information to the R-schema and create an XML schema as the final result. Therefore, the VQT algorithm prevents the R-schema from being incorrectly converted into the XML schema, and it richly and powerfully represents all the information in the R-schema by creating an XML schema as the translation result on behalf of the XML DTD.
基金supported by the National Natural Science Foundation of China under grant nos.61772091,61802035,61962006,61962038,U1802271,U2001212,and 62072311the Sichuan Science and Technology Program under grant nos.2021JDJQ0021 and 22ZDYF2680+7 种基金the CCF‐Huawei Database System Innovation Research Plan under grant no.CCF‐HuaweiDBIR2020004ADigital Media Art,Key Laboratory of Sichuan Province,Sichuan Conservatory of Music,Chengdu,China under grant no.21DMAKL02the Chengdu Major Science and Technology Innovation Project under grant no.2021‐YF08‐00156‐GXthe Chengdu Technology Innovation and Research and Development Project under grant no.2021‐YF05‐00491‐SNthe Natural Science Foundation of Guangxi under grant no.2018GXNSFDA138005the Guangdong Basic and Applied Basic Research Foundation under grant no.2020B1515120028the Science and Technology Innovation Seedling Project of Sichuan Province under grant no 2021006the College Student Innovation and Entrepreneurship Training Program of Chengdu University of Information Technology under grant nos.202110621179 and 202110621186.
文摘An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are inaccurate and the query efficiency cannot be guaranteed as well.In particular,they are difficult to accurately obtain the complex relationships between multiple tables in complex database systems.When dealing with complex queries,the existing cardinality estimators cannot achieve good results.In this study,a novel cardinality estimator is proposed.It uses the core techniques with the BiLSTM network structure and adds the attention mechanism.First,the columns involved in the query statements in the training set are sampled and compressed into bitmaps.Then,the Word2vec model is used to embed the word vectors about the query statements.Finally,the BiLSTM network and attention mechanism are employed to deal with word vectors.The proposed model takes into consideration not only the correlation between tables but also the processing of complex predicates.Extensive experiments and the evaluation of BiLSTM-Attention Cardinality Estimator(BACE)on the IMDB datasets are conducted.The results show that the deep learning model can significantly improve the quality of cardinality estimation,which is a vital role in query optimisation for complex databases.
文摘The Cardinality Constraint-Based Optimization problem is investigated in this note. In portfolio optimization problem, the cardinality constraint allows one to invest in assets out of a universe of N assets for a prespecified value of K. It is generally agreed that choosing a “small” value of K forces the implementation of diversification in small portfolios. However, the question of how small must be K has remained unanswered. In the present work, using a comparative approach we show computationally that optimal portfolio selection with a relatively small or large number of assets, K, may produce similar results with differentiated reliabilities.
文摘This paper studied cardinality constrained portfolio with integer weight.We suggested two optimization models and used two genetic algorithms to solve them.In this paper,after finding well matching stocks,according to investor’s target by using first genetic algorithm,we gave optimal integer weight of portfolio with well matching stocks by using second genetic algorithm.Through numerical comparisons with other feasible portfolios,we verified advantages of designed portfolio with two genetic algorithms.For a numerical comparison,we used a prepared data consisted of 18 stocks listed in S&P 500 and numerical example strongly supported the designed portfolio in this paper.Also,we made all comparisons visible through all feasible efficient frontiers.
基金supported by the National Natural Science Foundation of China(Grant No.11971092)supported by the Fundamental Research Funds for the Central Universities(Grant No.DUT20RC(3)079)。
文摘Optimization problem of cardinality constrained mean-variance(CCMV)model for sparse portfolio selection is considered.To overcome the difficulties caused by cardinality constraint,an exact penalty approach is employed,then CCMV problem is transferred into a difference-of-convex-functions(DC)problem.By exploiting the DC structure of the gained problem and the superlinear convergence of semismooth Newton(ssN)method,an inexact proximal DC algorithm with sieving strategy based on a majorized ssN method(siPDCA-mssN)is proposed.For solving the inner problems of siPDCA-mssN from dual,the second-order information is wisely incorporated and an efficient mssN method is employed.The global convergence of the sequence generated by siPDCA-mssN is proved.To solve large-scale CCMV problem,a decomposed siPDCA-mssN(DsiPDCA-mssN)is introduced.To demonstrate the efficiency of proposed algorithms,siPDCA-mssN and DsiPDCA-mssN are compared with the penalty proximal alternating linearized minimization method and the CPLEX(12.9)solver by performing numerical experiments on realword market data and large-scale simulated data.The numerical results demonstrate that siPDCA-mssN and DsiPDCA-mssN outperform the other methods from computation time and optimal value.The out-of-sample experiments results display that the solutions of CCMV model are better than those of other portfolio selection models in terms of Sharp ratio and sparsity.
基金supported in part by the National Key Research and Development Program of China(2020YFA0712300)in part by the National Natural Science Foundation of China(Grant Nos.62172162,62132005)。
文摘Multiparty private set intersection(PSI)allows several parties,each holding a set of elements,to jointly compute the intersection without leaking any additional information.With the development of cloud computing,PSI has a wide range of applications in privacy protection.However,it is complex to build an efficient and reliable scheme to protect user privacy.To address this issue,we propose EMPSI,an efficient PSI(with cardinality)protocol in a multiparty setting.EMPSI avoids using heavy cryptographic primitives(mainly rely on symmetric-key encryption)to achieve better performance.In addition,both PSI and PSI with the cardinality of EMPSI are secure against semi-honest adversaries and allow any number of colluding clients(at least one honest client).We also do experiments to compare EMPSI with some state-of-the-art works.The experimental results show that proposed EMPSI(-CA)has better performance and is scalable in the number of clients and the set size.
基金supported by the National GNSS Research Center Program of the Defense Acquisition Program Administration and Agency for Defense Developmentthe Ministry of Science and ICT of the Republic of Korea through the Space Core Technology Development Program (No. NRF2018M1A3A3A02065722)
文摘In this paper, a cardinality compensation method based on Information-weighted Consensus Filter(ICF) using data clustering is proposed in order to accurately estimate the cardinality of the Cardinalized Probability Hypothesis Density(CPHD) filter. Although the joint propagation of the intensity and the cardinality distribution in the CPHD filter process allows for more reliable estimation of the cardinality(target number) than the PHD filter, tracking loss may occur when noise and clutter are high in the measurements in a practical situation. For that reason, the cardinality compensation process is included in the CPHD filter, which is based on information fusion step using estimated cardinality obtained from the CPHD filter and measured cardinality obtained through data clustering. Here, the ICF is used for information fusion. To verify the performance of the proposed method, simulations were carried out and it was confirmed that the tracking performance of the multi-target was improved because the cardinality was estimated more accurately as compared to the existing techniques.
基金the CCF-Huawei Database System Innovation Research Plan under Grant No.CCF-HuaweiDBIR2020004Athe National Natural Science Foundation of China under Grant Nos.61772091,61802035,61962006 and 61962038+1 种基金the Sichuan Science and Technology Program under Grant Nos.2021JDJQ0021 and 2020YJ0481the Digital Media Art,Key Laboratory of Sichuan Province,Sichuan Conservatory of Music,Chengdu,China under Grant No.21DMAKL02.
文摘Although the popular database systems perform well on query optimization,they still face poor query execution plans when the join operations across multiple tables are complex.Bad execution planning usually results in bad cardinality estimations.The cardinality estimation models in traditional databases cannot provide high-quality estimation,because they are not capable of capturing the correlation between multiple tables in an effective fashion.Recently,the state-of-the-art learning-based cardinality estimation is estimated to work better than the traditional empirical methods.Basically,they used deep neural networks to compute the relationships and correlations of tables.In this paper,we propose a vertical scanning convolutional neural network(abbreviated as VSCNN)to capture the relationships between words in the word vector in order to generate a feature map.The proposed learning-based cardinality estimator converts Structured Query Language(SQL)queries from a sentence to a word vector and we encode table names in the one-hot encoding method and the samples into bitmaps,separately,and then merge them to obtain enough semantic information from data samples.In particular,the feature map obtained by VSCNN contains semantic information including tables,joins,and predicates about SQL queries.Importantly,in order to improve the accuracy of cardinality estimation,we propose the negative sampling method for training the word vector by gradient descent from the base table and compress it into a bitmap.Extensive experiments are conducted and the results show that the estimation quality of q-error of the proposed vertical scanning convolutional neural network based model is reduced by at least 14.6%when compared with the estimators in traditional databases.
基金supported by the National Natural Science Foundation of China grants(Nos.11101092,10971034)the Joint National Natural Science Foundation of China/Research Grants Council of Hong Kong grant(No.71061160506)the Research Grants Council of Hong Kong grants(Nos.CUHK414808 and CUHK414610).
文摘Mathematical programming problems with semi-continuous variables and cardinality constraint have many applications,including production planning,portfolio selection,compressed sensing and subset selection in regression.This class of problems can be modeled as mixed-integer programs with special structures and are in general NP-hard.In the past few years,based on new reformulations,approximation and relaxation techniques,promising exact and approximate methods have been developed.We survey in this paper these recent developments for this challenging class of mathematical programming problems.
基金supported by the National Natural Science Foundation of China under Grant 11831002。
文摘In this paper,we study the joint bandwidth allocation and path selection problem,which is an extension of the well-known network utility maximization(NUM)problem,via solving a multi-objective minimization problem under path cardinality constraints.Specifically,such a problem formulation captures various types of objectives including proportional fairness,average delay,as well as load balancing.In addition,in order to handle the"unsplittable flows",path cardinality constraints are added,making the resulting optimization problem quite challenging to solve due to intrinsic nonsmoothness and nonconvexity.Almost all existing works deal with such a problem using relaxation techniques to transform it into a convex optimization problem.However,we provide a novel solution framework based on the linearized alternating direction method of multipliers(LADMM)to split the original problem with coupling terms into several subproblems.We then derive that these subproblems,albeit nonconvex nonsmooth,are actually simple to solve and easy to implement,which can be of independent interest.Under some mild assumptions,we prove that any limiting point of the generated sequence of the proposed algorithm is a stationary point.Numerical simulations are performed to demonstrate the advantages of our proposed algorithm compared with various baselines.
基金This work was supported by the National High Tech- nology Research and Development Program of China (2012AA01A510 and 2012AA01AS09), and partially supported by the National Natural Science Foundation of China (NSFC) (Grant Nos. 61402518, 61403060), and the Jiangsu Province Science Foundation for Youths (BK20150722).
文摘Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic management and routing optimization in software defined network. In this pa- per, we propose a novel data structure (called LRU-Sketch) to address the problem. The significant contributions are as follows. 1) The proposed data structure adapts a well-known probabilistic sketch to sliding window model; 2) By using the least-recently used (LRU) replacement policy, we design a highly time-efficient algorithm for timely forgetting stale information, which takes constant (O(1)) time per time slot; 3) Moreover, a further memory-reducing schema is given at a cost of very little loss of accuracy; 4) Comprehensive ex- periments, performed on two real IP trace files, confirm that the proposed schema attains high accuracy and high time efficiency.ferences including IEEE TPDS, ACM ToS, JCST, MIDDLEWARE, CLUSTER, NAS, etc. Currently, his research interests include big data management, cloud storage, and distributed file systems.
文摘The notion of quasi-biorthogonal frame wavelets is a generalization of the notion of orthog- onal wavelets. A quasi-biorthogonal frame wavelet with the cardinality r consists of r pairs of functions. In this paper we first analyze the local property of the quasi-biorthogonal frame wavelet and show that its each pair of functions generates reconstruction formulas of the corresponding subspaces. Next we show that the lower bound of its cardinalities depends on a pair of dual frame multiresolution analyses deriving it. Finally, we present a split trick and show that any quasi-biorthogonal frame wavelet can be split into a new quasi-biorthogonal frame wavelet with an arbitrarily large cardinality. For generality, we work in the setting of matrix dilations.
基金This research was supported by the Jiangsu Funding Program for Excellent Postdoctoral Talent(2022ZB804).
文摘The cardinality constrained mean–variance(CCMV)portfolio selection model aims to identify a subset of the candidate assets such that the constructed portfolio has a guaranteed expected return and minimum variance.By formulating this model as the mixed-integer quadratic program(MIQP),the exact solution can be solved by a branch-and-bound algorithm.However,computational efficiency is the central issue in the time-sensitive portfolio investment due to its NP-hardness properties.To accelerate the solution speeds to CCMV portfolio optimization problems,we develop various heuristic methods based on techniques such as continuous relaxation,l1-norm approximation,integer optimization,and relaxation of semi-definite programming(SDP).We evaluate our heuristic methods by applying them to the US equity market dataset.The experimental results show that our SDP-based method is effective in terms of the computation time and the approximation ratio.Our SDP-based method performs even better than a commercial MIQP solver when the computational time is limited.In addition,several investment companies in China have adopted our methods,gaining good returns.This paper sheds light on the computation optimization for financial investments.
文摘§1. Introduction If we denote by o (X) the number of all open subsets in X, I. Juhasz raised a famous question in[1] whether o (X)ω= o GX) for every infinite strongly Hausdorff space. In the same paper, he alsoproved that it is true for T3hereditarily paracompact spaces or topological groups. In [2], VanDouwen and Zhou Hao-Xuan showed it is true for perfectly normal spaces and suggested thequestion of whether it holds for any hereditarily normal space X. On the other hand, we know from[4] that paracompactness can be characterized by collectionwise normality (CWN) θ-refinability
文摘An edge coloring of hypergraph H is a function such that holds for any pair of intersecting edges . The minimum number of colors in edge colorings of H is called the chromatic index of H and is denoted by . Erdös, Faber and Lovász proposed a famous conjecture that holds for any loopless linear hypergraph H with n vertices. In this paper, we show that is true for gap-restricted hypergraphs. Our result extends a result of Alesandroni in 2021.
文摘Up to now, the study on the cardinal number of fuzzy sets has advanced at on pace since it is very hard to give it an appropriate definition. Althrough for it in [1], it is with some harsh terms and is not reasonable as we point out in this paper. In the paper, we give a general definition of fuzzy cardinal numbers. Based on this definition, we not only obtain a large part of results with re spect to cardinal numbers, but also give a few of new properties of fuzzy cardinal numbers.