In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and center...In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and centering on the server, the data will store model to data- centric data storage model. Storage is considered from the start, just keep a series of data, for the management system and storage device rarely consider the intrinsic value of the stored data. The prosperity of the Internet has changed the world data storage, and with the emergence of many new applications. Theoretically, the proposed algorithm has the ability of dealing with massive data and numerically, the algorithm could enhance the processing accuracy and speed which will be meaningful.展开更多
"Data Structure and Algorithm",which is an important major subject in computer science,has a lot of problems in teaching activity.This paper introduces and analyzes the situation and problems in this course ..."Data Structure and Algorithm",which is an important major subject in computer science,has a lot of problems in teaching activity.This paper introduces and analyzes the situation and problems in this course study.A "programming factory" method is then brought out which is indeed a practice-oriented platform of the teachingstudy process.Good results are obtained by this creative method.展开更多
This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteris...This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.展开更多
PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity ...PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity of the execution of a specific operation behind a name. Also, it acts as an interface between SQL database and DEVELOPER. Therefore, it is important to test these modules that consist of procedures and functions. In this paper, a new genetic algorithm (GA), as search technique, is used in order to find the required test data according to branch criteria to test stored PL/SQL program units. The experimental results show that this was not fully achieved, such that the test target in some branches is not reached and the coverage percentage is 98%. A problem rises when target branch is depending on data retrieved from tables;in this case, GA is not able to generate test cases for this branch.展开更多
Input-output data fitting methods are often used for unknown-structure nonlinear system modeling. Based on model-on-demand tactics, a multiple model approach to modeling for nonlinear systems is presented. The basic i...Input-output data fitting methods are often used for unknown-structure nonlinear system modeling. Based on model-on-demand tactics, a multiple model approach to modeling for nonlinear systems is presented. The basic idea is to find out, from vast historical system input-output data sets, some data sets matching with the current working point, then to develop a local model using Local Polynomial Fitting (LPF) algorithm. With the change of working points, multiple local models are built, which realize the exact modeling for the global system. By comparing to other methods, the simulation results show good performance for its simple, effective and reliable estimation.展开更多
Data structures used for an algorithm can have a great impact on its performance, particularly for the solution of large and complex problems, such as multi-objective optimization problems (MOPs). Multi-objective ev...Data structures used for an algorithm can have a great impact on its performance, particularly for the solution of large and complex problems, such as multi-objective optimization problems (MOPs). Multi-objective evolutionary algorithms (MOEAs) are considered an attractive approach for solving MOPs~ since they are able to explore several parts of the Pareto front simultaneously. The data structures for storing and updating populations and non-dominated solutions (archives) may affect the efficiency of the search process. This article describes data structures used in MOEAs for realizing populations and archives in a comparative way, emphasizing their computational requirements and general applicability reported in the original work.展开更多
Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent ...Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.展开更多
A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge c...A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.展开更多
A new variant of HEAPSORT is presented in this paper. The algorithm is not an internal sorting algorithm in the strong sense, since extra storage for n integers is necessary. The basic idea of the new algorithm is sim...A new variant of HEAPSORT is presented in this paper. The algorithm is not an internal sorting algorithm in the strong sense, since extra storage for n integers is necessary. The basic idea of the new algorithm is similar to the classical sorting algorithm HEAPSORT, but the algorithm rebuilds the heap in another way. The basic idea of the new algorithm is it uses only one comparison at each node. The new algorithm shift walks down a path in the heap until a leaf is reached. The request of placing the element in the root immediately to its destination is relaxed. The new algorithm requires about n log n - 0.788928n comparisons in the worst case and n log n - n comparisons on the average which is only about 0.4n more than necessary. It beats on average even the clever variants of QUICKSORT, if n is not very small. The difference between the worst case and the best case indicates that there is still room for improvement of the new algorithm by constructing heap more carefully.展开更多
Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the t...Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.展开更多
Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent s...Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.展开更多
文摘In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and centering on the server, the data will store model to data- centric data storage model. Storage is considered from the start, just keep a series of data, for the management system and storage device rarely consider the intrinsic value of the stored data. The prosperity of the Internet has changed the world data storage, and with the emergence of many new applications. Theoretically, the proposed algorithm has the ability of dealing with massive data and numerically, the algorithm could enhance the processing accuracy and speed which will be meaningful.
基金supported by NSF B55101680,NTIF B2090571,B2110140,SCUT x2rjD2116860,Y1080170,Y1090160,Y1100030,Y1100050,Y1110020 and S1010561121,G101056137
文摘"Data Structure and Algorithm",which is an important major subject in computer science,has a lot of problems in teaching activity.This paper introduces and analyzes the situation and problems in this course study.A "programming factory" method is then brought out which is indeed a practice-oriented platform of the teachingstudy process.Good results are obtained by this creative method.
文摘This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.
文摘PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity of the execution of a specific operation behind a name. Also, it acts as an interface between SQL database and DEVELOPER. Therefore, it is important to test these modules that consist of procedures and functions. In this paper, a new genetic algorithm (GA), as search technique, is used in order to find the required test data according to branch criteria to test stored PL/SQL program units. The experimental results show that this was not fully achieved, such that the test target in some branches is not reached and the coverage percentage is 98%. A problem rises when target branch is depending on data retrieved from tables;in this case, GA is not able to generate test cases for this branch.
基金This project was supported by National Natural Science Foundation (No. 69934020).
文摘Input-output data fitting methods are often used for unknown-structure nonlinear system modeling. Based on model-on-demand tactics, a multiple model approach to modeling for nonlinear systems is presented. The basic idea is to find out, from vast historical system input-output data sets, some data sets matching with the current working point, then to develop a local model using Local Polynomial Fitting (LPF) algorithm. With the change of working points, multiple local models are built, which realize the exact modeling for the global system. By comparing to other methods, the simulation results show good performance for its simple, effective and reliable estimation.
基金supported by the Research Center of College of Computer and Information Sciences,King Saud University,Saudi Arabia
文摘Data structures used for an algorithm can have a great impact on its performance, particularly for the solution of large and complex problems, such as multi-objective optimization problems (MOPs). Multi-objective evolutionary algorithms (MOEAs) are considered an attractive approach for solving MOPs~ since they are able to explore several parts of the Pareto front simultaneously. The data structures for storing and updating populations and non-dominated solutions (archives) may affect the efficiency of the search process. This article describes data structures used in MOEAs for realizing populations and archives in a comparative way, emphasizing their computational requirements and general applicability reported in the original work.
文摘Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.
文摘A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.
基金Supported by the Natural Science Foundation of Fujian under Grant No.A0510008.
文摘A new variant of HEAPSORT is presented in this paper. The algorithm is not an internal sorting algorithm in the strong sense, since extra storage for n integers is necessary. The basic idea of the new algorithm is similar to the classical sorting algorithm HEAPSORT, but the algorithm rebuilds the heap in another way. The basic idea of the new algorithm is it uses only one comparison at each node. The new algorithm shift walks down a path in the heap until a leaf is reached. The request of placing the element in the root immediately to its destination is relaxed. The new algorithm requires about n log n - 0.788928n comparisons in the worst case and n log n - n comparisons on the average which is only about 0.4n more than necessary. It beats on average even the clever variants of QUICKSORT, if n is not very small. The difference between the worst case and the best case indicates that there is still room for improvement of the new algorithm by constructing heap more carefully.
基金Supported by the National Natural Sciences Foun-dation of China (60233010 ,60273034 ,60403014) ,863 ProgramofChina (2002AA116010) ,973 Programof China (2002CB312002)
文摘Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.
文摘Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.