One of the most dangerous safety hazard in underground coal mines is roof falls during retreat mining.Roof falls may cause life-threatening and non-fatal injuries to miners and impede mining and transportation operati...One of the most dangerous safety hazard in underground coal mines is roof falls during retreat mining.Roof falls may cause life-threatening and non-fatal injuries to miners and impede mining and transportation operations.As a result,a reliable roof fall prediction model is essential to tackle such challenges.Different parameters that substantially impact roof falls are ill-defined and intangible,making this an uncertain and challenging research issue.The National Institute for Occupational Safety and Health assembled a national database of roof performance from 37 coal mines to explore the factors contributing to roof falls.Data acquired for 37 mines is limited due to several restrictions,which increased the likelihood of incompleteness.Fuzzy logic is a technique for coping with ambiguity,incompleteness,and uncertainty.Therefore,In this paper,the fuzzy inference method is presented,which employs a genetic algorithm to create fuzzy rules based on 109 records of roof fall data and pattern search to refine the membership functions of parameters.The performance of the deployed model is evaluated using statistical measures such as the Root-Mean-Square Error,Mean-Absolute-Error,and coefficient of determination(R_(2)).Based on these criteria,the suggested model outperforms the existing models to precisely predict roof fall rates using fewer fuzzy rules.展开更多
Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining perform...Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining performance,but they still require huge computational resource and may miss many HUIs.Due to the good combination of EA and graphics processing unit(GPU),we propose a parallel genetic algorithm(GA)based on the platform of GPU for mining HUIM(PHUI-GA).The evolution steps with improvements are performed in central processing unit(CPU)and the CPU intensive steps are sent to GPU to eva-luate with multi-threaded processors.Experiments show that the mining performance of PHUI-GA outperforms the existing EAs.When mining 90%HUIs,the PHUI-GA is up to 188 times better than the existing EAs and up to 36 times better than the CPU parallel approach.展开更多
Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to tr...Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.展开更多
By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the...By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the help of IBM SPSS Modeler data mining software,this paper uses Apriori algorithm for association rule mining to conduct an in-depth analysis of the grades of nursing students in Shandong College of Traditional Chinese Medicine,and to explore the correlation between professional basic courses and professional core courses.Lastly,according to the detailed analysis of the mining results,valuable curriculum information will be found from the actual teaching data.展开更多
The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as ...The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.展开更多
An optimal layout or three-dimensional spatial distribution of stopes guarantees the maximum profitability over life span of an underground mining operation.Thus,stope optimization is one of the key areas in undergrou...An optimal layout or three-dimensional spatial distribution of stopes guarantees the maximum profitability over life span of an underground mining operation.Thus,stope optimization is one of the key areas in underground mine planning practice.However,the computational complexity in developing an optimal stope layout has been a reason for limited availability of the algorithms offering solution to this problem.This article shares a new and efficient heuristic algorithm that considers a three-dimensional ore body model as an input,maximizes the economic value,and satisfies the physical mining and geotechnical constraints for generating an optimal stope layout.An implementation at a copper deposit demonstrates the applicability and robustness of the algorithm.A parallel processing based modification improving the performance of the original algorithm in terms of enormous computational time saving is also presented.展开更多
Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidat...Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.展开更多
Under the condition of the designated collection ratio and the interfused ratio of mullock, to ensure the least energy consumption, the parameters of collecting head (the feed speed, the axes height of collecting hea...Under the condition of the designated collection ratio and the interfused ratio of mullock, to ensure the least energy consumption, the parameters of collecting head (the feed speed, the axes height of collecting head, and the rotate speed) are chosen as the optimized parameters. According to the force on the cutting pick, the collecting size of the cobalt crust and bedrock and the optimized energy consumption of the collecting head, the optimized design model of collecting head is built. Taking two hundred groups seabed microtopography for grand in the range of depth displacement from 4.5 to 5.5 era, then making use of the improved simulated annealing genetic algorithm (SAGA), the corresponding optimized result can be obtained. At the same time, in order to speed up the controlling of collecting head, the optimization results are analyzed using the regression analysis method, and the conclusion of the second parameter of the seabed microtopography is drawn.展开更多
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a...Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.展开更多
This paper integrates genetic algorithm and neura l network techniques to build new temporal predicting analysis tools for geographic information system (GIS). These new GIS tools can be readily applied in a practical...This paper integrates genetic algorithm and neura l network techniques to build new temporal predicting analysis tools for geographic information system (GIS). These new GIS tools can be readily applied in a practical and appropriate manner in spatial and temp oral research to patch the gaps in GIS data mining and knowledge discovery functions. The specific achievement here is the integration of related artificial intellig ent technologies into GIS software to establish a conceptual spatial and temporal analysis framework. And, by using this framework to develop an artificial intelligent spatial and tempor al information analyst (ASIA) system which then is fully utilized in the existin g GIS package. This study of air pollutants forecasting provides a geographical practical case to prove the rationalization and justness of the conceptual tempo ral analysis framework.展开更多
Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of informatio...Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of information that help</span><span style="font-family:Verdana;"><span style="font-family:Verdana;">s</span></span><span style="font-family:Verdana;"> to market the appropriate products at the appropriate time. Moreover, services are considered recently as products. The development of education and health services </span><span style="font-family:Verdana;"><span style="font-family:Verdana;">is</span></span><span style="font-family:Verdana;"> depending on historical data. For the more, reducing online social media networks problems and crimes need a significant source of information. Data analysts need to use an efficient classification algorithm to predict the future of such businesses. However, dealing with a huge quantity of data requires great time to process. Data mining involves many useful techniques that are used to predict statistical data in a variety of business applications. The classification technique is one of the most widely used with a variety of algorithms. In this paper, various classification algorithms are revised in terms of accuracy in different areas of data mining applications. A comprehensive analysis is made after delegated reading of 20 papers in the literature. This paper aims to help data analysts to choose the most suitable classification algorithm for different business applications including business in general, online social media networks, agriculture, health, and education. Results show FFBPN is the most accurate algorithm in the business domain. The Random Forest algorithm is the most accurate in classifying online social networks (OSN) activities. Na<span style="white-space:nowrap;">ï</span>ve Bayes algorithm is the most accurate to classify agriculture datasets. OneR is the most accurate algorithm to classify instances within the health domain. The C4.5 Decision Tree algorithm is the most accurate to classify students’ records to predict degree completion time.展开更多
The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table techni...The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.展开更多
A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting...A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting class threshold is used for construction of decision tree according to the concept of user expectation so as to find classification rules in different layers. Compared with the traditional C4.5 algorithm, the disadvantage of excessive adaptation in C4.5 has been improved so that classification results not only have much higher accuracy but also statistic meaning.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and ...Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and can be used to represent the string resulting from applying s to a reference string. We present a modification to Ukkonen’s edit distance calculating algorithm based upon representing strings by edit sequences. We conclude with a demonstration of how using this representation can improve mitochondrial DNA query throughput performance in a distributed computing environment.展开更多
In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up co...In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.展开更多
The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collec...The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.展开更多
Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by ...Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by an Ant Colony Algorithm(ACA) performing a searching operation and to develop a rule set searcher which approximates the ACA′s searcher.An attribute-oriented induction methodology was used to explore the relationship between an operations′ sequence and its attributes and a set of rules has been developed.At the end of this paper,the experimental results have shown that the proposed approach has good performance with respect to the quality of solution and the speed of computation.展开更多
Due to the difficulties in obtaining large deformation mining subsidence using differential Interferometric Synthetic Aperture Radar (D-InSAR) alone, a new algorithm was proposed to extract large deformation mining ...Due to the difficulties in obtaining large deformation mining subsidence using differential Interferometric Synthetic Aperture Radar (D-InSAR) alone, a new algorithm was proposed to extract large deformation mining subsidence using D-InSAR technique and probability integral method. The details of the algorithm are as follows:the control points set, containing correct phase unwrapping points on the subsidence basin edge generated by D-InSAR and several observation points (near the maximum subsidence and inflection points), was established at first; genetic algorithm (GA) was then used to optimize the parameters of probability integral method; at last, the surface subsidence was deduced according to the optimum parameters. The results of the experiment in Huaibei mining area, China, show that the presented method can generate the correct mining subsidence basin with a few surface observations, and the relative error of maximum subsidence point is about 8.3%, which is much better than that of conventional D-InSAR (relative error is 68.0%).展开更多
文摘One of the most dangerous safety hazard in underground coal mines is roof falls during retreat mining.Roof falls may cause life-threatening and non-fatal injuries to miners and impede mining and transportation operations.As a result,a reliable roof fall prediction model is essential to tackle such challenges.Different parameters that substantially impact roof falls are ill-defined and intangible,making this an uncertain and challenging research issue.The National Institute for Occupational Safety and Health assembled a national database of roof performance from 37 coal mines to explore the factors contributing to roof falls.Data acquired for 37 mines is limited due to several restrictions,which increased the likelihood of incompleteness.Fuzzy logic is a technique for coping with ambiguity,incompleteness,and uncertainty.Therefore,In this paper,the fuzzy inference method is presented,which employs a genetic algorithm to create fuzzy rules based on 109 records of roof fall data and pattern search to refine the membership functions of parameters.The performance of the deployed model is evaluated using statistical measures such as the Root-Mean-Square Error,Mean-Absolute-Error,and coefficient of determination(R_(2)).Based on these criteria,the suggested model outperforms the existing models to precisely predict roof fall rates using fewer fuzzy rules.
基金This work was supported by the National Natural Science Foundation of China(62073155,62002137,62106088,62206113)the High-End Foreign Expert Recruitment Plan(G2023144007L)the Fundamental Research Funds for the Central Universities(JUSRP221028).
文摘Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining performance,but they still require huge computational resource and may miss many HUIs.Due to the good combination of EA and graphics processing unit(GPU),we propose a parallel genetic algorithm(GA)based on the platform of GPU for mining HUIM(PHUI-GA).The evolution steps with improvements are performed in central processing unit(CPU)and the CPU intensive steps are sent to GPU to eva-luate with multi-threaded processors.Experiments show that the mining performance of PHUI-GA outperforms the existing EAs.When mining 90%HUIs,the PHUI-GA is up to 188 times better than the existing EAs and up to 36 times better than the CPU parallel approach.
基金support by the Open Project of Xiangjiang Laboratory(22XJ02003)the University Fundamental Research Fund(23-ZZCX-JDZ-28,ZK21-07)+5 种基金the National Science Fund for Outstanding Young Scholars(62122093)the National Natural Science Foundation of China(72071205)the Hunan Graduate Research Innovation Project(CX20230074)the Hunan Natural Science Foundation Regional Joint Project(2023JJ50490)the Science and Technology Project for Young and Middle-aged Talents of Hunan(2023TJZ03)the Science and Technology Innovation Program of Humnan Province(2023RC1002).
文摘Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.
文摘By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the help of IBM SPSS Modeler data mining software,this paper uses Apriori algorithm for association rule mining to conduct an in-depth analysis of the grades of nursing students in Shandong College of Traditional Chinese Medicine,and to explore the correlation between professional basic courses and professional core courses.Lastly,according to the detailed analysis of the mining results,valuable curriculum information will be found from the actual teaching data.
基金supported by the Ministry of Science and Higher Education, Republic of Poland (Statutory Activity of the Central Mining Institute, Grant No. 11133010)
文摘The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.
文摘An optimal layout or three-dimensional spatial distribution of stopes guarantees the maximum profitability over life span of an underground mining operation.Thus,stope optimization is one of the key areas in underground mine planning practice.However,the computational complexity in developing an optimal stope layout has been a reason for limited availability of the algorithms offering solution to this problem.This article shares a new and efficient heuristic algorithm that considers a three-dimensional ore body model as an input,maximizes the economic value,and satisfies the physical mining and geotechnical constraints for generating an optimal stope layout.An implementation at a copper deposit demonstrates the applicability and robustness of the algorithm.A parallel processing based modification improving the performance of the original algorithm in terms of enormous computational time saving is also presented.
基金theFundoftheNationalManagementBureauofTraditionalChineseMedicine(No .2 0 0 0 J P 5 4 )
文摘Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.
基金Project(50875265) supported by the National Natural Science Foundation of ChinaProject(20080440992) supported by the Postdoctoral Science Foundation of ChinaProject(2009SK3159) supported by the Technology Support Plan of Hunan Province,China
文摘Under the condition of the designated collection ratio and the interfused ratio of mullock, to ensure the least energy consumption, the parameters of collecting head (the feed speed, the axes height of collecting head, and the rotate speed) are chosen as the optimized parameters. According to the force on the cutting pick, the collecting size of the cobalt crust and bedrock and the optimized energy consumption of the collecting head, the optimized design model of collecting head is built. Taking two hundred groups seabed microtopography for grand in the range of depth displacement from 4.5 to 5.5 era, then making use of the improved simulated annealing genetic algorithm (SAGA), the corresponding optimized result can be obtained. At the same time, in order to speed up the controlling of collecting head, the optimization results are analyzed using the regression analysis method, and the conclusion of the second parameter of the seabed microtopography is drawn.
基金Supported by the National Natural Science Foundation of China(60472099)Ningbo Natural Science Foundation(2006A610017)
文摘Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.
文摘This paper integrates genetic algorithm and neura l network techniques to build new temporal predicting analysis tools for geographic information system (GIS). These new GIS tools can be readily applied in a practical and appropriate manner in spatial and temp oral research to patch the gaps in GIS data mining and knowledge discovery functions. The specific achievement here is the integration of related artificial intellig ent technologies into GIS software to establish a conceptual spatial and temporal analysis framework. And, by using this framework to develop an artificial intelligent spatial and tempor al information analyst (ASIA) system which then is fully utilized in the existin g GIS package. This study of air pollutants forecasting provides a geographical practical case to prove the rationalization and justness of the conceptual tempo ral analysis framework.
文摘Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of information that help</span><span style="font-family:Verdana;"><span style="font-family:Verdana;">s</span></span><span style="font-family:Verdana;"> to market the appropriate products at the appropriate time. Moreover, services are considered recently as products. The development of education and health services </span><span style="font-family:Verdana;"><span style="font-family:Verdana;">is</span></span><span style="font-family:Verdana;"> depending on historical data. For the more, reducing online social media networks problems and crimes need a significant source of information. Data analysts need to use an efficient classification algorithm to predict the future of such businesses. However, dealing with a huge quantity of data requires great time to process. Data mining involves many useful techniques that are used to predict statistical data in a variety of business applications. The classification technique is one of the most widely used with a variety of algorithms. In this paper, various classification algorithms are revised in terms of accuracy in different areas of data mining applications. A comprehensive analysis is made after delegated reading of 20 papers in the literature. This paper aims to help data analysts to choose the most suitable classification algorithm for different business applications including business in general, online social media networks, agriculture, health, and education. Results show FFBPN is the most accurate algorithm in the business domain. The Random Forest algorithm is the most accurate in classifying online social networks (OSN) activities. Na<span style="white-space:nowrap;">ï</span>ve Bayes algorithm is the most accurate to classify agriculture datasets. OneR is the most accurate algorithm to classify instances within the health domain. The C4.5 Decision Tree algorithm is the most accurate to classify students’ records to predict degree completion time.
文摘The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.
文摘A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting class threshold is used for construction of decision tree according to the concept of user expectation so as to find classification rules in different layers. Compared with the traditional C4.5 algorithm, the disadvantage of excessive adaptation in C4.5 has been improved so that classification results not only have much higher accuracy but also statistic meaning.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and can be used to represent the string resulting from applying s to a reference string. We present a modification to Ukkonen’s edit distance calculating algorithm based upon representing strings by edit sequences. We conclude with a demonstration of how using this representation can improve mitochondrial DNA query throughput performance in a distributed computing environment.
文摘In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.
基金supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2024/R/1445).
文摘The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.
文摘Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by an Ant Colony Algorithm(ACA) performing a searching operation and to develop a rule set searcher which approximates the ACA′s searcher.An attribute-oriented induction methodology was used to explore the relationship between an operations′ sequence and its attributes and a set of rules has been developed.At the end of this paper,the experimental results have shown that the proposed approach has good performance with respect to the quality of solution and the speed of computation.
基金Project (BK20130174) supported by the Basic Research Project of Jiangsu Province (Natural Science Foundation) Project (1101109C) supported by Jiangsu Planned Projects for Postdoctoral Research Funds,China+1 种基金Project (201325) supported by the Key Laboratory of Geo-informatics of State Bureau of Surveying and Mapping,ChinaProject (SZBF2011-6-B35) supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions,China
文摘Due to the difficulties in obtaining large deformation mining subsidence using differential Interferometric Synthetic Aperture Radar (D-InSAR) alone, a new algorithm was proposed to extract large deformation mining subsidence using D-InSAR technique and probability integral method. The details of the algorithm are as follows:the control points set, containing correct phase unwrapping points on the subsidence basin edge generated by D-InSAR and several observation points (near the maximum subsidence and inflection points), was established at first; genetic algorithm (GA) was then used to optimize the parameters of probability integral method; at last, the surface subsidence was deduced according to the optimum parameters. The results of the experiment in Huaibei mining area, China, show that the presented method can generate the correct mining subsidence basin with a few surface observations, and the relative error of maximum subsidence point is about 8.3%, which is much better than that of conventional D-InSAR (relative error is 68.0%).