Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data...Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data mining have been developed, but only a few of them make use of intelligent agents. This paper provides the reason for applying Multi-Agent Technology in Distributed Data Mining and presents a Distributed Data Mining System based on Multi-Agent Technology that deals with heterogeneity in such environment. Based on the advantages of both the CS model and agent-based model, the system is being able to address the specific concern of increasing scalability and enhancing performance.展开更多
The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geo...The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.展开更多
The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type descriptio...The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type description to the relational semantic recording XML data relations, and using an XML data mining language, the XML data mining system presents a strategy to mine information on XML.展开更多
Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manife...Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manifestations of Western medicine include thirst,inability to drink more,diarrhea,yellow urine,red tongue,et al.)internalized disease.Nevertheless,the mechanism of EZECD on damp-heat internalized Type 2 diabetes(T2D)remains unknown.We employed data mining,pharmacology databases and experimental verification to study how EZECD treats damp-heat internalized T2D.Methods:The main compounds or genes of EZECD and damp-heat internalized T2D were obtained from the pharmacology databases.Succeeding,the overlapped targets of EZECD and damp-heat internalized T2D were performed by the Gene Ontology,kyoto encyclopedia of genes and genomes analysis.And the compound-disease targets-pathway network were constructed to obtain the hub compound.Moreover,the hub genes and core related pathways were mined with weighted gene co-expression network analysis based on Gene Expression Omnibus database,the capability of hub compound and genes was valid in AutoDock 1.5.7.Furthermore,and violin plot and gene set enrichment analysis were performed to explore the role of hub genes in damp-heat internalized T2D.Finally,the interactions of hub compound and genes were explored using Comparative Toxicogenomics Database and quantitative polymerase chain reaction.Results:First,herb-compounds-genes-disease network illustrated that the hub compound of EZECD for damp-heat internalized T2D could be quercetin.Consistently,the hub genes were CASP8,CCL2,and AHR according to weighted gene co-expression network analysis.Molecular docking showed that quercetin could bind with the hub genes.Further,gene set enrichment analysis and Gene Ontology represented that CASP8,or CCL2,is negatively involved in insulin secretion response to the TNF or lipopolysaccharide process,and AHR or CCL2 positively regulated lipid and atherosclerosis,and/or including NOD-like receptor signaling pathway,and TNF signaling pathway.Ultimately,the quantitative polymerase chain reaction and western blotting analysis showed that quercetin could down-regulated the mRNA and protein experssion of CASP8,CCL2,and AHR.It was consistent with the results in Comparative Toxicogenomics Database databases.Conclusion:These results demonstrated quercetin could inhibit the expression of CASP8,CCL2,AHR in damp-heat internalized T2D,which improves insulin secretion and inhibits lipid and atherosclerosis,as well as/or including NOD-like receptor signaling pathway,and TNF signaling pathway,suggesting that EZECD may be more effective to treat damp-heat internalized T2D.展开更多
The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation ru...The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation rules,including data collection and processing preparation steps,builds the data warehouse of association rules,and then generates an educational quality evaluation framework using the principle of data mining.Based on this,this paper analyzes the construction design and method of the teaching evaluation system under data mining,hoping to provide help for the improvement of the teaching evaluation system and the improvement of teaching quality.展开更多
Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Da...Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.展开更多
In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf ava...In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf availability is an essential indicator of shelf data management and closely relates to customer purchase behavior.RFM(recency,frequency,andmonetary)patternmining is a powerful tool to evaluate the value of customer behavior.However,the existing RFM patternmining algorithms do not consider the quarterly nature of goods,resulting in unreasonable shelf availability and difficulty in profit-making.To solve this problem,we propose a quarterly RFM mining algorithmfor On-shelf products named OS-RFM.Our algorithmmines the high recency,high frequency,and high monetary patterns and considers the period of the on-shelf goods in quarterly units.We conducted experiments using two real datasets for numerical and graphical analysis to prove the algorithm’s effectiveness.Compared with the state-of-the-art RFM mining algorithm,our algorithm can identify more patterns and performs well in terms of precision,recall,and F1-score,with the recall rate nearing 100%.Also,the novel algorithm operates with significantly shorter running times and more stable memory usage than existing mining algorithms.Additionally,we analyze the sales trends of products in different quarters and seasonal variations.The analysis assists businesses in maintaining reasonable on-shelf availability and achieving greater profitability.展开更多
[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The str...[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The structural design information of all waist protection equipment was collected from the national Internet platform,and the data were classified and a database was established.IBM SPSS 26.0 and MATLAB 2018a were used to analyze the data and tabulate them in Tableau 2022.4.After the association rules were clarified,the data were imported into Cinema 4D R21 to create the concept contour of waist protection pillow.[Results]The average and standard deviation of the single airbag design were the highest in all groups,with an average of 0.511 and a standard deviation of 0.502.The average and standard deviation of the upper and lower dual airbags were the lowest in all groups,with an average of 0.015 and a standard deviation of 0.120;the correlation coefficient between single airbag and 120°arc stretching was 0.325,which was positively correlated with each other(P<0.01);the correlation coefficient between multiple airbags and 360°encircling fitting was 0.501,which was positively correlated with each other and had the highest correlation degree(P<0.01).[Conclusions]The single airbag design is well recognized by companies,and has received the highest attention among all brand products.While focusing on single airbag design,most brands will consider the need to add 120°arc stretching elements in product design.At the time of focusing on multiple airbag design,some brands believe that 360°encircling fitting elements need to be added to the product,and the correlation between the two is the highest among all groups.展开更多
Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.Th...Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.This study aimed to screen a novel TCM prescription against DR from patents and elucidate its medication rule and molecular mechanism using data mining,network pharmacology,molecular docking and molecular dynamics(MD)simulation.Method:TCM prescriptions for treating DR was collected from patents and a novel TCM prescription was identified using data mining.Subsequently,the mechanism of the novel TCM prescription against DR was explored by constructing a network of core TCMs-core active ingredients-core targets-core pathways.Finally,molecular docking and MD simulation were employed to validate the findings from network pharmacology.Result:The TCMs of the collected prescriptions primarily possessed bitter and cold properties with heat-clearing and supplementing effects,attributed to the liver,lung and kidney channels.Notably,a novel TCM prescription for treating DR was identified,composed of Lycii Fructus,Chrysanthemi Flos,Astragali Radix and Angelicae Sinensis Radix.Twenty core active ingredients and ten core targets of the novel TCM prescription for treating DR were screened.Moreover,the novel TCM prescription played a crucial role for treating DR by inhibiting inflammatory response,oxidative stress,retinal pigment epithelium cell apoptosis and retinal neovascularization through various pathways,such as the AGE-RAGE signaling pathway in diabetic complications and the MAPK signaling pathway.Finally,molecular docking and MD simulation demonstrated that almost all core active ingredients exhibited satisfactory binding energies to core targets.Conclusions:This study identified a novel TCM prescription and unveiled its multi-component,multi-target and multi-pathway characteristics for treating DR.These findings provide a scientific basis and novel insights into the development of drugs for DR prevention and treatment.展开更多
The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collec...The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.展开更多
The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. Thes...The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. These factors are safety, service quality and price. Airline companies can analyze the customers in the market with a focus on price and quality and develop a business model according to their expectations. For example, business class and economy class passenger expectations are different from each other, so the service and price to be offered to them will be different. However, all customers have one common expectation and that is safety. No matter how high quality the service is or how cheap the price is, no one wants to fly with an airline or plane that is not safe. From an airline company’s point of view, an accident or breakdown of one of the company’s aircraft can cause irreparable image loss and financial damage. If we look at past examples, we see that there are many airline companies or maintenance organizations that could not recover after an accident and went bankrupt. Safety is an indispensable factor. Therefore, there is a unit in the sector called the safety management system (SMS), which collects data by taking a proactive and reactive approach. The way and purpose of the safety management system is to take a proactive approach to recognize and prevent unsafe situations before they cause accidents or breakdowns, or to take a reactive approach to find the causes of accidents and breakdowns that have occurred as a result of certain factors and to take the necessary measures to prevent the same situations from happening again in the sector. The field of data mining, which is necessary to predict the future behavior of customers in the field of marketing, is an area that marketing also values. In this study, data mining studies to ensure safety in the aviation industry and the security of customer information in marketing will be emphasized, firstly, the concept and importance of data mining will be mentioned.展开更多
In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), s...In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), such as extracting information related to people’s behaviors and interactions to analyze feelings or understand the behavior of users or groups, and many others. This extracted knowledge has a very important role in decision-making, creating and improving marketing objectives and competitive advantage, monitoring events, whether political or economic, and development in all fields. Therefore, to extract this knowledge, we need to analyze the vast amount of data found within social media using the most popular data mining techniques and applications related to social media sites.展开更多
Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Col...Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Collect the related literature materials of PCOS with kidney deficiency and blood stasis syndrome treated by traditional Chinese medicine in four databases in recent ten years,extract the information of prescriptions and complete the frequency analysis.Traditional Chinese Medicine Systems Pharmacology Database was used to screen out the effective components.Use Online Mendelian Inheritance in Man and other databases to screen PCOS disease targets.The intersection targets obtained by clustering prescription and PCOS disease targets were submitted to STRING database for protein-protein interaction network analysis,and Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes pathways were analysed by Metascape.Result:There are 155 kinds of traditional Chinese medicines used in the literature.The most commonly utilized ones are Cuscutae Semen,Angelicae Sinensis Radix,and Rehmanniae Radix Praeparata.The results of the cluster analysis indicated that the plants most commonly found throughout the prescription were Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.GO results show that biological processes include cell reaction to organic nitrogen compounds and cell reaction to nitrogen compounds.The functional display of GO molecule includes cytokine receptor binding,signal receptor regulator activity and so on.Kyoto Encyclopedia of Genes and Genomes results show that the possible mechanisms of action are cancer pathway,an endocrine resistance signal pathway.Conclusion:Through data mining,the cluster prescription for PCOS with kidney deficiency and blood stasis syndrome is Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.The network pharmacology research of cluster prescription shows that the main drug components for treating PCOS with kidney deficiency and blood stasis syndrome are quercetin,kaempferol,luteolin,tanshinone IIA,etc.,which act on PTGS2,NCOA2,and other targets,and treat PCOS with kidney deficiency and blood stasis syndrome through cancer and endocrine resistance.展开更多
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th...Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.展开更多
Data mining process involves a number of steps fromdata collection to visualization to identify useful data from massive data set.the same time,the recent advances of machine learning(ML)and deep learning(DL)models ca...Data mining process involves a number of steps fromdata collection to visualization to identify useful data from massive data set.the same time,the recent advances of machine learning(ML)and deep learning(DL)models can be utilized for effectual rainfall prediction.With this motivation,this article develops a novel comprehensive oppositionalmoth flame optimization with deep learning for rainfall prediction(COMFO-DLRP)Technique.The proposed CMFO-DLRP model mainly intends to predict the rainfall and thereby determine the environmental changes.Primarily,data pre-processing and correlation matrix(CM)based feature selection processes are carried out.In addition,deep belief network(DBN)model is applied for the effective prediction of rainfall data.Moreover,COMFO algorithm was derived by integrating the concepts of comprehensive oppositional based learning(COBL)with traditional MFO algorithm.Finally,the COMFO algorithm is employed for the optimal hyperparameter selection of the DBN model.For demonstrating the improved outcomes of the COMFO-DLRP approach,a sequence of simulations were carried out and the outcomes are assessed under distinct measures.The simulation outcome highlighted the enhanced outcomes of the COMFO-DLRP method on the other techniques.展开更多
Prediction and diagnosis of cardiovascular diseases(CVDs)based,among other things,on medical examinations and patient symptoms are the biggest challenges in medicine.About 17.9 million people die from CVDs annually,ac...Prediction and diagnosis of cardiovascular diseases(CVDs)based,among other things,on medical examinations and patient symptoms are the biggest challenges in medicine.About 17.9 million people die from CVDs annually,accounting for 31%of all deaths worldwide.With a timely prognosis and thorough consideration of the patient’s medical history and lifestyle,it is possible to predict CVDs and take preventive measures to eliminate or control this life-threatening disease.In this study,we used various patient datasets from a major hospital in the United States as prognostic factors for CVD.The data was obtained by monitoring a total of 918 patients whose criteria for adults were 28-77 years old.In this study,we present a data mining modeling approach to analyze the performance,classification accuracy and number of clusters on Cardiovascular Disease Prognostic datasets in unsupervised machine learning(ML)using the Orange data mining software.Various techniques are then used to classify the model parameters,such as k-nearest neighbors,support vector machine,random forest,artificial neural network(ANN),naïve bayes,logistic regression,stochastic gradient descent(SGD),and AdaBoost.To determine the number of clusters,various unsupervised ML clustering methods were used,such as k-means,hierarchical,and density-based spatial clustering of applications with noise clustering.The results showed that the best model performance analysis and classification accuracy were SGD and ANN,both of which had a high score of 0.900 on Cardiovascular Disease Prognostic datasets.Based on the results of most clustering methods,such as k-means and hierarchical clustering,Cardiovascular Disease Prognostic datasets can be divided into two clusters.The prognostic accuracy of CVD depends on the accuracy of the proposed model in determining the diagnostic model.The more accurate the model,the better it can predict which patients are at risk for CVD.展开更多
Imagine numerous clients,each with personal data;individual inputs are severely corrupt,and a server only concerns the collective,statistically essential facets of this data.In several data mining methods,privacy has ...Imagine numerous clients,each with personal data;individual inputs are severely corrupt,and a server only concerns the collective,statistically essential facets of this data.In several data mining methods,privacy has become highly critical.As a result,various privacy-preserving data analysis technologies have emerged.Hence,we use the randomization process to reconstruct composite data attributes accurately.Also,we use privacy measures to estimate how much deception is required to guarantee privacy.There are several viable privacy protections;however,determining which one is the best is still a work in progress.This paper discusses the difficulty of measuring privacy while also offering numerous random sampling procedures and statistical and categorized data results.Further-more,this paper investigates the use of arbitrary nature with perturbations in privacy preservation.According to the research,arbitrary objects(most notably random matrices)have"predicted"frequency patterns.It shows how to recover crucial information from a sample damaged by a random number using an arbi-trary lattice spectral selection strategy.Thisfiltration system's conceptual frame-work posits,and extensive practicalfindings indicate that sparse data distortions preserve relatively modest privacy protection in various situations.As a result,the research framework is efficient and effective in maintaining data privacy and security.展开更多
It is crucial,while using healthcare data,to assess the advantages of data privacy against the possible drawbacks.Data from several sources must be combined for use in many data mining applications.The medical practit...It is crucial,while using healthcare data,to assess the advantages of data privacy against the possible drawbacks.Data from several sources must be combined for use in many data mining applications.The medical practitioner may use the results of association rule mining performed on this aggregated data to better personalize patient care and implement preventive measures.Historically,numerous heuristics(e.g.,greedy search)and metaheuristics-based techniques(e.g.,evolutionary algorithm)have been created for the positive association rule in privacy preserving data mining(PPDM).When it comes to connecting seemingly unrelated diseases and drugs,negative association rules may be more informative than their positive counterparts.It is well-known that during negative association rules mining,a large number of uninteresting rules are formed,making this a difficult problem to tackle.In this research,we offer an adaptive method for negative association rule mining in vertically partitioned healthcare datasets that respects users’privacy.The applied approach dynamically determines the transactions to be interrupted for information hiding,as opposed to predefining them.This study introduces a novel method for addressing the problem of negative association rules in healthcare data mining,one that is based on the Tabu-genetic optimization paradigm.Tabu search is advantageous since it removes a huge number of unnecessary rules and item sets.Experiments using benchmark healthcare datasets prove that the discussed scheme outperforms state-of-the-art solutions in terms of decreasing side effects and data distortions,as measured by the indicator of hiding failure.展开更多
Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based...Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based on data mining.Furthermore,with the complexity of industrial equipment increasing,the managers should pay more attention to the key components and carry out the lean management is very important.Therefore,the idea"customer segmentation"of"precise marketing"can be used in the maintenance management of the multi-component system.Following the idea of segmentation,the components of multicomponent systems should be subdivied into groups based on specific attributes relevant to maintenance,such as maintenance cost,mean time between failures,and failure frequency.For the target specific groups of parts,the optimal maintenance policy,health assessment and maintenance scheduling can be determined.The proposed analysis framework will be given out.In order to illustrate the effectiveness of this method,a numerical example is given out.展开更多
Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from...Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from numerous remote databases.Mining algorithms can obtain sensitive information once the data is in the data warehouse.Many traditional algorithms/techniques promise to provide safe data transfer,storing,and retrieving over the cloud platform.These strategies are primarily concerned with protecting the privacy of user data.This study aims to present data mining with privacy protection(DMPP)using precise elliptic curve cryptography(PECC),which builds upon that algebraic elliptic curve infinitefields.This approach enables safe data exchange by utilizing a reliable data consolidation approach entirely reliant on rewritable data concealing techniques.Also,it outperforms data mining in terms of solid privacy procedures while maintaining the quality of the data.Average approximation error,computational cost,anonymizing time,and data loss are considered performance measures.The suggested approach is practical and applicable in real-world situations according to the experimentalfindings.展开更多
文摘Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data mining have been developed, but only a few of them make use of intelligent agents. This paper provides the reason for applying Multi-Agent Technology in Distributed Data Mining and presents a Distributed Data Mining System based on Multi-Agent Technology that deals with heterogeneity in such environment. Based on the advantages of both the CS model and agent-based model, the system is being able to address the specific concern of increasing scalability and enhancing performance.
文摘The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.
文摘The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type description to the relational semantic recording XML data relations, and using an XML data mining language, the XML data mining system presents a strategy to mine information on XML.
基金supported by a grant from Hubei Key Laboratory of Diabetes and Angiopathy Program of Hubei University of Science and Technology(2020XZ10)Project of Education Commission of Hubei Province(B2022192).
文摘Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manifestations of Western medicine include thirst,inability to drink more,diarrhea,yellow urine,red tongue,et al.)internalized disease.Nevertheless,the mechanism of EZECD on damp-heat internalized Type 2 diabetes(T2D)remains unknown.We employed data mining,pharmacology databases and experimental verification to study how EZECD treats damp-heat internalized T2D.Methods:The main compounds or genes of EZECD and damp-heat internalized T2D were obtained from the pharmacology databases.Succeeding,the overlapped targets of EZECD and damp-heat internalized T2D were performed by the Gene Ontology,kyoto encyclopedia of genes and genomes analysis.And the compound-disease targets-pathway network were constructed to obtain the hub compound.Moreover,the hub genes and core related pathways were mined with weighted gene co-expression network analysis based on Gene Expression Omnibus database,the capability of hub compound and genes was valid in AutoDock 1.5.7.Furthermore,and violin plot and gene set enrichment analysis were performed to explore the role of hub genes in damp-heat internalized T2D.Finally,the interactions of hub compound and genes were explored using Comparative Toxicogenomics Database and quantitative polymerase chain reaction.Results:First,herb-compounds-genes-disease network illustrated that the hub compound of EZECD for damp-heat internalized T2D could be quercetin.Consistently,the hub genes were CASP8,CCL2,and AHR according to weighted gene co-expression network analysis.Molecular docking showed that quercetin could bind with the hub genes.Further,gene set enrichment analysis and Gene Ontology represented that CASP8,or CCL2,is negatively involved in insulin secretion response to the TNF or lipopolysaccharide process,and AHR or CCL2 positively regulated lipid and atherosclerosis,and/or including NOD-like receptor signaling pathway,and TNF signaling pathway.Ultimately,the quantitative polymerase chain reaction and western blotting analysis showed that quercetin could down-regulated the mRNA and protein experssion of CASP8,CCL2,and AHR.It was consistent with the results in Comparative Toxicogenomics Database databases.Conclusion:These results demonstrated quercetin could inhibit the expression of CASP8,CCL2,AHR in damp-heat internalized T2D,which improves insulin secretion and inhibits lipid and atherosclerosis,as well as/or including NOD-like receptor signaling pathway,and TNF signaling pathway,suggesting that EZECD may be more effective to treat damp-heat internalized T2D.
文摘The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation rules,including data collection and processing preparation steps,builds the data warehouse of association rules,and then generates an educational quality evaluation framework using the principle of data mining.Based on this,this paper analyzes the construction design and method of the teaching evaluation system under data mining,hoping to provide help for the improvement of the teaching evaluation system and the improvement of teaching quality.
基金supported by a Lee Kong Chian School of Medicine Dean’s Postdoctoral Fellowship(021207-00001)from Nanyang Technological University(NTU)Singapore and a Mistletoe Research Fellowship(022522-00001)from the Momental Foundation USA.Jialiu Zeng is supported by a Presidential Postdoctoral Fellowship(021229-00001)from NTU Singapore and an Open Fund Young Investigator Research Grant(OF-YIRG)(MOH-001147)from the National Medical Research Council(NMRC)SingaporeSu Bin Lim is supported by the National Research Foundation(NRF)of Korea(Grant Nos.:2020R1A6A1A03043539,2020M3A9D8037604,2022R1C1C1004756)a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare,Republic of Korea(Grant No.:HR22C1734).
文摘Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.
基金partially supported by the Foundation of State Key Laboratory of Public Big Data(No.PBD2022-01).
文摘In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf availability is an essential indicator of shelf data management and closely relates to customer purchase behavior.RFM(recency,frequency,andmonetary)patternmining is a powerful tool to evaluate the value of customer behavior.However,the existing RFM patternmining algorithms do not consider the quarterly nature of goods,resulting in unreasonable shelf availability and difficulty in profit-making.To solve this problem,we propose a quarterly RFM mining algorithmfor On-shelf products named OS-RFM.Our algorithmmines the high recency,high frequency,and high monetary patterns and considers the period of the on-shelf goods in quarterly units.We conducted experiments using two real datasets for numerical and graphical analysis to prove the algorithm’s effectiveness.Compared with the state-of-the-art RFM mining algorithm,our algorithm can identify more patterns and performs well in terms of precision,recall,and F1-score,with the recall rate nearing 100%.Also,the novel algorithm operates with significantly shorter running times and more stable memory usage than existing mining algorithms.Additionally,we analyze the sales trends of products in different quarters and seasonal variations.The analysis assists businesses in maintaining reasonable on-shelf availability and achieving greater profitability.
基金Supported by Municipal Public Welfare Science and Technology Project of Zhoushan Science and Technology Bureau,Zhejiang Province(2021C31064).
文摘[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The structural design information of all waist protection equipment was collected from the national Internet platform,and the data were classified and a database was established.IBM SPSS 26.0 and MATLAB 2018a were used to analyze the data and tabulate them in Tableau 2022.4.After the association rules were clarified,the data were imported into Cinema 4D R21 to create the concept contour of waist protection pillow.[Results]The average and standard deviation of the single airbag design were the highest in all groups,with an average of 0.511 and a standard deviation of 0.502.The average and standard deviation of the upper and lower dual airbags were the lowest in all groups,with an average of 0.015 and a standard deviation of 0.120;the correlation coefficient between single airbag and 120°arc stretching was 0.325,which was positively correlated with each other(P<0.01);the correlation coefficient between multiple airbags and 360°encircling fitting was 0.501,which was positively correlated with each other and had the highest correlation degree(P<0.01).[Conclusions]The single airbag design is well recognized by companies,and has received the highest attention among all brand products.While focusing on single airbag design,most brands will consider the need to add 120°arc stretching elements in product design.At the time of focusing on multiple airbag design,some brands believe that 360°encircling fitting elements need to be added to the product,and the correlation between the two is the highest among all groups.
基金supported by the National Natural Science Foundation of China(Grant No.82104701)Science Fund Program for Outstanding Young Scholars in Universities of Anhui Province(Grant No.2022AH030064)+3 种基金Key Project at Central Government Level:the Ability Establishment of Sustainable Use for Valuable Chinese Medicine Resources(Grant No.2060302)Foundation of Anhui Province Key Laboratory of Pharmaceutical Preparation Technology and Application(Grant No.2021KFKT10)China Agriculture Research System of MOF and MARA(Grant No.CARS-21)Talent Support Program of Anhui University of Chinese Medicine(Grant No.2020rcyb007).
文摘Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.This study aimed to screen a novel TCM prescription against DR from patents and elucidate its medication rule and molecular mechanism using data mining,network pharmacology,molecular docking and molecular dynamics(MD)simulation.Method:TCM prescriptions for treating DR was collected from patents and a novel TCM prescription was identified using data mining.Subsequently,the mechanism of the novel TCM prescription against DR was explored by constructing a network of core TCMs-core active ingredients-core targets-core pathways.Finally,molecular docking and MD simulation were employed to validate the findings from network pharmacology.Result:The TCMs of the collected prescriptions primarily possessed bitter and cold properties with heat-clearing and supplementing effects,attributed to the liver,lung and kidney channels.Notably,a novel TCM prescription for treating DR was identified,composed of Lycii Fructus,Chrysanthemi Flos,Astragali Radix and Angelicae Sinensis Radix.Twenty core active ingredients and ten core targets of the novel TCM prescription for treating DR were screened.Moreover,the novel TCM prescription played a crucial role for treating DR by inhibiting inflammatory response,oxidative stress,retinal pigment epithelium cell apoptosis and retinal neovascularization through various pathways,such as the AGE-RAGE signaling pathway in diabetic complications and the MAPK signaling pathway.Finally,molecular docking and MD simulation demonstrated that almost all core active ingredients exhibited satisfactory binding energies to core targets.Conclusions:This study identified a novel TCM prescription and unveiled its multi-component,multi-target and multi-pathway characteristics for treating DR.These findings provide a scientific basis and novel insights into the development of drugs for DR prevention and treatment.
基金supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2024/R/1445).
文摘The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.
文摘The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. These factors are safety, service quality and price. Airline companies can analyze the customers in the market with a focus on price and quality and develop a business model according to their expectations. For example, business class and economy class passenger expectations are different from each other, so the service and price to be offered to them will be different. However, all customers have one common expectation and that is safety. No matter how high quality the service is or how cheap the price is, no one wants to fly with an airline or plane that is not safe. From an airline company’s point of view, an accident or breakdown of one of the company’s aircraft can cause irreparable image loss and financial damage. If we look at past examples, we see that there are many airline companies or maintenance organizations that could not recover after an accident and went bankrupt. Safety is an indispensable factor. Therefore, there is a unit in the sector called the safety management system (SMS), which collects data by taking a proactive and reactive approach. The way and purpose of the safety management system is to take a proactive approach to recognize and prevent unsafe situations before they cause accidents or breakdowns, or to take a reactive approach to find the causes of accidents and breakdowns that have occurred as a result of certain factors and to take the necessary measures to prevent the same situations from happening again in the sector. The field of data mining, which is necessary to predict the future behavior of customers in the field of marketing, is an area that marketing also values. In this study, data mining studies to ensure safety in the aviation industry and the security of customer information in marketing will be emphasized, firstly, the concept and importance of data mining will be mentioned.
文摘In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), such as extracting information related to people’s behaviors and interactions to analyze feelings or understand the behavior of users or groups, and many others. This extracted knowledge has a very important role in decision-making, creating and improving marketing objectives and competitive advantage, monitoring events, whether political or economic, and development in all fields. Therefore, to extract this knowledge, we need to analyze the vast amount of data found within social media using the most popular data mining techniques and applications related to social media sites.
基金supported by Clinical observation on the treatment of diabetic peripheral neuropathy by supplementing qi,promoting blood circulation and tonifying kidney (grant mumber YJ202324).
文摘Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Collect the related literature materials of PCOS with kidney deficiency and blood stasis syndrome treated by traditional Chinese medicine in four databases in recent ten years,extract the information of prescriptions and complete the frequency analysis.Traditional Chinese Medicine Systems Pharmacology Database was used to screen out the effective components.Use Online Mendelian Inheritance in Man and other databases to screen PCOS disease targets.The intersection targets obtained by clustering prescription and PCOS disease targets were submitted to STRING database for protein-protein interaction network analysis,and Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes pathways were analysed by Metascape.Result:There are 155 kinds of traditional Chinese medicines used in the literature.The most commonly utilized ones are Cuscutae Semen,Angelicae Sinensis Radix,and Rehmanniae Radix Praeparata.The results of the cluster analysis indicated that the plants most commonly found throughout the prescription were Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.GO results show that biological processes include cell reaction to organic nitrogen compounds and cell reaction to nitrogen compounds.The functional display of GO molecule includes cytokine receptor binding,signal receptor regulator activity and so on.Kyoto Encyclopedia of Genes and Genomes results show that the possible mechanisms of action are cancer pathway,an endocrine resistance signal pathway.Conclusion:Through data mining,the cluster prescription for PCOS with kidney deficiency and blood stasis syndrome is Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.The network pharmacology research of cluster prescription shows that the main drug components for treating PCOS with kidney deficiency and blood stasis syndrome are quercetin,kaempferol,luteolin,tanshinone IIA,etc.,which act on PTGS2,NCOA2,and other targets,and treat PCOS with kidney deficiency and blood stasis syndrome through cancer and endocrine resistance.
基金supported by the Science and Technology Project of China Southern Power Grid(GZHKJXM20210043-080041KK52210002).
文摘Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.
基金the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/180/43)Princess Nourah bint Abdulrahman UniversityResearchers Supporting Project number(PNURSP2022R235)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research atUmmAl-Qura University for supporting this work by Grant Code:(22UQU4270206DSR01).
文摘Data mining process involves a number of steps fromdata collection to visualization to identify useful data from massive data set.the same time,the recent advances of machine learning(ML)and deep learning(DL)models can be utilized for effectual rainfall prediction.With this motivation,this article develops a novel comprehensive oppositionalmoth flame optimization with deep learning for rainfall prediction(COMFO-DLRP)Technique.The proposed CMFO-DLRP model mainly intends to predict the rainfall and thereby determine the environmental changes.Primarily,data pre-processing and correlation matrix(CM)based feature selection processes are carried out.In addition,deep belief network(DBN)model is applied for the effective prediction of rainfall data.Moreover,COMFO algorithm was derived by integrating the concepts of comprehensive oppositional based learning(COBL)with traditional MFO algorithm.Finally,the COMFO algorithm is employed for the optimal hyperparameter selection of the DBN model.For demonstrating the improved outcomes of the COMFO-DLRP approach,a sequence of simulations were carried out and the outcomes are assessed under distinct measures.The simulation outcome highlighted the enhanced outcomes of the COMFO-DLRP method on the other techniques.
文摘Prediction and diagnosis of cardiovascular diseases(CVDs)based,among other things,on medical examinations and patient symptoms are the biggest challenges in medicine.About 17.9 million people die from CVDs annually,accounting for 31%of all deaths worldwide.With a timely prognosis and thorough consideration of the patient’s medical history and lifestyle,it is possible to predict CVDs and take preventive measures to eliminate or control this life-threatening disease.In this study,we used various patient datasets from a major hospital in the United States as prognostic factors for CVD.The data was obtained by monitoring a total of 918 patients whose criteria for adults were 28-77 years old.In this study,we present a data mining modeling approach to analyze the performance,classification accuracy and number of clusters on Cardiovascular Disease Prognostic datasets in unsupervised machine learning(ML)using the Orange data mining software.Various techniques are then used to classify the model parameters,such as k-nearest neighbors,support vector machine,random forest,artificial neural network(ANN),naïve bayes,logistic regression,stochastic gradient descent(SGD),and AdaBoost.To determine the number of clusters,various unsupervised ML clustering methods were used,such as k-means,hierarchical,and density-based spatial clustering of applications with noise clustering.The results showed that the best model performance analysis and classification accuracy were SGD and ANN,both of which had a high score of 0.900 on Cardiovascular Disease Prognostic datasets.Based on the results of most clustering methods,such as k-means and hierarchical clustering,Cardiovascular Disease Prognostic datasets can be divided into two clusters.The prognostic accuracy of CVD depends on the accuracy of the proposed model in determining the diagnostic model.The more accurate the model,the better it can predict which patients are at risk for CVD.
文摘Imagine numerous clients,each with personal data;individual inputs are severely corrupt,and a server only concerns the collective,statistically essential facets of this data.In several data mining methods,privacy has become highly critical.As a result,various privacy-preserving data analysis technologies have emerged.Hence,we use the randomization process to reconstruct composite data attributes accurately.Also,we use privacy measures to estimate how much deception is required to guarantee privacy.There are several viable privacy protections;however,determining which one is the best is still a work in progress.This paper discusses the difficulty of measuring privacy while also offering numerous random sampling procedures and statistical and categorized data results.Further-more,this paper investigates the use of arbitrary nature with perturbations in privacy preservation.According to the research,arbitrary objects(most notably random matrices)have"predicted"frequency patterns.It shows how to recover crucial information from a sample damaged by a random number using an arbi-trary lattice spectral selection strategy.Thisfiltration system's conceptual frame-work posits,and extensive practicalfindings indicate that sparse data distortions preserve relatively modest privacy protection in various situations.As a result,the research framework is efficient and effective in maintaining data privacy and security.
文摘It is crucial,while using healthcare data,to assess the advantages of data privacy against the possible drawbacks.Data from several sources must be combined for use in many data mining applications.The medical practitioner may use the results of association rule mining performed on this aggregated data to better personalize patient care and implement preventive measures.Historically,numerous heuristics(e.g.,greedy search)and metaheuristics-based techniques(e.g.,evolutionary algorithm)have been created for the positive association rule in privacy preserving data mining(PPDM).When it comes to connecting seemingly unrelated diseases and drugs,negative association rules may be more informative than their positive counterparts.It is well-known that during negative association rules mining,a large number of uninteresting rules are formed,making this a difficult problem to tackle.In this research,we offer an adaptive method for negative association rule mining in vertically partitioned healthcare datasets that respects users’privacy.The applied approach dynamically determines the transactions to be interrupted for information hiding,as opposed to predefining them.This study introduces a novel method for addressing the problem of negative association rules in healthcare data mining,one that is based on the Tabu-genetic optimization paradigm.Tabu search is advantageous since it removes a huge number of unnecessary rules and item sets.Experiments using benchmark healthcare datasets prove that the discussed scheme outperforms state-of-the-art solutions in terms of decreasing side effects and data distortions,as measured by the indicator of hiding failure.
基金National Natural Science Foundations of China(No.71501103)Natural Science Foundation of Inner Mongolia,China(No.2015BS0705)the Program of Higher-Level Talents of Inner Mongolia University,China(No.20700-5145131)
文摘Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based on data mining.Furthermore,with the complexity of industrial equipment increasing,the managers should pay more attention to the key components and carry out the lean management is very important.Therefore,the idea"customer segmentation"of"precise marketing"can be used in the maintenance management of the multi-component system.Following the idea of segmentation,the components of multicomponent systems should be subdivied into groups based on specific attributes relevant to maintenance,such as maintenance cost,mean time between failures,and failure frequency.For the target specific groups of parts,the optimal maintenance policy,health assessment and maintenance scheduling can be determined.The proposed analysis framework will be given out.In order to illustrate the effectiveness of this method,a numerical example is given out.
文摘Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from numerous remote databases.Mining algorithms can obtain sensitive information once the data is in the data warehouse.Many traditional algorithms/techniques promise to provide safe data transfer,storing,and retrieving over the cloud platform.These strategies are primarily concerned with protecting the privacy of user data.This study aims to present data mining with privacy protection(DMPP)using precise elliptic curve cryptography(PECC),which builds upon that algebraic elliptic curve infinitefields.This approach enables safe data exchange by utilizing a reliable data consolidation approach entirely reliant on rewritable data concealing techniques.Also,it outperforms data mining in terms of solid privacy procedures while maintaining the quality of the data.Average approximation error,computational cost,anonymizing time,and data loss are considered performance measures.The suggested approach is practical and applicable in real-world situations according to the experimentalfindings.