Data mining techniques are used to discover knowledge from GIS database in order to improve remote sensing image classification.Two learning granularities are proposed for inductive learning from spatial data,one is s...Data mining techniques are used to discover knowledge from GIS database in order to improve remote sensing image classification.Two learning granularities are proposed for inductive learning from spatial data,one is spatial object granularity,the other is pixel granularity.We also present an approach to combine inductive learning with conventional image classification methods,which selects class probability of Bayes classification as learning attributes.A land use classification experiment is performed in the Beijing area using SPOT multi_spectral image and GIS data.Rules about spatial distribution patterns and shape features are discovered by C5.0 inductive learning algorithm and then the image is reclassified by deductive reasoning.Comparing with the result produced only by Bayes classification,the overall accuracy increased by 11% and the accuracy of some classes,such as garden and forest,increased by about 30%.The results indicate that inductive learning can resolve spectral confusion to a great extent.Combining Bayes method with inductive learning not only improves classification accuracy greatly,but also extends the classification by subdividing some classes with the discovered knowledge.展开更多
In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), s...In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), such as extracting information related to people’s behaviors and interactions to analyze feelings or understand the behavior of users or groups, and many others. This extracted knowledge has a very important role in decision-making, creating and improving marketing objectives and competitive advantage, monitoring events, whether political or economic, and development in all fields. Therefore, to extract this knowledge, we need to analyze the vast amount of data found within social media using the most popular data mining techniques and applications related to social media sites.展开更多
The growth of geo-technologies and the development of methods for spatial data collection have resulted in large spatial data repositories that require techniques for spatial information extraction, in order to transf...The growth of geo-technologies and the development of methods for spatial data collection have resulted in large spatial data repositories that require techniques for spatial information extraction, in order to transform raw data into useful previously unknown information. However, due to the high complexity of spatial data mining, the need for spatial relationship comprehension and its characteristics, efforts have been directed towards improving algorithms in order to provide an increase of performance and quality of results. Likewise, several issues have been addressed to spatial data mining, including environmental management, which is the focus of this paper. The main original contribution of this work is the demonstration of spatial data mining using a novel algorithm with a multi-relational approach that was applied to a database related to water resource from a certain region of S^o Paulo State, Brazil, and the discussion about obtained results. Some characteristics involving the location of water resources and the profile of who is administering the water exploration were discovered and discussed.展开更多
Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landsli...Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landslide,a large-scale and active landslide,on the south bank of the Yangtze River.The latest monitoring data and site investigations available are analyzed to establish spatial and temporal landslide deformation characteristics.Data mining technology,including the two-step clustering and Apriori algorithm,is then used to identify the dominant triggers of landslide movement.In the data mining process,the two-step clustering method clusters the candidate triggers and displacement rate into several groups,and the Apriori algorithm generates correlation criteria for the cause-and-effect.The analysis considers multiple locations of the landslide and incorporates two types of time scales:longterm deformation on a monthly basis and short-term deformation on a daily basis.This analysis shows that the deformations of the Outang landslide are driven by both rainfall and reservoir water while its deformation varies spatiotemporally mainly due to the difference in local responses to hydrological factors.The data mining results reveal different dominant triggering factors depending on the monitoring frequency:the monthly and bi-monthly cumulative rainfall control the monthly deformation,and the 10-d cumulative rainfall and the 5-d cumulative drop of water level in the reservoir dominate the daily deformation of the landslide.It is concluded that the spatiotemporal deformation pattern and data mining rules associated with precipitation and reservoir water level have the potential to be broadly implemented for improving landslide prevention and control in the dam reservoirs and other landslideprone areas.展开更多
Objective To explore the medication rules of traditional Chinese medicine(TCM)and mechanism of action of hub herb pairs for treating insomnia.Methods Totally 104 prescriptions were statistically analyzed.The associati...Objective To explore the medication rules of traditional Chinese medicine(TCM)and mechanism of action of hub herb pairs for treating insomnia.Methods Totally 104 prescriptions were statistically analyzed.The association rule algorithm was applied to mine the hub herb pairs.Network pharmacology was utilized to analyze the mechanism of the hub herb pairs,while molecular docking was applied to simulate the interaction between receptors and herb molecules,thereby predicting their binding affinities.Results The most frequently used herbs in TCM prescriptions for treating insomnia included Semen Ziziphi Spinosae,Radix Glycyrrhizae,Radix et Rhizoma Ginseng,and Poria cum Radix Pini.Among them,the most commonly used were the supplementing herbs,followed by heat-clearing,mind-calming,and exterior-releasing ones,with their properties of warm and cold,flavors of sweet,Pungent,and bitter,and meridian tropisms of liver,lungs,spleen,kidneys,heart,and stomach.The hub herb pairs based on the association rules included Radix Astragali-Radix et Rhizoma Ginseng,Rhizoma Chuanxiong-Radix Glycyrrhizae,Seman Platycladi-Semen Ziziphi Spinosae,Pericarpium Citri Reticulatae-Radix Glycyrrhizae,Radix Polygalae-Semen Ziziphi Spinosae,and Radix Astragali-Semen Ziziphi Spinosae.Network pharmacology revealed that the cAMP signaling pathway might play a key role in treating insomnia synergistically with HIF-1 signaling pathway,prolactin signaling pathway,chemical carcinogenesis receptor activation,and PI3K-Akt signaling pathway.Molecular docking indicated that there was good binding between the active ingredients of the hub herb pairs and the hub targets.Conclusions This study identified six hub herb pairs for treating insomnia in TCM.These hub herb pairs may synergistically treat insomnia with HIF-1 signaling pathway,prolactin signaling pathway,chemical carcinogenesis receptor activation,and PI3K-Akt signaling pathway through the cAMP signaling pathway.展开更多
Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manife...Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manifestations of Western medicine include thirst,inability to drink more,diarrhea,yellow urine,red tongue,et al.)internalized disease.Nevertheless,the mechanism of EZECD on damp-heat internalized Type 2 diabetes(T2D)remains unknown.We employed data mining,pharmacology databases and experimental verification to study how EZECD treats damp-heat internalized T2D.Methods:The main compounds or genes of EZECD and damp-heat internalized T2D were obtained from the pharmacology databases.Succeeding,the overlapped targets of EZECD and damp-heat internalized T2D were performed by the Gene Ontology,kyoto encyclopedia of genes and genomes analysis.And the compound-disease targets-pathway network were constructed to obtain the hub compound.Moreover,the hub genes and core related pathways were mined with weighted gene co-expression network analysis based on Gene Expression Omnibus database,the capability of hub compound and genes was valid in AutoDock 1.5.7.Furthermore,and violin plot and gene set enrichment analysis were performed to explore the role of hub genes in damp-heat internalized T2D.Finally,the interactions of hub compound and genes were explored using Comparative Toxicogenomics Database and quantitative polymerase chain reaction.Results:First,herb-compounds-genes-disease network illustrated that the hub compound of EZECD for damp-heat internalized T2D could be quercetin.Consistently,the hub genes were CASP8,CCL2,and AHR according to weighted gene co-expression network analysis.Molecular docking showed that quercetin could bind with the hub genes.Further,gene set enrichment analysis and Gene Ontology represented that CASP8,or CCL2,is negatively involved in insulin secretion response to the TNF or lipopolysaccharide process,and AHR or CCL2 positively regulated lipid and atherosclerosis,and/or including NOD-like receptor signaling pathway,and TNF signaling pathway.Ultimately,the quantitative polymerase chain reaction and western blotting analysis showed that quercetin could down-regulated the mRNA and protein experssion of CASP8,CCL2,and AHR.It was consistent with the results in Comparative Toxicogenomics Database databases.Conclusion:These results demonstrated quercetin could inhibit the expression of CASP8,CCL2,AHR in damp-heat internalized T2D,which improves insulin secretion and inhibits lipid and atherosclerosis,as well as/or including NOD-like receptor signaling pathway,and TNF signaling pathway,suggesting that EZECD may be more effective to treat damp-heat internalized T2D.展开更多
[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The str...[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The structural design information of all waist protection equipment was collected from the national Internet platform,and the data were classified and a database was established.IBM SPSS 26.0 and MATLAB 2018a were used to analyze the data and tabulate them in Tableau 2022.4.After the association rules were clarified,the data were imported into Cinema 4D R21 to create the concept contour of waist protection pillow.[Results]The average and standard deviation of the single airbag design were the highest in all groups,with an average of 0.511 and a standard deviation of 0.502.The average and standard deviation of the upper and lower dual airbags were the lowest in all groups,with an average of 0.015 and a standard deviation of 0.120;the correlation coefficient between single airbag and 120°arc stretching was 0.325,which was positively correlated with each other(P<0.01);the correlation coefficient between multiple airbags and 360°encircling fitting was 0.501,which was positively correlated with each other and had the highest correlation degree(P<0.01).[Conclusions]The single airbag design is well recognized by companies,and has received the highest attention among all brand products.While focusing on single airbag design,most brands will consider the need to add 120°arc stretching elements in product design.At the time of focusing on multiple airbag design,some brands believe that 360°encircling fitting elements need to be added to the product,and the correlation between the two is the highest among all groups.展开更多
The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collec...The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.展开更多
Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.Th...Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.This study aimed to screen a novel TCM prescription against DR from patents and elucidate its medication rule and molecular mechanism using data mining,network pharmacology,molecular docking and molecular dynamics(MD)simulation.Method:TCM prescriptions for treating DR was collected from patents and a novel TCM prescription was identified using data mining.Subsequently,the mechanism of the novel TCM prescription against DR was explored by constructing a network of core TCMs-core active ingredients-core targets-core pathways.Finally,molecular docking and MD simulation were employed to validate the findings from network pharmacology.Result:The TCMs of the collected prescriptions primarily possessed bitter and cold properties with heat-clearing and supplementing effects,attributed to the liver,lung and kidney channels.Notably,a novel TCM prescription for treating DR was identified,composed of Lycii Fructus,Chrysanthemi Flos,Astragali Radix and Angelicae Sinensis Radix.Twenty core active ingredients and ten core targets of the novel TCM prescription for treating DR were screened.Moreover,the novel TCM prescription played a crucial role for treating DR by inhibiting inflammatory response,oxidative stress,retinal pigment epithelium cell apoptosis and retinal neovascularization through various pathways,such as the AGE-RAGE signaling pathway in diabetic complications and the MAPK signaling pathway.Finally,molecular docking and MD simulation demonstrated that almost all core active ingredients exhibited satisfactory binding energies to core targets.Conclusions:This study identified a novel TCM prescription and unveiled its multi-component,multi-target and multi-pathway characteristics for treating DR.These findings provide a scientific basis and novel insights into the development of drugs for DR prevention and treatment.展开更多
The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. Thes...The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. These factors are safety, service quality and price. Airline companies can analyze the customers in the market with a focus on price and quality and develop a business model according to their expectations. For example, business class and economy class passenger expectations are different from each other, so the service and price to be offered to them will be different. However, all customers have one common expectation and that is safety. No matter how high quality the service is or how cheap the price is, no one wants to fly with an airline or plane that is not safe. From an airline company’s point of view, an accident or breakdown of one of the company’s aircraft can cause irreparable image loss and financial damage. If we look at past examples, we see that there are many airline companies or maintenance organizations that could not recover after an accident and went bankrupt. Safety is an indispensable factor. Therefore, there is a unit in the sector called the safety management system (SMS), which collects data by taking a proactive and reactive approach. The way and purpose of the safety management system is to take a proactive approach to recognize and prevent unsafe situations before they cause accidents or breakdowns, or to take a reactive approach to find the causes of accidents and breakdowns that have occurred as a result of certain factors and to take the necessary measures to prevent the same situations from happening again in the sector. The field of data mining, which is necessary to predict the future behavior of customers in the field of marketing, is an area that marketing also values. In this study, data mining studies to ensure safety in the aviation industry and the security of customer information in marketing will be emphasized, firstly, the concept and importance of data mining will be mentioned.展开更多
The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation ru...The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation rules,including data collection and processing preparation steps,builds the data warehouse of association rules,and then generates an educational quality evaluation framework using the principle of data mining.Based on this,this paper analyzes the construction design and method of the teaching evaluation system under data mining,hoping to provide help for the improvement of the teaching evaluation system and the improvement of teaching quality.展开更多
Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Col...Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Collect the related literature materials of PCOS with kidney deficiency and blood stasis syndrome treated by traditional Chinese medicine in four databases in recent ten years,extract the information of prescriptions and complete the frequency analysis.Traditional Chinese Medicine Systems Pharmacology Database was used to screen out the effective components.Use Online Mendelian Inheritance in Man and other databases to screen PCOS disease targets.The intersection targets obtained by clustering prescription and PCOS disease targets were submitted to STRING database for protein-protein interaction network analysis,and Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes pathways were analysed by Metascape.Result:There are 155 kinds of traditional Chinese medicines used in the literature.The most commonly utilized ones are Cuscutae Semen,Angelicae Sinensis Radix,and Rehmanniae Radix Praeparata.The results of the cluster analysis indicated that the plants most commonly found throughout the prescription were Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.GO results show that biological processes include cell reaction to organic nitrogen compounds and cell reaction to nitrogen compounds.The functional display of GO molecule includes cytokine receptor binding,signal receptor regulator activity and so on.Kyoto Encyclopedia of Genes and Genomes results show that the possible mechanisms of action are cancer pathway,an endocrine resistance signal pathway.Conclusion:Through data mining,the cluster prescription for PCOS with kidney deficiency and blood stasis syndrome is Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.The network pharmacology research of cluster prescription shows that the main drug components for treating PCOS with kidney deficiency and blood stasis syndrome are quercetin,kaempferol,luteolin,tanshinone IIA,etc.,which act on PTGS2,NCOA2,and other targets,and treat PCOS with kidney deficiency and blood stasis syndrome through cancer and endocrine resistance.展开更多
The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geo...The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.展开更多
Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play...Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs.展开更多
An intrusion detection (ID) model is proposed based on the fuzzy data mining method. A major difficulty of anomaly ID is that patterns of the normal behavior change with time. In addition, an actual intrusion with a...An intrusion detection (ID) model is proposed based on the fuzzy data mining method. A major difficulty of anomaly ID is that patterns of the normal behavior change with time. In addition, an actual intrusion with a small deviation may match normal patterns. So the intrusion behavior cannot be detected by the detection system.To solve the problem, fuzzy data mining technique is utilized to extract patterns representing the normal behavior of a network. A set of fuzzy association rules mined from the network data are shown as a model of “normal behaviors”. To detect anomalous behaviors, fuzzy association rules are generated from new audit data and the similarity with sets mined from “normal” data is computed. If the similarity values are lower than a threshold value,an alarm is given. Furthermore, genetic algorithms are used to adjust the fuzzy membership functions and to select an appropriate set of features.展开更多
The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type descriptio...The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type description to the relational semantic recording XML data relations, and using an XML data mining language, the XML data mining system presents a strategy to mine information on XML.展开更多
Data mining enables us to form forecasts and models regarding future by making use of past data. Any method which helps to discover data can be used as a data mining method. Enterprises gain important competitive adva...Data mining enables us to form forecasts and models regarding future by making use of past data. Any method which helps to discover data can be used as a data mining method. Enterprises gain important competitive advantage by data mining methods. Data mining is used in different fields. In finance field, it is a specially used in portfolio management, fraud detection, payment prediction, loan risk analysis, mortgage scoring, determining transaction manipulation, determining financial risk management, determining customer profile and foreign exchange market. It can be costly, risky and time consuming for enterprises to gain knowledge. Thus today enterprises use data mining as an innovative competitive mean. The aim of the study is to determine the importance of data mining in financial applications.展开更多
In this article, the relationship between the knowledge of competitors and the development of new products in the field of capital medical equipment has been investigated. In order to identify the criteria for measuri...In this article, the relationship between the knowledge of competitors and the development of new products in the field of capital medical equipment has been investigated. In order to identify the criteria for measuring competitors’ knowledge and developing new capital medical equipment products, marketing experts were interviewed and then a researcher-made questionnaire was compiled and distributed among the statistical sample of the research. Also, in order to achieve the goals of the research, a questionnaire among 100 members of the statistical community was selected, distributed and collected. To analyze the gathered data, the structural equation modeling (SEM) method was used in the SMART PLS 2 software to estimate the model and then the K-MEAN approach was used to cluster the capital medical equipment market based on the knowledge of actual and potential competitors. The results have shown that the knowledge of potential and actual competitors has a positive and significant effect on the development of new products in the capital medical equipment market. From the point of view of the knowledge of actual competitors, the market of “MRI”, “Ultrasound” and “SPECT” is grouped in the low knowledge cluster;“Pet MRI”, “CT Scan”, “Mammography”, “Radiography, Fluoroscopy and CRM”, “Pet CT”, “SPECT CT” and “Gamma Camera” markets are clustered in the medium knowledge. Finally, “Angiography” and “CBCT” markets are located in the knowledge cluster. From the perspective of knowledge of potential competitors, the market of “angiography”, “mammography”, “SPECT” and “SPECT CT” in the low knowledge cluster, “CT scan”, “radiography, fluoroscopy and CRM”, “pet CT”, “CBCT” markets in the medium knowledge cluster and “MRI”, “pet MRI”, “ultrasound” and “gamma camera” markets in the high knowledge cluster are located.展开更多
This paper describes in detail the web data mining technology, analyzes the relationship between the data on the web site to the tourism electronic commerce (including the server log, tourism commodity database, user...This paper describes in detail the web data mining technology, analyzes the relationship between the data on the web site to the tourism electronic commerce (including the server log, tourism commodity database, user database, the shopping cart), access to relevant user preference information for tourism commodity. Based on these models, the paper presents recommended strategies for the site registered users, and has had the corresponding formulas for calculating the current user of certain items recommended values and the corresponding recommendation algorithm, and the system can get a recommendation for user.展开更多
Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is...Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is a better avenue to understand the inner mechanism of it. Feature selection based data mining method is better suited to identify biomarkers of UA. In this study, we carried out clinical epidemiology to collect plasmas of UA in-patients and controls. Proteomics and metabolomics data were obtained via two-dimensional difference gel electrophoresis and gas chromatography techniques. We presented a novel computational strategy to select biomarkers as few as possible for UA in the two groups of data. Firstly, decision tree was used to select biomarkers for UA and 3-fold cross validation was used to evaluate computational performanees for the three methods. Alternatively, we combined inde- pendent t test and classification based data mining method as well as backward elimination technique to select, as few as possible, protein and metabolite biomarkers with best classification performances. By the method, we selected 6 proteins and 5 metabolites for UA. The novel method presented here provides a better insight into the pathology of a disease.展开更多
文摘Data mining techniques are used to discover knowledge from GIS database in order to improve remote sensing image classification.Two learning granularities are proposed for inductive learning from spatial data,one is spatial object granularity,the other is pixel granularity.We also present an approach to combine inductive learning with conventional image classification methods,which selects class probability of Bayes classification as learning attributes.A land use classification experiment is performed in the Beijing area using SPOT multi_spectral image and GIS data.Rules about spatial distribution patterns and shape features are discovered by C5.0 inductive learning algorithm and then the image is reclassified by deductive reasoning.Comparing with the result produced only by Bayes classification,the overall accuracy increased by 11% and the accuracy of some classes,such as garden and forest,increased by about 30%.The results indicate that inductive learning can resolve spectral confusion to a great extent.Combining Bayes method with inductive learning not only improves classification accuracy greatly,but also extends the classification by subdividing some classes with the discovered knowledge.
文摘In light of the rapid growth and development of social media, it has become the focus of interest in many different scientific fields. They seek to extract useful information from it, and this is called (knowledge), such as extracting information related to people’s behaviors and interactions to analyze feelings or understand the behavior of users or groups, and many others. This extracted knowledge has a very important role in decision-making, creating and improving marketing objectives and competitive advantage, monitoring events, whether political or economic, and development in all fields. Therefore, to extract this knowledge, we need to analyze the vast amount of data found within social media using the most popular data mining techniques and applications related to social media sites.
文摘The growth of geo-technologies and the development of methods for spatial data collection have resulted in large spatial data repositories that require techniques for spatial information extraction, in order to transform raw data into useful previously unknown information. However, due to the high complexity of spatial data mining, the need for spatial relationship comprehension and its characteristics, efforts have been directed towards improving algorithms in order to provide an increase of performance and quality of results. Likewise, several issues have been addressed to spatial data mining, including environmental management, which is the focus of this paper. The main original contribution of this work is the demonstration of spatial data mining using a novel algorithm with a multi-relational approach that was applied to a database related to water resource from a certain region of S^o Paulo State, Brazil, and the discussion about obtained results. Some characteristics involving the location of water resources and the profile of who is administering the water exploration were discovered and discussed.
基金supported by the Natural Science Foundation of Shandong Province,China(Grant No.ZR2021QD032)。
文摘Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landslide,a large-scale and active landslide,on the south bank of the Yangtze River.The latest monitoring data and site investigations available are analyzed to establish spatial and temporal landslide deformation characteristics.Data mining technology,including the two-step clustering and Apriori algorithm,is then used to identify the dominant triggers of landslide movement.In the data mining process,the two-step clustering method clusters the candidate triggers and displacement rate into several groups,and the Apriori algorithm generates correlation criteria for the cause-and-effect.The analysis considers multiple locations of the landslide and incorporates two types of time scales:longterm deformation on a monthly basis and short-term deformation on a daily basis.This analysis shows that the deformations of the Outang landslide are driven by both rainfall and reservoir water while its deformation varies spatiotemporally mainly due to the difference in local responses to hydrological factors.The data mining results reveal different dominant triggering factors depending on the monitoring frequency:the monthly and bi-monthly cumulative rainfall control the monthly deformation,and the 10-d cumulative rainfall and the 5-d cumulative drop of water level in the reservoir dominate the daily deformation of the landslide.It is concluded that the spatiotemporal deformation pattern and data mining rules associated with precipitation and reservoir water level have the potential to be broadly implemented for improving landslide prevention and control in the dam reservoirs and other landslideprone areas.
基金National Natural Science Foundation of China(82360905)Gansu Provincial University Teachers'Innovation Fund Projects(2023A-092 and 2024B-109).
文摘Objective To explore the medication rules of traditional Chinese medicine(TCM)and mechanism of action of hub herb pairs for treating insomnia.Methods Totally 104 prescriptions were statistically analyzed.The association rule algorithm was applied to mine the hub herb pairs.Network pharmacology was utilized to analyze the mechanism of the hub herb pairs,while molecular docking was applied to simulate the interaction between receptors and herb molecules,thereby predicting their binding affinities.Results The most frequently used herbs in TCM prescriptions for treating insomnia included Semen Ziziphi Spinosae,Radix Glycyrrhizae,Radix et Rhizoma Ginseng,and Poria cum Radix Pini.Among them,the most commonly used were the supplementing herbs,followed by heat-clearing,mind-calming,and exterior-releasing ones,with their properties of warm and cold,flavors of sweet,Pungent,and bitter,and meridian tropisms of liver,lungs,spleen,kidneys,heart,and stomach.The hub herb pairs based on the association rules included Radix Astragali-Radix et Rhizoma Ginseng,Rhizoma Chuanxiong-Radix Glycyrrhizae,Seman Platycladi-Semen Ziziphi Spinosae,Pericarpium Citri Reticulatae-Radix Glycyrrhizae,Radix Polygalae-Semen Ziziphi Spinosae,and Radix Astragali-Semen Ziziphi Spinosae.Network pharmacology revealed that the cAMP signaling pathway might play a key role in treating insomnia synergistically with HIF-1 signaling pathway,prolactin signaling pathway,chemical carcinogenesis receptor activation,and PI3K-Akt signaling pathway.Molecular docking indicated that there was good binding between the active ingredients of the hub herb pairs and the hub targets.Conclusions This study identified six hub herb pairs for treating insomnia in TCM.These hub herb pairs may synergistically treat insomnia with HIF-1 signaling pathway,prolactin signaling pathway,chemical carcinogenesis receptor activation,and PI3K-Akt signaling pathway through the cAMP signaling pathway.
基金supported by a grant from Hubei Key Laboratory of Diabetes and Angiopathy Program of Hubei University of Science and Technology(2020XZ10)Project of Education Commission of Hubei Province(B2022192).
文摘Background:Erzhu Erchen decoction(EZECD),which is based on Erchen decoction and enhanced with Atractylodes lancea and Atractylodes macrocephala,is widely used for the treatment of dampness and heat(The clinical manifestations of Western medicine include thirst,inability to drink more,diarrhea,yellow urine,red tongue,et al.)internalized disease.Nevertheless,the mechanism of EZECD on damp-heat internalized Type 2 diabetes(T2D)remains unknown.We employed data mining,pharmacology databases and experimental verification to study how EZECD treats damp-heat internalized T2D.Methods:The main compounds or genes of EZECD and damp-heat internalized T2D were obtained from the pharmacology databases.Succeeding,the overlapped targets of EZECD and damp-heat internalized T2D were performed by the Gene Ontology,kyoto encyclopedia of genes and genomes analysis.And the compound-disease targets-pathway network were constructed to obtain the hub compound.Moreover,the hub genes and core related pathways were mined with weighted gene co-expression network analysis based on Gene Expression Omnibus database,the capability of hub compound and genes was valid in AutoDock 1.5.7.Furthermore,and violin plot and gene set enrichment analysis were performed to explore the role of hub genes in damp-heat internalized T2D.Finally,the interactions of hub compound and genes were explored using Comparative Toxicogenomics Database and quantitative polymerase chain reaction.Results:First,herb-compounds-genes-disease network illustrated that the hub compound of EZECD for damp-heat internalized T2D could be quercetin.Consistently,the hub genes were CASP8,CCL2,and AHR according to weighted gene co-expression network analysis.Molecular docking showed that quercetin could bind with the hub genes.Further,gene set enrichment analysis and Gene Ontology represented that CASP8,or CCL2,is negatively involved in insulin secretion response to the TNF or lipopolysaccharide process,and AHR or CCL2 positively regulated lipid and atherosclerosis,and/or including NOD-like receptor signaling pathway,and TNF signaling pathway.Ultimately,the quantitative polymerase chain reaction and western blotting analysis showed that quercetin could down-regulated the mRNA and protein experssion of CASP8,CCL2,and AHR.It was consistent with the results in Comparative Toxicogenomics Database databases.Conclusion:These results demonstrated quercetin could inhibit the expression of CASP8,CCL2,AHR in damp-heat internalized T2D,which improves insulin secretion and inhibits lipid and atherosclerosis,as well as/or including NOD-like receptor signaling pathway,and TNF signaling pathway,suggesting that EZECD may be more effective to treat damp-heat internalized T2D.
基金Supported by Municipal Public Welfare Science and Technology Project of Zhoushan Science and Technology Bureau,Zhejiang Province(2021C31064).
文摘[Objectives]To explore the trend of brands towards the design of waist protection products through data mining,and to provide reference for the design concept of the contour of waist protection pillow.[Methods]The structural design information of all waist protection equipment was collected from the national Internet platform,and the data were classified and a database was established.IBM SPSS 26.0 and MATLAB 2018a were used to analyze the data and tabulate them in Tableau 2022.4.After the association rules were clarified,the data were imported into Cinema 4D R21 to create the concept contour of waist protection pillow.[Results]The average and standard deviation of the single airbag design were the highest in all groups,with an average of 0.511 and a standard deviation of 0.502.The average and standard deviation of the upper and lower dual airbags were the lowest in all groups,with an average of 0.015 and a standard deviation of 0.120;the correlation coefficient between single airbag and 120°arc stretching was 0.325,which was positively correlated with each other(P<0.01);the correlation coefficient between multiple airbags and 360°encircling fitting was 0.501,which was positively correlated with each other and had the highest correlation degree(P<0.01).[Conclusions]The single airbag design is well recognized by companies,and has received the highest attention among all brand products.While focusing on single airbag design,most brands will consider the need to add 120°arc stretching elements in product design.At the time of focusing on multiple airbag design,some brands believe that 360°encircling fitting elements need to be added to the product,and the correlation between the two is the highest among all groups.
基金supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2024/R/1445).
文摘The study aims to recognize how efficiently Educational DataMining(EDM)integrates into Artificial Intelligence(AI)to develop skills for predicting students’performance.The study used a survey questionnaire and collected data from 300 undergraduate students of Al Neelain University.The first step’s initial population placements were created using Particle Swarm Optimization(PSO).Then,using adaptive feature space search,Educational Grey Wolf Optimization(EGWO)was employed to choose the optimal attribute combination.The second stage uses the SVMclassifier to forecast classification accuracy.Different classifiers were utilized to evaluate the performance of students.According to the results,it was revealed that AI could forecast the final grades of students with an accuracy rate of 97%on the test dataset.Furthermore,the present study showed that successful students could be selected by the Decision Tree model with an efficiency rate of 87.50%and could be categorized as having equal information ratio gain after the semester.While the random forest provided an accuracy of 28%.These findings indicate the higher accuracy rate in the results when these models were implemented on the data set which provides significantly accurate results as compared to a linear regression model with accuracy(12%).The study concluded that the methodology used in this study can prove to be helpful for students and teachers in upgrading academic performance,reducing chances of failure,and taking appropriate steps at the right time to raise the standards of education.The study also motivates academics to assess and discover EDM at several other universities.
基金supported by the National Natural Science Foundation of China(Grant No.82104701)Science Fund Program for Outstanding Young Scholars in Universities of Anhui Province(Grant No.2022AH030064)+3 种基金Key Project at Central Government Level:the Ability Establishment of Sustainable Use for Valuable Chinese Medicine Resources(Grant No.2060302)Foundation of Anhui Province Key Laboratory of Pharmaceutical Preparation Technology and Application(Grant No.2021KFKT10)China Agriculture Research System of MOF and MARA(Grant No.CARS-21)Talent Support Program of Anhui University of Chinese Medicine(Grant No.2020rcyb007).
文摘Background:Diabetic retinopathy(DR)is currently the leading cause of blindness in elderly individuals with diabetes.Traditional Chinese medicine(TCM)prescriptions have shown remarkable effectiveness for treating DR.This study aimed to screen a novel TCM prescription against DR from patents and elucidate its medication rule and molecular mechanism using data mining,network pharmacology,molecular docking and molecular dynamics(MD)simulation.Method:TCM prescriptions for treating DR was collected from patents and a novel TCM prescription was identified using data mining.Subsequently,the mechanism of the novel TCM prescription against DR was explored by constructing a network of core TCMs-core active ingredients-core targets-core pathways.Finally,molecular docking and MD simulation were employed to validate the findings from network pharmacology.Result:The TCMs of the collected prescriptions primarily possessed bitter and cold properties with heat-clearing and supplementing effects,attributed to the liver,lung and kidney channels.Notably,a novel TCM prescription for treating DR was identified,composed of Lycii Fructus,Chrysanthemi Flos,Astragali Radix and Angelicae Sinensis Radix.Twenty core active ingredients and ten core targets of the novel TCM prescription for treating DR were screened.Moreover,the novel TCM prescription played a crucial role for treating DR by inhibiting inflammatory response,oxidative stress,retinal pigment epithelium cell apoptosis and retinal neovascularization through various pathways,such as the AGE-RAGE signaling pathway in diabetic complications and the MAPK signaling pathway.Finally,molecular docking and MD simulation demonstrated that almost all core active ingredients exhibited satisfactory binding energies to core targets.Conclusions:This study identified a novel TCM prescription and unveiled its multi-component,multi-target and multi-pathway characteristics for treating DR.These findings provide a scientific basis and novel insights into the development of drugs for DR prevention and treatment.
文摘The aviation industry is a sector that is developing, changing and growing every day in terms of technological and legal framework. There are generally three factors that enable airlines to hold on to the market. These factors are safety, service quality and price. Airline companies can analyze the customers in the market with a focus on price and quality and develop a business model according to their expectations. For example, business class and economy class passenger expectations are different from each other, so the service and price to be offered to them will be different. However, all customers have one common expectation and that is safety. No matter how high quality the service is or how cheap the price is, no one wants to fly with an airline or plane that is not safe. From an airline company’s point of view, an accident or breakdown of one of the company’s aircraft can cause irreparable image loss and financial damage. If we look at past examples, we see that there are many airline companies or maintenance organizations that could not recover after an accident and went bankrupt. Safety is an indispensable factor. Therefore, there is a unit in the sector called the safety management system (SMS), which collects data by taking a proactive and reactive approach. The way and purpose of the safety management system is to take a proactive approach to recognize and prevent unsafe situations before they cause accidents or breakdowns, or to take a reactive approach to find the causes of accidents and breakdowns that have occurred as a result of certain factors and to take the necessary measures to prevent the same situations from happening again in the sector. The field of data mining, which is necessary to predict the future behavior of customers in the field of marketing, is an area that marketing also values. In this study, data mining studies to ensure safety in the aviation industry and the security of customer information in marketing will be emphasized, firstly, the concept and importance of data mining will be mentioned.
文摘The teaching quality evaluation system based on data mining technology can accurately and fairly identify the core driving factors to improve teaching quality.This method adopts the analysis of big data correlation rules,including data collection and processing preparation steps,builds the data warehouse of association rules,and then generates an educational quality evaluation framework using the principle of data mining.Based on this,this paper analyzes the construction design and method of the teaching evaluation system under data mining,hoping to provide help for the improvement of the teaching evaluation system and the improvement of teaching quality.
基金supported by Clinical observation on the treatment of diabetic peripheral neuropathy by supplementing qi,promoting blood circulation and tonifying kidney (grant mumber YJ202324).
文摘Background:Using network pharmacology to explore the potential molecular mechanism of traditional Chinese medicine in treating polycystic ovary syndrome(PCOS)with kidney deficiency and blood stasis syndrome.Method:Collect the related literature materials of PCOS with kidney deficiency and blood stasis syndrome treated by traditional Chinese medicine in four databases in recent ten years,extract the information of prescriptions and complete the frequency analysis.Traditional Chinese Medicine Systems Pharmacology Database was used to screen out the effective components.Use Online Mendelian Inheritance in Man and other databases to screen PCOS disease targets.The intersection targets obtained by clustering prescription and PCOS disease targets were submitted to STRING database for protein-protein interaction network analysis,and Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes pathways were analysed by Metascape.Result:There are 155 kinds of traditional Chinese medicines used in the literature.The most commonly utilized ones are Cuscutae Semen,Angelicae Sinensis Radix,and Rehmanniae Radix Praeparata.The results of the cluster analysis indicated that the plants most commonly found throughout the prescription were Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.GO results show that biological processes include cell reaction to organic nitrogen compounds and cell reaction to nitrogen compounds.The functional display of GO molecule includes cytokine receptor binding,signal receptor regulator activity and so on.Kyoto Encyclopedia of Genes and Genomes results show that the possible mechanisms of action are cancer pathway,an endocrine resistance signal pathway.Conclusion:Through data mining,the cluster prescription for PCOS with kidney deficiency and blood stasis syndrome is Leonuri Herba,Lycopi Herba,Dipsaci Radix,etc.The network pharmacology research of cluster prescription shows that the main drug components for treating PCOS with kidney deficiency and blood stasis syndrome are quercetin,kaempferol,luteolin,tanshinone IIA,etc.,which act on PTGS2,NCOA2,and other targets,and treat PCOS with kidney deficiency and blood stasis syndrome through cancer and endocrine resistance.
文摘The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.
基金sponsored by the National Science and Technology Major Project(No.2011ZX05023-005-006)
文摘Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs.
文摘An intrusion detection (ID) model is proposed based on the fuzzy data mining method. A major difficulty of anomaly ID is that patterns of the normal behavior change with time. In addition, an actual intrusion with a small deviation may match normal patterns. So the intrusion behavior cannot be detected by the detection system.To solve the problem, fuzzy data mining technique is utilized to extract patterns representing the normal behavior of a network. A set of fuzzy association rules mined from the network data are shown as a model of “normal behaviors”. To detect anomalous behaviors, fuzzy association rules are generated from new audit data and the similarity with sets mined from “normal” data is computed. If the similarity values are lower than a threshold value,an alarm is given. Furthermore, genetic algorithms are used to adjust the fuzzy membership functions and to select an appropriate set of features.
文摘The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type description to the relational semantic recording XML data relations, and using an XML data mining language, the XML data mining system presents a strategy to mine information on XML.
文摘Data mining enables us to form forecasts and models regarding future by making use of past data. Any method which helps to discover data can be used as a data mining method. Enterprises gain important competitive advantage by data mining methods. Data mining is used in different fields. In finance field, it is a specially used in portfolio management, fraud detection, payment prediction, loan risk analysis, mortgage scoring, determining transaction manipulation, determining financial risk management, determining customer profile and foreign exchange market. It can be costly, risky and time consuming for enterprises to gain knowledge. Thus today enterprises use data mining as an innovative competitive mean. The aim of the study is to determine the importance of data mining in financial applications.
文摘In this article, the relationship between the knowledge of competitors and the development of new products in the field of capital medical equipment has been investigated. In order to identify the criteria for measuring competitors’ knowledge and developing new capital medical equipment products, marketing experts were interviewed and then a researcher-made questionnaire was compiled and distributed among the statistical sample of the research. Also, in order to achieve the goals of the research, a questionnaire among 100 members of the statistical community was selected, distributed and collected. To analyze the gathered data, the structural equation modeling (SEM) method was used in the SMART PLS 2 software to estimate the model and then the K-MEAN approach was used to cluster the capital medical equipment market based on the knowledge of actual and potential competitors. The results have shown that the knowledge of potential and actual competitors has a positive and significant effect on the development of new products in the capital medical equipment market. From the point of view of the knowledge of actual competitors, the market of “MRI”, “Ultrasound” and “SPECT” is grouped in the low knowledge cluster;“Pet MRI”, “CT Scan”, “Mammography”, “Radiography, Fluoroscopy and CRM”, “Pet CT”, “SPECT CT” and “Gamma Camera” markets are clustered in the medium knowledge. Finally, “Angiography” and “CBCT” markets are located in the knowledge cluster. From the perspective of knowledge of potential competitors, the market of “angiography”, “mammography”, “SPECT” and “SPECT CT” in the low knowledge cluster, “CT scan”, “radiography, fluoroscopy and CRM”, “pet CT”, “CBCT” markets in the medium knowledge cluster and “MRI”, “pet MRI”, “ultrasound” and “gamma camera” markets in the high knowledge cluster are located.
文摘This paper describes in detail the web data mining technology, analyzes the relationship between the data on the web site to the tourism electronic commerce (including the server log, tourism commodity database, user database, the shopping cart), access to relevant user preference information for tourism commodity. Based on these models, the paper presents recommended strategies for the site registered users, and has had the corresponding formulas for calculating the current user of certain items recommended values and the corresponding recommendation algorithm, and the system can get a recommendation for user.
基金Supported by the National Basic Research Program of China(No2011CB505106)the National Natural Science Foundation of China(No30902020)+2 种基金the Foundation of National Department of Public Benefit Research of China(No200807007)the Creation Fund for Significant New Drugs of China(No2009ZX09502-018)the Foundation of International Science and Technology Cooperation of China(No2008DFA30610)
文摘Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is a better avenue to understand the inner mechanism of it. Feature selection based data mining method is better suited to identify biomarkers of UA. In this study, we carried out clinical epidemiology to collect plasmas of UA in-patients and controls. Proteomics and metabolomics data were obtained via two-dimensional difference gel electrophoresis and gas chromatography techniques. We presented a novel computational strategy to select biomarkers as few as possible for UA in the two groups of data. Firstly, decision tree was used to select biomarkers for UA and 3-fold cross validation was used to evaluate computational performanees for the three methods. Alternatively, we combined inde- pendent t test and classification based data mining method as well as backward elimination technique to select, as few as possible, protein and metabolite biomarkers with best classification performances. By the method, we selected 6 proteins and 5 metabolites for UA. The novel method presented here provides a better insight into the pathology of a disease.