Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-s...Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.展开更多
Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean...Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.展开更多
Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are...Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are similar for different closely spaced targets, there is ambiguity in using the kinematic information alone; the correct association probability will decrease in conventional joint probabilistic data association algorithm and track coalescence will occur easily. A modified algorithm of joint probabilistic data association with classification-aided is presented, which avoids track coalescence when tracking multiple neighboring targets. Firstly, an identification matrix is defined, which is used to simplify validation matrix to decrease computational complexity. Then, target class information is integrated into the data association process. Performance comparisons with and without the use of class information in JPDA are presented on multiple closely spaced maneuvering targets tracking problem. Simulation results quantify the benefits of classification-aided JPDA for improved multiple targets tracking, especially in the presence of association uncertainty in the kinematic measurement and target maneuvering. Simulation results indicate that the algorithm is valid.展开更多
Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better ...Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong's Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research.展开更多
A specialized Hungarian algorithm was developed here for the maximum likelihood data association problem with two implementation versions due to presence of false alarms and missed detections. The maximum likelihood d...A specialized Hungarian algorithm was developed here for the maximum likelihood data association problem with two implementation versions due to presence of false alarms and missed detections. The maximum likelihood data association problem is formulated as a bipartite weighted matching problem. Its duality and the optimality conditions are given. The Hungarian algorithm with its computational steps, data structure and computational complexity is presented. The two implementation versions, Hungarian forest (HF) algorithm and Hungarian tree (HT) algorithm, and their combination with the naYve auction initialization are discussed. The computational results show that HT algorithm is slightly faster than HF algorithm and they are both superior to the classic Munkres algorithm.展开更多
A novel data association algorithm is developed based on fuzzy geneticalgorithms (FGAs). The static part of data association uses one FGA to determine both the lists ofcomposite measurements and the solutions of m-bes...A novel data association algorithm is developed based on fuzzy geneticalgorithms (FGAs). The static part of data association uses one FGA to determine both the lists ofcomposite measurements and the solutions of m-best S-D assignment. In the dynamic part of dataassociation, the results of the m-best S-D assignment are then used in turn, with a Kalman filterstate estimator, in a multi-population FGA-based dynamic 2D assignment algorithm to estimate thestates of the moving targets over time. Such an assignment-based data association algorithm isdemonstrated on a simulated passive sensor track formation and maintenance problem. The simulationresults show its feasibility in multi-sensor multi-target tracking. Moreover, algorithm developmentand real-time problems are briefly discussed.展开更多
Aiming at three-passive-sensor location system, a generalized 3-dimension (3-D) assignment model is constructed based on property information, and a multi-target programming model is proposed based on direction-find...Aiming at three-passive-sensor location system, a generalized 3-dimension (3-D) assignment model is constructed based on property information, and a multi-target programming model is proposed based on direction-finding and property fusion information. The multi-target programming model is transformed into a single target programming problem to resolve, and its data association result is compared with the results which are solved by using one kind of information only. Simulation experiments show the effectiveness of the multi-target programming algorithm with higher data association accuracy and less calculation.展开更多
The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) i...The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) it should be the minimal and the simplest association rule set; 2) its predictive power should in no way be weaker than that of the complete association rule set so that the precision of the association rule set analysis can be guaranteed. By adopting the least association rule set, the pruning of weak rules can be effectively carried out so as to greatly reduce the number of frequent itemset, and therefore improve the mining efficiency. Finally, based on the classical Apriori algorithm, the upward closure property of weak rules is utilized to develop a corresponding efficient algorithm.展开更多
In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too...In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistic data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI, two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.展开更多
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
One of the leading cancers for both genders worldwide is lung cancer.The occurrence of lung cancer has fully augmented since the early 19th century.In this manuscript,we have discussed various data mining techniques t...One of the leading cancers for both genders worldwide is lung cancer.The occurrence of lung cancer has fully augmented since the early 19th century.In this manuscript,we have discussed various data mining techniques that have been employed for cancer diagnosis.Exposure to air pollution has been related to various adverse health effects.This work is subject to analysis of various air pollutants and associated health hazards and intends to evaluate the impact of air pollution caused by lung cancer.We have introduced data mining in lung cancer to air pollution,and our approach includes preprocessing,data mining,testing and evaluation,and knowledge discovery.Initially,we will eradicate the noise and irrelevant data,and following that,we will join the multiple informed sources into a common source.From that source,we will designate the information relevant to our investigation to be regained from that assortment.Following that,we will convert the designated data into a suitable mining process.The patterns are abstracted by utilizing a relational suggestion rule mining process.These patterns have revealed information,and this information is categorized with the help of an Auto Associative Neural Network classification method(AANN).The proposed method is compared with the existing method in various factors.In conclusion,the projected Auto associative neural network and relational suggestion rule mining methods accomplish a high accuracy status.展开更多
This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negativ...This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.展开更多
Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on pr...The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.展开更多
Background:The purpose of this study was to identify the characteristics and principles of acupoints applied for treating chronic hepatitis B infection.Methods:The published clinical studies on acupuncture for the tre...Background:The purpose of this study was to identify the characteristics and principles of acupoints applied for treating chronic hepatitis B infection.Methods:The published clinical studies on acupuncture for the treatment of chronic hepatitis B infection were gathered from various databases,including SinoMed,Chongqing Vip,China National Knowledge Infrastructure,Wanfang,the Cochrane Library,PubMed,Web of Science and Embase.Excel 2019 was utilized to establish a database of acupuncture prescriptions and conduct statistics on the frequency,meridian application,distribution and specific points,as well as SPSS Modeler 18.0 and SPSS Statistics 26.0 to conduct association rule analysis and cluster analysis to investigate the characteristics and patterns of acupoint selection.Results:A total of 42 studies containing 47 acupoints were included,with a total frequency of 286 acupoints.The top five acupoints used were Zusanli(ST36),Ganshu(BL18),Yanglingquan(GB34),Sanyinjiao(SP6)and Taichong(LR3),and the most commonly used meridians was the Bladder Meridian of Foot-Taiyang.The majority of acupuncture points are located in the lower limbs,back,and lumbar regions,with a significant percentage of them being Five-Shu acupoints.The strongest acupoint combination identified was Zusanli(ST36)–Ganshu(BL18),in addition to which 13 association rules and 4 valid clusters were obtained.Conclusion:Zusanli(ST36)–Ganshu(BL18)could be considered a relatively reasonable prescription for treating chronic hepatitis B infection in clinical practice.However,further high-quality studies are needed.展开更多
Based upon a multisensor sequential processing filter, the target states in a3D Cartesian system are projected into the measurement space of each sensor to extend thejoint probabilistic data association (JPDA) algorit...Based upon a multisensor sequential processing filter, the target states in a3D Cartesian system are projected into the measurement space of each sensor to extend thejoint probabilistic data association (JPDA) algorithm into the multisensor tracking systemsconsisting of heterogeneous sensors for the data association.展开更多
Due to the advantages of ant colony optimization (ACO) in solving complex problems, a new data association algorithm based on ACO in a cluttered environment called DACDA is proposed. In the proposed method, the conc...Due to the advantages of ant colony optimization (ACO) in solving complex problems, a new data association algorithm based on ACO in a cluttered environment called DACDA is proposed. In the proposed method, the concept for tour and the length of tour are redefined. Additionally, the directional information is incorporated into the proposed method because it is one of the most important factors that affects the performance of data association. Kalman filter is employed to estimate target states. Computer simulation results show that the proposed method could carry out data association in an acceptable CPU time, and the correct data association rate is higher than that obtained by the data association (DA) algorithm not combined with directional information.展开更多
Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to tar...Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to target representation and data association. So discriminative and reliable target representation is vital for accurate data association in multi-tracking. Pervious works always combine bunch of features to increase the discriminative power, but this is prone to error accumulation and unnecessary computational cost, which may increase ambiguity on the contrary. Moreover, reliability of a same feature in different scenes may vary a lot, especially for currently widespread network cameras, which are settled in various and complex indoor scenes, previous fixed feature selection schemes cannot meet general requirements. To properly handle these problems, first, we propose a scene-adaptive hierarchical data association scheme, which adaptively selects features with higher reliability on target representation in the applied scene, and gradually combines features to the minimum requirement of discriminating ambiguous targets; second, a novel depth-invariant part-based appearance model using RGB-D data is proposed which makes the appearance model robust to scale change, partial occlusion and view-truncation. The introduce of RGB-D data increases the diversity of features, which provides more types of features for feature selection in data association and enhances the final multi-tracking performance. We validate our method from several aspects including scene-adaptive feature selection scheme, hierarchical data association scheme and RGB-D based appearance modeling scheme in various indoor scenes, which demonstrates its effectiveness and efficiency on improving multi-tracking performances in various indoor scenes.展开更多
To bridge the performance gap between original probability data association (PDA) algorithm and the optimum maximum a posterior (MAP) algorithm for multi-input multi-output (MIMO) detection, a grouped PDA (GP-...To bridge the performance gap between original probability data association (PDA) algorithm and the optimum maximum a posterior (MAP) algorithm for multi-input multi-output (MIMO) detection, a grouped PDA (GP-PDA) detection algorithm is proposed. The proposed GP-PDA method divides all the transmit antennas into groups, and then updates the symbol probabilities group by group using PDA computations. In each group, joint a posterior probability (APP) is computed to obtain the APP of a single symbol in this group, like the MAP algorithm. Such new algorithm combines the characters of MAP and PDA. MAP and original PDA algorithm can be regarded as a special case of the proposed GP-PDA. Simulations show that the proposed GP-PDA provides a performance and complexity trade, off between original PDA and MAP algorithm.展开更多
Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain a...Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.展开更多
基金funding within the Wheat BigData Project(German Federal Ministry of Food and Agriculture,FKZ2818408B18)。
文摘Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.
基金The National Key R&D Program of China under contract No.2021YFC3101603.
文摘Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.
基金Defense Advanced Research Project "the Techniques of Information Integrated Processing and Fusion" in the Eleventh Five-Year Plan (513060302).
文摘Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are similar for different closely spaced targets, there is ambiguity in using the kinematic information alone; the correct association probability will decrease in conventional joint probabilistic data association algorithm and track coalescence will occur easily. A modified algorithm of joint probabilistic data association with classification-aided is presented, which avoids track coalescence when tracking multiple neighboring targets. Firstly, an identification matrix is defined, which is used to simplify validation matrix to decrease computational complexity. Then, target class information is integrated into the data association process. Performance comparisons with and without the use of class information in JPDA are presented on multiple closely spaced maneuvering targets tracking problem. Simulation results quantify the benefits of classification-aided JPDA for improved multiple targets tracking, especially in the presence of association uncertainty in the kinematic measurement and target maneuvering. Simulation results indicate that the algorithm is valid.
文摘Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong's Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research.
基金This project was supported by the National Natural Science Foundation of China (60272024).
文摘A specialized Hungarian algorithm was developed here for the maximum likelihood data association problem with two implementation versions due to presence of false alarms and missed detections. The maximum likelihood data association problem is formulated as a bipartite weighted matching problem. Its duality and the optimality conditions are given. The Hungarian algorithm with its computational steps, data structure and computational complexity is presented. The two implementation versions, Hungarian forest (HF) algorithm and Hungarian tree (HT) algorithm, and their combination with the naYve auction initialization are discussed. The computational results show that HT algorithm is slightly faster than HF algorithm and they are both superior to the classic Munkres algorithm.
文摘A novel data association algorithm is developed based on fuzzy geneticalgorithms (FGAs). The static part of data association uses one FGA to determine both the lists ofcomposite measurements and the solutions of m-best S-D assignment. In the dynamic part of dataassociation, the results of the m-best S-D assignment are then used in turn, with a Kalman filterstate estimator, in a multi-population FGA-based dynamic 2D assignment algorithm to estimate thestates of the moving targets over time. Such an assignment-based data association algorithm isdemonstrated on a simulated passive sensor track formation and maintenance problem. The simulationresults show its feasibility in multi-sensor multi-target tracking. Moreover, algorithm developmentand real-time problems are briefly discussed.
基金This project was supported by the National Natural Science Foundation of China (60172033) the Excellent Ph.D.PaperAuthor Foundation of China (200036 ,200237) .
文摘Aiming at three-passive-sensor location system, a generalized 3-dimension (3-D) assignment model is constructed based on property information, and a multi-target programming model is proposed based on direction-finding and property fusion information. The multi-target programming model is transformed into a single target programming problem to resolve, and its data association result is compared with the results which are solved by using one kind of information only. Simulation experiments show the effectiveness of the multi-target programming algorithm with higher data association accuracy and less calculation.
文摘The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) it should be the minimal and the simplest association rule set; 2) its predictive power should in no way be weaker than that of the complete association rule set so that the precision of the association rule set analysis can be guaranteed. By adopting the least association rule set, the pruning of weak rules can be effectively carried out so as to greatly reduce the number of frequent itemset, and therefore improve the mining efficiency. Finally, based on the classical Apriori algorithm, the upward closure property of weak rules is utilized to develop a corresponding efficient algorithm.
基金the Youth Science and Technology Foundection of University of Electronic Science andTechnology of China (JX0622).
文摘In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistic data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI, two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
基金support from Taif University Researchers supporting Project Number(TURSP-2020/215),Taif University,Taif,Saudi Arabia.
文摘One of the leading cancers for both genders worldwide is lung cancer.The occurrence of lung cancer has fully augmented since the early 19th century.In this manuscript,we have discussed various data mining techniques that have been employed for cancer diagnosis.Exposure to air pollution has been related to various adverse health effects.This work is subject to analysis of various air pollutants and associated health hazards and intends to evaluate the impact of air pollution caused by lung cancer.We have introduced data mining in lung cancer to air pollution,and our approach includes preprocessing,data mining,testing and evaluation,and knowledge discovery.Initially,we will eradicate the noise and irrelevant data,and following that,we will join the multiple informed sources into a common source.From that source,we will designate the information relevant to our investigation to be regained from that assortment.Following that,we will convert the designated data into a suitable mining process.The patterns are abstracted by utilizing a relational suggestion rule mining process.These patterns have revealed information,and this information is categorized with the help of an Auto Associative Neural Network classification method(AANN).The proposed method is compared with the existing method in various factors.In conclusion,the projected Auto associative neural network and relational suggestion rule mining methods accomplish a high accuracy status.
文摘This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
基金We thank the anonymous reviewers and editors for their very constructive comments.the National Social Science Foundation Project of China under Grant 16BTQ085.
文摘The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.
基金supported by Chongqing Municipal Health and Family Planning Commission and Chongqing Municipal Science and Technology Commission Jointly Funded Key Research Projects in Traditional Chinese Medicine(ZY201801007).
文摘Background:The purpose of this study was to identify the characteristics and principles of acupoints applied for treating chronic hepatitis B infection.Methods:The published clinical studies on acupuncture for the treatment of chronic hepatitis B infection were gathered from various databases,including SinoMed,Chongqing Vip,China National Knowledge Infrastructure,Wanfang,the Cochrane Library,PubMed,Web of Science and Embase.Excel 2019 was utilized to establish a database of acupuncture prescriptions and conduct statistics on the frequency,meridian application,distribution and specific points,as well as SPSS Modeler 18.0 and SPSS Statistics 26.0 to conduct association rule analysis and cluster analysis to investigate the characteristics and patterns of acupoint selection.Results:A total of 42 studies containing 47 acupoints were included,with a total frequency of 286 acupoints.The top five acupoints used were Zusanli(ST36),Ganshu(BL18),Yanglingquan(GB34),Sanyinjiao(SP6)and Taichong(LR3),and the most commonly used meridians was the Bladder Meridian of Foot-Taiyang.The majority of acupuncture points are located in the lower limbs,back,and lumbar regions,with a significant percentage of them being Five-Shu acupoints.The strongest acupoint combination identified was Zusanli(ST36)–Ganshu(BL18),in addition to which 13 association rules and 4 valid clusters were obtained.Conclusion:Zusanli(ST36)–Ganshu(BL18)could be considered a relatively reasonable prescription for treating chronic hepatitis B infection in clinical practice.However,further high-quality studies are needed.
文摘Based upon a multisensor sequential processing filter, the target states in a3D Cartesian system are projected into the measurement space of each sensor to extend thejoint probabilistic data association (JPDA) algorithm into the multisensor tracking systemsconsisting of heterogeneous sensors for the data association.
文摘Due to the advantages of ant colony optimization (ACO) in solving complex problems, a new data association algorithm based on ACO in a cluttered environment called DACDA is proposed. In the proposed method, the concept for tour and the length of tour are redefined. Additionally, the directional information is incorporated into the proposed method because it is one of the most important factors that affects the performance of data association. Kalman filter is employed to estimate target states. Computer simulation results show that the proposed method could carry out data association in an acceptable CPU time, and the correct data association rate is higher than that obtained by the data association (DA) algorithm not combined with directional information.
基金This work is supported by National Natural Science Foundation of China (NSFC, No. 61340046), National High Technology Research and Development Program of China (863 Program, No. 2006AA04Z247), Scientific and Technical Innovation Commission of Shenzhen Municipality (JCYJ20130331144631730, JCYJ20130331144716089), Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130001110011).
文摘Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to target representation and data association. So discriminative and reliable target representation is vital for accurate data association in multi-tracking. Pervious works always combine bunch of features to increase the discriminative power, but this is prone to error accumulation and unnecessary computational cost, which may increase ambiguity on the contrary. Moreover, reliability of a same feature in different scenes may vary a lot, especially for currently widespread network cameras, which are settled in various and complex indoor scenes, previous fixed feature selection schemes cannot meet general requirements. To properly handle these problems, first, we propose a scene-adaptive hierarchical data association scheme, which adaptively selects features with higher reliability on target representation in the applied scene, and gradually combines features to the minimum requirement of discriminating ambiguous targets; second, a novel depth-invariant part-based appearance model using RGB-D data is proposed which makes the appearance model robust to scale change, partial occlusion and view-truncation. The introduce of RGB-D data increases the diversity of features, which provides more types of features for feature selection in data association and enhances the final multi-tracking performance. We validate our method from several aspects including scene-adaptive feature selection scheme, hierarchical data association scheme and RGB-D based appearance modeling scheme in various indoor scenes, which demonstrates its effectiveness and efficiency on improving multi-tracking performances in various indoor scenes.
基金Sponsored by the National Natural Science Foundation of China(60572120)
文摘To bridge the performance gap between original probability data association (PDA) algorithm and the optimum maximum a posterior (MAP) algorithm for multi-input multi-output (MIMO) detection, a grouped PDA (GP-PDA) detection algorithm is proposed. The proposed GP-PDA method divides all the transmit antennas into groups, and then updates the symbol probabilities group by group using PDA computations. In each group, joint a posterior probability (APP) is computed to obtain the APP of a single symbol in this group, like the MAP algorithm. Such new algorithm combines the characters of MAP and PDA. MAP and original PDA algorithm can be regarded as a special case of the proposed GP-PDA. Simulations show that the proposed GP-PDA provides a performance and complexity trade, off between original PDA and MAP algorithm.
文摘Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.