By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tre...By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.展开更多
It is important to quantify and analyze forest spatial patterns for studying biological characteristics,population interaction and the relationship between the population and environment.In this study,the forest spati...It is important to quantify and analyze forest spatial patterns for studying biological characteristics,population interaction and the relationship between the population and environment.In this study,the forest spatial structure unit was generated based on the Delaunay triangulation model(DTM),and the weights were generated using the comprehensive values of the tree diameter at breast height,total height and crown width.The distance between neighbors determined by the DTM was weighted to transform the original coordinates of trees into logical coordinates.Then,a weighted spatial pattern(WSP)was developed.After weighting,the neighboring trees were replaced,the replacement ratio was 38.3%,and there was 57.4%of the central tree.Correlation analysis showed that the uniform angle index of the WSP was significantly correlated with the tree size standard deviation under uniformity(r=0.932)and randomness(r=0.711).The DTM method not only considers the spatial distance between trees,but also considers the non-spatial attributes of trees.By changing the spatial topological relation between trees,this method further improves the spatial structure measurement of forest.展开更多
Generating attack pattern automatically based on attack tree is studied. The extending definition of attack tree is proposed. And the algorithm of generating attack tree is presented. The method of generating attack p...Generating attack pattern automatically based on attack tree is studied. The extending definition of attack tree is proposed. And the algorithm of generating attack tree is presented. The method of generating attack pattern automatically based on attack tree is shown, which is tested by concrete attack instances. The results show that the algorithm is effective and efficient. In doing so, the efficiency of generating attack pattern is improved and the attack trees can be reused.展开更多
Exploitation of equipment with cross linked polyethylene (XLPE ) insulation requires its condition monitoring and diagnostic. Traditionally diagnostics of insulation is carried out by means of partial discharge detect...Exploitation of equipment with cross linked polyethylene (XLPE ) insulation requires its condition monitoring and diagnostic. Traditionally diagnostics of insulation is carried out by means of partial discharge detection. However, such identification of a defect, for example, void, inclusion or treeing, does not say about its danger from a point of view of full insulation gap breakdown and insulation construction failure. For this purpose a 29 kV CN-CV cable sample is studied. The experiment is based on research for determination of the dependencies between PD characteristics in XLPE upon time and three dimension PD patterns of corresponding treeing. The investigations were carried out by means of electrical measurement of PD current and simultaneous optical recording of treeing image. The needleplane electrode is applied as the electrode. As a result, -q-n PD patterns which are used as the bases to bush tree initialization and growth can be obtained. Test results show that PD pattern recognition can be applied as a powerful tool for recognizing electrical tree initialization and growth. This can make a good basis for on-line condition monitoring of high voltage power cable.展开更多
In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly comple...In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly complex model, the lack or the ignorance of the explicit document model (DTD—Document Type Definition, Schema, etc.) increases the risk of obtaining an empty result set when the query is too specific, or, too large result set when it is too vague (e.g. it contains wildcards such as “*”). The reason is that in both cases, users write queries according to the document model they have in mind;this can be very far from the one that can actually be extracted from the document. Opposed to exact queries, preference queries are more flexible and can be relaxed to expand the search space during their evaluations. Indeed, during their evaluation, certain constraints (the preferences they contain) can be relaxed if necessary to avoid precisely empty results;moreover, the returned answers can be filtered to retain only the best ones. This paper presents an algorithm for evaluating such queries inspired by the TreeMatch algorithm proposed by Yao et al. for exact queries. In the proposed algorithm, the best answers are obtained by using an adaptation of the Skyline operator (defined in relational databases) in the context of documents (trees) to incrementally filter into the partial solutions set, those which satisfy the maximum of preferential constraints. The only restriction imposed on documents is No-Self-Containment.展开更多
Introduction: As far as adult and married women were concerned, when they occurred to “unplanned pregnancy”, they felt so surprised and concussive all the time. Besides, the unplanned pregnancy also affects the othe...Introduction: As far as adult and married women were concerned, when they occurred to “unplanned pregnancy”, they felt so surprised and concussive all the time. Besides, the unplanned pregnancy also affects the other members in the family system. Therefore, when married women have to face the choice: “birth” or “abortion”, they’ll consider lots of thoughts and different decision criteria and decision pattern under various influences on physician, mind, mental and society. The purpose of this study was to investigate the criteria considered and the decision patterns involved when adult married women decide whether to terminate or continue an unplanned pregnancy. Methods: The study uses the method—“Ethnographic Decision Tree Modeling” [1] to build model of the decision criteria and decision patterns involved when adult married women make a decision about their unplanned pregnancy. There are three process in the research method: “Pilot Study”—interview two groups, every group distinct 4 married adult women with unplanned pregnancies, which decide whether to terminate or continue an unplanned pregnancy, what is the items of decision characters affect to the choice: “birth” or “abortion”. “Building of the Model”, displays the importance in proper order of those items and build the modeling with these two groups of women. “Testing of the Model”: investigate the criteria considered and the decision patterns involved when adult married women decide whether to terminate or continue an unplanned pregnancy. The study interviewed 34 married adult women with 43 unplanned pregnancies totally. Results: The result of the study finds out 12 items of decision characters, including planning to get pregnant or not, stability of feelings for married partner, the points of view on life, was affected by mother, mother-in-law, an husband’s emphasis on male, the meanings of children, the financial burden, the plan an assignment of career and time, the past pregnant experiences, the status of raising children, the health of parents and fetus, the effect of living environment, and social and cultural vision. Besides, there are four decision patterns of married adult women with unplanned pregnancy are “receiving abortion positively”;“giving birth as long as getting pregnancy naturally”;“ the minds are hesitative and changeable”, and “being forced by important others.” Conclusion: By setting the decision model tree, we found several decision criteria and patterns, and possible modes actions to be taken, could offer to see the adult married women’s decision-making and struggles in mind about unplanned pregnancy.展开更多
挖掘最大频繁项目集是多种数据挖掘应用中的关键问题,之前的很多研究都是采用Apriori类的候选项目集生成-检验方法.然而,候选项目集产生的代价是很高的,尤其是在存在大量强模式和/或长模式的时候.提出了一种快速的基于频繁模式树(FP-tr...挖掘最大频繁项目集是多种数据挖掘应用中的关键问题,之前的很多研究都是采用Apriori类的候选项目集生成-检验方法.然而,候选项目集产生的代价是很高的,尤其是在存在大量强模式和/或长模式的时候.提出了一种快速的基于频繁模式树(FP-tree)的最大频繁项目集挖掘DMFIA(discover maximum frequent itemsets algorithm)及其更新算法UMFIA(update maximum frequent itemsets algorithm).算法UMFIA将充分利用以前的挖掘结果来减少在更新的数据库中发现新的最大频繁项目集的费用.展开更多
为了解决最大频繁项目集算法DMFIA(discover maximum frequent itemsets algorithm)在挖掘候选项目集维数较大而最大频繁项目集维数较小的情况下产生大量候选项目集的问题,提出一种改进的基于FP-Tree(frequent pattern tree)的最大频繁...为了解决最大频繁项目集算法DMFIA(discover maximum frequent itemsets algorithm)在挖掘候选项目集维数较大而最大频繁项目集维数较小的情况下产生大量候选项目集的问题,提出一种改进的基于FP-Tree(frequent pattern tree)的最大频繁项目集挖掘的FP-EMFIA算法;该算法在挖掘过程中根据项目头表,采用自上而下和自下而上的双向搜索策略,并通过条件模式基中的频繁项目和较小维数的非频繁项目集对候选项目集进行降维和剪枝,以减少候选项目集的数量,加速对候选集计数的操作。在经典数据集mushroom、chess和connect上的实验结果表明,FP-EMFIA算法在支持度较小时的时间效率优于DMFIA、IDMFIA(improved algorithm of DMFIA)和BDRFI(algorithm for mining frequent itemsets based on decreasing dimensionality reduction of frequent itemsets)算法的,说明FP-EMFIA算法在候选项目集维数较大时有相对优势。展开更多
基金Supported by National Natural Science Foundation of China (No.50975193)Specialized Research Fund for Doctoral Program of Higher Education of China (No.20060056016)
文摘By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.
基金funded by National Natural Science Foundation of China(31570627)Hunan Forestry Science and Technology Project(XLK201740)+1 种基金Hunan Science and Technology Innovation Platform and Talent Plan(2017TP1022)Hunan Science and Technology Plan Project(2015WK3017)。
文摘It is important to quantify and analyze forest spatial patterns for studying biological characteristics,population interaction and the relationship between the population and environment.In this study,the forest spatial structure unit was generated based on the Delaunay triangulation model(DTM),and the weights were generated using the comprehensive values of the tree diameter at breast height,total height and crown width.The distance between neighbors determined by the DTM was weighted to transform the original coordinates of trees into logical coordinates.Then,a weighted spatial pattern(WSP)was developed.After weighting,the neighboring trees were replaced,the replacement ratio was 38.3%,and there was 57.4%of the central tree.Correlation analysis showed that the uniform angle index of the WSP was significantly correlated with the tree size standard deviation under uniformity(r=0.932)and randomness(r=0.711).The DTM method not only considers the spatial distance between trees,but also considers the non-spatial attributes of trees.By changing the spatial topological relation between trees,this method further improves the spatial structure measurement of forest.
文摘Generating attack pattern automatically based on attack tree is studied. The extending definition of attack tree is proposed. And the algorithm of generating attack tree is presented. The method of generating attack pattern automatically based on attack tree is shown, which is tested by concrete attack instances. The results show that the algorithm is effective and efficient. In doing so, the efficiency of generating attack pattern is improved and the attack trees can be reused.
基金The project supported by the Science and Engineering Reserch Fund of Southwest JiaotongUniversity(1999 XM02) and the Startup F
文摘Exploitation of equipment with cross linked polyethylene (XLPE ) insulation requires its condition monitoring and diagnostic. Traditionally diagnostics of insulation is carried out by means of partial discharge detection. However, such identification of a defect, for example, void, inclusion or treeing, does not say about its danger from a point of view of full insulation gap breakdown and insulation construction failure. For this purpose a 29 kV CN-CV cable sample is studied. The experiment is based on research for determination of the dependencies between PD characteristics in XLPE upon time and three dimension PD patterns of corresponding treeing. The investigations were carried out by means of electrical measurement of PD current and simultaneous optical recording of treeing image. The needleplane electrode is applied as the electrode. As a result, -q-n PD patterns which are used as the bases to bush tree initialization and growth can be obtained. Test results show that PD pattern recognition can be applied as a powerful tool for recognizing electrical tree initialization and growth. This can make a good basis for on-line condition monitoring of high voltage power cable.
文摘In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly complex model, the lack or the ignorance of the explicit document model (DTD—Document Type Definition, Schema, etc.) increases the risk of obtaining an empty result set when the query is too specific, or, too large result set when it is too vague (e.g. it contains wildcards such as “*”). The reason is that in both cases, users write queries according to the document model they have in mind;this can be very far from the one that can actually be extracted from the document. Opposed to exact queries, preference queries are more flexible and can be relaxed to expand the search space during their evaluations. Indeed, during their evaluation, certain constraints (the preferences they contain) can be relaxed if necessary to avoid precisely empty results;moreover, the returned answers can be filtered to retain only the best ones. This paper presents an algorithm for evaluating such queries inspired by the TreeMatch algorithm proposed by Yao et al. for exact queries. In the proposed algorithm, the best answers are obtained by using an adaptation of the Skyline operator (defined in relational databases) in the context of documents (trees) to incrementally filter into the partial solutions set, those which satisfy the maximum of preferential constraints. The only restriction imposed on documents is No-Self-Containment.
文摘Introduction: As far as adult and married women were concerned, when they occurred to “unplanned pregnancy”, they felt so surprised and concussive all the time. Besides, the unplanned pregnancy also affects the other members in the family system. Therefore, when married women have to face the choice: “birth” or “abortion”, they’ll consider lots of thoughts and different decision criteria and decision pattern under various influences on physician, mind, mental and society. The purpose of this study was to investigate the criteria considered and the decision patterns involved when adult married women decide whether to terminate or continue an unplanned pregnancy. Methods: The study uses the method—“Ethnographic Decision Tree Modeling” [1] to build model of the decision criteria and decision patterns involved when adult married women make a decision about their unplanned pregnancy. There are three process in the research method: “Pilot Study”—interview two groups, every group distinct 4 married adult women with unplanned pregnancies, which decide whether to terminate or continue an unplanned pregnancy, what is the items of decision characters affect to the choice: “birth” or “abortion”. “Building of the Model”, displays the importance in proper order of those items and build the modeling with these two groups of women. “Testing of the Model”: investigate the criteria considered and the decision patterns involved when adult married women decide whether to terminate or continue an unplanned pregnancy. The study interviewed 34 married adult women with 43 unplanned pregnancies totally. Results: The result of the study finds out 12 items of decision characters, including planning to get pregnant or not, stability of feelings for married partner, the points of view on life, was affected by mother, mother-in-law, an husband’s emphasis on male, the meanings of children, the financial burden, the plan an assignment of career and time, the past pregnant experiences, the status of raising children, the health of parents and fetus, the effect of living environment, and social and cultural vision. Besides, there are four decision patterns of married adult women with unplanned pregnancy are “receiving abortion positively”;“giving birth as long as getting pregnancy naturally”;“ the minds are hesitative and changeable”, and “being forced by important others.” Conclusion: By setting the decision model tree, we found several decision criteria and patterns, and possible modes actions to be taken, could offer to see the adult married women’s decision-making and struggles in mind about unplanned pregnancy.
文摘挖掘最大频繁项目集是多种数据挖掘应用中的关键问题,之前的很多研究都是采用Apriori类的候选项目集生成-检验方法.然而,候选项目集产生的代价是很高的,尤其是在存在大量强模式和/或长模式的时候.提出了一种快速的基于频繁模式树(FP-tree)的最大频繁项目集挖掘DMFIA(discover maximum frequent itemsets algorithm)及其更新算法UMFIA(update maximum frequent itemsets algorithm).算法UMFIA将充分利用以前的挖掘结果来减少在更新的数据库中发现新的最大频繁项目集的费用.
文摘选择性集成通过选择部分基分类器参与集成,从而提高集成分类器的泛化能力,降低预测开销.但已有的选择性集成算法普遍耗时较长,将数据挖掘的技术应用于选择性集成,提出一种基于FP-Tree(frequent pattern tree)的快速选择性集成算法:CPM-EP(coverage based pattern mining for ensemble pruning).该算法将基分类器对校验样本集的分类结果组织成一个事务数据库,从而使选择性集成问题可转化为对事务数据集的处理问题.针对所有可能的集成分类器大小,CPM-EP算法首先得到一个精简的事务数据库,并创建一棵FP-Tree树保存其内容;然后,基于该FP-Tree获得相应大小的集成分类器.在获得的所有集成分类器中,对校验样本集预测精度最高的集成分类器即为算法的输出.实验结果表明,CPM-EP算法以很低的计算开销获得优越的泛化能力,其分类器选择时间约为GASEN的1/19以及Forward-Selection的1/8,其泛化能力显著优于参与比较的其他方法,而且产生的集成分类器具有较少的基分类器.
文摘为了解决最大频繁项目集算法DMFIA(discover maximum frequent itemsets algorithm)在挖掘候选项目集维数较大而最大频繁项目集维数较小的情况下产生大量候选项目集的问题,提出一种改进的基于FP-Tree(frequent pattern tree)的最大频繁项目集挖掘的FP-EMFIA算法;该算法在挖掘过程中根据项目头表,采用自上而下和自下而上的双向搜索策略,并通过条件模式基中的频繁项目和较小维数的非频繁项目集对候选项目集进行降维和剪枝,以减少候选项目集的数量,加速对候选集计数的操作。在经典数据集mushroom、chess和connect上的实验结果表明,FP-EMFIA算法在支持度较小时的时间效率优于DMFIA、IDMFIA(improved algorithm of DMFIA)和BDRFI(algorithm for mining frequent itemsets based on decreasing dimensionality reduction of frequent itemsets)算法的,说明FP-EMFIA算法在候选项目集维数较大时有相对优势。