The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks....The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.展开更多
Recent emergence of diverse services have led to explosive traffic growth in cellular data networks. Understanding the service dynamics in large cellular networks is important for network design, trouble shooting, qua...Recent emergence of diverse services have led to explosive traffic growth in cellular data networks. Understanding the service dynamics in large cellular networks is important for network design, trouble shooting, quality of service(Qo E) support, and resource allocation. In this paper, we present our study to reveal the distributions and temporal patterns of different services in cellular data network from two different perspectives, namely service request times and service duration. Our study is based on big traffic data, which is parsed to readable records by our Hadoop-based packet parsing platform, captured over a week-long period from a tier-1 mobile operator's network in China. We propose a Zipf's ranked model to characterize the distributions of traffic volume, packet, request times and duration of cellular services. Two-stage method(Self-Organizing Map combined with kmeans) is first used to cluster time series of service into four request patterns and three duration patterns. These seven patterns are combined together to better understand the fine-grained temporal patterns of service in cellular network. Results of our distribution models and temporal patterns present cellular network operators with a better understanding of the request and duration characteristics of service, which of great importance in network design, service generation and resource allocation.展开更多
Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both cost...Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both costs and lives. Laboratory experiments are one way to obtain a deeper and better understanding of the mechanisms of rockburst. In a previous study by these authors, a database of rockburst laboratory tests was created; in addition, with the use of data mining (DM) techniques, models to predict rockburst maximum stress and rockburst risk indexes were developed. In this paper, we focus on the analysis of a database of in situ cases of rockburst in order to build influence diagrams, list the factors that interact in the occurrence of rockburst, and understand the relationships between these factors. The in situ rockburst database was further analyzed using different DM techniques ranging from artificial neural networks (ANNs) to naive Bayesian classifiers. The aim was to predict the type of rockburst-that is, the rockburst level-based on geologic and construction characteristics of the mine or tunnel. Conclusions are drawn at the end of the paper.展开更多
Frequency and scale of the blasting events are increasing to boost limestone production. Mines areapproaching close to inhabited areas due to growing population and limited availability of land resourceswhich has chal...Frequency and scale of the blasting events are increasing to boost limestone production. Mines areapproaching close to inhabited areas due to growing population and limited availability of land resourceswhich has challenged the management to go for safe blasts with special reference to opencast mining.The study aims to predict the distance covered by the flyrock induced by blasting using artificial neuralnetwork (ANN) and multi-variate regression analysis (MVRA) for better assessment. Blast design andgeotechnical parameters, such as linear charge concentration, burden, stemming length, specific charge,unconfined compressive strength (UCS), and rock quality designation (RQD), have been selected as inputparameters and flyrock distance used as output parameter. ANN has been trained using 95 datasets ofexperimental blasts conducted in 4 opencast limestone mines in India. Thirty datasets have been used fortesting and validation of trained neural network. Flyrock distances have been predicted by ANN, MVRA,as well as further calculated using motion analysis of flyrock projectiles and compared with the observeddata. Back propagation neural network (BPNN) has been proven to be a superior predictive tool whencompared with MVRA. 2014 Institute of Rock and Soil Mechanics, Chinese Academy of Sciences. Production and hosting byElsevier B.V. All rights reserved.展开更多
AMiner is a novel online academic search and mining system,and it aims to provide a systematic modeling approach to help researchers and scientists gain a deeper understanding of the large and heterogeneous networks f...AMiner is a novel online academic search and mining system,and it aims to provide a systematic modeling approach to help researchers and scientists gain a deeper understanding of the large and heterogeneous networks formed by authors,papers,conferences,journals and organizations.The system is subsequently able to extract researchers’profiles automatically from the Web and integrates them with published papers by a way of a process that first performs name disambiguation.Then a generative probabilistic model is devised to simultaneously model the different entities while providing a topic-level expertise search.In addition,AMiner offers a set of researcher-centered functions,including social influence analysis,relationship mining,collaboration recommendation,similarity analysis and community evolution.The system has been in operation since 2006 and has been accessed from more than 8 million independent IP addresses residing in more than 200 countries and regions.展开更多
Gene co-expression network(GCN)mining identifies gene modules with highly correlated expression profiles across samples/conditions.It enables researchers to discover latent gene/molecule interactions,identify novel ge...Gene co-expression network(GCN)mining identifies gene modules with highly correlated expression profiles across samples/conditions.It enables researchers to discover latent gene/molecule interactions,identify novel gene functions,and extract molecular features from certain disease/condition groups,thus helping to identify disease bio-markers.However,there lacks an easy-to-use tool package for users to mine GCN modules that are relatively small in size with tightly connected genes that can be convenient for downstream gene set enrichment analysis,as well as modules that may share common members.To address this need,we developed an online GCN mining tool package:TSUNAMI(Tools SUite for Network Analysis and MIning).TSUNAMI incorporates our state-of-the-art lmQCM algorithm to mine GCN modules for both public and user-input data(microarray,RNA-seq,or any other numerical omics data),and then performs downstream gene set enrichment analysis for the identified modules.It has several features and advantages:1)a user-friendly interface and real-time co-expression network mining through a web server;2)direct access and search of NCBI Gene Expression Omnibus(GEO)and The Cancer Genome Atlas(TCGA)databases,as well as user-input gene ex-pression matrices for GCN module mining;3)multiple co-expression analysis tools to choose from,all of which are highly flexible in regards to parameter selection options;4)identified GCN modules are summarized to eigengenes,which are convenient for users to check their correlation with other clinical traits;5)integrated downstream Enrichr enrichment analysis and links to other gene set enrichment tools;and 6)visualization of gene loci by Circos plot in any step of the process.The web service is freely accessible through URL:https://biolearns.medicine.iu.edu/.Source code is available at https://github.com/huangzhii/TSUNAMI/.展开更多
Because there is neither waste rock nor mill tailings in the gypsum mine, and the buildings on the goaf of gypsum mine are needed to be protected, the research proposed the scheme of the clay filling technology. Gypsu...Because there is neither waste rock nor mill tailings in the gypsum mine, and the buildings on the goaf of gypsum mine are needed to be protected, the research proposed the scheme of the clay filling technology. Gypsum, cement, lime and water glass were used as adhesive, and the strength of different material ratios were investigated in this study. The influence factors of clay strength were obtained in the order of cement, gypsum, water glass and lime. The results show that the cement content is the determinant influence factor, and gypsum has positive effects, while the water glass can enhance both clay strength and the fluidity of the filing slurry. Furthermore, combining chaotic optimization method with neural network, the optimal ratio of composite cementing agent was obtained. The results show that the optimal ratio of water glass, cement, lime and clay (in quality) is 1.17:6.74:4.17:87.92 in the process of bottom self-flow filling, while the optimal ratio is 1.78:9.58:4.71:83.93 for roof-contacted filling. A novel filling process to fill in gypsum mine goaf with clay is established. The engineering practice shows that the filling cost is low, thus, notable economic benefit is achieved.展开更多
One of the most serious conundrum facing the stope production in underground metalliferous mining is uneven break (UB: unplanned dilution and ore-loss). Although the UB has a huge economic fallout to the entire min...One of the most serious conundrum facing the stope production in underground metalliferous mining is uneven break (UB: unplanned dilution and ore-loss). Although the UB has a huge economic fallout to the entire mining process, it is practically unavoidable due to the complex causing mechanism. In this study, the contribution of ten major UB causative parameters ha,; been scrutinised based on a published UB predicting artificial neuron network (ANN) model to put UB under the engineering management. Two typical ANN sensitivity analysis methods, i.e., connection weight algorithm (CWA) and profile method (PM) have been applied. As a result of CWA and PM applications, adjusted Qrate (AQ) revealed as the most influential parameter to UB with contribution of 22,40% in CWA and 20,48% in PM respectively. The findings of this study can be used as an important reference in stope design, production, and reconciliation stages on underground stoping mine.展开更多
OBJECTIVE: To apply data mining methods to research on the state of sub-mental health among residents in eight provinces and cities in China and to mine latent knowledge about many conditions through data mining and a...OBJECTIVE: To apply data mining methods to research on the state of sub-mental health among residents in eight provinces and cities in China and to mine latent knowledge about many conditions through data mining and analysis of data on 3970 sub-mentally healthy individuals selected from 13385 relevant question naires.METHODS: The strategic tree algorithm was used to identify the main mani festations of the state of sub-mental health. The backpropogation artificial neural network was used to analyze the main mani festations of sub-healthy mental states of three different degrees. A sub-mental health evaluation model was then established to achieve predictive evaluationresults.RESULTS: Using classifications from the Scale of Chinese Sub-healthy State, the main manifestations of sub-mental health selected using the strate gictree were F1101(Do you lack peace of mind?),F1102(Are you easily nervous when something comes up?), and F1002(Do you often sigh?). The relative intensity of manifestations of sub-mental health was highest for F1101, followed by F1102,and then F1002. Through study of the neural network, better differentiation could be made between moderate and severe and between mild and severe states of sub-mental health. The differentiation between mild and moderate sub-mental health states was less apparent. Additionally, the sub-mental health state evaluation model, which could be used to predict states of sub-mental health of different individuals, was established using F1101, F1102, F1002, and the mental self-assessment totals core.CONCLUSION: The main manifestations of the state of sub-mental health can be discovered using data mining methods to research and analyze the latent laws and knowledge hidden in research evidence on the state of sub-mental health. The state of sub-mental health of different individuals can be rapidly predicted using the model established here.This can provide a basis for assessment and intervention for sub-mental health. It can also replace the relatively outdated approaches to research on sub-health in the technical era of information and digitization by combining the study of states of sub-mental health with information techniques and by further quantifying the relevant information.展开更多
The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale netwo...The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.展开更多
基金funded by the Enterprise Ireland Innovation Partnership Programme with Ericsson under grant agreement IP/2011/0135[6]supported by the National Natural Science Foundation of China(No.61373131,61303039,61232016,61501247)+1 种基金the PAPDCICAEET funds
文摘The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.
基金supported by the National Basic Research Program of China (973 Program: 2013CB329004)
文摘Recent emergence of diverse services have led to explosive traffic growth in cellular data networks. Understanding the service dynamics in large cellular networks is important for network design, trouble shooting, quality of service(Qo E) support, and resource allocation. In this paper, we present our study to reveal the distributions and temporal patterns of different services in cellular data network from two different perspectives, namely service request times and service duration. Our study is based on big traffic data, which is parsed to readable records by our Hadoop-based packet parsing platform, captured over a week-long period from a tier-1 mobile operator's network in China. We propose a Zipf's ranked model to characterize the distributions of traffic volume, packet, request times and duration of cellular services. Two-stage method(Self-Organizing Map combined with kmeans) is first used to cluster time series of service into four request patterns and three duration patterns. These seven patterns are combined together to better understand the fine-grained temporal patterns of service in cellular network. Results of our distribution models and temporal patterns present cellular network operators with a better understanding of the request and duration characteristics of service, which of great importance in network design, service generation and resource allocation.
文摘Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both costs and lives. Laboratory experiments are one way to obtain a deeper and better understanding of the mechanisms of rockburst. In a previous study by these authors, a database of rockburst laboratory tests was created; in addition, with the use of data mining (DM) techniques, models to predict rockburst maximum stress and rockburst risk indexes were developed. In this paper, we focus on the analysis of a database of in situ cases of rockburst in order to build influence diagrams, list the factors that interact in the occurrence of rockburst, and understand the relationships between these factors. The in situ rockburst database was further analyzed using different DM techniques ranging from artificial neural networks (ANNs) to naive Bayesian classifiers. The aim was to predict the type of rockburst-that is, the rockburst level-based on geologic and construction characteristics of the mine or tunnel. Conclusions are drawn at the end of the paper.
文摘Frequency and scale of the blasting events are increasing to boost limestone production. Mines areapproaching close to inhabited areas due to growing population and limited availability of land resourceswhich has challenged the management to go for safe blasts with special reference to opencast mining.The study aims to predict the distance covered by the flyrock induced by blasting using artificial neuralnetwork (ANN) and multi-variate regression analysis (MVRA) for better assessment. Blast design andgeotechnical parameters, such as linear charge concentration, burden, stemming length, specific charge,unconfined compressive strength (UCS), and rock quality designation (RQD), have been selected as inputparameters and flyrock distance used as output parameter. ANN has been trained using 95 datasets ofexperimental blasts conducted in 4 opencast limestone mines in India. Thirty datasets have been used fortesting and validation of trained neural network. Flyrock distances have been predicted by ANN, MVRA,as well as further calculated using motion analysis of flyrock projectiles and compared with the observeddata. Back propagation neural network (BPNN) has been proven to be a superior predictive tool whencompared with MVRA. 2014 Institute of Rock and Soil Mechanics, Chinese Academy of Sciences. Production and hosting byElsevier B.V. All rights reserved.
文摘AMiner is a novel online academic search and mining system,and it aims to provide a systematic modeling approach to help researchers and scientists gain a deeper understanding of the large and heterogeneous networks formed by authors,papers,conferences,journals and organizations.The system is subsequently able to extract researchers’profiles automatically from the Web and integrates them with published papers by a way of a process that first performs name disambiguation.Then a generative probabilistic model is devised to simultaneously model the different entities while providing a topic-level expertise search.In addition,AMiner offers a set of researcher-centered functions,including social influence analysis,relationship mining,collaboration recommendation,similarity analysis and community evolution.The system has been in operation since 2006 and has been accessed from more than 8 million independent IP addresses residing in more than 200 countries and regions.
基金supported by the American Cancer Society Inernal Reseatch Grant (to JZ)the National Cancer Institure Informatics Technology for Ccance Research U01 grant (Grant No. CA188547 to JZ and KH)+1 种基金the Indiana University Precision Health Initiative (to JZ and KH)the support from Indiana University Information Technologies and Advanced Biomedical IT Core
文摘Gene co-expression network(GCN)mining identifies gene modules with highly correlated expression profiles across samples/conditions.It enables researchers to discover latent gene/molecule interactions,identify novel gene functions,and extract molecular features from certain disease/condition groups,thus helping to identify disease bio-markers.However,there lacks an easy-to-use tool package for users to mine GCN modules that are relatively small in size with tightly connected genes that can be convenient for downstream gene set enrichment analysis,as well as modules that may share common members.To address this need,we developed an online GCN mining tool package:TSUNAMI(Tools SUite for Network Analysis and MIning).TSUNAMI incorporates our state-of-the-art lmQCM algorithm to mine GCN modules for both public and user-input data(microarray,RNA-seq,or any other numerical omics data),and then performs downstream gene set enrichment analysis for the identified modules.It has several features and advantages:1)a user-friendly interface and real-time co-expression network mining through a web server;2)direct access and search of NCBI Gene Expression Omnibus(GEO)and The Cancer Genome Atlas(TCGA)databases,as well as user-input gene ex-pression matrices for GCN module mining;3)multiple co-expression analysis tools to choose from,all of which are highly flexible in regards to parameter selection options;4)identified GCN modules are summarized to eigengenes,which are convenient for users to check their correlation with other clinical traits;5)integrated downstream Enrichr enrichment analysis and links to other gene set enrichment tools;and 6)visualization of gene loci by Circos plot in any step of the process.The web service is freely accessible through URL:https://biolearns.medicine.iu.edu/.Source code is available at https://github.com/huangzhii/TSUNAMI/.
基金supported by the National Basic Research and Development Program of China (No. 2010CB732004)the joint funding of the National Natural Science Foundation and Shanghai Baosteel Group Corporation of China (No. 51074177)
文摘Because there is neither waste rock nor mill tailings in the gypsum mine, and the buildings on the goaf of gypsum mine are needed to be protected, the research proposed the scheme of the clay filling technology. Gypsum, cement, lime and water glass were used as adhesive, and the strength of different material ratios were investigated in this study. The influence factors of clay strength were obtained in the order of cement, gypsum, water glass and lime. The results show that the cement content is the determinant influence factor, and gypsum has positive effects, while the water glass can enhance both clay strength and the fluidity of the filing slurry. Furthermore, combining chaotic optimization method with neural network, the optimal ratio of composite cementing agent was obtained. The results show that the optimal ratio of water glass, cement, lime and clay (in quality) is 1.17:6.74:4.17:87.92 in the process of bottom self-flow filling, while the optimal ratio is 1.78:9.58:4.71:83.93 for roof-contacted filling. A novel filling process to fill in gypsum mine goaf with clay is established. The engineering practice shows that the filling cost is low, thus, notable economic benefit is achieved.
文摘One of the most serious conundrum facing the stope production in underground metalliferous mining is uneven break (UB: unplanned dilution and ore-loss). Although the UB has a huge economic fallout to the entire mining process, it is practically unavoidable due to the complex causing mechanism. In this study, the contribution of ten major UB causative parameters ha,; been scrutinised based on a published UB predicting artificial neuron network (ANN) model to put UB under the engineering management. Two typical ANN sensitivity analysis methods, i.e., connection weight algorithm (CWA) and profile method (PM) have been applied. As a result of CWA and PM applications, adjusted Qrate (AQ) revealed as the most influential parameter to UB with contribution of 22,40% in CWA and 20,48% in PM respectively. The findings of this study can be used as an important reference in stope design, production, and reconciliation stages on underground stoping mine.
基金Supported by Chinese"Disease"Sub-health Medicine Research and Intervention of the Eleventh Five-Year Science and Technology Support Project of China(No.2006BAI13B01)Financial Support Case Studies of Traditional Chinese Medicine Treatment of Disease and Health Management Ideas of Shanghai Health Bureau(No.2010227)+2 种基金Scientific Innovation Research Funds of Shanghai Municipal Education Commission(No.14YZ061)Teacher Academic Community Fund of Shanghai University of Traditional Chinese Medicine(No.2013JXG03)Chinese Culture and Its Core Value System Modernization Transformation of the National Social Science Funds(No.12AZD094)
文摘OBJECTIVE: To apply data mining methods to research on the state of sub-mental health among residents in eight provinces and cities in China and to mine latent knowledge about many conditions through data mining and analysis of data on 3970 sub-mentally healthy individuals selected from 13385 relevant question naires.METHODS: The strategic tree algorithm was used to identify the main mani festations of the state of sub-mental health. The backpropogation artificial neural network was used to analyze the main mani festations of sub-healthy mental states of three different degrees. A sub-mental health evaluation model was then established to achieve predictive evaluationresults.RESULTS: Using classifications from the Scale of Chinese Sub-healthy State, the main manifestations of sub-mental health selected using the strate gictree were F1101(Do you lack peace of mind?),F1102(Are you easily nervous when something comes up?), and F1002(Do you often sigh?). The relative intensity of manifestations of sub-mental health was highest for F1101, followed by F1102,and then F1002. Through study of the neural network, better differentiation could be made between moderate and severe and between mild and severe states of sub-mental health. The differentiation between mild and moderate sub-mental health states was less apparent. Additionally, the sub-mental health state evaluation model, which could be used to predict states of sub-mental health of different individuals, was established using F1101, F1102, F1002, and the mental self-assessment totals core.CONCLUSION: The main manifestations of the state of sub-mental health can be discovered using data mining methods to research and analyze the latent laws and knowledge hidden in research evidence on the state of sub-mental health. The state of sub-mental health of different individuals can be rapidly predicted using the model established here.This can provide a basis for assessment and intervention for sub-mental health. It can also replace the relatively outdated approaches to research on sub-health in the technical era of information and digitization by combining the study of states of sub-mental health with information techniques and by further quantifying the relevant information.
基金supported by the National Natural Science Foundation of China(Nos.61573299,61174140,61472127,and 61272395)the Social Science Foundation of Hunan Province(No.16ZDA07)+2 种基金China Postdoctoral Science Foundation(Nos.2013M540628and 2014T70767)the Natural Science Foundation of Hunan Province(Nos.14JJ3107 and 2017JJ5064)the Excellent Youth Scholars Project of Hunan Province(No.15B087)
文摘The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.