Data mining is a procedure of separating covered up,obscure,however possibly valuable data from gigantic data.Huge Data impactsly affects logical disclosures and worth creation.Data mining(DM)with Big Data has been br...Data mining is a procedure of separating covered up,obscure,however possibly valuable data from gigantic data.Huge Data impactsly affects logical disclosures and worth creation.Data mining(DM)with Big Data has been broadly utilized in the lifecycle of electronic items that range from the structure and generation stages to the administration organize.A far reaching examination of DM with Big Data and a survey of its application in the phases of its lifecycle won't just profit scientists to create solid research.As of late huge data have turned into a trendy expression,which constrained the analysts to extend the current data mining methods to adapt to the advanced idea of data and to grow new scientific procedures.In this paper,we build up an exact assessment technique dependent on the standard of Design of Experiment.We apply this technique to assess data mining instruments and AI calculations towards structure huge data examination for media transmission checking data.Two contextual investigations are directed to give bits of knowledge of relations between the necessities of data examination and the decision of an instrument or calculation with regards to data investigation work processes.展开更多
With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this pap...With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.展开更多
文摘Data mining is a procedure of separating covered up,obscure,however possibly valuable data from gigantic data.Huge Data impactsly affects logical disclosures and worth creation.Data mining(DM)with Big Data has been broadly utilized in the lifecycle of electronic items that range from the structure and generation stages to the administration organize.A far reaching examination of DM with Big Data and a survey of its application in the phases of its lifecycle won't just profit scientists to create solid research.As of late huge data have turned into a trendy expression,which constrained the analysts to extend the current data mining methods to adapt to the advanced idea of data and to grow new scientific procedures.In this paper,we build up an exact assessment technique dependent on the standard of Design of Experiment.We apply this technique to assess data mining instruments and AI calculations towards structure huge data examination for media transmission checking data.Two contextual investigations are directed to give bits of knowledge of relations between the necessities of data examination and the decision of an instrument or calculation with regards to data investigation work processes.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2013RC0114111 Project of China under Grant No.B08004
文摘With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.