This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used...This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used in various real systems. The main property of the proposed method is that a gain and a control time which are parameters in the control input to avoid an encountered obstacle are computed from a database which includes a lot of driving data in various situations. Especially, the important advantage of the method is small computation time, and hence it realizes real-time obstacle avoidance control for cars. From some numerical simulations, it is showed that the new control method can make the car avoid various obstacles efficiently in comparison with the previous method.展开更多
Offshore waters provide resources for human beings,while on the other hand,threaten them because of marine disasters.Ocean stations are part of offshore observation networks,and the quality of their data is of great s...Offshore waters provide resources for human beings,while on the other hand,threaten them because of marine disasters.Ocean stations are part of offshore observation networks,and the quality of their data is of great significance for exploiting and protecting the ocean.We used hourly mean wave height,temperature,and pressure real-time observation data taken in the Xiaomaidao station(in Qingdao,China)from June 1,2017,to May 31,2018,to explore the data quality using eight quality control methods,and to discriminate the most effective method for Xiaomaidao station.After using the eight quality control methods,the percentages of the mean wave height,temperature,and pressure data that passed the tests were 89.6%,88.3%,and 98.6%,respectively.With the marine disaster(wave alarm report)data,the values failed in the test mainly due to the influence of aging observation equipment and missing data transmissions.The mean wave height is often affected by dynamic marine disasters,so the continuity test method is not effective.The correlation test with other related parameters would be more useful for the mean wave height.展开更多
1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zh...1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zhang et al.,2016;Teng et al.,2016;Tian and Li,2018).The United States has built an information-sharing platform for state-owned scientific data as a national strategy.展开更多
Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with o...Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.展开更多
Since the concept of big data was proposed, the theory on big data is concerned by public, academics, market watchers, researcher and so on, people explore all aspects of the Big Data Time, more than in academic, it h...Since the concept of big data was proposed, the theory on big data is concerned by public, academics, market watchers, researcher and so on, people explore all aspects of the Big Data Time, more than in academic, it has an impact on all areas in marketing,we collect some papers and extract its viewpoints that involve the theory, methods in this article, we hope that it helps to do research on the theory of big data in the field of marketing.展开更多
Paleogeographic analysis accounts for an essential part of geological research,making important contributions in the reconstruction of depositional environments and tectonic evolution histories(Ingalls et al.,2016;Mer...Paleogeographic analysis accounts for an essential part of geological research,making important contributions in the reconstruction of depositional environments and tectonic evolution histories(Ingalls et al.,2016;Merdith et al.,2017),the prediction of mineral resource distributions in continental sedimentary basins(Sun and Wang,2009),and the investigation of climate patterns and ecosystems(Cox,2016).展开更多
Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely ben...Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely benefit from the advancement in this field. Here we introduce into this community a recent finding in physics on causality and the subsequent rigorous and quantitative causality analysis. The resulting formula is concise in form, involving only the common statistics namely sample covariance. A corollary is that causation implies correlation, but not vice versa, resolving the long-standing philosophical debate over correlation versus causation. The applicability to big data analysis is validated with time series purportedly generated with hidden processes. As a demonstration, a preliminary application to the gross domestic product (GDP) data of United States, China, and Japan reveals some subtle USA-China-Japan relations in certain periods. 展开更多
With the advent of Big Data, the fields of Statistics and Computer Science coexist in current information systems. In addition to this, technological advances in embedded systems, in particular Internet of Things tech...With the advent of Big Data, the fields of Statistics and Computer Science coexist in current information systems. In addition to this, technological advances in embedded systems, in particular Internet of Things technologies, make it possible to develop real-time applications. These technological developments are disrupting Software Engineering because the use of large amounts of real-time data requires advanced thinking in terms of software architecture. The purpose of this article is to propose an architecture unifying not only Software Engineering and Big Data activities, but also batch and streaming architectures for the exploitation of massive data. This architecture has the advantage of making possible the development of applications and digital services exploiting very large volumes of data in real time;both for management needs and for analytical purposes. This architecture was tested on COVID-19 data as part of the development of an application for real-time monitoring of the evolution of the pandemic in Côte d’Ivoire using PostgreSQL, ELasticsearch, Kafka, Kafka Connect, NiFi, Spark, Node-Red and MoleculerJS to operationalize the architecture.展开更多
针对现阶段用电设备状态监测技术存在的处理速度较慢、准确率较低等问题,文中基于多突变点检测和模板匹配策略提出了一种用电设备在线状态监测方法。该方法在缓冲区模型和滑动窗口模型的基础上,利用多路搜索树突变点检测(Ternary Search...针对现阶段用电设备状态监测技术存在的处理速度较慢、准确率较低等问题,文中基于多突变点检测和模板匹配策略提出了一种用电设备在线状态监测方法。该方法在缓冲区模型和滑动窗口模型的基础上,利用多路搜索树突变点检测(Ternary Search Tree and Kolmogorov-Smirnov,TSTKS)算法形成窗口维度和缓冲区维度的特征向量,通过两种维度的模板匹配实现用电设备的运行状态匹配和状态切换时刻定位。基于家用电冰箱的仿真实验结果表明,所提方法具有检测速度快、准确率高等优点,可为用电设备状态监测领域提供参考。展开更多
针对在大规模时序医疗数据的分析中现有检测方法检测精度低、检测速度慢等问题,文中提出了一种基于深度学习的时序病变数据段分类方法。该方法在TSTKS(Ternary Search Trees and modified Kolmogorov-Smirnov)算法和滑动窗口理论的基础...针对在大规模时序医疗数据的分析中现有检测方法检测精度低、检测速度慢等问题,文中提出了一种基于深度学习的时序病变数据段分类方法。该方法在TSTKS(Ternary Search Trees and modified Kolmogorov-Smirnov)算法和滑动窗口理论的基础上,利用深度学习技术实现了对病变数据段的快速准确分类。文中以利用该方法对病变数据段进行分类的结果作为依据,实现了滑动窗口大小的动态调整。通过对真实癫痫脑电信号(Electroencephalogram,EEG)进行分析,证明了所提病变数据段分类方法和基于该分类方法的滑动窗口动态调整机制具有检测速度快、精度较高等优点,可以为大规模时序数据的快速分析研究提供一种新选择。展开更多
文摘This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used in various real systems. The main property of the proposed method is that a gain and a control time which are parameters in the control input to avoid an encountered obstacle are computed from a database which includes a lot of driving data in various situations. Especially, the important advantage of the method is small computation time, and hence it realizes real-time obstacle avoidance control for cars. From some numerical simulations, it is showed that the new control method can make the car avoid various obstacles efficiently in comparison with the previous method.
基金Supported by the National Key Research and Development Program of China(Nos.2016YFC1402000,2018YFC1407003,2017YFC1405300)
文摘Offshore waters provide resources for human beings,while on the other hand,threaten them because of marine disasters.Ocean stations are part of offshore observation networks,and the quality of their data is of great significance for exploiting and protecting the ocean.We used hourly mean wave height,temperature,and pressure real-time observation data taken in the Xiaomaidao station(in Qingdao,China)from June 1,2017,to May 31,2018,to explore the data quality using eight quality control methods,and to discriminate the most effective method for Xiaomaidao station.After using the eight quality control methods,the percentages of the mean wave height,temperature,and pressure data that passed the tests were 89.6%,88.3%,and 98.6%,respectively.With the marine disaster(wave alarm report)data,the values failed in the test mainly due to the influence of aging observation equipment and missing data transmissions.The mean wave height is often affected by dynamic marine disasters,so the continuity test method is not effective.The correlation test with other related parameters would be more useful for the mean wave height.
基金granted by the National Science&Technology Major Projects of China(Grant No.2016ZX05033).
文摘1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zhang et al.,2016;Teng et al.,2016;Tian and Li,2018).The United States has built an information-sharing platform for state-owned scientific data as a national strategy.
文摘Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.
文摘Since the concept of big data was proposed, the theory on big data is concerned by public, academics, market watchers, researcher and so on, people explore all aspects of the Big Data Time, more than in academic, it has an impact on all areas in marketing,we collect some papers and extract its viewpoints that involve the theory, methods in this article, we hope that it helps to do research on the theory of big data in the field of marketing.
基金granted by the National Natural Science Foundation of China(Grant No.41802126)Open Fund of Key Laboratory of Sedimentary Mineralization and Sedimentary Minerals in Shandong Province(Grant No.DMSM2017006).
文摘Paleogeographic analysis accounts for an essential part of geological research,making important contributions in the reconstruction of depositional environments and tectonic evolution histories(Ingalls et al.,2016;Merdith et al.,2017),the prediction of mineral resource distributions in continental sedimentary basins(Sun and Wang,2009),and the investigation of climate patterns and ecosystems(Cox,2016).
文摘Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely benefit from the advancement in this field. Here we introduce into this community a recent finding in physics on causality and the subsequent rigorous and quantitative causality analysis. The resulting formula is concise in form, involving only the common statistics namely sample covariance. A corollary is that causation implies correlation, but not vice versa, resolving the long-standing philosophical debate over correlation versus causation. The applicability to big data analysis is validated with time series purportedly generated with hidden processes. As a demonstration, a preliminary application to the gross domestic product (GDP) data of United States, China, and Japan reveals some subtle USA-China-Japan relations in certain periods.
文摘With the advent of Big Data, the fields of Statistics and Computer Science coexist in current information systems. In addition to this, technological advances in embedded systems, in particular Internet of Things technologies, make it possible to develop real-time applications. These technological developments are disrupting Software Engineering because the use of large amounts of real-time data requires advanced thinking in terms of software architecture. The purpose of this article is to propose an architecture unifying not only Software Engineering and Big Data activities, but also batch and streaming architectures for the exploitation of massive data. This architecture has the advantage of making possible the development of applications and digital services exploiting very large volumes of data in real time;both for management needs and for analytical purposes. This architecture was tested on COVID-19 data as part of the development of an application for real-time monitoring of the evolution of the pandemic in Côte d’Ivoire using PostgreSQL, ELasticsearch, Kafka, Kafka Connect, NiFi, Spark, Node-Red and MoleculerJS to operationalize the architecture.
文摘针对现阶段用电设备状态监测技术存在的处理速度较慢、准确率较低等问题,文中基于多突变点检测和模板匹配策略提出了一种用电设备在线状态监测方法。该方法在缓冲区模型和滑动窗口模型的基础上,利用多路搜索树突变点检测(Ternary Search Tree and Kolmogorov-Smirnov,TSTKS)算法形成窗口维度和缓冲区维度的特征向量,通过两种维度的模板匹配实现用电设备的运行状态匹配和状态切换时刻定位。基于家用电冰箱的仿真实验结果表明,所提方法具有检测速度快、准确率高等优点,可为用电设备状态监测领域提供参考。
文摘针对在大规模时序医疗数据的分析中现有检测方法检测精度低、检测速度慢等问题,文中提出了一种基于深度学习的时序病变数据段分类方法。该方法在TSTKS(Ternary Search Trees and modified Kolmogorov-Smirnov)算法和滑动窗口理论的基础上,利用深度学习技术实现了对病变数据段的快速准确分类。文中以利用该方法对病变数据段进行分类的结果作为依据,实现了滑动窗口大小的动态调整。通过对真实癫痫脑电信号(Electroencephalogram,EEG)进行分析,证明了所提病变数据段分类方法和基于该分类方法的滑动窗口动态调整机制具有检测速度快、精度较高等优点,可以为大规模时序数据的快速分析研究提供一种新选择。