In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the sy...In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the system in the open loop. To tackle these difficulties, an approach of data-driven model identification and control algorithm design based on the maximum stability degree criterion is proposed in this paper. The data-driven model identification procedure supposes the finding of the mathematical model of the system based on the undamped transient response of the closed-loop system. The system is approximated with the inertial model, where the coefficients are calculated based on the values of the critical transfer coefficient, oscillation amplitude and period of the underdamped response of the closed-loop system. The data driven control design supposes that the tuning parameters of the controller are calculated based on the parameters obtained from the previous step of system identification and there are presented the expressions for the calculation of the tuning parameters. The obtained results of data-driven model identification and algorithm for synthesis the controller were verified by computer simulation.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit prop...Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.展开更多
Recently,internet stimulates the explosive progress of knowledge discovery in big volume data resource,to dig the valuable and hidden rules by computing.Simultaneously,the wireless channel measurement data reveals big...Recently,internet stimulates the explosive progress of knowledge discovery in big volume data resource,to dig the valuable and hidden rules by computing.Simultaneously,the wireless channel measurement data reveals big volume feature,considering the massive antennas,huge bandwidth and versatile application scenarios.This article firstly presents a comprehensive survey of channel measurement and modeling research for mobile communication,especially for 5th Generation(5G) and beyond.Considering the big data research progress,then a cluster-nuclei based model is proposed,which takes advantages of both the stochastical model and deterministic model.The novel model has low complexity with the limited number of cluster-nuclei while the cluster-nuclei has the physical mapping to real propagation objects.Combining the channel properties variation principles with antenna size,frequency,mobility and scenario dug from the channel data,the proposed model can be expanded in versatile application to support future mobile research.展开更多
Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorpt...Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.展开更多
Although the Internet of Things has been widely applied,the problems of cloud computing in the application of digital smart medical Big Data collection,processing,analysis,and storage remain,especially the low efficie...Although the Internet of Things has been widely applied,the problems of cloud computing in the application of digital smart medical Big Data collection,processing,analysis,and storage remain,especially the low efficiency of medical diagnosis.And with the wide application of the Internet of Things and Big Data in the medical field,medical Big Data is increasing in geometric magnitude resulting in cloud service overload,insufficient storage,communication delay,and network congestion.In order to solve these medical and network problems,a medical big-data-oriented fog computing architec-ture and BP algorithm application are proposed,and its structural advantages and characteristics are studied.This architecture enables the medical Big Data generated by medical edge devices and the existing data in the cloud service center to calculate,compare and analyze the fog node through the Internet of Things.The diagnosis results are designed to reduce the business processing delay and improve the diagnosis effect.Considering the weak computing of each edge device,the artificial intelligence BP neural network algorithm is used in the core computing model of the medical diagnosis system to improve the system computing power,enhance the medical intelligence-aided decision-making,and improve the clinical diagnosis and treatment efficiency.In the application process,combined with the characteristics of medical Big Data technology,through fog architecture design and Big Data technology integration,we could research the processing and analysis of heterogeneous data of the medical diagnosis system in the context of the Internet of Things.The results are promising:The medical platform network is smooth,the data storage space is sufficient,the data processing and analysis speed is fast,the diagnosis effect is remarkable,and it is a good assistant to doctors’treatment effect.It not only effectively solves the problem of low clinical diagnosis,treatment efficiency and quality,but also reduces the waiting time of patients,effectively solves the contradiction between doctors and patients,and improves the medical service quality and management level.展开更多
We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-leve...We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-level model.As the complexity of computational models increases,their sample requirements grow exponentially.This growth stems from the fact that the number of combinations of variables grows exponentially as the number of variables increases.Thus,a large sample size is needed.The number of variables in a computational model can be reduced by incorporating biological knowledge.One particularly successful way of doing this is by using available gene regulatory,signaling,metabolic,or context-specific pathway information.We conclude that the incorporation of existing biological knowledge is essential for the progress in using big data for cancer research.展开更多
In big data of business service or transaction,it is impossible to provide entire information to both of services from cyber system,so some service providers made use of maliciously services to get more interests.Trus...In big data of business service or transaction,it is impossible to provide entire information to both of services from cyber system,so some service providers made use of maliciously services to get more interests.Trust management is an effective solution to deal with these malicious actions.This paper gave a trust computing model based on service-recommendation in big data.This model takes into account difference of recommendation trust between familiar node and stranger node.Thus,to ensure accuracy of recommending trust computing,paper proposed a fine-granularity similarity computing method based on the similarity of service concept domain ontology.This model is more accurate in computing trust value of cyber service nodes and prevents better cheating and attacking of malicious service nodes.Experiment results illustrated our model is effective.展开更多
Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end u...Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end user tofind the exact search results among the huge paginated search results.Higher level of drill down search with category based search feature leads to get the most accurate search results but it increases the number and size of thefile system.The purpose of this manuscript is to implement a big data storage reduction binaryfile system model for category based drill down search engine that offers fast multi-levelfiltering capability.The basic methodology of the proposed model stores the search engine data in the binaryfile system model.To verify the effectiveness of the proposedfile system model,5 million unique keyword data are stored into a binaryfile,thereby analysing the proposedfile system with efficiency.Some experimental results are also provided based on real data that show our storage model speed and superiority.Experiments demonstrated that ourfile system expansion ratio is constant and it reduces the disk storage space up to 30%with conventional database/file system and it also increases the search performance for any levels of search.To discuss deeply,the paper starts with the short introduction of drill down search followed by the discussion of important technologies used to implement big data storage reduction system in detail.展开更多
This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used...This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used in various real systems. The main property of the proposed method is that a gain and a control time which are parameters in the control input to avoid an encountered obstacle are computed from a database which includes a lot of driving data in various situations. Especially, the important advantage of the method is small computation time, and hence it realizes real-time obstacle avoidance control for cars. From some numerical simulations, it is showed that the new control method can make the car avoid various obstacles efficiently in comparison with the previous method.展开更多
These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or...These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or social networks) that are continuously producing either structured, semi-structured or unstructured data. Database Management Systems and Data Warehouses are no longer the only technologies used to store and analyze datasets, namely due to the volume and complex structure of nowadays data that degrade their performance and scalability. Big Data is one of the recent challenges, since it implies new requirements in terms of data storage, processing and visualization. Despite that, analyzing properly Big Data can constitute great advantages because it allows discovering patterns and correlations in datasets. Users can use this processed information to gain deeper insights and to get business advantages. Thus, data modeling and data analytics are evolved in a way that we are able to process huge amounts of data without compromising performance and availability, but instead by “relaxing” the usual ACID properties. This paper provides a broad view and discussion of the current state of this subject with a particular focus on data modeling and data analytics, describing and clarifying the main differences between the three main approaches in what concerns these aspects, namely: operational databases, decision support databases and Big Data technologies.展开更多
Big data has convincing merits in developing risk stratification strategies for diseases.The 6“V”s of big data,namely,volume,velocity,variety,veracity,value,and variability,have shown promise for real-world scenario...Big data has convincing merits in developing risk stratification strategies for diseases.The 6“V”s of big data,namely,volume,velocity,variety,veracity,value,and variability,have shown promise for real-world scenarios.Big data can be applied to analyze health data and advance research in preclinical biology,medicine,and especially disease initiation,development,and control.A study design comprises data selection,inclusion and exclusion criteria,standard confirmation and cohort establishment,follow-up strategy,and events of interest.The development and efficiency verification of a prognosis model consists of deciding the data source,taking previous models as references while selecting candidate predictors,assessing model performance,choosing appropriate statistical methods,and model optimization.The model should be able to inform disease development and outcomes,such as predicting variceal rebleeding in patients with cirrhosis.Our work has merits beyond those of other colleagues with respect to cirrhosis patient screening and data source regarding variceal bleeding.展开更多
When designing large-sized complex machinery products, the design focus is always on the overall per- formance; however, there exist no design theory and method based on performance driven. In view of the defi- ciency...When designing large-sized complex machinery products, the design focus is always on the overall per- formance; however, there exist no design theory and method based on performance driven. In view of the defi- ciency of the existing design theory, according to the performance features of complex mechanical products, the performance indices are introduced into the traditional design theory of "Requirement-Function-Structure" to construct a new five-domain design theory of "Client Requirement-Function-Performance-Structure-Design Parameter". To support design practice based on this new theory, a product data model is established by using per- formance indices and the mapping relationship between them and the other four domains. When the product data model is applied to high-speed train design and combining the existing research result and relevant standards, the corresponding data model and its structure involving five domains of high-speed trains are established, which can provide technical support for studying the relationships between typical performance indices and design parame- ters and the fast achievement of a high-speed train scheme design. The five domains provide a reference for the design specification and evaluation criteria of high speed train and a new idea for the train's parameter design.展开更多
In the transition of China’s economy from high-speed growth to high-quality growth in the new era,economic practices are oriented to fostering new growth drivers,developing new industries,and forming new models.Based...In the transition of China’s economy from high-speed growth to high-quality growth in the new era,economic practices are oriented to fostering new growth drivers,developing new industries,and forming new models.Based on the data flow,big data effectively integrates technology,material,fund,and human resource flows and reveals new paths for the development of new growth drivers,new industries and new models.Adopting an analytical framework with"macro-meso-micro"levels,this paper elaborates on the theoretical mechanisms by which big data drives high-quality growth through efficiency improvements,upgrades of industrial structures,and business model innovations.It also explores the practical foundations for big data driven high-quality growth including technological advancements of big data,the development of big data industries,and the formulation of big data strategies.Finally,this paper proposes policy options for big data promoting high-quality growth in terms of developing digital economy,consolidating the infrastructure construction of big data,expediting convergence of big data and the real economy,advocating for a big data culture,and expanding financing options for big data.展开更多
With market competition becoming fiercer,enterprises must update their products by constantly assimilating new big data knowledge and private knowledge to maintain their market shares at different time points in the b...With market competition becoming fiercer,enterprises must update their products by constantly assimilating new big data knowledge and private knowledge to maintain their market shares at different time points in the big data environment.Typically,there is mutual influence between each knowledge transfer if the time interval is not too long.It is necessary to study the problem of continuous knowledge transfer in the big data environment.Based on research on one-time knowledge transfer,a model of continuous knowledge transfer is presented,which can consider the interaction between knowledge transfer and determine the optimal knowledge transfer time at different time points in the big data environment.Simulation experiments were performed by adjusting several parameters.The experimental results verified the model’s validity and facilitated conclusions regarding their practical application values.The experimental results can provide more effective decisions for enterprises that must carry out continuous knowledge transfer in the big data environment.展开更多
The 19th National Congress of the Communist Party of China has put forward higher requirements for Chinese government governance. The government governance has developed to a higher stage. Meanwhile, it faces more cha...The 19th National Congress of the Communist Party of China has put forward higher requirements for Chinese government governance. The government governance has developed to a higher stage. Meanwhile, it faces more challenges, like lack of top-level design and information sharing. To develop a government governance decision-making innovation model, we should make good use of big data to mine in the grassroots government data management network. Both the characteristics of the times and the experience of the practice have proven that big data can empower government governance and promote the construction of a service-oriented government.展开更多
In recent years,more and more researches focus on the self characteristics and spatial location of housing,and explore the influencing factors of urban housing price from the micro perspective.As representative of big...In recent years,more and more researches focus on the self characteristics and spatial location of housing,and explore the influencing factors of urban housing price from the micro perspective.As representative of big cities,spatial distribution pattern of housing price in national central cities has attracted much attention.In order to return the spatial distribution pattern of housing price to the research on influencing factors of housing price,the reasons behind the spatial distribution pattern of housing price in three national central cities:Beijing,Wuhan and Chongqing are explored.The results show that①urban housing price is affected by many factors.Due to different social and economic conditions in each city,there are differences in the influence direction of the proximity to expressways,city squares,universities and living facilities,characteristics of companies and enterprises on Beijing,Wuhan and Chongqing.②Various factors have different value-added effects on housing price in different cities.The location of ring line in Beijing and Wuhan has the greatest increase effect on housing price,while metro station of Chongqing has the greatest increase effect on housing price.展开更多
This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D vi...This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D visualization. Firstly we study the 3D big data of face modeling including feature facial extraction from 2D images. The second part is to represent the technical from Computer Vision, Image Processing and my new method for extract information from images and create 3D model. Thirdly, 3D face modeling based on 2D image software is implemented by C# language, EMGU CV library and XNA framework. Finally, we design experiment, test and record results for measure performance of our method.展开更多
Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensi...Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensional evaluation system is built to evaluate the government performance of China haze. Second, a data-driven model is employed to reveal the operation mechanism of China’s haze and is described as a multi input and multi output system. Third, a prototype system is set up to verify the proposed scheme, and the result provides us with a graphical tool to monitor different haze control strategies.展开更多
The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare...The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare. In recent years, the related technologies of Intelligent Transportation System (ITS) re</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">presented by the Vehicles to Everything (V2X) technology have been developing rapidly. Utilizing the related technologies of ITS, the large-scale vehicle microscopic trajectory data with high quality can be acquired, which provides the research foundation for modeling the car-following behavior based on the data-driven methods. According to this point, a data-driven car-following model based on the Random Forest (RF) method was constructed in this work, and the Next Generation Simulation (NGSIM) dataset was used to calibrate and train the constructed model. The Artificial Neural Network (ANN) model, GM model, and Full Velocity Difference (FVD) model are em</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">ployed to comparatively verify the proposed model. The research results suggest that the model proposed in this work can accurately describe the car-</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">following behavior with better performance under multiple performance indicators.展开更多
文摘In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the system in the open loop. To tackle these difficulties, an approach of data-driven model identification and control algorithm design based on the maximum stability degree criterion is proposed in this paper. The data-driven model identification procedure supposes the finding of the mathematical model of the system based on the undamped transient response of the closed-loop system. The system is approximated with the inertial model, where the coefficients are calculated based on the values of the critical transfer coefficient, oscillation amplitude and period of the underdamped response of the closed-loop system. The data driven control design supposes that the tuning parameters of the controller are calculated based on the parameters obtained from the previous step of system identification and there are presented the expressions for the calculation of the tuning parameters. The obtained results of data-driven model identification and algorithm for synthesis the controller were verified by computer simulation.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
文摘Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.
基金supported in part by National Natural Science Foundation of China (61322110, 6141101115)Doctoral Fund of Ministry of Education (201300051100013)
文摘Recently,internet stimulates the explosive progress of knowledge discovery in big volume data resource,to dig the valuable and hidden rules by computing.Simultaneously,the wireless channel measurement data reveals big volume feature,considering the massive antennas,huge bandwidth and versatile application scenarios.This article firstly presents a comprehensive survey of channel measurement and modeling research for mobile communication,especially for 5th Generation(5G) and beyond.Considering the big data research progress,then a cluster-nuclei based model is proposed,which takes advantages of both the stochastical model and deterministic model.The novel model has low complexity with the limited number of cluster-nuclei while the cluster-nuclei has the physical mapping to real propagation objects.Combining the channel properties variation principles with antenna size,frequency,mobility and scenario dug from the channel data,the proposed model can be expanded in versatile application to support future mobile research.
基金RPSEA and U.S.Department of Energy for partially funding this study
文摘Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.
基金supported by 2020 Foshan Science and Technology Project(Numbering:2020001005356),Baoling Qin received the grant.
文摘Although the Internet of Things has been widely applied,the problems of cloud computing in the application of digital smart medical Big Data collection,processing,analysis,and storage remain,especially the low efficiency of medical diagnosis.And with the wide application of the Internet of Things and Big Data in the medical field,medical Big Data is increasing in geometric magnitude resulting in cloud service overload,insufficient storage,communication delay,and network congestion.In order to solve these medical and network problems,a medical big-data-oriented fog computing architec-ture and BP algorithm application are proposed,and its structural advantages and characteristics are studied.This architecture enables the medical Big Data generated by medical edge devices and the existing data in the cloud service center to calculate,compare and analyze the fog node through the Internet of Things.The diagnosis results are designed to reduce the business processing delay and improve the diagnosis effect.Considering the weak computing of each edge device,the artificial intelligence BP neural network algorithm is used in the core computing model of the medical diagnosis system to improve the system computing power,enhance the medical intelligence-aided decision-making,and improve the clinical diagnosis and treatment efficiency.In the application process,combined with the characteristics of medical Big Data technology,through fog architecture design and Big Data technology integration,we could research the processing and analysis of heterogeneous data of the medical diagnosis system in the context of the Internet of Things.The results are promising:The medical platform network is smooth,the data storage space is sufficient,the data processing and analysis speed is fast,the diagnosis effect is remarkable,and it is a good assistant to doctors’treatment effect.It not only effectively solves the problem of low clinical diagnosis,treatment efficiency and quality,but also reduces the waiting time of patients,effectively solves the contradiction between doctors and patients,and improves the medical service quality and management level.
文摘We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-level model.As the complexity of computational models increases,their sample requirements grow exponentially.This growth stems from the fact that the number of combinations of variables grows exponentially as the number of variables increases.Thus,a large sample size is needed.The number of variables in a computational model can be reduced by incorporating biological knowledge.One particularly successful way of doing this is by using available gene regulatory,signaling,metabolic,or context-specific pathway information.We conclude that the incorporation of existing biological knowledge is essential for the progress in using big data for cancer research.
文摘In big data of business service or transaction,it is impossible to provide entire information to both of services from cyber system,so some service providers made use of maliciously services to get more interests.Trust management is an effective solution to deal with these malicious actions.This paper gave a trust computing model based on service-recommendation in big data.This model takes into account difference of recommendation trust between familiar node and stranger node.Thus,to ensure accuracy of recommending trust computing,paper proposed a fine-granularity similarity computing method based on the similarity of service concept domain ontology.This model is more accurate in computing trust value of cyber service nodes and prevents better cheating and attacking of malicious service nodes.Experiment results illustrated our model is effective.
文摘Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end user tofind the exact search results among the huge paginated search results.Higher level of drill down search with category based search feature leads to get the most accurate search results but it increases the number and size of thefile system.The purpose of this manuscript is to implement a big data storage reduction binaryfile system model for category based drill down search engine that offers fast multi-levelfiltering capability.The basic methodology of the proposed model stores the search engine data in the binaryfile system model.To verify the effectiveness of the proposedfile system model,5 million unique keyword data are stored into a binaryfile,thereby analysing the proposedfile system with efficiency.Some experimental results are also provided based on real data that show our storage model speed and superiority.Experiments demonstrated that ourfile system expansion ratio is constant and it reduces the disk storage space up to 30%with conventional database/file system and it also increases the search performance for any levels of search.To discuss deeply,the paper starts with the short introduction of drill down search followed by the discussion of important technologies used to implement big data storage reduction system in detail.
文摘This paper provides a new obstacle avoidance control method for cars based on big data and just-in-time modeling. Just-in-time modeling is a new kind of data-driven control technique in the age of big data and is used in various real systems. The main property of the proposed method is that a gain and a control time which are parameters in the control input to avoid an encountered obstacle are computed from a database which includes a lot of driving data in various situations. Especially, the important advantage of the method is small computation time, and hence it realizes real-time obstacle avoidance control for cars. From some numerical simulations, it is showed that the new control method can make the car avoid various obstacles efficiently in comparison with the previous method.
文摘These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or social networks) that are continuously producing either structured, semi-structured or unstructured data. Database Management Systems and Data Warehouses are no longer the only technologies used to store and analyze datasets, namely due to the volume and complex structure of nowadays data that degrade their performance and scalability. Big Data is one of the recent challenges, since it implies new requirements in terms of data storage, processing and visualization. Despite that, analyzing properly Big Data can constitute great advantages because it allows discovering patterns and correlations in datasets. Users can use this processed information to gain deeper insights and to get business advantages. Thus, data modeling and data analytics are evolved in a way that we are able to process huge amounts of data without compromising performance and availability, but instead by “relaxing” the usual ACID properties. This paper provides a broad view and discussion of the current state of this subject with a particular focus on data modeling and data analytics, describing and clarifying the main differences between the three main approaches in what concerns these aspects, namely: operational databases, decision support databases and Big Data technologies.
文摘Big data has convincing merits in developing risk stratification strategies for diseases.The 6“V”s of big data,namely,volume,velocity,variety,veracity,value,and variability,have shown promise for real-world scenarios.Big data can be applied to analyze health data and advance research in preclinical biology,medicine,and especially disease initiation,development,and control.A study design comprises data selection,inclusion and exclusion criteria,standard confirmation and cohort establishment,follow-up strategy,and events of interest.The development and efficiency verification of a prognosis model consists of deciding the data source,taking previous models as references while selecting candidate predictors,assessing model performance,choosing appropriate statistical methods,and model optimization.The model should be able to inform disease development and outcomes,such as predicting variceal rebleeding in patients with cirrhosis.Our work has merits beyond those of other colleagues with respect to cirrhosis patient screening and data source regarding variceal bleeding.
基金Supported by National Natural Science Foundation of China(Grant Nos.51275432,51505390)Sichuan Application Foundation Projects(Grant No.2016JY0098)Independent Research Project of TPL(Grant No.TPL1501)
文摘When designing large-sized complex machinery products, the design focus is always on the overall per- formance; however, there exist no design theory and method based on performance driven. In view of the defi- ciency of the existing design theory, according to the performance features of complex mechanical products, the performance indices are introduced into the traditional design theory of "Requirement-Function-Structure" to construct a new five-domain design theory of "Client Requirement-Function-Performance-Structure-Design Parameter". To support design practice based on this new theory, a product data model is established by using per- formance indices and the mapping relationship between them and the other four domains. When the product data model is applied to high-speed train design and combining the existing research result and relevant standards, the corresponding data model and its structure involving five domains of high-speed trains are established, which can provide technical support for studying the relationships between typical performance indices and design parame- ters and the fast achievement of a high-speed train scheme design. The five domains provide a reference for the design specification and evaluation criteria of high speed train and a new idea for the train's parameter design.
基金funded by the Program for “Sanqin Scholar Innovation Teams in Shanxi Province”(SZTZ [2018] No.34)“the Research on the Mechanism,Effect Evaluation,and Policy Support of Replacing Business Tax with VAT In Promoting the Industrial Structure Upgrade of China” funded by the Humanity and Social Science Youth Foundation of the Ministry of Education of China(18YJC790078)“the Evaluation and Study of the Effect of Promoting Industrial Transformation and Upgrade of Shaanxi by Replacing Business Tax with Value-added Tax” funded by the Social Science Foundation Project of Shanxi Province(2017D037)
文摘In the transition of China’s economy from high-speed growth to high-quality growth in the new era,economic practices are oriented to fostering new growth drivers,developing new industries,and forming new models.Based on the data flow,big data effectively integrates technology,material,fund,and human resource flows and reveals new paths for the development of new growth drivers,new industries and new models.Adopting an analytical framework with"macro-meso-micro"levels,this paper elaborates on the theoretical mechanisms by which big data drives high-quality growth through efficiency improvements,upgrades of industrial structures,and business model innovations.It also explores the practical foundations for big data driven high-quality growth including technological advancements of big data,the development of big data industries,and the formulation of big data strategies.Finally,this paper proposes policy options for big data promoting high-quality growth in terms of developing digital economy,consolidating the infrastructure construction of big data,expediting convergence of big data and the real economy,advocating for a big data culture,and expanding financing options for big data.
基金supported by the National Natural Science Foundation of China(Grant No.71704016,71331008)the Natural Science Foundation of Hunan Province(Grant No.2017JJ2267)+1 种基金Key Projects of Chinese Ministry of Education(17JZD022)the Project of China Scholarship Council for Overseas Studies(201208430233,201508430121),which are acknowledged.
文摘With market competition becoming fiercer,enterprises must update their products by constantly assimilating new big data knowledge and private knowledge to maintain their market shares at different time points in the big data environment.Typically,there is mutual influence between each knowledge transfer if the time interval is not too long.It is necessary to study the problem of continuous knowledge transfer in the big data environment.Based on research on one-time knowledge transfer,a model of continuous knowledge transfer is presented,which can consider the interaction between knowledge transfer and determine the optimal knowledge transfer time at different time points in the big data environment.Simulation experiments were performed by adjusting several parameters.The experimental results verified the model’s validity and facilitated conclusions regarding their practical application values.The experimental results can provide more effective decisions for enterprises that must carry out continuous knowledge transfer in the big data environment.
文摘The 19th National Congress of the Communist Party of China has put forward higher requirements for Chinese government governance. The government governance has developed to a higher stage. Meanwhile, it faces more challenges, like lack of top-level design and information sharing. To develop a government governance decision-making innovation model, we should make good use of big data to mine in the grassroots government data management network. Both the characteristics of the times and the experience of the practice have proven that big data can empower government governance and promote the construction of a service-oriented government.
基金Sponsored by National Natural Science Foundation of China (51808413)General Project of Hubei Social Science Fund (2018193)+1 种基金Innovation and Entrepreneurship Training Program for College Students in Hubei Province (S201910490024)University-level Graduate Innovation Fund of Wuhan Institute of Technology (CX2019036)。
文摘In recent years,more and more researches focus on the self characteristics and spatial location of housing,and explore the influencing factors of urban housing price from the micro perspective.As representative of big cities,spatial distribution pattern of housing price in national central cities has attracted much attention.In order to return the spatial distribution pattern of housing price to the research on influencing factors of housing price,the reasons behind the spatial distribution pattern of housing price in three national central cities:Beijing,Wuhan and Chongqing are explored.The results show that①urban housing price is affected by many factors.Due to different social and economic conditions in each city,there are differences in the influence direction of the proximity to expressways,city squares,universities and living facilities,characteristics of companies and enterprises on Beijing,Wuhan and Chongqing.②Various factors have different value-added effects on housing price in different cities.The location of ring line in Beijing and Wuhan has the greatest increase effect on housing price,while metro station of Chongqing has the greatest increase effect on housing price.
基金The paper is partly supported by: 1. The Fund of PHD Supervisor from China Institute Committee (20132304110018). 2. The Natural Fund of Hei Longjiang Province (F201246). 3. The National Natural Science Foundation of China under Grant (61272184).
文摘This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D visualization. Firstly we study the 3D big data of face modeling including feature facial extraction from 2D images. The second part is to represent the technical from Computer Vision, Image Processing and my new method for extract information from images and create 3D model. Thirdly, 3D face modeling based on 2D image software is implemented by C# language, EMGU CV library and XNA framework. Finally, we design experiment, test and record results for measure performance of our method.
文摘Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensional evaluation system is built to evaluate the government performance of China haze. Second, a data-driven model is employed to reveal the operation mechanism of China’s haze and is described as a multi input and multi output system. Third, a prototype system is set up to verify the proposed scheme, and the result provides us with a graphical tool to monitor different haze control strategies.
文摘The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare. In recent years, the related technologies of Intelligent Transportation System (ITS) re</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">presented by the Vehicles to Everything (V2X) technology have been developing rapidly. Utilizing the related technologies of ITS, the large-scale vehicle microscopic trajectory data with high quality can be acquired, which provides the research foundation for modeling the car-following behavior based on the data-driven methods. According to this point, a data-driven car-following model based on the Random Forest (RF) method was constructed in this work, and the Next Generation Simulation (NGSIM) dataset was used to calibrate and train the constructed model. The Artificial Neural Network (ANN) model, GM model, and Full Velocity Difference (FVD) model are em</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">ployed to comparatively verify the proposed model. The research results suggest that the model proposed in this work can accurately describe the car-</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">following behavior with better performance under multiple performance indicators.