Brain tissue is one of the softest parts of the human body,composed of white matter and grey matter.The mechanical behavior of the brain tissue plays an essential role in regulating brain morphology and brain function...Brain tissue is one of the softest parts of the human body,composed of white matter and grey matter.The mechanical behavior of the brain tissue plays an essential role in regulating brain morphology and brain function.Besides,traumatic brain injury(TBI)and various brain diseases are also greatly influenced by the brain's mechanical properties.Whether white matter or grey matter,brain tissue contains multiscale structures composed of neurons,glial cells,fibers,blood vessels,etc.,each with different mechanical properties.As such,brain tissue exhibits complex mechanical behavior,usually with strong nonlinearity,heterogeneity,and directional dependence.Building a constitutive law for multiscale brain tissue using traditional function-based approaches can be very challenging.Instead,this paper proposes a data-driven approach to establish the desired mechanical model of brain tissue.We focus on blood vessels with internal pressure embedded in a white or grey matter matrix material to demonstrate our approach.The matrix is described by an isotropic or anisotropic nonlinear elastic model.A representative unit cell(RUC)with blood vessels is built,which is used to generate the stress-strain data under different internal blood pressure and various proportional displacement loading paths.The generated stress-strain data is then used to train a mechanical law using artificial neural networks to predict the macroscopic mechanical response of brain tissue under different internal pressures.Finally,the trained material model is implemented into finite element software to predict the mechanical behavior of a whole brain under intracranial pressure and distributed body forces.Compared with a direct numerical simulation that employs a reference material model,our proposed approach greatly reduces the computational cost and improves modeling efficiency.The predictions made by our trained model demonstrate sufficient accuracy.Specifically,we find that the level of internal blood pressure can greatly influence stress distribution and determine the possible related damage behaviors.展开更多
This paper presents a method for measuring stress fields within the framework of coupled data models,aimed at determining stress fields in isotropic material structures exhibiting localized deterioration behavior with...This paper presents a method for measuring stress fields within the framework of coupled data models,aimed at determining stress fields in isotropic material structures exhibiting localized deterioration behavior without relying on constitutive equations in the deteriorated region.This approach contributes to advancing the field of intrinsic equation-free mechanics.The methodology combines measured strain fields with data-model coupling driven algorithms.The gradient and Canny operators are utilized to process the strain field data,enabling the determination of the deterioration region's location.Meanwhile,an adaptive model building method is proposed for constructing coupling driven models.To address the issue of unknown datasets during computation,a dataset updating strategy based on a differential evolutionary algorithm is introduced.The resulting optimal dataset is then used to generate stress field results.Validation against finite element method calculations demonstrates the accuracy of the proposed method in obtaining full-field stresses in specimens with local degradation behavior.展开更多
Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorpt...Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.展开更多
During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place i...During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.展开更多
Fault prognosis is mainly referred to the estimation of the operating time before a failure occurs,which is vital for ensuring the stability,safety and long lifetime of degrading industrial systems.According to the re...Fault prognosis is mainly referred to the estimation of the operating time before a failure occurs,which is vital for ensuring the stability,safety and long lifetime of degrading industrial systems.According to the results of fault prognosis,the maintenance strategy for underlying industrial systems can realize the conversion from passive maintenance to active maintenance.With the increased complexity and the improved automation level of industrial systems,fault prognosis techniques have become more and more indispensable.Particularly,the datadriven based prognosis approaches,which tend to find the hidden fault factors and determine the specific fault occurrence time of the system by analysing historical or real-time measurement data,gain great attention from different industrial sectors.In this context,the major task of this paper is to present a systematic overview of data-driven fault prognosis for industrial systems.Firstly,the characteristics of different prognosis methods are revealed with the data-based ones being highlighted.Moreover,based on the different data characteristics that exist in industrial systems,the corresponding fault prognosis methodologies are illustrated,with emphasis on analyses and comparisons of different prognosis methods.Finally,we reveal the current research trends and look forward to the future challenges in this field.This review is expected to serve as a tutorial and source of references for fault prognosis researchers.展开更多
To achieve zero-defect production during computer numerical control(CNC)machining processes,it is imperative to develop effective diagnosis systems to detect anomalies efficiently.However,due to the dynamic conditions...To achieve zero-defect production during computer numerical control(CNC)machining processes,it is imperative to develop effective diagnosis systems to detect anomalies efficiently.However,due to the dynamic conditions of the machine and tooling during machining processes,the relevant diagnosis systems currently adopted in industries are incompetent.To address this issue,this paper presents a novel data-driven diagnosis system for anomalies.In this system,power data for condition monitoring are continuously collected during dynamic machining processes to support online diagnosis analysis.To facilitate the analysis,preprocessing mechanisms have been designed to de-noise,normalize,and align the monitored data.Important features are extracted from the monitored data and thresholds are defined to identify anomalies.Considering the dynamic conditions of the machine and tooling during machining processes,the thresholds used to identify anomalies can vary.Based on historical data,the values of thresholds are optimized using a fruit fly optimization(FFO)algorithm to achieve more accurate detection.Practical case studies were used to validate the system,thereby demonstrating the potential and effectiveness of the system for industrial applications.展开更多
This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to o...This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to obtain a static equilibrium state of an elastic structure. Preliminary numerical experiments illustrate that, compared with existing methods, the proposed method finds a reasonable solution even if data points distribute coarsely in a given material data set.展开更多
The data-driven fault diagnosis methods can improve the reliability of analog circuits by using the data generated from it. The data have some characteristics, such as randomness and incompleteness, which lead to the ...The data-driven fault diagnosis methods can improve the reliability of analog circuits by using the data generated from it. The data have some characteristics, such as randomness and incompleteness, which lead to the diagnostic results being sensitive to the specific values and random noise. This paper presents a data-driven fault diagnosis method for analog circuits based on the robust competitive agglomeration (RCA), which can alleviate the incompleteness of the data by clustering with the competing process. And the robustness of the diagnostic results is enhanced by using the approach of robust statistics in RCA. A series of experiments are provided to demonstrate that RCA can classify the incomplete data with a high accuracy. The experimental results show that RCA is robust for the data needed to be classified as well as the parameters needed to be adjusted. The effectiveness of RCA in practical use is demonstrated by two analog circuits.展开更多
In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the ea...In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA) was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morohological resoonse, which is primarily driven by the intermittent larger storm waves.展开更多
The recently proposed data-driven pole placement method is able to make use of measurement data to simultaneously identify a state space model and derive pole placement state feedback gain. It can achieve this precise...The recently proposed data-driven pole placement method is able to make use of measurement data to simultaneously identify a state space model and derive pole placement state feedback gain. It can achieve this precisely for systems that are linear time-invariant and for which noiseless measurement datasets are available. However, for nonlinear systems, and/or when the only noisy measurement datasets available contain noise, this approach is unable to yield satisfactory results. In this study, we investigated the effect on data-driven pole placement performance of introducing a prefilter to reduce the noise present in datasets. Using numerical simulations of a self-balancing robot, we demonstrated the important role that prefiltering can play in reducing the interference caused by noise.展开更多
The application scope and future development directions of machine learning models(supervised learning, transfer learning, and unsupervised learning) that have driven energy material design are discussed.
In this paper, a real-time online data-driven adaptive method is developed to deal with uncertainties such as high nonlinearity, strong coupling, parameter perturbation and external disturbances in attitude control of...In this paper, a real-time online data-driven adaptive method is developed to deal with uncertainties such as high nonlinearity, strong coupling, parameter perturbation and external disturbances in attitude control of fixed-wing unmanned aerial vehicles (UAVs). Firstly, a model-free adaptive control (MFAC) method requiring only input/output (I/O) data and no model information is adopted for control scheme design of angular velocity subsystem which contains all model information and up-mentioned uncertainties. Secondly, the internal model control (IMC) method featured with less tuning parameters and convenient tuning process is adopted for control scheme design of the certain Euler angle subsystem. Simulation results show that, the method developed is obviously superior to the cascade PID (CPID) method and the nonlinear dynamic inversion (NDI) method.展开更多
Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data...Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.展开更多
Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented oppor...Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented opportunities for large-scale knowledge discovery from database.Data mining is an emerging area of computational intelligence that offers new theories,techniques,and tools for processing large volumes of data,such as data analysis,decision making,etc.There are many researchers working on designing efficient data mining techniques,methods,and algorithms.Unfortunately,most data mining researchers pay much attention to technique problems for developing data mining models and methods,while little to basic issues of data mining.In this paper,we will propose a new understanding for data mining,that is,domain-oriented data-driven data mining(3DM)model.Some data-driven data mining algorithms developed in our Lab are also presented to show its validity.展开更多
Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of product...Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.展开更多
Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensi...Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensional evaluation system is built to evaluate the government performance of China haze. Second, a data-driven model is employed to reveal the operation mechanism of China’s haze and is described as a multi input and multi output system. Third, a prototype system is set up to verify the proposed scheme, and the result provides us with a graphical tool to monitor different haze control strategies.展开更多
Using Louisiana’s Interstate system, this paper aims to demonstrate how data can be used to evaluate freight movement reliability, economy, and safety of truck freight operations to improve decision-making. Data main...Using Louisiana’s Interstate system, this paper aims to demonstrate how data can be used to evaluate freight movement reliability, economy, and safety of truck freight operations to improve decision-making. Data mainly from the National Performance Management Research Data Set (NPMRDS) and the Louisiana Crash Database were used to analyze Truck Travel Time Reliability Index, commercial vehicle User Delay Costs, and commercial vehicle safety. The results indicate that while Louisiana’s Interstate system remained reliable over the years, some segments were found to be unreliable, which were annually less than 12% of the state’s Interstate system mileage. The User Delay Costs by commercial vehicles on these unreliable segments were, on average, 65.45% of the User Delay Cost by all vehicles on the Interstate highway system between 2016 and 2019, 53.10% between 2020 and 2021, and 70.36% in 2022, which are considerably high. These disproportionate ratios indicate the economic impact of the unreliability of the Interstate system on commercial vehicle operations. Additionally, though the annual crash frequencies remained relatively constant, an increasing proportion of commercial vehicles are involved in crashes, with segments (mileposts) that have high crash frequencies seeming to correspond with locations with recurring congestion on the Interstate highway system. The study highlights the potential of using data to identify areas that need improvement in transportation systems to support better decision-making.展开更多
The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare...The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare. In recent years, the related technologies of Intelligent Transportation System (ITS) re</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">presented by the Vehicles to Everything (V2X) technology have been developing rapidly. Utilizing the related technologies of ITS, the large-scale vehicle microscopic trajectory data with high quality can be acquired, which provides the research foundation for modeling the car-following behavior based on the data-driven methods. According to this point, a data-driven car-following model based on the Random Forest (RF) method was constructed in this work, and the Next Generation Simulation (NGSIM) dataset was used to calibrate and train the constructed model. The Artificial Neural Network (ANN) model, GM model, and Full Velocity Difference (FVD) model are em</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">ployed to comparatively verify the proposed model. The research results suggest that the model proposed in this work can accurately describe the car-</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">following behavior with better performance under multiple performance indicators.展开更多
文摘Brain tissue is one of the softest parts of the human body,composed of white matter and grey matter.The mechanical behavior of the brain tissue plays an essential role in regulating brain morphology and brain function.Besides,traumatic brain injury(TBI)and various brain diseases are also greatly influenced by the brain's mechanical properties.Whether white matter or grey matter,brain tissue contains multiscale structures composed of neurons,glial cells,fibers,blood vessels,etc.,each with different mechanical properties.As such,brain tissue exhibits complex mechanical behavior,usually with strong nonlinearity,heterogeneity,and directional dependence.Building a constitutive law for multiscale brain tissue using traditional function-based approaches can be very challenging.Instead,this paper proposes a data-driven approach to establish the desired mechanical model of brain tissue.We focus on blood vessels with internal pressure embedded in a white or grey matter matrix material to demonstrate our approach.The matrix is described by an isotropic or anisotropic nonlinear elastic model.A representative unit cell(RUC)with blood vessels is built,which is used to generate the stress-strain data under different internal blood pressure and various proportional displacement loading paths.The generated stress-strain data is then used to train a mechanical law using artificial neural networks to predict the macroscopic mechanical response of brain tissue under different internal pressures.Finally,the trained material model is implemented into finite element software to predict the mechanical behavior of a whole brain under intracranial pressure and distributed body forces.Compared with a direct numerical simulation that employs a reference material model,our proposed approach greatly reduces the computational cost and improves modeling efficiency.The predictions made by our trained model demonstrate sufficient accuracy.Specifically,we find that the level of internal blood pressure can greatly influence stress distribution and determine the possible related damage behaviors.
基金supported by the Fundamental Research Fund for the Central Universities(Grant No.BLX202226)。
文摘This paper presents a method for measuring stress fields within the framework of coupled data models,aimed at determining stress fields in isotropic material structures exhibiting localized deterioration behavior without relying on constitutive equations in the deteriorated region.This approach contributes to advancing the field of intrinsic equation-free mechanics.The methodology combines measured strain fields with data-model coupling driven algorithms.The gradient and Canny operators are utilized to process the strain field data,enabling the determination of the deterioration region's location.Meanwhile,an adaptive model building method is proposed for constructing coupling driven models.To address the issue of unknown datasets during computation,a dataset updating strategy based on a differential evolutionary algorithm is introduced.The resulting optimal dataset is then used to generate stress field results.Validation against finite element method calculations demonstrates the accuracy of the proposed method in obtaining full-field stresses in specimens with local degradation behavior.
基金Supported by State Key Program of National Natural Science Foundation of China (60834001) and National Natural Science Foundation of China (60774022).Acknowledgement Authors would like to thank NSFC organizers and participants who shared their ideas and works with us during the NSFC workshop on data-based control, decision making, scheduling, and fault diagnosis. In particular, authors would like to thank Chai Tian-You, Sun You-Xian, Wang Hong, Yan Hong-Sheng, and Gao Fu-Rong for discussing the concept on design model shown in Fig. 12, the concept on temporal multi-scale shown in Fig. 8, the concept on fault diagnosis shown in Fig. 14, the concept on dynamic scheduling shown in Fig. 15, and the concept on interval model shown in Fig. 16, respectively.
基金RPSEA and U.S.Department of Energy for partially funding this study
文摘Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.
基金partially supported by the National Natural Science Foundation of China(61751306,61801208,61671233)the Jiangsu Science Foundation(BK20170650)+2 种基金the Postdoctoral Science Foundation of China(BX201700118,2017M621712)the Jiangsu Postdoctoral Science Foundation(1701118B)the Fundamental Research Funds for the Central Universities(021014380094)
文摘During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.
基金supported by the National Natural Science Foundation of China(61773087)the National Key Research and Development Program of China(2018YFB1601500)High-tech Ship Research Project of Ministry of Industry and Information Technology-Research of Intelligent Ship Testing and Verifacation([2018]473)
文摘Fault prognosis is mainly referred to the estimation of the operating time before a failure occurs,which is vital for ensuring the stability,safety and long lifetime of degrading industrial systems.According to the results of fault prognosis,the maintenance strategy for underlying industrial systems can realize the conversion from passive maintenance to active maintenance.With the increased complexity and the improved automation level of industrial systems,fault prognosis techniques have become more and more indispensable.Particularly,the datadriven based prognosis approaches,which tend to find the hidden fault factors and determine the specific fault occurrence time of the system by analysing historical or real-time measurement data,gain great attention from different industrial sectors.In this context,the major task of this paper is to present a systematic overview of data-driven fault prognosis for industrial systems.Firstly,the characteristics of different prognosis methods are revealed with the data-based ones being highlighted.Moreover,based on the different data characteristics that exist in industrial systems,the corresponding fault prognosis methodologies are illustrated,with emphasis on analyses and comparisons of different prognosis methods.Finally,we reveal the current research trends and look forward to the future challenges in this field.This review is expected to serve as a tutorial and source of references for fault prognosis researchers.
基金funding from the EU Smarter project(PEOPLE-2013-IAPP-610675)
文摘To achieve zero-defect production during computer numerical control(CNC)machining processes,it is imperative to develop effective diagnosis systems to detect anomalies efficiently.However,due to the dynamic conditions of the machine and tooling during machining processes,the relevant diagnosis systems currently adopted in industries are incompetent.To address this issue,this paper presents a novel data-driven diagnosis system for anomalies.In this system,power data for condition monitoring are continuously collected during dynamic machining processes to support online diagnosis analysis.To facilitate the analysis,preprocessing mechanisms have been designed to de-noise,normalize,and align the monitored data.Important features are extracted from the monitored data and thresholds are defined to identify anomalies.Considering the dynamic conditions of the machine and tooling during machining processes,the thresholds used to identify anomalies can vary.Based on historical data,the values of thresholds are optimized using a fruit fly optimization(FFO)algorithm to achieve more accurate detection.Practical case studies were used to validate the system,thereby demonstrating the potential and effectiveness of the system for industrial applications.
基金Supported by National Basic Research Program of China(973 Program)(2013CB035500) National Natural Science Foundation of China(61233004,61221003,61074061)+1 种基金 International Cooperation Program of Shanghai Science and Technology Commission (12230709600) the Higher Education Research Fund for the Doctoral Program of China(20120073130006)
基金supported by JSPS KAKENHI (Grants 17K06633 and 18K18898)
文摘This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to obtain a static equilibrium state of an elastic structure. Preliminary numerical experiments illustrate that, compared with existing methods, the proposed method finds a reasonable solution even if data points distribute coarsely in a given material data set.
基金supported by the National Natural Science Foundation of China (61202078 61071139)the National High Technology Research and Development Program of China (863 Program)(SQ2011AA110101)
文摘The data-driven fault diagnosis methods can improve the reliability of analog circuits by using the data generated from it. The data have some characteristics, such as randomness and incompleteness, which lead to the diagnostic results being sensitive to the specific values and random noise. This paper presents a data-driven fault diagnosis method for analog circuits based on the robust competitive agglomeration (RCA), which can alleviate the incompleteness of the data by clustering with the competing process. And the robustness of the diagnostic results is enhanced by using the approach of robust statistics in RCA. A series of experiments are provided to demonstrate that RCA can classify the incomplete data with a high accuracy. The experimental results show that RCA is robust for the data needed to be classified as well as the parameters needed to be adjusted. The effectiveness of RCA in practical use is demonstrated by two analog circuits.
基金supported by the UK Natural Environment Research Council(Grant No.NE/J005606/1)the UK Engineering and Physical Sciences Research Council(Grant No.EP/C005392/1)the Ensemble Estimation of Flood Risk in a Changing Climate(EFRa CC)project funded by the British Council under its Global Innovation Initiative
文摘In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA) was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morohological resoonse, which is primarily driven by the intermittent larger storm waves.
文摘The recently proposed data-driven pole placement method is able to make use of measurement data to simultaneously identify a state space model and derive pole placement state feedback gain. It can achieve this precisely for systems that are linear time-invariant and for which noiseless measurement datasets are available. However, for nonlinear systems, and/or when the only noisy measurement datasets available contain noise, this approach is unable to yield satisfactory results. In this study, we investigated the effect on data-driven pole placement performance of introducing a prefilter to reduce the noise present in datasets. Using numerical simulations of a self-balancing robot, we demonstrated the important role that prefiltering can play in reducing the interference caused by noise.
基金supported by the National Key R&D Program of China(Grant No.2021YFC2100100)the National Natural Science Foundation of China(Grant No.21901157)+1 种基金the Shanghai Science and Technology Project of China(Grant No.21JC1403400)the SJTU Global Strategic Partnership Fund(Grant No.2020 SJTUHUJI)。
文摘The application scope and future development directions of machine learning models(supervised learning, transfer learning, and unsupervised learning) that have driven energy material design are discussed.
文摘In this paper, a real-time online data-driven adaptive method is developed to deal with uncertainties such as high nonlinearity, strong coupling, parameter perturbation and external disturbances in attitude control of fixed-wing unmanned aerial vehicles (UAVs). Firstly, a model-free adaptive control (MFAC) method requiring only input/output (I/O) data and no model information is adopted for control scheme design of angular velocity subsystem which contains all model information and up-mentioned uncertainties. Secondly, the internal model control (IMC) method featured with less tuning parameters and convenient tuning process is adopted for control scheme design of the certain Euler angle subsystem. Simulation results show that, the method developed is obviously superior to the cascade PID (CPID) method and the nonlinear dynamic inversion (NDI) method.
文摘Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.
文摘Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented opportunities for large-scale knowledge discovery from database.Data mining is an emerging area of computational intelligence that offers new theories,techniques,and tools for processing large volumes of data,such as data analysis,decision making,etc.There are many researchers working on designing efficient data mining techniques,methods,and algorithms.Unfortunately,most data mining researchers pay much attention to technique problems for developing data mining models and methods,while little to basic issues of data mining.In this paper,we will propose a new understanding for data mining,that is,domain-oriented data-driven data mining(3DM)model.Some data-driven data mining algorithms developed in our Lab are also presented to show its validity.
文摘Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.
文摘Recently, the China haze becomes more and more serious, but it is very difficult to model and control it. Here, a data-driven model is introduced for the simulation and monitoring of China haze. First, a multi-dimensional evaluation system is built to evaluate the government performance of China haze. Second, a data-driven model is employed to reveal the operation mechanism of China’s haze and is described as a multi input and multi output system. Third, a prototype system is set up to verify the proposed scheme, and the result provides us with a graphical tool to monitor different haze control strategies.
文摘Using Louisiana’s Interstate system, this paper aims to demonstrate how data can be used to evaluate freight movement reliability, economy, and safety of truck freight operations to improve decision-making. Data mainly from the National Performance Management Research Data Set (NPMRDS) and the Louisiana Crash Database were used to analyze Truck Travel Time Reliability Index, commercial vehicle User Delay Costs, and commercial vehicle safety. The results indicate that while Louisiana’s Interstate system remained reliable over the years, some segments were found to be unreliable, which were annually less than 12% of the state’s Interstate system mileage. The User Delay Costs by commercial vehicles on these unreliable segments were, on average, 65.45% of the User Delay Cost by all vehicles on the Interstate highway system between 2016 and 2019, 53.10% between 2020 and 2021, and 70.36% in 2022, which are considerably high. These disproportionate ratios indicate the economic impact of the unreliability of the Interstate system on commercial vehicle operations. Additionally, though the annual crash frequencies remained relatively constant, an increasing proportion of commercial vehicles are involved in crashes, with segments (mileposts) that have high crash frequencies seeming to correspond with locations with recurring congestion on the Interstate highway system. The study highlights the potential of using data to identify areas that need improvement in transportation systems to support better decision-making.
文摘The car-following models are the research basis of traffic flow theory and microscopic traffic simulation. Among the previous work, the theory-driven models are dominant, while the data-driven ones are relatively rare. In recent years, the related technologies of Intelligent Transportation System (ITS) re</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">presented by the Vehicles to Everything (V2X) technology have been developing rapidly. Utilizing the related technologies of ITS, the large-scale vehicle microscopic trajectory data with high quality can be acquired, which provides the research foundation for modeling the car-following behavior based on the data-driven methods. According to this point, a data-driven car-following model based on the Random Forest (RF) method was constructed in this work, and the Next Generation Simulation (NGSIM) dataset was used to calibrate and train the constructed model. The Artificial Neural Network (ANN) model, GM model, and Full Velocity Difference (FVD) model are em</span><span style="font-family:Verdana;">- </span><span style="font-family:Verdana;">ployed to comparatively verify the proposed model. The research results suggest that the model proposed in this work can accurately describe the car-</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">following behavior with better performance under multiple performance indicators.