During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place i...During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.展开更多
The application scope and future development directions of machine learning models(supervised learning, transfer learning, and unsupervised learning) that have driven energy material design are discussed.
Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data...Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.展开更多
The field of fluid simulation is developing rapidly,and data-driven methods provide many frameworks and techniques for fluid simulation.This paper presents a survey of data-driven methods used in fluid simulation in c...The field of fluid simulation is developing rapidly,and data-driven methods provide many frameworks and techniques for fluid simulation.This paper presents a survey of data-driven methods used in fluid simulation in computer graphics in recent years.First,we provide a brief introduction of physical based fluid simulation methods based on their spatial discretization,including Lagrangian,Eulerian,and hybrid methods.The characteristics of these underlying structures and their inherent connection with data driven methodologies are then analyzed.Subsequently,we review studies pertaining to a wide range of applications,including data-driven solvers,detail enhancement,animation synthesis,fluid control,and differentiable simulation.Finally,we discuss some related issues and potential directions in data-driven fluid simulation.We conclude that the fluid simulation combined with data-driven methods has some advantages,such as higher simulation efficiency,rich details and different pattern styles,compared with traditional methods under the same parameters.It can be seen that the data-driven fluid simulation is feasible and has broad prospects.展开更多
Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading...Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading to poor performance and privacy breaches.Blockchain-based cognitive computing can help protect and maintain information security and privacy in cloud platforms,ensuring businesses can focus on business development.To ensure data security in cloud platforms,this research proposed a blockchain-based Hybridized Data Driven Cognitive Computing(HD2C)model.However,the proposed HD2C framework addresses breaches of the privacy information of mixed participants of the Internet of Things(IoT)in the cloud.HD2C is developed by combining Federated Learning(FL)with a Blockchain consensus algorithm to connect smart contracts with Proof of Authority.The“Data Island”problem can be solved by FL’s emphasis on privacy and lightning-fast processing,while Blockchain provides a decentralized incentive structure that is impervious to poisoning.FL with Blockchain allows quick consensus through smart member selection and verification.The HD2C paradigm significantly improves the computational processing efficiency of intelligent manufacturing.Extensive analysis results derived from IIoT datasets confirm HD2C superiority.When compared to other consensus algorithms,the Blockchain PoA’s foundational cost is significant.The accuracy and memory utilization evaluation results predict the total benefits of the system.In comparison to the values 0.004 and 0.04,the value of 0.4 achieves good accuracy.According to the experiment results,the number of transactions per second has minimal impact on memory requirements.The findings of this study resulted in the development of a brand-new IIoT framework based on blockchain technology.展开更多
“双碳”背景下风电的渗透率不断提高,将对电力系统的形态和运行机制产生深刻影响。本文提出了一种基于双向长短期记忆Bi-LSTM(bidirectional long short-term memory)循环神经网络的风储系统控制策略。采用双向长短时循环神经网络提取...“双碳”背景下风电的渗透率不断提高,将对电力系统的形态和运行机制产生深刻影响。本文提出了一种基于双向长短期记忆Bi-LSTM(bidirectional long short-term memory)循环神经网络的风储系统控制策略。采用双向长短时循环神经网络提取控制结果与风电场实际出力以及储能状态间的时序信息,通过构建基于双向长短时记忆循环神经网络的控制模型,使得风电场在多种运行工况下能够快速、准确地得到储能系统调节结果。基于实际风电场数据仿真结果表明,本文所提控制策略能够保证在一定经济效益的前提下,将风储系统控制误差保持在0.50%~1.37%。展开更多
Particulate nitrate,a key component of fine particles,forms through the intricate gas-to-particle conversion process.This process is regulated by the gas-to-particle conversion coefficient of nitrate(ε(NO_(3)^(-))).T...Particulate nitrate,a key component of fine particles,forms through the intricate gas-to-particle conversion process.This process is regulated by the gas-to-particle conversion coefficient of nitrate(ε(NO_(3)^(-))).The mechanism betweenε(NO_(3)^(-))and its drivers is highly complex and nonlinear,and can be characterized by machine learning methods.However,conventional machine learning often yields results that lack clear physical meaning and may even contradict established physical/chemical mechanisms due to the influence of ambient factors.It urgently needs an alternative approach that possesses transparent physical interpretations and provides deeper insights into the impact ofε(NO_(3)^(-)).Here we introduce a supervised machine learning approachdthe multilevel nested random forest guided by theory approaches.Our approach robustly identifies NH4 t,SO_(4)^(2-),and temperature as pivotal drivers forε(NO_(3)^(-)).Notably,substantial disparities exist between the outcomes of traditional random forest analysis and the anticipated actual results.Furthermore,our approach underscores the significance of NH4 t during both daytime(30%)and nighttime(40%)periods,while appropriately downplaying the influence of some less relevant drivers in comparison to conventional random forest analysis.This research underscores the transformative potential of integrating domain knowledge with machine learning in atmospheric studies.展开更多
With the rapid development of urban power grids and the large-scale integration of renewable energy, traditional power grid fault diagnosis techniques struggle to address the complexities of diagnosing faults in intri...With the rapid development of urban power grids and the large-scale integration of renewable energy, traditional power grid fault diagnosis techniques struggle to address the complexities of diagnosing faults in intricate power grid systems. Although artificial intelligence technologies offer new solutions for power grid fault diagnosis, the difficulty in acquiring labeled grid data limits the development of AI technologies in this area. In response to these challenges, this study proposes a semi-supervised learning framework with self-supervised and adaptive threshold (SAT-SSL) for fault detection and classification in power grids. Compared to other methods, our method reduces the dependence on labeling data while maintaining high recognition accuracy. First, we utilize frequency domain analysis on power grid data to filter abnormal events, then classify and label these events based on visual features, to creating a power grid dataset. Subsequently, we employ the Yule–Walker algorithm extract features from the power grid data. Then we construct a semi-supervised learning framework, incorporating self-supervised loss and dynamic threshold to enhance information extraction capabilities and adaptability across different scenarios of the model. Finally, the power grid dataset along with two benchmark datasets are used to validate the model’s functionality. The results indicate that our model achieves a low error rate across various scenarios and different amounts of labels. In power grid dataset, When retaining just 5% of the labels, the error rate is only 6.15%, which proves that this method can achieve accurate grid fault detection and classification with a limited amount of labeled data.展开更多
Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not mee...Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not meet oper ational constraints.To overcome excessive computational ex pense in high-dimensional MACEED problems,a novel data-driven surrogate-assisted method is proposed.First,a cosine-similarity-based deep belief network combined with a back-propagation(DBN+BP)neural network is utilized to replace cost and emission functions.Second,transfer learning is applied with a pretraining and fine-tuning method to improve DBN+BP regression surrogate models,thus realizing fast con struction of surrogate models between different regional power systems.Third,a multi-objective antlion optimizer with a novel general single-dimension retention bi-objective optimization poli cy is proposed to execute MACEED optimization to obtain scheduling decisions.The proposed method not only ensures the convergence,uniformity,and extensibility of the Pareto front,but also greatly reduces the computational time.Finally,a 4-ar ea 40-unit test system with different constraints is employed to demonstrate the effectiveness of the proposed method.展开更多
基金partially supported by the National Natural Science Foundation of China(61751306,61801208,61671233)the Jiangsu Science Foundation(BK20170650)+2 种基金the Postdoctoral Science Foundation of China(BX201700118,2017M621712)the Jiangsu Postdoctoral Science Foundation(1701118B)the Fundamental Research Funds for the Central Universities(021014380094)
文摘During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.
基金supported by the National Key R&D Program of China(Grant No.2021YFC2100100)the National Natural Science Foundation of China(Grant No.21901157)+1 种基金the Shanghai Science and Technology Project of China(Grant No.21JC1403400)the SJTU Global Strategic Partnership Fund(Grant No.2020 SJTUHUJI)。
文摘The application scope and future development directions of machine learning models(supervised learning, transfer learning, and unsupervised learning) that have driven energy material design are discussed.
文摘Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.
基金the Natural Key Research and Development Program of China(2018YFB1004902)the Natural Science Foundation of China(61772329,61373085).
文摘The field of fluid simulation is developing rapidly,and data-driven methods provide many frameworks and techniques for fluid simulation.This paper presents a survey of data-driven methods used in fluid simulation in computer graphics in recent years.First,we provide a brief introduction of physical based fluid simulation methods based on their spatial discretization,including Lagrangian,Eulerian,and hybrid methods.The characteristics of these underlying structures and their inherent connection with data driven methodologies are then analyzed.Subsequently,we review studies pertaining to a wide range of applications,including data-driven solvers,detail enhancement,animation synthesis,fluid control,and differentiable simulation.Finally,we discuss some related issues and potential directions in data-driven fluid simulation.We conclude that the fluid simulation combined with data-driven methods has some advantages,such as higher simulation efficiency,rich details and different pattern styles,compared with traditional methods under the same parameters.It can be seen that the data-driven fluid simulation is feasible and has broad prospects.
文摘Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading to poor performance and privacy breaches.Blockchain-based cognitive computing can help protect and maintain information security and privacy in cloud platforms,ensuring businesses can focus on business development.To ensure data security in cloud platforms,this research proposed a blockchain-based Hybridized Data Driven Cognitive Computing(HD2C)model.However,the proposed HD2C framework addresses breaches of the privacy information of mixed participants of the Internet of Things(IoT)in the cloud.HD2C is developed by combining Federated Learning(FL)with a Blockchain consensus algorithm to connect smart contracts with Proof of Authority.The“Data Island”problem can be solved by FL’s emphasis on privacy and lightning-fast processing,while Blockchain provides a decentralized incentive structure that is impervious to poisoning.FL with Blockchain allows quick consensus through smart member selection and verification.The HD2C paradigm significantly improves the computational processing efficiency of intelligent manufacturing.Extensive analysis results derived from IIoT datasets confirm HD2C superiority.When compared to other consensus algorithms,the Blockchain PoA’s foundational cost is significant.The accuracy and memory utilization evaluation results predict the total benefits of the system.In comparison to the values 0.004 and 0.04,the value of 0.4 achieves good accuracy.According to the experiment results,the number of transactions per second has minimal impact on memory requirements.The findings of this study resulted in the development of a brand-new IIoT framework based on blockchain technology.
文摘“双碳”背景下风电的渗透率不断提高,将对电力系统的形态和运行机制产生深刻影响。本文提出了一种基于双向长短期记忆Bi-LSTM(bidirectional long short-term memory)循环神经网络的风储系统控制策略。采用双向长短时循环神经网络提取控制结果与风电场实际出力以及储能状态间的时序信息,通过构建基于双向长短时记忆循环神经网络的控制模型,使得风电场在多种运行工况下能够快速、准确地得到储能系统调节结果。基于实际风电场数据仿真结果表明,本文所提控制策略能够保证在一定经济效益的前提下,将风储系统控制误差保持在0.50%~1.37%。
基金supported by the National Natural Science Foundation of China(42077191)the National Key Research and Development Program of China(2022YFC3703400)+1 种基金the Blue Sky Foundation,Tianjin Science and Technology Plan Project(18PTZWHZ00120)Fundamental Research Funds for the Central Universities(63213072 and 63213074).
文摘Particulate nitrate,a key component of fine particles,forms through the intricate gas-to-particle conversion process.This process is regulated by the gas-to-particle conversion coefficient of nitrate(ε(NO_(3)^(-))).The mechanism betweenε(NO_(3)^(-))and its drivers is highly complex and nonlinear,and can be characterized by machine learning methods.However,conventional machine learning often yields results that lack clear physical meaning and may even contradict established physical/chemical mechanisms due to the influence of ambient factors.It urgently needs an alternative approach that possesses transparent physical interpretations and provides deeper insights into the impact ofε(NO_(3)^(-)).Here we introduce a supervised machine learning approachdthe multilevel nested random forest guided by theory approaches.Our approach robustly identifies NH4 t,SO_(4)^(2-),and temperature as pivotal drivers forε(NO_(3)^(-)).Notably,substantial disparities exist between the outcomes of traditional random forest analysis and the anticipated actual results.Furthermore,our approach underscores the significance of NH4 t during both daytime(30%)and nighttime(40%)periods,while appropriately downplaying the influence of some less relevant drivers in comparison to conventional random forest analysis.This research underscores the transformative potential of integrating domain knowledge with machine learning in atmospheric studies.
基金supported by the National Natural Science Foundation China under Grants number 62073232,and the Science and Technology Project of Shenzhen,China(KCXST20221021111402006,JSGG20220831105800002)and the“Nanling Team Project”of Shaoguan city,and the Science and Technology project of Tianjin,China(22YFYSHZ00330)+1 种基金and Shenzhen Excellent Innovative Talents RCYX20221008093036022,Shenzhen-HongKong joint funding project(A)(SGDX20230116092053005)the Shenzhen Undertaking the National Major Science and Technology Program,China(CJGJZD20220517141405012).
文摘With the rapid development of urban power grids and the large-scale integration of renewable energy, traditional power grid fault diagnosis techniques struggle to address the complexities of diagnosing faults in intricate power grid systems. Although artificial intelligence technologies offer new solutions for power grid fault diagnosis, the difficulty in acquiring labeled grid data limits the development of AI technologies in this area. In response to these challenges, this study proposes a semi-supervised learning framework with self-supervised and adaptive threshold (SAT-SSL) for fault detection and classification in power grids. Compared to other methods, our method reduces the dependence on labeling data while maintaining high recognition accuracy. First, we utilize frequency domain analysis on power grid data to filter abnormal events, then classify and label these events based on visual features, to creating a power grid dataset. Subsequently, we employ the Yule–Walker algorithm extract features from the power grid data. Then we construct a semi-supervised learning framework, incorporating self-supervised loss and dynamic threshold to enhance information extraction capabilities and adaptability across different scenarios of the model. Finally, the power grid dataset along with two benchmark datasets are used to validate the model’s functionality. The results indicate that our model achieves a low error rate across various scenarios and different amounts of labels. In power grid dataset, When retaining just 5% of the labels, the error rate is only 6.15%, which proves that this method can achieve accurate grid fault detection and classification with a limited amount of labeled data.
文摘Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not meet oper ational constraints.To overcome excessive computational ex pense in high-dimensional MACEED problems,a novel data-driven surrogate-assisted method is proposed.First,a cosine-similarity-based deep belief network combined with a back-propagation(DBN+BP)neural network is utilized to replace cost and emission functions.Second,transfer learning is applied with a pretraining and fine-tuning method to improve DBN+BP regression surrogate models,thus realizing fast con struction of surrogate models between different regional power systems.Third,a multi-objective antlion optimizer with a novel general single-dimension retention bi-objective optimization poli cy is proposed to execute MACEED optimization to obtain scheduling decisions.The proposed method not only ensures the convergence,uniformity,and extensibility of the Pareto front,but also greatly reduces the computational time.Finally,a 4-ar ea 40-unit test system with different constraints is employed to demonstrate the effectiveness of the proposed method.