The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initiall...The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.展开更多
Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to pred...Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.展开更多
Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of ...Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.展开更多
With the increased availability of experimental measurements aiming at probing wind resources and wind turbine operations,machine learning(ML)models are poised to advance our understanding of the physics underpinning ...With the increased availability of experimental measurements aiming at probing wind resources and wind turbine operations,machine learning(ML)models are poised to advance our understanding of the physics underpinning the interaction between the atmospheric boundary layer and wind turbine arrays,the generated wakes and their interactions,and wind energy harvesting.However,the majority of the existing ML models for predicting wind turbine wakes merely recreate Computational fluid dynamics(CFD)simulated data with analogous accuracy but reduced computational costs,thus providing surrogate models rather than enhanced data-enabled physics insights.Although ML-based surrogate models are useful to overcome current limitations associated with the high computational costs of CFD models,using ML to unveil processes from experimental data or enhance modeling capabilities is deemed a potential research direction to pursue.In this letter,we discuss recent achievements in the realm of ML modeling of wind turbine wakes and operations,along with new promising research strategies.展开更多
Multi-Source data plays an important role in the evolution of media convergence.Its fusion processing enables the further mining of data and utilization of data value and broadens the path for the sharing and dissemin...Multi-Source data plays an important role in the evolution of media convergence.Its fusion processing enables the further mining of data and utilization of data value and broadens the path for the sharing and dissemination of media data.However,it also faces serious problems in terms of protecting user and data privacy.Many privacy protectionmethods have been proposed to solve the problemof privacy leakage during the process of data sharing,but they suffer fromtwo flaws:1)the lack of algorithmic frameworks for specific scenarios such as dynamic datasets in the media domain;2)the inability to solve the problem of the high computational complexity of ciphertext in multi-source data privacy protection,resulting in long encryption and decryption times.In this paper,we propose a multi-source data privacy protection method based on homomorphic encryption and blockchain technology,which solves the privacy protection problem ofmulti-source heterogeneous data in the dissemination ofmedia and reduces ciphertext processing time.We deployed the proposedmethod on theHyperledger platformfor testing and compared it with the privacy protection schemes based on k-anonymity and differential privacy.The experimental results showthat the key generation,encryption,and decryption times of the proposedmethod are lower than those in data privacy protection methods based on k-anonymity technology and differential privacy technology.This significantly reduces the processing time ofmulti-source data,which gives it potential for use in many applications.展开更多
Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of ...Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of wind farm data.Consequently,a method for cleaning wind power anomaly data by combining image processing with community detection algorithms(CWPAD-IPCDA)is proposed.To precisely identify and initially clean anomalous data,wind power curve(WPC)images are converted into graph structures,which employ the Louvain community recognition algorithm and graph-theoretic methods for community detection and segmentation.Furthermore,the mathematical morphology operation(MMO)determines the main part of the initially cleaned wind power curve images and maps them back to the normal wind power points to complete the final cleaning.The CWPAD-IPCDA method was applied to clean datasets from 25 wind turbines(WTs)in two wind farms in northwest China to validate its feasibility.A comparison was conducted using density-based spatial clustering of applications with noise(DBSCAN)algorithm,an improved isolation forest algorithm,and an image-based(IB)algorithm.The experimental results demonstrate that the CWPAD-IPCDA method surpasses the other three algorithms,achieving an approximately 7.23%higher average data cleaning rate.The mean value of the sum of the squared errors(SSE)of the dataset after cleaning is approximately 6.887 lower than that of the other algorithms.Moreover,the mean of overall accuracy,as measured by the F1-score,exceeds that of the other methods by approximately 10.49%;this indicates that the CWPAD-IPCDA method is more conducive to improving the accuracy and reliability of wind power curve modeling and wind farm power forecasting.展开更多
In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese...In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese medicine during corona virus disease 2019(COVID-19)pandemic has attracted extensive attention globally.Medicinal plants have,therefore,become increasingly popular among the public.However,with increasing demand for and profit with medicinal plants,commercial fraudulent events such as adulteration or counterfeits sometimes occur,which poses a serious threat to the clinical outcomes and interests of consumers.With rapid advances in artificial intelligence,machine learning can be used to mine information on various medicinal plants to establish an ideal resource database.We herein present a review that mainly introduces common machine learning algorithms and discusses their application in multi-source data analysis of medicinal plants.The combination of machine learning algorithms and multi-source data analysis facilitates a comprehensive analysis and aids in the effective evaluation of the quality of medicinal plants.The findings of this review provide new possibilities for promoting the development and utilization of medicinal plants.展开更多
Distribution networks denote important public infrastructure necessary for people’s livelihoods.However,extreme natural disasters,such as earthquakes,typhoons,and mudslides,severely threaten the safe and stable opera...Distribution networks denote important public infrastructure necessary for people’s livelihoods.However,extreme natural disasters,such as earthquakes,typhoons,and mudslides,severely threaten the safe and stable operation of distribution networks and power supplies needed for daily life.Therefore,considering the requirements for distribution network disaster prevention and mitigation,there is an urgent need for in-depth research on risk assessment methods of distribution networks under extreme natural disaster conditions.This paper accessesmultisource data,presents the data quality improvement methods of distribution networks,and conducts data-driven active fault diagnosis and disaster damage analysis and evaluation using data-driven theory.Furthermore,the paper realizes real-time,accurate access to distribution network disaster information.The proposed approach performs an accurate and rapid assessment of cross-sectional risk through case study.The minimal average annual outage time can be reduced to 3 h/a in the ring network through case study.The approach proposed in this paper can provide technical support to the further improvement of the ability of distribution networks to cope with extreme natural disasters.展开更多
In the first-tier cities,subway has become an important carrier and life focus of people’s daily travel activities.By studying the distribution of POIs of public service facilities around Metro Line 10,using GIS to q...In the first-tier cities,subway has become an important carrier and life focus of people’s daily travel activities.By studying the distribution of POIs of public service facilities around Metro Line 10,using GIS to quantitatively analyze the surrounding formats of subway stations,discussing the functional attributes of subway stations,and discussing the distribution of urban functions from a new perspective,this paper provided guidance and advice for the construction of service facilities.展开更多
The slow traffic system is an important component of urban transportation,and the prerequisite and necessary condition for Beijing to continue promoting“green priority”are establishing a good urban slow traffic syst...The slow traffic system is an important component of urban transportation,and the prerequisite and necessary condition for Beijing to continue promoting“green priority”are establishing a good urban slow traffic system.Shijingshan District of Beijing City is taken as a research object.By analyzing and processing population distribution data,POI data,and shared bicycle data,the shortcomings and deficiencies of the current slow traffic system in Shijingshan District are explored,and corresponding solutions are proposed,in order to provide new ideas and methods for future urban planning from the perspective of data.展开更多
Randomness and fluctuations in wind power output may cause changes in important parameters(e.g.,grid frequency and voltage),which in turn affect the stable operation of a power system.However,owing to external factors...Randomness and fluctuations in wind power output may cause changes in important parameters(e.g.,grid frequency and voltage),which in turn affect the stable operation of a power system.However,owing to external factors(such as weather),there are often various anomalies in wind power data,such as missing numerical values and unreasonable data.This significantly affects the accuracy of wind power generation predictions and operational decisions.Therefore,developing and applying reliable wind power interpolation methods is important for promoting the sustainable development of the wind power industry.In this study,the causes of abnormal data in wind power generation were first analyzed from a practical perspective.Second,an improved complete ensemble empirical mode decomposition with adaptive noise(ICEEMDAN)method with a generative adversarial interpolation network(GAIN)network was proposed to preprocess wind power generation and interpolate missing wind power generation sub-components.Finally,a complete wind power generation time series was reconstructed.Compared to traditional methods,the proposed ICEEMDAN-GAIN combination interpolation model has a higher interpolation accuracy and can effectively reduce the error impact caused by wind power generation sequence fluctuations.展开更多
Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Appl...Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Applicability assessment are beneficial for gaining insight into the reliability of the ERA5 data in the SCS.The bias range between the ERA5 and observed wind-speed data was-0.78-0.99 m/s.The result indicates that,while the ERA5 wind-speed data underestimation was dominate,the overestimation of such data existed as well.Additionally,the ERA5 data underestimated annual maximum wind-speed by up to 38%,with a correlation coefficient>0.87.The bias between the ERA5 and observed significant wave height(SWH)data varied from-0.24 to 0.28 m.And the ERA5 data showed positive SWH bias,which implied a general underestimation at all locations,except those in the Beibu Gulf and centralwestern SCS,where overestimation was observed.Under extreme conditions,annual maximum SWH in the ERA5 data was underestimated by up to 30%.The correlation coefficients between the ERA5 and observed SWH data at all locations were greater than 0.92,except in the central-western SCS(0.84).The bias between the ERA5 and observed mean wave period(MWP)data varied from-0.74 to 0.57 s.The ERA5 data showed negative MWP biases implying a general overestimation at all locations,except for B1(the Beibu Gulf)and B7(the northeastern SCS),where underestimation was observed.The correlation coefficient between the ERA5 and observed MWP data in the Beibu Gulf was the smallest(0.56),and those of other locations fluctuated within a narrow range from 0.82 to 0.90.The intercomparison indicates that during the analyzed time-span,the ERA5 data generally underestimated wind-speed and SWH,but overestimated MWP.Under non-extreme conditions,the ERA5 wind-speed and SWH data can be used with confidence in most regions of the SCS,except in the central-western SCS.展开更多
Due to the high inherent uncertainty of renewable energy,probabilistic day-ahead wind power forecasting is crucial for modeling and controlling the uncertainty of renewable energy smart grids in smart cities.However,t...Due to the high inherent uncertainty of renewable energy,probabilistic day-ahead wind power forecasting is crucial for modeling and controlling the uncertainty of renewable energy smart grids in smart cities.However,the accuracy and reliability of high-resolution day-ahead wind power forecasting are constrained by unreliable local weather prediction and incomplete power generation data.This article proposes a physics-informed artificial intelligence(AI)surrogates method to augment the incomplete dataset and quantify its uncertainty to improve wind power forecasting performance.The incomplete dataset,built with numerical weather prediction data,historical wind power generation,and weather factors data,is augmented based on generative adversarial networks.After augmentation,the enriched data is then fed into a multiple AI surrogates model constructed by two extreme learning machine networks to train the forecasting model for wind power.Therefore,the forecasting models’accuracy and generalization ability are improved by mining the implicit physics information from the incomplete dataset.An incomplete dataset gathered from a wind farm in North China,containing only 15 days of weather and wind power generation data withmissing points caused by occasional shutdowns,is utilized to verify the proposed method’s performance.Compared with other probabilistic forecastingmethods,the proposed method shows better accuracy and probabilistic performance on the same incomplete dataset,which highlights its potential for more flexible and sensitive maintenance of smart grids in smart cities.展开更多
In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is pre...In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is presented for estimating vehicular queue length using data from both point detectors and probe vehicles. The methodology applies the shockwave theory to model queue evolution over time and space. Using probe vehicle locations and times as well as point detector measured traffic states,analytical formulations for calculating the maximum and minimum( residual) queue length are developed. The proposed methodology is verified using ground truth data collected from numerical experiments conducted in Shanghai,China. It is found that the methodology has a mean absolute percentage error of 17. 09%,which is reasonably effective in estimating the queue length at traffic signalized intersections. Limitations of the proposed models and algorithms are also discussed in the paper.展开更多
Data fusion can effectively process multi-sensor information to obtain more accurate and reliable results than a single sensor.The data of water quality in the environment comes from different sensors,thus the data mu...Data fusion can effectively process multi-sensor information to obtain more accurate and reliable results than a single sensor.The data of water quality in the environment comes from different sensors,thus the data must be fused.In our research,self-adaptive weighted data fusion method is used to respectively integrate the data from the PH value,temperature,oxygen dissolved and NH3 concentration of water quality environment.Based on the fusion,the Grubbs method is used to detect the abnormal data so as to provide data support for estimation,prediction and early warning of the water quality.展开更多
For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for...For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.展开更多
Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to u...Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.展开更多
The development of 3D geological models involves the integration of large amounts of geological data,as well as additional accessible proprietary lithological, structural,geochemical,geophysical,and borehole data.Luan...The development of 3D geological models involves the integration of large amounts of geological data,as well as additional accessible proprietary lithological, structural,geochemical,geophysical,and borehole data.Luanchuan,the case study area,southwestern Henan Province,is an important molybdenum-tungsten -lead-zinc polymetallic belt in China.展开更多
The salinity of the salt lake is an important factor to evaluate whether it contains some mineral resources or not,the fault buried in the salt lake could control the abundance of the salinity.Therefore,it is of great...The salinity of the salt lake is an important factor to evaluate whether it contains some mineral resources or not,the fault buried in the salt lake could control the abundance of the salinity.Therefore,it is of great geological importance to identify the fault buried in the salt lake.Taking the Gasikule Salt Lake in China for example,the paper established a new method to identify the fault buried in the salt lake based on the multi-source remote sensing data including Landsat TM,SPOT-5 and ASTER data.It includes the acquisition and selection of the multi-source remote sensing data,data preprocessing,lake waterfront extraction,spectrum extraction of brine with different salinity,salinity index construction,salinity separation,analysis of the abnormal salinity and identification of the fault buried in salt lake,temperature inversion of brine and the fault verification.As a result,the study identified an important fault buried in the east of the Gasikule Salt Lake that controls the highest salinity abnormal.Because the level of the salinity is positively correlated to the mineral abundance,the result provides the important reference to identify the water body rich in mineral resources in the salt lake.展开更多
基金supported by the National Key Research and Development Program of China(grant number 2019YFE0123600)。
文摘The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.
基金supported by the National Natural Science Foundation of China(41977215)。
文摘Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.
基金Under the auspices of Natural Science Foundation of China(No.41971166)。
文摘Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.
基金supported by the National Science Foundation(NSF)CBET,Fluid Dynamics CAREER program(Grant No.2046160),program manager Ron Joslin.
文摘With the increased availability of experimental measurements aiming at probing wind resources and wind turbine operations,machine learning(ML)models are poised to advance our understanding of the physics underpinning the interaction between the atmospheric boundary layer and wind turbine arrays,the generated wakes and their interactions,and wind energy harvesting.However,the majority of the existing ML models for predicting wind turbine wakes merely recreate Computational fluid dynamics(CFD)simulated data with analogous accuracy but reduced computational costs,thus providing surrogate models rather than enhanced data-enabled physics insights.Although ML-based surrogate models are useful to overcome current limitations associated with the high computational costs of CFD models,using ML to unveil processes from experimental data or enhance modeling capabilities is deemed a potential research direction to pursue.In this letter,we discuss recent achievements in the realm of ML modeling of wind turbine wakes and operations,along with new promising research strategies.
基金funded by the High-Quality and Cutting-Edge Discipline Construction Project for Universities in Beijing (Internet Information,Communication University of China).
文摘Multi-Source data plays an important role in the evolution of media convergence.Its fusion processing enables the further mining of data and utilization of data value and broadens the path for the sharing and dissemination of media data.However,it also faces serious problems in terms of protecting user and data privacy.Many privacy protectionmethods have been proposed to solve the problemof privacy leakage during the process of data sharing,but they suffer fromtwo flaws:1)the lack of algorithmic frameworks for specific scenarios such as dynamic datasets in the media domain;2)the inability to solve the problem of the high computational complexity of ciphertext in multi-source data privacy protection,resulting in long encryption and decryption times.In this paper,we propose a multi-source data privacy protection method based on homomorphic encryption and blockchain technology,which solves the privacy protection problem ofmulti-source heterogeneous data in the dissemination ofmedia and reduces ciphertext processing time.We deployed the proposedmethod on theHyperledger platformfor testing and compared it with the privacy protection schemes based on k-anonymity and differential privacy.The experimental results showthat the key generation,encryption,and decryption times of the proposedmethod are lower than those in data privacy protection methods based on k-anonymity technology and differential privacy technology.This significantly reduces the processing time ofmulti-source data,which gives it potential for use in many applications.
基金supported by the National Natural Science Foundation of China(Project No.51767018)Natural Science Foundation of Gansu Province(Project No.23JRRA836).
文摘Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of wind farm data.Consequently,a method for cleaning wind power anomaly data by combining image processing with community detection algorithms(CWPAD-IPCDA)is proposed.To precisely identify and initially clean anomalous data,wind power curve(WPC)images are converted into graph structures,which employ the Louvain community recognition algorithm and graph-theoretic methods for community detection and segmentation.Furthermore,the mathematical morphology operation(MMO)determines the main part of the initially cleaned wind power curve images and maps them back to the normal wind power points to complete the final cleaning.The CWPAD-IPCDA method was applied to clean datasets from 25 wind turbines(WTs)in two wind farms in northwest China to validate its feasibility.A comparison was conducted using density-based spatial clustering of applications with noise(DBSCAN)algorithm,an improved isolation forest algorithm,and an image-based(IB)algorithm.The experimental results demonstrate that the CWPAD-IPCDA method surpasses the other three algorithms,achieving an approximately 7.23%higher average data cleaning rate.The mean value of the sum of the squared errors(SSE)of the dataset after cleaning is approximately 6.887 lower than that of the other algorithms.Moreover,the mean of overall accuracy,as measured by the F1-score,exceeds that of the other methods by approximately 10.49%;this indicates that the CWPAD-IPCDA method is more conducive to improving the accuracy and reliability of wind power curve modeling and wind farm power forecasting.
基金supported by the National Natural Science Foundation of China(Grant No.:U2202213)the Special Program for the Major Science and Technology Projects of Yunnan Province,China(Grant Nos.:202102AE090051-1-01,and 202202AE090001).
文摘In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese medicine during corona virus disease 2019(COVID-19)pandemic has attracted extensive attention globally.Medicinal plants have,therefore,become increasingly popular among the public.However,with increasing demand for and profit with medicinal plants,commercial fraudulent events such as adulteration or counterfeits sometimes occur,which poses a serious threat to the clinical outcomes and interests of consumers.With rapid advances in artificial intelligence,machine learning can be used to mine information on various medicinal plants to establish an ideal resource database.We herein present a review that mainly introduces common machine learning algorithms and discusses their application in multi-source data analysis of medicinal plants.The combination of machine learning algorithms and multi-source data analysis facilitates a comprehensive analysis and aids in the effective evaluation of the quality of medicinal plants.The findings of this review provide new possibilities for promoting the development and utilization of medicinal plants.
文摘Distribution networks denote important public infrastructure necessary for people’s livelihoods.However,extreme natural disasters,such as earthquakes,typhoons,and mudslides,severely threaten the safe and stable operation of distribution networks and power supplies needed for daily life.Therefore,considering the requirements for distribution network disaster prevention and mitigation,there is an urgent need for in-depth research on risk assessment methods of distribution networks under extreme natural disaster conditions.This paper accessesmultisource data,presents the data quality improvement methods of distribution networks,and conducts data-driven active fault diagnosis and disaster damage analysis and evaluation using data-driven theory.Furthermore,the paper realizes real-time,accurate access to distribution network disaster information.The proposed approach performs an accurate and rapid assessment of cross-sectional risk through case study.The minimal average annual outage time can be reduced to 3 h/a in the ring network through case study.The approach proposed in this paper can provide technical support to the further improvement of the ability of distribution networks to cope with extreme natural disasters.
基金Beijing Municipal Social Science Foundation(22GLC062)Research on service function renewal of Beijing subway station living circle driven by multiple big data.Beijing Municipal Education Commission Social Science Project(KM202010009002)Young YuYou Talents Training Plan of North China University of Technology.
文摘In the first-tier cities,subway has become an important carrier and life focus of people’s daily travel activities.By studying the distribution of POIs of public service facilities around Metro Line 10,using GIS to quantitatively analyze the surrounding formats of subway stations,discussing the functional attributes of subway stations,and discussing the distribution of urban functions from a new perspective,this paper provided guidance and advice for the construction of service facilities.
基金Sponsored by Beijing Natural Science Foundation General Project(8212009)Construction of Philosophy and Social Sciences Base in Beijing-Research on Beijing Urban Renewal and Comprehensive Management of Old Community En-vironment2023 Education Reform Project of North China University of Technology(108051360023XN264-25).
文摘The slow traffic system is an important component of urban transportation,and the prerequisite and necessary condition for Beijing to continue promoting“green priority”are establishing a good urban slow traffic system.Shijingshan District of Beijing City is taken as a research object.By analyzing and processing population distribution data,POI data,and shared bicycle data,the shortcomings and deficiencies of the current slow traffic system in Shijingshan District are explored,and corresponding solutions are proposed,in order to provide new ideas and methods for future urban planning from the perspective of data.
基金We gratefully acknowledge the support of National Natural Science Foundation of China(NSFC)(Grant No.51977133&Grant No.U2066209).
文摘Randomness and fluctuations in wind power output may cause changes in important parameters(e.g.,grid frequency and voltage),which in turn affect the stable operation of a power system.However,owing to external factors(such as weather),there are often various anomalies in wind power data,such as missing numerical values and unreasonable data.This significantly affects the accuracy of wind power generation predictions and operational decisions.Therefore,developing and applying reliable wind power interpolation methods is important for promoting the sustainable development of the wind power industry.In this study,the causes of abnormal data in wind power generation were first analyzed from a practical perspective.Second,an improved complete ensemble empirical mode decomposition with adaptive noise(ICEEMDAN)method with a generative adversarial interpolation network(GAIN)network was proposed to preprocess wind power generation and interpolate missing wind power generation sub-components.Finally,a complete wind power generation time series was reconstructed.Compared to traditional methods,the proposed ICEEMDAN-GAIN combination interpolation model has a higher interpolation accuracy and can effectively reduce the error impact caused by wind power generation sequence fluctuations.
基金Supported by the Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)(No.SML2021SP102)the Key Laboratory of Marine Environmental Survey Technology and Application+2 种基金Ministry of Natural Resources(Nos.MESTA-2020-C003,MESTA-2020-C004)the Key Research and Development Project of Guangdong Province(No.2020B1111020003)the Science and Technology Research Project of Jiangxi Provincial Department of Education(No.GJJ200330)。
文摘Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Applicability assessment are beneficial for gaining insight into the reliability of the ERA5 data in the SCS.The bias range between the ERA5 and observed wind-speed data was-0.78-0.99 m/s.The result indicates that,while the ERA5 wind-speed data underestimation was dominate,the overestimation of such data existed as well.Additionally,the ERA5 data underestimated annual maximum wind-speed by up to 38%,with a correlation coefficient>0.87.The bias between the ERA5 and observed significant wave height(SWH)data varied from-0.24 to 0.28 m.And the ERA5 data showed positive SWH bias,which implied a general underestimation at all locations,except those in the Beibu Gulf and centralwestern SCS,where overestimation was observed.Under extreme conditions,annual maximum SWH in the ERA5 data was underestimated by up to 30%.The correlation coefficients between the ERA5 and observed SWH data at all locations were greater than 0.92,except in the central-western SCS(0.84).The bias between the ERA5 and observed mean wave period(MWP)data varied from-0.74 to 0.57 s.The ERA5 data showed negative MWP biases implying a general overestimation at all locations,except for B1(the Beibu Gulf)and B7(the northeastern SCS),where underestimation was observed.The correlation coefficient between the ERA5 and observed MWP data in the Beibu Gulf was the smallest(0.56),and those of other locations fluctuated within a narrow range from 0.82 to 0.90.The intercomparison indicates that during the analyzed time-span,the ERA5 data generally underestimated wind-speed and SWH,but overestimated MWP.Under non-extreme conditions,the ERA5 wind-speed and SWH data can be used with confidence in most regions of the SCS,except in the central-western SCS.
基金funded by the National Natural Science Foundation of China under Grant 62273022.
文摘Due to the high inherent uncertainty of renewable energy,probabilistic day-ahead wind power forecasting is crucial for modeling and controlling the uncertainty of renewable energy smart grids in smart cities.However,the accuracy and reliability of high-resolution day-ahead wind power forecasting are constrained by unreliable local weather prediction and incomplete power generation data.This article proposes a physics-informed artificial intelligence(AI)surrogates method to augment the incomplete dataset and quantify its uncertainty to improve wind power forecasting performance.The incomplete dataset,built with numerical weather prediction data,historical wind power generation,and weather factors data,is augmented based on generative adversarial networks.After augmentation,the enriched data is then fed into a multiple AI surrogates model constructed by two extreme learning machine networks to train the forecasting model for wind power.Therefore,the forecasting models’accuracy and generalization ability are improved by mining the implicit physics information from the incomplete dataset.An incomplete dataset gathered from a wind farm in North China,containing only 15 days of weather and wind power generation data withmissing points caused by occasional shutdowns,is utilized to verify the proposed method’s performance.Compared with other probabilistic forecastingmethods,the proposed method shows better accuracy and probabilistic performance on the same incomplete dataset,which highlights its potential for more flexible and sensitive maintenance of smart grids in smart cities.
基金Sponsored by the National Natural Science Foundation of China(Grant No.51138003)
文摘In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is presented for estimating vehicular queue length using data from both point detectors and probe vehicles. The methodology applies the shockwave theory to model queue evolution over time and space. Using probe vehicle locations and times as well as point detector measured traffic states,analytical formulations for calculating the maximum and minimum( residual) queue length are developed. The proposed methodology is verified using ground truth data collected from numerical experiments conducted in Shanghai,China. It is found that the methodology has a mean absolute percentage error of 17. 09%,which is reasonably effective in estimating the queue length at traffic signalized intersections. Limitations of the proposed models and algorithms are also discussed in the paper.
基金This study was supported by National Key Research and Development Project(Project No.2017YFD0301506)National Social Science Foundation(Project No.71774052)+1 种基金Hunan Education Department Scientific Research Project(Project No.17K04417A092).
文摘Data fusion can effectively process multi-sensor information to obtain more accurate and reliable results than a single sensor.The data of water quality in the environment comes from different sensors,thus the data must be fused.In our research,self-adaptive weighted data fusion method is used to respectively integrate the data from the PH value,temperature,oxygen dissolved and NH3 concentration of water quality environment.Based on the fusion,the Grubbs method is used to detect the abnormal data so as to provide data support for estimation,prediction and early warning of the water quality.
基金supported by the National Natural Science Foundation of China under Grant 51722406,52074340,and 51874335the Shandong Provincial Natural Science Foundation under Grant JQ201808+5 种基金The Fundamental Research Funds for the Central Universities under Grant 18CX02097Athe Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-008the Science and Technology Support Plan for Youth Innovation of University in Shandong Province under Grant 2019KJH002the National Research Council of Science and Technology Major Project of China under Grant 2016ZX05025001-006111 Project under Grant B08028Sinopec Science and Technology Project under Grant P20050-1
文摘For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.
文摘Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.
文摘The development of 3D geological models involves the integration of large amounts of geological data,as well as additional accessible proprietary lithological, structural,geochemical,geophysical,and borehole data.Luanchuan,the case study area,southwestern Henan Province,is an important molybdenum-tungsten -lead-zinc polymetallic belt in China.
基金This work was supported by the National Advance Research Program(Item No.Y1601-1).
文摘The salinity of the salt lake is an important factor to evaluate whether it contains some mineral resources or not,the fault buried in the salt lake could control the abundance of the salinity.Therefore,it is of great geological importance to identify the fault buried in the salt lake.Taking the Gasikule Salt Lake in China for example,the paper established a new method to identify the fault buried in the salt lake based on the multi-source remote sensing data including Landsat TM,SPOT-5 and ASTER data.It includes the acquisition and selection of the multi-source remote sensing data,data preprocessing,lake waterfront extraction,spectrum extraction of brine with different salinity,salinity index construction,salinity separation,analysis of the abnormal salinity and identification of the fault buried in salt lake,temperature inversion of brine and the fault verification.As a result,the study identified an important fault buried in the east of the Gasikule Salt Lake that controls the highest salinity abnormal.Because the level of the salinity is positively correlated to the mineral abundance,the result provides the important reference to identify the water body rich in mineral resources in the salt lake.