The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces ...The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology.展开更多
In the traditional incremental analysis update(IAU)process,all analysis increments are treated as constant forcing in a model’s prognostic equations over a certain time window.This approach effectively reduces high-f...In the traditional incremental analysis update(IAU)process,all analysis increments are treated as constant forcing in a model’s prognostic equations over a certain time window.This approach effectively reduces high-frequency oscillations introduced by data assimilation.However,as different scales of increments have unique evolutionary speeds and life histories in a numerical model,the traditional IAU scheme cannot fully meet the requirements of short-term forecasting for the damping of high-frequency noise and may even cause systematic drifts.Therefore,a multi-scale IAU scheme is proposed in this paper.Analysis increments were divided into different scale parts using a spatial filtering technique.For each scale increment,the optimal relaxation time in the IAU scheme was determined by the skill of the forecasting results.Finally,different scales of analysis increments were added to the model integration during their optimal relaxation time.The multi-scale IAU scheme can effectively reduce the noise and further improve the balance between large-scale and small-scale increments in the model initialization stage.To evaluate its performance,several numerical experiments were conducted to simulate the path and intensity of Typhoon Mangkhut(2018)and showed that:(1)the multi-scale IAU scheme had an obvious effect on noise control at the initial stage of data assimilation;(2)the optimal relaxation time for large-scale and small-scale increments was estimated as 6 h and 3 h,respectively;(3)the forecast performance of the multi-scale IAU scheme in the prediction of Typhoon Mangkhut(2018)was better than that of the traditional IAU scheme.The results demonstrate the superiority of the multi-scale IAU scheme.展开更多
The four-dimensional variational (4D-Var) data assimilation systems used in most operational and research centers use initial condition increments as control variables and adjust initial increments to find optimal a...The four-dimensional variational (4D-Var) data assimilation systems used in most operational and research centers use initial condition increments as control variables and adjust initial increments to find optimal analysis solutions. This approach may sometimes create discontinuities in analysis fields and produce undesirable spin ups and spin downs. This study explores using incremental analysis updates (IAU) in 4D-Var to reduce the analysis discontinuities. IAU-based 4D-Var has almost the same mathematical formula as conventional 4D-Var if the initial condition increments are replaced with time-integrated increments as control variables. The IAU technique was implemented in the NASA/GSFC 4D-Var prototype and compared against a control run without IAU. The results showed that the initial precipitation spikes were removed and that other discontinuities were also reduced, especially for the analysis of surface temperature.展开更多
Initialization of tropical cyclones plays an important role in typhoon numerical prediction. This study applied a typhoon initialization scheme based on the incremental analysis updates (IAU) technique in a rapid refr...Initialization of tropical cyclones plays an important role in typhoon numerical prediction. This study applied a typhoon initialization scheme based on the incremental analysis updates (IAU) technique in a rapid refresh system to improve the prediction of Typhoon Lekima (2019). Two numerical sensitivity experiments with or without application of the IAU technique after performing vortex relocation and wind adjustment procedures were conducted for comparison with the control experiment, which did not involve a typhoon initialization scheme. Analysis of the initial fields indicated that the relocation procedure shifted the typhoon circulation to the observed typhoon region, and the wind speeds became closer to the observations following the wind adjustment procedure. Comparison of the results of the sensitivity and control experiments revealed that the vortex relocation and wind adjustment procedures could improve the prediction of typhoon track and intensity in the first 6-h period, and that these improvements were extended throughout the first 12-h period of the prediction by the IAU technique. The new typhoon initialization scheme also improved the simulated typhoon structure in terms of not only the wind speed and warm core prediction but also the organization of the eye of Typhoon Lekima. Diagnosis of the tendencies of variables showed that use of the IAU technique in a typhoon initialization scheme is efficacious in resolving the spurious high-frequency noise problem such that the model is able to reach equilibrium as soon as possible.展开更多
Based on the relationship among the geographic events, spatial changes and the database operations, a new automatic (semi-automatic) incremental updating approach of spatio-temporal database (STDB) named as (event-bas...Based on the relationship among the geographic events, spatial changes and the database operations, a new automatic (semi-automatic) incremental updating approach of spatio-temporal database (STDB) named as (event-based) incremental updating (E-BIU) is proposed in this paper. At first, the relationship among the events, spatial changes and the database operations is analyzed, then a total architecture of E-BIU implementation is designed, which includes an event queue, three managers and two sets of rules, each component is presented in detail. The process of the E-BIU of master STDB is described successively. An example of building’s incremental updating is given to illustrate this approach at the end. The result shows that E-BIU is an efficient automatic updating approach for master STDB.展开更多
The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation sy...The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.展开更多
For the purpose of carrying out the large deformation finite element analysis of spatial curved beams,the total Lagrangian(TL)and the updated Lagrangian(UL)incremental formulations for arbitrary spatial curved bea...For the purpose of carrying out the large deformation finite element analysis of spatial curved beams,the total Lagrangian(TL)and the updated Lagrangian(UL)incremental formulations for arbitrary spatial curved beam elements are established with displacement vector interpolation,which is improved from component interpolation of the straight beam displacement.A strategy of replacing the actual curve with the isoparametric curve is used to expand the applications of the UL formulation.The examples indicate that the process of establishing the curved beam element is correct,and the accuracy with the curved beam element is obviously higher than that with the straight beam element.Generally,the same level of computational accuracy can be achieved with 1/5 as many curved beam elements as otherwise with straight beam elements.展开更多
We proposed a novel solution schema called the Hierarchical Labeling Schema (HLS) to answer reachability queries in directed graphs. Different from many existing approaches that focus on static directed acyclic grap...We proposed a novel solution schema called the Hierarchical Labeling Schema (HLS) to answer reachability queries in directed graphs. Different from many existing approaches that focus on static directed acyclic graphs (DAGs), our schema focuses on directed cyclic graphs (DCGs) where vertices or arcs could be added to a graph incrementally. Unlike many of the traditional approaches, HLS does not require the graph to be acyclic in constructing its index. Therefore, it could, in fact, be applied to both DAGs and DCGs. When vertices or arcs are added to a graph, the HLS is capable of updating the index incrementally instead of re-computing the index from the scratch each time, making it more efficient than many other approaches in the practice. The basic idea of HLS is to create a tree for each vertex in a graph and link the trees together so that whenever two vertices are given, we can immediately know whether there is a path between them by referring to the appropriate trees. We conducted extensive experiments on both real-world datasets and synthesized datasets. We compared the performance of HLS, in terms of index construction time, query processing time and space consumption, with two state-of-the-art methodologies, the path-tree method and the 3-hop method. We also conducted simulations to model the situation when a graph is updated incrementally. The performance comparison of different algorithms against HLS on static graphs has also been studied. Our results show that HLS is highly competitive in the practice and is particularly useful in the cases where the graphs are updated frequently.展开更多
In traditional simulations of heavy rainfall events, the regional model is often initialized by using a global reanalysis dataset and a cold start method. An alternative to using global analysis data is to gradually i...In traditional simulations of heavy rainfall events, the regional model is often initialized by using a global reanalysis dataset and a cold start method. An alternative to using global analysis data is to gradually introduce the analysis field via an incremental analysis update(IAU) method under the replay configuration. We found substantial differences in the forecast of a heavy rainfall event in southern China between a precipitation forecast using the traditional method and a forecast using the IAU method in the Tropical Regional Atmospheric Modeling System(TRAMS),based on the ECMWF global analysis. The IAU method is efficient in removing spurious high-frequency gravity wave noise, especially when the relaxation time is more than 90 min. The regional model needs to be pre-integrated for about 12 h to warm up the convective system in the background field. The improvement by the IAU method is supported by verification of simulations over 1 month(1–30 April 2019). In general, the IAU technique improves the initialization and spin-up process in the simulation of the heavy rainfall event.展开更多
This paper addresses the mathematical relation on a set of periods and temporal indexing construc- tions as well as their applications.First we introduce two concepts, i.e.the temporal connection and temporal inclusio...This paper addresses the mathematical relation on a set of periods and temporal indexing construc- tions as well as their applications.First we introduce two concepts, i.e.the temporal connection and temporal inclusion, which are equivalence relation and preorder relation respectively.Second, by study- ing some basic topics such as the division of "large" equivalence classes and the overlaps of preorder relational sets, we propose a temporal data index model (TDIM) with a tree-structure consisting of a root node, equivalence class nodes and linearly ordered branch nodes.Third, we study algorithms for the temporal querying and incremental updating as well as dynamical management within the framework of TDIM.Based on a proper mathematical supporting, TDIM can be applied to researching some significant practical cases such as temporal relational and temporal XML data and so on.展开更多
基金sponsored by the National Natural Science Foundation of China(Nos.61972208,62102194 and 62102196)National Natural Science Foundation of China(Youth Project)(No.62302237)+3 种基金Six Talent Peaks Project of Jiangsu Province(No.RJFW-111),China Postdoctoral Science Foundation Project(No.2018M640509)Postgraduate Research and Practice Innovation Program of Jiangsu Province(Nos.KYCX22_1019,KYCX23_1087,KYCX22_1027,KYCX23_1087,SJCX24_0339 and SJCX24_0346)Innovative Training Program for College Students of Nanjing University of Posts and Telecommunications(No.XZD2019116)Nanjing University of Posts and Telecommunications College Students Innovation Training Program(Nos.XZD2019116,XYB2019331).
文摘The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology.
基金jointly sponsored by the Shenzhen Science and Technology Innovation Commission (Grant No. KCXFZ20201221173610028)the key program of the National Natural Science Foundation of China (Grant No. 42130605)
文摘In the traditional incremental analysis update(IAU)process,all analysis increments are treated as constant forcing in a model’s prognostic equations over a certain time window.This approach effectively reduces high-frequency oscillations introduced by data assimilation.However,as different scales of increments have unique evolutionary speeds and life histories in a numerical model,the traditional IAU scheme cannot fully meet the requirements of short-term forecasting for the damping of high-frequency noise and may even cause systematic drifts.Therefore,a multi-scale IAU scheme is proposed in this paper.Analysis increments were divided into different scale parts using a spatial filtering technique.For each scale increment,the optimal relaxation time in the IAU scheme was determined by the skill of the forecasting results.Finally,different scales of analysis increments were added to the model integration during their optimal relaxation time.The multi-scale IAU scheme can effectively reduce the noise and further improve the balance between large-scale and small-scale increments in the model initialization stage.To evaluate its performance,several numerical experiments were conducted to simulate the path and intensity of Typhoon Mangkhut(2018)and showed that:(1)the multi-scale IAU scheme had an obvious effect on noise control at the initial stage of data assimilation;(2)the optimal relaxation time for large-scale and small-scale increments was estimated as 6 h and 3 h,respectively;(3)the forecast performance of the multi-scale IAU scheme in the prediction of Typhoon Mangkhut(2018)was better than that of the traditional IAU scheme.The results demonstrate the superiority of the multi-scale IAU scheme.
基金supported by NOAA’s Hurricane Forecast Improvement Project
文摘The four-dimensional variational (4D-Var) data assimilation systems used in most operational and research centers use initial condition increments as control variables and adjust initial increments to find optimal analysis solutions. This approach may sometimes create discontinuities in analysis fields and produce undesirable spin ups and spin downs. This study explores using incremental analysis updates (IAU) in 4D-Var to reduce the analysis discontinuities. IAU-based 4D-Var has almost the same mathematical formula as conventional 4D-Var if the initial condition increments are replaced with time-integrated increments as control variables. The IAU technique was implemented in the NASA/GSFC 4D-Var prototype and compared against a control run without IAU. The results showed that the initial precipitation spikes were removed and that other discontinuities were also reduced, especially for the analysis of surface temperature.
基金Science and Technology Project of Zhejiang Province(LGF20D050001)East China Regional Meteorological Science and Technology Innovation Fund Cooperation Project(QYHZ201805)Meteorological Science and Technology Project of Zhejiang Meteorological Service(2018ZD01,2019ZD11)。
文摘Initialization of tropical cyclones plays an important role in typhoon numerical prediction. This study applied a typhoon initialization scheme based on the incremental analysis updates (IAU) technique in a rapid refresh system to improve the prediction of Typhoon Lekima (2019). Two numerical sensitivity experiments with or without application of the IAU technique after performing vortex relocation and wind adjustment procedures were conducted for comparison with the control experiment, which did not involve a typhoon initialization scheme. Analysis of the initial fields indicated that the relocation procedure shifted the typhoon circulation to the observed typhoon region, and the wind speeds became closer to the observations following the wind adjustment procedure. Comparison of the results of the sensitivity and control experiments revealed that the vortex relocation and wind adjustment procedures could improve the prediction of typhoon track and intensity in the first 6-h period, and that these improvements were extended throughout the first 12-h period of the prediction by the IAU technique. The new typhoon initialization scheme also improved the simulated typhoon structure in terms of not only the wind speed and warm core prediction but also the organization of the eye of Typhoon Lekima. Diagnosis of the tendencies of variables showed that use of the IAU technique in a typhoon initialization scheme is efficacious in resolving the spurious high-frequency noise problem such that the model is able to reach equilibrium as soon as possible.
文摘Based on the relationship among the geographic events, spatial changes and the database operations, a new automatic (semi-automatic) incremental updating approach of spatio-temporal database (STDB) named as (event-based) incremental updating (E-BIU) is proposed in this paper. At first, the relationship among the events, spatial changes and the database operations is analyzed, then a total architecture of E-BIU implementation is designed, which includes an event queue, three managers and two sets of rules, each component is presented in detail. The process of the E-BIU of master STDB is described successively. An example of building’s incremental updating is given to illustrate this approach at the end. The result shows that E-BIU is an efficient automatic updating approach for master STDB.
基金Under the auspices of National High Technology Research and Development Program of China (No.2007AA12Z242)
文摘The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.
基金The Major Research Plan of the National Natural Science Foundation of China(No.90715021)
文摘For the purpose of carrying out the large deformation finite element analysis of spatial curved beams,the total Lagrangian(TL)and the updated Lagrangian(UL)incremental formulations for arbitrary spatial curved beam elements are established with displacement vector interpolation,which is improved from component interpolation of the straight beam displacement.A strategy of replacing the actual curve with the isoparametric curve is used to expand the applications of the UL formulation.The examples indicate that the process of establishing the curved beam element is correct,and the accuracy with the curved beam element is obviously higher than that with the straight beam element.Generally,the same level of computational accuracy can be achieved with 1/5 as many curved beam elements as otherwise with straight beam elements.
文摘We proposed a novel solution schema called the Hierarchical Labeling Schema (HLS) to answer reachability queries in directed graphs. Different from many existing approaches that focus on static directed acyclic graphs (DAGs), our schema focuses on directed cyclic graphs (DCGs) where vertices or arcs could be added to a graph incrementally. Unlike many of the traditional approaches, HLS does not require the graph to be acyclic in constructing its index. Therefore, it could, in fact, be applied to both DAGs and DCGs. When vertices or arcs are added to a graph, the HLS is capable of updating the index incrementally instead of re-computing the index from the scratch each time, making it more efficient than many other approaches in the practice. The basic idea of HLS is to create a tree for each vertex in a graph and link the trees together so that whenever two vertices are given, we can immediately know whether there is a path between them by referring to the appropriate trees. We conducted extensive experiments on both real-world datasets and synthesized datasets. We compared the performance of HLS, in terms of index construction time, query processing time and space consumption, with two state-of-the-art methodologies, the path-tree method and the 3-hop method. We also conducted simulations to model the situation when a graph is updated incrementally. The performance comparison of different algorithms against HLS on static graphs has also been studied. Our results show that HLS is highly competitive in the practice and is particularly useful in the cases where the graphs are updated frequently.
基金Supported by the National Natural Science Foundation of China (U1811464)Science and Technology Planning Project of Guangdong Province,China (2018B020208004)。
文摘In traditional simulations of heavy rainfall events, the regional model is often initialized by using a global reanalysis dataset and a cold start method. An alternative to using global analysis data is to gradually introduce the analysis field via an incremental analysis update(IAU) method under the replay configuration. We found substantial differences in the forecast of a heavy rainfall event in southern China between a precipitation forecast using the traditional method and a forecast using the IAU method in the Tropical Regional Atmospheric Modeling System(TRAMS),based on the ECMWF global analysis. The IAU method is efficient in removing spurious high-frequency gravity wave noise, especially when the relaxation time is more than 90 min. The regional model needs to be pre-integrated for about 12 h to warm up the convective system in the background field. The improvement by the IAU method is supported by verification of simulations over 1 month(1–30 April 2019). In general, the IAU technique improves the initialization and spin-up process in the simulation of the heavy rainfall event.
基金Supported by the National Natural Science Foundation of China (Grant Nos 60373081, 60673135)the Natural Science Foundation of Guangdong Province (Grant No 05003348)the Program of New Century Excellent Person Supporting of Ministery of Education of China(GrantNo.NCET-04-0805)
文摘This paper addresses the mathematical relation on a set of periods and temporal indexing construc- tions as well as their applications.First we introduce two concepts, i.e.the temporal connection and temporal inclusion, which are equivalence relation and preorder relation respectively.Second, by study- ing some basic topics such as the division of "large" equivalence classes and the overlaps of preorder relational sets, we propose a temporal data index model (TDIM) with a tree-structure consisting of a root node, equivalence class nodes and linearly ordered branch nodes.Third, we study algorithms for the temporal querying and incremental updating as well as dynamical management within the framework of TDIM.Based on a proper mathematical supporting, TDIM can be applied to researching some significant practical cases such as temporal relational and temporal XML data and so on.