This paper proposes one method of feature selection by using Bayes' theorem. The purpose of the proposed method is to reduce the computational complexity and increase the classification accuracy of the selected featu...This paper proposes one method of feature selection by using Bayes' theorem. The purpose of the proposed method is to reduce the computational complexity and increase the classification accuracy of the selected feature subsets. The dependence between two attributes (binary) is determined based on the probabilities of their joint values that contribute to positive and negative classification decisions. If opposing sets of attribute values do not lead to opposing classification decisions (zero probability), then the two attributes are considered independent of each other, otherwise dependent, and one of them can be removed and thus the number of attributes is reduced. The process must be repeated on all combinations of attributes. The paper also evaluates the approach by comparing it with existing feature selection algorithms over 8 datasets from University of California, Irvine (UCI) machine learning databases. The proposed method shows better results in terms of number of selected features, classification accuracy, and running time than most existing algorithms.展开更多
In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore,...In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.展开更多
Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for rese...Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.展开更多
Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present ...Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present significant challenges,necessitating efficient data collection and reliable transmission services.This paper addresses the limitations of existing data transmission and recovery protocols by proposing a systematic end-to-end design tailored for medical event-driven cluster-based large-scale WSNs.The primary goal is to enhance the reliability of data collection and transmission services,ensuring a comprehensive and practical approach.Our approach focuses on refining the hop-count-based routing scheme to achieve fairness in forwarding reliability.Additionally,it emphasizes reliable data collection within clusters and establishes robust data transmission over multiple hops.These systematic improvements are designed to optimize the overall performance of the WSN in real-world scenarios.Simulation results of the proposed protocol validate its exceptional performance compared to other prominent data transmission schemes.The evaluation spans varying sensor densities,wireless channel conditions,and packet transmission rates,showcasing the protocol’s superiority in ensuring reliable and efficient data transfer.Our systematic end-to-end design successfully addresses the challenges posed by the instability of wireless links in large-scaleWSNs.By prioritizing fairness,reliability,and efficiency,the proposed protocol demonstrates its efficacy in enhancing data collection and transmission services,thereby offering a valuable contribution to the field of medical event-drivenWSNs.展开更多
Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the ...Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the cost of data storage and improve the reliability and efficiency of Big Data management.Its weaknesses lie in inadequate and non-standardized management.Archiving in archival science focuses on the management aspects and neglects the necessary technical considerations,resulting in high storage and retention costs and poor ability to manage Big Data.Therefore,the integration of large-scale data archiving and archival theory can balance the existing research limitations of the two fields and propose two research topics for related research-archival management of Big Data and large-scale management of archived Big Data.展开更多
Traditional large-scale multi-objective optimization algorithms(LSMOEAs)encounter difficulties when dealing with sparse large-scale multi-objective optimization problems(SLM-OPs)where most decision variables are zero....Traditional large-scale multi-objective optimization algorithms(LSMOEAs)encounter difficulties when dealing with sparse large-scale multi-objective optimization problems(SLM-OPs)where most decision variables are zero.As a result,many algorithms use a two-layer encoding approach to optimize binary variable Mask and real variable Dec separately.Nevertheless,existing optimizers often focus on locating non-zero variable posi-tions to optimize the binary variables Mask.However,approxi-mating the sparse distribution of real Pareto optimal solutions does not necessarily mean that the objective function is optimized.In data mining,it is common to mine frequent itemsets appear-ing together in a dataset to reveal the correlation between data.Inspired by this,we propose a novel two-layer encoding learning swarm optimizer based on frequent itemsets(TELSO)to address these SLMOPs.TELSO mined the frequent terms of multiple particles with better target values to find mask combinations that can obtain better objective values for fast convergence.Experi-mental results on five real-world problems and eight benchmark sets demonstrate that TELSO outperforms existing state-of-the-art sparse large-scale multi-objective evolutionary algorithms(SLMOEAs)in terms of performance and convergence speed.展开更多
Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to tr...Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.展开更多
Assessment of past-climate simulations of regional climate models(RCMs)is important for understanding the reliability of RCMs when used to project future regional climate.Here,we assess the performance and discuss pos...Assessment of past-climate simulations of regional climate models(RCMs)is important for understanding the reliability of RCMs when used to project future regional climate.Here,we assess the performance and discuss possible causes of biases in a WRF-based RCM with a grid spacing of 50 km,named WRFG,from the North American Regional Climate Change Assessment Program(NARCCAP)in simulating wet season precipitation over the Central United States for a period when observational data are available.The RCM reproduces key features of the precipitation distribution characteristics during late spring to early summer,although it tends to underestimate the magnitude of precipitation.This dry bias is partially due to the model’s lack of skill in simulating nocturnal precipitation related to the lack of eastward propagating convective systems in the simulation.Inaccuracy in reproducing large-scale circulation and environmental conditions is another contributing factor.The too weak simulated pressure gradient between the Rocky Mountains and the Gulf of Mexico results in weaker southerly winds in between,leading to a reduction of warm moist air transport from the Gulf to the Central Great Plains.The simulated low-level horizontal convergence fields are less favorable for upward motion than in the NARR and hence,for the development of moist convection as well.Therefore,a careful examination of an RCM’s deficiencies and the identification of the source of errors are important when using the RCM to project precipitation changes in future climate scenarios.展开更多
Accurate positioning is one of the essential requirements for numerous applications of remote sensing data,especially in the event of a noisy or unreliable satellite signal.Toward this end,we present a novel framework...Accurate positioning is one of the essential requirements for numerous applications of remote sensing data,especially in the event of a noisy or unreliable satellite signal.Toward this end,we present a novel framework for aircraft geo-localization in a large range that only requires a downward-facing monocular camera,an altimeter,a compass,and an open-source Vector Map(VMAP).The algorithm combines the matching and particle filter methods.Shape vector and correlation between two building contour vectors are defined,and a coarse-to-fine building vector matching(CFBVM)method is proposed in the matching stage,for which the original matching results are described by the Gaussian mixture model(GMM).Subsequently,an improved resampling strategy is designed to reduce computing expenses with a huge number of initial particles,and a credibility indicator is designed to avoid location mistakes in the particle filter stage.An experimental evaluation of the approach based on flight data is provided.On a flight at a height of 0.2 km over a flight distance of 2 km,the aircraft is geo-localized in a reference map of 11,025 km~2using 0.09 km~2aerial images without any prior information.The absolute localization error is less than 10 m.展开更多
This article introduces the concept of load aggregation,which involves a comprehensive analysis of loads to acquire their external characteristics for the purpose of modeling and analyzing power systems.The online ide...This article introduces the concept of load aggregation,which involves a comprehensive analysis of loads to acquire their external characteristics for the purpose of modeling and analyzing power systems.The online identification method is a computer-involved approach for data collection,processing,and system identification,commonly used for adaptive control and prediction.This paper proposes a method for dynamically aggregating large-scale adjustable loads to support high proportions of new energy integration,aiming to study the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction methods.The experiment selected 300 central air conditioners as the research subject and analyzed their regulation characteristics,economic efficiency,and comfort.The experimental results show that as the adjustment time of the air conditioner increases from 5 minutes to 35 minutes,the stable adjustment quantity during the adjustment period decreases from 28.46 to 3.57,indicating that air conditioning loads can be controlled over a long period and have better adjustment effects in the short term.Overall,the experimental results of this paper demonstrate that analyzing the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction algorithms is effective.展开更多
The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the intera...The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the interaction among decision variables is intricate,leading to large group sizes and suboptimal optimization effects;hence a large-scale multi-objective optimization algorithm based on weighted overlapping grouping of decision variables(MOEAWOD)is proposed in this paper.Initially,the decision variables are perturbed and categorized into convergence and diversity variables;subsequently,the convergence variables are subdivided into groups based on the interactions among different decision variables.If the size of a group surpasses the set threshold,that group undergoes a process of weighting and overlapping grouping.Specifically,the interaction strength is evaluated based on the interaction frequency and number of objectives among various decision variables.The decision variable with the highest interaction in the group is identified and disregarded,and the remaining variables are then reclassified into subgroups.Finally,the decision variable with the strongest interaction is added to each subgroup.MOEAWOD minimizes the interactivity between different groups and maximizes the interactivity of decision variables within groups,which contributed to the optimized direction of convergence and diversity exploration with different groups.MOEAWOD was subjected to testing on 18 benchmark large-scale optimization problems,and the experimental results demonstrate the effectiveness of our methods.Compared with the other algorithms,our method is still at an advantage.展开更多
With the development of big data and social computing,large-scale group decisionmaking(LGDM)is nowmerging with social networks.Using social network analysis(SNA),this study proposes an LGDM consensus model that consid...With the development of big data and social computing,large-scale group decisionmaking(LGDM)is nowmerging with social networks.Using social network analysis(SNA),this study proposes an LGDM consensus model that considers the trust relationship among decisionmakers(DMs).In the process of consensusmeasurement:the social network is constructed according to the social relationship among DMs,and the Louvain method is introduced to classify social networks to form subgroups.In this study,the weights of each decision maker and each subgroup are computed by comprehensive network weights and trust weights.In the process of consensus improvement:A feedback mechanism with four identification and two direction rules is designed to guide the consensus of the improvement process.Based on the trust relationship among DMs,the preferences are modified,and the corresponding social network is updated to accelerate the consensus.Compared with the previous research,the proposedmodel not only allows the subgroups to be reconstructed and updated during the adjustment process,but also improves the accuracy of the adjustment by the feedbackmechanism.Finally,an example analysis is conducted to verify the effectiveness and flexibility of the proposed method.Moreover,compared with previous studies,the superiority of the proposed method in solving the LGDM problem is highlighted.展开更多
The deformation and fracture evolution mechanisms of the strata overlying mines mined using sublevel caving were studied via numerical simulations.Moreover,an expression for the normal force acting on the side face of...The deformation and fracture evolution mechanisms of the strata overlying mines mined using sublevel caving were studied via numerical simulations.Moreover,an expression for the normal force acting on the side face of a steeply dipping superimposed cantilever beam in the surrounding rock was deduced based on limit equilibrium theory.The results show the following:(1)surface displacement above metal mines with steeply dipping discontinuities shows significant step characteristics,and(2)the behavior of the strata as they fail exhibits superimposition characteristics.Generally,failure first occurs in certain superimposed strata slightly far from the goaf.Subsequently,with the constant downward excavation of the orebody,the superimposed strata become damaged both upwards away from and downwards toward the goaf.This process continues until the deep part of the steeply dipping superimposed strata forms a large-scale deep fracture plane that connects with the goaf.The deep fracture plane generally makes an angle of 12°-20°with the normal to the steeply dipping discontinuities.The effect of the constant outward transfer of strata movement due to the constant outward failure of the superimposed strata in the metal mines with steeply dipping discontinuities causes the scope of the strata movement in these mines to be larger than expected.The strata in the metal mines with steeply dipping discontinuities mainly show flexural toppling failure.However,the steeply dipping structural strata near the goaf mainly exhibit shear slipping failure,in which case the mechanical model used to describe them can be simplified by treating them as steeply dipping superimposed cantilever beams.By taking the steeply dipping superimposed cantilever beam that first experiences failure as the key stratum,the failure scope of the strata(and criteria for the stability of metal mines with steeply dipping discontinuities mined using sublevel caving)can be obtained via iterative computations from the key stratum,moving downward toward and upwards away from the goaf.展开更多
The efficacy of insecticide-treated nets (ITNs) is increasingly compromised by the prevalence of malaria vectors resistant to pyrethroids. In response to this issue, a new generation of ITNs has been developed that in...The efficacy of insecticide-treated nets (ITNs) is increasingly compromised by the prevalence of malaria vectors resistant to pyrethroids. In response to this issue, a new generation of ITNs has been developed that incorporate synergistic components, such as piperonyl butoxide (PBO). The purpose of this study is to provide entomological evidence for the efficacy of a PBO-based ITN brand at the village level, serving as a basis for decision-making before large-scale net deployment. During the high malaria transmission period, ITNs were distributed in each group and vector sampling was conducted biweekly in selected households. Bionomic data were collected to assess the resistance of wild An. gambiae populations to various chemical insecticides. There was a significant disparity in total An. gambiae sl. collected between the ITN arms, the intervention arm (ITN arms), and the control arm (P = 0.003). An. coluzzi was identified as the predominant species in the study area, as confirmed by PCR analysis. Analysis of the blood-feeding inhibition rate revealed that 100% permethrin + PBO ITN exhibited significantly greater inhibition than 66.81% permethrin only ITN. According to the log-time probit regression analysis, permethrin exhibited a knockdown time of 256 min without synergists, while it decreased to 139 min (P = 0.001) when pre-exposed to PBO. The evidence from this trial supports the use of PBO ITNs over standard ITNs for pyrethroids to combat pyrethroid resistance and improve protection against malaria for both individuals and communities, particularly in areas with high pyrethroid resistance.展开更多
The financial aspects of large-scale engineering construction projects profoundly influence their success.Strengthening cost control and establishing a scientific financial evaluation system can enhance the project’s...The financial aspects of large-scale engineering construction projects profoundly influence their success.Strengthening cost control and establishing a scientific financial evaluation system can enhance the project’s economic benefits,minimize unnecessary costs,and provide decision-makers with a robust financial foundation.Additionally,implementing an effective cash flow control mechanism and conducting a comprehensive assessment of potential project risks can ensure financial stability and mitigate the risk of fund shortages.Developing a practical and feasible fundraising plan,along with stringent fund management practices,can prevent fund wastage and optimize fund utilization efficiency.These measures not only facilitate smooth project progression and improve project management efficiency but also enhance the project’s economic and social outcomes.展开更多
Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effectiv...Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>展开更多
Kernel is a kind of data summary which is elaborately extracted from a large dataset.Given a problem,the solution obtained from the kernel is an approximate version of the solution obtained from the whole dataset with...Kernel is a kind of data summary which is elaborately extracted from a large dataset.Given a problem,the solution obtained from the kernel is an approximate version of the solution obtained from the whole dataset with a provable approximate ratio.It is widely used in geometric optimization,clustering,and approximate query processing,etc.,for scaling them up to massive data.In this paper,we focus on the minimumε-kernel(MK)computation that asks for a kernel of the smallest size for large-scale data processing.For the open problem presented by Wang et al.that whether the minimumε-coreset(MC)problem and the MK problem can be reduced to each other,we first formalize the MK problem and analyze its complexity.Due to the NP-hardness of the MK problem in three or higher dimensions,an approximate algorithm,namely Set Cover-Based Minimumε-Kernel algorithm(SCMK),is developed to solve it.We prove that the MC problem and the MK problem can be Turing-reduced to each other.Then,we discuss the update of MK under insertion and deletion operations,respectively.Finally,a randomized algorithm,called the Randomized Algorithm of Set Cover-Based Minimumε-Kernel algorithm(RA-SCMK),is utilized to further reduce the complexity of SCMK.The efficiency and effectiveness of SCMK and RA-SCMK are verified by experimental results on real-world and synthetic datasets.Experiments show that the kernel sizes of SCMK are 2x and 17.6x smaller than those of an ANN-based method on real-world and synthetic datasets,respectively.The speedup ratio of SCMK over the ANN-based method is 5.67 on synthetic datasets.RA-SCMK runs up to three times faster than SCMK on synthetic datasets.展开更多
The global energy transition is a widespread phenomenon that requires international exchange of experiences and mutual learning.Germany’s success in its first phase of energy transition can be attributed to its adopt...The global energy transition is a widespread phenomenon that requires international exchange of experiences and mutual learning.Germany’s success in its first phase of energy transition can be attributed to its adoption of smart energy technology and implementation of electricity futures and spot marketization,which enabled the achievement of multiple energy spatial–temporal complementarities and overall grid balance through energy conversion and reconversion technologies.While China can draw from Germany’s experience to inform its own energy transition efforts,its 11-fold higher annual electricity consumption requires a distinct approach.We recommend a clean energy system based on smart sector coupling(ENSYSCO)as a suitable pathway for achieving sustainable energy in China,given that renewable energy is expected to guarantee 85%of China’s energy production by 2060,requiring significant future electricity storage capacity.Nonetheless,renewable energy storage remains a significant challenge.We propose four large-scale underground energy storage methods based on ENSYSCO to address this challenge,while considering China’s national conditions.These proposals have culminated in pilot projects for large-scale underground energy storage in China,which we believe is a necessary choice for achieving carbon neutrality in China and enabling efficient and safe grid integration of renewable energy within the framework of ENSYSCO.展开更多
This paper investigates the wireless communication with a novel architecture of antenna arrays,termed modular extremely large-scale array(XLarray),where array elements of an extremely large number/size are regularly m...This paper investigates the wireless communication with a novel architecture of antenna arrays,termed modular extremely large-scale array(XLarray),where array elements of an extremely large number/size are regularly mounted on a shared platform with both horizontally and vertically interlaced modules.Each module consists of a moderate/flexible number of array elements with the inter-element distance typically in the order of the signal wavelength,while different modules are separated by the relatively large inter-module distance for convenience of practical deployment.By accurately modelling the signal amplitudes and phases,as well as projected apertures across all modular elements,we analyse the near-field signal-to-noise ratio(SNR)performance for modular XL-array communications.Based on the non-uniform spherical wave(NUSW)modelling,the closed-form SNR expression is derived in terms of key system parameters,such as the overall modular array size,distances of adjacent modules along all dimensions,and the user's three-dimensional(3D)location.In addition,with the number of modules in different dimensions increasing infinitely,the asymptotic SNR scaling laws are revealed.Furthermore,we show that our proposed near-field modelling and performance analysis include the results for existing array architectures/modelling as special cases,e.g.,the collocated XL-array architecture,the uniform plane wave(UPW)based far-field modelling,and the modular extremely large-scale uniform linear array(XL-ULA)of onedimension.Extensive simulation results are presented to validate our findings.展开更多
Major interactions are known to trigger star formation in galaxies and alter their color.We study the major interactions in filaments and sheets using SDSS data to understand the influence of large-scale environments ...Major interactions are known to trigger star formation in galaxies and alter their color.We study the major interactions in filaments and sheets using SDSS data to understand the influence of large-scale environments on galaxy interactions.We identify the galaxies in filaments and sheets using the local dimension and also find the major pairs residing in these environments.The star formation rate(SFR) and color of the interacting galaxies as a function of pair separation are separately analyzed in filaments and sheets.The analysis is repeated for three volume limited samples covering different magnitude ranges.The major pairs residing in the filaments show a significantly higher SFR and bluer color than those residing in the sheets up to the projected pair separation of~50 kpc.We observe a complete reversal of this behavior for both the SFR and color of the galaxy pairs having a projected separation larger than 50 kpc.Some earlier studies report that the galaxy pairs align with the filament axis.Such alignment inside filaments indicates anisotropic accretion that may cause these differences.We do not observe these trends in the brighter galaxy samples.The pairs in filaments and sheets from the brighter galaxy samples trace relatively denser regions in these environments.The absence of these trends in the brighter samples may be explained by the dominant effect of the local density over the effects of the large-scale environment.展开更多
文摘This paper proposes one method of feature selection by using Bayes' theorem. The purpose of the proposed method is to reduce the computational complexity and increase the classification accuracy of the selected feature subsets. The dependence between two attributes (binary) is determined based on the probabilities of their joint values that contribute to positive and negative classification decisions. If opposing sets of attribute values do not lead to opposing classification decisions (zero probability), then the two attributes are considered independent of each other, otherwise dependent, and one of them can be removed and thus the number of attributes is reduced. The process must be repeated on all combinations of attributes. The paper also evaluates the approach by comparing it with existing feature selection algorithms over 8 datasets from University of California, Irvine (UCI) machine learning databases. The proposed method shows better results in terms of number of selected features, classification accuracy, and running time than most existing algorithms.
基金This research has been partially supported by the national natural science foundation of China (51175169) and the national science and technology support program (2012BAF02B01).
文摘In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.
文摘Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.
文摘Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present significant challenges,necessitating efficient data collection and reliable transmission services.This paper addresses the limitations of existing data transmission and recovery protocols by proposing a systematic end-to-end design tailored for medical event-driven cluster-based large-scale WSNs.The primary goal is to enhance the reliability of data collection and transmission services,ensuring a comprehensive and practical approach.Our approach focuses on refining the hop-count-based routing scheme to achieve fairness in forwarding reliability.Additionally,it emphasizes reliable data collection within clusters and establishes robust data transmission over multiple hops.These systematic improvements are designed to optimize the overall performance of the WSN in real-world scenarios.Simulation results of the proposed protocol validate its exceptional performance compared to other prominent data transmission schemes.The evaluation spans varying sensor densities,wireless channel conditions,and packet transmission rates,showcasing the protocol’s superiority in ensuring reliable and efficient data transfer.Our systematic end-to-end design successfully addresses the challenges posed by the instability of wireless links in large-scaleWSNs.By prioritizing fairness,reliability,and efficiency,the proposed protocol demonstrates its efficacy in enhancing data collection and transmission services,thereby offering a valuable contribution to the field of medical event-drivenWSNs.
基金supported by the National Natural Science Foundation of China(grant number 72074214).
文摘Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the cost of data storage and improve the reliability and efficiency of Big Data management.Its weaknesses lie in inadequate and non-standardized management.Archiving in archival science focuses on the management aspects and neglects the necessary technical considerations,resulting in high storage and retention costs and poor ability to manage Big Data.Therefore,the integration of large-scale data archiving and archival theory can balance the existing research limitations of the two fields and propose two research topics for related research-archival management of Big Data and large-scale management of archived Big Data.
基金supported by the Scientific Research Project of Xiang Jiang Lab(22XJ02003)the University Fundamental Research Fund(23-ZZCX-JDZ-28)+5 种基金the National Science Fund for Outstanding Young Scholars(62122093)the National Natural Science Foundation of China(72071205)the Hunan Graduate Research Innovation Project(ZC23112101-10)the Hunan Natural Science Foundation Regional Joint Project(2023JJ50490)the Science and Technology Project for Young and Middle-aged Talents of Hunan(2023TJ-Z03)the Science and Technology Innovation Program of Humnan Province(2023RC1002)。
文摘Traditional large-scale multi-objective optimization algorithms(LSMOEAs)encounter difficulties when dealing with sparse large-scale multi-objective optimization problems(SLM-OPs)where most decision variables are zero.As a result,many algorithms use a two-layer encoding approach to optimize binary variable Mask and real variable Dec separately.Nevertheless,existing optimizers often focus on locating non-zero variable posi-tions to optimize the binary variables Mask.However,approxi-mating the sparse distribution of real Pareto optimal solutions does not necessarily mean that the objective function is optimized.In data mining,it is common to mine frequent itemsets appear-ing together in a dataset to reveal the correlation between data.Inspired by this,we propose a novel two-layer encoding learning swarm optimizer based on frequent itemsets(TELSO)to address these SLMOPs.TELSO mined the frequent terms of multiple particles with better target values to find mask combinations that can obtain better objective values for fast convergence.Experi-mental results on five real-world problems and eight benchmark sets demonstrate that TELSO outperforms existing state-of-the-art sparse large-scale multi-objective evolutionary algorithms(SLMOEAs)in terms of performance and convergence speed.
基金support by the Open Project of Xiangjiang Laboratory(22XJ02003)the University Fundamental Research Fund(23-ZZCX-JDZ-28,ZK21-07)+5 种基金the National Science Fund for Outstanding Young Scholars(62122093)the National Natural Science Foundation of China(72071205)the Hunan Graduate Research Innovation Project(CX20230074)the Hunan Natural Science Foundation Regional Joint Project(2023JJ50490)the Science and Technology Project for Young and Middle-aged Talents of Hunan(2023TJZ03)the Science and Technology Innovation Program of Humnan Province(2023RC1002).
文摘Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.
文摘Assessment of past-climate simulations of regional climate models(RCMs)is important for understanding the reliability of RCMs when used to project future regional climate.Here,we assess the performance and discuss possible causes of biases in a WRF-based RCM with a grid spacing of 50 km,named WRFG,from the North American Regional Climate Change Assessment Program(NARCCAP)in simulating wet season precipitation over the Central United States for a period when observational data are available.The RCM reproduces key features of the precipitation distribution characteristics during late spring to early summer,although it tends to underestimate the magnitude of precipitation.This dry bias is partially due to the model’s lack of skill in simulating nocturnal precipitation related to the lack of eastward propagating convective systems in the simulation.Inaccuracy in reproducing large-scale circulation and environmental conditions is another contributing factor.The too weak simulated pressure gradient between the Rocky Mountains and the Gulf of Mexico results in weaker southerly winds in between,leading to a reduction of warm moist air transport from the Gulf to the Central Great Plains.The simulated low-level horizontal convergence fields are less favorable for upward motion than in the NARR and hence,for the development of moist convection as well.Therefore,a careful examination of an RCM’s deficiencies and the identification of the source of errors are important when using the RCM to project precipitation changes in future climate scenarios.
文摘Accurate positioning is one of the essential requirements for numerous applications of remote sensing data,especially in the event of a noisy or unreliable satellite signal.Toward this end,we present a novel framework for aircraft geo-localization in a large range that only requires a downward-facing monocular camera,an altimeter,a compass,and an open-source Vector Map(VMAP).The algorithm combines the matching and particle filter methods.Shape vector and correlation between two building contour vectors are defined,and a coarse-to-fine building vector matching(CFBVM)method is proposed in the matching stage,for which the original matching results are described by the Gaussian mixture model(GMM).Subsequently,an improved resampling strategy is designed to reduce computing expenses with a huge number of initial particles,and a credibility indicator is designed to avoid location mistakes in the particle filter stage.An experimental evaluation of the approach based on flight data is provided.On a flight at a height of 0.2 km over a flight distance of 2 km,the aircraft is geo-localized in a reference map of 11,025 km~2using 0.09 km~2aerial images without any prior information.The absolute localization error is less than 10 m.
基金supported by the State Grid Science&Technology Project(5100-202114296A-0-0-00).
文摘This article introduces the concept of load aggregation,which involves a comprehensive analysis of loads to acquire their external characteristics for the purpose of modeling and analyzing power systems.The online identification method is a computer-involved approach for data collection,processing,and system identification,commonly used for adaptive control and prediction.This paper proposes a method for dynamically aggregating large-scale adjustable loads to support high proportions of new energy integration,aiming to study the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction methods.The experiment selected 300 central air conditioners as the research subject and analyzed their regulation characteristics,economic efficiency,and comfort.The experimental results show that as the adjustment time of the air conditioner increases from 5 minutes to 35 minutes,the stable adjustment quantity during the adjustment period decreases from 28.46 to 3.57,indicating that air conditioning loads can be controlled over a long period and have better adjustment effects in the short term.Overall,the experimental results of this paper demonstrate that analyzing the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction algorithms is effective.
基金supported in part by the Central Government Guides Local Science and TechnologyDevelopment Funds(Grant No.YDZJSX2021A038)in part by theNational Natural Science Foundation of China under(Grant No.61806138)in part by the China University Industry-University-Research Collaborative Innovation Fund(Future Network Innovation Research and Application Project)(Grant 2021FNA04014).
文摘The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the interaction among decision variables is intricate,leading to large group sizes and suboptimal optimization effects;hence a large-scale multi-objective optimization algorithm based on weighted overlapping grouping of decision variables(MOEAWOD)is proposed in this paper.Initially,the decision variables are perturbed and categorized into convergence and diversity variables;subsequently,the convergence variables are subdivided into groups based on the interactions among different decision variables.If the size of a group surpasses the set threshold,that group undergoes a process of weighting and overlapping grouping.Specifically,the interaction strength is evaluated based on the interaction frequency and number of objectives among various decision variables.The decision variable with the highest interaction in the group is identified and disregarded,and the remaining variables are then reclassified into subgroups.Finally,the decision variable with the strongest interaction is added to each subgroup.MOEAWOD minimizes the interactivity between different groups and maximizes the interactivity of decision variables within groups,which contributed to the optimized direction of convergence and diversity exploration with different groups.MOEAWOD was subjected to testing on 18 benchmark large-scale optimization problems,and the experimental results demonstrate the effectiveness of our methods.Compared with the other algorithms,our method is still at an advantage.
基金The work was supported by Humanities and Social Sciences Fund of the Ministry of Education(No.22YJA630119)the National Natural Science Foundation of China(No.71971051)Natural Science Foundation of Hebei Province(No.G2021501004).
文摘With the development of big data and social computing,large-scale group decisionmaking(LGDM)is nowmerging with social networks.Using social network analysis(SNA),this study proposes an LGDM consensus model that considers the trust relationship among decisionmakers(DMs).In the process of consensusmeasurement:the social network is constructed according to the social relationship among DMs,and the Louvain method is introduced to classify social networks to form subgroups.In this study,the weights of each decision maker and each subgroup are computed by comprehensive network weights and trust weights.In the process of consensus improvement:A feedback mechanism with four identification and two direction rules is designed to guide the consensus of the improvement process.Based on the trust relationship among DMs,the preferences are modified,and the corresponding social network is updated to accelerate the consensus.Compared with the previous research,the proposedmodel not only allows the subgroups to be reconstructed and updated during the adjustment process,but also improves the accuracy of the adjustment by the feedbackmechanism.Finally,an example analysis is conducted to verify the effectiveness and flexibility of the proposed method.Moreover,compared with previous studies,the superiority of the proposed method in solving the LGDM problem is highlighted.
基金Financial support for this work was provided by the Youth Fund Program of the National Natural Science Foundation of China (No. 42002292)the General Program of the National Natural Science Foundation of China (No. 42377175)the General Program of the Hubei Provincial Natural Science Foundation, China (No. 2023AFB631)
文摘The deformation and fracture evolution mechanisms of the strata overlying mines mined using sublevel caving were studied via numerical simulations.Moreover,an expression for the normal force acting on the side face of a steeply dipping superimposed cantilever beam in the surrounding rock was deduced based on limit equilibrium theory.The results show the following:(1)surface displacement above metal mines with steeply dipping discontinuities shows significant step characteristics,and(2)the behavior of the strata as they fail exhibits superimposition characteristics.Generally,failure first occurs in certain superimposed strata slightly far from the goaf.Subsequently,with the constant downward excavation of the orebody,the superimposed strata become damaged both upwards away from and downwards toward the goaf.This process continues until the deep part of the steeply dipping superimposed strata forms a large-scale deep fracture plane that connects with the goaf.The deep fracture plane generally makes an angle of 12°-20°with the normal to the steeply dipping discontinuities.The effect of the constant outward transfer of strata movement due to the constant outward failure of the superimposed strata in the metal mines with steeply dipping discontinuities causes the scope of the strata movement in these mines to be larger than expected.The strata in the metal mines with steeply dipping discontinuities mainly show flexural toppling failure.However,the steeply dipping structural strata near the goaf mainly exhibit shear slipping failure,in which case the mechanical model used to describe them can be simplified by treating them as steeply dipping superimposed cantilever beams.By taking the steeply dipping superimposed cantilever beam that first experiences failure as the key stratum,the failure scope of the strata(and criteria for the stability of metal mines with steeply dipping discontinuities mined using sublevel caving)can be obtained via iterative computations from the key stratum,moving downward toward and upwards away from the goaf.
文摘The efficacy of insecticide-treated nets (ITNs) is increasingly compromised by the prevalence of malaria vectors resistant to pyrethroids. In response to this issue, a new generation of ITNs has been developed that incorporate synergistic components, such as piperonyl butoxide (PBO). The purpose of this study is to provide entomological evidence for the efficacy of a PBO-based ITN brand at the village level, serving as a basis for decision-making before large-scale net deployment. During the high malaria transmission period, ITNs were distributed in each group and vector sampling was conducted biweekly in selected households. Bionomic data were collected to assess the resistance of wild An. gambiae populations to various chemical insecticides. There was a significant disparity in total An. gambiae sl. collected between the ITN arms, the intervention arm (ITN arms), and the control arm (P = 0.003). An. coluzzi was identified as the predominant species in the study area, as confirmed by PCR analysis. Analysis of the blood-feeding inhibition rate revealed that 100% permethrin + PBO ITN exhibited significantly greater inhibition than 66.81% permethrin only ITN. According to the log-time probit regression analysis, permethrin exhibited a knockdown time of 256 min without synergists, while it decreased to 139 min (P = 0.001) when pre-exposed to PBO. The evidence from this trial supports the use of PBO ITNs over standard ITNs for pyrethroids to combat pyrethroid resistance and improve protection against malaria for both individuals and communities, particularly in areas with high pyrethroid resistance.
文摘The financial aspects of large-scale engineering construction projects profoundly influence their success.Strengthening cost control and establishing a scientific financial evaluation system can enhance the project’s economic benefits,minimize unnecessary costs,and provide decision-makers with a robust financial foundation.Additionally,implementing an effective cash flow control mechanism and conducting a comprehensive assessment of potential project risks can ensure financial stability and mitigate the risk of fund shortages.Developing a practical and feasible fundraising plan,along with stringent fund management practices,can prevent fund wastage and optimize fund utilization efficiency.These measures not only facilitate smooth project progression and improve project management efficiency but also enhance the project’s economic and social outcomes.
文摘Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>
基金the National Natural Science Foundation of China under Grant Nos.61732003,61832003,61972110 and U19A2059the National Key Research and Development Program of China under Grant No.2019YFB2101902the CCF-Baidu Open Fund CCF-BAIDU under Grant No.OF2021011.
文摘Kernel is a kind of data summary which is elaborately extracted from a large dataset.Given a problem,the solution obtained from the kernel is an approximate version of the solution obtained from the whole dataset with a provable approximate ratio.It is widely used in geometric optimization,clustering,and approximate query processing,etc.,for scaling them up to massive data.In this paper,we focus on the minimumε-kernel(MK)computation that asks for a kernel of the smallest size for large-scale data processing.For the open problem presented by Wang et al.that whether the minimumε-coreset(MC)problem and the MK problem can be reduced to each other,we first formalize the MK problem and analyze its complexity.Due to the NP-hardness of the MK problem in three or higher dimensions,an approximate algorithm,namely Set Cover-Based Minimumε-Kernel algorithm(SCMK),is developed to solve it.We prove that the MC problem and the MK problem can be Turing-reduced to each other.Then,we discuss the update of MK under insertion and deletion operations,respectively.Finally,a randomized algorithm,called the Randomized Algorithm of Set Cover-Based Minimumε-Kernel algorithm(RA-SCMK),is utilized to further reduce the complexity of SCMK.The efficiency and effectiveness of SCMK and RA-SCMK are verified by experimental results on real-world and synthetic datasets.Experiments show that the kernel sizes of SCMK are 2x and 17.6x smaller than those of an ANN-based method on real-world and synthetic datasets,respectively.The speedup ratio of SCMK over the ANN-based method is 5.67 on synthetic datasets.RA-SCMK runs up to three times faster than SCMK on synthetic datasets.
基金Henan Institute for Chinese Development Strategy of Engineering&Technology(No.2022HENZDA02)the Science&Technology Department of Sichuan Province(No.2021YFH0010)。
文摘The global energy transition is a widespread phenomenon that requires international exchange of experiences and mutual learning.Germany’s success in its first phase of energy transition can be attributed to its adoption of smart energy technology and implementation of electricity futures and spot marketization,which enabled the achievement of multiple energy spatial–temporal complementarities and overall grid balance through energy conversion and reconversion technologies.While China can draw from Germany’s experience to inform its own energy transition efforts,its 11-fold higher annual electricity consumption requires a distinct approach.We recommend a clean energy system based on smart sector coupling(ENSYSCO)as a suitable pathway for achieving sustainable energy in China,given that renewable energy is expected to guarantee 85%of China’s energy production by 2060,requiring significant future electricity storage capacity.Nonetheless,renewable energy storage remains a significant challenge.We propose four large-scale underground energy storage methods based on ENSYSCO to address this challenge,while considering China’s national conditions.These proposals have culminated in pilot projects for large-scale underground energy storage in China,which we believe is a necessary choice for achieving carbon neutrality in China and enabling efficient and safe grid integration of renewable energy within the framework of ENSYSCO.
基金supported by the National Key R&D Program of China with Grant number 2019YFB1803400the National Natural Science Foundation of China under Grant number 62071114the Fundamental Research Funds for the Central Universities of China under grant numbers 3204002004A2 and 2242022k30005。
文摘This paper investigates the wireless communication with a novel architecture of antenna arrays,termed modular extremely large-scale array(XLarray),where array elements of an extremely large number/size are regularly mounted on a shared platform with both horizontally and vertically interlaced modules.Each module consists of a moderate/flexible number of array elements with the inter-element distance typically in the order of the signal wavelength,while different modules are separated by the relatively large inter-module distance for convenience of practical deployment.By accurately modelling the signal amplitudes and phases,as well as projected apertures across all modular elements,we analyse the near-field signal-to-noise ratio(SNR)performance for modular XL-array communications.Based on the non-uniform spherical wave(NUSW)modelling,the closed-form SNR expression is derived in terms of key system parameters,such as the overall modular array size,distances of adjacent modules along all dimensions,and the user's three-dimensional(3D)location.In addition,with the number of modules in different dimensions increasing infinitely,the asymptotic SNR scaling laws are revealed.Furthermore,we show that our proposed near-field modelling and performance analysis include the results for existing array architectures/modelling as special cases,e.g.,the collocated XL-array architecture,the uniform plane wave(UPW)based far-field modelling,and the modular extremely large-scale uniform linear array(XL-ULA)of onedimension.Extensive simulation results are presented to validate our findings.
基金financial support from the SERB,DST,Government of India through the project CRG/2019/001110IUCAA,Pune for providing support through an associateship program+1 种基金IISER Tirupati for support through a postdoctoral fellowshipFunding for the SDSS and SDSS-Ⅱhas been provided by the Alfred P.Sloan Foundation,the U.S.Department of Energy,the National Aeronautics and Space Administration,the Japanese Monbukagakusho,the Max Planck Society,and the Higher Education Funding Council for England。
文摘Major interactions are known to trigger star formation in galaxies and alter their color.We study the major interactions in filaments and sheets using SDSS data to understand the influence of large-scale environments on galaxy interactions.We identify the galaxies in filaments and sheets using the local dimension and also find the major pairs residing in these environments.The star formation rate(SFR) and color of the interacting galaxies as a function of pair separation are separately analyzed in filaments and sheets.The analysis is repeated for three volume limited samples covering different magnitude ranges.The major pairs residing in the filaments show a significantly higher SFR and bluer color than those residing in the sheets up to the projected pair separation of~50 kpc.We observe a complete reversal of this behavior for both the SFR and color of the galaxy pairs having a projected separation larger than 50 kpc.Some earlier studies report that the galaxy pairs align with the filament axis.Such alignment inside filaments indicates anisotropic accretion that may cause these differences.We do not observe these trends in the brighter galaxy samples.The pairs in filaments and sheets from the brighter galaxy samples trace relatively denser regions in these environments.The absence of these trends in the brighter samples may be explained by the dominant effect of the local density over the effects of the large-scale environment.