To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen s...To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen set of models accurately represents the‘true’distribution of considered observables.Furthermore,the models are chosen globally,indicating their applicability across the entire energy range of interest.However,this approach overlooks uncertainties inherent in the models themselves.In this work,we propose that instead of selecting globally a winning model set and proceeding with it as if it was the‘true’model set,we,instead,take a weighted average over multiple models within a Bayesian model averaging(BMA)framework,each weighted by its posterior probability.The method involves executing a set of TALYS calculations by randomly varying multiple nuclear physics models and their parameters to yield a vector of calculated observables.Next,computed likelihood function values at each incident energy point were then combined with the prior distributions to obtain updated posterior distributions for selected cross sections and the elastic angular distributions.As the cross sections and elastic angular distributions were updated locally on a per-energy-point basis,the approach typically results in discontinuities or“kinks”in the cross section curves,and these were addressed using spline interpolation.The proposed BMA method was applied to the evaluation of proton-induced reactions on ^(58)Ni between 1 and 100 MeV.The results demonstrated a favorable comparison with experimental data as well as with the TENDL-2023 evaluation.展开更多
The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,ma...The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,many empirical ZTD models based on whether the gridded or scattered ZTD products have been proposed and widely used in the GNSS positioning applications.But there is no comprehensive evaluation of these models for the whole China region,which features complicated topography and climate.In this study,we completely assess the typical empirical models,the IGGtropSH model(gridded,non-meteorology),the SHAtropE model(scattered,non-meteorology),and the GPT3 model(gridded,meteorology)using the Crustal Movement Observation Network of China(CMONOC)network.In general,the results show that the three models share consistent performance with RMSE/bias of 37.45/1.63,37.13/2.20,and 38.27/1.34 mm for the GPT3,SHAtropE and IGGtropSH model,respectively.However,the models had a distinct performance regarding geographical distribution,elevation,seasonal variations,and daily variation.In the southeastern region of China,RMSE values are around 50 mm,which are much higher than that in the western region,approximately 20 mm.The SHAtropE model exhibits better performance for areas with large variations in elevation.The GPT3 model and the IGGtropSH model are more stable across different months,and the SHAtropE model based on the GNSS data exhibits superior performance across various UTC epochs.展开更多
A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epi...A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epidemiological impact of vaccine booster doses on the co-dynamics of viral hepatitis B and COVID-19.The model is fitted to real COVID-19 data from Pakistan.The proposed model incorporates logistic growth and saturated incidence functions.Rigorous analyses using the tools of stochastic calculus,are performed to study appropriate conditions for the existence of unique global solutions,stationary distribution in the sense of ergodicity and disease extinction.The stochastic threshold estimated from the data fitting is given by:R_(0)^(S)=3.0651.Numerical assessments are implemented to illustrate the impact of double-dose vaccination and saturated incidence functions on the dynamics of both diseases.The effects of stochastic white noise intensities are also highlighted.展开更多
Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The ...Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The performance of existing long-term navigation algorithm is limited by the cumulative error of inertial sensors, disturbed local magnetic field, and complex motion modes of the pedestrian. This paper develops a robust data and physical model dual-driven based trajectory estimation(DPDD-TE) framework, which can be applied for long-term navigation tasks. A Bi-directional Long Short-Term Memory(Bi-LSTM) based quasi-static magnetic field(QSMF) detection algorithm is developed for extracting useful magnetic observation for heading calibration, and another Bi-LSTM is adopted for walking speed estimation by considering hybrid human motion information under a specific time period. In addition, a data and physical model dual-driven based multi-source fusion model is proposed to integrate basic INS mechanization and multi-level constraint and observations for maintaining accuracy under long-term navigation tasks, and enhanced by the magnetic and trajectory features assisted loop detection algorithm. Real-world experiments indicate that the proposed DPDD-TE outperforms than existing algorithms, and final estimated heading and positioning accuracy indexes reaches 5° and less than 2 m under the time period of 30 min, respectively.展开更多
Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify th...Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify the subgroups compared with support vector machine(SVM)and extreme gradient boosting(XGBoost),and to select the features.The top 10 important features are included in the stepwise logistic regression,and the odds ratio(OR)and 95%confidence interval(CI)are obtained.There are 41290 adult inpatient records diagnosed with CSVD.Accuracy and area under curve(AUC)of RF are close to 0.7,which performs best in classification compared to SVM and XGBoost.OR and 95%CI of hematocrit for white matter lesions(WMLs),lacunes,microbleeds,atrophy,and enlarged perivascular space(EPVS)are 0.9875(0.9857−0.9893),0.9728(0.9705−0.9752),0.9782(0.9740−0.9824),1.0093(1.0081−1.0106),and 0.9716(0.9597−0.9832).OR and 95%CI of red cell distribution width for WMLs,lacunes,atrophy,and EPVS are 0.9600(0.9538−0.9662),0.9630(0.9559−0.9702),1.0751(1.0686−1.0817),and 0.9304(0.8864−0.9755).OR and 95%CI of platelet distribution width for WMLs,lacunes,and microbleeds are 1.1796(1.1636−1.1958),1.1663(1.1476−1.1853),and 1.0416(1.0152−1.0687).This study proposes a new analytical framework to select important clinical markers for CSVD with machine learning based on a common data model,which has low cost,fast speed,large sample size,and continuous data sources.展开更多
Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully...Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully exploited.To extract dominant woody plant species,GEE combined Sen-tinel-1(S1)and Sentinel-2(S2)data with the addition of the National Forest Resources Inventory(NFRI)and topographic data,resulting in a 10 m resolution multimodal geospatial dataset for subtropical forests in southeast China.Spectral and texture features,red-edge bands,and vegetation indices of S1 and S2 data were computed.A hierarchical model obtained information on forest distribution and area and the dominant woody plant species.The results suggest that combining data sources from the S1 winter and S2 yearly ranges enhances accuracy in forest distribution and area extraction compared to using either data source independently.Similarly,for dominant woody species recognition,using S1 winter and S2 data across all four seasons was accurate.Including terrain factors and removing spatial correlation from NFRI sample points further improved the recognition accuracy.The optimal forest extraction achieved an overall accuracy(OA)of 97.4%and a maplevel image classification efficacy(MICE)of 96.7%.OA and MICE were 83.6%and 80.7%for dominant species extraction,respectively.The high accuracy and efficacy values indicate that the hierarchical recognition model based on multimodal remote sensing data performed extremely well for extracting information about dominant woody plant species.Visualizing the results using the GEE application allows for an intuitive display of forest and species distribution,offering significant convenience for forest resource monitoring.展开更多
This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hac...This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.展开更多
Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on d...Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on data management,rather than emphasizing efficiency. Accurate prediction of electricity consumption is crucial for enabling intelligent grid operations,including resource planning and demandsupply balancing. Smart metering solutions offer users the benefits of effectively interpreting their energy utilization and optimizing costs. Motivated by this,this paper presents an Intelligent Energy Utilization Analysis using Smart Metering Data(IUA-SMD)model to determine energy consumption patterns. The proposed IUA-SMD model comprises three major processes:data Pre-processing,feature extraction,and classification,with parameter optimization. We employ the extreme learning machine(ELM)based classification approach within the IUA-SMD model to derive optimal energy utilization labels. Additionally,we apply the shell game optimization(SGO)algorithm to enhance the classification efficiency of the ELM by optimizing its parameters. The effectiveness of the IUA-SMD model is evaluated using an extensive dataset of smart metering data,and the results are analyzed in terms of accuracy and mean square error(MSE). The proposed model demonstrates superior performance,achieving a maximum accuracy of65.917% and a minimum MSE of0.096. These results highlight the potential of the IUA-SMD model for enabling efficient energy utilization through intelligent analysis of smart metering data.展开更多
The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), ha...The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.展开更多
We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Se...We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for struct...This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for structuring and analyzing data is underlined, as it enables the measurement of the adequacy between training and the needs of the labor market. The innovation of the study lies in the adaptation of the MERISE model to the local context, the development of innovative indicators, and the integration of a participatory approach including all relevant stakeholders. Contextual adaptation and local innovation: The study suggests adapting MERISE to the specific context of the Republic of Congo, considering the local particularities of the labor market. Development of innovative indicators and new measurement tools: It proposes creating indicators to assess skills matching and employer satisfaction, which are crucial for evaluating the effectiveness of vocational training. Participatory approach and inclusion of stakeholders: The study emphasizes actively involving training centers, employers, and recruitment agencies in the evaluation process. This participatory approach ensures that the perspectives of all stakeholders are considered, leading to more relevant and practical outcomes. Using the MERISE model allows for: • Rigorous data structuring, organization, and standardization: Clearly defining entities and relationships facilitates data organization and standardization, crucial for effective data analysis. • Facilitation of monitoring, analysis, and relevant indicators: Developing both quantitative and qualitative indicators helps measure the effectiveness of training in relation to the labor market, allowing for a comprehensive evaluation. • Improved communication and common language: By providing a common language for different stakeholders, MERISE enhances communication and collaboration, ensuring that all parties have a shared understanding. The study’s approach and contribution to existing research lie in: • Structured theoretical and practical framework and holistic approach: The study offers a structured framework for data collection and analysis, covering both quantitative and qualitative aspects, thus providing a comprehensive view of the training system. • Reproducible methodology and international comparison: The proposed methodology can be replicated in other contexts, facilitating international comparison and the adoption of best practices. • Extension of knowledge and new perspective: By integrating a participatory approach and developing indicators adapted to local needs, the study extends existing research and offers new perspectives on vocational training evaluation.展开更多
In this paper, a new multimedia data model, namely object-relation hypermedia data model (O-RHDM) which is an advanced and effective multimedia data model is proposed and designed based on the extension and integratio...In this paper, a new multimedia data model, namely object-relation hypermedia data model (O-RHDM) which is an advanced and effective multimedia data model is proposed and designed based on the extension and integration of non first normal form (NF2) multimedia data model. Its principle, mathematical description, algebra operation, organization method and store model are also discussed. And its specific application example, in the multimedia spatial data management is given combining with the Hainan multimedia touring information system.展开更多
Because radiation belt electrons can pose a potential threat to the safety of satellites orbiting in space,it is of great importance to develop a reliable model that can predict the highly dynamic variations in outer ...Because radiation belt electrons can pose a potential threat to the safety of satellites orbiting in space,it is of great importance to develop a reliable model that can predict the highly dynamic variations in outer radiation belt electron fluxes.In the present study,we develop a forecast model of radiation belt electron fluxes based on the data assimilation method,in terms of Van Allen Probe measurements combined with three-dimensional radiation belt numerical simulations.Our forecast model can cover the entire outer radiation belt with a high temporal resolution(1 hour)and a spatial resolution of 0.25 L over a wide range of both electron energy(0.1-5.0 MeV)and pitch angle(5°-90°).On the basis of this model,we forecast hourly electron fluxes for the next 1,2,and 3 days during an intense geomagnetic storm and evaluate the corresponding prediction performance.Our model can reasonably predict the stormtime evolution of radiation belt electrons with high prediction efficiency(up to~0.8-1).The best prediction performance is found for~0.3-3 MeV electrons at L=~3.25-4.5,which extends to higher L and lower energies with increasing pitch angle.Our results demonstrate that the forecast model developed can be a powerful tool to predict the spatiotemporal changes in outer radiation belt electron fluxes,and the model has both scientific significance and practical implications.展开更多
Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit prop...Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.展开更多
Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a n...Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.展开更多
The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesir...The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesired or of poor quality.A Data Warehouse(DW)is a huge collection of data gathered from many sources and an important part of any BI solution to assist management in making better decisions.The Extract,Transform,and Load(ETL)process is the backbone of a DW system,and it is responsible for moving data from source systems into the DW system.The more mature the ETL process the more reliable the DW system.In this paper,we propose the ETL Maturity Model(EMM)that assists organizations in achieving a high-quality ETL system and thereby enhancing the quality of knowledge produced.The EMM is made up of five levels of maturity i.e.,Chaotic,Acceptable,Stable,Efficient and Reliable.Each level of maturity contains Key Process Areas(KPAs)that have been endorsed by industry experts and include all critical features of a good ETL system.Quality Objectives(QOs)are defined procedures that,when implemented,resulted in a high-quality ETL process.Each KPA has its own set of QOs,the execution of which meets the requirements of that KPA.Multiple brainstorming sessions with relevant industry experts helped to enhance the model.EMMwas deployed in two key projects utilizing multiple case studies to supplement the validation process and support our claim.This model can assist organizations in improving their current ETL process and transforming it into a more mature ETL system.This model can also provide high-quality information to assist users inmaking better decisions and gaining their trust.展开更多
Non-contact remote sensing techniques,such as terrestrial laser scanning(TLS)and unmanned aerial vehicle(UAV)photogrammetry,have been globally applied for landslide monitoring in high and steep mountainous areas.These...Non-contact remote sensing techniques,such as terrestrial laser scanning(TLS)and unmanned aerial vehicle(UAV)photogrammetry,have been globally applied for landslide monitoring in high and steep mountainous areas.These techniques acquire terrain data and enable ground deformation monitoring.However,practical application of these technologies still faces many difficulties due to complex terrain,limited access and dense vegetation.For instance,monitoring high and steep slopes can obstruct the TLS sightline,and the accuracy of the UAV model may be compromised by absence of ground control points(GCPs).This paper proposes a TLS-and UAV-based method for monitoring landslide deformation in high mountain valleys using traditional real-time kinematics(RTK)-based control points(RCPs),low-precision TLS-based control points(TCPs)and assumed control points(ACPs)to achieve high-precision surface deformation analysis under obstructed vision and impassable conditions.The effects of GCP accuracy,GCP quantity and automatic tie point(ATP)quantity on the accuracy of UAV modeling and surface deformation analysis were comprehensively analyzed.The results show that,the proposed method allows for the monitoring accuracy of landslides to exceed the accuracy of the GCPs themselves by adding additional low-accuracy GCPs.The proposed method was implemented for monitoring the Xinhua landslide in Baoxing County,China,and was validated against data from multiple sources.展开更多
The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased si...The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.展开更多
Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering stru...Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering structures within them are at flood risk. The economic and social impact of flooding revealed that the damage caused by flash floods leading to blue spots is very high in terms of dollar amount and direct impacts on people’s lives. The impact of flooding within blue spots is either infrastructural or social, affecting lives and properties. Currently, more than 16.1 million properties in the U.S are vulnerable to flooding, and this is projected to increase by 3.2% within the next 30 years. Some models have been developed for flood risks analysis and management including some hydrological models, algorithms and machine learning and geospatial models. The models and methods reviewed are based on location data collection, statistical analysis and computation, and visualization (mapping). This research aims to create blue spots model for the State of Tennessee using ArcGIS visual programming language (model) and data analytics pipeline.展开更多
基金funding from the Paul ScherrerInstitute,Switzerland through the NES/GFA-ABE Cross Project。
文摘To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen set of models accurately represents the‘true’distribution of considered observables.Furthermore,the models are chosen globally,indicating their applicability across the entire energy range of interest.However,this approach overlooks uncertainties inherent in the models themselves.In this work,we propose that instead of selecting globally a winning model set and proceeding with it as if it was the‘true’model set,we,instead,take a weighted average over multiple models within a Bayesian model averaging(BMA)framework,each weighted by its posterior probability.The method involves executing a set of TALYS calculations by randomly varying multiple nuclear physics models and their parameters to yield a vector of calculated observables.Next,computed likelihood function values at each incident energy point were then combined with the prior distributions to obtain updated posterior distributions for selected cross sections and the elastic angular distributions.As the cross sections and elastic angular distributions were updated locally on a per-energy-point basis,the approach typically results in discontinuities or“kinks”in the cross section curves,and these were addressed using spline interpolation.The proposed BMA method was applied to the evaluation of proton-induced reactions on ^(58)Ni between 1 and 100 MeV.The results demonstrated a favorable comparison with experimental data as well as with the TENDL-2023 evaluation.
基金supported by the National Natural Science Foundation of China(42204022,52174160,52274169)Open Fund of Hubei Luojia Laboratory(230100031)+2 种基金the Open Fund of State Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University(23P02)the Fundamental Research Funds for the Central Universities(2023ZKPYDC10)China University of Mining and Technology-Beijing Innovation Training Program for College Students(202302014,202202023)。
文摘The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,many empirical ZTD models based on whether the gridded or scattered ZTD products have been proposed and widely used in the GNSS positioning applications.But there is no comprehensive evaluation of these models for the whole China region,which features complicated topography and climate.In this study,we completely assess the typical empirical models,the IGGtropSH model(gridded,non-meteorology),the SHAtropE model(scattered,non-meteorology),and the GPT3 model(gridded,meteorology)using the Crustal Movement Observation Network of China(CMONOC)network.In general,the results show that the three models share consistent performance with RMSE/bias of 37.45/1.63,37.13/2.20,and 38.27/1.34 mm for the GPT3,SHAtropE and IGGtropSH model,respectively.However,the models had a distinct performance regarding geographical distribution,elevation,seasonal variations,and daily variation.In the southeastern region of China,RMSE values are around 50 mm,which are much higher than that in the western region,approximately 20 mm.The SHAtropE model exhibits better performance for areas with large variations in elevation.The GPT3 model and the IGGtropSH model are more stable across different months,and the SHAtropE model based on the GNSS data exhibits superior performance across various UTC epochs.
文摘A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epidemiological impact of vaccine booster doses on the co-dynamics of viral hepatitis B and COVID-19.The model is fitted to real COVID-19 data from Pakistan.The proposed model incorporates logistic growth and saturated incidence functions.Rigorous analyses using the tools of stochastic calculus,are performed to study appropriate conditions for the existence of unique global solutions,stationary distribution in the sense of ergodicity and disease extinction.The stochastic threshold estimated from the data fitting is given by:R_(0)^(S)=3.0651.Numerical assessments are implemented to illustrate the impact of double-dose vaccination and saturated incidence functions on the dynamics of both diseases.The effects of stochastic white noise intensities are also highlighted.
文摘Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The performance of existing long-term navigation algorithm is limited by the cumulative error of inertial sensors, disturbed local magnetic field, and complex motion modes of the pedestrian. This paper develops a robust data and physical model dual-driven based trajectory estimation(DPDD-TE) framework, which can be applied for long-term navigation tasks. A Bi-directional Long Short-Term Memory(Bi-LSTM) based quasi-static magnetic field(QSMF) detection algorithm is developed for extracting useful magnetic observation for heading calibration, and another Bi-LSTM is adopted for walking speed estimation by considering hybrid human motion information under a specific time period. In addition, a data and physical model dual-driven based multi-source fusion model is proposed to integrate basic INS mechanization and multi-level constraint and observations for maintaining accuracy under long-term navigation tasks, and enhanced by the magnetic and trajectory features assisted loop detection algorithm. Real-world experiments indicate that the proposed DPDD-TE outperforms than existing algorithms, and final estimated heading and positioning accuracy indexes reaches 5° and less than 2 m under the time period of 30 min, respectively.
基金supported by the National Natural Science Foundation of China(Nos.72204169 and 81825007)Beijing Outstanding Young Scientist Program(No.BJJWZYJH01201910025030)+5 种基金Capital’s Funds for Health Improvement and Research(No.2022-2-2045)National Key R&D Program of China(Nos.2022YFF15015002022YFF1501501,2022YFF1501502,2022YFF1501503,2022YFF1501504,and 2022YFF1501505)Youth Beijing Scholar Program(No.010)Beijing Laboratory of Oral Health(No.PXM2021_014226_000041)Beijing Talent Project-Class A:Innovation and Development(No.2018A12)National Ten-Thousand Talent PlanLeadership of Scientific and Technological Innovation,and National Key R&D Program of China(Nos.2017YFC1307900 and 2017YFC1307905).
文摘Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify the subgroups compared with support vector machine(SVM)and extreme gradient boosting(XGBoost),and to select the features.The top 10 important features are included in the stepwise logistic regression,and the odds ratio(OR)and 95%confidence interval(CI)are obtained.There are 41290 adult inpatient records diagnosed with CSVD.Accuracy and area under curve(AUC)of RF are close to 0.7,which performs best in classification compared to SVM and XGBoost.OR and 95%CI of hematocrit for white matter lesions(WMLs),lacunes,microbleeds,atrophy,and enlarged perivascular space(EPVS)are 0.9875(0.9857−0.9893),0.9728(0.9705−0.9752),0.9782(0.9740−0.9824),1.0093(1.0081−1.0106),and 0.9716(0.9597−0.9832).OR and 95%CI of red cell distribution width for WMLs,lacunes,atrophy,and EPVS are 0.9600(0.9538−0.9662),0.9630(0.9559−0.9702),1.0751(1.0686−1.0817),and 0.9304(0.8864−0.9755).OR and 95%CI of platelet distribution width for WMLs,lacunes,and microbleeds are 1.1796(1.1636−1.1958),1.1663(1.1476−1.1853),and 1.0416(1.0152−1.0687).This study proposes a new analytical framework to select important clinical markers for CSVD with machine learning based on a common data model,which has low cost,fast speed,large sample size,and continuous data sources.
基金supported by the National Technology Extension Fund of Forestry,Forest Vegetation Carbon Storage Monitoring Technology Based on Watershed Algorithm ([2019]06)Fundamental Research Funds for the Central Universities (No.PTYX202107).
文摘Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully exploited.To extract dominant woody plant species,GEE combined Sen-tinel-1(S1)and Sentinel-2(S2)data with the addition of the National Forest Resources Inventory(NFRI)and topographic data,resulting in a 10 m resolution multimodal geospatial dataset for subtropical forests in southeast China.Spectral and texture features,red-edge bands,and vegetation indices of S1 and S2 data were computed.A hierarchical model obtained information on forest distribution and area and the dominant woody plant species.The results suggest that combining data sources from the S1 winter and S2 yearly ranges enhances accuracy in forest distribution and area extraction compared to using either data source independently.Similarly,for dominant woody species recognition,using S1 winter and S2 data across all four seasons was accurate.Including terrain factors and removing spatial correlation from NFRI sample points further improved the recognition accuracy.The optimal forest extraction achieved an overall accuracy(OA)of 97.4%and a maplevel image classification efficacy(MICE)of 96.7%.OA and MICE were 83.6%and 80.7%for dominant species extraction,respectively.The high accuracy and efficacy values indicate that the hierarchical recognition model based on multimodal remote sensing data performed extremely well for extracting information about dominant woody plant species.Visualizing the results using the GEE application allows for an intuitive display of forest and species distribution,offering significant convenience for forest resource monitoring.
文摘This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.
文摘Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on data management,rather than emphasizing efficiency. Accurate prediction of electricity consumption is crucial for enabling intelligent grid operations,including resource planning and demandsupply balancing. Smart metering solutions offer users the benefits of effectively interpreting their energy utilization and optimizing costs. Motivated by this,this paper presents an Intelligent Energy Utilization Analysis using Smart Metering Data(IUA-SMD)model to determine energy consumption patterns. The proposed IUA-SMD model comprises three major processes:data Pre-processing,feature extraction,and classification,with parameter optimization. We employ the extreme learning machine(ELM)based classification approach within the IUA-SMD model to derive optimal energy utilization labels. Additionally,we apply the shell game optimization(SGO)algorithm to enhance the classification efficiency of the ELM by optimizing its parameters. The effectiveness of the IUA-SMD model is evaluated using an extensive dataset of smart metering data,and the results are analyzed in terms of accuracy and mean square error(MSE). The proposed model demonstrates superior performance,achieving a maximum accuracy of65.917% and a minimum MSE of0.096. These results highlight the potential of the IUA-SMD model for enabling efficient energy utilization through intelligent analysis of smart metering data.
基金supported by the National Natural Science Foundation of China(No.42071057).
文摘The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.
文摘We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
文摘This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for structuring and analyzing data is underlined, as it enables the measurement of the adequacy between training and the needs of the labor market. The innovation of the study lies in the adaptation of the MERISE model to the local context, the development of innovative indicators, and the integration of a participatory approach including all relevant stakeholders. Contextual adaptation and local innovation: The study suggests adapting MERISE to the specific context of the Republic of Congo, considering the local particularities of the labor market. Development of innovative indicators and new measurement tools: It proposes creating indicators to assess skills matching and employer satisfaction, which are crucial for evaluating the effectiveness of vocational training. Participatory approach and inclusion of stakeholders: The study emphasizes actively involving training centers, employers, and recruitment agencies in the evaluation process. This participatory approach ensures that the perspectives of all stakeholders are considered, leading to more relevant and practical outcomes. Using the MERISE model allows for: • Rigorous data structuring, organization, and standardization: Clearly defining entities and relationships facilitates data organization and standardization, crucial for effective data analysis. • Facilitation of monitoring, analysis, and relevant indicators: Developing both quantitative and qualitative indicators helps measure the effectiveness of training in relation to the labor market, allowing for a comprehensive evaluation. • Improved communication and common language: By providing a common language for different stakeholders, MERISE enhances communication and collaboration, ensuring that all parties have a shared understanding. The study’s approach and contribution to existing research lie in: • Structured theoretical and practical framework and holistic approach: The study offers a structured framework for data collection and analysis, covering both quantitative and qualitative aspects, thus providing a comprehensive view of the training system. • Reproducible methodology and international comparison: The proposed methodology can be replicated in other contexts, facilitating international comparison and the adoption of best practices. • Extension of knowledge and new perspective: By integrating a participatory approach and developing indicators adapted to local needs, the study extends existing research and offers new perspectives on vocational training evaluation.
文摘In this paper, a new multimedia data model, namely object-relation hypermedia data model (O-RHDM) which is an advanced and effective multimedia data model is proposed and designed based on the extension and integration of non first normal form (NF2) multimedia data model. Its principle, mathematical description, algebra operation, organization method and store model are also discussed. And its specific application example, in the multimedia spatial data management is given combining with the Hainan multimedia touring information system.
基金supported by the National Natural Science Foundation of China (Grant Nos. 42025404, 42188101, and 42241143)the National Key R&D Program of China (Grant Nos. 2022YFF0503700 and 2022YFF0503900)+1 种基金the B-type Strategic Priority Program of the Chinese Academy of Sciences (Grant No. XDB41000000)the Fundamental Research Funds for the Central Universities (Grant No. 2042022kf1012)
文摘Because radiation belt electrons can pose a potential threat to the safety of satellites orbiting in space,it is of great importance to develop a reliable model that can predict the highly dynamic variations in outer radiation belt electron fluxes.In the present study,we develop a forecast model of radiation belt electron fluxes based on the data assimilation method,in terms of Van Allen Probe measurements combined with three-dimensional radiation belt numerical simulations.Our forecast model can cover the entire outer radiation belt with a high temporal resolution(1 hour)and a spatial resolution of 0.25 L over a wide range of both electron energy(0.1-5.0 MeV)and pitch angle(5°-90°).On the basis of this model,we forecast hourly electron fluxes for the next 1,2,and 3 days during an intense geomagnetic storm and evaluate the corresponding prediction performance.Our model can reasonably predict the stormtime evolution of radiation belt electrons with high prediction efficiency(up to~0.8-1).The best prediction performance is found for~0.3-3 MeV electrons at L=~3.25-4.5,which extends to higher L and lower energies with increasing pitch angle.Our results demonstrate that the forecast model developed can be a powerful tool to predict the spatiotemporal changes in outer radiation belt electron fluxes,and the model has both scientific significance and practical implications.
文摘Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.
基金supported by National Natural Science Foundation of China (61703410,61873175,62073336,61873273,61773386,61922089)。
文摘Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.
基金King Saud University for funding this work through Researchers Supporting Project Number(RSP-2021/387),King Saud University,Riyadh,Saudi Arabia.
文摘The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesired or of poor quality.A Data Warehouse(DW)is a huge collection of data gathered from many sources and an important part of any BI solution to assist management in making better decisions.The Extract,Transform,and Load(ETL)process is the backbone of a DW system,and it is responsible for moving data from source systems into the DW system.The more mature the ETL process the more reliable the DW system.In this paper,we propose the ETL Maturity Model(EMM)that assists organizations in achieving a high-quality ETL system and thereby enhancing the quality of knowledge produced.The EMM is made up of five levels of maturity i.e.,Chaotic,Acceptable,Stable,Efficient and Reliable.Each level of maturity contains Key Process Areas(KPAs)that have been endorsed by industry experts and include all critical features of a good ETL system.Quality Objectives(QOs)are defined procedures that,when implemented,resulted in a high-quality ETL process.Each KPA has its own set of QOs,the execution of which meets the requirements of that KPA.Multiple brainstorming sessions with relevant industry experts helped to enhance the model.EMMwas deployed in two key projects utilizing multiple case studies to supplement the validation process and support our claim.This model can assist organizations in improving their current ETL process and transforming it into a more mature ETL system.This model can also provide high-quality information to assist users inmaking better decisions and gaining their trust.
基金support of the National Natural Science Foundation of China(Grant Nos.U2240221 and 41977229)the Sichuan Youth Science and Technology Innovation Research Team Project(Grant No.2020JDTD0006).
文摘Non-contact remote sensing techniques,such as terrestrial laser scanning(TLS)and unmanned aerial vehicle(UAV)photogrammetry,have been globally applied for landslide monitoring in high and steep mountainous areas.These techniques acquire terrain data and enable ground deformation monitoring.However,practical application of these technologies still faces many difficulties due to complex terrain,limited access and dense vegetation.For instance,monitoring high and steep slopes can obstruct the TLS sightline,and the accuracy of the UAV model may be compromised by absence of ground control points(GCPs).This paper proposes a TLS-and UAV-based method for monitoring landslide deformation in high mountain valleys using traditional real-time kinematics(RTK)-based control points(RCPs),low-precision TLS-based control points(TCPs)and assumed control points(ACPs)to achieve high-precision surface deformation analysis under obstructed vision and impassable conditions.The effects of GCP accuracy,GCP quantity and automatic tie point(ATP)quantity on the accuracy of UAV modeling and surface deformation analysis were comprehensively analyzed.The results show that,the proposed method allows for the monitoring accuracy of landslides to exceed the accuracy of the GCPs themselves by adding additional low-accuracy GCPs.The proposed method was implemented for monitoring the Xinhua landslide in Baoxing County,China,and was validated against data from multiple sources.
基金supported in part by the National Natural Science Foundation of China(NSFC)(92167106,61833014)Key Research and Development Program of Zhejiang Province(2022C01206)。
文摘The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.
文摘Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering structures within them are at flood risk. The economic and social impact of flooding revealed that the damage caused by flash floods leading to blue spots is very high in terms of dollar amount and direct impacts on people’s lives. The impact of flooding within blue spots is either infrastructural or social, affecting lives and properties. Currently, more than 16.1 million properties in the U.S are vulnerable to flooding, and this is projected to increase by 3.2% within the next 30 years. Some models have been developed for flood risks analysis and management including some hydrological models, algorithms and machine learning and geospatial models. The models and methods reviewed are based on location data collection, statistical analysis and computation, and visualization (mapping). This research aims to create blue spots model for the State of Tennessee using ArcGIS visual programming language (model) and data analytics pipeline.