Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the ...Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data.However,because of the diversity and complexity of data,testing Big Data is challenging.Though numerous research efforts deal with Big Data testing,a comprehensive review to address testing techniques and challenges of BigData is not available as yet.Therefore,we have systematically reviewed the Big Data testing techniques’evidence occurring in the period 2010–2021.This paper discusses testing data processing by highlighting the techniques used in every processing phase.Furthermore,we discuss the challenges and future directions.Our findings show that diverse functional,non-functional and combined(functional and non-functional)testing techniques have been used to solve specific problems related to Big Data.At the same time,most of the testing challenges have been faced during the MapReduce validation phase.In addition,the combinatorial testing technique is one of the most applied techniques in combination with other techniques(i.e.,random testing,mutation testing,input space partitioning and equivalence testing)to find various functional faults through Big Data testing.展开更多
Data obtained from accelerated life testing (ALT) when there are two or more failure modes, which is commonly referred to as competing failure modes, are often incomplete. The incompleteness is mainly due to censori...Data obtained from accelerated life testing (ALT) when there are two or more failure modes, which is commonly referred to as competing failure modes, are often incomplete. The incompleteness is mainly due to censoring, as well as masking which might be the case that the failure time is observed, but its corresponding failure mode is not identified. Because the identification of the failure mode may be expensive, or very difficult to investigate due to lack of appropriate diagnostics. A method is proposed for analyzing incomplete data of constant stress ALT with competing failure modes. It is assumed that failure modes have s-independent latent lifetimes and the log lifetime of each failure mode can be written as a linear function of stress. The parameters of the model are estimated by using the expectation maximum (EM) algorithm with incomplete data. Simulation studies are performed to check'model validity and investigate the properties of estimates. For further validation, the method is also illustrated by an example, which shows the process of analyze incomplete data from ALT of some insulation system. Because of considering the incompleteness of data in modeling and making use of the EM algorithm in estimating, the method becomes more flexible in ALT analysis.展开更多
In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. O...In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).展开更多
Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems....Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems.They imi-tate the theory of natural selection and evolution.The harmony search algorithm(HSA)is one of the most recent search algorithms in the last years.It imitates the behavior of a musician tofind the best harmony.Scholars have estimated the simi-larities and the differences between genetic algorithms and the harmony search algorithm in diverse research domains.The test data generation process represents a critical task in software validation.Unfortunately,there is no work comparing the performance of genetic algorithms and the harmony search algorithm in the test data generation process.This paper studies the similarities and the differences between genetic algorithms and the harmony search algorithm based on the ability and speed offinding the required test data.The current research performs an empirical comparison of the HSA and the GAs,and then the significance of the results is estimated using the t-Test.The study investigates the efficiency of the harmony search algorithm and the genetic algorithms according to(1)the time performance,(2)the significance of the generated test data,and(3)the adequacy of the generated test data to satisfy a given testing criterion.The results showed that the harmony search algorithm is significantly faster than the genetic algo-rithms because the t-Test showed that the p-value of the time values is 0.026<α(αis the significance level=0.05 at 95%confidence level).In contrast,there is no significant difference between the two algorithms in generating the adequate test data because the t-Test showed that the p-value of thefitness values is 0.25>α.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
In Brazil and various regions globally, the initiation of landslides is frequently associated with rainfall;yet the spatial arrangement of geological structures and stratification considerably influences landslide occ...In Brazil and various regions globally, the initiation of landslides is frequently associated with rainfall;yet the spatial arrangement of geological structures and stratification considerably influences landslide occurrences. The multifaceted nature of these influences makes the surveillance of mass movements a highly intricate task, requiring an understanding of numerous interdependent variables. Recent years have seen an emergence in scholarly research aimed at integrating geophysical and geotechnical methodologies. The conjoint examination of geophysical and geotechnical data offers an enhanced perspective into subsurface structures. Within this work, a methodology is proposed for the synchronous analysis of electrical resistivity geophysical data and geotechnical data, specifically those extracted from the Light Dynamic Penetrometer (DPL) and Standard Penetration Test (SPT). This study involved a linear fitting process to correlate resistivity with N10/SPT N-values from DPL/SPT soundings, culminating in a 2D profile of N10/SPT N-values predicated on electrical profiles. The findings of this research furnish invaluable insights into slope stability by allowing for a two-dimensional representation of penetration resistance properties. Through the synthesis of geophysical and geotechnical data, this project aims to augment the comprehension of subsurface conditions, with potential implications for refining landslide risk evaluations. This endeavor offers insight into the formulation of more effective and precise slope management protocols and disaster prevention strategies.展开更多
Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanal...Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanalysis database(ERA5)is used.Seeing calculated from ERA5 is compared consistently with the Differential Image Motion Monitor seeing at the height of 12 m.Results show that seeing decays exponentially with height at the Muztagh-Ata site.Seeing decays the fastest in fall in 2021 and most slowly with height in summer.The seeing condition is better in fall than in summer.The median value of seeing at 12 m is 0.89 arcsec,the maximum value is1.21 arcsec in August and the minimum is 0.66 arcsec in October.The median value of seeing at 12 m is 0.72arcsec in the nighttime and 1.08 arcsec in the daytime.Seeing is a combination of annual and about biannual variations with the same phase as temperature and wind speed indicating that seeing variation with time is influenced by temperature and wind speed.The Richardson number Ri is used to analyze the atmospheric stability and the variations of seeing are consistent with Ri between layers.These quantitative results can provide an important reference for a telescopic observation strategy.展开更多
Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and mai...Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and maintenance of cable-stayed bridges.However,the representative temperatures of stayed cables are not specified in the existing design codes.To address this issue,this study investigates the distribution of the cable temperature and determinates its representative temperature.First,an experimental investigation,spanning over a period of one year,was carried out near the bridge site to obtain the temperature data.According to the statistical analysis of the measured data,it reveals that the temperature distribution is generally uniform along the cable cross-section without significant temperature gradient.Then,based on the limited data,the Monte Carlo,the gradient boosted regression trees(GBRT),and univariate linear regression(ULR)methods are employed to predict the cable’s representative temperature throughout the service life.These methods effectively overcome the limitations of insufficient monitoring data and accurately predict the representative temperature of the cables.However,each method has its own advantages and limitations in terms of applicability and accuracy.A comprehensive evaluation of the performance of these methods is conducted,and practical recommendations are provided for their application.The proposed methods and representative temperatures provide a good basis for the operation and maintenance of in-service long-span cable-stayed bridges.展开更多
Testing is an integral part of software development.Current fastpaced system developments have rendered traditional testing techniques obsolete.Therefore,automated testing techniques are needed to adapt to such system...Testing is an integral part of software development.Current fastpaced system developments have rendered traditional testing techniques obsolete.Therefore,automated testing techniques are needed to adapt to such system developments speed.Model-based testing(MBT)is a technique that uses system models to generate and execute test cases automatically.It was identified that the test data generation(TDG)in many existing model-based test case generation(MB-TCG)approaches were still manual.An automatic and effective TDG can further reduce testing cost while detecting more faults.This study proposes an automated TDG approach in MB-TCG using the extended finite state machine model(EFSM).The proposed approach integrates MBT with combinatorial testing.The information available in an EFSM model and the boundary value analysis strategy are used to automate the domain input classifications which were done manually by the existing approach.The results showed that the proposed approach was able to detect 6.62 percent more faults than the conventionalMB-TCG but at the same time generated 43 more tests.The proposed approach effectively detects faults,but a further treatment to the generated tests such as test case prioritization should be done to increase the effectiveness and efficiency of testing.展开更多
Introduction: The present work compared the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Four popular data mining algorithms (Decision tree, Naive Bayes, N...Introduction: The present work compared the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Four popular data mining algorithms (Decision tree, Naive Bayes, Neural network, logistic regression) were used to build the model that predicts whether an individual was being tested for HIV among adults in Ethiopia using EDHS 2011. The final experimentation results indicated that the decision tree (random tree algorithm) performed the best with accuracy of 96%, the decision tree induction method (J48) came out to be the second best with a classification accuracy of 79%, followed by neural network (78%). Logistic regression has also achieved the least classification accuracy of 74%. Objectives: The objective of this study is to compare the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Methods: Cross-Industry Standard Process for Data Mining (CRISP-DM) was used to predict the model for HIV testing and explore association rules between HIV testing and the selected attributes. Data preprocessing was performed and missing values for the categorical variable were replaced by the modal value of the variable. Different data mining techniques were used to build the predictive model. Results: The target dataset contained 30,625 study participants. Out of which 16,515 (54%) participants were women while the rest 14,110 (46%) were men. The age of the participants in the dataset ranged from 15 to 59 years old with modal age of 15 - 19 years old. Among the study participants, 17,719 (58%) have never been tested for HIV while the rest 12,906 (42%) had been tested. Residence, educational level, wealth index, HIV related stigma, knowledge related to HIV, region, age group, risky sexual behaviour attributes, knowledge about where to test for HIV and knowledge on family planning through mass media were found to be predictors for HIV testing. Conclusion and Recommendation: The results obtained from this research reveal that data mining is crucial in extracting relevant information for the effective utilization of HIV testing services which has clinical, community and public health importance at all levels. It is vital to apply different data mining techniques for the same settings and compare the model performances (based on accuracy, sensitivity, and specificity) with each other. Furthermore, this study would also invite interested researchers to explore more on the application of data mining techniques in healthcare industry or else in related and similar settings for the future.展开更多
In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use Op...In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use OpenGL technique and the characteristic of analyzed data to construct a TDDF, the ways of reality processing and interactive processing are described. Then the medium geometric element and a related realistic model are constructed by means of the first algorithm. Models obtained for attaching the third dimension in three-dimensional data field are presented. An example for TDDF realization of machine measuring is provided. The analysis of resultant graphic indicates that the three-dimensional graphics built by the method developed is featured by good reality, fast processing and strong interaction展开更多
In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely d...In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely difficult or impossible to get specific information about the component that caused the system to fail.Unknown failure causes are instances in which the actual cause of systemfailure is unknown.On the other side,thanks to current advanced technology based on computers,automation,and simulation,products have become incredibly dependable and trustworthy,and as a result,obtaining failure data for testing such exceptionally reliable items have become a very costly and time-consuming procedure.Therefore,because of its capacity to produce rapid and adequate failure data in a short period of time,accelerated life testing(ALT)is the most utilized approach in the field of product reliability and life testing.Based on progressively hybrid censored(PrHC)data froma three-component parallel series hybrid system that failed to owe to unknown causes,this paper investigates a challenging problem of parameter estimation and reliability assessment under a step stress partially accelerated life-test(SSPALT).Failures of components are considered to follow a power linear hazard rate(PLHR),which can be used when the failure rate displays linear,decreasing,increasing or bathtub failure patterns.The Tempered random variable(TRV)model is considered to reflect the effect of the high stress level used to induce early failure data.The maximum likelihood estimation(MLE)approach is used to estimate the parameters of the PLHR distribution and the acceleration factor.A variance covariance matrix(VCM)is then obtained to construct the approximate confidence intervals(ACIs).In addition,studentized bootstrap confidence intervals(ST-B CIs)are also constructed and compared with ACIs in terms of their respective interval lengths(ILs).Moreover,a simulation study is conducted to demonstrate the performance of the estimation procedures and the methodology discussed in this paper.Finally,real failure data from the air conditioning systems of an airplane is used to illustrate further the performance of the suggested estimation technique.展开更多
Background: Human Immuno-Deficiency Virus Self-Testing (HIVST) is a process where an individual who wants to know their HIV status collects a specimen, performs a test and interprets the result by themselves. HIVST da...Background: Human Immuno-Deficiency Virus Self-Testing (HIVST) is a process where an individual who wants to know their HIV status collects a specimen, performs a test and interprets the result by themselves. HIVST data from the Zimbabwe AIDS and TB Program (ATP) directorate showed that between 2019-2020, only 31% of the target HIVST kits were distributed in the country. Mashonaland West Province was one of the least performing provinces in meeting targets for HIVST kits distribution. Gaps in the implementation of the HIVST in the province ultimately affect the nationwide scaleup of targeted testing, a key enabler in achieving HIV epidemic control. We analyzed HIVST trends in Mashonaland West Province to inform HIV testing services programming. Methods: We conducted a cross-sectional study using HIVST secondary data obtained from the District Health Information Software 2 (DHIS2) electronic database. We conducted regression analysis for trends using Epi Info 7.2 and tables, bar graphs, pie charts and linear graphs were used for data presentation. Results: A total of 31,070 clients accessed HIVST kits in Mashonaland West Province from 2019-2020. A slightly higher proportion (50.4% and 51.7%) of females as compared to males accessed HIVST kits in 2019 and 2020 respectively. Overall, an increase in the trend of HIVST kits uptake was recorded (males R<sup>2</sup> = 0.3945, p-value = 0.003 and females R<sup>2</sup> = 0.4739, p-value = 0.001). There was generally a decline in the trend of community-based distribution of HIVST kits from the third quarter of 2019 throughout 2020 (R<sup>2</sup> = 0.2441, p-value = 0.006). Primary distribution of HIVST kits remained the dominant method of distribution, constituting more than half of the kits distributed in both 2019 (67%) and 2020 (86%). Conclusion: Mashonaland West Province was mainly utilising facility-based distribution model for HIVST over the community-based distribution model. We recommended training more community-based distribution agents to increase community distribution of HIVST kits.展开更多
Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre...Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre- quency-directed run-length (AFDR) codes. Different [rom frequency-directed run-length (FDR) codes, AFDR encodes both 0- and 1-runs and uses the same codes to the equal length runs. It also modifies the codes for 00 and 11 to improve the compression performance. Experimental results for ISCAS 89 benchmark circuits show that AFDR codes achieve higher compression ratio than FDR and other compression codes.展开更多
This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,t...This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,test application time, and area overhead. To improve the compression ratio, the new method is based on variable-to-variable run length codes,and a novel algorithm is proposed to reorder the test vectors and fill the unspecified bits in the pre-processing step. With a novel on-chip decoder, low test application time and low area overhead are obtained by hybrid run length codes. Finally, an experimental comparison on ISCAS 89 benchmark circuits validates the proposed method展开更多
To analyze the errors of processing data, the testing principle for jet elements is introduced and the property of testing system is theoretically and experimentally studied. On the basis of the above, the method of p...To analyze the errors of processing data, the testing principle for jet elements is introduced and the property of testing system is theoretically and experimentally studied. On the basis of the above, the method of processing data is presented and the error formulae, which are the functions of the testing system property, are derived. Finally, the methods of reducing the errors are provided. The measured results are in correspondence with the theoretical conclusion.展开更多
We developed an inversion technique to determine in situ stresses for elliptical boreholes of arbitrary trajectory. In this approach, borehole geometry, drilling-induced fracture information, and other available leak-...We developed an inversion technique to determine in situ stresses for elliptical boreholes of arbitrary trajectory. In this approach, borehole geometry, drilling-induced fracture information, and other available leak-off test data were used to construct a mathematical model, which was in turn applied to finding the inverse of an overdetermined system of equations.The method has been demonstrated by a case study in the Appalachian Basin, USA. The calculated horizontal stresses are in reasonable agreement with the reported regional stress study of the area, although there are no field measurement data of the studied well for direct calibration. The results also indicate that 2% of axis difference in the elliptical borehole geometry can cause a 5% difference in minimum horizontal stress calculation and a 10% difference in maximum horizontal stress calculation.展开更多
This paper presents a substructure online hybrid test system that is extensible for geographically distributed tests. This system consists of a set of devices conventionally used for cyclic tests to load the tested su...This paper presents a substructure online hybrid test system that is extensible for geographically distributed tests. This system consists of a set of devices conventionally used for cyclic tests to load the tested substructures onto the target displacement or the target force. Due to their robustness and portability, individual sets of conventional loading devices can be transported and reconfigured to realize physical loading in geographically remote laboratories. Another appealing feature is the flexible displacement-force mixed control that is particularly suitable for specimens having large disparities in stiffness during various performance stages. To conduct a substructure online hybrid test, an extensible framework is developed, which is equipped with a generalized interface to encapsulate each substructure. Multiple tested substructures and analyzed substructures using various structural program codes can be accommodated within the single framework, simply interfaced with the boundary displacements and forces. A coordinator program is developed to keep the boundaries among all substructures compatible and equilibrated. An Interuet-based data exchange scheme is also devised to transfer data among computers equipped with different software environments. A series of online hybrid tests are introduced, and the portability, flexibility, and extensibility of the online hybrid test system are demonstrated.展开更多
A new structural damage identification method using limited test static displacement based on grey system theory is proposed in this paper. The grey relation coefficient of displacement curvature is defined and used t...A new structural damage identification method using limited test static displacement based on grey system theory is proposed in this paper. The grey relation coefficient of displacement curvature is defined and used to locate damage in the structure, and an iterative estimation scheme for solving nonlinear optimization programming problems based on the quadratic programming technique is used to identify the damage magnitude. A numerical example of a cantilever beam with single or multiple damages is used to examine the capability of the proposed grey-theory-based method to localize and identify damages. The factors of meas-urement noise and incomplete test data are also discussed. The numerical results showed that the damage in the structure can be localized correctly through using the grey-related coefficient of displacement curvature, and the damage magnitude can be iden-tified with a high degree of accuracy, regardless of the number of measured displacement nodes. This proposed method only requires limited static test data, which is easily available in practice, and has wide applications in structural damage detection.展开更多
基金Science Foundation Ireland(SFI)under Grant Number SFI/16/RC/3918(Confirm)and Marie Sklodowska Curie Grant agreement No.847577 co-fundedthe European Regional Development Fund.Wasif Afzal has received funding from the European Union’s Horizon 2020 research and innovation program under CMC,2023,vol.74,no.22767 Grant agreement Nos.871319,957212from the ECSEL Joint Undertaking(JU)under Grant agreement No 101007350.
文摘Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data.However,because of the diversity and complexity of data,testing Big Data is challenging.Though numerous research efforts deal with Big Data testing,a comprehensive review to address testing techniques and challenges of BigData is not available as yet.Therefore,we have systematically reviewed the Big Data testing techniques’evidence occurring in the period 2010–2021.This paper discusses testing data processing by highlighting the techniques used in every processing phase.Furthermore,we discuss the challenges and future directions.Our findings show that diverse functional,non-functional and combined(functional and non-functional)testing techniques have been used to solve specific problems related to Big Data.At the same time,most of the testing challenges have been faced during the MapReduce validation phase.In addition,the combinatorial testing technique is one of the most applied techniques in combination with other techniques(i.e.,random testing,mutation testing,input space partitioning and equivalence testing)to find various functional faults through Big Data testing.
基金supported by Sustentation Program of National Ministries and Commissions of China (Grant No. 203020102)
文摘Data obtained from accelerated life testing (ALT) when there are two or more failure modes, which is commonly referred to as competing failure modes, are often incomplete. The incompleteness is mainly due to censoring, as well as masking which might be the case that the failure time is observed, but its corresponding failure mode is not identified. Because the identification of the failure mode may be expensive, or very difficult to investigate due to lack of appropriate diagnostics. A method is proposed for analyzing incomplete data of constant stress ALT with competing failure modes. It is assumed that failure modes have s-independent latent lifetimes and the log lifetime of each failure mode can be written as a linear function of stress. The parameters of the model are estimated by using the expectation maximum (EM) algorithm with incomplete data. Simulation studies are performed to check'model validity and investigate the properties of estimates. For further validation, the method is also illustrated by an example, which shows the process of analyze incomplete data from ALT of some insulation system. Because of considering the incompleteness of data in modeling and making use of the EM algorithm in estimating, the method becomes more flexible in ALT analysis.
基金The project supported by NNSFC (19631040), NSSFC (04BTJ002) and the grant for post-doctor fellows in SELF.
文摘In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).
文摘Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems.They imi-tate the theory of natural selection and evolution.The harmony search algorithm(HSA)is one of the most recent search algorithms in the last years.It imitates the behavior of a musician tofind the best harmony.Scholars have estimated the simi-larities and the differences between genetic algorithms and the harmony search algorithm in diverse research domains.The test data generation process represents a critical task in software validation.Unfortunately,there is no work comparing the performance of genetic algorithms and the harmony search algorithm in the test data generation process.This paper studies the similarities and the differences between genetic algorithms and the harmony search algorithm based on the ability and speed offinding the required test data.The current research performs an empirical comparison of the HSA and the GAs,and then the significance of the results is estimated using the t-Test.The study investigates the efficiency of the harmony search algorithm and the genetic algorithms according to(1)the time performance,(2)the significance of the generated test data,and(3)the adequacy of the generated test data to satisfy a given testing criterion.The results showed that the harmony search algorithm is significantly faster than the genetic algo-rithms because the t-Test showed that the p-value of the time values is 0.026<α(αis the significance level=0.05 at 95%confidence level).In contrast,there is no significant difference between the two algorithms in generating the adequate test data because the t-Test showed that the p-value of thefitness values is 0.25>α.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
文摘In Brazil and various regions globally, the initiation of landslides is frequently associated with rainfall;yet the spatial arrangement of geological structures and stratification considerably influences landslide occurrences. The multifaceted nature of these influences makes the surveillance of mass movements a highly intricate task, requiring an understanding of numerous interdependent variables. Recent years have seen an emergence in scholarly research aimed at integrating geophysical and geotechnical methodologies. The conjoint examination of geophysical and geotechnical data offers an enhanced perspective into subsurface structures. Within this work, a methodology is proposed for the synchronous analysis of electrical resistivity geophysical data and geotechnical data, specifically those extracted from the Light Dynamic Penetrometer (DPL) and Standard Penetration Test (SPT). This study involved a linear fitting process to correlate resistivity with N10/SPT N-values from DPL/SPT soundings, culminating in a 2D profile of N10/SPT N-values predicated on electrical profiles. The findings of this research furnish invaluable insights into slope stability by allowing for a two-dimensional representation of penetration resistance properties. Through the synthesis of geophysical and geotechnical data, this project aims to augment the comprehension of subsurface conditions, with potential implications for refining landslide risk evaluations. This endeavor offers insight into the formulation of more effective and precise slope management protocols and disaster prevention strategies.
基金funded by the National Natural Science Foundation of China(NSFC)the Chinese Academy of Sciences(CAS)(grant No.U2031209)the National Natural Science Foundation of China(NSFC,grant Nos.11872128,42174192,and 91952111)。
文摘Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanalysis database(ERA5)is used.Seeing calculated from ERA5 is compared consistently with the Differential Image Motion Monitor seeing at the height of 12 m.Results show that seeing decays exponentially with height at the Muztagh-Ata site.Seeing decays the fastest in fall in 2021 and most slowly with height in summer.The seeing condition is better in fall than in summer.The median value of seeing at 12 m is 0.89 arcsec,the maximum value is1.21 arcsec in August and the minimum is 0.66 arcsec in October.The median value of seeing at 12 m is 0.72arcsec in the nighttime and 1.08 arcsec in the daytime.Seeing is a combination of annual and about biannual variations with the same phase as temperature and wind speed indicating that seeing variation with time is influenced by temperature and wind speed.The Richardson number Ri is used to analyze the atmospheric stability and the variations of seeing are consistent with Ri between layers.These quantitative results can provide an important reference for a telescopic observation strategy.
基金Project(2017G006-N)supported by the Project of Science and Technology Research and Development Program of China Railway Corporation。
文摘Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and maintenance of cable-stayed bridges.However,the representative temperatures of stayed cables are not specified in the existing design codes.To address this issue,this study investigates the distribution of the cable temperature and determinates its representative temperature.First,an experimental investigation,spanning over a period of one year,was carried out near the bridge site to obtain the temperature data.According to the statistical analysis of the measured data,it reveals that the temperature distribution is generally uniform along the cable cross-section without significant temperature gradient.Then,based on the limited data,the Monte Carlo,the gradient boosted regression trees(GBRT),and univariate linear regression(ULR)methods are employed to predict the cable’s representative temperature throughout the service life.These methods effectively overcome the limitations of insufficient monitoring data and accurately predict the representative temperature of the cables.However,each method has its own advantages and limitations in terms of applicability and accuracy.A comprehensive evaluation of the performance of these methods is conducted,and practical recommendations are provided for their application.The proposed methods and representative temperatures provide a good basis for the operation and maintenance of in-service long-span cable-stayed bridges.
基金The research was funded by Universiti Teknologi Malaysia(UTM)and the MalaysianMinistry of Higher Education(MOHE)under the Industry-International Incentive Grant Scheme(IIIGS)(Vote Number:Q.J130000.3651.02M67 and Q.J130000.3051.01M86)the Aca-demic Fellowship Scheme(SLAM).
文摘Testing is an integral part of software development.Current fastpaced system developments have rendered traditional testing techniques obsolete.Therefore,automated testing techniques are needed to adapt to such system developments speed.Model-based testing(MBT)is a technique that uses system models to generate and execute test cases automatically.It was identified that the test data generation(TDG)in many existing model-based test case generation(MB-TCG)approaches were still manual.An automatic and effective TDG can further reduce testing cost while detecting more faults.This study proposes an automated TDG approach in MB-TCG using the extended finite state machine model(EFSM).The proposed approach integrates MBT with combinatorial testing.The information available in an EFSM model and the boundary value analysis strategy are used to automate the domain input classifications which were done manually by the existing approach.The results showed that the proposed approach was able to detect 6.62 percent more faults than the conventionalMB-TCG but at the same time generated 43 more tests.The proposed approach effectively detects faults,but a further treatment to the generated tests such as test case prioritization should be done to increase the effectiveness and efficiency of testing.
文摘Introduction: The present work compared the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Four popular data mining algorithms (Decision tree, Naive Bayes, Neural network, logistic regression) were used to build the model that predicts whether an individual was being tested for HIV among adults in Ethiopia using EDHS 2011. The final experimentation results indicated that the decision tree (random tree algorithm) performed the best with accuracy of 96%, the decision tree induction method (J48) came out to be the second best with a classification accuracy of 79%, followed by neural network (78%). Logistic regression has also achieved the least classification accuracy of 74%. Objectives: The objective of this study is to compare the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Methods: Cross-Industry Standard Process for Data Mining (CRISP-DM) was used to predict the model for HIV testing and explore association rules between HIV testing and the selected attributes. Data preprocessing was performed and missing values for the categorical variable were replaced by the modal value of the variable. Different data mining techniques were used to build the predictive model. Results: The target dataset contained 30,625 study participants. Out of which 16,515 (54%) participants were women while the rest 14,110 (46%) were men. The age of the participants in the dataset ranged from 15 to 59 years old with modal age of 15 - 19 years old. Among the study participants, 17,719 (58%) have never been tested for HIV while the rest 12,906 (42%) had been tested. Residence, educational level, wealth index, HIV related stigma, knowledge related to HIV, region, age group, risky sexual behaviour attributes, knowledge about where to test for HIV and knowledge on family planning through mass media were found to be predictors for HIV testing. Conclusion and Recommendation: The results obtained from this research reveal that data mining is crucial in extracting relevant information for the effective utilization of HIV testing services which has clinical, community and public health importance at all levels. It is vital to apply different data mining techniques for the same settings and compare the model performances (based on accuracy, sensitivity, and specificity) with each other. Furthermore, this study would also invite interested researchers to explore more on the application of data mining techniques in healthcare industry or else in related and similar settings for the future.
基金This project is supported by National Natural Science Foundation of China (No.50405009)
文摘In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use OpenGL technique and the characteristic of analyzed data to construct a TDDF, the ways of reality processing and interactive processing are described. Then the medium geometric element and a related realistic model are constructed by means of the first algorithm. Models obtained for attaching the third dimension in three-dimensional data field are presented. An example for TDDF realization of machine measuring is provided. The analysis of resultant graphic indicates that the three-dimensional graphics built by the method developed is featured by good reality, fast processing and strong interaction
文摘In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely difficult or impossible to get specific information about the component that caused the system to fail.Unknown failure causes are instances in which the actual cause of systemfailure is unknown.On the other side,thanks to current advanced technology based on computers,automation,and simulation,products have become incredibly dependable and trustworthy,and as a result,obtaining failure data for testing such exceptionally reliable items have become a very costly and time-consuming procedure.Therefore,because of its capacity to produce rapid and adequate failure data in a short period of time,accelerated life testing(ALT)is the most utilized approach in the field of product reliability and life testing.Based on progressively hybrid censored(PrHC)data froma three-component parallel series hybrid system that failed to owe to unknown causes,this paper investigates a challenging problem of parameter estimation and reliability assessment under a step stress partially accelerated life-test(SSPALT).Failures of components are considered to follow a power linear hazard rate(PLHR),which can be used when the failure rate displays linear,decreasing,increasing or bathtub failure patterns.The Tempered random variable(TRV)model is considered to reflect the effect of the high stress level used to induce early failure data.The maximum likelihood estimation(MLE)approach is used to estimate the parameters of the PLHR distribution and the acceleration factor.A variance covariance matrix(VCM)is then obtained to construct the approximate confidence intervals(ACIs).In addition,studentized bootstrap confidence intervals(ST-B CIs)are also constructed and compared with ACIs in terms of their respective interval lengths(ILs).Moreover,a simulation study is conducted to demonstrate the performance of the estimation procedures and the methodology discussed in this paper.Finally,real failure data from the air conditioning systems of an airplane is used to illustrate further the performance of the suggested estimation technique.
文摘Background: Human Immuno-Deficiency Virus Self-Testing (HIVST) is a process where an individual who wants to know their HIV status collects a specimen, performs a test and interprets the result by themselves. HIVST data from the Zimbabwe AIDS and TB Program (ATP) directorate showed that between 2019-2020, only 31% of the target HIVST kits were distributed in the country. Mashonaland West Province was one of the least performing provinces in meeting targets for HIVST kits distribution. Gaps in the implementation of the HIVST in the province ultimately affect the nationwide scaleup of targeted testing, a key enabler in achieving HIV epidemic control. We analyzed HIVST trends in Mashonaland West Province to inform HIV testing services programming. Methods: We conducted a cross-sectional study using HIVST secondary data obtained from the District Health Information Software 2 (DHIS2) electronic database. We conducted regression analysis for trends using Epi Info 7.2 and tables, bar graphs, pie charts and linear graphs were used for data presentation. Results: A total of 31,070 clients accessed HIVST kits in Mashonaland West Province from 2019-2020. A slightly higher proportion (50.4% and 51.7%) of females as compared to males accessed HIVST kits in 2019 and 2020 respectively. Overall, an increase in the trend of HIVST kits uptake was recorded (males R<sup>2</sup> = 0.3945, p-value = 0.003 and females R<sup>2</sup> = 0.4739, p-value = 0.001). There was generally a decline in the trend of community-based distribution of HIVST kits from the third quarter of 2019 throughout 2020 (R<sup>2</sup> = 0.2441, p-value = 0.006). Primary distribution of HIVST kits remained the dominant method of distribution, constituting more than half of the kits distributed in both 2019 (67%) and 2020 (86%). Conclusion: Mashonaland West Province was mainly utilising facility-based distribution model for HIVST over the community-based distribution model. We recommended training more community-based distribution agents to increase community distribution of HIVST kits.
基金Supported by the National Natural Science Foundation of China(61076019,61106018)the Aeronautical Science Foundation of China(20115552031)+3 种基金the China Postdoctoral Science Foundation(20100481134)the Jiangsu Province Key Technology R&D Program(BE2010003)the Nanjing University of Aeronautics and Astronautics Research Funding(NS2010115)the Nanjing University of Aeronatics and Astronautics Initial Funding for Talented Faculty(1004-YAH10027)~~
文摘Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre- quency-directed run-length (AFDR) codes. Different [rom frequency-directed run-length (FDR) codes, AFDR encodes both 0- and 1-runs and uses the same codes to the equal length runs. It also modifies the codes for 00 and 11 to improve the compression performance. Experimental results for ISCAS 89 benchmark circuits show that AFDR codes achieve higher compression ratio than FDR and other compression codes.
文摘This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,test application time, and area overhead. To improve the compression ratio, the new method is based on variable-to-variable run length codes,and a novel algorithm is proposed to reorder the test vectors and fill the unspecified bits in the pre-processing step. With a novel on-chip decoder, low test application time and low area overhead are obtained by hybrid run length codes. Finally, an experimental comparison on ISCAS 89 benchmark circuits validates the proposed method
文摘To analyze the errors of processing data, the testing principle for jet elements is introduced and the property of testing system is theoretically and experimentally studied. On the basis of the above, the method of processing data is presented and the error formulae, which are the functions of the testing system property, are derived. Finally, the methods of reducing the errors are provided. The measured results are in correspondence with the theoretical conclusion.
基金support of the United States Department of Energy (DE-FE0026825, UCFER-University Coalition for Fossil Energy Research)
文摘We developed an inversion technique to determine in situ stresses for elliptical boreholes of arbitrary trajectory. In this approach, borehole geometry, drilling-induced fracture information, and other available leak-off test data were used to construct a mathematical model, which was in turn applied to finding the inverse of an overdetermined system of equations.The method has been demonstrated by a case study in the Appalachian Basin, USA. The calculated horizontal stresses are in reasonable agreement with the reported regional stress study of the area, although there are no field measurement data of the studied well for direct calibration. The results also indicate that 2% of axis difference in the elliptical borehole geometry can cause a 5% difference in minimum horizontal stress calculation and a 10% difference in maximum horizontal stress calculation.
基金Public Benefit Research Foundation under Grant No.201108006Natural Science Foundation under Grant No.51161120360+2 种基金Heilongjiang Overseas Funding under Grant No.LC201002 of ChinaGrant-in-Aid for Scientific Research(Basic Research Category A,19206060)Japan Society for the Promotion of Science
文摘This paper presents a substructure online hybrid test system that is extensible for geographically distributed tests. This system consists of a set of devices conventionally used for cyclic tests to load the tested substructures onto the target displacement or the target force. Due to their robustness and portability, individual sets of conventional loading devices can be transported and reconfigured to realize physical loading in geographically remote laboratories. Another appealing feature is the flexible displacement-force mixed control that is particularly suitable for specimens having large disparities in stiffness during various performance stages. To conduct a substructure online hybrid test, an extensible framework is developed, which is equipped with a generalized interface to encapsulate each substructure. Multiple tested substructures and analyzed substructures using various structural program codes can be accommodated within the single framework, simply interfaced with the boundary displacements and forces. A coordinator program is developed to keep the boundaries among all substructures compatible and equilibrated. An Interuet-based data exchange scheme is also devised to transfer data among computers equipped with different software environments. A series of online hybrid tests are introduced, and the portability, flexibility, and extensibility of the online hybrid test system are demonstrated.
基金Project supported by the Natural Science Foundation of China(No. 50378041) and the Specialized Research Fund for the Doc-toral Program of Higher Education (No. 20030487016), China
文摘A new structural damage identification method using limited test static displacement based on grey system theory is proposed in this paper. The grey relation coefficient of displacement curvature is defined and used to locate damage in the structure, and an iterative estimation scheme for solving nonlinear optimization programming problems based on the quadratic programming technique is used to identify the damage magnitude. A numerical example of a cantilever beam with single or multiple damages is used to examine the capability of the proposed grey-theory-based method to localize and identify damages. The factors of meas-urement noise and incomplete test data are also discussed. The numerical results showed that the damage in the structure can be localized correctly through using the grey-related coefficient of displacement curvature, and the damage magnitude can be iden-tified with a high degree of accuracy, regardless of the number of measured displacement nodes. This proposed method only requires limited static test data, which is easily available in practice, and has wide applications in structural damage detection.