The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random mis...The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.展开更多
In wireless sensor networks(WSNs),the performance of related applications is highly dependent on the quality of data collected.Unfortunately,missing data is almost inevitable in the process of data acquisition and tra...In wireless sensor networks(WSNs),the performance of related applications is highly dependent on the quality of data collected.Unfortunately,missing data is almost inevitable in the process of data acquisition and transmission.Existing methods often rely on prior information such as low-rank characteristics or spatiotemporal correlation when recovering missing WSNs data.However,in realistic application scenarios,it is very difficult to obtain these prior information from incomplete data sets.Therefore,we aim to recover the missing WSNs data effectively while getting rid of the perplexity of prior information.By designing the corresponding measurement matrix that can capture the position of missing data and sparse representation matrix,a compressive sensing(CS)based missing data recovery model is established.Then,we design a comparison standard to select the best sparse representation basis and introduce average cross-correlation to examine the rationality of the established model.Furthermore,an improved fast matching pursuit algorithm is proposed to solve the model.Simulation results show that the proposed method can effectively recover the missing WSNs data.展开更多
Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are inc...Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.展开更多
Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are inc...Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.展开更多
Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are inc...Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.展开更多
Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are inc...Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.展开更多
Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are inc...Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.展开更多
The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o...The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.展开更多
Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for a...Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for analyzing the data of N-of-1 trials.This simulation study aimed to compare two statistical methods for handling missing values of quantitative data in Bayesian N-of-1 trials.Methods:The simulated data of N-of-1 trials with different coefficients of autocorrelation,effect sizes and missing ratios are obtained by SAS 9.1 system.The missing values are filled with mean filling and regression filling respectively in the condition of different coefficients of autocorrelation,effect sizes and missing ratios by SPSS 25.0 software.Bayesian models are built to estimate the posterior means by Winbugs 14 software.Results:When the missing ratio is relatively small,e.g.5%,missing values have relatively little effect on the results.Therapeutic effects may be underestimated when the coefficient of autocorrelation increases and no filling is used.However,it may be overestimated when mean or regression filling is used,and the results after mean filling are closer to the actual effect than regression filling.In the case of moderate missing ratio,the estimated effect after mean filling is closer to the actual effect compared to regression filling.When a large missing ratio(20%)occurs,data missing can lead to significantly underestimate the effect.In this case,the estimated effect after regression filling is closer to the actual effect compared to mean filling.Conclusion:Data missing can affect the estimated therapeutic effects using Bayesian models in N-of-1 trials.The present study suggests that mean filling can be used under situation of missing ratio≤10%.Otherwise,regression filling may be preferable.展开更多
In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity condi...In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.展开更多
Radio Frequency Identification(RFID)technology has been widely used to identify missing items.In many applications,rapidly pinpointing key tags that are attached to favorable or valuable items is critical.To realize t...Radio Frequency Identification(RFID)technology has been widely used to identify missing items.In many applications,rapidly pinpointing key tags that are attached to favorable or valuable items is critical.To realize this goal,interference from ordinary tags should be avoided,while key tags should be efficiently verified.Despite many previous studies,how to rapidly and dynamically filter out ordinary tags when the ratio of ordinary tags changes has not been addressed.Moreover,how to efficiently verify missing key tags in groups rather than one by one has not been explored,especially with varying missing rates.In this paper,we propose an Efficient and Robust missing Key tag Identification(ERKI)protocol that consists of a filtering mechanism and a verification mechanism.Specifically,the filtering mechanism adopts the Bloom filter to quickly filter out ordinary tags and uses the labeling vector to optimize the Bloom filter's performance when the key tag ratio is high.Furthermore,the verification mechanism can dynamically verify key tags according to the missing rates,in which an appropriate number of key tags is mapped to a slot and verified at once.Moreover,we theoretically analyze the parameters of the ERKI protocol to minimize its execution time.Extensive numerical results show that ERKI can accelerate the execution time by more than 2.14compared with state-of-the-art solutions.展开更多
Recently,the importance of data analysis has increased significantly due to the rapid data increase.In particular,vehicle communication data,considered a significant challenge in Intelligent Transportation Systems(ITS...Recently,the importance of data analysis has increased significantly due to the rapid data increase.In particular,vehicle communication data,considered a significant challenge in Intelligent Transportation Systems(ITS),has spatiotemporal characteristics and many missing values.High missing values in data lead to the decreased predictive performance of models.Existing missing value imputation models ignore the topology of transportation net-works due to the structural connection of road networks,although physical distances are close in spatiotemporal image data.Additionally,the learning process of missing value imputation models requires complete data,but there are limitations in securing complete vehicle communication data.This study proposes a missing value imputation model based on adversarial autoencoder using spatiotemporal feature extraction to address these issues.The proposed method replaces missing values by reflecting spatiotemporal characteristics of transportation data using temporal convolution and spatial convolution.Experimental results show that the proposed model has the lowest error rate of 5.92%,demonstrating excellent predictive accuracy.Through this,it is possible to solve the data sparsity problem and improve traffic safety by showing superior predictive performance.展开更多
Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients d...Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients drop out of the study, making the data missing, so a method for estimating the mean of the response variable with missing values for the ultra-high dimensional datasets is needed. In this paper, we propose a two-stage ultra-high dimensional variable screening method, RF-SIS, based on random forest regression, which effectively solves the problem of estimating missing values due to excessive data dimension. After the dimension reduction process by applying RF-SIS, mean interpolation is executed on the missing responses. The results of the simulated data show that compared with the estimation method of directly deleting missing observations, the estimation results of RF-SIS-MI have significant advantages in terms of the proportion of intervals covered, the average length of intervals, and the average absolute deviation.展开更多
In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are o...In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.展开更多
Fear of missing out(FoMO)is a unique term introduced in 2004 to describe a phenomenon observed on social networking sites.FoMO includes two processes;firstly,perception of missing out,followed up with a compulsive beh...Fear of missing out(FoMO)is a unique term introduced in 2004 to describe a phenomenon observed on social networking sites.FoMO includes two processes;firstly,perception of missing out,followed up with a compulsive behavior to maintain these social connections.We are interested in understanding the complex construct of FoMO and its relations to the need to belong and form stable interpersonal relationships.It is associated with a range of negative life experiences and feelings,due to it being considered a problematic attachment to social media.We have provided a general review of the literature and have summarized the findings in relation to mental health,social functioning,sleep,academic performance and productivity,neuro-developmental disorders,and physical well-being.We have also discussed the treatment options available for FoMo based on cognitive behavior therapy.It imperative that new findings on FoMO are communicated to the clinical community as it has diagnostic implications and could be a confounding variable in those who do not respond to treatment as usual.展开更多
Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real proc...Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.展开更多
The first thunderstorm weather appeared in southern Shenyang on May 2,2010 and did not bring about severe lightning disaster for Shenyang region,but forecast service had poor effect without forecasting thunderstorm we...The first thunderstorm weather appeared in southern Shenyang on May 2,2010 and did not bring about severe lightning disaster for Shenyang region,but forecast service had poor effect without forecasting thunderstorm weather accurately.In our paper,the reasons for missing report of this thunderstorm weather were analyzed,and analysis on thunderstorm potential was carried out by means of mesoscale analysis technique,providing technical index and vantage point for the prediction of thunderstorm potential.The results showed that the reasons for missing report of this weather process were as follows:surface temperature at prophase was constantly lower going against the development of convective weather;the interpreting and analyzing ability of numerical forecast product should be improved;the forecast result of T639 model was better than that of Japanese numerical forecast;the study and application of mesoscale analysis technique should be strengthened,and this service was formally developed after thunderstorm weather on June 1,2010.展开更多
Missing via has been a defect in semiconductor manufacturing,especially of foundries. Its solution can be rather attractive in yield improvement for relatively mature technology since each percentage point improvement...Missing via has been a defect in semiconductor manufacturing,especially of foundries. Its solution can be rather attractive in yield improvement for relatively mature technology since each percentage point improvement will mean significant profit margin enhancement. However, the root cause for the missing via defect is not easy to determine since many factors,such as, defocus, material re-deposition, and inadequate development,can lead to missing via defects. Therefore, knowing the exact cause for each defect type is the key. In this paper, we will present the analysis methodology used in our company. In the experiments,we have observed three types of missing vias. The first type consists of large areas, usually circular,of missing patterns,which are primarily located near the wafer edge. The second type consists of isolated sites with single partially opened vias or completely unopened vias. The third type consists of relatively small circular areas,within which the entire via pattern is missing. We have first tried the optimization of the developing recipe and found that the first type of missing via can be largely removed through the tuning of the rinse process, which improves the cleaning efficiency of the developing residue. However, this method does not remove missing via of the second and third type. We found that the second type of missing via is related to local defocus caused by topographical distribution. To resolve the third type of missing via defects, we have performed extensive experiments with different types of developer nozzles and different types of photomasks,and the result is that we have not found any distinct dependence of the defect density on either the nozzle or the mask types. Moreover, we have also studied the defect density from three resists with different resolution capability and found a correlation between the defect density and the resist resolution. It seems that,in general, lower resolution resists also have lower defect density. The results will be presented in the paper.展开更多
In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objec...In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objective is to identify the best method for correcting missing data in TB/HIV Co-infection setting. We employ both empirical data analysis and extensive simulation study to examine the effects of missing data, the accuracy, sensitivity, specificity and train and test error for different approaches. The novelty of this work hinges on the use of modern statistical learning algorithm when treating missingness. In the empirical analysis, both HIV data and TB-HIV co-infection data imputations were performed, and the missing values were imputed using different approaches. In the simulation study, sets of 0% (Complete case), 10%, 30%, 50% and 80% of the data were drawn randomly and replaced with missing values. Results show complete cases only had a co-infection rate (95% Confidence Interval band) of 29% (25%, 33%), weighted method 27% (23%, 31%), likelihood-based approach 26% (24%, 28%) and multiple imputation approach 21% (20%, 22%). In conclusion, MI remains the best approach for dealing with missing data and failure to apply it, results to overestimation of HIV/TB co-infection rate by 8%.展开更多
基金supported by Graduate Funded Project(No.JY2022A017).
文摘The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.
基金supported by the National Natural Science Foundation of China(No.61871400)the Natural Science Foundation of the Jiangsu Province of China(No.BK20171401)。
文摘In wireless sensor networks(WSNs),the performance of related applications is highly dependent on the quality of data collected.Unfortunately,missing data is almost inevitable in the process of data acquisition and transmission.Existing methods often rely on prior information such as low-rank characteristics or spatiotemporal correlation when recovering missing WSNs data.However,in realistic application scenarios,it is very difficult to obtain these prior information from incomplete data sets.Therefore,we aim to recover the missing WSNs data effectively while getting rid of the perplexity of prior information.By designing the corresponding measurement matrix that can capture the position of missing data and sparse representation matrix,a compressive sensing(CS)based missing data recovery model is established.Then,we design a comparison standard to select the best sparse representation basis and introduce average cross-correlation to examine the rationality of the established model.Furthermore,an improved fast matching pursuit algorithm is proposed to solve the model.Simulation results show that the proposed method can effectively recover the missing WSNs data.
文摘Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.
文摘Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.
文摘Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.
文摘Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.
文摘Ethical statements were not included in the published version of the following articles that appeared in previous issues of Journal of Integrative Agriculture.The appropriate statements provided by the Authors are included below.
文摘The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.
基金supported by the National Natural Science Foundation of China (No.81973705).
文摘Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for analyzing the data of N-of-1 trials.This simulation study aimed to compare two statistical methods for handling missing values of quantitative data in Bayesian N-of-1 trials.Methods:The simulated data of N-of-1 trials with different coefficients of autocorrelation,effect sizes and missing ratios are obtained by SAS 9.1 system.The missing values are filled with mean filling and regression filling respectively in the condition of different coefficients of autocorrelation,effect sizes and missing ratios by SPSS 25.0 software.Bayesian models are built to estimate the posterior means by Winbugs 14 software.Results:When the missing ratio is relatively small,e.g.5%,missing values have relatively little effect on the results.Therapeutic effects may be underestimated when the coefficient of autocorrelation increases and no filling is used.However,it may be overestimated when mean or regression filling is used,and the results after mean filling are closer to the actual effect than regression filling.In the case of moderate missing ratio,the estimated effect after mean filling is closer to the actual effect compared to regression filling.When a large missing ratio(20%)occurs,data missing can lead to significantly underestimate the effect.In this case,the estimated effect after regression filling is closer to the actual effect compared to mean filling.Conclusion:Data missing can affect the estimated therapeutic effects using Bayesian models in N-of-1 trials.The present study suggests that mean filling can be used under situation of missing ratio≤10%.Otherwise,regression filling may be preferable.
文摘In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.
基金This work was supported in part by the National Natural Science Foundation of China under project contracts No.61971113 and 61901095in part by National Key R&D Program under project contract No.2018AAA0103203+5 种基金in part by Guangdong Provincial Research and Development Plan in Key Areas under project contract No.2019B010141001 and 2019B010142001in part by Sichuan Provincial Science and Technology Planning Program under project contracts No.2020YFG0039,No.2021YFG0013 and No.2021YFH0133in part by Ministry of Education China Mobile Fund Program under project contract No.MCM20180104in part by Yibin Science and Technology Program-Key Projects under project contract No.2018ZSF001 and 2019GY001in part by Central University Business Fee Program under project contract No.A03019023801224the Central Universities under Grant ZYGX2019Z022.
文摘Radio Frequency Identification(RFID)technology has been widely used to identify missing items.In many applications,rapidly pinpointing key tags that are attached to favorable or valuable items is critical.To realize this goal,interference from ordinary tags should be avoided,while key tags should be efficiently verified.Despite many previous studies,how to rapidly and dynamically filter out ordinary tags when the ratio of ordinary tags changes has not been addressed.Moreover,how to efficiently verify missing key tags in groups rather than one by one has not been explored,especially with varying missing rates.In this paper,we propose an Efficient and Robust missing Key tag Identification(ERKI)protocol that consists of a filtering mechanism and a verification mechanism.Specifically,the filtering mechanism adopts the Bloom filter to quickly filter out ordinary tags and uses the labeling vector to optimize the Bloom filter's performance when the key tag ratio is high.Furthermore,the verification mechanism can dynamically verify key tags according to the missing rates,in which an appropriate number of key tags is mapped to a slot and verified at once.Moreover,we theoretically analyze the parameters of the ERKI protocol to minimize its execution time.Extensive numerical results show that ERKI can accelerate the execution time by more than 2.14compared with state-of-the-art solutions.
基金supported by the MSIT (Ministry of Science and ICT),Korea,under the ITRC (Information Technology Research Center)support program (IITP-2018-0-01405)supervised by the IITP (Institute for Information&Communications Technology Planning&Evaluation).
文摘Recently,the importance of data analysis has increased significantly due to the rapid data increase.In particular,vehicle communication data,considered a significant challenge in Intelligent Transportation Systems(ITS),has spatiotemporal characteristics and many missing values.High missing values in data lead to the decreased predictive performance of models.Existing missing value imputation models ignore the topology of transportation net-works due to the structural connection of road networks,although physical distances are close in spatiotemporal image data.Additionally,the learning process of missing value imputation models requires complete data,but there are limitations in securing complete vehicle communication data.This study proposes a missing value imputation model based on adversarial autoencoder using spatiotemporal feature extraction to address these issues.The proposed method replaces missing values by reflecting spatiotemporal characteristics of transportation data using temporal convolution and spatial convolution.Experimental results show that the proposed model has the lowest error rate of 5.92%,demonstrating excellent predictive accuracy.Through this,it is possible to solve the data sparsity problem and improve traffic safety by showing superior predictive performance.
文摘Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients drop out of the study, making the data missing, so a method for estimating the mean of the response variable with missing values for the ultra-high dimensional datasets is needed. In this paper, we propose a two-stage ultra-high dimensional variable screening method, RF-SIS, based on random forest regression, which effectively solves the problem of estimating missing values due to excessive data dimension. After the dimension reduction process by applying RF-SIS, mean interpolation is executed on the missing responses. The results of the simulated data show that compared with the estimation method of directly deleting missing observations, the estimation results of RF-SIS-MI have significant advantages in terms of the proportion of intervals covered, the average length of intervals, and the average absolute deviation.
文摘In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.
文摘Fear of missing out(FoMO)is a unique term introduced in 2004 to describe a phenomenon observed on social networking sites.FoMO includes two processes;firstly,perception of missing out,followed up with a compulsive behavior to maintain these social connections.We are interested in understanding the complex construct of FoMO and its relations to the need to belong and form stable interpersonal relationships.It is associated with a range of negative life experiences and feelings,due to it being considered a problematic attachment to social media.We have provided a general review of the literature and have summarized the findings in relation to mental health,social functioning,sleep,academic performance and productivity,neuro-developmental disorders,and physical well-being.We have also discussed the treatment options available for FoMo based on cognitive behavior therapy.It imperative that new findings on FoMO are communicated to the clinical community as it has diagnostic implications and could be a confounding variable in those who do not respond to treatment as usual.
基金supported by Key Discipline Construction Program of Beijing Municipal Commission of Education (XK10008043)
文摘Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.
文摘The first thunderstorm weather appeared in southern Shenyang on May 2,2010 and did not bring about severe lightning disaster for Shenyang region,but forecast service had poor effect without forecasting thunderstorm weather accurately.In our paper,the reasons for missing report of this thunderstorm weather were analyzed,and analysis on thunderstorm potential was carried out by means of mesoscale analysis technique,providing technical index and vantage point for the prediction of thunderstorm potential.The results showed that the reasons for missing report of this weather process were as follows:surface temperature at prophase was constantly lower going against the development of convective weather;the interpreting and analyzing ability of numerical forecast product should be improved;the forecast result of T639 model was better than that of Japanese numerical forecast;the study and application of mesoscale analysis technique should be strengthened,and this service was formally developed after thunderstorm weather on June 1,2010.
文摘Missing via has been a defect in semiconductor manufacturing,especially of foundries. Its solution can be rather attractive in yield improvement for relatively mature technology since each percentage point improvement will mean significant profit margin enhancement. However, the root cause for the missing via defect is not easy to determine since many factors,such as, defocus, material re-deposition, and inadequate development,can lead to missing via defects. Therefore, knowing the exact cause for each defect type is the key. In this paper, we will present the analysis methodology used in our company. In the experiments,we have observed three types of missing vias. The first type consists of large areas, usually circular,of missing patterns,which are primarily located near the wafer edge. The second type consists of isolated sites with single partially opened vias or completely unopened vias. The third type consists of relatively small circular areas,within which the entire via pattern is missing. We have first tried the optimization of the developing recipe and found that the first type of missing via can be largely removed through the tuning of the rinse process, which improves the cleaning efficiency of the developing residue. However, this method does not remove missing via of the second and third type. We found that the second type of missing via is related to local defocus caused by topographical distribution. To resolve the third type of missing via defects, we have performed extensive experiments with different types of developer nozzles and different types of photomasks,and the result is that we have not found any distinct dependence of the defect density on either the nozzle or the mask types. Moreover, we have also studied the defect density from three resists with different resolution capability and found a correlation between the defect density and the resist resolution. It seems that,in general, lower resolution resists also have lower defect density. The results will be presented in the paper.
文摘In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objective is to identify the best method for correcting missing data in TB/HIV Co-infection setting. We employ both empirical data analysis and extensive simulation study to examine the effects of missing data, the accuracy, sensitivity, specificity and train and test error for different approaches. The novelty of this work hinges on the use of modern statistical learning algorithm when treating missingness. In the empirical analysis, both HIV data and TB-HIV co-infection data imputations were performed, and the missing values were imputed using different approaches. In the simulation study, sets of 0% (Complete case), 10%, 30%, 50% and 80% of the data were drawn randomly and replaced with missing values. Results show complete cases only had a co-infection rate (95% Confidence Interval band) of 29% (25%, 33%), weighted method 27% (23%, 31%), likelihood-based approach 26% (24%, 28%) and multiple imputation approach 21% (20%, 22%). In conclusion, MI remains the best approach for dealing with missing data and failure to apply it, results to overestimation of HIV/TB co-infection rate by 8%.