Recently,self-supervised learning has shown great potential in Graph Neural Networks (GNNs) through contrastive learning,which aims to learn discriminative features for each node without label information. The key to ...Recently,self-supervised learning has shown great potential in Graph Neural Networks (GNNs) through contrastive learning,which aims to learn discriminative features for each node without label information. The key to graph contrastive learning is data augmentation. The anchor node regards its augmented samples as positive samples,and the rest of the samples are regarded as negative samples,some of which may be positive samples. We call these mislabeled samples as “false negative” samples,which will seriously affect the final learning effect. Since such semantically similar samples are ubiquitous in the graph,the problem of false negative samples is very significant. To address this issue,the paper proposes a novel model,False negative sample Detection for Graph Contrastive Learning (FD4GCL),which uses attribute and structure-aware to detect false negative samples. Experimental results on seven datasets show that FD4GCL outperforms the state-of-the-art baselines and even exceeds several supervised methods.展开更多
If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final samp...If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final sample size.Hence,the cost of sampling increases substantially.To overcome this problem,the surveyors often use auxiliary information which is easy to obtain and inexpensive.An attempt is made through the auxiliary information to control the final sample size.In this article,we have proposed two-stage negative adaptive cluster sampling design.It is a new design,which is a combination of two-stage sampling and negative adaptive cluster sampling designs.In this design,we consider an auxiliary variablewhich is highly negatively correlatedwith the variable of interest and auxiliary information is completely known.In the first stage of this design,an initial random sample is drawn by using the auxiliary information.Further,using Thompson’s(JAmStat Assoc 85:1050-1059,1990)adaptive procedure networks in the population are discovered.These networks serve as the primary-stage units(PSUs).In the second stage,random samples of unequal sizes are drawn from the PSUs to get the secondary-stage units(SSUs).The values of the auxiliary variable and the variable of interest are recorded for these SSUs.Regression estimator is proposed to estimate the population total of the variable of interest.A new estimator,Composite Horwitz-Thompson(CHT)-type estimator,is also proposed.It is based on only the information on the variable of interest.Variances of the above two estimators along with their unbiased estimators are derived.Using this proposed methodology,sample survey was conducted at Western Ghat of Maharashtra,India.The comparison of the performance of these estimators and methodology is presented and compared with other existing methods.The cost-benefit analysis is given.展开更多
A novel face verification algorithm using competitive negative samples is proposed.In the algorithm,the tested face matches not only with the claimed client face but also with competitive negative samples,and all the ...A novel face verification algorithm using competitive negative samples is proposed.In the algorithm,the tested face matches not only with the claimed client face but also with competitive negative samples,and all the matching scores are combined to make a final decision.Based on the algorithm,three schemes,including closestnegative-sample scheme,all-negative-sample scheme,and closest-few-negative-sample scheme,are designed.They are tested and compared with the traditional similaritybased verification approach on several databases with different features and classifiers.Experiments demonstrate that the three schemes reduce the verification error rate by 25.15%,30.24%,and 30.97%,on average,respectively.展开更多
Landslide susceptibility mapping is a crucial tool for analyzing geohazards in a region.Recent publications have popularized data-driven models,particularly machine learning-based methods,owing to their strong capabil...Landslide susceptibility mapping is a crucial tool for analyzing geohazards in a region.Recent publications have popularized data-driven models,particularly machine learning-based methods,owing to their strong capability in dealing with complex nonlinear problems.However,a significant proportion of these models have neglected qualitative aspects during analysis,resulting in a lack of interpretability throughout the process and causing inaccuracies in the negative sample extraction.In this study,Scoops 3D was employed as a physics-informed tool to qualitatively assess slope stability in the study area(the Hubei Province section of the Three Gorges Reservoir Area).The non-landslide samples were extracted based on the calculated factor of safety(FS).Subsequently,the random forest algorithm was employed for data-driven landslide susceptibility analysis,with the area under the receiver operating characteristic curve(AUC)serving as the model evaluation index.Compared to the benchmark model(i.e.,the standard method of utilizing the pure random forest algorithm),the proposed method’s AUC value improved by 20.1%,validating the effectiveness of the dual-driven method(physics-informed data-driven).展开更多
基金supported by the National Key Research and Development Program of China(No.2021YFB3300503)Regional Innovation and Development Joint Fund of National Natural Science Foundation of China(No.U22A20167)National Natural Science Foundation of China(No.61872260).
文摘Recently,self-supervised learning has shown great potential in Graph Neural Networks (GNNs) through contrastive learning,which aims to learn discriminative features for each node without label information. The key to graph contrastive learning is data augmentation. The anchor node regards its augmented samples as positive samples,and the rest of the samples are regarded as negative samples,some of which may be positive samples. We call these mislabeled samples as “false negative” samples,which will seriously affect the final learning effect. Since such semantically similar samples are ubiquitous in the graph,the problem of false negative samples is very significant. To address this issue,the paper proposes a novel model,False negative sample Detection for Graph Contrastive Learning (FD4GCL),which uses attribute and structure-aware to detect false negative samples. Experimental results on seven datasets show that FD4GCL outperforms the state-of-the-art baselines and even exceeds several supervised methods.
文摘If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final sample size.Hence,the cost of sampling increases substantially.To overcome this problem,the surveyors often use auxiliary information which is easy to obtain and inexpensive.An attempt is made through the auxiliary information to control the final sample size.In this article,we have proposed two-stage negative adaptive cluster sampling design.It is a new design,which is a combination of two-stage sampling and negative adaptive cluster sampling designs.In this design,we consider an auxiliary variablewhich is highly negatively correlatedwith the variable of interest and auxiliary information is completely known.In the first stage of this design,an initial random sample is drawn by using the auxiliary information.Further,using Thompson’s(JAmStat Assoc 85:1050-1059,1990)adaptive procedure networks in the population are discovered.These networks serve as the primary-stage units(PSUs).In the second stage,random samples of unequal sizes are drawn from the PSUs to get the secondary-stage units(SSUs).The values of the auxiliary variable and the variable of interest are recorded for these SSUs.Regression estimator is proposed to estimate the population total of the variable of interest.A new estimator,Composite Horwitz-Thompson(CHT)-type estimator,is also proposed.It is based on only the information on the variable of interest.Variances of the above two estimators along with their unbiased estimators are derived.Using this proposed methodology,sample survey was conducted at Western Ghat of Maharashtra,India.The comparison of the performance of these estimators and methodology is presented and compared with other existing methods.The cost-benefit analysis is given.
基金supported by the National Natural Science Foundation of China (No.69972024)the National High Technology Research and Development Program of China (No.2001A4114081).
文摘A novel face verification algorithm using competitive negative samples is proposed.In the algorithm,the tested face matches not only with the claimed client face but also with competitive negative samples,and all the matching scores are combined to make a final decision.Based on the algorithm,three schemes,including closestnegative-sample scheme,all-negative-sample scheme,and closest-few-negative-sample scheme,are designed.They are tested and compared with the traditional similaritybased verification approach on several databases with different features and classifiers.Experiments demonstrate that the three schemes reduce the verification error rate by 25.15%,30.24%,and 30.97%,on average,respectively.
基金funded by the National Key R&D Program of China(Project No.2019YFC1509605)High-end Foreign Expert Introduction program(No.G20200022005 and DL2021165001L)Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.HZ2021001)。
文摘Landslide susceptibility mapping is a crucial tool for analyzing geohazards in a region.Recent publications have popularized data-driven models,particularly machine learning-based methods,owing to their strong capability in dealing with complex nonlinear problems.However,a significant proportion of these models have neglected qualitative aspects during analysis,resulting in a lack of interpretability throughout the process and causing inaccuracies in the negative sample extraction.In this study,Scoops 3D was employed as a physics-informed tool to qualitatively assess slope stability in the study area(the Hubei Province section of the Three Gorges Reservoir Area).The non-landslide samples were extracted based on the calculated factor of safety(FS).Subsequently,the random forest algorithm was employed for data-driven landslide susceptibility analysis,with the area under the receiver operating characteristic curve(AUC)serving as the model evaluation index.Compared to the benchmark model(i.e.,the standard method of utilizing the pure random forest algorithm),the proposed method’s AUC value improved by 20.1%,validating the effectiveness of the dual-driven method(physics-informed data-driven).