To study the difference of industrial location among different industries, this article is to test the spatial agglomeration across industries and firm sizes at the city level. Our research bases on a unique plant-lev...To study the difference of industrial location among different industries, this article is to test the spatial agglomeration across industries and firm sizes at the city level. Our research bases on a unique plant-level data set of Beijing and employs a distance-based approach, which considers space as continuous. Unlike previous studies, we set two sets of references for service and manufacturing industries respectively to adapt to the investigation in the intra-urban area. Comparing among eight types of industries and different firm sizes, we find that: 1) producer service, high-tech industries and labor-intensive manufacturing industries are more likely to cluster, whereas personal service and capital-intensive industries tend to be randomly dispersed in Beijing; 2) the spillover of the co-location of finns is more important to knowledge-intensive industries and has more significant impact on their allocation than business-oriented services in the intra-urban area; 3) the spatial agglomeration of service industries are driven by larger establishments, whereas manufac- turing industries are mixed.展开更多
The Nei's improved genetic distance(DA)and gene flow(Nm)were measured using sixteen microsatellite markers.Dendograms based on DA genetic distance using the neighbor-joining(NJ)method and STRUCTURE program were co...The Nei's improved genetic distance(DA)and gene flow(Nm)were measured using sixteen microsatellite markers.Dendograms based on DA genetic distance using the neighbor-joining(NJ)method and STRUCTURE program were constructed to analyze the genetic structure and relationship among 10 Chinese indigenous chicken breeds.The results showed that dendograms of DA genetic distance using the NJ method divided the 10 chicken breeds into two main clusters;one consisted of breeds of low weight body(CHA,TTB,XIA,GUS and BAI),the other contained heavier breeds(LAN,DAG,YOU,XIS and LUY).In the lighter breeds,TIB and CHA clustered together,as did XIA and GUS.In the heavier breeds,XIS and LUY was clustered together in one branch,but LAN,DAG and YOU clustered in independent branches.The results were consistent with Nm estimates among the 10 indigenous chicken breeds.The STRUCTURE program properly inferred the presence of genetic structure despite not pre-defining the origin of individuals.The genetic cluster inferred by STRUCTURE was basically the same as that from the DA distance clustering method.An advantage of the STRUCTURE program was its ability to identify the migrants and admixed individuals in the 10 chicken populations;this could not be achieved by use of the DA distance clustering method.展开更多
Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a pro...Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.展开更多
The urban transit fare structure and level can largely affect passengers’travel behavior and route choices.The commonly used transit fare policies in the present transit network would lead to the unbalanced transit a...The urban transit fare structure and level can largely affect passengers’travel behavior and route choices.The commonly used transit fare policies in the present transit network would lead to the unbalanced transit assignment and improper transit resources distribution.In order to distribute transit passenger flow evenly and efficiently,this paper introduces a new distance-based fare pattern with Euclidean distance.A bi-level programming model is developed for determining the optimal distance-based fare pattern,with the path-based stochastic transit assignment(STA)problem with elastic demand being proposed at the lower level.The upper-level intends to address a principal-agent game between transport authorities and transit enterprises pursing maximization of social welfare and financial interest,respectively.A genetic algorithm(GA)is implemented to solve the bi-level model,which is verified by a numerical example to illustrate that the proposed nonlinear distance-based fare pattern presents a better financial performance and distribution effect than other fare structures.展开更多
A new update strategy, distance-based update strategy, is presented in Location Dependent Continuous Query (LDCQ) under error limitation. There are different possibilities to intersect when the distances between movin...A new update strategy, distance-based update strategy, is presented in Location Dependent Continuous Query (LDCQ) under error limitation. There are different possibilities to intersect when the distances between moving objects and the querying boundary are different.Therefore, moving objects have different influences to the query result. We set different deviation limits for different moving objects according to distances. A great number of unnecessary updates are reduced and the payload of the system is relieved.展开更多
The distance-based regression model has many applications in analysis of multivariate response regression in various ?elds, such as ecology, genomics, genetics, human microbiomics, and neuroimaging. It yields a pseudo...The distance-based regression model has many applications in analysis of multivariate response regression in various ?elds, such as ecology, genomics, genetics, human microbiomics, and neuroimaging. It yields a pseudo F test statistic that assesses the relation between the distance(dissimilarity) of the subjects and the predictors of interest. Despite its popularity in recent decades, the statistical properties of the pseudo F test statistic have not been revealed to our knowledge. This study derives the asymptotic properties of the pseudo F test statistic using spectral decomposition under the matrix normal assumption, when the utilized dissimilarity measure is the Euclidean or Mahalanobis distance. The pseudo F test statistic with the Euclidean distance has the same distribution as the quotient of two Chi-squared-type mixtures. The denominator and numerator of the quotient are approximated using a random variable of the form ξχ_d^2+ η, and the approximate error bound is given. The pseudo F test statistic with the Mahalanobis distance follows an F distribution.In simulation studies, the approximated distribution well matched the "exact" distribution obtained by the permutation procedure. The obtained distribution was further validated on H1N1 in?uenza data, aging human brain data, and embryonic imprint data.展开更多
Distance-based range search is crucial in many real applications.In particular,given a database and a query issuer,a distance-based range search retrieves all the objects in the database whose distances from the query...Distance-based range search is crucial in many real applications.In particular,given a database and a query issuer,a distance-based range search retrieves all the objects in the database whose distances from the query issuer are less than or equal to a given threshold.Often,due to the accuracy of positioning devices,updating protocols or characteristics of applications(for example,location privacy protection),data obtained from real world are imprecise or uncertain.Therefore, existing approaches over exact databases cannot be directly applied to the uncertain scenario.In this paper,we redefine the distance-based range query in the context of uncertain databases,namely the probabilistic uncertain distance-based range (PUDR) queries,which obtain objects with confidence guarantees.We categorize the topological relationships between uncertain objects and uncertain search ranges into six cases and present the probability evaluation in each case.It is verified by experiments that our approach outperform Monte-Carlo method utilized in most existing work in precision and time cost for uniform uncertainty distribution.This approach approximates the probabilities of objects following other practical uncertainty distribution,such as Gaussian distribution with acceptable errors.Since the retrieval of a PUDR query requires accessing all the objects in the databases,which is quite costly,we propose spatial pruning and probabilistic pruning techniques to reduce the search space.Two metrics,false positive rate and false negative rate are introduced to measure the qualities of query results.An extensive empirical study has been conducted to demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings.展开更多
In this study,we developed a microfluidic paper analysis device(μPAD)for distance-based detection of Ag^(+)in water.TheμPAD was manufactured by wax printing method on filter paper.Then,a layer of gold nanoparticles(...In this study,we developed a microfluidic paper analysis device(μPAD)for distance-based detection of Ag^(+)in water.TheμPAD was manufactured by wax printing method on filter paper.Then,a layer of gold nanoparticles(AuNPs)was deposited and ascorbic acid was printed on the channel.In the detection,Ag^(+)was reduced by ascorbic acid and coated on the surface of the AuNPs on the channel,forming Au@Ag core/shell nanoparticles.Based on the capillary flow principle,diff erent concentrations of Ag^(+)formed diff erent distances of color ribbons.Thus,quantitative detection of Ag^(+)can be achieved by measuring the distance of the color ribbon.The detection limit of this method was as low as 1 mg·L^(-1)within 15 min and the interference of common metal ions in water can be eliminated.In conclusion,this method had successfully realized the leap from colorimetry to direct reading,realizing fast read and easy manipulation with low-cost.展开更多
Data obtained from real world are imprecise or uncertain due to the accuracy of positioning devices,updating protocols or characteristics of applications.On the other hand,users sometimes prefer to qualitatively expre...Data obtained from real world are imprecise or uncertain due to the accuracy of positioning devices,updating protocols or characteristics of applications.On the other hand,users sometimes prefer to qualitatively express their requests with vague conditions and different parts of search region are in-equally important in some applications.We address the problem of efficiently processing the fuzzy range queries for uncertain moving objects whose whereabouts in time are not known exactly,for which the basic syntax is find objects always/sometimes near to the query issuer with the qualifying guarantees no less than a given threshold during a given temporal interval.We model the location uncertainty of moving objects on the utilization of probability density functions and describe the indeterminate boundary of query range with fuzzy set.We present the qualifying guarantee evaluation of objects,and propose pruning techniques based on the α-cut of fuzzy set to shrink the search space efficiently.We also design rules to reject non-qualifying objects and validate qualifying objects in order to avoid unnecessary costly numeric integrations in the refinement step.An extensive empirical study has been conducted to demonstrate the efficiency and effectiveness of algorithms under various experimental展开更多
The hyper-spectral image contains spectral and spatial information,which increases the ability and precision of objects classification.Despite the classification value of hyper-spectral imaging technology within vario...The hyper-spectral image contains spectral and spatial information,which increases the ability and precision of objects classification.Despite the classification value of hyper-spectral imaging technology within various applications,users often find it difficult to effectively apply in practice because of the effect of light,temperature and wind in outdoor environment.This research presented a new classification model for outdoor farmland objects based on near-infrared(NIR)hyper-spectral images.It involves two steps including region of interest(ROI)acquisition and establishment of classifiers.A distance-based method for quantitative analysis was proposed to optimize the reference pixels in ROI acquisition firstly.Then maximum likelihood(ML)and support vector machine(SVM)were used for farmland objects classification.The performance of the proposed method showed that the total classification accuracy based on the reference pixels was over 97.5%,of which the SVM-M model could reach 99.5%.The research provided an effective method for outdoor farmland image classification.展开更多
α-diversity describes species diversity at local scales.The Simpson’s and Shannon-Wiener indices are widely used to characterizeα-diversity based on species abundances within a fixed study site(e.g.,a quadrat or pl...α-diversity describes species diversity at local scales.The Simpson’s and Shannon-Wiener indices are widely used to characterizeα-diversity based on species abundances within a fixed study site(e.g.,a quadrat or plot).Although such indices provide overall diversity estimates that can be analyzed,their values are not spatially continuous nor applicable in theory to any point within the study region,and thus they cannot be treated as spatial covariates for analyses of other variables.Herein,we extended the Simpson’s and Shannon-Wiener indices to create point estimates ofα-diversity for any location based on spatially explicit species occurrences within different bandwidths(i.e.,radii,with the location of interest as the center).For an arbitrary point in the study region,species occurrences within the circle plotting the bandwidth were weighted according to their distance from the center using a tri-cube kernel function,with occurrences closer to the center having greater weight than more distant ones.These novel kernel-basedα-diversity indices were tested using a tree dataset from a 400 m×400 m study region comprising a 200 m×200 m core region surrounded by a 100-m width buffer zone.Our newly extendedα-diversity indices did not disagree qualitatively with the traditional indices,and the former were slightly lower than the latter by<2%at medium and large band widths.The present work demonstrates the feasibility of using kernel-basedα-diversity indices to estimate diversity at any location in the study region and allows them to be used as quantifiable spatial covariates or predictors for other dependent variables of interest in future ecological studies.Spatially continuousα-diversity indices are useful to compare and monitor species trends in space and time,which is valuable for conservation practitioners.展开更多
Congestion pricing is an important component of urban intelligent transport system.The efficiency,equity and the environmental impacts associated with road pricing schemes are key issues that should be considered befo...Congestion pricing is an important component of urban intelligent transport system.The efficiency,equity and the environmental impacts associated with road pricing schemes are key issues that should be considered before such schemes are implemented.This paper focuses on the cordon-based pricing with distance tolls,where the tolls are determined by a nonlinear function of a vehicles' travel distance within a cordon,termed as toll charge function.The optimal tolls can give rise to:1) higher total social benefits,2) better levels of equity,and 3) reduced environmental impacts(e.g.,less emission).Firstly,a deterministic equilibrium(DUE) model with elastic demand is presented to evaluate any given toll charge function.The distance tolls are non-additive,thus a modified path-based gradient projection algorithm is developed to solve the DUE model.Then,to quantitatively measure the equity level of each toll charge function,the Gini coefficient is adopted to measure the equity level of the flows in the entire transport network based on equilibrium flows.The total emission level is used to reflect the impacts of distance tolls on the environment.With these two indexes/measurements for the efficiency,equity and environmental issues as well as the DUE model,a multi-objective bi-level programming model is then developed to determine optimal distance tolls.The multi-objective model is converted to a single level model using the goal programming.A genetic algorithm(GA) is adopted to determine solutions.Finally,a numerical example is presented to verify the methodology.展开更多
The primary objective of the study was to determine whether a distanced-based educational in-tervention would result in positive health outcomes for persons with both DM and cognitive impairment. Older adults with Typ...The primary objective of the study was to determine whether a distanced-based educational in-tervention would result in positive health outcomes for persons with both DM and cognitive impairment. Older adults with Type 2 diabetes (Diabetes Mellitus—DM) who also have cognitive impairment such as Mild Cognitive Impairment (MCI) or early stage dementia are both challenged and at risk when attempting to live independently. The ability to effectively monitor blood glucose levels and diet and exercise regimens often is severely constrained by the combination of DM and the presence of Mild Cognitive Impairment (MCI) or early stage dementia. We describe an exploratory study funded by the National Institute of Diabetes Digestive and Kidney Diseases (NIDDK) in which Certified Diabetic Educators (CDEs) were linked with 40 older adult with DM and cognitive impairment using iPads and the internet. CDEs presented personalized education sessions to participants, and 18 of the participants also received a cognitive intervention called Spaced Retrieval (SR), which is designed to train the effective use of strategies to enhance medication compliance and reach other goals. Blood glucose and cholesterol measures were assessed at baseline and at 2-, 4-, and 6-month post intervention. Hemoglobin A1c (HbA1c) levels initially declined from baseline after treatment but returned to baseline levels after 6 months. For low-density lipoprotein (LDL) cholesterol, a significant interaction effect was found for the Group × Time interaction. LDL levels increased from baseline after treatment for the control group, but showed decline after baseline in the SR group. Goals that were initially learned were retained, in general, at short-term follow-up, and self-efficacy increased significantly after training. Results show the need for follow-up and support after initial treatment, as well as the need to see if the effects produced by SR can be replicated and sustained with continued contact.展开更多
In order to address the optimal distance toll design problem for cordon-based congestion pricing incorporating the issue of equity,this paper presents a toll user equilibrium( TUE) model based on a transformed network...In order to address the optimal distance toll design problem for cordon-based congestion pricing incorporating the issue of equity,this paper presents a toll user equilibrium( TUE) model based on a transformed network with elastic demand,to evaluate any given toll charge function. A bi-level programming model is developed for determining the optimal toll levels,with the TUE being represented at the lower level.The upper level optimizes the total equity level over the transport network,represented by the Gini coefficient,where a constraint is imposed to the total travel impedance of each OD pair after the levy. A genetic algorithm( GA) is implemented to solve the bi-level model,which is verified by a numerical example.展开更多
Silica has three major varieties of crystalline. Quartz is the main andabundant ingredient in the crust of our earth. While other varieties are formedby the heating of quartz. Silica quartz is a rich chemical structur...Silica has three major varieties of crystalline. Quartz is the main andabundant ingredient in the crust of our earth. While other varieties are formedby the heating of quartz. Silica quartz is a rich chemical structure containingenormous properties. Any chemical network or structure can be transformedinto a graph, where atoms become vertices and the bonds are converted toedges, between vertices. This makes a complex network easy to visualize towork on it. There are many concepts to work on chemical structures in termsof graph theory but the resolvability parameters of a graph are quite advanceand applicable topic. Resolvability parameters of a graph is a way to getting agraph into unique form, like each vertex or edge has a unique identification bymeans of some selected vertices, which depends on the distance of vertices andits pattern in a particular graph. We have dealt some resolvability parametersof SiO2 quartz. We computed the resolving set for quartz structure and itsvariants, wherein we proved that all the variants of resolvability parameters ofquartz structures are constant and do not depend on the order of the graph.展开更多
The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value dif...The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value difference met- ric (IWVDM) is proposed here. Different from prior work, IWVDM uses naive Bayes (NB) to find weights for train- ing instances. Because early work has shown that there is a close relationship between VDM and NB, some work on NB can be applied to VDM. The weight of a training instance x, that belongs to the class c, is assigned according to the dif- ference between the estimated conditional probability P(c/x) by NB and the true conditional probability P(c/x), and the weight is adjusted iteratively. Compared with previous work, IWVDM has the advantage of reducing the time complex- ity of the process of finding weights, and simultaneously im- proving the performance of VDM. Experimental results on 36 UCI datasets validate the effectiveness of IWVDM.展开更多
Uncertain data are common due to the increasing usage of sensors, radio frequency identification(RFID), GPS and similar devices for data collection. The causes of uncertainty include limitations of measurements, inclu...Uncertain data are common due to the increasing usage of sensors, radio frequency identification(RFID), GPS and similar devices for data collection. The causes of uncertainty include limitations of measurements, inclusion of noise, inconsistent supply voltage and delay or loss of data in transfer. In order to manage, query or mine such data, data uncertainty needs to be considered. Hence,this paper studies the problem of top-k distance-based outlier detection from uncertain data objects. In this work, an uncertain object is modelled by a probability density function of a Gaussian distribution. The naive approach of distance-based outlier detection makes use of nested loop. This approach is very costly due to the expensive distance function between two uncertain objects. Therefore,a populated-cells list(PC-list) approach of outlier detection is proposed. Using the PC-list, the proposed top-k outlier detection algorithm needs to consider only a fraction of dataset objects and hence quickly identifies candidate objects for top-k outliers. Two approximate top-k outlier detection algorithms are presented to further increase the efficiency of the top-k outlier detection algorithm.An extensive empirical study on synthetic and real datasets is also presented to prove the accuracy, efficiency and scalability of the proposed algorithms.展开更多
基金State Key Program of National Natural Science of China(No.41230632)National Natural Science Foundation of China(No.41301123,41201169)
文摘To study the difference of industrial location among different industries, this article is to test the spatial agglomeration across industries and firm sizes at the city level. Our research bases on a unique plant-level data set of Beijing and employs a distance-based approach, which considers space as continuous. Unlike previous studies, we set two sets of references for service and manufacturing industries respectively to adapt to the investigation in the intra-urban area. Comparing among eight types of industries and different firm sizes, we find that: 1) producer service, high-tech industries and labor-intensive manufacturing industries are more likely to cluster, whereas personal service and capital-intensive industries tend to be randomly dispersed in Beijing; 2) the spillover of the co-location of finns is more important to knowledge-intensive industries and has more significant impact on their allocation than business-oriented services in the intra-urban area; 3) the spatial agglomeration of service industries are driven by larger establishments, whereas manufac- turing industries are mixed.
基金supported by the Program of National Technological Basis from Ministry of Science and Technology of China(No.2005DKA21101)the National Natural Science Foundation of China(No.30700572)
文摘The Nei's improved genetic distance(DA)and gene flow(Nm)were measured using sixteen microsatellite markers.Dendograms based on DA genetic distance using the neighbor-joining(NJ)method and STRUCTURE program were constructed to analyze the genetic structure and relationship among 10 Chinese indigenous chicken breeds.The results showed that dendograms of DA genetic distance using the NJ method divided the 10 chicken breeds into two main clusters;one consisted of breeds of low weight body(CHA,TTB,XIA,GUS and BAI),the other contained heavier breeds(LAN,DAG,YOU,XIS and LUY).In the lighter breeds,TIB and CHA clustered together,as did XIA and GUS.In the heavier breeds,XIS and LUY was clustered together in one branch,but LAN,DAG and YOU clustered in independent branches.The results were consistent with Nm estimates among the 10 indigenous chicken breeds.The STRUCTURE program properly inferred the presence of genetic structure despite not pre-defining the origin of individuals.The genetic cluster inferred by STRUCTURE was basically the same as that from the DA distance clustering method.An advantage of the STRUCTURE program was its ability to identify the migrants and admixed individuals in the 10 chicken populations;this could not be achieved by use of the DA distance clustering method.
基金National Natural Science Foundation of China (Grant No. 60433020, 60673099, 60673023)"985" project of Jilin University
文摘Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.
基金the Humanities and Social Science Foundation of the Ministry of Education of China(Grant No.20YJCZH121).
文摘The urban transit fare structure and level can largely affect passengers’travel behavior and route choices.The commonly used transit fare policies in the present transit network would lead to the unbalanced transit assignment and improper transit resources distribution.In order to distribute transit passenger flow evenly and efficiently,this paper introduces a new distance-based fare pattern with Euclidean distance.A bi-level programming model is developed for determining the optimal distance-based fare pattern,with the path-based stochastic transit assignment(STA)problem with elastic demand being proposed at the lower level.The upper-level intends to address a principal-agent game between transport authorities and transit enterprises pursing maximization of social welfare and financial interest,respectively.A genetic algorithm(GA)is implemented to solve the bi-level model,which is verified by a numerical example to illustrate that the proposed nonlinear distance-based fare pattern presents a better financial performance and distribution effect than other fare structures.
文摘A new update strategy, distance-based update strategy, is presented in Location Dependent Continuous Query (LDCQ) under error limitation. There are different possibilities to intersect when the distances between moving objects and the querying boundary are different.Therefore, moving objects have different influences to the query result. We set different deviation limits for different moving objects according to distances. A great number of unnecessary updates are reduced and the payload of the system is relieved.
基金supported by National Natural Science Foundation of China (Grant No. 11722113)
文摘The distance-based regression model has many applications in analysis of multivariate response regression in various ?elds, such as ecology, genomics, genetics, human microbiomics, and neuroimaging. It yields a pseudo F test statistic that assesses the relation between the distance(dissimilarity) of the subjects and the predictors of interest. Despite its popularity in recent decades, the statistical properties of the pseudo F test statistic have not been revealed to our knowledge. This study derives the asymptotic properties of the pseudo F test statistic using spectral decomposition under the matrix normal assumption, when the utilized dissimilarity measure is the Euclidean or Mahalanobis distance. The pseudo F test statistic with the Euclidean distance has the same distribution as the quotient of two Chi-squared-type mixtures. The denominator and numerator of the quotient are approximated using a random variable of the form ξχ_d^2+ η, and the approximate error bound is given. The pseudo F test statistic with the Mahalanobis distance follows an F distribution.In simulation studies, the approximated distribution well matched the "exact" distribution obtained by the permutation procedure. The obtained distribution was further validated on H1N1 in?uenza data, aging human brain data, and embryonic imprint data.
基金supported by the National High Technology Research and Development 863 Program of China under Grant No. 2007AA01Z404the Program of Jiangsu Province under Grant No.BE2008135.
文摘Distance-based range search is crucial in many real applications.In particular,given a database and a query issuer,a distance-based range search retrieves all the objects in the database whose distances from the query issuer are less than or equal to a given threshold.Often,due to the accuracy of positioning devices,updating protocols or characteristics of applications(for example,location privacy protection),data obtained from real world are imprecise or uncertain.Therefore, existing approaches over exact databases cannot be directly applied to the uncertain scenario.In this paper,we redefine the distance-based range query in the context of uncertain databases,namely the probabilistic uncertain distance-based range (PUDR) queries,which obtain objects with confidence guarantees.We categorize the topological relationships between uncertain objects and uncertain search ranges into six cases and present the probability evaluation in each case.It is verified by experiments that our approach outperform Monte-Carlo method utilized in most existing work in precision and time cost for uniform uncertainty distribution.This approach approximates the probabilities of objects following other practical uncertainty distribution,such as Gaussian distribution with acceptable errors.Since the retrieval of a PUDR query requires accessing all the objects in the databases,which is quite costly,we propose spatial pruning and probabilistic pruning techniques to reduce the search space.Two metrics,false positive rate and false negative rate are introduced to measure the qualities of query results.An extensive empirical study has been conducted to demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings.
基金supported by the Graduate Student Innovation Project of China University of Petroleum(East China)in 2020(No.YCX2020054)the financial support by the National Natural Science Foundation of China(No.21876206,21505157)+1 种基金the Key Fundamental Research Fund of Shandong Province(ZR2020ZD13)the Youth Innovation and Technology projects of Universities in Shandong Province(2020KJC007,ZR2020MB064)
文摘In this study,we developed a microfluidic paper analysis device(μPAD)for distance-based detection of Ag^(+)in water.TheμPAD was manufactured by wax printing method on filter paper.Then,a layer of gold nanoparticles(AuNPs)was deposited and ascorbic acid was printed on the channel.In the detection,Ag^(+)was reduced by ascorbic acid and coated on the surface of the AuNPs on the channel,forming Au@Ag core/shell nanoparticles.Based on the capillary flow principle,diff erent concentrations of Ag^(+)formed diff erent distances of color ribbons.Thus,quantitative detection of Ag^(+)can be achieved by measuring the distance of the color ribbon.The detection limit of this method was as low as 1 mg·L^(-1)within 15 min and the interference of common metal ions in water can be eliminated.In conclusion,this method had successfully realized the leap from colorimetry to direct reading,realizing fast read and easy manipulation with low-cost.
基金supported by the National High Technology Research and Development 863 Program of China under Grant No. 2007AA01Z404the National Research Foundation for the Doctoral Program of Higher Education of China under Grant No. 20103218110017+1 种基金the Science & Technology Pillar Program of Jiangsu Province of China under Grant No. BE2008135the Postdoctoral Science Foundation of China under Grant No. 20100481133.
文摘Data obtained from real world are imprecise or uncertain due to the accuracy of positioning devices,updating protocols or characteristics of applications.On the other hand,users sometimes prefer to qualitatively express their requests with vague conditions and different parts of search region are in-equally important in some applications.We address the problem of efficiently processing the fuzzy range queries for uncertain moving objects whose whereabouts in time are not known exactly,for which the basic syntax is find objects always/sometimes near to the query issuer with the qualifying guarantees no less than a given threshold during a given temporal interval.We model the location uncertainty of moving objects on the utilization of probability density functions and describe the indeterminate boundary of query range with fuzzy set.We present the qualifying guarantee evaluation of objects,and propose pruning techniques based on the α-cut of fuzzy set to shrink the search space efficiently.We also design rules to reject non-qualifying objects and validate qualifying objects in order to avoid unnecessary costly numeric integrations in the refinement step.An extensive empirical study has been conducted to demonstrate the efficiency and effectiveness of algorithms under various experimental
基金supported by the Shaanxi Key Laboratory of Complex System Control and Intelligent Information Processing under Grant No.2016CP01,Xi’an University of Technology,Xi’an Science and Technology Plan Projects under Grant No.NC1504(2)the National Natural Science Foundation of China under Grant No.31101075+1 种基金the National High Technology Research and Development of China(863 Program)under Grant No.2013AA10230402,Natural Science Fundamental Research Plan of Shaanxi Province under Grant No.2016JM6038Fundamental Research Funds for the Central Universities,NWSUAF,China,Grant No.2452015060.
文摘The hyper-spectral image contains spectral and spatial information,which increases the ability and precision of objects classification.Despite the classification value of hyper-spectral imaging technology within various applications,users often find it difficult to effectively apply in practice because of the effect of light,temperature and wind in outdoor environment.This research presented a new classification model for outdoor farmland objects based on near-infrared(NIR)hyper-spectral images.It involves two steps including region of interest(ROI)acquisition and establishment of classifiers.A distance-based method for quantitative analysis was proposed to optimize the reference pixels in ROI acquisition firstly.Then maximum likelihood(ML)and support vector machine(SVM)were used for farmland objects classification.The performance of the proposed method showed that the total classification accuracy based on the reference pixels was over 97.5%,of which the SVM-M model could reach 99.5%.The research provided an effective method for outdoor farmland image classification.
基金supported by Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01A213)。
文摘α-diversity describes species diversity at local scales.The Simpson’s and Shannon-Wiener indices are widely used to characterizeα-diversity based on species abundances within a fixed study site(e.g.,a quadrat or plot).Although such indices provide overall diversity estimates that can be analyzed,their values are not spatially continuous nor applicable in theory to any point within the study region,and thus they cannot be treated as spatial covariates for analyses of other variables.Herein,we extended the Simpson’s and Shannon-Wiener indices to create point estimates ofα-diversity for any location based on spatially explicit species occurrences within different bandwidths(i.e.,radii,with the location of interest as the center).For an arbitrary point in the study region,species occurrences within the circle plotting the bandwidth were weighted according to their distance from the center using a tri-cube kernel function,with occurrences closer to the center having greater weight than more distant ones.These novel kernel-basedα-diversity indices were tested using a tree dataset from a 400 m×400 m study region comprising a 200 m×200 m core region surrounded by a 100-m width buffer zone.Our newly extendedα-diversity indices did not disagree qualitatively with the traditional indices,and the former were slightly lower than the latter by<2%at medium and large band widths.The present work demonstrates the feasibility of using kernel-basedα-diversity indices to estimate diversity at any location in the study region and allows them to be used as quantifiable spatial covariates or predictors for other dependent variables of interest in future ecological studies.Spatially continuousα-diversity indices are useful to compare and monitor species trends in space and time,which is valuable for conservation practitioners.
基金Projects (61304198,61374195) supported by the National Natural Science Foundation of ChinaProjects (2013M530159,2014T70351) supported by the China Postdoctoral Science Foundation
文摘Congestion pricing is an important component of urban intelligent transport system.The efficiency,equity and the environmental impacts associated with road pricing schemes are key issues that should be considered before such schemes are implemented.This paper focuses on the cordon-based pricing with distance tolls,where the tolls are determined by a nonlinear function of a vehicles' travel distance within a cordon,termed as toll charge function.The optimal tolls can give rise to:1) higher total social benefits,2) better levels of equity,and 3) reduced environmental impacts(e.g.,less emission).Firstly,a deterministic equilibrium(DUE) model with elastic demand is presented to evaluate any given toll charge function.The distance tolls are non-additive,thus a modified path-based gradient projection algorithm is developed to solve the DUE model.Then,to quantitatively measure the equity level of each toll charge function,the Gini coefficient is adopted to measure the equity level of the flows in the entire transport network based on equilibrium flows.The total emission level is used to reflect the impacts of distance tolls on the environment.With these two indexes/measurements for the efficiency,equity and environmental issues as well as the DUE model,a multi-objective bi-level programming model is then developed to determine optimal distance tolls.The multi-objective model is converted to a single level model using the goal programming.A genetic algorithm(GA) is adopted to determine solutions.Finally,a numerical example is presented to verify the methodology.
文摘The primary objective of the study was to determine whether a distanced-based educational in-tervention would result in positive health outcomes for persons with both DM and cognitive impairment. Older adults with Type 2 diabetes (Diabetes Mellitus—DM) who also have cognitive impairment such as Mild Cognitive Impairment (MCI) or early stage dementia are both challenged and at risk when attempting to live independently. The ability to effectively monitor blood glucose levels and diet and exercise regimens often is severely constrained by the combination of DM and the presence of Mild Cognitive Impairment (MCI) or early stage dementia. We describe an exploratory study funded by the National Institute of Diabetes Digestive and Kidney Diseases (NIDDK) in which Certified Diabetic Educators (CDEs) were linked with 40 older adult with DM and cognitive impairment using iPads and the internet. CDEs presented personalized education sessions to participants, and 18 of the participants also received a cognitive intervention called Spaced Retrieval (SR), which is designed to train the effective use of strategies to enhance medication compliance and reach other goals. Blood glucose and cholesterol measures were assessed at baseline and at 2-, 4-, and 6-month post intervention. Hemoglobin A1c (HbA1c) levels initially declined from baseline after treatment but returned to baseline levels after 6 months. For low-density lipoprotein (LDL) cholesterol, a significant interaction effect was found for the Group × Time interaction. LDL levels increased from baseline after treatment for the control group, but showed decline after baseline in the SR group. Goals that were initially learned were retained, in general, at short-term follow-up, and self-efficacy increased significantly after training. Results show the need for follow-up and support after initial treatment, as well as the need to see if the effects produced by SR can be replicated and sustained with continued contact.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61374195 and 71501038)the Fundamental Research Funds for the Central Universities(Grant No.2242015R30036)the Natural Science Foundation of Jiangsu Province in China(Grant No.BK20150603)
文摘In order to address the optimal distance toll design problem for cordon-based congestion pricing incorporating the issue of equity,this paper presents a toll user equilibrium( TUE) model based on a transformed network with elastic demand,to evaluate any given toll charge function. A bi-level programming model is developed for determining the optimal toll levels,with the TUE being represented at the lower level.The upper level optimizes the total equity level over the transport network,represented by the Gini coefficient,where a constraint is imposed to the total travel impedance of each OD pair after the levy. A genetic algorithm( GA) is implemented to solve the bi-level model,which is verified by a numerical example.
基金This research is supported by the University program of Advanced Research(UPAR)and UAEU-AUA grants of United Arab Emirates University(UAEU)via Grant No.G00003271 and Grant No.G00003461.
文摘Silica has three major varieties of crystalline. Quartz is the main andabundant ingredient in the crust of our earth. While other varieties are formedby the heating of quartz. Silica quartz is a rich chemical structure containingenormous properties. Any chemical network or structure can be transformedinto a graph, where atoms become vertices and the bonds are converted toedges, between vertices. This makes a complex network easy to visualize towork on it. There are many concepts to work on chemical structures in termsof graph theory but the resolvability parameters of a graph are quite advanceand applicable topic. Resolvability parameters of a graph is a way to getting agraph into unique form, like each vertex or edge has a unique identification bymeans of some selected vertices, which depends on the distance of vertices andits pattern in a particular graph. We have dealt some resolvability parametersof SiO2 quartz. We computed the resolving set for quartz structure and itsvariants, wherein we proved that all the variants of resolvability parameters ofquartz structures are constant and do not depend on the order of the graph.
文摘The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value difference met- ric (IWVDM) is proposed here. Different from prior work, IWVDM uses naive Bayes (NB) to find weights for train- ing instances. Because early work has shown that there is a close relationship between VDM and NB, some work on NB can be applied to VDM. The weight of a training instance x, that belongs to the class c, is assigned according to the dif- ference between the estimated conditional probability P(c/x) by NB and the true conditional probability P(c/x), and the weight is adjusted iteratively. Compared with previous work, IWVDM has the advantage of reducing the time complex- ity of the process of finding weights, and simultaneously im- proving the performance of VDM. Experimental results on 36 UCI datasets validate the effectiveness of IWVDM.
基金supported by Grant-in-Aid for Scientific Research(A)(#24240015A)
文摘Uncertain data are common due to the increasing usage of sensors, radio frequency identification(RFID), GPS and similar devices for data collection. The causes of uncertainty include limitations of measurements, inclusion of noise, inconsistent supply voltage and delay or loss of data in transfer. In order to manage, query or mine such data, data uncertainty needs to be considered. Hence,this paper studies the problem of top-k distance-based outlier detection from uncertain data objects. In this work, an uncertain object is modelled by a probability density function of a Gaussian distribution. The naive approach of distance-based outlier detection makes use of nested loop. This approach is very costly due to the expensive distance function between two uncertain objects. Therefore,a populated-cells list(PC-list) approach of outlier detection is proposed. Using the PC-list, the proposed top-k outlier detection algorithm needs to consider only a fraction of dataset objects and hence quickly identifies candidate objects for top-k outliers. Two approximate top-k outlier detection algorithms are presented to further increase the efficiency of the top-k outlier detection algorithm.An extensive empirical study on synthetic and real datasets is also presented to prove the accuracy, efficiency and scalability of the proposed algorithms.