A peak norm is defined for Lp spaces of E-valued Bochner integrable functions, where E is a Banach space, and best approximations from a sun to elements of the space are characterized. Applications are given to some f...A peak norm is defined for Lp spaces of E-valued Bochner integrable functions, where E is a Banach space, and best approximations from a sun to elements of the space are characterized. Applications are given to some families of simultaneous best approximation problems.展开更多
Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curs...Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.展开更多
To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overco...To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.展开更多
Currently,the cloud computing systems use simple key-value data processing,which cannot support similarity search efectively due to lack of efcient index structures,and with the increase of dimensionality,the existing...Currently,the cloud computing systems use simple key-value data processing,which cannot support similarity search efectively due to lack of efcient index structures,and with the increase of dimensionality,the existing tree-like index structures could lead to the problem of"the curse of dimensionality".In this paper,a novel VF-CAN indexing scheme is proposed.VF-CAN integrates content addressable network(CAN)based routing protocol and the improved vector approximation fle(VA-fle) index.There are two index levels in this scheme:global index and local index.The local index VAK-fle is built for the data in each storage node.VAK-fle is thek-means clustering result of VA-fle approximation vectors according to their degree of proximity.Each cluster forms a separate local index fle and each fle stores the approximate vectors that are contained in the cluster.The vector of each cluster center is stored in the cluster center information fle of corresponding storage node.In the global index,storage nodes are organized into an overlay network CAN,and in order to reduce the cost of calculation,only clustering information of local index is issued to the entire overlay network through the CAN interface.The experimental results show that VF-CAN reduces the index storage space and improves query performance efectively.展开更多
文摘A peak norm is defined for Lp spaces of E-valued Bochner integrable functions, where E is a Banach space, and best approximations from a sun to elements of the space are characterized. Applications are given to some families of simultaneous best approximation problems.
基金Project (No. [2005]555) supported by the Hi-Tech Research and De-velopment Program (863) of China
文摘Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.
基金Sponsored by the National High Technology Research and Development Program of China (863 Program)(Grant No.[2005]555)
文摘To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.
基金supported by National Natural Science Foundation of China(No.61071093)Research and Innovation Projects for Graduates of Jiangsu Province(Nos.CXZZ12 0483 and CXLX12 0481)+1 种基金Science and Technology Support Program of Jiangsu Province(No.BE2012849)Priority Academic Program Development of Jiangsu Higher Education Institutions(No.yx002001)
文摘Currently,the cloud computing systems use simple key-value data processing,which cannot support similarity search efectively due to lack of efcient index structures,and with the increase of dimensionality,the existing tree-like index structures could lead to the problem of"the curse of dimensionality".In this paper,a novel VF-CAN indexing scheme is proposed.VF-CAN integrates content addressable network(CAN)based routing protocol and the improved vector approximation fle(VA-fle) index.There are two index levels in this scheme:global index and local index.The local index VAK-fle is built for the data in each storage node.VAK-fle is thek-means clustering result of VA-fle approximation vectors according to their degree of proximity.Each cluster forms a separate local index fle and each fle stores the approximate vectors that are contained in the cluster.The vector of each cluster center is stored in the cluster center information fle of corresponding storage node.In the global index,storage nodes are organized into an overlay network CAN,and in order to reduce the cost of calculation,only clustering information of local index is issued to the entire overlay network through the CAN interface.The experimental results show that VF-CAN reduces the index storage space and improves query performance efectively.