Author name disambiguation(AND)is a central task in academic search,which has received more attention recently accompanied by the increase of authors and academic publications.To tackle the AND problem,existing studie...Author name disambiguation(AND)is a central task in academic search,which has received more attention recently accompanied by the increase of authors and academic publications.To tackle the AND problem,existing studies have proposed various approaches based on different types of information,such as raw document features(e.g.,co-authors,titles,and keywords),the fusion feature(e.g.,a hybrid publication embedding based on multiple raw document features),the local structural information(e.g.,a publication's neighborhood information on a graph),and the global structural information(e.g.,interactive information between a node and others on a graph).However,there has been no work taking all the above-mentioned information into account and taking full advantage of the contributions of each raw document feature for the AND problem so far.To fill the gap,we propose a novel framework named EAND(Towards Effective Author Name Disambiguation by Hybrid Attention).Specifically,we design a novel feature extraction model,which consists of three hybrid attention mechanism layers,to extract key information from the global structural information and the local structural information that are generated from six similarity graphs constructed based on different similarity coefficients,raw document features,and the fusion feature.Each hybrid attention mechanism layer contains three key modules:a local structural perception,a global structural perception,and a feature extractor.Additionally,the mean absolute error function in the joint loss function is used to introduce the structural information loss of the vector space.Experimental results on two real-world datasets demonstrate that EAND achieves superior performance,outperforming state-of-the-art methods by at least+2.74%in terms of the micro-F1 score and+3.31%in terms of the macro-F1 score.展开更多
Inductive knowledge graph embedding(KGE)aims to embed unseen entities in emerging knowledge graphs(KGs).The major recent studies of inductive KGE embed unseen entities by aggregating information from their neighboring...Inductive knowledge graph embedding(KGE)aims to embed unseen entities in emerging knowledge graphs(KGs).The major recent studies of inductive KGE embed unseen entities by aggregating information from their neighboring entities and relations with graph neural networks(GNNs).However,these methods rely on the existing neighbors of unseen entities and suffer from two common problems:data sparsity and feature smoothing.Firstly,the data sparsity problem means unseen entities usually emerge with few triplets containing insufficient information.Secondly,the effectiveness of the features extracted from original KGs will degrade when repeatedly propagating these features to represent unseen entities in emerging KGs,which is termed feature smoothing problem.To tackle the two problems,we propose a novel model entitled Meta-Learning Based Memory Graph Convolutional Network(MMGCN)consisting of three different components:1)the two-layer information transforming module(TITM)developed to effectively transform information from original KGs to emerging KGs;2)the hyper-relation feature initializing module(HFIM)proposed to extract type-level features shared between KGs and obtain a coarse-grained representation for each entity with these features;and 3)the meta-learning training module(MTM)designed to simulate the few-shot emerging KGs and train the model in a meta-learning framework.The extensive experiments conducted on the few-shot link prediction task for emerging KGs demonstrate the superiority of our proposed model MMGCN compared with state-of-the-art methods.展开更多
In maxillofacial surgery, there is a significant need for the design and fabrication of porous scaffolds with customizable bionic structures and mechanical properties suitable for bone tissue engineering. In this pape...In maxillofacial surgery, there is a significant need for the design and fabrication of porous scaffolds with customizable bionic structures and mechanical properties suitable for bone tissue engineering. In this paper, we characterize the porous Ti6Al4V implant, which is one of the most promising and attractive biomedical applications due to the similarity of its modulus to human bones. We describe the mechanical properties of this implant, which we suggest is capable of providing important biological functions for bone tissue regeneration. We characterize a novel bionic design and fabrication process for porous implants. A design concept of “reducing dimensions and designing layer by layer” was used to construct layered slice and rod-connected mesh structure (LSRCMS) implants. Porous LSRCMS implants with different parameters and porosities were fabricated by selective laser melting (SLM). Printed samples were evaluated by microstructure characterization, specific mechanical properties were analyzed by mechanical tests, and finite element analysis was used to digitally calculate the stress characteristics of the LSRCMS under loading forces. Our results show that the samples fabricated by SLM had good structure printing quality with reasonable pore sizes. The porosity, pore size, and strut thickness of manufactured samples ranged from (60.95± 0.27)% to (81.23±0.32)%,(480±28) to (685±31)μm, and (263±28) to (265±28)μm, respectively. The compression results show that the Young’s modulus and the yield strength ranged from (2.23±0.03) to (6.36±0.06) GPa and (21.36±0.42) to (122.85±3.85) MPa, respectively. We also show that the Young’s modulus and yield strength of the LSRCMS samples can be predicted by the Gibson-Ashby model. Further, we prove the structural stability of our novel design by finite element analysis. Our results illustrate that our novel SLM-fabricated porous Ti6Al4V scaffolds based on an LSRCMS are a promising material for bone implants, and are potentially applicable to the field of bone defect repair.展开更多
To support a large amount of GPS data generated from various moving objects, the back-end servers usually store low-sampling-rate trajectories. Therefore, no precise position information can be obtained directly from ...To support a large amount of GPS data generated from various moving objects, the back-end servers usually store low-sampling-rate trajectories. Therefore, no precise position information can be obtained directly from the back-end servers and uncertainty is an inherent characteristic of the spatio-temporal data. How to deal with the uncertainty thus becomes a basic and challenging problem. A lot of researches have been rigidly conducted on the uncertainty of a moving object itself and isolated from the context where it is derived. However, we discover that the uncertainty of moving objects can be efficiently reduced and effectively ranked using the context-aware information. In this paper, we focus on context- aware information and propose an integrated framework, Context-Based Uncertainty Reduction and Ranking (CURR), to reduce and rank the uncertainty of trajectories. Specifically, given two consecutive samplings, we aim to infer and rank the possible trajectories in accordance with the information extracted from context. Since some context-aware information can be used to reduce the uncertainty while some context-aware information can be used to rank the uncertainty, to leverage them accordingly, CURR naturally consists of two stages: reduction stage and ranking stage which complement each other. We also implement a prototype system to validate the effectiveness of our solution. Extensive experiments are conducted and the evaluation results demonstrate the efficiency and high accuracy of CURR.展开更多
With the development and prevalence of online social networks, there is an obvious tendency that people are willing to attend and share group activities with friends or acquaintances. This motivates the study on group...With the development and prevalence of online social networks, there is an obvious tendency that people are willing to attend and share group activities with friends or acquaintances. This motivates the study on group recommendation, which aims to meet the needs of a group of users, instead of only individual users. However, how to aggregate different preferences of different group members is still a challenging problem: 1) the choice of a member in a group is influenced by various factors, e.g., personal preference, group topic, and social relationship; 2) users have different influences when in diffe- rent groups. In this paper, we propose a generative geo-social group recommendation model (GSGR) to recommend points of interest (POIs) for groups. Specifically, GSGR well models the personal preference impacted by geographical information, group topics, and social influence for recommendation. Moreover, when making recommendations, GSGR aggregates the preferences of group members with different weights to estimate the preference score of a group to a POI. Experimental results on two datasets show that GSGR is effective in group recommendation and outperforms the state-of-the-art methods.展开更多
Entity linking is a new technique in recommender systems to link users'interaction behaviors in different domains,for the purpose of improving the performance of the recommendation task.Linking-based cross-domain ...Entity linking is a new technique in recommender systems to link users'interaction behaviors in different domains,for the purpose of improving the performance of the recommendation task.Linking-based cross-domain recom-mendation aims to alleviate the data sparse problem by utilizing the domain-sharable knowledge from auxiliary domains.However,existing methods fail to prevent domain-specific features to be transferred,resulting in suboptimal results.In this paper,we aim to address this issue by proposing an adversarial transfer learning based model ATLRec,which effec-tively captures domain-sharable features for cross-domain recommendation.In ATLRec,we leverage adversarial learning to generate representations of user-item interactions in both the source and the target domains,such that the discrimina-tor cannot identify which domain they belong to,for the purpose of obtaining domain-sharable features.Meanwhile each domain learns its domain-specific features by a private feature extractor.The recommendation of each domain considers both domain-specific and domain-sharable features.We further adopt an attention mechanism to learn item latent factors of both domains by utilizing the shared users with interaction history,so that the representations of all items can be learned sufficiently in a shared space,even when few or even no items are shared by different domains.By this method,we can represent all items from the source and the target domains in a shared space,for the purpose of better linking items in different domains and capturing cross-domain item-item relatedness to facilitate the learning of domain-sharable knowledge.The proposed model is evaluated on various real-world datasets and demonstrated to outperform several state-of-the-art single-domain and cross-domain recommendation methods in terms of recommendation accuracy.展开更多
Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"informatio...Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.展开更多
With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphis...With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphism, which is an NP-complete problem and sometimes too strict to find useful matches in certain applications. And how to protect the privacy of data graphs in subgraph pattern matching without undermining matching results is an important concern. Thus, we propose a novel framework to achieve the privacy-preserving subgraph pattern matching in cloud. In order to protect the structural privacy in data graphs, we firstly develop a k-automorphism model based method. Additionally, we use a cost-model based label generalization method to protect label privacy in both data graphs and pattern graphs. During the generation of the k-automorphic graph, a large number of noise edges or vertices might be introduced to the original data graph. Thus, we use the outsourced graph, which is only a subset of a k-automorphic graph, to answer the subgraph pattern matching. The efficiency of the pattern matching process can be greatly improved in this way. Extensive experiments on real-world datasets demonstrate the high efficiency of our framework.展开更多
We are delighted to present the special section of Journal of Computer Science and Technology on"Spatio-Temporal Big Data Analytics".The fast development of mobile Internet has given rise to an extremely lar...We are delighted to present the special section of Journal of Computer Science and Technology on"Spatio-Temporal Big Data Analytics".The fast development of mobile Internet has given rise to an extremely large volume of spatio-temporal data.These data contain rich information of both individuals and groups,and are thus invaluable for traffic control,route planning,urban planning and many other intelligent applications.Spatio-temporal big data analytics deals with the management and makes sense of large amount of spatio-temporal data that provides actionable insights at the right time.展开更多
A point of interest (POI) is a specific point location that someone may find useful. With the development of urban modernization, a large number of functional organized POI groups (FOPGs), such as shopping malls, ...A point of interest (POI) is a specific point location that someone may find useful. With the development of urban modernization, a large number of functional organized POI groups (FOPGs), such as shopping malls, electronic malls, and snacks streets, are springing up in the city. They have a great influence on people's lives. We aim to discover functional organized POI groups for spatial keyword recommendation because FOPGs-based recommendation is superior to POIs-based recommendation in efficiency and flexibility. To discover FOPGs, we design clustering algorithms to obtain organized POI groups (OPGs) and utilize OPGs-LDA (Latent Dirichlet Allocation) model to reveal functions of OPGs for further recommendation. To the best of our knowledge, we are the first to study functional organized POI groups which have important applications in urban planning and social marketing.展开更多
Entity matching (EM) identifies records referring to the same entity within or across databases. Existing methods using structured attribute values (such as digital, date or short string values) may fail when the stru...Entity matching (EM) identifies records referring to the same entity within or across databases. Existing methods using structured attribute values (such as digital, date or short string values) may fail when the structured information is not enough to reflect the matching relationships between records. Nowadays more and more databases may have some unstructured textual attribute containing extra consolidated textual information (CText) of the record, but seldom work has been done on using the CText for EM. Conventional string similarity metrics such as edit distance or bag-of-words are unsuitable for measuring the similarities between CText since there are hundreds or thousands of words with each piece of CText, while existing topic models either cannot work well since there are no obvious gaps between topics in CText. In this paper, we propose a novel cooccurrence-based topic model to identify various sub-topics from each piece of CText, and then measure the similarity between CText on the multiple sub-topic dimensions. To avoid ignoring some hidden important sub-topics, we let the crowd help us decide weights of different sub-topics in doing EM. Our empirical study on two real-world datasets based on Amzon Mechanical Turk Crowdsourcing Platform shows that our method outperforms the state-of-the-art EM methods and Text Understanding models.展开更多
基金supported by the Major Program of the Natural Science Foundation of Jiangsu Higher Education Institutions of China under Grant Nos.19KJA610002 and 19KJB520050the National Natural Science Foundation of China under Grant No.61902270.
文摘Author name disambiguation(AND)is a central task in academic search,which has received more attention recently accompanied by the increase of authors and academic publications.To tackle the AND problem,existing studies have proposed various approaches based on different types of information,such as raw document features(e.g.,co-authors,titles,and keywords),the fusion feature(e.g.,a hybrid publication embedding based on multiple raw document features),the local structural information(e.g.,a publication's neighborhood information on a graph),and the global structural information(e.g.,interactive information between a node and others on a graph).However,there has been no work taking all the above-mentioned information into account and taking full advantage of the contributions of each raw document feature for the AND problem so far.To fill the gap,we propose a novel framework named EAND(Towards Effective Author Name Disambiguation by Hybrid Attention).Specifically,we design a novel feature extraction model,which consists of three hybrid attention mechanism layers,to extract key information from the global structural information and the local structural information that are generated from six similarity graphs constructed based on different similarity coefficients,raw document features,and the fusion feature.Each hybrid attention mechanism layer contains three key modules:a local structural perception,a global structural perception,and a feature extractor.Additionally,the mean absolute error function in the joint loss function is used to introduce the structural information loss of the vector space.Experimental results on two real-world datasets demonstrate that EAND achieves superior performance,outperforming state-of-the-art methods by at least+2.74%in terms of the micro-F1 score and+3.31%in terms of the macro-F1 score.
基金supported by the National Natural Science Foundation of China under Grant No.62272332the Major Program of the Natural Science Foundation of Jiangsu Higher Education Institutions of China under Grant No.22KJA520006.
文摘Inductive knowledge graph embedding(KGE)aims to embed unseen entities in emerging knowledge graphs(KGs).The major recent studies of inductive KGE embed unseen entities by aggregating information from their neighboring entities and relations with graph neural networks(GNNs).However,these methods rely on the existing neighbors of unseen entities and suffer from two common problems:data sparsity and feature smoothing.Firstly,the data sparsity problem means unseen entities usually emerge with few triplets containing insufficient information.Secondly,the effectiveness of the features extracted from original KGs will degrade when repeatedly propagating these features to represent unseen entities in emerging KGs,which is termed feature smoothing problem.To tackle the two problems,we propose a novel model entitled Meta-Learning Based Memory Graph Convolutional Network(MMGCN)consisting of three different components:1)the two-layer information transforming module(TITM)developed to effectively transform information from original KGs to emerging KGs;2)the hyper-relation feature initializing module(HFIM)proposed to extract type-level features shared between KGs and obtain a coarse-grained representation for each entity with these features;and 3)the meta-learning training module(MTM)designed to simulate the few-shot emerging KGs and train the model in a meta-learning framework.The extensive experiments conducted on the few-shot link prediction task for emerging KGs demonstrate the superiority of our proposed model MMGCN compared with state-of-the-art methods.
基金Project supported by the National Natural Science Foundation of China(No.51775506)the Zhejiang Provincial Natural Science Foundation of China(No.LY18E050022)+2 种基金the Public Welfare Technology Application Research Project of Zhejiang Province(Nos.LGG19E050022 and 2017C33115)the Zhejiang Provincial Science&Technology Project for Medicine&Health(No.2018KY878)the Open Foundation of Zhejiang Provincial Top Key Discipline of Mechanical Engineering of Hangzhou Dianzi University,China
文摘In maxillofacial surgery, there is a significant need for the design and fabrication of porous scaffolds with customizable bionic structures and mechanical properties suitable for bone tissue engineering. In this paper, we characterize the porous Ti6Al4V implant, which is one of the most promising and attractive biomedical applications due to the similarity of its modulus to human bones. We describe the mechanical properties of this implant, which we suggest is capable of providing important biological functions for bone tissue regeneration. We characterize a novel bionic design and fabrication process for porous implants. A design concept of “reducing dimensions and designing layer by layer” was used to construct layered slice and rod-connected mesh structure (LSRCMS) implants. Porous LSRCMS implants with different parameters and porosities were fabricated by selective laser melting (SLM). Printed samples were evaluated by microstructure characterization, specific mechanical properties were analyzed by mechanical tests, and finite element analysis was used to digitally calculate the stress characteristics of the LSRCMS under loading forces. Our results show that the samples fabricated by SLM had good structure printing quality with reasonable pore sizes. The porosity, pore size, and strut thickness of manufactured samples ranged from (60.95± 0.27)% to (81.23±0.32)%,(480±28) to (685±31)μm, and (263±28) to (265±28)μm, respectively. The compression results show that the Young’s modulus and the yield strength ranged from (2.23±0.03) to (6.36±0.06) GPa and (21.36±0.42) to (122.85±3.85) MPa, respectively. We also show that the Young’s modulus and yield strength of the LSRCMS samples can be predicted by the Gibson-Ashby model. Further, we prove the structural stability of our novel design by finite element analysis. Our results illustrate that our novel SLM-fabricated porous Ti6Al4V scaffolds based on an LSRCMS are a promising material for bone implants, and are potentially applicable to the field of bone defect repair.
基金This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA01A603, the Pilot Project of Chinese Academy of Sciences under Grant No. XDA06010600, and the National Natural Science Foundation of China under Grant No. 61402312.
文摘To support a large amount of GPS data generated from various moving objects, the back-end servers usually store low-sampling-rate trajectories. Therefore, no precise position information can be obtained directly from the back-end servers and uncertainty is an inherent characteristic of the spatio-temporal data. How to deal with the uncertainty thus becomes a basic and challenging problem. A lot of researches have been rigidly conducted on the uncertainty of a moving object itself and isolated from the context where it is derived. However, we discover that the uncertainty of moving objects can be efficiently reduced and effectively ranked using the context-aware information. In this paper, we focus on context- aware information and propose an integrated framework, Context-Based Uncertainty Reduction and Ranking (CURR), to reduce and rank the uncertainty of trajectories. Specifically, given two consecutive samplings, we aim to infer and rank the possible trajectories in accordance with the information extracted from context. Since some context-aware information can be used to reduce the uncertainty while some context-aware information can be used to rank the uncertainty, to leverage them accordingly, CURR naturally consists of two stages: reduction stage and ranking stage which complement each other. We also implement a prototype system to validate the effectiveness of our solution. Extensive experiments are conducted and the evaluation results demonstrate the efficiency and high accuracy of CURR.
基金This research was partially supported by the National Natural Science Foundation of China under Grant No. 61572335 and the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20151223.
文摘With the development and prevalence of online social networks, there is an obvious tendency that people are willing to attend and share group activities with friends or acquaintances. This motivates the study on group recommendation, which aims to meet the needs of a group of users, instead of only individual users. However, how to aggregate different preferences of different group members is still a challenging problem: 1) the choice of a member in a group is influenced by various factors, e.g., personal preference, group topic, and social relationship; 2) users have different influences when in diffe- rent groups. In this paper, we propose a generative geo-social group recommendation model (GSGR) to recommend points of interest (POIs) for groups. Specifically, GSGR well models the personal preference impacted by geographical information, group topics, and social influence for recommendation. Moreover, when making recommendations, GSGR aggregates the preferences of group members with different weights to estimate the preference score of a group to a POI. Experimental results on two datasets show that GSGR is effective in group recommendation and outperforms the state-of-the-art methods.
基金supported by the National Natural Science Foundation of China under Grant Nos.61872258,61772356,61876117,and 61802273the Priority Academic Program Development of Jiangsu Higher Education Institutions of China.
文摘Entity linking is a new technique in recommender systems to link users'interaction behaviors in different domains,for the purpose of improving the performance of the recommendation task.Linking-based cross-domain recom-mendation aims to alleviate the data sparse problem by utilizing the domain-sharable knowledge from auxiliary domains.However,existing methods fail to prevent domain-specific features to be transferred,resulting in suboptimal results.In this paper,we aim to address this issue by proposing an adversarial transfer learning based model ATLRec,which effec-tively captures domain-sharable features for cross-domain recommendation.In ATLRec,we leverage adversarial learning to generate representations of user-item interactions in both the source and the target domains,such that the discrimina-tor cannot identify which domain they belong to,for the purpose of obtaining domain-sharable features.Meanwhile each domain learns its domain-specific features by a private feature extractor.The recommendation of each domain considers both domain-specific and domain-sharable features.We further adopt an attention mechanism to learn item latent factors of both domains by utilizing the shared users with interaction history,so that the representations of all items can be learned sufficiently in a shared space,even when few or even no items are shared by different domains.By this method,we can represent all items from the source and the target domains in a shared space,for the purpose of better linking items in different domains and capturing cross-domain item-item relatedness to facilitate the learning of domain-sharable knowledge.The proposed model is evaluated on various real-world datasets and demonstrated to outperform several state-of-the-art single-domain and cross-domain recommendation methods in terms of recommendation accuracy.
基金supported by the National Key Research and Development Program of China under Grant No.2018AAAO10190the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20191420+2 种基金the National Natural Science Foundation of China under Grant No.61632016the Natural Science Research Project of Jiangsu Higher Education Institution under Grant No.17KJA520003the Priority Academic Program Development of JiangsuHigher Education Institutions,and the Suda-Toycloud Data Intelligence Joint Laboratory.
文摘Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.
基金This work is supported by the National Natural Science Foundation of China under Grant No.61572335the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20151223。
文摘With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphism, which is an NP-complete problem and sometimes too strict to find useful matches in certain applications. And how to protect the privacy of data graphs in subgraph pattern matching without undermining matching results is an important concern. Thus, we propose a novel framework to achieve the privacy-preserving subgraph pattern matching in cloud. In order to protect the structural privacy in data graphs, we firstly develop a k-automorphism model based method. Additionally, we use a cost-model based label generalization method to protect label privacy in both data graphs and pattern graphs. During the generation of the k-automorphic graph, a large number of noise edges or vertices might be introduced to the original data graph. Thus, we use the outsourced graph, which is only a subset of a k-automorphic graph, to answer the subgraph pattern matching. The efficiency of the pattern matching process can be greatly improved in this way. Extensive experiments on real-world datasets demonstrate the high efficiency of our framework.
文摘We are delighted to present the special section of Journal of Computer Science and Technology on"Spatio-Temporal Big Data Analytics".The fast development of mobile Internet has given rise to an extremely large volume of spatio-temporal data.These data contain rich information of both individuals and groups,and are thus invaluable for traffic control,route planning,urban planning and many other intelligent applications.Spatio-temporal big data analytics deals with the management and makes sense of large amount of spatio-temporal data that provides actionable insights at the right time.
基金This work was supported by the National Natural Science Foundation of China under Grant Nos. 61572335, 61472263, 61402312 and 61402313, the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20151223, and the Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China.
文摘A point of interest (POI) is a specific point location that someone may find useful. With the development of urban modernization, a large number of functional organized POI groups (FOPGs), such as shopping malls, electronic malls, and snacks streets, are springing up in the city. They have a great influence on people's lives. We aim to discover functional organized POI groups for spatial keyword recommendation because FOPGs-based recommendation is superior to POIs-based recommendation in efficiency and flexibility. To discover FOPGs, we design clustering algorithms to obtain organized POI groups (OPGs) and utilize OPGs-LDA (Latent Dirichlet Allocation) model to reveal functions of OPGs for further recommendation. To the best of our knowledge, we are the first to study functional organized POI groups which have important applications in urban planning and social marketing.
文摘Entity matching (EM) identifies records referring to the same entity within or across databases. Existing methods using structured attribute values (such as digital, date or short string values) may fail when the structured information is not enough to reflect the matching relationships between records. Nowadays more and more databases may have some unstructured textual attribute containing extra consolidated textual information (CText) of the record, but seldom work has been done on using the CText for EM. Conventional string similarity metrics such as edit distance or bag-of-words are unsuitable for measuring the similarities between CText since there are hundreds or thousands of words with each piece of CText, while existing topic models either cannot work well since there are no obvious gaps between topics in CText. In this paper, we propose a novel cooccurrence-based topic model to identify various sub-topics from each piece of CText, and then measure the similarity between CText on the multiple sub-topic dimensions. To avoid ignoring some hidden important sub-topics, we let the crowd help us decide weights of different sub-topics in doing EM. Our empirical study on two real-world datasets based on Amzon Mechanical Turk Crowdsourcing Platform shows that our method outperforms the state-of-the-art EM methods and Text Understanding models.