The Internet based cyber-physical world has profoundly changed the information environment for the development of artificial intelligence(AI), bringing a new wave of AI research and promoting it into the new era of AI...The Internet based cyber-physical world has profoundly changed the information environment for the development of artificial intelligence(AI), bringing a new wave of AI research and promoting it into the new era of AI 2.0. As one of the most prominent characteristics of research in AI 2.0 era, crowd intelligence has attracted much attention from both industry and research communities. Specifically, crowd intelligence provides a novel problem-solving paradigm through gathering the intelligence of crowds to address challenges. In particular, due to the rapid development of the sharing economy, crowd intelligence not only becomes a new approach to solving scientific challenges, but has also been integrated into all kinds of application scenarios in daily life, e.g., online-tooffline(O2O) application, real-time traffic monitoring, and logistics management. In this paper, we survey existing studies of crowd intelligence. First, we describe the concept of crowd intelligence, and explain its relationship to the existing related concepts, e.g., crowdsourcing and human computation. Then, we introduce four categories of representative crowd intelligence platforms. We summarize three core research problems and the state-of-the-art techniques of crowd intelligence. Finally, we discuss promising future research directions of crowd intelligence.展开更多
Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applicatio...Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users' diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud (VC) for special purposes. A virtual cloud for special purposes is in essence a user's specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.展开更多
Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but...Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but also about project evolution and social interaction. A comprehensive understanding of the review topics in pull-based model would be useful to better organize the code review process and optimize review tasks such as reviewer recommendation and pull-request prioritization. In this paper, we first conduct a qualitative study on three popular open-source software projects hosted on GitHub and construct a fine-grained two-level taxonomy covering four level-1 categories (code correctness, pull- request decision-making, project management, and social interaction) and 11 level-2 subcategories (e.g., defect detecting, reviewer assigning, contribution encouraging). Second, we conduct preliminary quantitative analysis on a large set of review comments that were labeled by TSHC (a two-stage hybrid classification algorithm), which is able to automatically classify review comments by combining rule-based and machine-learning techniques. Through the quantitative study, we explore the typical review patterns. We find that the three projects present similar comments distribution on each subeategory. Pull-requests submitted by inexperienced contributors tend to contain potential issues even though they have passed the tests. Furthermore, external contributors are more likely to break project conventions in their early contributions.展开更多
Network virtualization is recognized as an effective way to overcome the ossification of the Internet. However, the virtual network mapping problem (VNMP) is a critical challenge, focusing on how to map the virtual ne...Network virtualization is recognized as an effective way to overcome the ossification of the Internet. However, the virtual network mapping problem (VNMP) is a critical challenge, focusing on how to map the virtual networks to the substrate network with efficient utilization of infrastructure resources. The problem can be divided into two phases: node mapping phase and link mapping phase. In the node mapping phase, the existing algorithms usually map those virtual nodes with a complete greedy strategy, without considering the topology among these virtual nodes, resulting in too long substrate paths (with multiple hops). Addressing this problem, we propose a topology awareness mapping algorithm, which considers the topology among these virtual nodes. In the link mapping phase, the new algorithm adopts the k-shortest path algorithm. Simulation results show that the new algorithm greatly increases the long-term average revenue, the acceptance ratio, and the long-term revenue-to-cost ratio (R/C).展开更多
Communication and coordination between OSS developers who do not work physically in the same location have always been the challenging issues.The pull-based development model,as the state-of-art collaborative developm...Communication and coordination between OSS developers who do not work physically in the same location have always been the challenging issues.The pull-based development model,as the state-of-art collaborative development mechanism,provides high openness and transparency to improve the visibility of contributors'work.However,duplicate contributions may still be submitted by more than one contributors to solve the same problem due to the parallel and uncoordinated nature of this model.If not detected in time,duplicate pull-requests can cause contributors and reviewers to waste time and energy on redundant work.In this paper,we propose an approach combining textual and change similarities to automatically detect duplicate contributions in pull-based model at submission time.For a new-arriving contribution,we first compute textual similarity and change similarity between it and other existing contributions.And then our method returns a list of candidate duplicate contributions that are most similar with the new contribution in terms of the combined textual and change similarity.The evaluation shows that 83.4%of the duplicates can be found in average when we use the combined textual and change similarity compared to 54.8%using only textual similarity and 78.2%using only change similarity.展开更多
基金supported by the National Natural Science Foundation of China(No.61532004)
文摘The Internet based cyber-physical world has profoundly changed the information environment for the development of artificial intelligence(AI), bringing a new wave of AI research and promoting it into the new era of AI 2.0. As one of the most prominent characteristics of research in AI 2.0 era, crowd intelligence has attracted much attention from both industry and research communities. Specifically, crowd intelligence provides a novel problem-solving paradigm through gathering the intelligence of crowds to address challenges. In particular, due to the rapid development of the sharing economy, crowd intelligence not only becomes a new approach to solving scientific challenges, but has also been integrated into all kinds of application scenarios in daily life, e.g., online-tooffline(O2O) application, real-time traffic monitoring, and logistics management. In this paper, we survey existing studies of crowd intelligence. First, we describe the concept of crowd intelligence, and explain its relationship to the existing related concepts, e.g., crowdsourcing and human computation. Then, we introduce four categories of representative crowd intelligence platforms. We summarize three core research problems and the state-of-the-art techniques of crowd intelligence. Finally, we discuss promising future research directions of crowd intelligence.
基金This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000105 and the National Natural Science Foundation of China under Grant Nos. 61272154 and 61421091.
文摘Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users' diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud (VC) for special purposes. A virtual cloud for special purposes is in essence a user's specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.
基金This work was supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000805 and the National Natural Science Foundation of China under Grant Nos. 61432020, 61303064, 61472430 and 61502512.
文摘Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but also about project evolution and social interaction. A comprehensive understanding of the review topics in pull-based model would be useful to better organize the code review process and optimize review tasks such as reviewer recommendation and pull-request prioritization. In this paper, we first conduct a qualitative study on three popular open-source software projects hosted on GitHub and construct a fine-grained two-level taxonomy covering four level-1 categories (code correctness, pull- request decision-making, project management, and social interaction) and 11 level-2 subcategories (e.g., defect detecting, reviewer assigning, contribution encouraging). Second, we conduct preliminary quantitative analysis on a large set of review comments that were labeled by TSHC (a two-stage hybrid classification algorithm), which is able to automatically classify review comments by combining rule-based and machine-learning techniques. Through the quantitative study, we explore the typical review patterns. We find that the three projects present similar comments distribution on each subeategory. Pull-requests submitted by inexperienced contributors tend to contain potential issues even though they have passed the tests. Furthermore, external contributors are more likely to break project conventions in their early contributions.
基金supported by the National Basic Research Program (973) of China (No. 2011CB302601)the National Natural Science Foundation of China (No. 90818028)the National High-Tech R&D Program (863) of China (No. 2007AA010301)
文摘Network virtualization is recognized as an effective way to overcome the ossification of the Internet. However, the virtual network mapping problem (VNMP) is a critical challenge, focusing on how to map the virtual networks to the substrate network with efficient utilization of infrastructure resources. The problem can be divided into two phases: node mapping phase and link mapping phase. In the node mapping phase, the existing algorithms usually map those virtual nodes with a complete greedy strategy, without considering the topology among these virtual nodes, resulting in too long substrate paths (with multiple hops). Addressing this problem, we propose a topology awareness mapping algorithm, which considers the topology among these virtual nodes. In the link mapping phase, the new algorithm adopts the k-shortest path algorithm. Simulation results show that the new algorithm greatly increases the long-term average revenue, the acceptance ratio, and the long-term revenue-to-cost ratio (R/C).
基金This work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB1004202the National Natural Science Foundation of China under Grant No. 61702534.
文摘Communication and coordination between OSS developers who do not work physically in the same location have always been the challenging issues.The pull-based development model,as the state-of-art collaborative development mechanism,provides high openness and transparency to improve the visibility of contributors'work.However,duplicate contributions may still be submitted by more than one contributors to solve the same problem due to the parallel and uncoordinated nature of this model.If not detected in time,duplicate pull-requests can cause contributors and reviewers to waste time and energy on redundant work.In this paper,we propose an approach combining textual and change similarities to automatically detect duplicate contributions in pull-based model at submission time.For a new-arriving contribution,we first compute textual similarity and change similarity between it and other existing contributions.And then our method returns a list of candidate duplicate contributions that are most similar with the new contribution in terms of the combined textual and change similarity.The evaluation shows that 83.4%of the duplicates can be found in average when we use the combined textual and change similarity compared to 54.8%using only textual similarity and 78.2%using only change similarity.