The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on pr...The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.展开更多
The public is increasingly using social media platforms such as Twitter and Facebook to express their views on a variety of topics.As a result,social media has emerged as the most effective and largest open source for...The public is increasingly using social media platforms such as Twitter and Facebook to express their views on a variety of topics.As a result,social media has emerged as the most effective and largest open source for obtaining public opinion.Single node computational methods are inefficient for sentiment analysis on such large datasets.Supercomputers or parallel or distributed proces-sing are two options for dealing with such large amounts of data.Most parallel programming frameworks,such as MPI(Message Processing Interface),are dif-ficult to use and scale in environments where supercomputers are expensive.Using the Apache Spark Parallel Model,this proposed work presents a scalable system for sentiment analysis on Twitter.A Spark-based Naive Bayes training technique is suggested for this purpose;unlike prior research,this algorithm does not need any disk access.Millions of tweets have been classified using the trained model.Experiments with various-sized clusters reveal that the suggested strategy is extremely scalable and cost-effective for larger data sets.It is nearly 12 times quicker than the Map Reduce-based model and nearly 21 times faster than the Naive Bayes Classifier in Apache Mahout.To evaluate the framework’s scalabil-ity,we gathered a large training corpus from Twitter.The accuracy of the classi-fier trained with this new dataset was more than 80%.展开更多
Objective To study the research status,research hotspots and development trends in the field of real-world data(RWD)through social network analysis and knowledge graph analysis.Methods RWD of the past 10 years were re...Objective To study the research status,research hotspots and development trends in the field of real-world data(RWD)through social network analysis and knowledge graph analysis.Methods RWD of the past 10 years were retrieved,and literature metrological analysis was made by using UCINET and CiteSpace from CNKI.Results and Conclusion The frequency and centrality of related keywords such as real-world study,hospital information system(HIS),drug combination,data mining and TCM are high.The clusters labeled as clinical medication and RWD contain more keywords.In recent 4 years,there are more articles involving the keywords of data specification,data authenticity,data security and information security.Among them,compound Kushen injection,HIS database and RWD are the top three keywords.It is a long-term research hotspot for Chinese and western medicine to use HIS to study clinical medication,clinical characteristics,diseases and injections.Besides,the research of RWD database has changed from construction to standardized collection and governance,which can make RWD effective.Data authenticity,data security and information security will become the new hotspots in the research of RWD.展开更多
This paper provides a comprehensive overview of evolution and innovation in social network analysis to the paradigm of social networking. It explains how the development of sociological theory and the structural prope...This paper provides a comprehensive overview of evolution and innovation in social network analysis to the paradigm of social networking. It explains how the development of sociological theory and the structural properties of social groups matter to computer science and communications. Authors such as Moreno, John Barnes and Harrison C. White provide evidence of a growing body of literature addressing the networking of people, organizations and communities to explain the structure of society. This perspective has passed from sociology to other fields, changing understandings of social phenomena. Social networks remain a potent concept for analyzing computer science and communications. This paper shows how and why this has occurred and examines substantive areas in which social network analysis has been applied—mainly how the advantages of graphic visualization and computer software packages have influenced SNA in different audiences and publics leading to the unfolding of social networking to different audiences and publics.展开更多
The rising popularity of online social networks (OSNs), such as Twitter, Facebook, MySpace, and LinkedIn, in recent years has sparked great interest in sentiment analysis on their data. While many methods exist for id...The rising popularity of online social networks (OSNs), such as Twitter, Facebook, MySpace, and LinkedIn, in recent years has sparked great interest in sentiment analysis on their data. While many methods exist for identifying sentiment in OSNs such as communication pattern mining and classification based on emoticon and parts of speech, the majority of them utilize a suboptimal batch mode learning approach when analyzing a large amount of real time data. As an alternative we present a stream algorithm using Modified Balanced Winnow for sentiment analysis on OSNs. Tested on three real-world network datasets, the performance of our sentiment predictions is close to that of batch learning with the ability to detect important features dynamically for sentiment analysis in data streams. These top features reveal key words important to the analysis of sentiment.展开更多
The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure ...The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.展开更多
This paper is devoted to analyze and model user reading and replying activities in a bulletin board system (BBS) social network. By analyzing the data set from a famous Chinese BBS social network, we show how some u...This paper is devoted to analyze and model user reading and replying activities in a bulletin board system (BBS) social network. By analyzing the data set from a famous Chinese BBS social network, we show how some user activities distribute, and reveal several important features that might characterize user dynamics. We propose a method to model user activities in the BBS social network. The model could reproduce power-law and non-power-law distributions of user activities at the same time. Our results show that user reading and replying activities could be simulated through simple agent-based models. Specifically, manners of how the BBS server interacts with Internet users in the Web 2.0 application, how users organize their reading lists, and how user behavioral trait distributes are the important factors in the formation of activity patterns.展开更多
Half centuries of follow-up survey has enabled the architects and urban planners to design rationally by the aid of planning Nonetheless, limitation has occurred at planning because city has been changing its utility ...Half centuries of follow-up survey has enabled the architects and urban planners to design rationally by the aid of planning Nonetheless, limitation has occurred at planning because city has been changing its utility in accordance with its users' demand. In this paper, the authors proposed a method to analyze trait of users in market areas near stations by analyzing location based social network. After the datum collection from geotagged tweets, these GPS (global positioning system) datum were plotted to map attained from yahoo open location platform. Then the morphological analysis and terminology extraction system extracted the keywords and their scores. After calculating the distance from stations and users' GPS coordination, the authors extracted the array of keywords and corresponding scores in some station market area. Lastly, ratios of all users' scores and city's scores were calculated to examine the locality. Full combination of data collection, natural language processing and visualization enabled the authors to envisage distribution of collective background in city.展开更多
Despite the recent development of many worldwide initiatives, there is still a need for the development of observation frameworks that will provide a comprehensive view of SDI’s use. Amongst the many challenges left,...Despite the recent development of many worldwide initiatives, there is still a need for the development of observation frameworks that will provide a comprehensive view of SDI’s use. Amongst the many challenges left, a thorough analysis of the information flows between existing SDIs as well as their respective uses and the way that those evolve over time is an important issue to explore. The research presented in this paper introduces a methodological framework oriented to the study of the SDIs use from a diachronic perspective. The approach is based on a Social Network Analysis (SNA) and questionnaires collected by online surveys. We develop a structural and diachronic analysis based on a series of graph-based measures identifying the main patterns that appear over time. The methodological framework is applied to a series of French SDIs and users involved in environmental management. The study identifies a series of structural differences in the data flows that emerge between the users and SDIs. Last, the diachronic network analysis provides an overall understanding on how data flows evolve over time at different institutional levels.展开更多
基金We thank the anonymous reviewers and editors for their very constructive comments.the National Social Science Foundation Project of China under Grant 16BTQ085.
文摘The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.
文摘The public is increasingly using social media platforms such as Twitter and Facebook to express their views on a variety of topics.As a result,social media has emerged as the most effective and largest open source for obtaining public opinion.Single node computational methods are inefficient for sentiment analysis on such large datasets.Supercomputers or parallel or distributed proces-sing are two options for dealing with such large amounts of data.Most parallel programming frameworks,such as MPI(Message Processing Interface),are dif-ficult to use and scale in environments where supercomputers are expensive.Using the Apache Spark Parallel Model,this proposed work presents a scalable system for sentiment analysis on Twitter.A Spark-based Naive Bayes training technique is suggested for this purpose;unlike prior research,this algorithm does not need any disk access.Millions of tweets have been classified using the trained model.Experiments with various-sized clusters reveal that the suggested strategy is extremely scalable and cost-effective for larger data sets.It is nearly 12 times quicker than the Map Reduce-based model and nearly 21 times faster than the Naive Bayes Classifier in Apache Mahout.To evaluate the framework’s scalabil-ity,we gathered a large training corpus from Twitter.The accuracy of the classi-fier trained with this new dataset was more than 80%.
文摘Objective To study the research status,research hotspots and development trends in the field of real-world data(RWD)through social network analysis and knowledge graph analysis.Methods RWD of the past 10 years were retrieved,and literature metrological analysis was made by using UCINET and CiteSpace from CNKI.Results and Conclusion The frequency and centrality of related keywords such as real-world study,hospital information system(HIS),drug combination,data mining and TCM are high.The clusters labeled as clinical medication and RWD contain more keywords.In recent 4 years,there are more articles involving the keywords of data specification,data authenticity,data security and information security.Among them,compound Kushen injection,HIS database and RWD are the top three keywords.It is a long-term research hotspot for Chinese and western medicine to use HIS to study clinical medication,clinical characteristics,diseases and injections.Besides,the research of RWD database has changed from construction to standardized collection and governance,which can make RWD effective.Data authenticity,data security and information security will become the new hotspots in the research of RWD.
文摘This paper provides a comprehensive overview of evolution and innovation in social network analysis to the paradigm of social networking. It explains how the development of sociological theory and the structural properties of social groups matter to computer science and communications. Authors such as Moreno, John Barnes and Harrison C. White provide evidence of a growing body of literature addressing the networking of people, organizations and communities to explain the structure of society. This perspective has passed from sociology to other fields, changing understandings of social phenomena. Social networks remain a potent concept for analyzing computer science and communications. This paper shows how and why this has occurred and examines substantive areas in which social network analysis has been applied—mainly how the advantages of graphic visualization and computer software packages have influenced SNA in different audiences and publics leading to the unfolding of social networking to different audiences and publics.
文摘The rising popularity of online social networks (OSNs), such as Twitter, Facebook, MySpace, and LinkedIn, in recent years has sparked great interest in sentiment analysis on their data. While many methods exist for identifying sentiment in OSNs such as communication pattern mining and classification based on emoticon and parts of speech, the majority of them utilize a suboptimal batch mode learning approach when analyzing a large amount of real time data. As an alternative we present a stream algorithm using Modified Balanced Winnow for sentiment analysis on OSNs. Tested on three real-world network datasets, the performance of our sentiment predictions is close to that of batch learning with the ability to detect important features dynamically for sentiment analysis in data streams. These top features reveal key words important to the analysis of sentiment.
文摘The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.
基金supported in part by the National Natural Science Foundation of China under Grant No. 60972010the Beijing Natural Science Foundation under Grant No. 4102047+1 种基金the Major Program for Research on Philosophy & Humanity Social Sciences of the Ministry of Education of China under Grant No. 08WL1101the Service Business of Scientists and Engineers Project under Grant No. 2009GJA00048
文摘This paper is devoted to analyze and model user reading and replying activities in a bulletin board system (BBS) social network. By analyzing the data set from a famous Chinese BBS social network, we show how some user activities distribute, and reveal several important features that might characterize user dynamics. We propose a method to model user activities in the BBS social network. The model could reproduce power-law and non-power-law distributions of user activities at the same time. Our results show that user reading and replying activities could be simulated through simple agent-based models. Specifically, manners of how the BBS server interacts with Internet users in the Web 2.0 application, how users organize their reading lists, and how user behavioral trait distributes are the important factors in the formation of activity patterns.
文摘Half centuries of follow-up survey has enabled the architects and urban planners to design rationally by the aid of planning Nonetheless, limitation has occurred at planning because city has been changing its utility in accordance with its users' demand. In this paper, the authors proposed a method to analyze trait of users in market areas near stations by analyzing location based social network. After the datum collection from geotagged tweets, these GPS (global positioning system) datum were plotted to map attained from yahoo open location platform. Then the morphological analysis and terminology extraction system extracted the keywords and their scores. After calculating the distance from stations and users' GPS coordination, the authors extracted the array of keywords and corresponding scores in some station market area. Lastly, ratios of all users' scores and city's scores were calculated to examine the locality. Full combination of data collection, natural language processing and visualization enabled the authors to envisage distribution of collective background in city.
文摘Despite the recent development of many worldwide initiatives, there is still a need for the development of observation frameworks that will provide a comprehensive view of SDI’s use. Amongst the many challenges left, a thorough analysis of the information flows between existing SDIs as well as their respective uses and the way that those evolve over time is an important issue to explore. The research presented in this paper introduces a methodological framework oriented to the study of the SDIs use from a diachronic perspective. The approach is based on a Social Network Analysis (SNA) and questionnaires collected by online surveys. We develop a structural and diachronic analysis based on a series of graph-based measures identifying the main patterns that appear over time. The methodological framework is applied to a series of French SDIs and users involved in environmental management. The study identifies a series of structural differences in the data flows that emerge between the users and SDIs. Last, the diachronic network analysis provides an overall understanding on how data flows evolve over time at different institutional levels.