This paper poses a question:How many types of social relations can be categorized in the Chinese context?In social networks,the calculation of tie strength can better represent the degree of intimacy of the relationsh...This paper poses a question:How many types of social relations can be categorized in the Chinese context?In social networks,the calculation of tie strength can better represent the degree of intimacy of the relationship between nodes,rather than just indicating whether the link exists or not.Previou research suggests that Granovetter measures tie strength so as to distinguish strong ties from weak ties,and the Dunbar circle theory may offer a plausible approach to calculating 5 types of relations according to interaction frequency via unsupervised learning(e.g.,clustering interactive data between users in Facebook and Twitter).In this paper,we differentiate the layers of an ego-centered network by measuring the different dimensions of user's online interaction data based on the Dunbar circle theory.To label the types of Chinese guanxi,we conduct a survey to collect the ground truth from the real world and link this survey data to big data collected from a widely used social network platform in China.After repeating the Dunbar experiments,we modify our computing methods and indicators computed from big data in order to have a model best fit for the ground truth.At the same time,a comprehensive set of effective predictors are selected to have a dialogue with existing theories of tie strength.Eventually,by combining Guanxi theory with Dunbar circle studies,four types of guanxi are found to represent a four-layer model of a Chinese ego-centered network.展开更多
移动互联网、社交媒体的快速发展,极大推动了各个领域对文本、图像、视频等网络媒体数据处理的需求.该类数据具有高维度、动态更新、内容复杂的特性,增加了特征计算以及分类难度.同时,当前网络媒体数据的特征选择方法主要针对静态数据,...移动互联网、社交媒体的快速发展,极大推动了各个领域对文本、图像、视频等网络媒体数据处理的需求.该类数据具有高维度、动态更新、内容复杂的特性,增加了特征计算以及分类难度.同时,当前网络媒体数据的特征选择方法主要针对静态数据,并且对数据格式规范性要求较高.针对上述问题,为保证对动态网络媒体数据的实时特征提取,该文提出了一种基于用户相关性的动态网络媒体数据无监督特征选择算法(Unsupervised Feature Selection Algorithm for Dynamic Network Media Based on User Correlation,UFSDUC).首先,对社交网络中的交互用户进行关系分析,作为无监督特征选择的约束条件.然后,利用拉普拉斯算子构建用户相关性的特征选择模型,量化相关用户之间的关系强弱,通过拉格朗日乘子法给出特征模型中最优用户关系的数学方法.最后,基于梯度下降法设定动态网络媒体数据的阈值,用以计算非零特征权值来更新最优特征子集,达到对网络媒体数据进行有效分类的目的.该算法可在保证用户在相关性完整的基础上对动态网络媒体数据进行准确、实时的特征选择.该文采用3个标准网络媒体数据集,同时与5种目前较为流行的同类型算法进行对比以验证算法的有效性.展开更多
基金project number:20182001706the support of Tsinghua-Gottingen Student Exchange Project IDS-SSP-2017001.
文摘This paper poses a question:How many types of social relations can be categorized in the Chinese context?In social networks,the calculation of tie strength can better represent the degree of intimacy of the relationship between nodes,rather than just indicating whether the link exists or not.Previou research suggests that Granovetter measures tie strength so as to distinguish strong ties from weak ties,and the Dunbar circle theory may offer a plausible approach to calculating 5 types of relations according to interaction frequency via unsupervised learning(e.g.,clustering interactive data between users in Facebook and Twitter).In this paper,we differentiate the layers of an ego-centered network by measuring the different dimensions of user's online interaction data based on the Dunbar circle theory.To label the types of Chinese guanxi,we conduct a survey to collect the ground truth from the real world and link this survey data to big data collected from a widely used social network platform in China.After repeating the Dunbar experiments,we modify our computing methods and indicators computed from big data in order to have a model best fit for the ground truth.At the same time,a comprehensive set of effective predictors are selected to have a dialogue with existing theories of tie strength.Eventually,by combining Guanxi theory with Dunbar circle studies,four types of guanxi are found to represent a four-layer model of a Chinese ego-centered network.
文摘移动互联网、社交媒体的快速发展,极大推动了各个领域对文本、图像、视频等网络媒体数据处理的需求.该类数据具有高维度、动态更新、内容复杂的特性,增加了特征计算以及分类难度.同时,当前网络媒体数据的特征选择方法主要针对静态数据,并且对数据格式规范性要求较高.针对上述问题,为保证对动态网络媒体数据的实时特征提取,该文提出了一种基于用户相关性的动态网络媒体数据无监督特征选择算法(Unsupervised Feature Selection Algorithm for Dynamic Network Media Based on User Correlation,UFSDUC).首先,对社交网络中的交互用户进行关系分析,作为无监督特征选择的约束条件.然后,利用拉普拉斯算子构建用户相关性的特征选择模型,量化相关用户之间的关系强弱,通过拉格朗日乘子法给出特征模型中最优用户关系的数学方法.最后,基于梯度下降法设定动态网络媒体数据的阈值,用以计算非零特征权值来更新最优特征子集,达到对网络媒体数据进行有效分类的目的.该算法可在保证用户在相关性完整的基础上对动态网络媒体数据进行准确、实时的特征选择.该文采用3个标准网络媒体数据集,同时与5种目前较为流行的同类型算法进行对比以验证算法的有效性.