The eXtreme gradient boosting(XGBoost)algorithm is used to identify abnormal users.Firstly,the raw data were cleaned.Then user power characteristics were extracted from different aspects.Finally,the XGBoost classifier...The eXtreme gradient boosting(XGBoost)algorithm is used to identify abnormal users.Firstly,the raw data were cleaned.Then user power characteristics were extracted from different aspects.Finally,the XGBoost classifier was used to identify the abnormal users respectively in the balanced sample set and the unbalanced sample set.In contrast,under the same characteristics,the k-nearest neighbor(KNN)classifier,back-propagation(BP)neural network classifier and random forest classifier were used to identify the abnormal users in the two samples.The experimental results show that the XGBoost classifier has higher recognition rate and faster running speed.Especially in the imbalanced data sets,the performance improvement is obvious.展开更多
Smartphones have ubiquitously integrated into our home and work environments,however,users normally rely on explicit but inefficient identification processes in a controlled environment.Therefore,when a device is stol...Smartphones have ubiquitously integrated into our home and work environments,however,users normally rely on explicit but inefficient identification processes in a controlled environment.Therefore,when a device is stolen,a thief can have access to the owner’s personal information and services against the stored passwords.As a result of this potential scenario,this work proposes an automatic legitimate user identification system based on gait biometrics extracted from user walking patterns captured by smartphone sensors.A set of preprocessing schemes are applied to calibrate noisy and invalid samples and augment the gait-induced time and frequency domain features,then further optimized using a non-linear unsupervised feature selection method.The selected features create an underlying gait biometric representation able to discriminate among individuals and identify them uniquely.Different classifiers are adopted to achieve accurate legitimate user identification.Extensive experiments on a group of 16 individuals in an indoor environment show the effectiveness of the proposed solution:with 5 to 70 samples per window,KNN and bagging classifiers achieve 87–99%accuracy,82–98%for ELM,and 81–94%for SVM.The proposed pipeline achieves a 100%true positive and 0%false-negative rate for almost all classifiers.展开更多
In digital fingerprinting, preventing piracy of images by colluders is an important and tedious issue. Each image will be embedded with a unique User IDentification (UID) code that is the fingerprint for tracking th...In digital fingerprinting, preventing piracy of images by colluders is an important and tedious issue. Each image will be embedded with a unique User IDentification (UID) code that is the fingerprint for tracking the authorized user. The proposed hiding scheme makes use of a random number generator to scramble two copies of a UID, which will then be hidden in the randomly selected medium frequency coefficients of the host image. The linear support vector machine (SVM) will be used to train classifications by calculating the normalized correlation (NC) for the 2class UID codes. The trained classifications will be the models used for identifying unreadable UID codes. Experimental results showed that the success of predicting the unreadable UID codes can be increased by applying SVM. The proposed scheme can be used to provide protections to intellectual property rights of digital images aad to keep track of users to prevent collaborative piracies.展开更多
Identifying accounts across different online social networks that belong to the same user has attracted extensive attentions.However,existing techniques rely on given user seeds and ignore the dynamic changes of onlin...Identifying accounts across different online social networks that belong to the same user has attracted extensive attentions.However,existing techniques rely on given user seeds and ignore the dynamic changes of online social networks,which fails to generate high quality identification results.In order to solve this problem,we propose an incremental user identification method based on user-guider similarity index(called CURIOUS),which efficiently identifies users and well captures the changes of user features over time.Specifically,we first construct a novel user-guider similarity index(called USI)to speed up the matching between users.Second we propose a two-phase user identification strategy consisting of USI-based bidirectional user matching and seed-based user matching,which is effective even for incomplete networks.Finally,we propose incremental maintenance for both USI and the identification results,which dynamically captures the instant states of social networks.We conduct experimental studies based on three real-world social networks.The experiments demonstrate the effectiveness and the efficiency of our proposed method in comparison with traditional methods.Compared with the traditional methods,our method improves precision,recall and rank score by an average of 0.19,0.16 and 0.09 respectively,and reduces the time cost by an average of 81%.展开更多
Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscrib...Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscribers’privacy.This work focuses on subscriber privacy vulnerability assessment in terms of user identifiability across two datasets with significant detail reduced mobility representation.In this paper,we propose an innovative semantic spatiotemporal representation for each subscriber based on the geographic information,termed as daily habitat region,to approximate the subscriber’s daily mobility coverage with far lesser information compared with original mobility traces.The daily habitat region is realized via convex hull extraction on the user’s daily spatiotemporal traces.As a result,user identification can be formulated to match two records with the maximum similarity score between two convex hull sets,obtained by our proposed similarity measures based on cosine distance and permutation hypothesis test.Experiments are conducted to evaluate our proposed innovative mobility representation and user identification algorithms,which also demonstrate that the subscriber’s mobile privacy is under a severe threat even with significantly reduced spatiotemporal information.展开更多
基金National Natural Science Foundation of China(No.61262044)
文摘The eXtreme gradient boosting(XGBoost)algorithm is used to identify abnormal users.Firstly,the raw data were cleaned.Then user power characteristics were extracted from different aspects.Finally,the XGBoost classifier was used to identify the abnormal users respectively in the balanced sample set and the unbalanced sample set.In contrast,under the same characteristics,the k-nearest neighbor(KNN)classifier,back-propagation(BP)neural network classifier and random forest classifier were used to identify the abnormal users in the two samples.The experimental results show that the XGBoost classifier has higher recognition rate and faster running speed.Especially in the imbalanced data sets,the performance improvement is obvious.
文摘Smartphones have ubiquitously integrated into our home and work environments,however,users normally rely on explicit but inefficient identification processes in a controlled environment.Therefore,when a device is stolen,a thief can have access to the owner’s personal information and services against the stored passwords.As a result of this potential scenario,this work proposes an automatic legitimate user identification system based on gait biometrics extracted from user walking patterns captured by smartphone sensors.A set of preprocessing schemes are applied to calibrate noisy and invalid samples and augment the gait-induced time and frequency domain features,then further optimized using a non-linear unsupervised feature selection method.The selected features create an underlying gait biometric representation able to discriminate among individuals and identify them uniquely.Different classifiers are adopted to achieve accurate legitimate user identification.Extensive experiments on a group of 16 individuals in an indoor environment show the effectiveness of the proposed solution:with 5 to 70 samples per window,KNN and bagging classifiers achieve 87–99%accuracy,82–98%for ELM,and 81–94%for SVM.The proposed pipeline achieves a 100%true positive and 0%false-negative rate for almost all classifiers.
文摘In digital fingerprinting, preventing piracy of images by colluders is an important and tedious issue. Each image will be embedded with a unique User IDentification (UID) code that is the fingerprint for tracking the authorized user. The proposed hiding scheme makes use of a random number generator to scramble two copies of a UID, which will then be hidden in the randomly selected medium frequency coefficients of the host image. The linear support vector machine (SVM) will be used to train classifications by calculating the normalized correlation (NC) for the 2class UID codes. The trained classifications will be the models used for identifying unreadable UID codes. Experimental results showed that the success of predicting the unreadable UID codes can be increased by applying SVM. The proposed scheme can be used to provide protections to intellectual property rights of digital images aad to keep track of users to prevent collaborative piracies.
基金This work was supported by the National Natural Science Foundation of China under Grant Nos.62072084,62172082 and 62072086the Science Research Foundation of Liaoning Province of China under Grant No.LJKZ0094+2 种基金the Natural Science Foundation of Liaoning Province of China under Grant No.2022-MS-171the Science and Technology Plan Major Project of Liaoning Province of China under Grant No.2022JH1/10400009the Fundamental Research Funds for the Central Universities of China under Grant No.N2116008。
文摘Identifying accounts across different online social networks that belong to the same user has attracted extensive attentions.However,existing techniques rely on given user seeds and ignore the dynamic changes of online social networks,which fails to generate high quality identification results.In order to solve this problem,we propose an incremental user identification method based on user-guider similarity index(called CURIOUS),which efficiently identifies users and well captures the changes of user features over time.Specifically,we first construct a novel user-guider similarity index(called USI)to speed up the matching between users.Second we propose a two-phase user identification strategy consisting of USI-based bidirectional user matching and seed-based user matching,which is effective even for incomplete networks.Finally,we propose incremental maintenance for both USI and the identification results,which dynamically captures the instant states of social networks.We conduct experimental studies based on three real-world social networks.The experiments demonstrate the effectiveness and the efficiency of our proposed method in comparison with traditional methods.Compared with the traditional methods,our method improves precision,recall and rank score by an average of 0.19,0.16 and 0.09 respectively,and reduces the time cost by an average of 81%.
基金This work was in part supported by the National Natural Science Foundation of China(Nos.61622101 and 61571020)in part by the Natural Science Foundation(Nos.DMS-1521746 and DMS-1737795.
文摘Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscribers’privacy.This work focuses on subscriber privacy vulnerability assessment in terms of user identifiability across two datasets with significant detail reduced mobility representation.In this paper,we propose an innovative semantic spatiotemporal representation for each subscriber based on the geographic information,termed as daily habitat region,to approximate the subscriber’s daily mobility coverage with far lesser information compared with original mobility traces.The daily habitat region is realized via convex hull extraction on the user’s daily spatiotemporal traces.As a result,user identification can be formulated to match two records with the maximum similarity score between two convex hull sets,obtained by our proposed similarity measures based on cosine distance and permutation hypothesis test.Experiments are conducted to evaluate our proposed innovative mobility representation and user identification algorithms,which also demonstrate that the subscriber’s mobile privacy is under a severe threat even with significantly reduced spatiotemporal information.