In this paper, a C5.0 decision tree and neural network models are proposed to classify recessions in the US with 12 common financial indices and new financial stress indices inferred from the neural network models are...In this paper, a C5.0 decision tree and neural network models are proposed to classify recessions in the US with 12 common financial indices and new financial stress indices inferred from the neural network models are created. A detailed experiment is presented and demonstrates that the neural network models with proper regularization and dropout achieve 98% accuracy in the training set, 97% accuracy in validation set and 100% accuracy in test accuracy. The financial stress indices outperform other existing financial stress indices in many scenes and can accurately locate crisis events even the most recent 2018 US Bear Market. With these models and new indices, contraction can be detected before NBER’s announcement and action could be taken as early as the situation get worse.展开更多
Critical functionality and huge infuence of the hot trend/topic page(HTP)in microblogging sites have driven the creation of a new kind of underground service called the bogus traffic service(BTS).BTS provides a kind o...Critical functionality and huge infuence of the hot trend/topic page(HTP)in microblogging sites have driven the creation of a new kind of underground service called the bogus traffic service(BTS).BTS provides a kind of illegal service which hijacks the HTP by pushing the controlled topics into it for malicious customers with the goal of guiding public opinions.To hijack HTP,the agents of BTS maintain an army of black-market accounts called bogus trafic accounts(BTAs)and control BTAs to generate a burst of fake trafic by massively retweeting the tweets containing the customer desired topic(hashtag).Although this service has been extensively exploited by malicious customers,little has been done to understand it.In this paper,we conduct a systematic measurement study of the BTS.We first investigate and collect 125 BTS agents from a variety of sources and set up a honey pot account to capture BTAs from these agents.We then build a BTA detector that detects 162218 BTAs from Weibo,the largest Chinese microblogging site,with a precision of 94.5%.We further use them as a bridge to uncover 296916 topics that might be involved in bogus trafic.Finally,we uncover the operating mechanism from the perspectives of the attack cycle and the attack entity.The highlights of our findings include the temporal attack patterns and intelligent evasion tactics of the BTAs.These findings bring BTS into the spotlight.Our work will help in understanding and ultimately eliminating this threat.展开更多
基于本地化差分隐私的多维分析查询(multi-dimensional analytical query,MDA)已得到了研究者的广泛关注.现有基于最优局部哈希(optimal local Hashing,OLH)机制与层次树结构的扰动方法存在泄露根结点隐私的风险.针对现有结合层次树结...基于本地化差分隐私的多维分析查询(multi-dimensional analytical query,MDA)已得到了研究者的广泛关注.现有基于最优局部哈希(optimal local Hashing,OLH)机制与层次树结构的扰动方法存在泄露根结点隐私的风险.针对现有结合层次树结构的本地扰动机制不足,提出了一种有效且满足本地化差分隐私的MDA查询算法H4MDA (hierarchical structure for MDA),该算法充分利用层次树的横向与纵向结构特征设计了3种基于用户分组策略的本地扰动算法HGRR,LGRR-FD,LGRR.算法HGRR结合层次树横向结构与GRR机制本地扰动用户元组数据,通过摈弃根结点组合来响应MDA查询.不同于HGRR,LGRR-FD算法利用层次树的纵向结构与GRR机制扰动本地数据,同时通过添加假数据来避免叶子结点的隐私泄露.LGRR算法通过摈弃叶子结点层纵向扰动本地数据.收集者结合LGRR的扰动结果利用局部一致性处理技术重构层次树最后两层,通过添加虚拟叶子结点来响应MDA查询,而虚拟叶子结点计数之和等于其父节点计数.HGRR,LGRR-FD,LGRR算法与现有扰动算法在3种数据集上实验结果表明,其响应MDA查询的精度优于同类算法.展开更多
Social networks are important media for spreading information, ideas, and influence among individuals.Most existing research focuses on understanding the characteristics of social networks, investigating how informati...Social networks are important media for spreading information, ideas, and influence among individuals.Most existing research focuses on understanding the characteristics of social networks, investigating how information is spread through the "word-of-mouth" effect of social networks, or exploring social influences among individuals and groups. However, most studies ignore negative influences among individuals and groups. Motivated by the goal of alleviating social problems, such as drinking, smoking, and gambling, and influence-spreading problems, such as promoting new products, we consider positive and negative influences, and propose a new optimization problem called the Minimum-sized Positive Influential Node Set(MPINS) selection problem to identify the minimum set of influential nodes such that every node in the network can be positively influenced by these selected nodes with no less than a threshold of ?. Our contributions are threefold. First, we prove that, under the independent cascade model considering positive and negative influences, MPINS is APX-hard. Subsequently, we present a greedy approximation algorithm to address the MPINS selection problem. Finally, to validate the proposed greedy algorithm, we conduct extensive simulations and experiments on random graphs and seven different realworld data sets that represent small-, medium-, and large-scale networks.展开更多
Image captchas have recently become very popular and are widely deployed across the Internet to defend against abusive programs. However, the ever-advancing capabilities of computer vision have gradually diminished th...Image captchas have recently become very popular and are widely deployed across the Internet to defend against abusive programs. However, the ever-advancing capabilities of computer vision have gradually diminished the security of image captchas and made them vulnerable to attack. In this paper, we first classify the currently popular image captchas into three categories: selection-based captchas, slide-based captchas, and click-based captchas. Second, we propose simple yet powerful attack frameworks against each of these categories of image captchas. Third, we systematically evaluate our attack frameworks against 10 popular real-world image captchas,including captchas from tencent.com, google.com, and 12306.cn. Fourth, we compare our attacks against nine online image recognition services and against human labors from eight underground captcha-solving services. Our evaluation results show that(1) each of the popular image captchas that we study is vulnerable to our attacks;(2) our attacks yield the highest captcha-breaking success rate compared with state-of-the-art methods in almost all scenarios; and(3) our attacks achieve almost as high a success rate as human labor while being much faster.Based on our evaluation, we identify some design flaws in these popular schemes, along with some best practices and design principles for more secure captchas. We also examine the underground market for captcha-solving services, identifying 152 such services. We then seek to measure this underground market with data from these services. Our findings shed light on understanding the scale, impact, and commercial landscape of the underground market for captcha solving.展开更多
Online short-term rental platforms,such as Airbnb,have been becoming popular,and a better pricing strategy is imperative for hosts of new listings.In this paper,we analyzed the relationship between the description of ...Online short-term rental platforms,such as Airbnb,have been becoming popular,and a better pricing strategy is imperative for hosts of new listings.In this paper,we analyzed the relationship between the description of each listing and its price,and proposed a text-based price recommendation system called TAPE to recommend a reasonable price for newly added listings.We used deep learning techniques(e.g.,feedforward network,long short-term memory,and mean shift)to design and implement TAPE.Using two chronologically extracted datasets of the same four cities,we revealed important factors(e.g.,indoor equipment and high-density area)that positively or negatively affect each property’s price,and evaluated our preliminary and enhanced models.Our models achieved a Root-Mean-Square Error(RMSE)of 33.73 in Boston,20.50 in London,34.68 in Los Angeles,and 26.31 in New York City,which are comparable to an existing model that uses more features.展开更多
文摘In this paper, a C5.0 decision tree and neural network models are proposed to classify recessions in the US with 12 common financial indices and new financial stress indices inferred from the neural network models are created. A detailed experiment is presented and demonstrates that the neural network models with proper regularization and dropout achieve 98% accuracy in the training set, 97% accuracy in validation set and 100% accuracy in test accuracy. The financial stress indices outperform other existing financial stress indices in many scenes and can accurately locate crisis events even the most recent 2018 US Bear Market. With these models and new indices, contraction can be detected before NBER’s announcement and action could be taken as early as the situation get worse.
文摘Critical functionality and huge infuence of the hot trend/topic page(HTP)in microblogging sites have driven the creation of a new kind of underground service called the bogus traffic service(BTS).BTS provides a kind of illegal service which hijacks the HTP by pushing the controlled topics into it for malicious customers with the goal of guiding public opinions.To hijack HTP,the agents of BTS maintain an army of black-market accounts called bogus trafic accounts(BTAs)and control BTAs to generate a burst of fake trafic by massively retweeting the tweets containing the customer desired topic(hashtag).Although this service has been extensively exploited by malicious customers,little has been done to understand it.In this paper,we conduct a systematic measurement study of the BTS.We first investigate and collect 125 BTS agents from a variety of sources and set up a honey pot account to capture BTAs from these agents.We then build a BTA detector that detects 162218 BTAs from Weibo,the largest Chinese microblogging site,with a precision of 94.5%.We further use them as a bridge to uncover 296916 topics that might be involved in bogus trafic.Finally,we uncover the operating mechanism from the perspectives of the attack cycle and the attack entity.The highlights of our findings include the temporal attack patterns and intelligent evasion tactics of the BTAs.These findings bring BTS into the spotlight.Our work will help in understanding and ultimately eliminating this threat.
文摘基于本地化差分隐私的多维分析查询(multi-dimensional analytical query,MDA)已得到了研究者的广泛关注.现有基于最优局部哈希(optimal local Hashing,OLH)机制与层次树结构的扰动方法存在泄露根结点隐私的风险.针对现有结合层次树结构的本地扰动机制不足,提出了一种有效且满足本地化差分隐私的MDA查询算法H4MDA (hierarchical structure for MDA),该算法充分利用层次树的横向与纵向结构特征设计了3种基于用户分组策略的本地扰动算法HGRR,LGRR-FD,LGRR.算法HGRR结合层次树横向结构与GRR机制本地扰动用户元组数据,通过摈弃根结点组合来响应MDA查询.不同于HGRR,LGRR-FD算法利用层次树的纵向结构与GRR机制扰动本地数据,同时通过添加假数据来避免叶子结点的隐私泄露.LGRR算法通过摈弃叶子结点层纵向扰动本地数据.收集者结合LGRR的扰动结果利用局部一致性处理技术重构层次树最后两层,通过添加虚拟叶子结点来响应MDA查询,而虚拟叶子结点计数之和等于其父节点计数.HGRR,LGRR-FD,LGRR算法与现有扰动算法在3种数据集上实验结果表明,其响应MDA查询的精度优于同类算法.
基金funded in part by the Kennesaw State University College of Science and Mathematics Interdisciplinary Research Opportunities (IDROP) Programthe Provincial Key Research and Development Program of Zhejiang, China (No. 2016C01G2010916)+1 种基金the Fundamental Research Funds for the Central Universities, the Alibaba-Zhejiang University Joint Research Institute for Frontier Technologies (A.Z.F.T.) (No. XT622017000118)the CCF-Tencent Open Research Fund (No. AGR20160109).
文摘Social networks are important media for spreading information, ideas, and influence among individuals.Most existing research focuses on understanding the characteristics of social networks, investigating how information is spread through the "word-of-mouth" effect of social networks, or exploring social influences among individuals and groups. However, most studies ignore negative influences among individuals and groups. Motivated by the goal of alleviating social problems, such as drinking, smoking, and gambling, and influence-spreading problems, such as promoting new products, we consider positive and negative influences, and propose a new optimization problem called the Minimum-sized Positive Influential Node Set(MPINS) selection problem to identify the minimum set of influential nodes such that every node in the network can be positively influenced by these selected nodes with no less than a threshold of ?. Our contributions are threefold. First, we prove that, under the independent cascade model considering positive and negative influences, MPINS is APX-hard. Subsequently, we present a greedy approximation algorithm to address the MPINS selection problem. Finally, to validate the proposed greedy algorithm, we conduct extensive simulations and experiments on random graphs and seven different realworld data sets that represent small-, medium-, and large-scale networks.
基金supported by the National Natural Science Foundation of China (Nos. 61772466 and U1836202)the Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars (No. LR19F020003)+1 种基金the Provincial Key Research and Development Program of Zhejiang Province (No. 2017C01055)the Alibaba-ZJU Joint Research Institute of Frontier Technologies
文摘Image captchas have recently become very popular and are widely deployed across the Internet to defend against abusive programs. However, the ever-advancing capabilities of computer vision have gradually diminished the security of image captchas and made them vulnerable to attack. In this paper, we first classify the currently popular image captchas into three categories: selection-based captchas, slide-based captchas, and click-based captchas. Second, we propose simple yet powerful attack frameworks against each of these categories of image captchas. Third, we systematically evaluate our attack frameworks against 10 popular real-world image captchas,including captchas from tencent.com, google.com, and 12306.cn. Fourth, we compare our attacks against nine online image recognition services and against human labors from eight underground captcha-solving services. Our evaluation results show that(1) each of the popular image captchas that we study is vulnerable to our attacks;(2) our attacks yield the highest captcha-breaking success rate compared with state-of-the-art methods in almost all scenarios; and(3) our attacks achieve almost as high a success rate as human labor while being much faster.Based on our evaluation, we identify some design flaws in these popular schemes, along with some best practices and design principles for more secure captchas. We also examine the underground market for captcha-solving services, identifying 152 such services. We then seek to measure this underground market with data from these services. Our findings shed light on understanding the scale, impact, and commercial landscape of the underground market for captcha solving.
文摘Online short-term rental platforms,such as Airbnb,have been becoming popular,and a better pricing strategy is imperative for hosts of new listings.In this paper,we analyzed the relationship between the description of each listing and its price,and proposed a text-based price recommendation system called TAPE to recommend a reasonable price for newly added listings.We used deep learning techniques(e.g.,feedforward network,long short-term memory,and mean shift)to design and implement TAPE.Using two chronologically extracted datasets of the same four cities,we revealed important factors(e.g.,indoor equipment and high-density area)that positively or negatively affect each property’s price,and evaluated our preliminary and enhanced models.Our models achieved a Root-Mean-Square Error(RMSE)of 33.73 in Boston,20.50 in London,34.68 in Los Angeles,and 26.31 in New York City,which are comparable to an existing model that uses more features.