The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step m...The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step method to address this problem.To obtain a better understanding and more specific representation of the legal texts,we summarize a judgment model according to relevant law articles and then apply it in the extraction of case feature from judgment documents.By formalizing prison term prediction as a regression problem,we adopt the linear regression model and the neural network model to train the prison term predictor.In experiments,we construct a real-world dataset of theft case judgment documents.Experimental results demonstrate that our method can effectively extract judgment-specific case features from textual fact descriptions.The best performance of the proposed predictor is obtained with a mean absolute error of 3.2087 months,and the accuracy of 72.54%and 90.01%at the error upper bounds of three and six months,respectively.展开更多
Many improved authentication solutions were put forward, on purpose of authenticating more quickly and securely.However, neither the overuse of hash function,or additional symmetric encryption, can truly increase the ...Many improved authentication solutions were put forward, on purpose of authenticating more quickly and securely.However, neither the overuse of hash function,or additional symmetric encryption, can truly increase the overall security. Instead,extra computation cost degraded the performance.They were still vulnerable to a variety of threats, such as smart card loss attack and impersonation attack, due to hidden loopholes and flaws. Even worse, user's identity can be parsed in insecure environment, even became traceable. Aiming to protect identity, a lightweight mutual authentication scheme is proposed. Redundant operations are removed,which make the verification process more explicit. It gains better performance with average cost compared to other similar schemes.Cryptanalysis shows the proposed scheme can resist common attacks and achieve user anonymity.Formal security is further verified by using the widely accepted Automated Validation of Internet Security Protocols and Applications(AVISPA) tool.展开更多
Objective:To determine the most influential data features and to develop machine learning approaches that best predict hospital readmissions among patients with diabetes.Methods:In this retrospective cohort study,we s...Objective:To determine the most influential data features and to develop machine learning approaches that best predict hospital readmissions among patients with diabetes.Methods:In this retrospective cohort study,we surveyed patient statistics and performed feature analysis to identify the most influential data features associated with readmissions.Classification of all-cause,30-day readmission outcomes were modeled using logistic regression,artificial neural network,and Easy Ensemble.F1 statistic,sensitivity,and positive predictive value were used to evaluate the model performance.Results:We identified 14 most influential data features(4 numeric features and 10 categorical features)and evaluated 3 machine learning models with numerous sampling methods(oversampling,undersampling,and hybrid techniques).The deep learning model offered no improvement over traditional models(logistic regression and Easy Ensemble)for predicting readmission,whereas the other two algorithms led to much smaller differences between the training and testing datasets.Conclusions:Machine learning approaches to record electronic health data offer a promising method for improving readmission prediction in patients with diabetes.But more work is needed to construct datasets with more clinical variables beyond the standard risk factors and to fine-tune and optimize machine learning models.展开更多
Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely co...Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely concerned in recent years.Compared with those Byzantine fault tolerant protocols,Ripple has a significant difference that the system can reach an agreement under decentralized trust model.However,Ripple has many problems both in theory and practice,which arementioned in the previous researches.This paper presents Ripple+,an improved scheme of Ripple consensus protocol,which improves Ripple fromthree aspects:(1)Ripple+employs a specific trustmodel and a corresponding guideline for Unique Node List selection,which makes it easy to deploy in practice to meet the safety and liveness condition;(2)the primary and viewchangemechanismare joined to solve the problem discussed by the previous research that Ripple may lose liveness in some extreme scenarios;(3)we remove the strong synchrony clock and timeout during consensus periods to make it suitable for weak synchrony assumption.We implemented a prototype of Ripple+and conducted experiments to show that Ripple+can achieve the throughput of tens of thousands of transactions per second with no more than half a minute latency,and the view change mechanism hardly incurs additional cost.展开更多
Security technology is a special kind of companion technology that is developed for the underlying applications it serves. It is becoming increasingly critical in today's society, as these underlying applications bec...Security technology is a special kind of companion technology that is developed for the underlying applications it serves. It is becoming increasingly critical in today's society, as these underlying applications become more and more interconnected, pervasive, and intelligent. In recent years, we have witnessed the prolifera- tion of cutting-edge computing and information technologies in a wide range of emerging areas, such as cloud computing.展开更多
Although many different views of social media coexist in the field of information systems (IS), such the- ories are usually not introduced in a consistent framework based on philosophical foundations, This paper int...Although many different views of social media coexist in the field of information systems (IS), such the- ories are usually not introduced in a consistent framework based on philosophical foundations, This paper introduces the dimensions of lifeworld and consideration of others, The concept of lifeworld includes Descartes' rationality and Heidegger's historicity, and consideration of others is based on instru- mentalism and Heidegger's "being-with," These philosophical foundations elaborate a framework where different archetypal theories applied to social media may he compared: Goffman's presentation of sell Bourdieu's social capital, Sartre's existential project, and Heidegger's "shared-world," While Goffman has become a frequent reference in social media, the three other references are innovative in IS research, The concepts of these four theories of social media are compared with empirical findings in IS literature, While some of these concepts match the empirical findings, some other concepts have not yet been inves- tigated in the use of social media, suggesting future research directions,展开更多
Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an e...Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an essential role in various WIS scenarios to improve the quality of data service.In this paper,we present a review of the state-of-the-art methods for data cleaning in WIS.According to the characteristics of data cleaning,we extract the critical elements of WIS,such as interactive objects,application scenarios,and core technology,to classify the existing works.Then,after elaborating and analyzing each category,we summarize the descriptions and challenges of data cleaning methods with sub-elements such as data&user interaction,data quality rule,model,crowdsourcing,and privacy preservation.Finally,we analyze various types of problems and provide suggestions for future research on data cleaning in WIS from the technology and interactive perspective.展开更多
As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified at...As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified attributed data,which has the value of organizational analysis and mining.The existing research on dealing with multiple Web scenarios and accuracy performance is insufficient.This paper aims to propose a method to transform organizational official Web pages into the data with attributes.After locating the active blocks in the Web pages,the structural and content features are proposed to classify information with the specific model.The extraction methods based on trigger lexicon and LSTM(Long Short-Term Memory)are proposed,which efficiently process the classified information and extract data that matches the attributes.Finally,an accurate and efficient method to classify and extract information from organizational official Web pages is formed.Experimental results show that our approach improves the performing indicators and exceeds the level of state of the art on real data set from organizational official Web pages.展开更多
As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hi...As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hierarchical resource authentication architecture,which brings management convenience but places the DNS at risk from a single point of failure.When the root key suffers a leak or misconfiguration,top level domain(TLD)authority cannot independently protect the authenticity of TLD data in the root zone.In this paper,we propose self-certificating root,a lightweight security enhancement mechanism of root zone compatible with DNS/DNSSEC protocol.By adding the TLD public key and signature of the glue records to the root zone,this mechanism enables the TLD authority to certify the self-submitted data in the root zone and protects the TLD authority from the risk of root key failure.This mechanism is implemented on an open-source software,namely,Berkeley Internet Name Domain(BIND),and evaluated in terms of performance,compatibility,and effectiveness.Evaluation results show that the proposed mechanism enables the resolver that only supports DNS/DNSSEC to authenticate the root zone TLD data effectively with minimal performance difference.展开更多
Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies ...Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies on BT almost cover each important aspect, little work reflects the recent development of BT system. In this paper, we develop a hybrid measurement system incorporating both active and passive approaches. By exploiting DHT (Distribute Hash Table) and PEX (Peer Exchange) protocols, we collect more extensive information compared to prior measurement systems. Based on the measurement results, we study the resource popularity and swarm evolution with different population in minute/ hour/day scales, and discover that: 1) the resources in BT system appear obvious unbalanced distribution and hotspot phenomenon, in that 74.6% torrents have no more than 1000 peers;2) The lifetime of torrents can be divided into a fast growing stage, a dramatically shrinking stage, a sustaining stage and a slowly fading out stage in terms of swarm population;3) Users’ interest and diurnal periodicity are the main factors that influence the swarm evolution. The former dominates the first two stages, while the latter is decisive in the third stage. We raise an improved peer arrival rate model to describe the variation of the swarm population. Comparison results show that our model outperforms the state-of-the-art approach according to root mean square error and correlation coefficient.展开更多
In recent years,with the rapid growth of social network services(SNS),social networks pervade nearly every aspect of our daily lives.Social networks are influencing today’s societal and cultural issues,and changing t...In recent years,with the rapid growth of social network services(SNS),social networks pervade nearly every aspect of our daily lives.Social networks are influencing today’s societal and cultural issues,and changing the way of people seeing themselves.To fully understand the running mechanisms of social networks,in this paper,we aim at series of high knitted and important elements of online social networks.We mainly focus on 3 important but also open research problems,they are(1)structural properties and evolving laws,(2)social crowds and their interaction behaviors and(3)information and its diffusion.In this paper,we review the related work on the 3 problems.Then,we briefly introduce some interesting research directions and our progress on these research problems.展开更多
Social networks are fundamental media for dif- fusion of information and contagions appear at some node of the network and get propagated over the edges. Prior re- searches mainly focus on each contagion spreading ind...Social networks are fundamental media for dif- fusion of information and contagions appear at some node of the network and get propagated over the edges. Prior re- searches mainly focus on each contagion spreading indepen- dently, regardless of multiple contagions' interactions as they propagate at the same time. In the real world, simultaneous news and events usually have to compete for user's attention to get propagated. In some other cases, they can cooperate with each other and achieve more influences. In this paper, an evolutionary game theoretic framework is proposed to model the interactions among multiple con- tagions. The basic idea is that different contagions in social networks are similar to the multiple organisms in a popula- tion, and the diffusion process is as organisms interact and then evolve from one state to another. This framework statis- tically learns the payoffs as contagions interacting with each other and builds the payoff matrix. Since learning payoffs for all pairs of contagions IS almost impossible (quadratic in the number of contagions), a contagion clustering method is proposed in order to decrease the number of parameters to fit, which makes our approach efficient and scalable. To ver- ify the proposed framework, we conduct experiments by us- ing real-world information spreading dataset of Digg. Exper- imental results show that the proposed game theoretic frame- work helps to comprehend the information diffusion process better and can predict users' forwarding behaviors with more accuracy than the previous studies. The analyses of evolution dynamics of contagions and evolutionarily stable strategy re- veal whether a contagion can be promoted or suppressed by others in the diffusion process.展开更多
基金This work is supported in part by the National Key Research and Development Program of China under grants 2018YFC0830602 and 2016QY03D0501in part by the National Natural Science Foundation of China(NSFC)under grants 61872111,61732022 and 61601146.
文摘The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step method to address this problem.To obtain a better understanding and more specific representation of the legal texts,we summarize a judgment model according to relevant law articles and then apply it in the extraction of case feature from judgment documents.By formalizing prison term prediction as a regression problem,we adopt the linear regression model and the neural network model to train the prison term predictor.In experiments,we construct a real-world dataset of theft case judgment documents.Experimental results demonstrate that our method can effectively extract judgment-specific case features from textual fact descriptions.The best performance of the proposed predictor is obtained with a mean absolute error of 3.2087 months,and the accuracy of 72.54%and 90.01%at the error upper bounds of three and six months,respectively.
基金supported by the National Key Research and Development Program of China (No. 2017YFC0820603)
文摘Many improved authentication solutions were put forward, on purpose of authenticating more quickly and securely.However, neither the overuse of hash function,or additional symmetric encryption, can truly increase the overall security. Instead,extra computation cost degraded the performance.They were still vulnerable to a variety of threats, such as smart card loss attack and impersonation attack, due to hidden loopholes and flaws. Even worse, user's identity can be parsed in insecure environment, even became traceable. Aiming to protect identity, a lightweight mutual authentication scheme is proposed. Redundant operations are removed,which make the verification process more explicit. It gains better performance with average cost compared to other similar schemes.Cryptanalysis shows the proposed scheme can resist common attacks and achieve user anonymity.Formal security is further verified by using the widely accepted Automated Validation of Internet Security Protocols and Applications(AVISPA) tool.
基金supported in part by the Key Research and Development Program for Guangdong Province(No.2019B010136001)in part by Hainan Major Science and Technology Projects(No.ZDKJ2019010)+3 种基金in part by the National Key Research and Development Program of China(No.2016YFB0800803 and No.2018YFB1004005)in part by National Natural Science Foundation of China(No.81960565,No.81260139,No.81060073,No.81560275,No.61562021,No.30560161 and No.61872110)in part by Hainan Special Projects of Social Development(No.ZDYF2018103 and No.2015SF 39)in part by Hainan Association for Academic Excellence Youth Science and Technology Innovation Program(No.201515)
文摘Objective:To determine the most influential data features and to develop machine learning approaches that best predict hospital readmissions among patients with diabetes.Methods:In this retrospective cohort study,we surveyed patient statistics and performed feature analysis to identify the most influential data features associated with readmissions.Classification of all-cause,30-day readmission outcomes were modeled using logistic regression,artificial neural network,and Easy Ensemble.F1 statistic,sensitivity,and positive predictive value were used to evaluate the model performance.Results:We identified 14 most influential data features(4 numeric features and 10 categorical features)and evaluated 3 machine learning models with numerous sampling methods(oversampling,undersampling,and hybrid techniques).The deep learning model offered no improvement over traditional models(logistic regression and Easy Ensemble)for predicting readmission,whereas the other two algorithms led to much smaller differences between the training and testing datasets.Conclusions:Machine learning approaches to record electronic health data offer a promising method for improving readmission prediction in patients with diabetes.But more work is needed to construct datasets with more clinical variables beyond the standard risk factors and to fine-tune and optimize machine learning models.
基金the National Key Research and Development Program(Grant No.2018YFB1800702)Peng Cheng Laboratory(Grant No.PCL2021A02).
文摘Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely concerned in recent years.Compared with those Byzantine fault tolerant protocols,Ripple has a significant difference that the system can reach an agreement under decentralized trust model.However,Ripple has many problems both in theory and practice,which arementioned in the previous researches.This paper presents Ripple+,an improved scheme of Ripple consensus protocol,which improves Ripple fromthree aspects:(1)Ripple+employs a specific trustmodel and a corresponding guideline for Unique Node List selection,which makes it easy to deploy in practice to meet the safety and liveness condition;(2)the primary and viewchangemechanismare joined to solve the problem discussed by the previous research that Ripple may lose liveness in some extreme scenarios;(3)we remove the strong synchrony clock and timeout during consensus periods to make it suitable for weak synchrony assumption.We implemented a prototype of Ripple+and conducted experiments to show that Ripple+can achieve the throughput of tens of thousands of transactions per second with no more than half a minute latency,and the view change mechanism hardly incurs additional cost.
基金This work was partially supported by the National Natural Science Foundation of China (U1636215, 61572492, 61650202, 61772236, and 61372191) and the National Key Research and Development Program (2016YFB0800802, 2016YFB0800803, 2016YFB0800804, 2017YFB0802204, 2016QY03D0601, 2016QY03D0603, and 2016YFB0800303).
文摘Security technology is a special kind of companion technology that is developed for the underlying applications it serves. It is becoming increasingly critical in today's society, as these underlying applications become more and more interconnected, pervasive, and intelligent. In recent years, we have witnessed the prolifera- tion of cutting-edge computing and information technologies in a wide range of emerging areas, such as cloud computing.
基金This research was supported by Key Programs of National Natural Science Foundation of China (71231002), Major Projects of National Social Science Foundation of China (16ZDA055), National Natural Science Foundation of China (91546121 and U1636215), and the National Key Research and Development Program of China (2017YFB0803300).
文摘Although many different views of social media coexist in the field of information systems (IS), such the- ories are usually not introduced in a consistent framework based on philosophical foundations, This paper introduces the dimensions of lifeworld and consideration of others, The concept of lifeworld includes Descartes' rationality and Heidegger's historicity, and consideration of others is based on instru- mentalism and Heidegger's "being-with," These philosophical foundations elaborate a framework where different archetypal theories applied to social media may he compared: Goffman's presentation of sell Bourdieu's social capital, Sartre's existential project, and Heidegger's "shared-world," While Goffman has become a frequent reference in social media, the three other references are innovative in IS research, The concepts of these four theories of social media are compared with empirical findings in IS literature, While some of these concepts match the empirical findings, some other concepts have not yet been inves- tigated in the use of social media, suggesting future research directions,
文摘Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an essential role in various WIS scenarios to improve the quality of data service.In this paper,we present a review of the state-of-the-art methods for data cleaning in WIS.According to the characteristics of data cleaning,we extract the critical elements of WIS,such as interactive objects,application scenarios,and core technology,to classify the existing works.Then,after elaborating and analyzing each category,we summarize the descriptions and challenges of data cleaning methods with sub-elements such as data&user interaction,data quality rule,model,crowdsourcing,and privacy preservation.Finally,we analyze various types of problems and provide suggestions for future research on data cleaning in WIS from the technology and interactive perspective.
基金This work was supported by the National Key Research and Development Program of China(Nos.2016QY03D0501,2017YFB0803300)the National Natural Science Foundation of China(Nos.61601146,61732022)Sichuan Science and Technology Program(No.2019YFSY0049).
文摘As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified attributed data,which has the value of organizational analysis and mining.The existing research on dealing with multiple Web scenarios and accuracy performance is insufficient.This paper aims to propose a method to transform organizational official Web pages into the data with attributes.After locating the active blocks in the Web pages,the structural and content features are proposed to classify information with the specific model.The extraction methods based on trigger lexicon and LSTM(Long Short-Term Memory)are proposed,which efficiently process the classified information and extract data that matches the attributes.Finally,an accurate and efficient method to classify and extract information from organizational official Web pages is formed.Experimental results show that our approach improves the performing indicators and exceeds the level of state of the art on real data set from organizational official Web pages.
基金This work is partially supported by the National Key Research and Development Program(2018YFB1800702).
文摘As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hierarchical resource authentication architecture,which brings management convenience but places the DNS at risk from a single point of failure.When the root key suffers a leak or misconfiguration,top level domain(TLD)authority cannot independently protect the authenticity of TLD data in the root zone.In this paper,we propose self-certificating root,a lightweight security enhancement mechanism of root zone compatible with DNS/DNSSEC protocol.By adding the TLD public key and signature of the glue records to the root zone,this mechanism enables the TLD authority to certify the self-submitted data in the root zone and protects the TLD authority from the risk of root key failure.This mechanism is implemented on an open-source software,namely,Berkeley Internet Name Domain(BIND),and evaluated in terms of performance,compatibility,and effectiveness.Evaluation results show that the proposed mechanism enables the resolver that only supports DNS/DNSSEC to authenticate the root zone TLD data effectively with minimal performance difference.
文摘Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies on BT almost cover each important aspect, little work reflects the recent development of BT system. In this paper, we develop a hybrid measurement system incorporating both active and passive approaches. By exploiting DHT (Distribute Hash Table) and PEX (Peer Exchange) protocols, we collect more extensive information compared to prior measurement systems. Based on the measurement results, we study the resource popularity and swarm evolution with different population in minute/ hour/day scales, and discover that: 1) the resources in BT system appear obvious unbalanced distribution and hotspot phenomenon, in that 74.6% torrents have no more than 1000 peers;2) The lifetime of torrents can be divided into a fast growing stage, a dramatically shrinking stage, a sustaining stage and a slowly fading out stage in terms of swarm population;3) Users’ interest and diurnal periodicity are the main factors that influence the swarm evolution. The former dominates the first two stages, while the latter is decisive in the third stage. We raise an improved peer arrival rate model to describe the variation of the swarm population. Comparison results show that our model outperforms the state-of-the-art approach according to root mean square error and correlation coefficient.
基金supported by National BasicResearch Program of China(2013CB329601 and 2013CB329606)the National Natural Science Foundation of China(91124002,61372191,and 61303190)
文摘In recent years,with the rapid growth of social network services(SNS),social networks pervade nearly every aspect of our daily lives.Social networks are influencing today’s societal and cultural issues,and changing the way of people seeing themselves.To fully understand the running mechanisms of social networks,in this paper,we aim at series of high knitted and important elements of online social networks.We mainly focus on 3 important but also open research problems,they are(1)structural properties and evolving laws,(2)social crowds and their interaction behaviors and(3)information and its diffusion.In this paper,we review the related work on the 3 problems.Then,we briefly introduce some interesting research directions and our progress on these research problems.
基金This work was supported by State Key Development Program of Basic Research of China (2013CB 329605) and the National Nat- ural Science Foundation of China (Grant Nos. 61300014, 61372191).
文摘Social networks are fundamental media for dif- fusion of information and contagions appear at some node of the network and get propagated over the edges. Prior re- searches mainly focus on each contagion spreading indepen- dently, regardless of multiple contagions' interactions as they propagate at the same time. In the real world, simultaneous news and events usually have to compete for user's attention to get propagated. In some other cases, they can cooperate with each other and achieve more influences. In this paper, an evolutionary game theoretic framework is proposed to model the interactions among multiple con- tagions. The basic idea is that different contagions in social networks are similar to the multiple organisms in a popula- tion, and the diffusion process is as organisms interact and then evolve from one state to another. This framework statis- tically learns the payoffs as contagions interacting with each other and builds the payoff matrix. Since learning payoffs for all pairs of contagions IS almost impossible (quadratic in the number of contagions), a contagion clustering method is proposed in order to decrease the number of parameters to fit, which makes our approach efficient and scalable. To ver- ify the proposed framework, we conduct experiments by us- ing real-world information spreading dataset of Digg. Exper- imental results show that the proposed game theoretic frame- work helps to comprehend the information diffusion process better and can predict users' forwarding behaviors with more accuracy than the previous studies. The analyses of evolution dynamics of contagions and evolutionarily stable strategy re- veal whether a contagion can be promoted or suppressed by others in the diffusion process.