The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step m...The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step method to address this problem.To obtain a better understanding and more specific representation of the legal texts,we summarize a judgment model according to relevant law articles and then apply it in the extraction of case feature from judgment documents.By formalizing prison term prediction as a regression problem,we adopt the linear regression model and the neural network model to train the prison term predictor.In experiments,we construct a real-world dataset of theft case judgment documents.Experimental results demonstrate that our method can effectively extract judgment-specific case features from textual fact descriptions.The best performance of the proposed predictor is obtained with a mean absolute error of 3.2087 months,and the accuracy of 72.54%and 90.01%at the error upper bounds of three and six months,respectively.展开更多
Exploiting random access for the underlying connectivity provisioning has great potential to incorporate massive machine-type communication(MTC)devices in an Internet of Things(Io T)network.However,massive access atte...Exploiting random access for the underlying connectivity provisioning has great potential to incorporate massive machine-type communication(MTC)devices in an Internet of Things(Io T)network.However,massive access attempts from versatile MTC devices may bring congestion to the IIo T network,thereby hindering service increasing of IIo T applications.In this paper,an intelligence enabled physical(PHY-)layer user signature code acquisition(USCA)algorithm is proposed to overcome the random access congestion problem with reduced signaling and control overhead.In the proposed scheme,the detector aims at approximating the optimal observation on both active user detection and user data reception by iteratively learning and predicting the convergence of the user signature codes that are in active.The crossentropy based low-complexity iterative updating rule is present to guarantee that the proposed USCA algorithm is computational feasible.A closed-form bit error rate(BER)performance analysis is carried out to show the efficiency of the proposed intelligence USCA algorithm.Simulation results confirm that the proposed USCA algorithm provides an inherent tradeoff between performance and complexity and allows the detector achieves an approximate optimal performance with a reasonable computational complexity.展开更多
Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely co...Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely concerned in recent years.Compared with those Byzantine fault tolerant protocols,Ripple has a significant difference that the system can reach an agreement under decentralized trust model.However,Ripple has many problems both in theory and practice,which arementioned in the previous researches.This paper presents Ripple+,an improved scheme of Ripple consensus protocol,which improves Ripple fromthree aspects:(1)Ripple+employs a specific trustmodel and a corresponding guideline for Unique Node List selection,which makes it easy to deploy in practice to meet the safety and liveness condition;(2)the primary and viewchangemechanismare joined to solve the problem discussed by the previous research that Ripple may lose liveness in some extreme scenarios;(3)we remove the strong synchrony clock and timeout during consensus periods to make it suitable for weak synchrony assumption.We implemented a prototype of Ripple+and conducted experiments to show that Ripple+can achieve the throughput of tens of thousands of transactions per second with no more than half a minute latency,and the view change mechanism hardly incurs additional cost.展开更多
Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols ...Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.展开更多
Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an e...Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an essential role in various WIS scenarios to improve the quality of data service.In this paper,we present a review of the state-of-the-art methods for data cleaning in WIS.According to the characteristics of data cleaning,we extract the critical elements of WIS,such as interactive objects,application scenarios,and core technology,to classify the existing works.Then,after elaborating and analyzing each category,we summarize the descriptions and challenges of data cleaning methods with sub-elements such as data&user interaction,data quality rule,model,crowdsourcing,and privacy preservation.Finally,we analyze various types of problems and provide suggestions for future research on data cleaning in WIS from the technology and interactive perspective.展开更多
As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified at...As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified attributed data,which has the value of organizational analysis and mining.The existing research on dealing with multiple Web scenarios and accuracy performance is insufficient.This paper aims to propose a method to transform organizational official Web pages into the data with attributes.After locating the active blocks in the Web pages,the structural and content features are proposed to classify information with the specific model.The extraction methods based on trigger lexicon and LSTM(Long Short-Term Memory)are proposed,which efficiently process the classified information and extract data that matches the attributes.Finally,an accurate and efficient method to classify and extract information from organizational official Web pages is formed.Experimental results show that our approach improves the performing indicators and exceeds the level of state of the art on real data set from organizational official Web pages.展开更多
With the development of Internet technology and the enhancement of people’s concept of the rule of law,online legal consultation has become an important means for the general public to conduct legal consultation.Howe...With the development of Internet technology and the enhancement of people’s concept of the rule of law,online legal consultation has become an important means for the general public to conduct legal consultation.However,different people have different language expressions and legal professional backgrounds.This phenomenon may lead to the phenomenon of different descriptions of the same legal consultation.How to accurately understand the true intentions behind different users’legal consulting statements is an important issue that needs to be solved urgently in the field of legal consulting services.Traditional intent understanding algorithms rely heavily on the lexical and semantic information between the original data,and are not scalable,and often require taxing manual annotation work.This article proposes a new approach TdBrnn which is based on the normalized tensor decomposition method and Bi-LSTM to learn users’intention to legal consulting.First,we present the users’legal consulting statements as a tensor.And then we use the normalized tensor decomposition layer proposed by this article to extract the tensor elements and structural information of the original tensor which can best represent users’intention of legal consultation,namely the core tensor.The core tensor relies less on the lexical and semantic information of the original users’legal consulting statements data,it reduces the dimension of the original tensor,and greatly reduces the computational complexity of the subsequent Bi-LSTM algorithm.Furthermore,we use a large number of core tensors obtained by the tensor decomposition layer with users’legal consulting statements tensors as inputs to continuously train Bi-LSTM,and finally derive the users’legal consultation intention classification model which can comprehensively understand the user’s legal consultation intention.Experiments show that our method has faster convergence speed and higher accuracy than traditional recurrent neural networks.展开更多
A Scalable Multi-Hash( SMH) name lookup method is proposed,which is based on hierarchical name decomposition to aggregate names sharing common prefixes and multiple scalable hash tables to minimize collisions among pr...A Scalable Multi-Hash( SMH) name lookup method is proposed,which is based on hierarchical name decomposition to aggregate names sharing common prefixes and multiple scalable hash tables to minimize collisions among prefixes. We take the component instead of the entire name as a key in the hash functions. The SMH method achieves lookup speeds of 21. 45 and 20. 87 Mbps on prefix table with 2 million and 3. 6 million names,respectively. The proposed method is the fastest of the four methods considered and requires 61.63 and 89.17 Mb of memory on the prefix tables with 2 million and 3. 6 million names,respectively. The required memory is slightly larger than the best method. The scalability of SMH outperforms that of the other two methods.展开更多
Road networks have been used in a wide range of applications to reduces the cost of transportation and improve the quality of related services.The shortest road distance computation has been considered as one of the m...Road networks have been used in a wide range of applications to reduces the cost of transportation and improve the quality of related services.The shortest road distance computation has been considered as one of the most fundamental operations of road networks computation.To alleviate privacy concerns about location privacy leaks during road distance computation,it is desirable to have a secure and efficient road distance computation approach.In this paper,we propose two secure road distance computation approaches,which can compute road distance over encrypted data efficiently.An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding.An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding.We implement our two road distance computation approaches,and evaluate them on the real cityscale road network.Evaluation results show that our approaches are accurate and efficient.展开更多
As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hi...As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hierarchical resource authentication architecture,which brings management convenience but places the DNS at risk from a single point of failure.When the root key suffers a leak or misconfiguration,top level domain(TLD)authority cannot independently protect the authenticity of TLD data in the root zone.In this paper,we propose self-certificating root,a lightweight security enhancement mechanism of root zone compatible with DNS/DNSSEC protocol.By adding the TLD public key and signature of the glue records to the root zone,this mechanism enables the TLD authority to certify the self-submitted data in the root zone and protects the TLD authority from the risk of root key failure.This mechanism is implemented on an open-source software,namely,Berkeley Internet Name Domain(BIND),and evaluated in terms of performance,compatibility,and effectiveness.Evaluation results show that the proposed mechanism enables the resolver that only supports DNS/DNSSEC to authenticate the root zone TLD data effectively with minimal performance difference.展开更多
Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network t...Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network traffic has put tremendous pressure on the performance of DPI.In fact,the sensitive content being monitored is only a minority of network traffic,that is to say,most is undesired.A close look at the network traffic,we found that it contains many undesired high frequency content(UHC)that are not monitored.As everyone knows,the key to improve DPI performance is to skip as many useless characters as possible.Nevertheless,researchers generally study the algorithm of skipping useless characters through sensitive content,ignoring the high-frequency non-sensitive content.To fill this gap,in this literature,we design a model,named Fast AC Model with Skipping(FAMS),to quickly skip UHC while scanning traffic.The model consists of a standard AC automaton,where the input traffic is scanned byte-by-byte,and an additional sub-model,which includes a mapping set and UHC matching model.The mapping set is a bridge between the state node of AC and UHC matching model,while the latter is to select a matching function from hash and fingerprint functions.Our experiments show promising results that we achieve a throughput gain of 1.3-2.6 times the original throughput and 1.1-1.3 times Barr’s double path method.展开更多
Concentrations of lead( Pb),cadmium( Cd),chromium( Cr),copper( Cu),zinc( Zn) and manganese( Mn) were measured in various organs( such as in liver and muscle) from 9 species of freshwater economic fishes which were col...Concentrations of lead( Pb),cadmium( Cd),chromium( Cr),copper( Cu),zinc( Zn) and manganese( Mn) were measured in various organs( such as in liver and muscle) from 9 species of freshwater economic fishes which were collected from northeast area of Guangdong Province. The concentration of metals was measured by inductively coupled plasma atomic emission spectrometry( ICP- AES). Results showed that the levels of metals in hepatopancreas of the fishes were found in order of Zn > Pb > Cu > Hg > Cd,while in muscles were Zn > Cr> Pb > Mn > Cu > Cd. In general,the metals concentrations were significantly higher in liver samples than that in muscle samples. Based on the " pollution index of single factor",the fishes,to one degree or another,were polluted by Pb,Cd,Cr,Cu and Zn,and pollution levels were mostly followed by Pb > Cd > Cr > Cu > Zn. The indexes of Pb and Cd tested in hepatopancreas of the fishes were in a majority exceeded the national safety criteria for food in China. What's more,it was found that the contents of the heavy metals in fishes did not vary with the trophic level which they belong to. In summary,the fishes were polluted by Pb,Cd,Cr,Cu and Zn to some extent,which indicated that hidden danger of heavy metals pollution was present in ecological environment or safety in fishery production in the area.展开更多
Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies ...Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies on BT almost cover each important aspect, little work reflects the recent development of BT system. In this paper, we develop a hybrid measurement system incorporating both active and passive approaches. By exploiting DHT (Distribute Hash Table) and PEX (Peer Exchange) protocols, we collect more extensive information compared to prior measurement systems. Based on the measurement results, we study the resource popularity and swarm evolution with different population in minute/ hour/day scales, and discover that: 1) the resources in BT system appear obvious unbalanced distribution and hotspot phenomenon, in that 74.6% torrents have no more than 1000 peers;2) The lifetime of torrents can be divided into a fast growing stage, a dramatically shrinking stage, a sustaining stage and a slowly fading out stage in terms of swarm population;3) Users’ interest and diurnal periodicity are the main factors that influence the swarm evolution. The former dominates the first two stages, while the latter is decisive in the third stage. We raise an improved peer arrival rate model to describe the variation of the swarm population. Comparison results show that our model outperforms the state-of-the-art approach according to root mean square error and correlation coefficient.展开更多
Measuring and characterizing peer-to-peer (P2P) file-sharing systems will benefit the optimization and management of P2P systems. Though there are a lot of measurement studies on BitTorrent almost in every important a...Measuring and characterizing peer-to-peer (P2P) file-sharing systems will benefit the optimization and management of P2P systems. Though there are a lot of measurement studies on BitTorrent almost in every important aspect, few of them focus on the measurement issues and the corresponding solutions, which can strongly influence the accuracy of measurement results. This paper analyzes the key difficulties of measuring BitTorrent and presents a measurement system with combination of active and passive ways, which can han-dle with the problems well and balance the efficiency and integrity. Then compared to other work, a more complete and representative measurement was performed for nearly two months and several characteristics are concerned: 1) there are diverse content sharing in BitTorrent system, but multimedia files that are larger than 100 MB are the most. 2) Distributed Hash Tables has indeed enhanced the ability of peer discovery though there are some pitfalls to be addressed. 3) Pieces are distributed uniformly after the early stage and there are few rare pieces. Furthermore, peer arrival rate shows a periodical pattern, which was not well mod-eled before. Then an improved model is proposed and the experiment results indicate that new model is fitted in with actual measurement results with high accuracy.展开更多
Social network platforms such as Twitter, Instagram and Facebook are one of the fastest and most convenient means for sharing digital images. Digital images are generally accepted as credible news but, it may undergo ...Social network platforms such as Twitter, Instagram and Facebook are one of the fastest and most convenient means for sharing digital images. Digital images are generally accepted as credible news but, it may undergo some manipulations before being shared without leaving any obvious traces of tampering; due to existence of the powerful image editing softwares. Copy-move forgery technique is a very simple and common type of image forgery, where a part of the image is copied and then pasted in the same image to replicate or hide some parts from the image. In this paper, we proposed a copy-scale-move forgery detection method based on Scale Invariant Feature Operator (SFOP) detector. The keypoints are then described using MROGH descriptor. Experimental results show that the proposed method is able to locate and detect the forgery even if under some geometric transformations such as scaling.展开更多
Analysis of the particularity of the civil aviation passenger auxiliary service recommendation scenario.As application of the traditional recommendation algorithm has certain limitation in civil aviation auxiliary ser...Analysis of the particularity of the civil aviation passenger auxiliary service recommendation scenario.As application of the traditional recommendation algorithm has certain limitation in civil aviation auxiliary services recommendation,a SVR recommendation algorithm of auxiliary service of civil aviation based on context-awareness was proposed.Analysis of the civil aviation passenger travel data,construct the civil aviation passenger preference model,then recommend auxiliary service for passengers.Based on the traditional two-dimensional user-item recommendation,considering the user characteristics,item attributes and user contextual information in the process of recommendation,which can effectively reduce the data sparseness in some degree.In addition,when there is a new user or a new item,whose similar users or items can be found according to the user or item attributes,to some extent,which can solve the problem of cold start.The experimental results show that the algorithm can recommend auxiliary service for passengers more accurately,which can provide convenience for passengers as well as increase the quality of airlines’services.展开更多
Monitoring of sweat pH plays important roles in physiological health,nutritional balance,psychological stress,and sports performance.However,the combination of functional MOFs with phosphorescent material to acquire t...Monitoring of sweat pH plays important roles in physiological health,nutritional balance,psychological stress,and sports performance.However,the combination of functional MOFs with phosphorescent material to acquire the real-time physiological information,as well as the application of dual mode anti-counterfeiting,has seldom been reported.Herein,we developed multifunctional gel films based on MOFs and phosphorescent dyes which responded to H+ions and the related mechanism was studied in detail.Upon exposure to H+,the composite gel film exhibited decreased fluorescent signal but enhanced room temperature phosphorescence(RTP),which could be utilized for sweat pH sensing through a dual-mode.Moreover,multifunctional gel films exhibited a potential application in information encryption and anti-counterfeiting by designing of stimulus responsive multiple patterns.This research provided a new avenue for portable and non-invasive sweat pH monitoring methods while also offering insights into stimulus-responsive multifunctional materials.展开更多
A prescribed performance control scheme based on the three-inflection-point hyperbolic function and predefined time performance function is proposed to solve the trajectory tracking problem of the forward-tilting morp...A prescribed performance control scheme based on the three-inflection-point hyperbolic function and predefined time performance function is proposed to solve the trajectory tracking problem of the forward-tilting morphing aerospace vehicle with time-varying actuator faults.To accurately estimate the loss degree of actuator faults,an immersion and invariance observer based on the predefined time dynamic scale factor is designed to estimate and compensate it.A composite dynamic sliding mode surface is designed using a three-inflection-point hyperbolic function,and a novel three-inflection-point sliding mode control framework is proposed.The convergent domain of the sliding manifold is adjusted by parameters,and the system error convergence is controllable.A transfer function is designed to eliminate the sensitivity of the three-inflection-point hyperbolic sliding mode to the unknown initial state,and combined with the barrier Lyapunov function,and the performance constraint of the system is realized.The global asymptotic stability of the system is demonstrated using a strict mathematical proof.The effectiveness and superiority of the proposed control scheme are proven by simulation experiments.展开更多
基金This work is supported in part by the National Key Research and Development Program of China under grants 2018YFC0830602 and 2016QY03D0501in part by the National Natural Science Foundation of China(NSFC)under grants 61872111,61732022 and 61601146.
文摘The task of prison term prediction is to predict the term of penalty based on textual fact description for a certain type of criminal case.Recent advances in deep learning frameworks inspire us to propose a two-step method to address this problem.To obtain a better understanding and more specific representation of the legal texts,we summarize a judgment model according to relevant law articles and then apply it in the extraction of case feature from judgment documents.By formalizing prison term prediction as a regression problem,we adopt the linear regression model and the neural network model to train the prison term predictor.In experiments,we construct a real-world dataset of theft case judgment documents.Experimental results demonstrate that our method can effectively extract judgment-specific case features from textual fact descriptions.The best performance of the proposed predictor is obtained with a mean absolute error of 3.2087 months,and the accuracy of 72.54%and 90.01%at the error upper bounds of three and six months,respectively.
基金supported in part by Natural Science Foundation of Heilongjiang Province of China under Grant YQ2021F003in part by the National Natural Science Foundation of China under Grant 61901140+1 种基金in part by China Postdoctoral Science Foundation Funded Project under Grant 2019M650067in part by Science and Technology on Communication Networks Laboratory under Grant SCX21641X003。
文摘Exploiting random access for the underlying connectivity provisioning has great potential to incorporate massive machine-type communication(MTC)devices in an Internet of Things(Io T)network.However,massive access attempts from versatile MTC devices may bring congestion to the IIo T network,thereby hindering service increasing of IIo T applications.In this paper,an intelligence enabled physical(PHY-)layer user signature code acquisition(USCA)algorithm is proposed to overcome the random access congestion problem with reduced signaling and control overhead.In the proposed scheme,the detector aims at approximating the optimal observation on both active user detection and user data reception by iteratively learning and predicting the convergence of the user signature codes that are in active.The crossentropy based low-complexity iterative updating rule is present to guarantee that the proposed USCA algorithm is computational feasible.A closed-form bit error rate(BER)performance analysis is carried out to show the efficiency of the proposed intelligence USCA algorithm.Simulation results confirm that the proposed USCA algorithm provides an inherent tradeoff between performance and complexity and allows the detector achieves an approximate optimal performance with a reasonable computational complexity.
基金the National Key Research and Development Program(Grant No.2018YFB1800702)Peng Cheng Laboratory(Grant No.PCL2021A02).
文摘Ripple acts as a real-time settlement and payment system to connect banks and payment providers.As the consensus support of the Ripple network to ensure network consistency,Ripple consensus protocol has been widely concerned in recent years.Compared with those Byzantine fault tolerant protocols,Ripple has a significant difference that the system can reach an agreement under decentralized trust model.However,Ripple has many problems both in theory and practice,which arementioned in the previous researches.This paper presents Ripple+,an improved scheme of Ripple consensus protocol,which improves Ripple fromthree aspects:(1)Ripple+employs a specific trustmodel and a corresponding guideline for Unique Node List selection,which makes it easy to deploy in practice to meet the safety and liveness condition;(2)the primary and viewchangemechanismare joined to solve the problem discussed by the previous research that Ripple may lose liveness in some extreme scenarios;(3)we remove the strong synchrony clock and timeout during consensus periods to make it suitable for weak synchrony assumption.We implemented a prototype of Ripple+and conducted experiments to show that Ripple+can achieve the throughput of tens of thousands of transactions per second with no more than half a minute latency,and the view change mechanism hardly incurs additional cost.
基金This work is supported by the National Natural Science Foundation of China(Grant Number:61471141,61361166006,61301099)Basic Research Project of Shenzhen,China(Grant Number:JCYJ20150513151706561)National Defense Basic Scientific Research Program of China(Grant Number:JCKY2018603B006).
文摘Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.
文摘Web information system(WIS)is frequently-used and indispensable in daily social life.WIS provides information services in many scenarios,such as electronic commerce,communities,and edutainment.Data cleaning plays an essential role in various WIS scenarios to improve the quality of data service.In this paper,we present a review of the state-of-the-art methods for data cleaning in WIS.According to the characteristics of data cleaning,we extract the critical elements of WIS,such as interactive objects,application scenarios,and core technology,to classify the existing works.Then,after elaborating and analyzing each category,we summarize the descriptions and challenges of data cleaning methods with sub-elements such as data&user interaction,data quality rule,model,crowdsourcing,and privacy preservation.Finally,we analyze various types of problems and provide suggestions for future research on data cleaning in WIS from the technology and interactive perspective.
基金This work was supported by the National Key Research and Development Program of China(Nos.2016QY03D0501,2017YFB0803300)the National Natural Science Foundation of China(Nos.61601146,61732022)Sichuan Science and Technology Program(No.2019YFSY0049).
文摘As a real-time and authoritative source,the official Web pages of organizations contain a large amount of information.The diversity of Web content and format makes it essential for pre-processing to get the unified attributed data,which has the value of organizational analysis and mining.The existing research on dealing with multiple Web scenarios and accuracy performance is insufficient.This paper aims to propose a method to transform organizational official Web pages into the data with attributes.After locating the active blocks in the Web pages,the structural and content features are proposed to classify information with the specific model.The extraction methods based on trigger lexicon and LSTM(Long Short-Term Memory)are proposed,which efficiently process the classified information and extract data that matches the attributes.Finally,an accurate and efficient method to classify and extract information from organizational official Web pages is formed.Experimental results show that our approach improves the performing indicators and exceeds the level of state of the art on real data set from organizational official Web pages.
基金This work is supported by the National Key Research and Development Program of China(2018YFC0830602,2016QY03D0501)National Natural Science Foundation of China(61872111).
文摘With the development of Internet technology and the enhancement of people’s concept of the rule of law,online legal consultation has become an important means for the general public to conduct legal consultation.However,different people have different language expressions and legal professional backgrounds.This phenomenon may lead to the phenomenon of different descriptions of the same legal consultation.How to accurately understand the true intentions behind different users’legal consulting statements is an important issue that needs to be solved urgently in the field of legal consulting services.Traditional intent understanding algorithms rely heavily on the lexical and semantic information between the original data,and are not scalable,and often require taxing manual annotation work.This article proposes a new approach TdBrnn which is based on the normalized tensor decomposition method and Bi-LSTM to learn users’intention to legal consulting.First,we present the users’legal consulting statements as a tensor.And then we use the normalized tensor decomposition layer proposed by this article to extract the tensor elements and structural information of the original tensor which can best represent users’intention of legal consultation,namely the core tensor.The core tensor relies less on the lexical and semantic information of the original users’legal consulting statements data,it reduces the dimension of the original tensor,and greatly reduces the computational complexity of the subsequent Bi-LSTM algorithm.Furthermore,we use a large number of core tensors obtained by the tensor decomposition layer with users’legal consulting statements tensors as inputs to continuously train Bi-LSTM,and finally derive the users’legal consultation intention classification model which can comprehensively understand the user’s legal consultation intention.Experiments show that our method has faster convergence speed and higher accuracy than traditional recurrent neural networks.
基金sponsored by the National Basic Research Program of China(973 Program)(Grant No.2011CB302605)the National High Technology Research and Development Program of China(863 Program)(Grants No.2011AA010705+5 种基金2012AA0125022012AA012506)the National Key Technology R&D Program of China(Grant No.2012BAH37B01)the National Science Foundation of China(Grant No.6120245761402149)the CNNIC(Grant No.K201211043)
文摘A Scalable Multi-Hash( SMH) name lookup method is proposed,which is based on hierarchical name decomposition to aggregate names sharing common prefixes and multiple scalable hash tables to minimize collisions among prefixes. We take the component instead of the entire name as a key in the hash functions. The SMH method achieves lookup speeds of 21. 45 and 20. 87 Mbps on prefix table with 2 million and 3. 6 million names,respectively. The proposed method is the fastest of the four methods considered and requires 61.63 and 89.17 Mb of memory on the prefix tables with 2 million and 3. 6 million names,respectively. The required memory is slightly larger than the best method. The scalability of SMH outperforms that of the other two methods.
基金This work was partially supported by National Natural Science Foundation of China(Grant Nos.61601146,61732022)National Key R&D Program of China(Grant No.2016QY05X1000).
文摘Road networks have been used in a wide range of applications to reduces the cost of transportation and improve the quality of related services.The shortest road distance computation has been considered as one of the most fundamental operations of road networks computation.To alleviate privacy concerns about location privacy leaks during road distance computation,it is desirable to have a secure and efficient road distance computation approach.In this paper,we propose two secure road distance computation approaches,which can compute road distance over encrypted data efficiently.An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding.An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding.We implement our two road distance computation approaches,and evaluate them on the real cityscale road network.Evaluation results show that our approaches are accurate and efficient.
基金This work is partially supported by the National Key Research and Development Program(2018YFB1800702).
文摘As a critical Internet infrastructure,domain name system(DNS)protects the authenticity and integrity of domain resource records with the introduction of security extensions(DNSSEC).DNSSEC builds a single-center and hierarchical resource authentication architecture,which brings management convenience but places the DNS at risk from a single point of failure.When the root key suffers a leak or misconfiguration,top level domain(TLD)authority cannot independently protect the authenticity of TLD data in the root zone.In this paper,we propose self-certificating root,a lightweight security enhancement mechanism of root zone compatible with DNS/DNSSEC protocol.By adding the TLD public key and signature of the glue records to the root zone,this mechanism enables the TLD authority to certify the self-submitted data in the root zone and protects the TLD authority from the risk of root key failure.This mechanism is implemented on an open-source software,namely,Berkeley Internet Name Domain(BIND),and evaluated in terms of performance,compatibility,and effectiveness.Evaluation results show that the proposed mechanism enables the resolver that only supports DNS/DNSSEC to authenticate the root zone TLD data effectively with minimal performance difference.
基金This work was supported by National Natural Science Foundation of China under Grant(Nos.61771166,61771166,61402137)National Key Research&Development Plan of China under Grant 2016QY05X1000。
文摘Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network traffic has put tremendous pressure on the performance of DPI.In fact,the sensitive content being monitored is only a minority of network traffic,that is to say,most is undesired.A close look at the network traffic,we found that it contains many undesired high frequency content(UHC)that are not monitored.As everyone knows,the key to improve DPI performance is to skip as many useless characters as possible.Nevertheless,researchers generally study the algorithm of skipping useless characters through sensitive content,ignoring the high-frequency non-sensitive content.To fill this gap,in this literature,we design a model,named Fast AC Model with Skipping(FAMS),to quickly skip UHC while scanning traffic.The model consists of a standard AC automaton,where the input traffic is scanned byte-by-byte,and an additional sub-model,which includes a mapping set and UHC matching model.The mapping set is a bridge between the state node of AC and UHC matching model,while the latter is to select a matching function from hash and fingerprint functions.Our experiments show promising results that we achieve a throughput gain of 1.3-2.6 times the original throughput and 1.1-1.3 times Barr’s double path method.
基金Supported by Natural Science Foundation of Guangdong(S2013010013693)Outstanding Young Teacher Training Program of Colleges and Universities in Guangdong Province(Yq2013152)
文摘Concentrations of lead( Pb),cadmium( Cd),chromium( Cr),copper( Cu),zinc( Zn) and manganese( Mn) were measured in various organs( such as in liver and muscle) from 9 species of freshwater economic fishes which were collected from northeast area of Guangdong Province. The concentration of metals was measured by inductively coupled plasma atomic emission spectrometry( ICP- AES). Results showed that the levels of metals in hepatopancreas of the fishes were found in order of Zn > Pb > Cu > Hg > Cd,while in muscles were Zn > Cr> Pb > Mn > Cu > Cd. In general,the metals concentrations were significantly higher in liver samples than that in muscle samples. Based on the " pollution index of single factor",the fishes,to one degree or another,were polluted by Pb,Cd,Cr,Cu and Zn,and pollution levels were mostly followed by Pb > Cd > Cr > Cu > Zn. The indexes of Pb and Cd tested in hepatopancreas of the fishes were in a majority exceeded the national safety criteria for food in China. What's more,it was found that the contents of the heavy metals in fishes did not vary with the trophic level which they belong to. In summary,the fishes were polluted by Pb,Cd,Cr,Cu and Zn to some extent,which indicated that hidden danger of heavy metals pollution was present in ecological environment or safety in fishery production in the area.
文摘Analyzing and modeling of the BitTorrent (BT) resource popularity and swarm evolution is important for better understanding current BT system and designing accurate BT simulators. Although lots of measurement studies on BT almost cover each important aspect, little work reflects the recent development of BT system. In this paper, we develop a hybrid measurement system incorporating both active and passive approaches. By exploiting DHT (Distribute Hash Table) and PEX (Peer Exchange) protocols, we collect more extensive information compared to prior measurement systems. Based on the measurement results, we study the resource popularity and swarm evolution with different population in minute/ hour/day scales, and discover that: 1) the resources in BT system appear obvious unbalanced distribution and hotspot phenomenon, in that 74.6% torrents have no more than 1000 peers;2) The lifetime of torrents can be divided into a fast growing stage, a dramatically shrinking stage, a sustaining stage and a slowly fading out stage in terms of swarm population;3) Users’ interest and diurnal periodicity are the main factors that influence the swarm evolution. The former dominates the first two stages, while the latter is decisive in the third stage. We raise an improved peer arrival rate model to describe the variation of the swarm population. Comparison results show that our model outperforms the state-of-the-art approach according to root mean square error and correlation coefficient.
文摘Measuring and characterizing peer-to-peer (P2P) file-sharing systems will benefit the optimization and management of P2P systems. Though there are a lot of measurement studies on BitTorrent almost in every important aspect, few of them focus on the measurement issues and the corresponding solutions, which can strongly influence the accuracy of measurement results. This paper analyzes the key difficulties of measuring BitTorrent and presents a measurement system with combination of active and passive ways, which can han-dle with the problems well and balance the efficiency and integrity. Then compared to other work, a more complete and representative measurement was performed for nearly two months and several characteristics are concerned: 1) there are diverse content sharing in BitTorrent system, but multimedia files that are larger than 100 MB are the most. 2) Distributed Hash Tables has indeed enhanced the ability of peer discovery though there are some pitfalls to be addressed. 3) Pieces are distributed uniformly after the early stage and there are few rare pieces. Furthermore, peer arrival rate shows a periodical pattern, which was not well mod-eled before. Then an improved model is proposed and the experiment results indicate that new model is fitted in with actual measurement results with high accuracy.
基金The authors would like to thank all anonymous reviewers for their insightful comments. Additionally, This work is supported by the National Natural Science Foundation of China (Grant Number: 61471141, 61301099, 61361166006), the Fundamental Research Funds for the Central Universities (Grant Number: HIT. KISTP. 201416, HIT. KISTP. 201414).
文摘Social network platforms such as Twitter, Instagram and Facebook are one of the fastest and most convenient means for sharing digital images. Digital images are generally accepted as credible news but, it may undergo some manipulations before being shared without leaving any obvious traces of tampering; due to existence of the powerful image editing softwares. Copy-move forgery technique is a very simple and common type of image forgery, where a part of the image is copied and then pasted in the same image to replicate or hide some parts from the image. In this paper, we proposed a copy-scale-move forgery detection method based on Scale Invariant Feature Operator (SFOP) detector. The keypoints are then described using MROGH descriptor. Experimental results show that the proposed method is able to locate and detect the forgery even if under some geometric transformations such as scaling.
文摘Analysis of the particularity of the civil aviation passenger auxiliary service recommendation scenario.As application of the traditional recommendation algorithm has certain limitation in civil aviation auxiliary services recommendation,a SVR recommendation algorithm of auxiliary service of civil aviation based on context-awareness was proposed.Analysis of the civil aviation passenger travel data,construct the civil aviation passenger preference model,then recommend auxiliary service for passengers.Based on the traditional two-dimensional user-item recommendation,considering the user characteristics,item attributes and user contextual information in the process of recommendation,which can effectively reduce the data sparseness in some degree.In addition,when there is a new user or a new item,whose similar users or items can be found according to the user or item attributes,to some extent,which can solve the problem of cold start.The experimental results show that the algorithm can recommend auxiliary service for passengers more accurately,which can provide convenience for passengers as well as increase the quality of airlines’services.
基金supported by the Basic Research Fund for the Central Universities(WK3450000006)the National Natural Science Foundation of China(52373122).
文摘Monitoring of sweat pH plays important roles in physiological health,nutritional balance,psychological stress,and sports performance.However,the combination of functional MOFs with phosphorescent material to acquire the real-time physiological information,as well as the application of dual mode anti-counterfeiting,has seldom been reported.Herein,we developed multifunctional gel films based on MOFs and phosphorescent dyes which responded to H+ions and the related mechanism was studied in detail.Upon exposure to H+,the composite gel film exhibited decreased fluorescent signal but enhanced room temperature phosphorescence(RTP),which could be utilized for sweat pH sensing through a dual-mode.Moreover,multifunctional gel films exhibited a potential application in information encryption and anti-counterfeiting by designing of stimulus responsive multiple patterns.This research provided a new avenue for portable and non-invasive sweat pH monitoring methods while also offering insights into stimulus-responsive multifunctional materials.
基金co-supported by the Xinjiang Uygur Autonomous Region Natural Science Foundation,China(No.2022D01C86)the National Natural Science Foundation of China(No.62263030)the Open Research Fund Program of Beijing National Research Center for Information Science and Technology,China(No.BR2023KF02011).
文摘A prescribed performance control scheme based on the three-inflection-point hyperbolic function and predefined time performance function is proposed to solve the trajectory tracking problem of the forward-tilting morphing aerospace vehicle with time-varying actuator faults.To accurately estimate the loss degree of actuator faults,an immersion and invariance observer based on the predefined time dynamic scale factor is designed to estimate and compensate it.A composite dynamic sliding mode surface is designed using a three-inflection-point hyperbolic function,and a novel three-inflection-point sliding mode control framework is proposed.The convergent domain of the sliding manifold is adjusted by parameters,and the system error convergence is controllable.A transfer function is designed to eliminate the sensitivity of the three-inflection-point hyperbolic sliding mode to the unknown initial state,and combined with the barrier Lyapunov function,and the performance constraint of the system is realized.The global asymptotic stability of the system is demonstrated using a strict mathematical proof.The effectiveness and superiority of the proposed control scheme are proven by simulation experiments.