Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation anti POS (part-of-speech) tagging. While most previous works tot HMM-based Chinese segmentation anti POS tagging eonsult POS infor...Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation anti POS (part-of-speech) tagging. While most previous works tot HMM-based Chinese segmentation anti POS tagging eonsult POS informatiou in contexts, they do not utilize lexieal information which is crucial for resoMng certain morphologieal ambiguity. This paper proposes a method which incorporates lexieal information and wider context information into HMM. Model induction anti related smoothing technique are presented in detail. Experiments indicate that this technique improves the segmentation and tagging accuracy by nearly 1%.展开更多
Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the C...Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.展开更多
Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guara...Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guarantees. Privacy- preserving multidimensional data publishing currently lacks a solid theoretical foundation. It is urgent to develop new techniques with provable privacy guarantees, e-Differential privacy is the only method that can provide such guarantees. In this paper, we propose a multidimensional data publishing scheme that ensures c-differential privacy while providing accurate results for query processing. The proposed solution applies nonstandard wavelet transforms on the raw multidimensional data and adds noise to guarantee c-differential privacy. Then, the scheme processes arbitrarily queries directly in the noisy wavelet- coefficient synopses of relational tables and expands the noisy wavelet coefficients back into noisy relational tuples until the end result of the query. Moreover, experimental results demonstrate the high accuracy and effectiveness of our approach.展开更多
Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results...Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results of frequent queries. We discover frequent query patterns from user-issued queries using an efficient bottom-up mining approach called VBUXMiner. VBUXMiner consists of two main steps. First, all queries are merged into a summary structure named "compressed global tree guide" (CGTG). Second, a bottom-up traversal scheme based on the CGTG is employed to generate frequent query patterns. We use the frequent query patterns in a cache mechanism to improve the XML query performance. Experimental results show that our proposed mining approach outperforms the previous mining algorithms for XML queries, such as XQPMinerTID and FastXMiner, and that by caching the results of frequent query patterns, XML query performance can be dramatically improved.展开更多
Recent Cryptosporidium outbreaks have highlighted concerns about filter efficiency and in particular particle breakthrough. It is essential to ascertain the causes of Cryptosporidium sized particle breakthrough for Cr...Recent Cryptosporidium outbreaks have highlighted concerns about filter efficiency and in particular particle breakthrough. It is essential to ascertain the causes of Cryptosporidium sized particle breakthrough for Cryptosporidium cannot be destroyed by conventional chlorine disinfection. This research tried to investigate the influence of temperature, flow rate and chemical dosing on particle breakthrough during filtration. The results showed that higher temperatures and coagulant doses could reduce particle breakthrough. The increase of filtration rate made the residual particle counts become larger. There was an optimal dose in filtration and was well correlated to ζ potential.展开更多
CFG pile has been widely applied as one of the common ground treatment techniques. As a concealed work, the construction quality of pile foundation not only relates to the success of the project, but also concerns the...CFG pile has been widely applied as one of the common ground treatment techniques. As a concealed work, the construction quality of pile foundation not only relates to the success of the project, but also concerns the benefits of thousands of hot, seholds. Only strengthening the supervision and management during the construction and strictly designing and specifying CFG pile can ensure the construction quality of CFG pile. But most researches focus on operating mechanism and theoretical analysis, and there are fewer researches about the construction of CFG pile. The real construction of CFG pile has no specified operation and lacks of the construction guidance, which not only causes great problems and has great influence on the intensity of CFG pile, but also makes the real pile body have great difference from the design requirements. Therefore, the study on construction of CFG pile in the paper has great significance.展开更多
An algorithm is given for computing in a very efficient way the topology of two real algebraic plane curves defined implicitly.The authors preform a symbolic pre-processing that allows us later to execute all numerica...An algorithm is given for computing in a very efficient way the topology of two real algebraic plane curves defined implicitly.The authors preform a symbolic pre-processing that allows us later to execute all numerical computations in an accurate way.展开更多
Camouflage is ubiquitous in the natural world and benefits both predators and prey. Amongst the range of conceal- ment strategies, disruptive coloration is thought to visually fragment an animal's' outline, thereby ...Camouflage is ubiquitous in the natural world and benefits both predators and prey. Amongst the range of conceal- ment strategies, disruptive coloration is thought to visually fragment an animal's' outline, thereby reducing its rate of discovery. Here, I propose two non-mutually exclusive hypotheses for how disruptive camouflage functions, and describe the visual me- chanisms that might underlie them. (1) The local edge disruption hypothesis states that camouflage is achieved by breaking up edge information. (2) The global feature disruption hypothesis states camouflage is achieved by breaking up the characteristic features of an animal (e.g., overall shape or facial features). Research clearly shows that putatively disruptive edge markings do increase concealment; however, few tests have been undertaken to determine whether this survival advantage is attributable to the distortion of features, so the global feature disruption hypothesis is under studied. In this review the evidence for global feature disruption is evaluated. Further, I address if object recognition processing provides a feasible mechanism for animals' features to influence concealment. This review concludes that additional studies are needed to test if disruptive camouflage operates through the global feature disruption and proposes future research directions [Current Zoology 61 (4): 708-717, 2015].展开更多
Superresolution is an image processing technique that estimates an original high-resolutionimage from its low-resolution and degraded observations.In superresolution tasks,there have beenproblems regarding the computa...Superresolution is an image processing technique that estimates an original high-resolutionimage from its low-resolution and degraded observations.In superresolution tasks,there have beenproblems regarding the computational cost for the estimation of high-dimensional variables.Theseproblems are now being overcome by the recent development of fast computers and the developmentof powerful computational techniques such as variational Bayesian approximation.This paper reviewsa Bayesian treatment of the superresolution problem and presents its extensions based on hierarchicalmodeling by employing hidden variables.展开更多
基金国家高技术研究发展计划(863计划),the National Natural Science Foundation of China
文摘Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation anti POS (part-of-speech) tagging. While most previous works tot HMM-based Chinese segmentation anti POS tagging eonsult POS informatiou in contexts, they do not utilize lexieal information which is crucial for resoMng certain morphologieal ambiguity. This paper proposes a method which incorporates lexieal information and wider context information into HMM. Model induction anti related smoothing technique are presented in detail. Experiments indicate that this technique improves the segmentation and tagging accuracy by nearly 1%.
基金Project(IRT0725)supported by the Changjiang Innovative Group of Ministry of Education,China
文摘Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.
基金the National Basic Research Program of China under Grant 2013CB338004,Doctoral Program of Higher Education of China under Grant No.20120073120034,National Natural Science Foundation of China under Grants No.61070204,61101108,and National S&T Major Program under Grant No.2011ZX03002-005-01
文摘Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guarantees. Privacy- preserving multidimensional data publishing currently lacks a solid theoretical foundation. It is urgent to develop new techniques with provable privacy guarantees, e-Differential privacy is the only method that can provide such guarantees. In this paper, we propose a multidimensional data publishing scheme that ensures c-differential privacy while providing accurate results for query processing. The proposed solution applies nonstandard wavelet transforms on the raw multidimensional data and adds noise to guarantee c-differential privacy. Then, the scheme processes arbitrarily queries directly in the noisy wavelet- coefficient synopses of relational tables and expands the noisy wavelet coefficients back into noisy relational tuples until the end result of the query. Moreover, experimental results demonstrate the high accuracy and effectiveness of our approach.
基金the National Natural Science Foundation of China (No. 60603044)the National Key Technologies Supporting Program of China during the 11th Five-Year Plan Period (No. 2006BAH02A03)the Program for Changjiang Scholars and Innovative Research Team in University of China (No. IRT0652)
文摘Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results of frequent queries. We discover frequent query patterns from user-issued queries using an efficient bottom-up mining approach called VBUXMiner. VBUXMiner consists of two main steps. First, all queries are merged into a summary structure named "compressed global tree guide" (CGTG). Second, a bottom-up traversal scheme based on the CGTG is employed to generate frequent query patterns. We use the frequent query patterns in a cache mechanism to improve the XML query performance. Experimental results show that our proposed mining approach outperforms the previous mining algorithms for XML queries, such as XQPMinerTID and FastXMiner, and that by caching the results of frequent query patterns, XML query performance can be dramatically improved.
文摘Recent Cryptosporidium outbreaks have highlighted concerns about filter efficiency and in particular particle breakthrough. It is essential to ascertain the causes of Cryptosporidium sized particle breakthrough for Cryptosporidium cannot be destroyed by conventional chlorine disinfection. This research tried to investigate the influence of temperature, flow rate and chemical dosing on particle breakthrough during filtration. The results showed that higher temperatures and coagulant doses could reduce particle breakthrough. The increase of filtration rate made the residual particle counts become larger. There was an optimal dose in filtration and was well correlated to ζ potential.
文摘CFG pile has been widely applied as one of the common ground treatment techniques. As a concealed work, the construction quality of pile foundation not only relates to the success of the project, but also concerns the benefits of thousands of hot, seholds. Only strengthening the supervision and management during the construction and strictly designing and specifying CFG pile can ensure the construction quality of CFG pile. But most researches focus on operating mechanism and theoretical analysis, and there are fewer researches about the construction of CFG pile. The real construction of CFG pile has no specified operation and lacks of the construction guidance, which not only causes great problems and has great influence on the intensity of CFG pile, but also makes the real pile body have great difference from the design requirements. Therefore, the study on construction of CFG pile in the paper has great significance.
文摘An algorithm is given for computing in a very efficient way the topology of two real algebraic plane curves defined implicitly.The authors preform a symbolic pre-processing that allows us later to execute all numerical computations in an accurate way.
文摘Camouflage is ubiquitous in the natural world and benefits both predators and prey. Amongst the range of conceal- ment strategies, disruptive coloration is thought to visually fragment an animal's' outline, thereby reducing its rate of discovery. Here, I propose two non-mutually exclusive hypotheses for how disruptive camouflage functions, and describe the visual me- chanisms that might underlie them. (1) The local edge disruption hypothesis states that camouflage is achieved by breaking up edge information. (2) The global feature disruption hypothesis states camouflage is achieved by breaking up the characteristic features of an animal (e.g., overall shape or facial features). Research clearly shows that putatively disruptive edge markings do increase concealment; however, few tests have been undertaken to determine whether this survival advantage is attributable to the distortion of features, so the global feature disruption hypothesis is under studied. In this review the evidence for global feature disruption is evaluated. Further, I address if object recognition processing provides a feasible mechanism for animals' features to influence concealment. This review concludes that additional studies are needed to test if disruptive camouflage operates through the global feature disruption and proposes future research directions [Current Zoology 61 (4): 708-717, 2015].
文摘Superresolution is an image processing technique that estimates an original high-resolutionimage from its low-resolution and degraded observations.In superresolution tasks,there have beenproblems regarding the computational cost for the estimation of high-dimensional variables.Theseproblems are now being overcome by the recent development of fast computers and the developmentof powerful computational techniques such as variational Bayesian approximation.This paper reviewsa Bayesian treatment of the superresolution problem and presents its extensions based on hierarchicalmodeling by employing hidden variables.