The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj...The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.展开更多
The present work aims is to propose a solution for automating updates (MAJ) of the radio parameters of the ATOLL database from the OSS NetAct using Parsing. Indeed, this solution will be operated by the RAN (Radio Acc...The present work aims is to propose a solution for automating updates (MAJ) of the radio parameters of the ATOLL database from the OSS NetAct using Parsing. Indeed, this solution will be operated by the RAN (Radio Access Network) service of mobile operators, which ensures the planning and optimization of network coverage. The overall objective of this study is to make synchronous physical data of the sites deployed in the field with the ATOLL database which contains all the data of the coverage of the mobile networks of the operators. We have made an application that automates, updates with the following functionalities: import of radio parameters with the parsing method we have defined, visualization of data and its export to the Template of the ATOLL database. The results of the tests and validations of our application developed for a 4G network have made it possible to have a solution that performs updates with a constraint on the size of data to be imported. Our solution is a reliable resource for updating the databases containing the radio parameters of the network at all mobile operators, subject to a limitation in terms of the volume of data to be imported.展开更多
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other seman...Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other semantic information such as semantic collocation and semantic category. Some improvements on this distinctive parser are presented. Firstly, "valency" is an essential semantic feature of words. Once the valency of word is determined, the collocation of the word is clear, and the sentence structure can be directly derived. Thus, a syntactic parsing model combining valence structure with semantic dependency is purposed on the base of head-driven statistical syntactic parsing models. Secondly, semantic role labeling(SRL) is very necessary for deep natural language processing. An integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Experiments are conducted for the refined statistical parser. The results show that 87.12% precision and 85.04% recall are obtained, and F measure is improved by 5.68% compared with the head-driven parsing model introduced by Collins.展开更多
Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and e...Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and entity relation extracting is important work to understanding information semantic in natural language processing. Chunk analysis is a shallow parsing method, and entity relation extraction is used in establishing relationship between entities. Because full syntax parsing is complexity in Chinese text understanding, many researchers is more interesting in chunk analysis and relation extraction. Conditional random fields (CRFs) model is the valid probabilistic model to segment and label sequence data. This paper models chunk and entity relation problems in Chinese text. By transforming them into label solution we can use CRFs to realize the chunk analysis and entities relation extraction.展开更多
Information content security is a branch of cyberspace security. How to effectively manage and use Weibo comment information has become a research focus in the field of information content security. Three main tasks i...Information content security is a branch of cyberspace security. How to effectively manage and use Weibo comment information has become a research focus in the field of information content security. Three main tasks involved are emotion sentence identification and classification,emotion tendency classification,and emotion expression extraction. Combining with the latent Dirichlet allocation(LDA) model,a Gibbs sampling implementation for inference of our algorithm is presented,and can be used to categorize emotion tendency automatically with the computer. In accordance with the lower ratio of recall for emotion expression extraction in Weibo,use dependency parsing,divided into two categories with subject and object,summarized six kinds of dependency models from evaluating objects and emotion words,and proposed that a merge algorithm for evaluating objects can be accurately evaluated by participating in a public bakeoff and in the shared tasks among the best methods in the sub-task of emotion expression extraction,indicating the value of our method as not only innovative but practical.展开更多
Natural language parsing is a task of great importance and extreme difficulty. In this paper, we present a full Chinese parsing system based on a two-stage approach. Rather than identifying all phrases by a uniform mo...Natural language parsing is a task of great importance and extreme difficulty. In this paper, we present a full Chinese parsing system based on a two-stage approach. Rather than identifying all phrases by a uniform model, we utilize a divide and conquer strategy. We propose an effective and fast method based on Markov model to identify the base phrases. Then we make the first attempt to extend one of the best English parsing models i.e. the head-driven model to recognize Chinese complex phrases. Our two-stage approach is superior to the uniform approach in two aspects. First, it creates synergy between the Markov model and the head-driven model. Second, it reduces the complexity of full Chinese parsing and makes the parsing system space and time efficient. We evaluate our approach in PARSEVAL measures on the open test set, the parsing system performances at 87.53% precision, 87.95% recall.展开更多
This paper proposes a new way to improve the performance of dependency parser: subdividing verbs according to their grammatical functions and integrating the information of verb subclasses into lexicalized parsing mod...This paper proposes a new way to improve the performance of dependency parser: subdividing verbs according to their grammatical functions and integrating the information of verb subclasses into lexicalized parsing model. Firstly,the scheme of verb subdivision is described. Secondly,a maximum entropy model is presented to distinguish verb subclasses. Finally,a statistical parser is developed to evaluate the verb subdivision. Experimental results indicate that the use of verb subclasses has a good influence on parsing performance.展开更多
Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing cloth...Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.展开更多
Video events recognition is a challenging task for high-level understanding of video se- quence. At present, there are two major limitations in existing methods for events recognition. One is that no algorithms are av...Video events recognition is a challenging task for high-level understanding of video se- quence. At present, there are two major limitations in existing methods for events recognition. One is that no algorithms are available to recognize events which happen alternately. The other is that the temporal relationship between atomic actions is not fully utilized. Aiming at these problems, an algo- rithm based on an extended stochastic context-free grammar (SCFG) representation is proposed for events recognition. Events are modeled by a series of atomic actions and represented by an extended SCFG. The extended SCFG can express the hierarchical structure of the events and the temporal re- lationship between the atomic actions. In comparison with previous work, the main contributions of this paper are as follows: ① Events (include alternating events) can be recognized by an improved stochastic parsing and shortest path finding algorithm. ② The algorithm can disambiguate the detec- tion results of atomic actions by event context. Experimental results show that the proposed algo- rithm can recognize events accurately and most atomic action detection errors can be corrected sim- ultaneously.展开更多
A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses ...A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.展开更多
Ribosomes are abundant,large RNA-protein complexes that are the sites of all protein synthesis in cells.Defects in ribosomal proteins(RPs),including proteoforms arising from genetic variations,alternative splicing of ...Ribosomes are abundant,large RNA-protein complexes that are the sites of all protein synthesis in cells.Defects in ribosomal proteins(RPs),including proteoforms arising from genetic variations,alternative splicing of RNA transcripts,post-translational modifications and alterations of protein expression level,have been linked to a diverse range of diseases,including cancer and aging.Comprehensive characterization of ribosomal proteoforms is challenging but important for the discovery of potential disease biomarkers or protein targets.In the present work,using E.coli 70S RPs as an example,we first developed a top-down proteomics approach on a Waters Synapt G2 Si mass spectrometry(MS)system,and then applied it to the HeLa 80S ribosome.The results were complemented by a bottom-up approach.In total,50 out of 55 RPs were identified using the top-down approach.Among these,more than 30 RPs were found to have their N-terminal methionine removed.Additional modifications such as methylation,acetylation,and hydroxylation were also observed,and the modification sites were identified by bottomup MS.In a HeLa 80S ribosomal sample,we identified 98 ribosomal proteoforms,among which multiple truncated 80S ribosomal proteoforms were observed,the type of information which is often overlooked by bottom-up experiments.Although their relevance to diseases is not yet known,the integration of topdown and bottom-up proteomics approaches paves the way for the discovery of proteoform-specific disease biomarkers or targets.展开更多
In this paper, we present a modular incremental statistical model for English full parsing. Unlike other full parsing approaches in which the analysis of the sentence is a uniform process, our model separates the full...In this paper, we present a modular incremental statistical model for English full parsing. Unlike other full parsing approaches in which the analysis of the sentence is a uniform process, our model separates the full parsing into shallow parsing and sentence skeleton parsing. In shallow parsing, we finish POS tagging, Base NP identification, prepositional phrase attachment and subordinate clause identification. In skeleton parsing, we use a layered feature-oriented statistical method. Modularity possesses the advantage of solving different problems in parsing with corresponding mechanisms. Feature-oriented rule is able to express the complex lingual phenomena at the key point if needed. Evaluated on Penn Treebank corpus, we obtained 89.2% precision and 89.8% recall.展开更多
The defects from electron transport layer,perovskite layer and their interface would result in carrier nonradiative recombination losses.Poor buried interfacial contact is detrimental to charge extraction and device s...The defects from electron transport layer,perovskite layer and their interface would result in carrier nonradiative recombination losses.Poor buried interfacial contact is detrimental to charge extraction and device stability.Here,we report a bottom-up holistic carrier management strategy induced synergistically by multiple chemical bonds to minimize bulk and interfacial energy losses for high-performance perovskite photovoltaics.4-trifluoromethyl-benzamidine hydrochloride(TBHCl)containing–CF_(3),amidine cation and Cl^(-)is in advance incorporated into SnO_(2)colloid solution to realize bottom-up modification.The synergistic effect of multiple functional groups and multiple-bond-induced chemical interaction are revealed theoretically and experimentally.F and Cl^(-)can passivate oxygen vacancy and/or undercoordinated Sn^(4+)defects by coordinating with Sn^(4+).The F can suppress cation migration and modulate crystallization via hydrogen bond with FA^(+),and can passivate lead defects by coordinating with Pb^(2+).The–NH_(2)–C=NH^(+)_(2)and Cl^(-)can passivate cation and anion vacancy defects through ionic bonds with perovskites,respectively.Through TBHCl modification,the suppression of agglomeration of SnO_(2)nanoparticles,bulk and interfacial defect passivation,and release of tensile strains of perovskite films are demonstrated,which resulted in a PCE enhancement from 21.28%to 23.40%and improved stability.With post-treatment,the efficiency is further improved to 23.63%.展开更多
AIM:To observe ocular surface changes after phacovitrectomy in patients with mild to moderate meibomian gland dysfunction(MGD)-type dry eye and track clinical treatment response using a Keratograph 5M and a Lipi View ...AIM:To observe ocular surface changes after phacovitrectomy in patients with mild to moderate meibomian gland dysfunction(MGD)-type dry eye and track clinical treatment response using a Keratograph 5M and a Lipi View interferometer.METHODS:Forty cases were randomized into control group A and treatment group B;the latter received meibomian gland treatment 3d before phacovitrectomy and sodium hyaluronate before and after surgery.The average non-invasive tear film break-up time(NITBUTav),first noninvasive tear film break-up time(NITBUTf),non-invasive measured tear meniscus height(NTMH),meibomian gland loss(MGL),lipid layer thickness(LLT)and partial blink rate(PBR)were measured preoperatively and 1wk,1 and 3mo postoperatively.RESULTS:The NITBUTav values of group A at 1wk(4.38±0.47),1mo(6.76±0.70),and 3mo(7.25±0.68)were significantly lower than those of group B(7.45±0.78,10.46±0.97,and 11.31±0.89;P=0.002,0.004,and 0.001,respectively).The NTMH values of group B at 1wk(0.20±0.01)and 1mo(0.22±0.01)were markedly higher than those of group A(0.15±0.01 and 0.15±0.01;P=0.008 and P<0.001,respectively);however,there was no difference at 3mo.The LLT of group B at 3mo[91.5(76.25-100.00)]significantly exceeded that of group A[65.00(54.50-91.25),P=0.017].No obvious intergroup difference was found in MGL or PBR(P>0.05).CONCLUSION:Mild to moderate MGD dry eye worsens in the short term after phacovitrectomy.Preoperative cleaning,hot compresses,and meibomian gland massage as well as preoperative and postoperative sodium hyaluronate promote the rapid recovery of tear film stability.展开更多
In order to obtain information or discover knowledge from system logs,the first step is to performlog parsing,whereby unstructured raw logs can be transformed into a sequence of structured events.Although comprehensiv...In order to obtain information or discover knowledge from system logs,the first step is to performlog parsing,whereby unstructured raw logs can be transformed into a sequence of structured events.Although comprehensive studies on log parsing have been conducted in recent years,most assume that one event object corresponds to a single-line message.However,in a growing number of scenarios,one event object spans multiple lines in the log,for which parsing methods toward single-line events are not applicable.In order to address this problem,this paper proposes an automated log parsing method for multiline events(LPME).LPME finds multiline event objects via iterative scanning,driven by a set of heuristic rules derived from practice.The advantage of LPME is that it proposes a cohesion-based evaluation method for multiline events and a bottom-up search approach that eliminates the process of enumerating all combinations.We analyze the algorithmic complexity of LPME and validate it on four datasets from different backgrounds.Evaluations show that the actual time complexity of LPME parsing for multiline events is close to the constant time,which enables it to handle large-scale sample inputs.On the experimental datasets,the performance of LPME achieves 1.0 for recall,and the precision is generally higher than 0.9,which demonstrates the effectiveness of the proposed LPME.展开更多
文摘The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.
文摘The present work aims is to propose a solution for automating updates (MAJ) of the radio parameters of the ATOLL database from the OSS NetAct using Parsing. Indeed, this solution will be operated by the RAN (Radio Access Network) service of mobile operators, which ensures the planning and optimization of network coverage. The overall objective of this study is to make synchronous physical data of the sites deployed in the field with the ATOLL database which contains all the data of the coverage of the mobile networks of the operators. We have made an application that automates, updates with the following functionalities: import of radio parameters with the parsing method we have defined, visualization of data and its export to the Template of the ATOLL database. The results of the tests and validations of our application developed for a 4G network have made it possible to have a solution that performs updates with a constraint on the size of data to be imported. Our solution is a reliable resource for updating the databases containing the radio parameters of the network at all mobile operators, subject to a limitation in terms of the volume of data to be imported.
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
基金Project(61262035) supported by the National Natural Science Foundation of ChinaProjects(GJJ12271,GJJ12742) supported by the Science and Technology Foundation of Education Department of Jiangxi Province,ChinaProject(20122BAB201033) supported by the Natural Science Foundation of Jiangxi Province,China
文摘Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other semantic information such as semantic collocation and semantic category. Some improvements on this distinctive parser are presented. Firstly, "valency" is an essential semantic feature of words. Once the valency of word is determined, the collocation of the word is clear, and the sentence structure can be directly derived. Thus, a syntactic parsing model combining valence structure with semantic dependency is purposed on the base of head-driven statistical syntactic parsing models. Secondly, semantic role labeling(SRL) is very necessary for deep natural language processing. An integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Experiments are conducted for the refined statistical parser. The results show that 87.12% precision and 85.04% recall are obtained, and F measure is improved by 5.68% compared with the head-driven parsing model introduced by Collins.
文摘Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and entity relation extracting is important work to understanding information semantic in natural language processing. Chunk analysis is a shallow parsing method, and entity relation extraction is used in establishing relationship between entities. Because full syntax parsing is complexity in Chinese text understanding, many researchers is more interesting in chunk analysis and relation extraction. Conditional random fields (CRFs) model is the valid probabilistic model to segment and label sequence data. This paper models chunk and entity relation problems in Chinese text. By transforming them into label solution we can use CRFs to realize the chunk analysis and entities relation extraction.
基金supported by National Key Basic Research Program of China (No.2014CB340600)partially supported by National Natural Science Foundation of China (Grant Nos.61332019,61672531)partially supported by National Social Science Foundation of China (Grant No.14GJ003-152)
文摘Information content security is a branch of cyberspace security. How to effectively manage and use Weibo comment information has become a research focus in the field of information content security. Three main tasks involved are emotion sentence identification and classification,emotion tendency classification,and emotion expression extraction. Combining with the latent Dirichlet allocation(LDA) model,a Gibbs sampling implementation for inference of our algorithm is presented,and can be used to categorize emotion tendency automatically with the computer. In accordance with the lower ratio of recall for emotion expression extraction in Weibo,use dependency parsing,divided into two categories with subject and object,summarized six kinds of dependency models from evaluating objects and emotion words,and proposed that a merge algorithm for evaluating objects can be accurately evaluated by participating in a public bakeoff and in the shared tasks among the best methods in the sub-task of emotion expression extraction,indicating the value of our method as not only innovative but practical.
基金国家高技术研究发展计划(863计划),the National Natural Science Foundation of China
文摘Natural language parsing is a task of great importance and extreme difficulty. In this paper, we present a full Chinese parsing system based on a two-stage approach. Rather than identifying all phrases by a uniform model, we utilize a divide and conquer strategy. We propose an effective and fast method based on Markov model to identify the base phrases. Then we make the first attempt to extend one of the best English parsing models i.e. the head-driven model to recognize Chinese complex phrases. Our two-stage approach is superior to the uniform approach in two aspects. First, it creates synergy between the Markov model and the head-driven model. Second, it reduces the complexity of full Chinese parsing and makes the parsing system space and time efficient. We evaluate our approach in PARSEVAL measures on the open test set, the parsing system performances at 87.53% precision, 87.95% recall.
基金the National Natural Science Foundation of China (No.60435020, 60575042 and 60503072).
文摘This paper proposes a new way to improve the performance of dependency parser: subdividing verbs according to their grammatical functions and integrating the information of verb subclasses into lexicalized parsing model. Firstly,the scheme of verb subdivision is described. Secondly,a maximum entropy model is presented to distinguish verb subclasses. Finally,a statistical parser is developed to evaluate the verb subdivision. Experimental results indicate that the use of verb subclasses has a good influence on parsing performance.
基金National Natural Science Foundation of China (No.62006039)Shanghai Special Fund for Software and Integrated Circuit Industry Development,China (No.180330)。
文摘Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.
基金Supported by the National Natural Science Foundation of China(60805028,60903146)Natural Science Foundation of Shandong Province of China (ZR2010FM027)+1 种基金SDUST Research Fund(2010KYTD101)China Postdoctoral Science Foundation(2012M521336)
文摘Video events recognition is a challenging task for high-level understanding of video se- quence. At present, there are two major limitations in existing methods for events recognition. One is that no algorithms are available to recognize events which happen alternately. The other is that the temporal relationship between atomic actions is not fully utilized. Aiming at these problems, an algo- rithm based on an extended stochastic context-free grammar (SCFG) representation is proposed for events recognition. Events are modeled by a series of atomic actions and represented by an extended SCFG. The extended SCFG can express the hierarchical structure of the events and the temporal re- lationship between the atomic actions. In comparison with previous work, the main contributions of this paper are as follows: ① Events (include alternating events) can be recognized by an improved stochastic parsing and shortest path finding algorithm. ② The algorithm can disambiguate the detec- tion results of atomic actions by event context. Experimental results show that the proposed algo- rithm can recognize events accurately and most atomic action detection errors can be corrected sim- ultaneously.
基金Supported by the Science and Technology Innovation Plan of Beijing Institute of Technology(2013)
文摘A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.
基金supported in part by the National Natural Science Foundation of China(Grant Nos.:91953102 and 81872836)Natural Science Foundation of Guangdong Province,China(Grant Nos.:2019A1515011265 and 2022A1515010965)+1 种基金the Fundamental Research Funds for Sun Yat-sen University,China(Grant No.:19ykzd26)Open Project Funding of the State Key Laboratory of Crop Stress Adaptation and Improvement(Grant No.:2020KF05).Huilin Li would like to thank the Pearl River Talent Recruitment Program for support.
文摘Ribosomes are abundant,large RNA-protein complexes that are the sites of all protein synthesis in cells.Defects in ribosomal proteins(RPs),including proteoforms arising from genetic variations,alternative splicing of RNA transcripts,post-translational modifications and alterations of protein expression level,have been linked to a diverse range of diseases,including cancer and aging.Comprehensive characterization of ribosomal proteoforms is challenging but important for the discovery of potential disease biomarkers or protein targets.In the present work,using E.coli 70S RPs as an example,we first developed a top-down proteomics approach on a Waters Synapt G2 Si mass spectrometry(MS)system,and then applied it to the HeLa 80S ribosome.The results were complemented by a bottom-up approach.In total,50 out of 55 RPs were identified using the top-down approach.Among these,more than 30 RPs were found to have their N-terminal methionine removed.Additional modifications such as methylation,acetylation,and hydroxylation were also observed,and the modification sites were identified by bottomup MS.In a HeLa 80S ribosomal sample,we identified 98 ribosomal proteoforms,among which multiple truncated 80S ribosomal proteoforms were observed,the type of information which is often overlooked by bottom-up experiments.Although their relevance to diseases is not yet known,the integration of topdown and bottom-up proteomics approaches paves the way for the discovery of proteoform-specific disease biomarkers or targets.
文摘In this paper, we present a modular incremental statistical model for English full parsing. Unlike other full parsing approaches in which the analysis of the sentence is a uniform process, our model separates the full parsing into shallow parsing and sentence skeleton parsing. In shallow parsing, we finish POS tagging, Base NP identification, prepositional phrase attachment and subordinate clause identification. In skeleton parsing, we use a layered feature-oriented statistical method. Modularity possesses the advantage of solving different problems in parsing with corresponding mechanisms. Feature-oriented rule is able to express the complex lingual phenomena at the key point if needed. Evaluated on Penn Treebank corpus, we obtained 89.2% precision and 89.8% recall.
基金financially supported by the Support Plan for Overseas Students to Return to China for Entrepreneurship and Innovation(cx2020003)the Fundamental Research Funds for the Central Universities(2020CDJ-LHZZ-074)the Natural Science Foundation of Chongqing(cstc2020jcyj-msxm X0629)。
文摘The defects from electron transport layer,perovskite layer and their interface would result in carrier nonradiative recombination losses.Poor buried interfacial contact is detrimental to charge extraction and device stability.Here,we report a bottom-up holistic carrier management strategy induced synergistically by multiple chemical bonds to minimize bulk and interfacial energy losses for high-performance perovskite photovoltaics.4-trifluoromethyl-benzamidine hydrochloride(TBHCl)containing–CF_(3),amidine cation and Cl^(-)is in advance incorporated into SnO_(2)colloid solution to realize bottom-up modification.The synergistic effect of multiple functional groups and multiple-bond-induced chemical interaction are revealed theoretically and experimentally.F and Cl^(-)can passivate oxygen vacancy and/or undercoordinated Sn^(4+)defects by coordinating with Sn^(4+).The F can suppress cation migration and modulate crystallization via hydrogen bond with FA^(+),and can passivate lead defects by coordinating with Pb^(2+).The–NH_(2)–C=NH^(+)_(2)and Cl^(-)can passivate cation and anion vacancy defects through ionic bonds with perovskites,respectively.Through TBHCl modification,the suppression of agglomeration of SnO_(2)nanoparticles,bulk and interfacial defect passivation,and release of tensile strains of perovskite films are demonstrated,which resulted in a PCE enhancement from 21.28%to 23.40%and improved stability.With post-treatment,the efficiency is further improved to 23.63%.
基金Supported by the Natural Science Foundation of Tianjin City(No.20JCZXJC00040)Tianjin Key Medical Discipline(No.Specialty)Construction Project(No.TJYXZDXK-037A)The Science&Technology Development Fund of Tianjin Education Commission for Higher Education(No.2022ZD058)。
文摘AIM:To observe ocular surface changes after phacovitrectomy in patients with mild to moderate meibomian gland dysfunction(MGD)-type dry eye and track clinical treatment response using a Keratograph 5M and a Lipi View interferometer.METHODS:Forty cases were randomized into control group A and treatment group B;the latter received meibomian gland treatment 3d before phacovitrectomy and sodium hyaluronate before and after surgery.The average non-invasive tear film break-up time(NITBUTav),first noninvasive tear film break-up time(NITBUTf),non-invasive measured tear meniscus height(NTMH),meibomian gland loss(MGL),lipid layer thickness(LLT)and partial blink rate(PBR)were measured preoperatively and 1wk,1 and 3mo postoperatively.RESULTS:The NITBUTav values of group A at 1wk(4.38±0.47),1mo(6.76±0.70),and 3mo(7.25±0.68)were significantly lower than those of group B(7.45±0.78,10.46±0.97,and 11.31±0.89;P=0.002,0.004,and 0.001,respectively).The NTMH values of group B at 1wk(0.20±0.01)and 1mo(0.22±0.01)were markedly higher than those of group A(0.15±0.01 and 0.15±0.01;P=0.008 and P<0.001,respectively);however,there was no difference at 3mo.The LLT of group B at 3mo[91.5(76.25-100.00)]significantly exceeded that of group A[65.00(54.50-91.25),P=0.017].No obvious intergroup difference was found in MGL or PBR(P>0.05).CONCLUSION:Mild to moderate MGD dry eye worsens in the short term after phacovitrectomy.Preoperative cleaning,hot compresses,and meibomian gland massage as well as preoperative and postoperative sodium hyaluronate promote the rapid recovery of tear film stability.
文摘In order to obtain information or discover knowledge from system logs,the first step is to performlog parsing,whereby unstructured raw logs can be transformed into a sequence of structured events.Although comprehensive studies on log parsing have been conducted in recent years,most assume that one event object corresponds to a single-line message.However,in a growing number of scenarios,one event object spans multiple lines in the log,for which parsing methods toward single-line events are not applicable.In order to address this problem,this paper proposes an automated log parsing method for multiline events(LPME).LPME finds multiline event objects via iterative scanning,driven by a set of heuristic rules derived from practice.The advantage of LPME is that it proposes a cohesion-based evaluation method for multiline events and a bottom-up search approach that eliminates the process of enumerating all combinations.We analyze the algorithmic complexity of LPME and validate it on four datasets from different backgrounds.Evaluations show that the actual time complexity of LPME parsing for multiline events is close to the constant time,which enables it to handle large-scale sample inputs.On the experimental datasets,the performance of LPME achieves 1.0 for recall,and the precision is generally higher than 0.9,which demonstrates the effectiveness of the proposed LPME.
基金Acknowledgements: This work is supported by the Natural Science Foundation of Henan (No.0211021600, No.0324220079) and the Fundamental Researches of Henan (No.004061800)