Multiple sequence alignment (MSA) is the alignment among more than two molecular biological sequences, which is a fundamental method to analyze evolutionary events such as mutations, insertions, deletions, and re-ar...Multiple sequence alignment (MSA) is the alignment among more than two molecular biological sequences, which is a fundamental method to analyze evolutionary events such as mutations, insertions, deletions, and re-arrangements. In theory, a dynamic programming algorithm can be employed to produce the optimal MSA. However, this leads to an explosive increase in computing time and memory consumption as the number of sequences increases (Taylor, 1990). So far, MSA is still regarded as one of the most challenging problems in bioinformatics and computational biology (Chatzou et al., 2016).展开更多
Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by gener...Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.展开更多
Fentanyl is a highly selective u-opioid receptor agonist with high analgesic activity. Three-dimensional pharmacophore models were built from a set of 50 fentanyl derivatives. These were employed to elucidate ligand-r...Fentanyl is a highly selective u-opioid receptor agonist with high analgesic activity. Three-dimensional pharmacophore models were built from a set of 50 fentanyl derivatives. These were employed to elucidate ligand-receptor interactions using information derived only from the ligand structure to identify new potential lead compounds. The present studies demonstrated that three hydrophobic regions, one positive ionizable region and two hydrogen bond acceptor region sites located on the molecule seem to be essential for analgesic activity. The results of the comparative molecular field analysis model suggested that both steric and electrostatic interactions play important roles. The contributions from steric and electrostatic fields for the model were 0.621 and 0.379, respectively. The pharmacophore model provides crucial information about how well the common features of a subject molecule overlap with the hypothesis model, which is very valuable for designing and optimizing new active structures.展开更多
In the last ten years, high-performance and massively parallel computing technology comes into a high speed developing phase and is used in all fields. The cluster computer systems are also being widely used for their...In the last ten years, high-performance and massively parallel computing technology comes into a high speed developing phase and is used in all fields. The cluster computer systems are also being widely used for their low cost and high performance. In bioinformatics research, solving a problem with computer usually takes hours even days. To speed up research, high-performance cluster computers are considered to be a good platform. Moving into the new MPP (massively parallel processing) system, the original algorithm should be parallelized in a proper way. In this paper, a new parallelizing method of useful sequence alignment algorithm (Smith-Waterman) is designed based on its optimizing algorithm already exists. The result is gratifying.展开更多
False data injection attacks(FDIAs)can manipulate measurement data from Supervisory Control and Data Acquisition(SCADA)system and threat state estimation in smart grids.Blind FDIAs(BFDIAs)enhance traditional FDIAs,whi...False data injection attacks(FDIAs)can manipulate measurement data from Supervisory Control and Data Acquisition(SCADA)system and threat state estimation in smart grids.Blind FDIAs(BFDIAs)enhance traditional FDIAs,which eliminate the limitation of grasping measurement Jacobian matrix H in advance,but when there are outliers in measurement data,attack performance is degraded.In this paper,improved BFDIAs are proposed.In off-line phase,lowdimensional measurement matrix without outliers calculated by Linear Local Tangent Space Alignment algorithm(LLTSA)is sent into Continuous Deep Belief Network(CDBN)as training data to learn their probability distribution.In on-line phase,real-time low-dimensional measurement matrix with outliers are sent into the trained model as inputs,and outputs are reconstructed by the probability distribution in off-line phase,which eliminates the influence of outliers indirectly.Simulations are implemented on PJM 5-bus and IEEE 14-bus systems to verify the performance of proposed strategy compared with PCA-based BFDIAs.展开更多
A computer-aided method to design a hybrid layout--tree-shape planar flowlines is presented. In newtype fiowshop layout, the common machines shared by several flowlines could be located together in functional sections...A computer-aided method to design a hybrid layout--tree-shape planar flowlines is presented. In newtype fiowshop layout, the common machines shared by several flowlines could be located together in functional sections. The approach combines traditional cell formation techniques with sequence alignment algorithms. Firstly, a sequence analysis based cell formation procedure is adopted; then the operation sequences for parts are aligned to maximize machines adjacency in hyperedge representations; finally a tree-shape planar flowline will be obtained for each part family. With the help of a sample of operation sequences obtained from industry, this algorithm is illustrated.展开更多
In order to discover the novel anticonvulsant drugs, pharmacophore screening of the anticonvulsant inhibitors was enforced. Genetic Algorithm with Linear Assignment for Hypermolecular Alignment of Datasets (GALAHAD)...In order to discover the novel anticonvulsant drugs, pharmacophore screening of the anticonvulsant inhibitors was enforced. Genetic Algorithm with Linear Assignment for Hypermolecular Alignment of Datasets (GALAHAD) and Comparative Molecular Field Analysis (CoMFA) studies were combined to implement our research. Firstly, multiple models were generated using GALAHAG based on high active molecules. Secondly, several of them were validated using the CoMFA study. Finally, a good values of q2 from training set and promising predictive power from test set were obtained based on one model simutaneously. One model had been selected as the most reasonable pharmacophore model. The results of the CoMFA study based on the model 1 suggested that both steric and electrostatic interactions played important roles.展开更多
基金supported by the National Key R&D Program of China (Nos. 2017YFB0202600, 2016YFC1302500, 2016YFB0200400 and 2017YFB0202104)the National Natural Science Foundation of China (Nos. 61772543, U1435222, 61625202, 61272056 and 61771331)Guangdong Provincial Department of Science and Technology (No. 2016B090918122)
文摘Multiple sequence alignment (MSA) is the alignment among more than two molecular biological sequences, which is a fundamental method to analyze evolutionary events such as mutations, insertions, deletions, and re-arrangements. In theory, a dynamic programming algorithm can be employed to produce the optimal MSA. However, this leads to an explosive increase in computing time and memory consumption as the number of sequences increases (Taylor, 1990). So far, MSA is still regarded as one of the most challenging problems in bioinformatics and computational biology (Chatzou et al., 2016).
基金National Natural Science Foundation of China(No.62006039)。
文摘Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.
基金supported by the National Natural Science Foundation of China,No.20872095
文摘Fentanyl is a highly selective u-opioid receptor agonist with high analgesic activity. Three-dimensional pharmacophore models were built from a set of 50 fentanyl derivatives. These were employed to elucidate ligand-receptor interactions using information derived only from the ligand structure to identify new potential lead compounds. The present studies demonstrated that three hydrophobic regions, one positive ionizable region and two hydrogen bond acceptor region sites located on the molecule seem to be essential for analgesic activity. The results of the comparative molecular field analysis model suggested that both steric and electrostatic interactions play important roles. The contributions from steric and electrostatic fields for the model were 0.621 and 0.379, respectively. The pharmacophore model provides crucial information about how well the common features of a subject molecule overlap with the hypothesis model, which is very valuable for designing and optimizing new active structures.
文摘In the last ten years, high-performance and massively parallel computing technology comes into a high speed developing phase and is used in all fields. The cluster computer systems are also being widely used for their low cost and high performance. In bioinformatics research, solving a problem with computer usually takes hours even days. To speed up research, high-performance cluster computers are considered to be a good platform. Moving into the new MPP (massively parallel processing) system, the original algorithm should be parallelized in a proper way. In this paper, a new parallelizing method of useful sequence alignment algorithm (Smith-Waterman) is designed based on its optimizing algorithm already exists. The result is gratifying.
基金supported by the National Natural Science Foundation of China(Grant Nos.11972013 and 12272145)the Ministry of Science and Technology of China(Grant No.2018YFF01014200).
基金supported by the Funds of the National Key Research and Development Program of China(Grant No.2020YFE0201100)the Funds of National Science of China(Grant nos.61973062,61973068)the Fundamental Research Funds for the Central Universities(Grant nos.N2004010,N2104021,N182008004).
文摘False data injection attacks(FDIAs)can manipulate measurement data from Supervisory Control and Data Acquisition(SCADA)system and threat state estimation in smart grids.Blind FDIAs(BFDIAs)enhance traditional FDIAs,which eliminate the limitation of grasping measurement Jacobian matrix H in advance,but when there are outliers in measurement data,attack performance is degraded.In this paper,improved BFDIAs are proposed.In off-line phase,lowdimensional measurement matrix without outliers calculated by Linear Local Tangent Space Alignment algorithm(LLTSA)is sent into Continuous Deep Belief Network(CDBN)as training data to learn their probability distribution.In on-line phase,real-time low-dimensional measurement matrix with outliers are sent into the trained model as inputs,and outputs are reconstructed by the probability distribution in off-line phase,which eliminates the influence of outliers indirectly.Simulations are implemented on PJM 5-bus and IEEE 14-bus systems to verify the performance of proposed strategy compared with PCA-based BFDIAs.
文摘A computer-aided method to design a hybrid layout--tree-shape planar flowlines is presented. In newtype fiowshop layout, the common machines shared by several flowlines could be located together in functional sections. The approach combines traditional cell formation techniques with sequence alignment algorithms. Firstly, a sequence analysis based cell formation procedure is adopted; then the operation sequences for parts are aligned to maximize machines adjacency in hyperedge representations; finally a tree-shape planar flowline will be obtained for each part family. With the help of a sample of operation sequences obtained from industry, this algorithm is illustrated.
基金Project supported by the National Natural Science Foundation of China (No. 20872095).
文摘In order to discover the novel anticonvulsant drugs, pharmacophore screening of the anticonvulsant inhibitors was enforced. Genetic Algorithm with Linear Assignment for Hypermolecular Alignment of Datasets (GALAHAD) and Comparative Molecular Field Analysis (CoMFA) studies were combined to implement our research. Firstly, multiple models were generated using GALAHAG based on high active molecules. Secondly, several of them were validated using the CoMFA study. Finally, a good values of q2 from training set and promising predictive power from test set were obtained based on one model simutaneously. One model had been selected as the most reasonable pharmacophore model. The results of the CoMFA study based on the model 1 suggested that both steric and electrostatic interactions played important roles.