BACKGROUND Surgical site infections(SSIs) are the commonest healthcare-associated infection. In addition to increasing mortality, it also lengthens the hospital stay and raises healthcare expenses. SSIs are challengin...BACKGROUND Surgical site infections(SSIs) are the commonest healthcare-associated infection. In addition to increasing mortality, it also lengthens the hospital stay and raises healthcare expenses. SSIs are challenging to predict, with most models having poor predictability. Therefore, we developed a prediction model for SSI after elective abdominal surgery by identifying risk factors.AIM To analyse the data on inpatients undergoing elective abdominal surgery to identify risk factors and develop predictive models that will help clinicians assess patients preoperatively.METHODS We retrospectively analysed the inpatient records of Shaanxi Provincial People’s Hospital from January 1, 2018 to January 1, 2021. We included the demographic data of the patients and their haematological test results in our analysis. The attending physicians provided the Nutritional Risk Screening 2002(NRS 2002)scores. The surgeons and anaesthesiologists manually calculated the National Nosocomial Infections Surveillance(NNIS) scores. Inpatient SSI risk factors were evaluated using univariate analysis and multivariate logistic regression. Nomograms were used in the predictive models. The receiver operating characteristic and area under the curve values were used to measure the specificity and accuracy of the model.RESULTS A total of 3018 patients met the inclusion criteria. The surgical sites included the uterus(42.2%), the liver(27.6%), the gastrointestinal tract(19.1%), the appendix(5.9%), the kidney(3.7%), and the groin area(1.4%). SSI occurred in 5% of the patients(n = 150). The risk factors associated with SSI were as follows: Age;gender;marital status;place of residence;history of diabetes;surgical season;surgical site;NRS 2002 score;preoperative white blood cell, procalcitonin(PCT), albumin, and low-density lipoprotein cholesterol(LDL) levels;preoperative antibiotic use;anaesthesia method;incision grade;NNIS score;intraoperative blood loss;intraoperative drainage tube placement;surgical operation items. Multivariate logistic regression revealed the following independent risk factors: A history of diabetes [odds ratio(OR) = 5.698, 95% confidence interval(CI): 3.305-9.825, P = 0.001], antibiotic use(OR = 14.977, 95%CI: 2.865-78.299, P = 0.001), an NRS 2002 score of ≥ 3(OR = 2.426, 95%CI: 1.199-4.909, P = 0.014), general anaesthesia(OR = 3.334, 95%CI: 1.134-9.806, P = 0.029), an NNIS score of ≥ 2(OR = 2.362, 95%CI: 1.019-5.476, P = 0.045), PCT ≥ 0.05 μg/L(OR = 1.687, 95%CI: 1.056-2.695, P = 0.029), LDL < 3.37 mmol/L(OR = 1.719, 95%CI: 1.039-2.842, P = 0.035), intraoperative blood loss ≥ 200 mL(OR = 29.026, 95%CI: 13.751-61.266, P < 0.001), surgical season(P < 0.05), surgical site(P < 0.05), and incision grade I or Ⅲ(P < 0.05). The overall area under the receiver operating characteristic curve of the predictive model was 0.926, which is significantly higher than the NNIS score(0.662).CONCLUSION The patient’s condition and haematological test indicators form the bases of our prediction model. It is a novel, efficient, and highly accurate predictive model for preventing postoperative SSI, thereby improving the prognosis in patients undergoing abdominal surgery.展开更多
In this manuscript we present a nonlinear site amplification model for ground-motion prediction equations(GMPEs)in Japan,using a site period-based site class and a site impedance ratio as site parameters.We used a lar...In this manuscript we present a nonlinear site amplification model for ground-motion prediction equations(GMPEs)in Japan,using a site period-based site class and a site impedance ratio as site parameters.We used a large number of shear-wave velocity profiles from the Kiban-Kyoshin network(KiK-net)and the Kyoshin network(K-NET)to construct the one-dimensional(1D)numerical models.The strong-motion records from rock-sites in Japan with different earthquake categories and taken from the Pacific Earthquake Engineering Research Center dataset were used in this study.We fit a set of 1D site amplification models using the spectral amplification ratios derived from 1D equivalent linear analyses.Parameters of site impedance ratios for both linear and nonlinear site response were included in the 1D model.The 1D model could be implemented into GMPEs using a new proposed adjustment method.The adjusted site amplification ratios retain the nonlinear characteristics of the 1D model for strong motions and match the linear amplification ratio in GMPE for weak motions.The nonlinearity of the present site model is reasonably similar to that of the historical models,and the present site model could satisfactorily capture the nonlinear site response in empirical data.展开更多
N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insi...N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level.Although a variety of identification algorithms have been proposed recently,most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences,while ignoring the structural dependencies of nucleotides in their threedimensional structures.To overcome this issue,we propose a cross-species end-to-end deep learning model,namely CR-NSSD,which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification.Specifically,CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory.It then constructs a crossdomain reconstruction encoder to learn the sequential and structural dependencies between nucleotides.By minimizing the reconstruction and binary cross-entropy losses,CR-NSSD is trained to complete the task of m6A site identification.Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms.Moreover,the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species,thus improving the accuracy of cross-species identification.展开更多
Identification of the drug-binding residues on the surface of proteins is a vital step in drug discovery and it is important for understanding protein function. Most previous researches are based on the structural inf...Identification of the drug-binding residues on the surface of proteins is a vital step in drug discovery and it is important for understanding protein function. Most previous researches are based on the structural information of proteins, but the structures of most proteins are not available. So in this article, a sequence-based method was proposed by combining the support vector machine (SVM)-based ensemble learning and the improved position specific scoring matrix (PSSM). In order to take the local environment information of a drug-binding site into account, an improved PSSM profile scaled by the sliding window and smoothing window was used to improve the prediction result. In addition, a new SVM-based ensemble learning method was developed to deal with the imbalanced data classification problem that commonly exists in the binding site predictions. When performed on the dataset of 985 drug-binding residues, the method achieved a very promising prediction result with the area under the curve (AUC) of 0.9264. Furthermore, an independent dataset of 349 drug- binding residues was used to evaluate the pre- diction model and the prediction accuracy is 84.68%. These results suggest that our method is effective for predicting the drug-binding sites in proteins. The code and all datasets used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/Ensem_DBS.zip.展开更多
In this work, we study predicting the effect of non-synonymous SNPs on several cancers. We trained classifiers on both sequential and structural features extracted from the affected genes and assessed the predictions ...In this work, we study predicting the effect of non-synonymous SNPs on several cancers. We trained classifiers on both sequential and structural features extracted from the affected genes and assessed the predictions made by the trained classifiers using cross validation. Specifically, we investigated how the prediction performance can be improved by connecting SNPs in the context of haplotype and interacting sites of proteins encoded by affected genes. We found that accuracy was consistently enhanced by combining sequential and structural features, with increase ranging from a few percentage points up to more than 20 percentage points. The results for putting SNPs in the context of interacting sites were less consistent. Compared to individual SNPs, these that appear together in haplotype showed stronger correlation with one another and with the phenotype, and therefore led to significant improvement inprediction performance, with ROC score increased from 0.81 to 0.95. Although some similar effect has been expected for connecting SNPs to interacting sites in proteins, the performance actually got worse. This decrease in prediction accuracy may be caused by the small data set being used in the study, as many affected proteins in the study do not have known interacting sites.展开更多
Phosphorylation of protein is an important post-translational modification that enables activation of various enzymes and receptors included in signaling pathways. To reduce the cost of identifying phosphorylation sit...Phosphorylation of protein is an important post-translational modification that enables activation of various enzymes and receptors included in signaling pathways. To reduce the cost of identifying phosphorylation site by laborious experiments, computational prediction of it has been actively studied. In this study, by adopting a new set of features and applying feature selection by Random Forest with grid search before training by Support Vector Machine, our method achieved better or comparable performance of phosphorylation site prediction for two different data sets.展开更多
AIM:To evaluate the effect of computed tomography(CT) attenuation values of ascites on gastrointestinal(GI) perforation site prediction.METHODS:The CT attenuation values of the ascites from 51 patients with GI perfora...AIM:To evaluate the effect of computed tomography(CT) attenuation values of ascites on gastrointestinal(GI) perforation site prediction.METHODS:The CT attenuation values of the ascites from 51 patients with GI perforations were measured by volume rendering to calculate the mean values.The effect of the CT attenuation values of the ascites on perforation site prediction and postoperative complications was evaluated.RESULTS:Of 24 patients with colorectal perforations,the CT attenuation values of ascites were significantly higher than those in patients with perforations at other sites [22.5 Hounsfield units(HU) vs 16.5 HU,respectively,P = 0.006].Colorectal perforation was significantly associated with postoperative complications(P = 0.038).The prediction rate of colorectal perforation using attenuation values as an auxiliary diagnosis improved by 9.8% compared to that of CT findings alone(92.2% vs 82.4%).CONCLUSION:The CT attenuation values of ascites could facilitate the prediction of perforation sites and postoperative complications in GI perforations,particularly in cases in which the perforation sites are difficult to predict by CT findings alone.展开更多
It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protei...It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protein structure, structure function analysis or sequence alignment. Mutual Information is a standard measure for coevolution between two sites but its application is limited by signal to noise ratio. In this work we report a preliminary study to investigate whether larger sequence sets could circumvent this problem by calculating mutual information arrays for two sets of drug naive sequences from the HIV gpl20 protein for the B and C subtypes. Our results suggest that while the larger sequences sets can improve the signal to noise ratio, the gain is offset by the high mutation rate of the HIV virus which makes it more difficult to achieve consistent alignments. Nevertheless, we were able to predict a number of coevolving sites that were supported by previous experimental studies as well as a region close to the C terminal of the protein that was highly variable in the C subtype but highly conserved in the B subtype.展开更多
基金Supported by Key Research and Development Program of Shaanxi,No.2020GXLH-Y-019 and 2022KXJ-141Innovation Capability Support Program of Shaanxi,No.2019GHJD-14 and 2021TD-40+1 种基金Science and Technology Talent Support Program of Shaanxi Provincial People's Hospital,No.2021LJ-052023 Natural Science Basic Research Foundation of Shaanxi Province,No.2023-JC-YB-739.
文摘BACKGROUND Surgical site infections(SSIs) are the commonest healthcare-associated infection. In addition to increasing mortality, it also lengthens the hospital stay and raises healthcare expenses. SSIs are challenging to predict, with most models having poor predictability. Therefore, we developed a prediction model for SSI after elective abdominal surgery by identifying risk factors.AIM To analyse the data on inpatients undergoing elective abdominal surgery to identify risk factors and develop predictive models that will help clinicians assess patients preoperatively.METHODS We retrospectively analysed the inpatient records of Shaanxi Provincial People’s Hospital from January 1, 2018 to January 1, 2021. We included the demographic data of the patients and their haematological test results in our analysis. The attending physicians provided the Nutritional Risk Screening 2002(NRS 2002)scores. The surgeons and anaesthesiologists manually calculated the National Nosocomial Infections Surveillance(NNIS) scores. Inpatient SSI risk factors were evaluated using univariate analysis and multivariate logistic regression. Nomograms were used in the predictive models. The receiver operating characteristic and area under the curve values were used to measure the specificity and accuracy of the model.RESULTS A total of 3018 patients met the inclusion criteria. The surgical sites included the uterus(42.2%), the liver(27.6%), the gastrointestinal tract(19.1%), the appendix(5.9%), the kidney(3.7%), and the groin area(1.4%). SSI occurred in 5% of the patients(n = 150). The risk factors associated with SSI were as follows: Age;gender;marital status;place of residence;history of diabetes;surgical season;surgical site;NRS 2002 score;preoperative white blood cell, procalcitonin(PCT), albumin, and low-density lipoprotein cholesterol(LDL) levels;preoperative antibiotic use;anaesthesia method;incision grade;NNIS score;intraoperative blood loss;intraoperative drainage tube placement;surgical operation items. Multivariate logistic regression revealed the following independent risk factors: A history of diabetes [odds ratio(OR) = 5.698, 95% confidence interval(CI): 3.305-9.825, P = 0.001], antibiotic use(OR = 14.977, 95%CI: 2.865-78.299, P = 0.001), an NRS 2002 score of ≥ 3(OR = 2.426, 95%CI: 1.199-4.909, P = 0.014), general anaesthesia(OR = 3.334, 95%CI: 1.134-9.806, P = 0.029), an NNIS score of ≥ 2(OR = 2.362, 95%CI: 1.019-5.476, P = 0.045), PCT ≥ 0.05 μg/L(OR = 1.687, 95%CI: 1.056-2.695, P = 0.029), LDL < 3.37 mmol/L(OR = 1.719, 95%CI: 1.039-2.842, P = 0.035), intraoperative blood loss ≥ 200 mL(OR = 29.026, 95%CI: 13.751-61.266, P < 0.001), surgical season(P < 0.05), surgical site(P < 0.05), and incision grade I or Ⅲ(P < 0.05). The overall area under the receiver operating characteristic curve of the predictive model was 0.926, which is significantly higher than the NNIS score(0.662).CONCLUSION The patient’s condition and haematological test indicators form the bases of our prediction model. It is a novel, efficient, and highly accurate predictive model for preventing postoperative SSI, thereby improving the prognosis in patients undergoing abdominal surgery.
基金National Science Foundation of China under Grant No.51578470。
文摘In this manuscript we present a nonlinear site amplification model for ground-motion prediction equations(GMPEs)in Japan,using a site period-based site class and a site impedance ratio as site parameters.We used a large number of shear-wave velocity profiles from the Kiban-Kyoshin network(KiK-net)and the Kyoshin network(K-NET)to construct the one-dimensional(1D)numerical models.The strong-motion records from rock-sites in Japan with different earthquake categories and taken from the Pacific Earthquake Engineering Research Center dataset were used in this study.We fit a set of 1D site amplification models using the spectral amplification ratios derived from 1D equivalent linear analyses.Parameters of site impedance ratios for both linear and nonlinear site response were included in the 1D model.The 1D model could be implemented into GMPEs using a new proposed adjustment method.The adjusted site amplification ratios retain the nonlinear characteristics of the 1D model for strong motions and match the linear amplification ratio in GMPE for weak motions.The nonlinearity of the present site model is reasonably similar to that of the historical models,and the present site model could satisfactorily capture the nonlinear site response in empirical data.
基金supported in part by the National Natural Science Foundation of China(62373348)the Natural Science Foundation of Xinjiang Uygur Autonomous Region(2021D01D05)+1 种基金the Tianshan Talent Training Program(2023TSYCLJ0021)the Pioneer Hundred Talents Program of Chinese Academy of Sciences.
文摘N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level.Although a variety of identification algorithms have been proposed recently,most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences,while ignoring the structural dependencies of nucleotides in their threedimensional structures.To overcome this issue,we propose a cross-species end-to-end deep learning model,namely CR-NSSD,which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification.Specifically,CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory.It then constructs a crossdomain reconstruction encoder to learn the sequential and structural dependencies between nucleotides.By minimizing the reconstruction and binary cross-entropy losses,CR-NSSD is trained to complete the task of m6A site identification.Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms.Moreover,the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species,thus improving the accuracy of cross-species identification.
文摘Identification of the drug-binding residues on the surface of proteins is a vital step in drug discovery and it is important for understanding protein function. Most previous researches are based on the structural information of proteins, but the structures of most proteins are not available. So in this article, a sequence-based method was proposed by combining the support vector machine (SVM)-based ensemble learning and the improved position specific scoring matrix (PSSM). In order to take the local environment information of a drug-binding site into account, an improved PSSM profile scaled by the sliding window and smoothing window was used to improve the prediction result. In addition, a new SVM-based ensemble learning method was developed to deal with the imbalanced data classification problem that commonly exists in the binding site predictions. When performed on the dataset of 985 drug-binding residues, the method achieved a very promising prediction result with the area under the curve (AUC) of 0.9264. Furthermore, an independent dataset of 349 drug- binding residues was used to evaluate the pre- diction model and the prediction accuracy is 84.68%. These results suggest that our method is effective for predicting the drug-binding sites in proteins. The code and all datasets used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/Ensem_DBS.zip.
文摘In this work, we study predicting the effect of non-synonymous SNPs on several cancers. We trained classifiers on both sequential and structural features extracted from the affected genes and assessed the predictions made by the trained classifiers using cross validation. Specifically, we investigated how the prediction performance can be improved by connecting SNPs in the context of haplotype and interacting sites of proteins encoded by affected genes. We found that accuracy was consistently enhanced by combining sequential and structural features, with increase ranging from a few percentage points up to more than 20 percentage points. The results for putting SNPs in the context of interacting sites were less consistent. Compared to individual SNPs, these that appear together in haplotype showed stronger correlation with one another and with the phenotype, and therefore led to significant improvement inprediction performance, with ROC score increased from 0.81 to 0.95. Although some similar effect has been expected for connecting SNPs to interacting sites in proteins, the performance actually got worse. This decrease in prediction accuracy may be caused by the small data set being used in the study, as many affected proteins in the study do not have known interacting sites.
文摘Phosphorylation of protein is an important post-translational modification that enables activation of various enzymes and receptors included in signaling pathways. To reduce the cost of identifying phosphorylation site by laborious experiments, computational prediction of it has been actively studied. In this study, by adopting a new set of features and applying feature selection by Random Forest with grid search before training by Support Vector Machine, our method achieved better or comparable performance of phosphorylation site prediction for two different data sets.
文摘AIM:To evaluate the effect of computed tomography(CT) attenuation values of ascites on gastrointestinal(GI) perforation site prediction.METHODS:The CT attenuation values of the ascites from 51 patients with GI perforations were measured by volume rendering to calculate the mean values.The effect of the CT attenuation values of the ascites on perforation site prediction and postoperative complications was evaluated.RESULTS:Of 24 patients with colorectal perforations,the CT attenuation values of ascites were significantly higher than those in patients with perforations at other sites [22.5 Hounsfield units(HU) vs 16.5 HU,respectively,P = 0.006].Colorectal perforation was significantly associated with postoperative complications(P = 0.038).The prediction rate of colorectal perforation using attenuation values as an auxiliary diagnosis improved by 9.8% compared to that of CT findings alone(92.2% vs 82.4%).CONCLUSION:The CT attenuation values of ascites could facilitate the prediction of perforation sites and postoperative complications in GI perforations,particularly in cases in which the perforation sites are difficult to predict by CT findings alone.
文摘It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protein structure, structure function analysis or sequence alignment. Mutual Information is a standard measure for coevolution between two sites but its application is limited by signal to noise ratio. In this work we report a preliminary study to investigate whether larger sequence sets could circumvent this problem by calculating mutual information arrays for two sets of drug naive sequences from the HIV gpl20 protein for the B and C subtypes. Our results suggest that while the larger sequences sets can improve the signal to noise ratio, the gain is offset by the high mutation rate of the HIV virus which makes it more difficult to achieve consistent alignments. Nevertheless, we were able to predict a number of coevolving sites that were supported by previous experimental studies as well as a region close to the C terminal of the protein that was highly variable in the C subtype but highly conserved in the B subtype.