Objective: Patients with radioactive iodine-refractory differentiated thyroid cancer(RAIR-DTC) are often diagnosed with delay and constrained to limited treatment options. The correlation between RAI refractoriness an...Objective: Patients with radioactive iodine-refractory differentiated thyroid cancer(RAIR-DTC) are often diagnosed with delay and constrained to limited treatment options. The correlation between RAI refractoriness and the underlying genetic characteristics has not been extensively studied.Methods: Adult patients with distant metastatic DTC were enrolled and assigned to undergo next-generation sequencing of a customized 26-gene panel(Thyro Lead). Patients were classified into RAIR-DTC or non-RAIR groups to determine the differences in clinicopathological and molecular characteristics. Molecular risk stratification(MRS) was constructed based on the association between molecular alterations identified and RAI refractoriness, and the results were classified as high, intermediate or low MRS.Results: A total of 220 patients with distant metastases were included, 63.2% of whom were identified as RAIRDTC. Genetic alterations were identified in 90% of all the patients, with BRAF(59.7% vs. 17.3%), TERT promoter(43.9% vs. 7.4%), and TP53 mutations(11.5% vs. 3.7%) being more prevalent in the RAIR-DTC group than in the non-RAIR group, except for RET fusions(15.8% vs. 39.5%), which had the opposite pattern. BRAF and TERT promoter are independent predictors of RAIR-DTC, accounting for 67.6% of patients with RAIR-DTC. MRS was strongly associated with RAI refractoriness(P<0.001), with an odds ratio(OR) of high to low MRS of 7.52 [95%confidence interval(95% CI), 3.96-14.28;P<0.001] and an OR of intermediate to low MRS of 3.20(95% CI,1.01-10.14;P=0.041).Conclusions: Molecular alterations were associated with RAI refractoriness, with BRAF and TERT promoter mutations being the predominant contributors, followed by TP53 and DICER1 mutations. MRS might serve as a valuable tool for both prognosticating clinical outcomes and directing precision-based therapeutic interventions.展开更多
Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved throu...Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.展开更多
BACKGROUND Duodenal cancer is one of the most common subtypes of small intestinal cancer,and distant metastasis(DM)in this type of cancer still leads to poor prognosis.Although nomograms have recently been used in tum...BACKGROUND Duodenal cancer is one of the most common subtypes of small intestinal cancer,and distant metastasis(DM)in this type of cancer still leads to poor prognosis.Although nomograms have recently been used in tumor areas,no studies have focused on the diagnostic and prognostic evaluation of DM in patients with primary duodenal cancer.AIM To develop and evaluate nomograms for predicting the risk of DM and person-alized prognosis in patients with duodenal cancer.METHODS Data on duodenal cancer patients diagnosed between 2010 and 2019 were extracted from the Surveillance,Epidemiology,and End Results database.Univariate and multivariate logistic regression analyses were used to identify independent risk factors for DM in patients with duodenal cancer,and univariate and multivariate Cox proportional hazards regression analyses were used to determine independent prognostic factors in duodenal cancer patients with DM.Two novel nomograms were established,and the results were evaluated by receiver operating characteristic(ROC)curves,calibration curves,and decision curve analysis(DCA).RESULTS A total of 2603 patients with duodenal cancer were included,of whom 457 cases(17.56%)had DM at the time of diagnosis.Logistic analysis revealed independent risk factors for DM in duodenal cancer patients,including gender,grade,tumor size,T stage,and N stage(P<0.05).Univariate and multivariate COX analyses further identified independent prognostic factors for duodenal cancer patients with DM,including age,histological type,T stage,tumor grade,tumor size,bone metastasis,chemotherapy,and surgery(P<0.05).The accuracy of the nomograms was validated in the training set,validation set,and expanded testing set using ROC curves,calibration curves,and DCA curves.The results of Kaplan-Meier survival curves(P<0.001)indicated that both nomograms accurately predicted the occurrence and prognosis of DM in patients with duodenal cancer.CONCLUSION The two nomograms are expected as effective tools for predicting DM risk in duodenal cancer patients and offering personalized prognosis predictions for those with DM,potentially enhancing clinical decision-making.展开更多
BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced N...BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.展开更多
BACKGROUND The prognosis of many patients with distant metastatic hepatocellular carcinoma(HCC)improved after they survived for several months.Compared with tradi-tional survival analysis,conditional survival(CS)which...BACKGROUND The prognosis of many patients with distant metastatic hepatocellular carcinoma(HCC)improved after they survived for several months.Compared with tradi-tional survival analysis,conditional survival(CS)which takes into account changes in survival risk could be used to describe dynamic survival probabilities.AIM To evaluate CS of distant metastatic HCC patients.METHODS Patients diagnosed with distant metastatic HCC between 2010 and 2015 were extracted from the Surveillance,Epidemiology and End Results database.Univariate and multivariate Cox regression analysis were used to identify factors for overall survival(OS),while competing risk model was used to identify risk factors for cancer-specific survival(CSS).Six-month CS was used to calculate the probability of survival for an additional 6 mo at a specific time after initial diagnosis,and standardized difference(d)was used to evaluate the survival differences between subgroups.Nomograms were constructed to predict CS.Positiveα-fetoprotein expression,higher T stage(T3 and T4),N1 stage,non-primary site surgery,non-chemotherapy,non-radiotherapy,and lung metastasis were independent risk factors for actual OS and CSS through univariate and multivariate analysis.Actual survival rates decreased over time,while CS rates gradually increased.As for the 6-month CS,the survival difference caused by chemotherapy and radiotherapy gradually disappeared over time,and the survival difference caused by lung metastasis reversed.Moreover,the influence of age and gender on survival gradually appeared.Nomograms were fitted for patients who have lived for 2,4 and 6 mo to predict 6-month conditional OS and CSS,respectively.The area under the curve(AUC)of nomograms for conditional OS decreased as time passed,and the AUC for conditional CSS gradually increased.CONCLUSION CS for distant metastatic HCC patients substantially increased over time.With dynamic risk factors,nomograms constructed at a specific time could predict more accurate survival rates.展开更多
针对词向量语义信息不完整以及文本特征抽取时的一词多义问题,提出基于BERT(Bidirectional Encoder Representation from Transformer)的两次注意力加权算法(TARE)。首先,在词向量编码阶段,通过构建Q、K、V矩阵使用自注意力机制动态编...针对词向量语义信息不完整以及文本特征抽取时的一词多义问题,提出基于BERT(Bidirectional Encoder Representation from Transformer)的两次注意力加权算法(TARE)。首先,在词向量编码阶段,通过构建Q、K、V矩阵使用自注意力机制动态编码算法,为当前词的词向量捕获文本前后词语义信息;其次,在模型输出句子级特征向量后,利用定位信息符提取全连接层对应参数,构建关系注意力矩阵;最后,运用句子级注意力机制算法为每个句子级特征向量添加不同的注意力分数,提高句子级特征的抗噪能力。实验结果表明:在NYT-10m数据集上,与基于对比学习框架的CIL(Contrastive Instance Learning)算法相比,TARE的F1值提升了4.0个百分点,按置信度降序排列后前100、200和300条数据精准率Precision@N的平均值(P@M)提升了11.3个百分点;在NYT-10d数据集上,与基于注意力机制的PCNN-ATT(Piecewise Convolutional Neural Network algorithm based on ATTention mechanism)算法相比,精准率与召回率曲线下的面积(AUC)提升了4.8个百分点,P@M值提升了2.1个百分点。在主流的远程监督关系抽取(DSER)任务中,TARE有效地提升了模型对数据特征的学习能力。展开更多
基金supported by the Project on InterGovernmental International Scientific and Technological Innovation Cooperation in National Key Projects of Research and Development Plan (No. 2019YFE0106400)the National Natural Science Foundation of China (No. 81771875)。
文摘Objective: Patients with radioactive iodine-refractory differentiated thyroid cancer(RAIR-DTC) are often diagnosed with delay and constrained to limited treatment options. The correlation between RAI refractoriness and the underlying genetic characteristics has not been extensively studied.Methods: Adult patients with distant metastatic DTC were enrolled and assigned to undergo next-generation sequencing of a customized 26-gene panel(Thyro Lead). Patients were classified into RAIR-DTC or non-RAIR groups to determine the differences in clinicopathological and molecular characteristics. Molecular risk stratification(MRS) was constructed based on the association between molecular alterations identified and RAI refractoriness, and the results were classified as high, intermediate or low MRS.Results: A total of 220 patients with distant metastases were included, 63.2% of whom were identified as RAIRDTC. Genetic alterations were identified in 90% of all the patients, with BRAF(59.7% vs. 17.3%), TERT promoter(43.9% vs. 7.4%), and TP53 mutations(11.5% vs. 3.7%) being more prevalent in the RAIR-DTC group than in the non-RAIR group, except for RET fusions(15.8% vs. 39.5%), which had the opposite pattern. BRAF and TERT promoter are independent predictors of RAIR-DTC, accounting for 67.6% of patients with RAIR-DTC. MRS was strongly associated with RAI refractoriness(P<0.001), with an odds ratio(OR) of high to low MRS of 7.52 [95%confidence interval(95% CI), 3.96-14.28;P<0.001] and an OR of intermediate to low MRS of 3.20(95% CI,1.01-10.14;P=0.041).Conclusions: Molecular alterations were associated with RAI refractoriness, with BRAF and TERT promoter mutations being the predominant contributors, followed by TP53 and DICER1 mutations. MRS might serve as a valuable tool for both prognosticating clinical outcomes and directing precision-based therapeutic interventions.
文摘Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.
基金Supported by State Administration of Traditional Chinese Medicine Base Construction Stomach Cancer Special Fund,No.Y2020CX57Jiangsu Provincial Graduate Research and Practical Innovation Program Project,No.SJCX23-0799.
文摘BACKGROUND Duodenal cancer is one of the most common subtypes of small intestinal cancer,and distant metastasis(DM)in this type of cancer still leads to poor prognosis.Although nomograms have recently been used in tumor areas,no studies have focused on the diagnostic and prognostic evaluation of DM in patients with primary duodenal cancer.AIM To develop and evaluate nomograms for predicting the risk of DM and person-alized prognosis in patients with duodenal cancer.METHODS Data on duodenal cancer patients diagnosed between 2010 and 2019 were extracted from the Surveillance,Epidemiology,and End Results database.Univariate and multivariate logistic regression analyses were used to identify independent risk factors for DM in patients with duodenal cancer,and univariate and multivariate Cox proportional hazards regression analyses were used to determine independent prognostic factors in duodenal cancer patients with DM.Two novel nomograms were established,and the results were evaluated by receiver operating characteristic(ROC)curves,calibration curves,and decision curve analysis(DCA).RESULTS A total of 2603 patients with duodenal cancer were included,of whom 457 cases(17.56%)had DM at the time of diagnosis.Logistic analysis revealed independent risk factors for DM in duodenal cancer patients,including gender,grade,tumor size,T stage,and N stage(P<0.05).Univariate and multivariate COX analyses further identified independent prognostic factors for duodenal cancer patients with DM,including age,histological type,T stage,tumor grade,tumor size,bone metastasis,chemotherapy,and surgery(P<0.05).The accuracy of the nomograms was validated in the training set,validation set,and expanded testing set using ROC curves,calibration curves,and DCA curves.The results of Kaplan-Meier survival curves(P<0.001)indicated that both nomograms accurately predicted the occurrence and prognosis of DM in patients with duodenal cancer.CONCLUSION The two nomograms are expected as effective tools for predicting DM risk in duodenal cancer patients and offering personalized prognosis predictions for those with DM,potentially enhancing clinical decision-making.
文摘BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.
文摘BACKGROUND The prognosis of many patients with distant metastatic hepatocellular carcinoma(HCC)improved after they survived for several months.Compared with tradi-tional survival analysis,conditional survival(CS)which takes into account changes in survival risk could be used to describe dynamic survival probabilities.AIM To evaluate CS of distant metastatic HCC patients.METHODS Patients diagnosed with distant metastatic HCC between 2010 and 2015 were extracted from the Surveillance,Epidemiology and End Results database.Univariate and multivariate Cox regression analysis were used to identify factors for overall survival(OS),while competing risk model was used to identify risk factors for cancer-specific survival(CSS).Six-month CS was used to calculate the probability of survival for an additional 6 mo at a specific time after initial diagnosis,and standardized difference(d)was used to evaluate the survival differences between subgroups.Nomograms were constructed to predict CS.Positiveα-fetoprotein expression,higher T stage(T3 and T4),N1 stage,non-primary site surgery,non-chemotherapy,non-radiotherapy,and lung metastasis were independent risk factors for actual OS and CSS through univariate and multivariate analysis.Actual survival rates decreased over time,while CS rates gradually increased.As for the 6-month CS,the survival difference caused by chemotherapy and radiotherapy gradually disappeared over time,and the survival difference caused by lung metastasis reversed.Moreover,the influence of age and gender on survival gradually appeared.Nomograms were fitted for patients who have lived for 2,4 and 6 mo to predict 6-month conditional OS and CSS,respectively.The area under the curve(AUC)of nomograms for conditional OS decreased as time passed,and the AUC for conditional CSS gradually increased.CONCLUSION CS for distant metastatic HCC patients substantially increased over time.With dynamic risk factors,nomograms constructed at a specific time could predict more accurate survival rates.
文摘针对词向量语义信息不完整以及文本特征抽取时的一词多义问题,提出基于BERT(Bidirectional Encoder Representation from Transformer)的两次注意力加权算法(TARE)。首先,在词向量编码阶段,通过构建Q、K、V矩阵使用自注意力机制动态编码算法,为当前词的词向量捕获文本前后词语义信息;其次,在模型输出句子级特征向量后,利用定位信息符提取全连接层对应参数,构建关系注意力矩阵;最后,运用句子级注意力机制算法为每个句子级特征向量添加不同的注意力分数,提高句子级特征的抗噪能力。实验结果表明:在NYT-10m数据集上,与基于对比学习框架的CIL(Contrastive Instance Learning)算法相比,TARE的F1值提升了4.0个百分点,按置信度降序排列后前100、200和300条数据精准率Precision@N的平均值(P@M)提升了11.3个百分点;在NYT-10d数据集上,与基于注意力机制的PCNN-ATT(Piecewise Convolutional Neural Network algorithm based on ATTention mechanism)算法相比,精准率与召回率曲线下的面积(AUC)提升了4.8个百分点,P@M值提升了2.1个百分点。在主流的远程监督关系抽取(DSER)任务中,TARE有效地提升了模型对数据特征的学习能力。