期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications
1
作者 Shuting Ge Jin Ren +3 位作者 yihua shi Yujun Zhang Shunzhi Yang Jinfeng Yang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3215-3245,共31页
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p... In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management. 展开更多
关键词 Speech-text multimodal automatic speech recognition semantic alignment air traffic control communications dual-tower architecture
下载PDF
胰腺癌新辅助化疗后腹腔镜手术治疗的效果分析 被引量:14
2
作者 卓奇峰 刘梦奇 +6 位作者 李征 刘文生 史逸华 许文彦 吉顺荣 徐晓武 虞先濬 《中华外科杂志》 CAS CSCD 北大核心 2022年第2期134-139,共6页
目的探讨胰腺癌接受新辅助化疗后行腹腔镜手术治疗的临床效果。方法回顾性分析2019年9月至2020年6月在复旦大学附属肿瘤医院胰腺外科接受腹腔镜手术治疗的8例新辅助化疗后胰腺癌患者的临床资料。其中男性5例,女性3例;年龄47~72岁。所有... 目的探讨胰腺癌接受新辅助化疗后行腹腔镜手术治疗的临床效果。方法回顾性分析2019年9月至2020年6月在复旦大学附属肿瘤医院胰腺外科接受腹腔镜手术治疗的8例新辅助化疗后胰腺癌患者的临床资料。其中男性5例,女性3例;年龄47~72岁。所有患者术前均行腹部增强CT检查和PET-CT检查,以准确评估肿瘤分期,并排除远处转移。结果8例患者术前均接受AG方案(吉西他滨1000 mg/m2+白蛋白结合型紫杉醇125 mg/m2,第1、8、15天,每4周1个周期)新辅助化疗2~6周期,并顺利完成手术。其中接受胰十二指肠切除术5例,根治性顺行模块化胰脾切除术2例,全胰腺切除术1例,无中转开腹或腹腔镜辅助手术。手术时间240~450 min,术中出血量100~500 ml,术后住院时间10~16 d。随访截至2020年12月31日,术后发生并发症1例(B级胰瘘伴腹腔感染),无围手术期死亡。淋巴结清扫数目为9~31枚。所有患者均获得R0切除。随访时间4.5~9.5个月,其中1例胰体尾癌患者术后2个月发生肝转移,其余7例患者仍无瘤存活。结论在经验丰富的胰腺微创外科中心,胰腺癌患者新辅助化疗后接受腹腔镜手术治疗具有较好的临床效果。 展开更多
关键词 胰腺肿瘤 腹腔镜检查 新辅助化疗 R0切除 并发症
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部