摘要
利用加权有限状态转换器相关的合成操作,可以将语音识别需要的模型进行组合,便于识别中各种知识的综合利用,从而提升识别性能。传统合成算法在计算的同时存储了无效状态与状态转移。在进行词典与语言模型等合成操作时,算法需要1 GB甚至更多内存保存无效信息,这直接导致了算法的高空间复杂度。为解决这一问题,提出同步裁剪合成算法(synchronized pruning composition algorithm,SPCA)。新算法对传统合成算法进行了改进,在合成的同时对无效信息进行及时的分析和去除。实验表明,与经典的合成算法相比,SPCA平均节约内存14.99%,所用最大内存节约25.72%,有效降低了合成的空间复杂度。
The WFST-related composition algorithm could be used to integrate recognition models together to facilitate the utilization of knowledge during speech recognition and to improve the recognition system's performance.The general composition algorithm stores lots of useless states and transitions during it runs.It needs 1 GB or more memory to save the useless info when compose dictionary and language models,which impact the algorithm's space complexity.To solve this problem,this paper developed a SPCA.It improved the general composition method.With the new method,the composition and removing useless info were done simultaneously.Experiments shows that the improved method achieves 14.99% and 25.72% in average and maximum memory reduction compared with the general method,and effectively reduces the composition's space complexity.
出处
《计算机应用研究》
CSCD
北大核心
2011年第8期2931-2934,共4页
Application Research of Computers
关键词
加权有限状态转换器
合成
有向图
空间复杂度
语音识别
WFST(weighted finite-state transducer)
composition
digraph
space-complexity
speech recognition