The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pi...The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.展开更多
A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monito...A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monitoring. Combining the device characteristics, the strategy first proposes a cascaded deep neural network, which inputs 2D point cloud, color image and pitching angle. The outputs are target distance and speed classification. And the cross-entropy loss function of network is modified by using focal loss and uniform distribution to improve the recognition accuracy. Then a pitching range and speed model are proposed to determine pitching motion parameters. Finally, the adaptive scanning is realized by integral separate speed PID. The experimental results show that the accuracies of the improved network target detection box, distance and speed classification are 90.17%, 96.87% and 96.97%, respectively. The average speed error of the improved PID is 0.4239°/s, and the average strategy execution time is 0.1521 s.The range and speed model can effectively reduce the collection of useless information and the deformation of the target point cloud. Conclusively, the experimental of overall scanning strategy show that it can improve target point cloud integrity and density while ensuring the capture of target.展开更多
基金Sponsored by the National Natural Science Foundation of China(Grant No.60503071)the 973 National Basic Research Program of China(Grant No.2004CB318102)the Postdoctor Science Foundation of China(Grant No.20070420275)
文摘The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.
基金funded by National Natural Science Foundation of China(Grant No. 51805146)the Fundamental Research Funds for the Central Universities (Grant No. B200202221)+1 种基金Jiangsu Key R&D Program (Grant Nos. BE2018004-1, BE2018004)College Students’ Innovative Entrepreneurial Training Plan Program (Grant No. 2020102941513)。
文摘A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monitoring. Combining the device characteristics, the strategy first proposes a cascaded deep neural network, which inputs 2D point cloud, color image and pitching angle. The outputs are target distance and speed classification. And the cross-entropy loss function of network is modified by using focal loss and uniform distribution to improve the recognition accuracy. Then a pitching range and speed model are proposed to determine pitching motion parameters. Finally, the adaptive scanning is realized by integral separate speed PID. The experimental results show that the accuracies of the improved network target detection box, distance and speed classification are 90.17%, 96.87% and 96.97%, respectively. The average speed error of the improved PID is 0.4239°/s, and the average strategy execution time is 0.1521 s.The range and speed model can effectively reduce the collection of useless information and the deformation of the target point cloud. Conclusively, the experimental of overall scanning strategy show that it can improve target point cloud integrity and density while ensuring the capture of target.