Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency...Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency Scaling model. Whispered speech is filtered by the proposed model. Meanwhile, the value of masking threshold for each frequency band is dynamically determined by speech absence probability. Then whispered speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values. Results of objective and subjective tests on the enhanced whispered signal show that compared with other methods; the proposed method can enhance whispered signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.展开更多
Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,...Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,a novel model of move recognition is proposed that outperforms the BERT-based method.Design/methodology/approach:Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences.In this paper,inspired by the BERT masked language model(MLM),we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition.Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps.Then,we compare our model with HSLN-RNN,BERT-based and SciBERT using the same dataset.Findings:Compared with the BERT-based and SciBERT models,the F1 score of our model outperforms them by 4.96%and 4.34%,respectively,which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-theart results of HSLN-RNN at present.Research limitations:The sequential features of move labels are not considered,which might be one of the reasons why HSLN-RNN has better performance.Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed,which is a typical biomedical database,to fine-tune our model.Practical implications:The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.Originality/value:T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way.The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks.展开更多
In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In thi...In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.展开更多
Sign language is mainly utilized in communication with people who have hearing disabilities.Sign language is used to communicate with people hav-ing developmental impairments who have some or no interaction skills.The...Sign language is mainly utilized in communication with people who have hearing disabilities.Sign language is used to communicate with people hav-ing developmental impairments who have some or no interaction skills.The inter-action via Sign language becomes a fruitful means of communication for hearing and speech impaired persons.A Hand gesture recognition systemfinds helpful for deaf and dumb people by making use of human computer interface(HCI)and convolutional neural networks(CNN)for identifying the static indications of Indian Sign Language(ISL).This study introduces a shark smell optimization with deep learning based automated sign language recognition(SSODL-ASLR)model for hearing and speaking impaired people.The presented SSODL-ASLR technique majorly concentrates on the recognition and classification of sign lan-guage provided by deaf and dumb people.The presented SSODL-ASLR model encompasses a two stage process namely sign language detection and sign lan-guage classification.In thefirst stage,the Mask Region based Convolution Neural Network(Mask RCNN)model is exploited for sign language recognition.Sec-ondly,SSO algorithm with soft margin support vector machine(SM-SVM)model can be utilized for sign language classification.To assure the enhanced classifica-tion performance of the SSODL-ASLR model,a brief set of simulations was car-ried out.The extensive results portrayed the supremacy of the SSODL-ASLR model over other techniques.展开更多
基金supported by the National Natural Science Foundation of China(61071215)the University Natural Science Research Project of Jiangsu Province(05KJB510113)
文摘Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency Scaling model. Whispered speech is filtered by the proposed model. Meanwhile, the value of masking threshold for each frequency band is dynamically determined by speech absence probability. Then whispered speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values. Results of objective and subjective tests on the enhanced whispered signal show that compared with other methods; the proposed method can enhance whispered signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.
基金supported by the project “The demonstration system of rich semantic search application in scientific literature” (Grant No. 1734) from the Chinese Academy of Sciences
文摘Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,a novel model of move recognition is proposed that outperforms the BERT-based method.Design/methodology/approach:Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences.In this paper,inspired by the BERT masked language model(MLM),we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition.Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps.Then,we compare our model with HSLN-RNN,BERT-based and SciBERT using the same dataset.Findings:Compared with the BERT-based and SciBERT models,the F1 score of our model outperforms them by 4.96%and 4.34%,respectively,which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-theart results of HSLN-RNN at present.Research limitations:The sequential features of move labels are not considered,which might be one of the reasons why HSLN-RNN has better performance.Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed,which is a typical biomedical database,to fine-tune our model.Practical implications:The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.Originality/value:T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way.The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks.
基金supported in part by the National Key R&D Program of China under Grant 2018YFA0701601part by the National Natural Science Foundation of China(Grant No.U22A2002,61941104,62201605)part by Tsinghua University-China Mobile Communications Group Co.,Ltd.Joint Institute。
文摘In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.
文摘Sign language is mainly utilized in communication with people who have hearing disabilities.Sign language is used to communicate with people hav-ing developmental impairments who have some or no interaction skills.The inter-action via Sign language becomes a fruitful means of communication for hearing and speech impaired persons.A Hand gesture recognition systemfinds helpful for deaf and dumb people by making use of human computer interface(HCI)and convolutional neural networks(CNN)for identifying the static indications of Indian Sign Language(ISL).This study introduces a shark smell optimization with deep learning based automated sign language recognition(SSODL-ASLR)model for hearing and speaking impaired people.The presented SSODL-ASLR technique majorly concentrates on the recognition and classification of sign lan-guage provided by deaf and dumb people.The presented SSODL-ASLR model encompasses a two stage process namely sign language detection and sign lan-guage classification.In thefirst stage,the Mask Region based Convolution Neural Network(Mask RCNN)model is exploited for sign language recognition.Sec-ondly,SSO algorithm with soft margin support vector machine(SM-SVM)model can be utilized for sign language classification.To assure the enhanced classifica-tion performance of the SSODL-ASLR model,a brief set of simulations was car-ried out.The extensive results portrayed the supremacy of the SSODL-ASLR model over other techniques.