In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual...In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.展开更多
文摘In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.