摘要
随着电影市场与观众消费需求的持续增长,电影内容的价值愈发显现。提取电影内容元素是对电影内容进行量化分析的重要一步,现有的电影内容标签提取方法多用于影片推荐,注重标签与观众喜好的契合度,并不能代表和概括电影内容本身。本文基于电影“微类型”公式,采用机器学习中的聚类算法以及自然语言处理相关算法对电影长评论进行关键词智能提取,结合人工标注得到代表电影内容的元素,并将各元素采用机器学习算法与票房进行关联性分析,获得各元素对票房的影响权重。
With the continuous growth of the film market and audience consumption demand,the value of film content is becoming more and more obvious.Extraction of film content elements is an important step for quantitative analysis of film content.Existing meth-ods of extracting film content labels are mostly used for film recommendation,focusing on the compatibility between labels and audi-ence preferences,which can not represent and summarize the film content itself.Based on the formula of"micro—type"of movies,this paper intelligently extracts keywords from long reviews of movies by using clustering algorithm in machine learning and natural language processing related algorithm.After combining with manual labeling to get the elements representing the content of movies,machine learning algorithm is used to analyze the correlation between each element and box office,as well as the influence weight of each element on box office.
作者
王萃
张海悦
WANG Cui;ZHANG Haiyue(China Research Institute of Film Science&Technology)
出处
《现代电影技术》
2019年第9期4-9,共6页
Advanced Motion Picture Technology
关键词
电影微类型
电影内容元素
机器学习
自然语言处理
智能提取
关联性分析
Film Microtype
Film Content Elements
Machine Learning
Natural Language Processing
Intelligent Extraction
Relevance Analysis