期刊文献+

基于NLP的北京中轴线设计因子提取研究

Extraction of design factors of Beijing's central axis based on NLP
下载PDF
导出
摘要 为了更好地挖掘北京中轴线蕴含的丰富设计元素,提出一种基于自然语言处理的设计因子提取方法.首先利用爬虫爬取与北京中轴线及其南北延长线上各重要地标相关的语料数据,然后利用jieba对爬取的语料数据进行分词并删除停用词.获得分词结果后,利用词频-逆文档频率(term frequency-inverse document frequency,TF-IDF)技术提取各地标的关键词及其权重,选取其中权重较高的关键词作为该地标的主题词.利用词语间的相似度对各地标的主题词进行聚类,并根据聚类结果提取语义因子.提取的大致流程包括首先要求参与者在充分阅读相关材料后通过投票排除无意义的聚类结果,然后利用卡片分类法将聚类结果进行合并,最后要求参与者用感性词汇对卡片分类结果进行恰当命名以获得各地标的语义因子.根据各地标的语义因子编制语义差异法问卷,邀请参与者对各地标的典型颜色进行打分以筛选与之匹配的色彩因子.最终提取出22个地标共计64个语义因子及22个色彩因子.提取的语义因子及色彩因子能够很好地反映各地标的内涵语义及其外延的风格特征,为未来北京中轴线的相关设计提供了设计元素. In order to better explore the rich design elements contained in the Beijing’s central axis,a design factor extraction method based on natural language processing is proposed.Firstly,the corpus data related to the important landmarks on the central axis of Beijing and its north-south extension line is crawled by crawler,and then jieba is used to segment the crawled corpus data and delete stop words.After the word segmentation results are obtained,the keywords and their weights of each landmark are extracted by using term frequency-inverse document frequency(TF-IDF)technology,and the keywords with higher weights are selected as the subject words of the landmark.The subject words of each landmark are clustered by the similarity between words,and semantic factors are extracted according to the clustering results.The general process of semantic factors extraction includes first asking participants to eliminate meaningless clustering results by voting after fully reading the relevant materials,then using the card sorting method to merge the clustering results,and finally asking the participants to name the card sorting results appropriately with perceptual vocabulary to obtain the semantic factors of each landmark.According to the semantic factors of each landmark,a semantic difference method questionnaire is prepared,and then the participants are invited to score the typical colors of each landmark to filter out the matching color factors.Finally,22 landmarks are extracted with a total of 64 semantic factors and 22 color factors.The extracted semantic factors and color factors can well reflect the connotative semantics of each landmark and their extended style characteristics,providing design elements for the future design of Beijing’s central axis.
作者 王金龙 孙炜 WANG Jinlong;SUN Wei(School of Digital Media&Design Arts,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处 《中国科技论文在线精品论文》 2022年第1期66-72,共7页 Highlights of Sciencepaper Online
关键词 土木建筑工程其他学科 北京中轴线 设计因子 自然语言处理 语义差异法 other subjects of civil and architectural engineering Beijing’s central axis design factor natural language processing semantic difference method
  • 相关文献

参考文献26

二级参考文献108

共引文献146

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部