< 1 2 87 >
每页显示 20 50 100
开放式地理实体关系抽取的Bootstrapping方法 被引量:26
作者 余丽 陆锋 刘希亮 《测绘学报》 EI CSCD 北大核心 2016年第5期616-622,共7页
从网络文本中抽取地理实体间空间关系和语义关系要求高时效性和强鲁棒性。本文提出一种开放式地理实体关系的自动抽取方法,通过bootstrapping技术统计词语的词性、位置和距离特征来计算语境中词语权值,据此确定描述地理实体关系的关键词... 从网络文本中抽取地理实体间空间关系和语义关系要求高时效性和强鲁棒性。本文提出一种开放式地理实体关系的自动抽取方法,通过bootstrapping技术统计词语的词性、位置和距离特征来计算语境中词语权值,据此确定描述地理实体关系的关键词,最终组织成结构化实例,并使用百度百科和Stanford CoreNLP开展了试验。研究结果表明,本文方法能自动挖掘自然语言的部分词法特征,无须领域专家知识和大规模标注语料,适用于未知关系类型的信息抽取任务;较之经典的Frequency、TFIDF和PPMI频率统计方法,精度和召回率分别提升约5%和23%。 展开更多
关键词 文本挖掘 地理实体 关系抽取 定量评价 bootstrapping
基于BootStrapping的集成分类器的中文观点句识别方法 被引量:8
作者 吕云云 李旸 王素格 《中文信息学报》 CSCD 北大核心 2013年第5期84-92,共9页
领域相关的大规模和高质量的标注训练数据是分类器性能的重要保证,而标注训练语料是一件费时费力的工作。该文提出了一种采用小规模标注语料识别中文观点句的方法。首先采用Bootstrapping方法扩展训练语料,分别训练贝叶斯、支持向量机... 领域相关的大规模和高质量的标注训练数据是分类器性能的重要保证,而标注训练语料是一件费时费力的工作。该文提出了一种采用小规模标注语料识别中文观点句的方法。首先采用Bootstrapping方法扩展训练语料,分别训练贝叶斯、支持向量机和最大熵分类器。最后,通过给三个训练好的分类器赋权获得一个集成分类器。实验结果表明,集成后的分类器性能优于单分类器,并且该方法在使用部分标注训练数据的情况下也能取得与采用全部标注训练数据相近的实验结果。 展开更多
关键词 观点句识别 bootstrapping 集成分类器
使用基于模式的Bootstrapping方法抽取情感词 被引量:6
作者 王昌厚 王菲 《计算机工程与应用》 CSCD 2014年第1期127-129,共3页
情感评价词典在情感分析中具有非常重要的作用,在新词频发的网络环境中,识别新的情感评价词,完善现有的情感词典是非常有必要的。使用基于模式的Bootstrapping方法,在微博语料中抽取情感评价词。实验证明,在保持了较理想的精确率的情况... 情感评价词典在情感分析中具有非常重要的作用,在新词频发的网络环境中,识别新的情感评价词,完善现有的情感词典是非常有必要的。使用基于模式的Bootstrapping方法,在微博语料中抽取情感评价词。实验证明,在保持了较理想的精确率的情况下,上述方法抽取了数量可观的传统情感词典未收录的情感评价词。 展开更多
关键词 情感评价词 模式 bootstrapping方法
利用鞍点逼近与Bootstrapping方法估计统计量的分布 被引量:2
作者 李述山 王秀芬 《山东理工大学学报(自然科学版)》 CAS 2004年第5期44-48,共5页
统计量分布的确定是统计推断的一个关键工作,在总体分布已知的条件下,鞍点逼近在很多场合可以给出统计量分布的良好近似.在介绍鞍点逼近方法的基础上给出了一个结合鞍点逼近与Bootstrapping方法估计统计量分布的方法,解决了总体分布未... 统计量分布的确定是统计推断的一个关键工作,在总体分布已知的条件下,鞍点逼近在很多场合可以给出统计量分布的良好近似.在介绍鞍点逼近方法的基础上给出了一个结合鞍点逼近与Bootstrapping方法估计统计量分布的方法,解决了总体分布未知的条件下统计量近似分布的估计问题,并以样本均值的分布为例进行了讨论. 展开更多
关键词 鞍点逼近 统计量 逼近方法 样本均值 bootstrapping方法 分布估计
Bootstrapping创业资源获取的驱动机制研究:基于“关系”视角 被引量:1
作者 杨林波 朱兴婷 《宁波大学学报(人文科学版)》 2018年第3期93-99,共7页
Bootstrapping资源获取方式有助于创业者降低对外部环境的依赖,顺利开展创业活动。本研究基于关系理论,建立Bootstrapping资源获取驱动机制模型,通过深入的理论分析,发现创业者人力资本(学历教育、商业培训、工作经验、创业经历)可以通... Bootstrapping资源获取方式有助于创业者降低对外部环境的依赖,顺利开展创业活动。本研究基于关系理论,建立Bootstrapping资源获取驱动机制模型,通过深入的理论分析,发现创业者人力资本(学历教育、商业培训、工作经验、创业经历)可以通过增加"关系"中的情感成分(而非工具成分),加强对Bootstrapping资源获取策略的使用。而且创业者核心自我评价水平越高,人力资本对"关系"的增进作用就越强。 展开更多
关键词 bootstrapping资源获取 人力资本 “关系” 核心自我评价
作者 尹继豪 樊孝忠 +1 位作者 刘士宁 于江德 《计算机研究与发展》 EI CSCD 北大核心 2007年第z2期394-397,共4页
提出一种基于Bootstrapping算法构建训练语料的方法.该方法从自动标注的语料中随机选取部分语料,人工修正后生成种子集,用该种子集训练一个基于类的语言模型,然后使用该模型自动标注剩余的语料;再从剩余语料中选取部分语料进行以上处理... 提出一种基于Bootstrapping算法构建训练语料的方法.该方法从自动标注的语料中随机选取部分语料,人工修正后生成种子集,用该种子集训练一个基于类的语言模型,然后使用该模型自动标注剩余的语料;再从剩余语料中选取部分语料进行以上处理,如此循环直到训练语料标注质量理想.实验结果表明,该方法在保证训练语料标注质量理想的情况下,能够大幅度地减少人工参与. 展开更多
关键词 bootstrapping 命名实体识别 训练语料 类语言模型
结合词向量和Bootstrapping的领域实体上下位关系获取与组织 被引量:6
作者 马晓军 郭剑毅 +3 位作者 线岩团 毛存礼 严馨 余正涛 《计算机科学》 CSCD 北大核心 2018年第1期67-72,共6页
实体上下位关系是构建领域知识图谱不可或缺的一种重要的语义关系,传统抽取上下位关系的方法大多不考虑关系的组织。提出一种结合词向量和Bootstrapping的方法来实现领域实体上下位关系的获取与组织。首先,选取旅游领域的种子语料集;然... 实体上下位关系是构建领域知识图谱不可或缺的一种重要的语义关系,传统抽取上下位关系的方法大多不考虑关系的组织。提出一种结合词向量和Bootstrapping的方法来实现领域实体上下位关系的获取与组织。首先,选取旅游领域的种子语料集;然后,采用基于词向量的相似度计算方法对种子集中包含的上下位关系模式进行聚类,筛选出置信度高的模式并对未标注语料进行上下位关系识别,得到候选关系实例,同时选择置信度高的关系实例加入到种子集中,进行下一轮的迭代,直到得到所有的关系实例;最后,根据领域实体上下位关系对的向量偏移并结合领域实体层级关系的特点,采用映射的学习方法进行领域实体层级关系组织。实验结果表明,与传统的方法相比,所提方法的F值提高了近10%。 展开更多
关键词 上下位关系 关系抽取 bootstrapping方法 词向量 映射学习 层级关系组织
作者 赵乃刚 邓景顺 《山西大同大学学报(自然科学版)》 2012年第6期3-6,共4页
基于<产品特征,情感词>关联对的缺点,讨论了情感词与否定性副词搭配的必要性,提出了<Pfeature,Flag,Sword>关联三元组,能够更准确地表达文本中相关评论句对产品特征的情感倾向。采用两个步骤来提取关联三元组:首先,利用已... 基于<产品特征,情感词>关联对的缺点,讨论了情感词与否定性副词搭配的必要性,提出了<Pfeature,Flag,Sword>关联三元组,能够更准确地表达文本中相关评论句对产品特征的情感倾向。采用两个步骤来提取关联三元组:首先,利用已训练好的最大熵模型作为分类器,结合Bootstrapping方法完成了产品特征与情感词语关联对的抽取;其次,将情感词前的否定性副词抽取出来,合成关联三元组。 展开更多
关键词 最大熵 bootstrapping 关联三元组 情感倾向
基于Bootstrapping的家谱文本信息抽取方法研究 被引量:3
作者 鲍宸洋 任明 《图书馆杂志》 CSSCI 北大核心 2022年第2期93-102,共10页
实现家谱文本信息的自动抽取是家谱资源深度开发利用的关键。目前深度学习在家谱文本信息抽取方面取得了良好的效果,但是对标注数据的依赖始终是其发展瓶颈之一。本文面向家谱的世系小传,研究基于小规模标注数据进行家谱人物和关系的抽... 实现家谱文本信息的自动抽取是家谱资源深度开发利用的关键。目前深度学习在家谱文本信息抽取方面取得了良好的效果,但是对标注数据的依赖始终是其发展瓶颈之一。本文面向家谱的世系小传,研究基于小规模标注数据进行家谱人物和关系的抽取方法。具体来说:基于Bootstrapping的思想,以少量的标注数据作为初始种子集,使用深度学习BiLSTM-CRF模型为待标注样本自动预测标签序列,并筛选高置信分数的样本加入标注集中,从而迭代地扩展标注集,最后训练得到的模型用于命名实体识别和关系抽取。基于真实数据集的实验表明,使用Bootstrapping改进的BiLSTM-CRF模型能够基于小规模标注数据实现家谱信息抽取,使基于深度学习的家谱信息抽取更加高效。在种子集规模为250条时取得的预测效果与训练集规模为1800条的BiLSTM-CRF模型的预测效果接近。 展开更多
关键词 家谱文本 信息抽取 深度学习 bootstrapping BiLSTM-CRF
基于BootStrapping的中文事件元素抽取系统设计与实现 被引量:4
作者 赵江江 秦兵 《智能计算机与应用》 2012年第1期16-17,20,共3页
采用基于BootStrapping的方法实现中文事件元素抽取系统。其中,将事件元素抽取定义为一个模式匹配问题。针对这一问题,首先构建了初始种子集,然后创新性地引入了BootStrapping方法构建模板集,并使用模式匹配的方法进行事件元素抽取。在... 采用基于BootStrapping的方法实现中文事件元素抽取系统。其中,将事件元素抽取定义为一个模式匹配问题。针对这一问题,首先构建了初始种子集,然后创新性地引入了BootStrapping方法构建模板集,并使用模式匹配的方法进行事件元素抽取。在模板构造过程中,提出了基于BestMatch的模板泛化算法[1]。对任意两个事件实例模板[2]进行匹配,计算其匹配代价并泛化,提高了模板的覆盖能力。所实现的系统在ACE 2005语料测试中取得了不错结果。 展开更多
关键词 事件元素抽取 bootstrapping 模式匹配
Bootstrapping Data Envelopment Analysis of Efficiency and Productivity of County Public Hospitals in Eastern, Central, and Western China after the Public Hospital Reform 被引量:5
作者 王曼丽 方海清 +5 位作者 陶红兵 程兆辉 林小军 蔡苗 许昌 蒋帅 《Journal of Huazhong University of Science and Technology(Medical Sciences)》 SCIE CAS 2017年第5期681-692,共12页
China implemented the public hospital reform in 2012. This study utilized bootstrapping data envelopment analysis(DEA) to evaluate the technical efficiency(TE) and productivity of county public hospitals in Easter... China implemented the public hospital reform in 2012. This study utilized bootstrapping data envelopment analysis(DEA) to evaluate the technical efficiency(TE) and productivity of county public hospitals in Eastern, Central, and Western China after the 2012 public hospital reform. Data from 127 county public hospitals(39, 45, and 43 in Eastern, Central, and Western China, respectively) were collected during 2012–2015. Changes of TE and productivity over time were estimated by bootstrapping DEA and bootstrapping Malmquist. The disparities in TE and productivity among public hospitals in the three regions of China were compared by Kruskal–Wallis H test and Mann–Whitney U test. The average bias-corrected TE values for the four-year period were 0.6442, 0.5785, 0.6099, and 0.6094 in Eastern, Central, and Western China, and the entire country respectively, with average non-technical efficiency, low pure technical efficiency(PTE), and high scale efficiency found. Productivity increased by 8.12%, 0.25%, 12.11%, and 11.58% in China and its three regions during 2012–2015, and such increase in productivity resulted from progressive technological changes by 16.42%, 6.32%, 21.08%, and 21.42%, respectively. The TE and PTE of the county hospitals significantly differed among the three regions of China. Eastern and Western China showed significantly higher TE and PTE than Central China. More than 60% of county public hospitals in China and its three areas operated at decreasing return scales. There was a considerable space for TE improvement in county hospitals in China and its three regions. During 2012–2015, the hospitals experienced progressive productivity; however, the PTE changed adversely. Moreover, Central China continuously achieved a significantly lower efficiency score than Eastern and Western China. Decision makers and administrators in China should identify the causes of the observed inefficiencies and take appropriate measures to increase the efficiency of county public hospitals in the three areas of China, especially in Central China. 展开更多
关键词 county public hospital data envelopment analysis technical efficiency Malmquist productivity index bootstrapping
作者 Li Weigang Liu Ting Li Sheng 《Journal of Electronics(China)》 2008年第1期89-96,共8页
A new approach of relation extraction is described in this paper. It adopts a bootstrap- ping model with a novel iteration strategy, which generates more precise examples of specific relation. Compared with previous m... A new approach of relation extraction is described in this paper. It adopts a bootstrap- ping model with a novel iteration strategy, which generates more precise examples of specific relation. Compared with previous methods, the proposed method has three main advantages: first, it needs less manual intervention; second, more abundant and reasonable information are introduced to represent a relation pattern; third, it reduces the risk of circular dependency occurrence in bootstrapping. Scalable evaluation methodology and metrics are developed for our task with comparable techniques over TianWang 100G corpus. The experimental results show that it can get 90% precision and have excellent expansibility. 展开更多
关键词 Relation extraction bootstrapping PATTERNS TUPLES
A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications 被引量:2
作者 Qiuzi Zhang Qikai Cheng +1 位作者 Yong Huang Wei Lu 《Journal of Data and Information Science》 2016年第1期69-85,共17页
Purpose: Our study proposes a bootstrapping-based method to automatically extract data- usage statements from academic texts. Design/methodology/approach: The method for data-usage statements extraction starts with ... Purpose: Our study proposes a bootstrapping-based method to automatically extract data- usage statements from academic texts. Design/methodology/approach: The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper. Findings: The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns. Research limitations: While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future. Practical implications: Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation. Originality/value: To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data. 展开更多
关键词 Data-usage statements extraction Information extraction bootstrapping Unsupervised learning Academic text-mining
Analysis of Change Point in Surface Temperature Time Series Using Cumulative Sum Chart and Bootstrapping for Asansol Weather Observation Station, West Bengal, India 被引量:3
作者 Ansar Khan Soumendu Chatterjee +1 位作者 Dipak Bisai Nilay Kanti Barman 《American Journal of Climate Change》 2014年第1期83-94,共12页
This paper aims to detect the short-term as well as long-term change point in the surface air temperature time series for Asansol weather observation station, West Bengal, India. Temperature data for the period from 1... This paper aims to detect the short-term as well as long-term change point in the surface air temperature time series for Asansol weather observation station, West Bengal, India. Temperature data for the period from 1941 to 2010 of the said weather observatory have been collected from Indian Meteorological Department, Kolkata. Variations and trends of annual mean temperature, annual mean maximum temperature and annual minimum temperature time series were examined. The cumulative sum charts (CUSUM) and bootstrapping were used for the detection of abrupt changes in the time series data set. Statistically significant abrupt changes and trends have been detected. The major change point in the annual mean temperatures occurred around 1986 (0.57°C) at the period of 25 years in the long-term regional scale. On the other side, the annual mean maximum and annual mean minimum temperatures have distinct change points at level 1. There are abrupt changes in the year 1961 (Confidence interval 1961, 1963) for the annual mean maximum and 1994 (Confidence interval 1993, 1996) for the annual mean minimum temperatures at a confidence level of 100% and 98%, respectively. Before the change, the annual mean maximum and annual mean minimum temperatures were 30.90°C and 23.99°C, respectively, while after the change, the temperatures became 33.93°C and 24.84°C, respectively. Over the entire period of consideration (1941-2010), 11 forward and backward changes were found in total. Out of 11, there are 3 changes (1961, 1986 and 2001) in annual mean temperatures, 4 changes (1957, 1961, 1980 and 1994) in annual mean maximum temperatures, and rest 4 changes (1968, 1981, 1994 and 2001) are associated with annual mean minimum temperature data set. 展开更多
关键词 bootstrapping CHANGE POINT CUSUM Temperature Time SERIES
Automatic determination method of optimal threshold based on the bootstrapping technology 被引量:1
作者 Wang Jixin Wang Yan +2 位作者 Zhai Xinting Huang Yajun Wang Zhenyu 《Journal of Southeast University(English Edition)》 EI CAS 2018年第2期208-212,共5页
In order to predict the extreme load of the mechanical components during the entire life,an automatic method based on the bootstrapping technology(BT)is proposed to determine the most suitable threshold.Based on all t... In order to predict the extreme load of the mechanical components during the entire life,an automatic method based on the bootstrapping technology(BT)is proposed to determine the most suitable threshold.Based on all the turning points of the load history and a series of thresholds estimated in advance,the generalized Pareto distribution is established to fit the exceedances.The corresponding distribution parameters are estimated with the maximum likelihood method.Then,BT is employed to calculate the mean squared error(MSE)of each estimated threshold based on the exceedances and the specific distribution parameters.Finally,the threshold with the smallest MSE will be the optimal one.Compared to the kurtosis method and the mean excess function method,the average deviation of the probability density function of exceedances determined by BT reduces by 38.52%and 29.25%,respectively.Moreover,the quantile-quantile plot of the exceedances determined by BT is closer to a straight line.The results suggest the improvement of the modeling flexibility and the determined threshold precision.If the exceedances are insufficient,BT will enlarge their amount by resampling to solve the instability problem of the original distribution parameters. 展开更多
关键词 load spectrum peak over threshold threshold selection bootstrapping technology mean squared error
Maneuvering target tracking algorithm based on CDKF in observation bootstrapping strategy 被引量:1
作者 胡振涛 Zhang Jin +1 位作者 Fu Chunling Li Xian 《High Technology Letters》 EI CAS 2017年第2期149-155,共7页
The selection and optimization of model filters affect the precision of motion pattern identification and state estimation in maneuvering target tracking directly.Aiming at improving performance of model filters,a nov... The selection and optimization of model filters affect the precision of motion pattern identification and state estimation in maneuvering target tracking directly.Aiming at improving performance of model filters,a novel maneuvering target tracking algorithm based on central difference Kalman filter in observation bootstrapping strategy is proposed.The framework of interactive multiple model(IMM) is used to realize identification of motion pattern,and a central difference Kalman filter(CDKF) is selected as the model filter of IMM.Considering the advantage of multi-sensor fusion method in improving the stability and reliability of observation information,the hardware cost of the observation system for multiple sensors is adopted,meanwhile,according to the data assimilation technique in Ensemble Kalman filter(En KF),a bootstrapping observation set is constructed by integrating the latest observation and the prior information of observation noise.On that basis,these bootstrapping observations are reasonably used to optimize the filtering performance of CDKF by means of weight fusion way.The object of new algorithm is to improve the tracking precision of observed target by the multi-sensor fusion method without increasing the number of physical sensors.The theoretical analysis and experimental results show the feasibility and efficiency of the proposed algorithm. 展开更多
关键词 maneuvering target tracking interacting multiple model(IMM) central difference Kalman filter(CDKF) bootstrapping observation
作者 施万亚 望育梅 +1 位作者 张琳 邓辉 《现代电信科技》 2005年第11期14-18,共5页
关键词 移动IPV6 bootstrapping AAA
基于Bootstrapping的新闻事件型实体关系抽取方法 被引量:1
作者 宋卿 戚成琳 杨越 《中国传媒大学学报(自然科学版)》 2017年第4期46-50,共5页
新闻所包含核心内容是事件,现有的中文实体关系抽取方法都针对属性型关系,忽略了事件型关系的抽取;新闻内容涉及领域广,要求关系抽取方法具有良好的领域扩展能力;同时,开放域人工标注训练语料库的难度较大。针对上述问题,本文提出Bootst... 新闻所包含核心内容是事件,现有的中文实体关系抽取方法都针对属性型关系,忽略了事件型关系的抽取;新闻内容涉及领域广,要求关系抽取方法具有良好的领域扩展能力;同时,开放域人工标注训练语料库的难度较大。针对上述问题,本文提出Bootstrapping的关系种子集自动生成方法,并在迭代过程中加入扩展和过滤规则,最终得到准确度和复用性较高的实体关系提取模式。通过实验测试,本文提出的方法在事件型实体关系的提取中能够取得良好效果。 展开更多
关键词 关系抽取 事件型关系 bootstrapping 开放模板
A Bootstrapping Approach for Software Reliability Measurement Based on a Discretized NHPP Model
作者 Shinji Inoue Shigeru Yamada 《Journal of Software Engineering and Applications》 2013年第4期1-7,共7页
Discrete software reliability measurement has a proper characteristic for describing a software reliability growth process which depends on a unit of the software fault-detection period, such as the number of test run... Discrete software reliability measurement has a proper characteristic for describing a software reliability growth process which depends on a unit of the software fault-detection period, such as the number of test runs, the number of executed test cases. This paper discusses discrete software reliability measurement based on a discretized nonhomogeneous Poisson process (NHPP) model. Especially, we use a bootstrapping method in our discrete software reliability measurement for discussing the statistical inference on parameters and software reliability assessment measures of our model. Finally we show numerical examples of interval estimations based on our bootstrapping method for the several software reliability assessment measures by using actual data. 展开更多
关键词 Software Reliability Measurement Discretized NHPP Model NONPARAMETRIC bootstrapping Method Regression Analysis BOOTSTRAP CONFIDENCE INTERVALS
Bootstrapping estimates of regression analysis for a non-random sample and its application in the research on anti-proliferation effects of triptolide
作者 曹阳 李玫 +2 位作者 谢万军 张罗漫 吴宗贵 《Journal of Medical Colleges of PLA(China)》 CAS 2003年第2期124-128,共5页
Objective: To solve the problem of parameter estimate in the regression analysis of non-random sample. Methods: Calculating residuals according to the regression function based on original data. Modifying residuals an... Objective: To solve the problem of parameter estimate in the regression analysis of non-random sample. Methods: Calculating residuals according to the regression function based on original data. Modifying residuals and correcting them with mean. Adding mean-corrected residuals on original response and bootstrapping them to get 1000 samples. Fitting regression functions of 1000 resampling samples and calculating the 2.5th percentile and 97.5th percentile of corresponding coefficient. Results: The interval estimates deriving from bootstrap method had more statistical significance than that from usual method. Conclusion: Bootstrapping a regression with residuals is a valid method for estimating parameter in regression analysis. 展开更多
上一页 1 2 87 下一页 到第
使用帮助 返回顶部