期刊文献+
共找到277篇文章
< 1 2 14 >
每页显示 20 50 100
Enhancing Relational Triple Extraction in Specific Domains:Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models
1
作者 Jiakai Li Jianpeng Hu Geng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2481-2503,共23页
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e... In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach. 展开更多
关键词 Relational triple extraction semantic interaction large language models data augmentation specific domains
下载PDF
A Semantic-Sensitive Approach to Indoor and Outdoor 3D Data Organization
2
作者 Youchen Wei 《Journal of World Architecture》 2024年第1期1-6,共6页
Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data... Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it. 展开更多
关键词 Integrated data organization Indoor and outdoor 3D data models semantic models Spatial segmentation
下载PDF
Constructing a raster-based spatio-temporal hierarchical data model for marine fisheries application 被引量:2
3
作者 SU Fenzhen ZHOU Chenhu ZHANG Tianyu 《Acta Oceanologica Sinica》 SCIE CAS CSCD 2006年第1期57-63,共7页
Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently... Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels. 展开更多
关键词 marine geographical information system spatio-temporal data model knowledge discovery fishery management data warehouse
下载PDF
A Spatio-temporal Data Model for Road Network in Data Center Based on Incremental Updating in Vehicle Navigation System 被引量:1
4
作者 WU Huisheng LIU Zhaoli +1 位作者 ZHANG Shuwen ZUO Xiuling 《Chinese Geographical Science》 SCIE CSCD 2011年第3期346-353,共8页
The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation sy... The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network. 展开更多
关键词 spatio-temporal data model reverse map with overlay model road network incremental updating vehicle navigation system data center vehicle terminal
下载PDF
Using Semantic Web Technologies to Improve the Extract Transform Load Model
5
作者 Amena Mahmoud Mahmoud Y.Shams +1 位作者 O.M.Elzeki Nancy Awadallah Awad 《Computers, Materials & Continua》 SCIE EI 2021年第8期2711-2726,共16页
Semantic Web(SW)provides new opportunities for the study and application of big data,massive ranges of data sets in varied formats from multiple sources.Related studies focus on potential SW technologies for resolving... Semantic Web(SW)provides new opportunities for the study and application of big data,massive ranges of data sets in varied formats from multiple sources.Related studies focus on potential SW technologies for resolving big data problems,such as structurally and semantically heterogeneous data that result from the variety of data formats(structured,semi-structured,numeric,unstructured text data,email,video,audio,stock ticker).SW offers information semantically both for people and machines to retain the vast volume of data and provide a meaningful output of unstructured data.In the current research,we implement a new semantic Extract Transform Load(ETL)model that uses SW technologies for aggregating,integrating,and representing data as linked data.First,geospatial data resources are aggregated from the internet,and then a semantic ETL model is used to store the aggregated data in a semantic model after converting it to Resource Description Framework(RDF)format for successful integration and representation.The principal contribution of this research is the synthesis,aggregation,and semantic representation of geospatial data to solve problems.A case study of city data is used to illustrate the semantic ETL model’s functionalities.The results show that the proposed model solves the structural and semantic heterogeneity problems in diverse data sources for successful data aggregation,integration,and representation. 展开更多
关键词 semantic web big data ETL model linked data geospatial data
下载PDF
Spatio-temporal Changes and Associated Uncertainties of CENTURYmodelled SOC for Chinese Upland Soils, 1980-2010
6
作者 LIU Xiaoyu ZHAO Yongcun +3 位作者 SHI Xuezheng WANG Shihang FENG Xiang YAN Fang 《Chinese Geographical Science》 SCIE CSCD 2021年第1期126-136,共11页
Detailed information on the spatio-temporal changes of cropland soil organic carbon(SOC) can significantly contribute to the improvement of soil fertility and mitigate climate change. Nonetheless, information and know... Detailed information on the spatio-temporal changes of cropland soil organic carbon(SOC) can significantly contribute to the improvement of soil fertility and mitigate climate change. Nonetheless, information and knowledge on the national scale spatio-temporal changes and the corresponding uncertainties of SOC in Chinese upland soils remain limited. The CENTURY model was used to estimate the SOC storages and their changes in Chinese uplands from 1980 to 2010. With the Monte Carlo method, the uncertainties of CENTURY-modelled SOC dynamics associated with the spatial heterogeneous model inputs were quantified. Results revealed that the SOC storage in Chinese uplands increased from 3.03(1.59 to 4.78) Pg C in 1980 to 3.40(2.39 to 4.62) Pg C in 2010. Increment of SOC storage during this period was 370 Tg C, with an uncertainty interval of –440 to 1110 Tg C. The regional disparities of SOC changes reached a significant level, with considerable SOC accumulation in the Huang-Huai-Hai Plain of China and SOC loss in the northeastern China. The SOC lost from Meadow soils, Black soils and Chernozems was most severe, whilst SOC accumulation in Fluvo-aquic soils, Cinnamon soils and Purplish soils was most significant. In modelling large-scale SOC dynamics, the initial soil properties were major sources of uncertainty. Hence, more detailed information concerning the soil properties must be collected. The SOC stock of Chinese uplands in 2010 was still relatively low, manifesting that recommended agricultural management practices in conjunction with effectively economic and policy incentives to farmers for soil fertility improvement were indispensable for future carbon sequestration in these regions. 展开更多
关键词 soil organic carbon(SOC) CENTURY model uncertainty analysis heterogeneous model input data spatio-temporal change
下载PDF
Spatio-temporal model for soil characteristic of reclamation land
7
作者 CHEN Qiu-ji~(1, 2), HU Zhen-qi~1, FU Mei-chen~3, XIE Hong-quan~4, HAO Hai-fu~5 (1. China University of Mining and Technology(Beijing Campus), Beijing 100083, China 2. Henan Polytechnic University, Jiaozuo 454000, China +2 位作者 3. China University of Geosciences, Beijing 100083, China 4. Hebei Polytechnic University, Tangshan 063009, China 5. China Railway Shiqiju Group Corporation, Taiyuan 030600, China) 《中国有色金属学会会刊:英文版》 CSCD 2005年第S1期45-48,共4页
The development of spatio-temporal data model is introduced. According to the soil characteristic of reclamation land, we adopt the base state with amendments model of multi-layer raster to organize the spatio-tempora... The development of spatio-temporal data model is introduced. According to the soil characteristic of reclamation land, we adopt the base state with amendments model of multi-layer raster to organize the spatio-temporal data, using the combined data structure on linear quadtree and linear octree to code. The advantage of this model is that it can easily obtain the information of certain layer and integratedly analyze the data with other methods. Then, the methods of obtain and analyses are introduced. The method can provide a tool for the research of the soil characteristic change and spatial distribution in reclamation land. 展开更多
关键词 RECLAMATION soil spatio-temporal data model LINEAR quad-tree LINEAR OCTREE
下载PDF
A trajectory data warehouse solution for workforce management decision-making
8
作者 Georgia Garani Dimitrios Tolis Ilias K.Savvas 《Data Science and Management》 2023年第2期88-97,共10页
In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can su... In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence. 展开更多
关键词 Business intelligence DECISION-MAKING Workforce management Trajectory data warehouse(TDW) Moving object semantic modeling
下载PDF
基于BIM和语义网的轨道智能运维管理方法
9
作者 何庆 荆传玉 +3 位作者 孙华坤 姚力 徐井芒 王平 《图学学报》 CSCD 北大核心 2024年第3期601-612,共12页
建筑信息模型(BIM)技术对提高轨道运维管理效率具有重要的推进作用。然而,不同的检查和维护活动产生的数据异构性高、时空关系复杂,阻碍了BIM解释和整合数据的进程。为此,开发了一个基于工业基础类(IFC)和语义Web技术的轨道运维本体(TOM... 建筑信息模型(BIM)技术对提高轨道运维管理效率具有重要的推进作用。然而,不同的检查和维护活动产生的数据异构性高、时空关系复杂,阻碍了BIM解释和整合数据的进程。为此,开发了一个基于工业基础类(IFC)和语义Web技术的轨道运维本体(TOMO),其具有3个功能:①基于轨道运维生命周期的应用需求,简化BIM模型信息;②引入映射规则,建立数据提取与转换模块,集成多源异构数据,结构化定义数据之间复杂的时空关系;③结合数据驱动技术,研究轨道精调智能优化的方法,提供弹性决策支持。最后,以某高速铁路静检数据为例,验证了该框架的有效性与实用性,对于促进领域数据互操作性、降低运维人员劳动强度和提高运维管理智能化程度具有实际的工程指导意义。 展开更多
关键词 建筑信息模型 运维管理 语义WEB技术 数据驱动 弹性决策
下载PDF
语言学知识驱动的空间语义理解能力评测数据集研究
10
作者 詹卫东 孙春晖 肖力铭 《语言战略研究》 CSSCI 北大核心 2024年第5期7-21,共15页
近20年来,深度学习技术显著提升了机器的自然语言处理能力,使之在诸多任务上接近甚至超过人类水平。机器学习的对象不再是直接来自人类语言学研究成果(知识),而是人类语言材料(数据)。在靠数据和算力驱动的大语言模型几近建成巴别塔的当... 近20年来,深度学习技术显著提升了机器的自然语言处理能力,使之在诸多任务上接近甚至超过人类水平。机器学习的对象不再是直接来自人类语言学研究成果(知识),而是人类语言材料(数据)。在靠数据和算力驱动的大语言模型几近建成巴别塔的当下,语言学家通过深挖语言现象总结的语言学知识价值何在?本文提出从知识到数据的研究思路,设计了空间语义理解的6项任务:空间信息正误判别、异常空间信息识别、缺失参照成分补回、空间语义角色标注、空间表达异形同义判别、空间方位关系推理,以构建中文空间语义理解能力评测数据集为例,介绍从SpaCE2021到SpaCE2024数据集的设计思想、数据集制作概况以及机器在空间语义理解任务上的表现。总的来看,参加SpaCE赛事的大语言模型,在依赖表面分布特征(形式线索)的任务上容易获得好成绩,在依赖深层语义理解(认知能力)的任务上容易表现不好。因此,在人工智能高速发展使得语言学知识在计算机信息处理领域被动边缘化的当下,语言学知识的价值需要拓展,即用于指导小而精的高品质语言数据,以提升机器学习的效果和效率。为了计算应用的目的,语法研究应该在观察充分、描写充分、解释充分之上,追求更具挑战性的目标——生成充分。 展开更多
关键词 人工智能 大语言模型 语言学知识 空间语义理解 数据合成
下载PDF
基于多源数据的街道空间品质测度研究——以芜湖市中心城区为例
11
作者 宣蔚 汪婷婷 郑杰 《北京建筑大学学报》 2024年第1期37-44,共8页
在20世纪80年代后,城市经济与高速公路的发展使城市结构发生剧变,由街道构成的传统城镇空间形态被打破。而街道空间作为城市公共空间的重要组成部分,其空间品质的研究对城市在打造魅力街道、传统特色保留以及时代新元素的融入方面具有... 在20世纪80年代后,城市经济与高速公路的发展使城市结构发生剧变,由街道构成的传统城镇空间形态被打破。而街道空间作为城市公共空间的重要组成部分,其空间品质的研究对城市在打造魅力街道、传统特色保留以及时代新元素的融入方面具有重大意义。研究发现:芜湖市中心城区街道综合空间品质整体上,呈现出中心放射状的整体结构,城市空间品质测度结构及城市建设强度的重心数值也呈现出南高北低、内高外低的指状分布特征;芜湖市中心城区5种类型的街道在空间分布上表现出较为分散的特征,不同类型的街道聚类伴随区位的迁移具有明显的差异性;交通导向型街道趋向于城市干道及快速路,但由于城市的各类服务型业态难以覆盖而导致街道服务性不足,生活导向型街道多数位于城市核心建设区,需要加强街道绿化和空间开敞度方面的建设。 展开更多
关键词 街道空间品质 多源数据 空间分布特征 语义分割模型
下载PDF
基于改进SPRINT分类算法的数据挖掘模型
12
作者 林敏 王李杰 《信息技术》 2024年第3期170-174,187,共6页
为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技... 为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技术生成与数据相近的属性集,计算数据属性相似度,生成语义规则集,为用户提供更优的数据服务。选取某公司营销数据集作为样本进行对比实验,结果表明,与对比模型相比,所提出的数据挖掘模型分类时间更短,挖掘准确率更高,能为用户提供更优质的数据服务。 展开更多
关键词 决策树分类算法 协同过滤技术 语义规则集 数据挖掘模型 神经网络
下载PDF
语义网赋能建筑信息交付及模型数据模式分析 被引量:1
13
作者 刘吉明 段立平 +2 位作者 林思伟 缪季 赵金城 《土木与环境工程学报(中英文)》 CSCD 北大核心 2024年第1期244-253,共10页
针对以建筑信息模型(BIM)进行交付的信息共享模式所依赖的工业基础类(IFC)标准行业适用性不足且难以拓展的问题,探讨在IFC基础上引入语义网实现异源数据集成共享,并于语义层面实现信息交付。首先,通过算法解析和模型转化介绍语义化建模... 针对以建筑信息模型(BIM)进行交付的信息共享模式所依赖的工业基础类(IFC)标准行业适用性不足且难以拓展的问题,探讨在IFC基础上引入语义网实现异源数据集成共享,并于语义层面实现信息交付。首先,通过算法解析和模型转化介绍语义化建模方法,并以二层钢框架厂房结构为例对该方法进行说明;然后,通过对转化案例进行数据模式分析,以验证建筑信息交付的准确性和建筑语义的可传递性。案例实践论证基于IfcOWL本体的语义化建模方法的可实施性;通过分析该语义化模型单元实例的数据模式,探究制约该语义化建模方法赋能建筑信息交付的关键因素;针对语义化建模方法所面临的问题,提出冗余信息规避、领域本体开发和轻量化语义建模的初步解决思路。SPARQL查询实例表明,所解析的数据模式对规避冗余信息有效。因此,该方法在共享和集成建筑多源异构信息方面具有优势,能有效提升建筑信息管理的智能化水平。 展开更多
关键词 建筑信息交付 语义网 数据模式分析 工业基础类 建筑信息模型
下载PDF
一种基于数据增强的科技文献关键词提取模型
14
作者 程芮 张海军 《情报杂志》 CSSCI 北大核心 2024年第1期135-141,120,共8页
[研究目的]科技文献关键词提取研究具有重要价值,目前研究中关键词提取方法存在较大误差且只能提取文本中的关键词,难以根据深层语义信息提炼出更符合文本核心主旨的词语。本研究针对关键词提取对上下文隐含语义挖掘不足导致的局限性和... [研究目的]科技文献关键词提取研究具有重要价值,目前研究中关键词提取方法存在较大误差且只能提取文本中的关键词,难以根据深层语义信息提炼出更符合文本核心主旨的词语。本研究针对关键词提取对上下文隐含语义挖掘不足导致的局限性和重点信息关注不足问题开展研究。[研究方法]提出一种基于数据增强的关键词提取模型(GPT-2 BiLSTM Mul-Attention,GPBA),通过语言模型进行数据增强,并结合BiLSTM+Mul-Attention提取模型进行多特征语义信息融合理解。[研究结论]实验结果表明,基于数据增强的关键词提取模型GPBA总体表现优于其他基线模型,并且能更精确地凝练和提取文本中的关键词。 展开更多
关键词 科技文献 关键词提取模型 数据增强 语义信息 评估指标
下载PDF
新一代电子目标整编业务框架研究
15
作者 李高云 刘昕卓 +2 位作者 李福林 周超 吴腾亚 《中国电子科学研究院学报》 2024年第4期369-374,共6页
电子目标整编是将零散、模糊、矛盾的原始素材,通过各种迹象对照、关联等去粗取精、去伪存真,转变为有序、精准、可靠的情报信息。文中剖析了传统电子目标整编面临的文本类资料处理负荷日益剧增、海量低密度价值数据提取困难、体系级情... 电子目标整编是将零散、模糊、矛盾的原始素材,通过各种迹象对照、关联等去粗取精、去伪存真,转变为有序、精准、可靠的情报信息。文中剖析了传统电子目标整编面临的文本类资料处理负荷日益剧增、海量低密度价值数据提取困难、体系级情报的整编与分析欠缺等主要问题与挑战,提出了电子目标整编业务“智能+”新框架,并对其典型的核心技术进行了探讨与思考,最后给出了工程实践中的部分实例,旨在为大模型、大数据等新兴技术赋能电子目标整编业务模式研究,提供技术参考与借鉴。 展开更多
关键词 电磁大数据 电子目标整编 语义提取 语言大模型
下载PDF
语义通信模型联合训练框架中的隐私泄露
16
作者 罗倩雯 王碧舳 +3 位作者 卞志强 许晓东 韩书君 张静璇 《移动通信》 2024年第2期111-116,共6页
为了同时保障端边协同训练语义编解码模型过程中的模型训练效率与数据隐私保护,基于U型分割的语义编解码模型端边协同训练框架是一种可行的方法。然而,端边之间交互的中间特征值与特征梯度仍然可能会泄露终端设备的数据隐私。基于U型分... 为了同时保障端边协同训练语义编解码模型过程中的模型训练效率与数据隐私保护,基于U型分割的语义编解码模型端边协同训练框架是一种可行的方法。然而,端边之间交互的中间特征值与特征梯度仍然可能会泄露终端设备的数据隐私。基于U型分割的语义编解码模型端边协同训练框架可以在一定程度上解决端边协同训练语义编解码模型过程中模型训练效率与数据隐私保护之间的矛盾。然而,该框架下端边之间的交互过程仍然可能泄露终端设备的数据隐私。针对这一问题,提出了一种面向U型分割语义编解码模型协同训练过程的特征泄露攻击算法,通过分析训练过程中终端设备与边缘服务器之间交互的中间特征值和特征梯度,对终端的私有隐私数据进行重构。仿真结果表明,当使用单回合中间特征值对终端数据进行推断时,语义编解码模型使用浅层分割点或模型训练轮次较多时,中间特征值会包含更多的数据语义信息。此外,当攻击者增加本地攻击迭代次数,并选取多回合中间特征值和特征梯度对终端数据进行推断时,重构的终端数据与真实数据的图像结构相似度可以从0.2759提升到0.4017。 展开更多
关键词 语义通信 端边协同训练 数据重构 隐私泄露 模型分割
下载PDF
基于强化正则的小样本自动摘要方法
17
作者 李清 万卫兵 《电子科技》 2024年第7期16-24,共9页
文本自动摘要旨在从文本信息中提取主要语句以压缩信息。现有生成式自动摘要方法无法充分利用预训练模型对原文语义进行学习,导致生成内容易丢失重要信息,当面对样本数量较少的数据集时容易发生过拟合。为了解决此类问题并获得更好的微... 文本自动摘要旨在从文本信息中提取主要语句以压缩信息。现有生成式自动摘要方法无法充分利用预训练模型对原文语义进行学习,导致生成内容易丢失重要信息,当面对样本数量较少的数据集时容易发生过拟合。为了解决此类问题并获得更好的微调性能,文中使用预训练模型mT5(multilingual T5)作为基线,通过结合R-drop(Regularized dropout)对模型微调进行强化正则来提高模型学习能力,同时利用Sparse softmax减少预测生成的模糊性来确保输出准确度。模型在中文数据集LCSTS和CSL上通过计算BLEU(Bilingual Evaluation Understudy)进行优化方法超参数测试,并采用Rouge作为评测指标分别对数据集进行了不同数量级的评测。实验结果表明,经过优化的预训练模型能够更好地学习原文语义表征,在小样本情况下模型能够保持较好的拟合效果,并且能够生成实用性较高的结果。 展开更多
关键词 文本自动摘要 文本生成 预训练模型 小样本数据 强化正则 稀疏化输出 语义表征学习 mT5
下载PDF
Knowledge Model for Electric Power Big Data Based on Ontology and Semantic Web 被引量:19
18
作者 Yanhao Huang Xiaoxin Zhou 《CSEE Journal of Power and Energy Systems》 SCIE 2015年第1期19-27,共9页
It is very important for the development of electric power big data technology to use the electric power knowledge.A new electric power knowledge theory model is proposed here to solve the problem of normalized modele... It is very important for the development of electric power big data technology to use the electric power knowledge.A new electric power knowledge theory model is proposed here to solve the problem of normalized modeled electric power knowledge for the management and analysis of electric power big data.Current modeling techniques of electric power knowledge are viewed as inadequate because of the complexity and variety of the relationships among electric power system data.Ontology theory and semantic web technologies used in electric power systems and in many other industry domains provide a new kind of knowledge modeling method.Based on this,this paper proposes the structure,elements,basic calculations and multidimensional reasoning method of the new knowledge model.A modeling example of the regulations defined in electric power system operation standard is demonstrated.Different forms of the model and related technologies are also introduced,including electric power system standard modeling,multi-type data management,unstructured data searching,knowledge display and data analysis based on semantic expansion and reduction.Research shows that the new model developed here is powerful and can adapt to various knowledge expression requirements of electric power big data.With the development of electric power big data technology,it is expected that the knowledge model will be improved and will be used in more applications. 展开更多
关键词 Electric power big data knowledge model ONTOLOGY semantic web
原文传递
STGI:a spatio-temporal grid index model for marine big data 被引量:2
19
作者 Tengteng Qu Lizhe Wang +6 位作者 Jian Yu Jining Yan Guilin Xu Meng Li Chengqi Cheng Kaihua Hou Bo Chen 《Big Earth Data》 EI 2020年第4期435-450,共16页
Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB d... Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB database,this paper proposes a spatio-temporal grid index model(STGI)for efficient optimized query of marine big data.A spatio-temporal secondary index is created on the spatial code and time code columns to build a composite index in the MongoDB database used for the storage of massive marine data.Multiple comparative experiments demonstrate that the retrieval efficiency adopting the STGI approach is increased by more than two to three times compared with other index models.Through theoretical analysis and experimental verification,the conclusion could be achieved that the STGI model is quite suitable for retrieving large-scale spatial data with low time frequency,such as marine big data. 展开更多
关键词 GeoSOT spatio-temporal grid index model marine big data MONGODB
原文传递
Categorical Database Generalization 被引量:1
20
作者 LIUYaolin MartinMolenaar +1 位作者 AlTinghua LIUYanfang 《Geo-Spatial Information Science》 2003年第4期1-9,26,共10页
This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The fra... This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The framework contents of categoricaldatabase generalization transformationare defined. This paper presents an in-tegrated spatial supporting data struc-ture, a semantic supporting model andsimilarity model for the categorical da-tabase generalization. The concept oftransformation unit is proposed in generalization. 展开更多
关键词 categorical database generalization data model hierarchy semantic evaluation model TRANSFORMATION transformation unit
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部