期刊文献+
共找到3,237篇文章
< 1 2 162 >
每页显示 20 50 100
Recommendation Algorithm Integrating CNN and Attention System in Data Extraction 被引量:1
1
作者 Yang Li Fei Yin Xianghui Hui 《Computers, Materials & Continua》 SCIE EI 2023年第5期4047-4063,共17页
With the rapid development of the Internet globally since the 21st century,the amount of data information has increased exponentially.Data helps improve people’s livelihood and working conditions,as well as learning ... With the rapid development of the Internet globally since the 21st century,the amount of data information has increased exponentially.Data helps improve people’s livelihood and working conditions,as well as learning efficiency.Therefore,data extraction,analysis,and processing have become a hot issue for people from all walks of life.Traditional recommendation algorithm still has some problems,such as inaccuracy,less diversity,and low performance.To solve these problems and improve the accuracy and variety of the recommendation algorithms,the research combines the convolutional neural networks(CNN)and the attention model to design a recommendation algorithm based on the neural network framework.Through the text convolutional network,the input layer in CNN has transformed into two channels:static ones and non-static ones.Meanwhile,the self-attention system focuses on the system so that data can be better processed and the accuracy of feature extraction becomes higher.The recommendation algorithm combines CNN and attention system and divides the embedding layer into user information feature embedding and data name feature extraction embedding.It obtains data name features through a convolution kernel.Finally,the top pooling layer obtains the length vector.The attention system layer obtains the characteristics of the data type.Experimental results show that the proposed recommendation algorithm that combines CNN and the attention system can perform better in data extraction than the traditional CNN algorithm and other recommendation algorithms that are popular at the present stage.The proposed algorithm shows excellent accuracy and robustness. 展开更多
关键词 data extraction recommendation algorithm CNN algorithm attention model
下载PDF
An Efficient Mechanism for Product Data Extraction from E-Commerce Websites
2
作者 Malik Javed Akhtar Zahur Ahmad +3 位作者 Rashid Amin Sultan H.Almotiri Mohammed A.Al Ghamdi Hamza Aldabbas 《Computers, Materials & Continua》 SCIE EI 2020年第12期2639-2663,共25页
A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human underst... A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human understanding and not for machines.Therefore,to make data machine-readable,it requires techniques to grab data from web pages.Researchers have addressed the problem using two approaches,i.e.,knowledge engineering and machine learning.State of the art knowledge engineering approaches use the structure of documents,visual cues,clustering of attributes of data records and text processing techniques to identify data records on a web page.Machine learning approaches use annotated pages to learn rules.These rules are used to extract data from unseen web pages.The structure of web documents is continuously evolving.Therefore,new techniques are needed to handle the emerging requirements of web data extraction.In this paper,we have presented a novel,simple and efficient technique to extract data from web pages using visual styles and structure of documents.The proposed technique detects Rich Data Region(RDR)using query and correlative words of the query.RDR is then divided into data records using style similarity.Noisy elements are removed using a Common Tag Sequence(CTS)and formatting entropy.The system is implemented using JAVA and runs on the dataset of real-world working websites.The effectiveness of results is evaluated using precision,recall,and F-measure and compared with five existing systems.A comparison of the proposed technique to existing systems has shown encouraging results. 展开更多
关键词 Document object model rich data region common tag sequence web data extraction deep web mining
下载PDF
Automatic Data Extraction from Websites for Generating Aquatic Product Market Information
3
作者 袁红春 陈莹 孙越夫 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期15-19,共5页
The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that de... The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that deploys various algorithms to locate, extract and filter tabular data from HTML pages and to transform them into new web-based representations. The tool has been applied in an aquaculture web application platform for extracting and generating aquatic product market information. Results prove that this tool is very effective in extracting the required data from web pages. 展开更多
关键词 web data table localization algorithm distance algorithm data filtering algorithm data extraction tool.
下载PDF
Semi-structured Data Extraction and Schema Knowledge Mining
4
作者 陈恩红 WANG Xufa 《High Technology Letters》 EI CAS 2001年第1期1-5,共5页
A semi structured data extraction method to get the useful information embedded in a group of relevant web pages and store it with OEM(Object Exchange Model) is proposed. Then, the data mining method is adopted to dis... A semi structured data extraction method to get the useful information embedded in a group of relevant web pages and store it with OEM(Object Exchange Model) is proposed. Then, the data mining method is adopted to discover schema knowledge implicit in the semi structured data. This knowledge can make users understand the information structure on the web more deeply and thourouly. At the same time, it can also provide a kind of effective schema for the querying of web information. 展开更多
关键词 Semi-structured data SCHEMA data extraction.
下载PDF
Structured AJAX Data Extraction Based on Agricultural Ontology 被引量:6
5
作者 LI Chuan-xi SU Ya-ru +2 位作者 WANG Ru-jing WEI Yuan-yuan HUANG He 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2012年第5期784-791,共8页
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi... More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation. 展开更多
关键词 information extraction structured data AJAX agricultural ontology semantic annotation
下载PDF
On Structure-based Web Data Extraction: The Model, Method and Application
6
作者 俞方桦 戴玮 陈家训 《Journal of China Textile University(English Edition)》 EI CAS 2000年第4期103-106,共4页
Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of t... Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of the procedure of Web data extraction is presented, as well as the description of crawling and extraction algorithm. Based on the formalization, an XML - based page structure description language, TIDL, is brought out, including the object model, the HTML object reference model and definition of tags. At the final part, a Web data gathering and querying application based on Internet agent technology, named Web Integration Services Kit (WISK) is mentioned. 展开更多
关键词 World WIDE WEB WEB MINING data extractION HTML XML
下载PDF
L-Tree Match: A New Data Extraction Model and Algorithm for Huge Text Stream with Noises 被引量:4
7
作者 邓绪斌 朱扬勇 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第6期763-773,共11页
In this paper, a new method, named as L-tree match, is presented for extracting data from complex data sources. Firstly, based on data extraction logic presented in this work, a new data extraction model is constructe... In this paper, a new method, named as L-tree match, is presented for extracting data from complex data sources. Firstly, based on data extraction logic presented in this work, a new data extraction model is constructed in which model components are structurally correlated via a generalized template. Secondly, a database-populating mechanism is built, along with some object-manipulating operations needed for flexible database design, to support data extraction from huge text stream. Thirdly, top-down and bottom-up strategies are combined to design a new extraction algorithm that can extract data from data sources with optional, unordered, nested, and/or noisy components. Lastly, this method is applied to extract accurate data from biological documents amounting to 100GB for the first online integrated biological data warehouse of China. 展开更多
关键词 data extraction data model extraction algorithm regular expression WRAPPER
原文传递
Intelligent and Adaptive Web Data Extraction System Using Convolutional and Long Short-Term Memory Deep Learning Networks 被引量:4
8
作者 Sudhir Kumar Patnaik C.Narendra Babu Mukul Bhave 《Big Data Mining and Analytics》 EI 2021年第4期279-297,共19页
Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction e... Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction engines fail because they cannot adapt to the dynamic changes in website content.This study investigates an intelligent and adaptive web data extraction system with convolutional and Long Short-Term Memory(LSTM)networks to enable automated web page detection using the You only look once(Yolo)algorithm and Tesseract LSTM to extract product details,which are detected as images from web pages.This state-of-the-art system does not need a core data extraction engine,and thus can adapt to dynamic changes in website layout.Experiments conducted on real-world retail cases demonstrate an image detection(precision)and character extraction accuracy(precision)of 97%and 99%,respectively.In addition,a mean average precision of 74%,with an input dataset of 45 objects or images,is obtained. 展开更多
关键词 adaptive web scraping deep learning Long Short-Term Memory(LSTM) Web data extraction You only look once(Yolo)
原文传递
Big Data Bot with a Special Reference to Bioinformatics
9
作者 Ahmad M.Al-Omari Shefa M.Tawalbeh +4 位作者 Yazan H.Akkam Mohammad Al-Tawalbeh Shima’a Younis Abdullah A.Mustafa Jonathan Arnold 《Computers, Materials & Continua》 SCIE EI 2023年第5期4155-4173,共19页
There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug disc... There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug discovery,rely on such data;nevertheless,gathering and extracting data from these resources is a tough undertaking.This data should go through several processes,including mining,data processing,analysis,and classification.This study proposes software that extracts data from big data repositories automatically and with the particular ability to repeat data extraction phases as many times as needed without human intervention.This software simulates the extraction of data from web-based(point-and-click)resources or graphical user interfaces that cannot be accessed using command-line tools.The software was evaluated by creating a novel database of 34 parameters for 1360 physicochemical properties of antimicrobial peptides(AMP)sequences(46240 hits)from various MARVIN software panels,which can be later utilized to develop novel AMPs.Furthermore,for machine learning research,the program was validated by extracting 10,000 protein tertiary structures from the Protein Data Bank.As a result,data collection from the web will become faster and less expensive,with no need for manual data extraction.The software is critical as a first step to preparing large datasets for subsequent stages of analysis,such as those using machine and deep-learning applications. 展开更多
关键词 BIOINFORMATICS big data data extraction BOT drug design
下载PDF
Research of Extracting Data from HTML Web Pages Automatically 被引量:1
10
作者 王茹 宋瀚涛 陆玉昌 《Journal of Beijing Institute of Technology》 EI CAS 2003年第S1期104-108,共5页
In order to use data information in the Internet, it is necessary to extract data from web pages. An HTT tree model representing HTML pages is presented. Based on the HTT model, a wrapper generation algorithm AGW is p... In order to use data information in the Internet, it is necessary to extract data from web pages. An HTT tree model representing HTML pages is presented. Based on the HTT model, a wrapper generation algorithm AGW is proposed. The AGW algorithm utilizes comparing and correcting technique to generate the wrapper with the native characteristic of the HTT tree structure. The AGW algorithm can not only generate the wrapper automatically, but also rebuild the data schema easily and reduce the complexity of the computing. 展开更多
关键词 information extraction data transformation WRAPPER HTML page
下载PDF
Extraction of Mineral Alteration Zone from ETM+ Data in Northwestern Yunnan,China
11
作者 赵志芳 张玉君 +1 位作者 成秋明 陈建平 《Journal of China University of Geosciences》 SCIE CSCD 2008年第4期416-420,共5页
Alteration is regarded as significant information for mineral exploration. In this study, ETM+ remote sensing data are used for recognizing and extracting alteration zones in northwestern Yunnan (云南), China. The ... Alteration is regarded as significant information for mineral exploration. In this study, ETM+ remote sensing data are used for recognizing and extracting alteration zones in northwestern Yunnan (云南), China. The principal component analysis (PCA) of ETM+ bands 1, 4, 5, and 7 was employed for OH alteration extractions. The PCA of ETM+ bands 1, 3, 4, and 5 was used for extracting Fe^2+ (Fe^3+) alterations. Interfering factors, such as vegetation, snow, and shadows, were masked. Alteration components were defined in the principal components (PCs) by the contributions of their diagnostic spectral bands. The zones of alteration identified from remote sensing were analyzed in detail along with geological surveys and field verification. The results show that the OH^- alteration is a main indicator of K-feldspar, phyllic, and prophilized alterations. These alterations are closely related to porphyry copper deposits. The Fe^2+ (Fe^3+) alteration indicates pyritization, which is mainly related to hydrothermal or skarn type polymetallic deposits. 展开更多
关键词 mineral alteration extraction from ETM+ data PCA OH^- alteration Fe^2+ (Fe^3+) alteration northwestern Yunnan China
下载PDF
Algorithmic Foundation and Software Tools for Extracting Shoreline Features from Remote Sensing Imagery and LiDAR Data 被引量:8
12
作者 Hongxing Liu Lei Wang +2 位作者 Douglas J. Sherman Qiusheng Wu Haibin Su 《Journal of Geographic Information System》 2011年第2期99-119,共21页
This paper presents algorithmic components and corresponding software routines for extracting shoreline features from remote sensing imagery and LiDAR data. Conceptually, shoreline features are treated as boundary lin... This paper presents algorithmic components and corresponding software routines for extracting shoreline features from remote sensing imagery and LiDAR data. Conceptually, shoreline features are treated as boundary lines between land objects and water objects. Numerical algorithms have been identified and de-vised to segment and classify remote sensing imagery and LiDAR data into land and water pixels, to form and enhance land and water objects, and to trace and vectorize the boundaries between land and water ob-jects as shoreline features. A contouring routine is developed as an alternative method for extracting shore-line features from LiDAR data. While most of numerical algorithms are implemented using C++ program-ming language, some algorithms use available functions of ArcObjects in ArcGIS. Based on VB .NET and ArcObjects programming, a graphical user’s interface has been developed to integrate and organize shoreline extraction routines into a software package. This product represents the first comprehensive software tool dedicated for extracting shorelines from remotely sensed data. Radarsat SAR image, QuickBird multispectral image, and airborne LiDAR data have been used to demonstrate how these software routines can be utilized and combined to extract shoreline features from different types of input data sources: panchromatic or single band imagery, color or multi-spectral image, and LiDAR elevation data. Our software package is freely available for the public through the internet. 展开更多
关键词 SHORELINE extraction Remote Sensing IMAGERY LiDAR data ArcGIS ARCOBJECTS VB.NET
下载PDF
Rural Habitation Multistage Nature Boundary Extraction Based on Geographic Name Database
13
作者 Binbin Hu Hong Wang Wei Zhang 《Journal of Geoscience and Environment Protection》 2016年第7期37-43,共7页
In order to extract the boundary of rural habitation, based on geographic name data and basic geographic information data, an extraction method that use polygon aggregation is raised, it can extract the boundary of th... In order to extract the boundary of rural habitation, based on geographic name data and basic geographic information data, an extraction method that use polygon aggregation is raised, it can extract the boundary of three levels of rural habitation consists of town, administrative village and nature village. The method first extracts the boundary of nature village by aggregating the resident polygon, then extracts the boundary of administrative village by aggregating the boundary of nature village, and last extracts the boundary of town by aggregating the boundary of administrative village. The related methods of extracting the boundary of those three levels rural habitation has been given in detail during the experiment with basic geographic information data and geographic name data. Experimental results show the method can be a reference for boundary extraction of rural habitation. 展开更多
关键词 Rural Habitation Geographic Name data Basic Geographic Information data Boundary extraction Polygon Aggregation
下载PDF
基于Spark计算的大数据终端潜在异常识别仿真 被引量:2
14
作者 牛庆丽 朱耀琴 《计算机仿真》 2024年第1期518-521,526,共5页
终端信息泄漏是大数据安全的主要问题,大规模高速数据流的潜在异常风险直接影响大数据终端运行状态。为此提出基于Spark计算的大数据终端潜在异常识别方法。分析终端潜在异常数据的噪声影响程度,利用去噪算法对原始终端数据完成去噪预... 终端信息泄漏是大数据安全的主要问题,大规模高速数据流的潜在异常风险直接影响大数据终端运行状态。为此提出基于Spark计算的大数据终端潜在异常识别方法。分析终端潜在异常数据的噪声影响程度,利用去噪算法对原始终端数据完成去噪预处理。将其输入网络大数据深度挖掘模型中提取潜在异常数据的特征。以Spark计算和自适应快速决策树为基础构建并行性分类模型,将提取到的特征输入至模型,实现大数据终端潜在异常的识别。仿真结果表明,所提方法识别精确度和效率均较高,且具有更大的适应度,说明研究方法的稳定性更优。 展开更多
关键词 大数据 特征提取 潜在异常识别
下载PDF
The development of data acquisition and control system for extraction power supply of prototype RF ion source 被引量:1
15
作者 Meichu HUANG Chundong HU +4 位作者 Yuanzhe ZHAO Caichao JIANG Yahong XIE Shiyong CHEN Qinglong CUI 《Plasma Science and Technology》 SCIE EI CAS CSCD 2018年第8期104-111,共8页
A 16 kV/20 A power supply was developed for the extraction grid of prototype radio frequency(RF) ion source of neutral beam injector. To acquire the state signals of extraction grid power supply(EGPS) and control ... A 16 kV/20 A power supply was developed for the extraction grid of prototype radio frequency(RF) ion source of neutral beam injector. To acquire the state signals of extraction grid power supply(EGPS) and control the operation of the EGPS, a data acquisition and control system has been developed. This system mainly consists of interlock protection circuit board, photoelectric conversion circuit, optical fibers, industrial compact peripheral component interconnect(CPCI) computer and host computer. The human machine interface of host computer delivers commands and data to program of the CPCI computer, as well as offers a convenient client for setting parameters and displaying EGPS status. The CPCI computer acquires the status of the power supply. The system can turn-off the EGPS quickly when the faults of EGPS occur. The system has been applied to the EGPS of prototype RF ion source. Test results show that the data acquisition and control system for the EGPS can meet the requirements of the operation of prototype RF ion source. 展开更多
关键词 RF ion source data acquisition control system TCP/IP protocol beam extraction
下载PDF
煤层气富集区遥感勘查研究进展与展望
16
作者 秦其明 吴自华 +2 位作者 叶昕 王楠 韩谷怀 《自然资源遥感》 CSCD 北大核心 2024年第3期1-12,共12页
煤层气是一种自生自储式非常规清洁能源,它赋存于煤层及其围岩中。传统勘查方法费时费力,而遥感技术提供了煤层气富集区快速勘查的新途径。煤层气富集区遥感勘查的基本原理是基于典型地物波谱特征与富集区烃微渗漏导致的地表物体异常波... 煤层气是一种自生自储式非常规清洁能源,它赋存于煤层及其围岩中。传统勘查方法费时费力,而遥感技术提供了煤层气富集区快速勘查的新途径。煤层气富集区遥感勘查的基本原理是基于典型地物波谱特征与富集区烃微渗漏导致的地表物体异常波谱特征,包括岩矿蚀变、植被异常和热异常等波谱特征的对比,结合煤田地质、地震、大地电磁等物探方法获取的数据,进行多源信息提取与综合分析,逐步查明煤层气富集区分布范围与含气特性。文章综述了煤层气富集区烃类物质渗漏与地表岩矿波谱和植被波谱异常的响应机制,以及基于地表岩矿波谱与植被波谱参数反演的多种方法与地物波谱异常反演在煤层气潜在富集区的勘查应用;阐述了含煤层气地层导致地表热异常的不同解释与提高地表温度反演准确率的主要方法与应用。未来,遥感与煤田地质数据、地震探测和大地电磁探测相结合,开展立体多元信息分析与信息提取,将成为实现煤层气富集区低成本快速勘查的主要途径。 展开更多
关键词 煤层气富集区 烃微渗漏 遥感勘查 多源信息提取
下载PDF
基于车载点云的道路三维实景建模方法研究 被引量:1
17
作者 徐辛超 丁雪 《测绘与空间地理信息》 2024年第2期17-20,共4页
传统的基础测绘存在组织管理固化、服务模式落后、产品形式单一等问题,在新型基础测绘体系下形成了全要素三维实景模型这一成果。本文探讨基于车载点云进行城市道路三维实景建模方法研究,并以某城市主干路为试验对象,对道路及道路两侧... 传统的基础测绘存在组织管理固化、服务模式落后、产品形式单一等问题,在新型基础测绘体系下形成了全要素三维实景模型这一成果。本文探讨基于车载点云进行城市道路三维实景建模方法研究,并以某城市主干路为试验对象,对道路及道路两侧部件点云数据进行矢量化得到道路全要素地形数据,以部件点云数据为参考结合外业调绘尺寸用3ds Max软件制作道路部件模板库,并结合点云数据和矢量数据对各类要素进行单体化,最后将道路模型和部件模型融合。结果表明,基于车载点云数据构建的城市道路全要素实景模型不仅可以保证场景的完整性和真实性,还减少了作业时间和成本,实现了各类模型之间的无缝结合,制作完成的模型精度也能满足项目精度要求。 展开更多
关键词 车载点云 矢量提取 3ds Max 道路建模 部件建模
下载PDF
基于多源数据与随机森林方法的城市建成区提取——以郑州市为例 被引量:2
18
作者 杨杰 林敬娜 程钢 《测绘工程》 2024年第2期8-17,共10页
基于夜间灯光数据的阈值分割法在城镇建成区提取研究中被广泛应用,但由于夜间灯光数据分辨率低、灯光溢出和阈值分割法无法顾及区域差异等问题,一定程度上影响了该方法的提取精度。以郑州市为例,以LJ1-01与NPP/VIIRS两种夜间灯光影像为... 基于夜间灯光数据的阈值分割法在城镇建成区提取研究中被广泛应用,但由于夜间灯光数据分辨率低、灯光溢出和阈值分割法无法顾及区域差异等问题,一定程度上影响了该方法的提取精度。以郑州市为例,以LJ1-01与NPP/VIIRS两种夜间灯光影像为主要数据源,结合Landsat8中分辨率遥感影像、网络城市兴趣点(POI)及路网数据,利用随机森林分类方法对郑州市2018年建成区进行提取,参考土地利用数据,对RF分类法与NTL、VANUI、BANUI、PANUI、RANUI指数等阈值法进行对比实验和精度评价,评估基于多源数据的随机森林分类方法在城市建成区提取中的优势。实验表明,RF比阈值法提取的建成区更接近真实建成区且提取精度更高,具有更好适用性;LJ1-01数据提取的效果和精度总体优于NPP/VIIRS数据;在采用RF分类时,各类特征的重要性在不同夜光数据源中表现差异较大。 展开更多
关键词 建成区提取 多源数据 随机森林 阈值
下载PDF
自然资源“三全”调查监测技术体系研究与实践 被引量:1
19
作者 陈春晖 王迎春 +1 位作者 张伟 曲丽佳 《测绘与空间地理信息》 2024年第2期76-80,共5页
为统筹整合遥感数据源,统一实施地类变化信息提取,优化各项调查监测和业务管理的业务流,探索建立“全范围地类监测、全流程变化跟踪、全业务数据支撑”的“三全”调查监测体系。整合“天空地网”多源协同数据获取、变化监测要素智能提... 为统筹整合遥感数据源,统一实施地类变化信息提取,优化各项调查监测和业务管理的业务流,探索建立“全范围地类监测、全流程变化跟踪、全业务数据支撑”的“三全”调查监测体系。整合“天空地网”多源协同数据获取、变化监测要素智能提取、三维时空场景建模与管理、自然资源监测智能化服务平台等关键技术,形成标准统一、手段智能、业务联通、先进实用的自然资源统一调查监测技术体系,更有力地支撑生态文明建设和自然资源管理。 展开更多
关键词 自然资源 调查监测 数据获取 智能提取 三维时空场景
下载PDF
基于本体库的知识抽取及图谱构建技术
20
作者 李津 《科技创新与应用》 2024年第11期37-40,共4页
该文首先介绍领域内本体库的组成架构,以及基础数据分析和WordNet节选的主要功能。随后,提出一种基于本体库的实体数据抽取技术,在不同的实体之间建立语义关系,为知识抽取做好铺垫。在实体信息抽取过程中,首先要判断网页是否在领域内,... 该文首先介绍领域内本体库的组成架构,以及基础数据分析和WordNet节选的主要功能。随后,提出一种基于本体库的实体数据抽取技术,在不同的实体之间建立语义关系,为知识抽取做好铺垫。在实体信息抽取过程中,首先要判断网页是否在领域内,在确定网页属于领域后按照特定的标签划分网页内容,进而抽取出有价值的实体数据。将抽取到的实体数据存储到Neo4j数据库中,定期更新知识图谱内的数据。当需要调用数据时,可以从知识图谱中检索需要的数据,从而实现数据资源的整合利用,发挥数据的价值。 展开更多
关键词 本体库 实体数据抽取 Neo4j数据库 数据检索 知识图谱
下载PDF
上一页 1 2 162 下一页 到第
使用帮助 返回顶部