期刊文献+
共找到3,839篇文章
< 1 2 192 >
每页显示 20 50 100
A Deep Web Data Integration System for Job Search 被引量:6
1
作者 LIU Wei LI Xian +2 位作者 LING Yanyan ZHANG Xiaoyu MENG Xiaofeng 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1197-1201,共5页
With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over... With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced. 展开更多
关键词 web database web data integration job website
下载PDF
Question classification in question answering based on real-world web data sets
2
作者 袁晓洁 于士涛 +1 位作者 师建兴 陈秋双 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期272-275,共4页
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t... To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance. 展开更多
关键词 question classification question answering real-world web data sets question and answer web forums re-ranking model
下载PDF
Web Data Cube Construction in Multidimensional On-line Analytical Processing Environment
3
作者 朱焱 《Journal of Southwest Jiaotong University(English Edition)》 2007年第1期1-7,共7页
This paper investigates how to integrate Web data into a multidimensional data warehouse (cube) for comprehensive on-line analytical processing (OLAP) and decision making. An approach for Web data-based cube const... This paper investigates how to integrate Web data into a multidimensional data warehouse (cube) for comprehensive on-line analytical processing (OLAP) and decision making. An approach for Web data-based cube construction is proposed, which includes Web data modeling based on MIX ( Metadam based Integration model for data X-change ), generic and specific mapping rules design, and a transformation algorithm for mapping Web data to a multidimensional array. Besides, the structure and implementation of the prototype of a Web data base cube are discussed. 展开更多
关键词 web data warehousing web data-based cube MOLAP
下载PDF
SCMR:a semantic-based coherence micro-cluster recognition algorithm for hybrid web data stream 被引量:2
4
作者 王珉 Wang Yongbin Li Ying 《High Technology Letters》 EI CAS 2016年第2期224-232,共9页
Data aggregation from various web sources is very significant for web data analysis domain. In ad- dition, the recognition of coherence micro cluster is one of the most interesting issues in the field of data aggregat... Data aggregation from various web sources is very significant for web data analysis domain. In ad- dition, the recognition of coherence micro cluster is one of the most interesting issues in the field of data aggregation. Until now, many algorithms have been proposed to work on this issue. However, the deficiency of these solutions is that they cannot recognize the micro-cluster data stream accurately. A semantic-based coherent micro-cluster recognition algorithm for hybrid web data stream is nronosed.Firstly, an objective function is proposed to recognize the coherence micro-cluster and then the coher- ence micro-cluster recognition algorithm for hybrid web data stream based on semantic is raised. Fi- 展开更多
关键词 hybrid web data stream coherence micro-clustering entity unified object coher-ence semantic computing
下载PDF
用PB8的Web Data Window DTC开发Web应用 被引量:1
5
作者 华铨平 《现代计算机》 2003年第7期73-75,79,共4页
浏览器/Web服务器+应用服务器/数据库服务器的三层或多层体系结构已成为当今应用开发技术的主流,本文着重介绍 PowerBuilder 8.0的 Web Data Window DTC的使用,阐述瘦客户技术的实现。
关键词 webdataWindowsDTC 数据库 POWERBUILDER8.0 web 数据窗口
下载PDF
A Framework of Web Data Integrated LBS Middleware
6
作者 MENG Xiaofeng YIN Shaoyi XIAO Zhen 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1187-1191,共5页
In this paper, we propose a flexible locationbased service (LBS) middleware framework to make the development and deployment of new location based applications much easier. Considering the World Wide Web as a huge d... In this paper, we propose a flexible locationbased service (LBS) middleware framework to make the development and deployment of new location based applications much easier. Considering the World Wide Web as a huge data source of location relative information, we integrate the common used web data extraction techniques into the middleware framework, exposing a unified web data interface for the upper applications to make them more attractive. Besides, the framework also emphasizes some common LBS issues, including positioning, location modeling, location-dependent query processing, privacy and secure management. 展开更多
关键词 location-based service (LBS) MIDDLEWARE web data extraction
下载PDF
Web Data Aggregation in MOLAP:Approach,Language,and Implementation
7
作者 朱焱 唐慧佳 马永强 《Journal of Southwest Jiaotong University(English Edition)》 2007年第3期179-186,共8页
This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules... This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules. To define the rules for reading warehouse data and computing aggregates, a rule definition language - array aggregation language (AAL) is developed. This language treats an array as a function from indexes to values and provides syntax and semantics based on monads. External functions can be called in aggregation rules to specify array reading, writing, and aggregating. Based on the features of AAL, array operations are unified as function operations, which can be easily expressed and automatically evaluated. To implement the aggregation approach, a processor for computing aggregates over the base cube and for materializing them in the data warehouse is built, and the component structure and working principle of the aggregation processor are introduced. 展开更多
关键词 web data aggregation Aggregation language MOLAP Aggregation processor
下载PDF
Web Database Query Interface Annotation Based on User Collaboration
8
作者 LIU Wei LIN Can MENG Xiaofeng 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1403-1406,共4页
A vision based query interface annotation meth od is used to relate attributes and form elements in form based web query interfaces, this method can reach accuracy of 82%. And a user participation method is used to tu... A vision based query interface annotation meth od is used to relate attributes and form elements in form based web query interfaces, this method can reach accuracy of 82%. And a user participation method is used to tune the result; user can answer "yes" or "no" for existing annotations, or manually annotate form elements. Mass feedback is added to the annotation algorithm to produce more accurate result. By this approach, query interface annotation can reach a perfect accuracy. 展开更多
关键词 web database data integration data extraction
下载PDF
Web data mining在远程教育中的应用
9
作者 白伟 《山西科技》 2009年第2期54-55,共2页
采用Web data mining对远程教育进行分析,根据受教育对象存在的个体差异,提出个性化远程学习系统的框架结构思想和个性化服务的理念,对相关信息进行数据挖掘并建立起一个集智能化、个性化为一体的远程教育系统,从而更好地改善远程教育... 采用Web data mining对远程教育进行分析,根据受教育对象存在的个体差异,提出个性化远程学习系统的框架结构思想和个性化服务的理念,对相关信息进行数据挖掘并建立起一个集智能化、个性化为一体的远程教育系统,从而更好地改善远程教育服务的现状。 展开更多
关键词 web数据挖掘 远程教育 个性化学习 个性化服务
下载PDF
On Structure-based Web Data Extraction: The Model, Method and Application
10
作者 俞方桦 戴玮 陈家训 《Journal of China Textile University(English Edition)》 EI CAS 2000年第4期103-106,共4页
Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of t... Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of the procedure of Web data extraction is presented, as well as the description of crawling and extraction algorithm. Based on the formalization, an XML - based page structure description language, TIDL, is brought out, including the object model, the HTML object reference model and definition of tags. At the final part, a Web data gathering and querying application based on Internet agent technology, named Web Integration Services Kit (WISK) is mentioned. 展开更多
关键词 World WIDE web web MINING data EXTRACTION HTML XML
下载PDF
The Optimization and Improvement of MapReduce in Web Data Mining
11
作者 Jun Qu Chang-Qing Yin Shangwei Song 《Journal of Software Engineering and Applications》 2015年第8期395-406,共12页
Extracting and mining social networks information from massive Web data is of both theoretical and practical significance. However, one of definite features of this task was a large scale data processing, which remain... Extracting and mining social networks information from massive Web data is of both theoretical and practical significance. However, one of definite features of this task was a large scale data processing, which remained to be a great challenge that would be addressed. MapReduce is a kind of distributed programming model. Just through the implementation of map and reduce those two functions, the distributed tasks can work well. Nevertheless, this model does not directly support heterogeneous datasets processing, while heterogeneous datasets are common in Web. This article proposes a new framework which improves original MapReduce framework into a new one called Map-Reduce-Merge. It adds merge phase that can efficiently solve the problems of heterogeneous data processing. At the same time, some works of optimization and improvement are done based on the features of Web data. 展开更多
关键词 CLOUD COMPUTING web data MAPREDUCE Map-Reduce-Merge
下载PDF
基于DataPool的Web测试数据生成与维护方法 被引量:2
12
作者 黄陇 李诺 +1 位作者 金茂忠 刘超 《计算机科学》 CSCD 北大核心 2006年第10期272-274,共3页
针对Web应用测试数据所具有的特点,本文提出了一种基于DataPool的Web应用测试数据生成与维护方法。在形式化定义DataPool和明确其语义描述的基础上,根据浏览器端不同的输入域类型在DataPool的编辑视图中提供了相对应的测试数据生成方式... 针对Web应用测试数据所具有的特点,本文提出了一种基于DataPool的Web应用测试数据生成与维护方法。在形式化定义DataPool和明确其语义描述的基础上,根据浏览器端不同的输入域类型在DataPool的编辑视图中提供了相对应的测试数据生成方式,并提供了各种维护功能。在DataPool浏览视图中,支持以单个和批量的方式选择测试数据。 展开更多
关键词 dataPool web测试 测试数据 数据维护
下载PDF
Intelligent and Adaptive Web Data Extraction System Using Convolutional and Long Short-Term Memory Deep Learning Networks 被引量:4
13
作者 Sudhir Kumar Patnaik C.Narendra Babu Mukul Bhave 《Big Data Mining and Analytics》 EI 2021年第4期279-297,共19页
Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction e... Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction engines fail because they cannot adapt to the dynamic changes in website content.This study investigates an intelligent and adaptive web data extraction system with convolutional and Long Short-Term Memory(LSTM)networks to enable automated web page detection using the You only look once(Yolo)algorithm and Tesseract LSTM to extract product details,which are detected as images from web pages.This state-of-the-art system does not need a core data extraction engine,and thus can adapt to dynamic changes in website layout.Experiments conducted on real-world retail cases demonstrate an image detection(precision)and character extraction accuracy(precision)of 97%and 99%,respectively.In addition,a mean average precision of 74%,with an input dataset of 45 objects or images,is obtained. 展开更多
关键词 adaptive web scraping deep learning Long Short-Term Memory(LSTM) web data extraction You only look once(Yolo)
原文传递
Integrating Multi-Source Web Records into Relational Database 被引量:1
14
作者 HUANG Jianbin JI Hongbing SUN Heli 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1177-1181,共5页
How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learnin... How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learning of labeled samples and unlabeled database records in order to reduce the dependence on tediously hand-labeled training data. The pro- posed model was used to solve the problem of schema matching between data source schema and database schema. Experimental results using a large number of Web pages from diverse domains show the novel approach's effectiveness. 展开更多
关键词 web data integration schema matching conditional random fields
下载PDF
A Dynamic XML-NS View Based Approach for the Extensible Integration of Web Data Sources
15
作者 WUWei LUZheng-ding LIRui-xuan WANGZhi-gang 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期647-651,共5页
We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format... We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format of resource descriptor for the information source discovery scheme so that we can dynamically register and/or deregister the Web data sources on the fly. Third, we employ an inverted-index mechanism to identify the subset of information sources that are relevant to a particular user query. We describe the design, architecture, and implementation of our approach—IWDS, and illustrate its use through case examples. Key words integration - heterogeneity - Web data source - XML namespace CLC number TP 311.13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: WU Wei (1975-), male, Ph.D candidate, research direction: information integration, distribute computing 展开更多
关键词 INTEGRATION HETEROGENEITY web data source XML namespace
下载PDF
An Efficient Mechanism for Product Data Extraction from E-Commerce Websites
16
作者 Malik Javed Akhtar Zahur Ahmad +3 位作者 Rashid Amin Sultan H.Almotiri Mohammed A.Al Ghamdi Hamza Aldabbas 《Computers, Materials & Continua》 SCIE EI 2020年第12期2639-2663,共25页
A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human underst... A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human understanding and not for machines.Therefore,to make data machine-readable,it requires techniques to grab data from web pages.Researchers have addressed the problem using two approaches,i.e.,knowledge engineering and machine learning.State of the art knowledge engineering approaches use the structure of documents,visual cues,clustering of attributes of data records and text processing techniques to identify data records on a web page.Machine learning approaches use annotated pages to learn rules.These rules are used to extract data from unseen web pages.The structure of web documents is continuously evolving.Therefore,new techniques are needed to handle the emerging requirements of web data extraction.In this paper,we have presented a novel,simple and efficient technique to extract data from web pages using visual styles and structure of documents.The proposed technique detects Rich Data Region(RDR)using query and correlative words of the query.RDR is then divided into data records using style similarity.Noisy elements are removed using a Common Tag Sequence(CTS)and formatting entropy.The system is implemented using JAVA and runs on the dataset of real-world working websites.The effectiveness of results is evaluated using precision,recall,and F-measure and compared with five existing systems.A comparison of the proposed technique to existing systems has shown encouraging results. 展开更多
关键词 Document object model rich data region common tag sequence web data extraction deep web mining
下载PDF
Automatic Data Extraction from Websites for Generating Aquatic Product Market Information
17
作者 袁红春 陈莹 孙越夫 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期15-19,共5页
The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that de... The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that deploys various algorithms to locate, extract and filter tabular data from HTML pages and to transform them into new web-based representations. The tool has been applied in an aquaculture web application platform for extracting and generating aquatic product market information. Results prove that this tool is very effective in extracting the required data from web pages. 展开更多
关键词 web data table localization algorithm distance algorithm data filtering algorithm data extraction tool.
下载PDF
Creating customized data services from web pages
18
作者 季光 Wang Guiling Han Yanbo 《High Technology Letters》 EI CAS 2013年第2期203-207,共5页
To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user throu... To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness. 展开更多
关键词 web data extraction structured data user labeling CUSTOMIZATION data service
下载PDF
基于区块链技术的Web服务器数据加密方法
19
作者 吴静莉 《无线互联科技》 2024年第17期109-111,共3页
为了提高Web服务器数据的安全性,文章提出了一种基于区块链技术的Web服务器数据加密方法。先进行Web服务器数据去重,以减少冗余数据,建立安全传输协议,通过智能合约、生成和分发密钥实现数据的加密传输,确保数据在传输过程中不被未经授... 为了提高Web服务器数据的安全性,文章提出了一种基于区块链技术的Web服务器数据加密方法。先进行Web服务器数据去重,以减少冗余数据,建立安全传输协议,通过智能合约、生成和分发密钥实现数据的加密传输,确保数据在传输过程中不被未经授权的第三方截获或篡改。利用区块链技术加密访问控制,保障数据的安全访问,实现数据的加密。实验结果表明,该方法的安全系数始终保持在0.98以上,证明了该加密方法的可靠性。 展开更多
关键词 区块链技术 web服务器 数据加密 安全传输协议
下载PDF
一个基于现实世界的大型Web参照数据集——UK2006 Datasets的初步研究
20
作者 曾刚 李宏 《企业技术开发》 2009年第5期16-17,31,共3页
文章介绍了WEBSPAM-UK2006数据集,一个大型的基于现实世界的,人工评判过一些垃圾行为的web数据集合,详细的对数据集的构成进行了分析,对数据集采用Python进行了初步的预处理,为以后在反垃圾网页行为方面的算法和判定研究提供了非常有意... 文章介绍了WEBSPAM-UK2006数据集,一个大型的基于现实世界的,人工评判过一些垃圾行为的web数据集合,详细的对数据集的构成进行了分析,对数据集采用Python进行了初步的预处理,为以后在反垃圾网页行为方面的算法和判定研究提供了非常有意的经验和参考。 展开更多
关键词 搜索引擎作弊 web数据集 链接分析 web
下载PDF
上一页 1 2 192 下一页 到第
使用帮助 返回顶部