To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user throu...To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness.展开更多
In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation litera...In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.展开更多
With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependenc...With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependencies, resulting in the inflexibility of the design and implement for the processes. This paper proposes a novel data-aware business process model which is able to describe both explicit control flow and implicit data flow. Data model with dependencies which are formulated by Linear-time Temporal Logic(LTL) is presented, and their satisfiability is validated by an automaton-based model checking algorithm. Data dependencies are fully considered in modeling phase, which helps to improve the efficiency and reliability of programming during developing phase. Finally, a prototype system based on j BPM for data-aware workflow is designed using such model, and has been deployed to Beijing Kingfore heating management system to validate the flexibility, efficacy and convenience of our approach for massive coding and large-scale system management in reality.展开更多
Massive ocean data acquired by various observing platforms and sensors poses new challenges to data management and utilization.Typically,it is difficult to find the desired data from the large amount of datasets effic...Massive ocean data acquired by various observing platforms and sensors poses new challenges to data management and utilization.Typically,it is difficult to find the desired data from the large amount of datasets efficiently and effectively.Most of existing methods for data discovery are based on the keyword retrieval or direct semantic reasoning,and they are either limited in data access rate or do not take the time cost into account.In this paper,we creatively design and implement a novel system to alleviate the problem by introducing semantics with ontologies,which is referred to as Data Ontology and List-Based Publishing(DOLP).Specifically,we mainly improve the ocean data services in the following three aspects.First,we propose a unified semantic model called OEDO(Ocean Environmental Data Ontology)to represent heterogeneous ocean data by metadata and to be published as data services.Second,we propose an optimized quick service query list(QSQL)data structure for storing the pre-inferred semantically related services,and reducing the service querying time.Third,we propose two algorithms for optimizing QSQL hierarchically and horizontally,respectively,which aim to extend the semantics relationships of the data service and improve the data access rate.Experimental results prove that DOLP outperforms the benchmark methods.First,our QSQL-based data discovery methods obtain a higher recall rate than the keyword-based method,and are faster than the traditional semantic method based on direct reasoning.Second,DOLP can handle more complex semantic relationships than the existing methods.展开更多
As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is r...As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is revolutionizing all industries,bringing colossal impacts to them[2].Many researchers have pointed out the huge impact that big data can have on our daily lives[3].We can utilize the information we obtain and help us make decisions.Also,the conclusions we drew from the big data we analyzed can be used as a prediction for the future,helping us to make more accurate and benign decisions earlier than others.If we apply these technics in finance,for example,in stock,we can get detailed information for stocks.Moreover,we can use the analyzed data to predict certain stocks.This can help people decide whether to buy a stock or not by providing predicted data for people at a certain convincing level,helping to protect them from potential losses.展开更多
JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and...JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and timeliness of the marine-meteorological and oceanographic data, metadata and products available to end users. China as a candidate of CMOC China has been approved to run on a trial basis after the 4th Meeting of the Joint IOC/WMO Technical Commission for Oceanography and Marine Meteorology (JCOMM). This article states the developing intention of CMOC China in the next few years through the brief introduction to critical marine data, products and service system and cooperation projects in the world.展开更多
In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the r...In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.展开更多
An ocean state monitor and analysis radar(OSMAR), developed by Wuhan University in China, have been mounted at six stations along the coasts of East China Sea(ECS) to measure velocities(currents, waves and winds...An ocean state monitor and analysis radar(OSMAR), developed by Wuhan University in China, have been mounted at six stations along the coasts of East China Sea(ECS) to measure velocities(currents, waves and winds) at the sea surface. Radar-observed surface current is taken as an example to illustrate the operational high-frequency(HF) radar observing and data service platform(OP), presenting an operational flow from data observing, transmitting, processing, visualizing, to end-user service. Three layers(systems): radar observing system(ROS), data service system(DSS) and visualization service system(VSS), as well as the data flow within the platform are introduced. Surface velocities observed at stations are synthesized at the radar data receiving and preprocessing center of the ROS, and transmitted to the DSS, in which the data processing and quality control(QC) are conducted. Users are allowed to browse the processed data on the portal of the DSS, and access to those data files. The VSS aims to better show the data products by displaying the information on a visual globe. By utilizing the OP, the surface currents in East China Sea are monitored, and hourly and seasonal variabilities of them are investigated.展开更多
Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring th...Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register(ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain.Design/methodology/approach: The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings: We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information.Research limitations: The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated.Practical implications: The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives.Originality/value: The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions.展开更多
Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the int...Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the intrinsic spatial nature of ocean data and information,no current GIS software is adequate to deal effectively and efficiently with spatiotemporal data.Furthermore,while existing ocean data portals are generally designed to meet the basic needs of a broad range of users,they are sometimes very complicated for general audiences,especially for those without training in GIS.In this paper,a new technical architecture for an ocean data integration and service system is put forward that consists of four layers:the operation layer,the extract,transform,and load(ETL) layer,the data warehouse layer,and the presentation layer.The integration technology based on the XML,ontology,and spatiotemporal data organization scheme for the data warehouse layer is then discussed.In addition,the ocean observing data service technology realized in the presentation layer is also discussed in detail,including the development of the web portal and ocean data sharing platform.The application on the Taiwan Strait shows that the technology studied in this paper can facilitate sharing,access,and use of ocean observation data.The paper is based on an ongoing research project for the development of an ocean observing information system for the Taiwan Strait that will facilitate the prevention of ocean disasters.展开更多
In existing web services-based workflow, data exchanging across the web services is centralized, the workflow engine intermediates at each step of the application sequence. However, many grid applications, especially ...In existing web services-based workflow, data exchanging across the web services is centralized, the workflow engine intermediates at each step of the application sequence. However, many grid applications, especially data intensive scientific applications, require exchanging large amount of data across the grid services. Having a central workflow engine relay the data between the services would resu'lts in a bottleneck in these cases. This paper proposes a data exchange model for individual grid workflow and multiworkflows composition respectively. The model enables direct communication for large amounts of data between two grid services. To enable data to exchange among multiple workflows, the bridge data service is used.展开更多
For China’s telecom industry,2009 is destined to be an extraordinary year due to the approach of long-thirsted-for mobile 3G era,which will have significant impact on current work and lifestyles.2009 will also be a y...For China’s telecom industry,2009 is destined to be an extraordinary year due to the approach of long-thirsted-for mobile 3G era,which will have significant impact on current work and lifestyles.2009 will also be a year full of opportunities and challenges because the coming 3G era will bring limitless business opportunities and impose more challenges on Chinese telecom operators.The reshuffling of Chinese telecom markets has been brought to an end.The new China Unicom,China Mobile and China Telecom all focus their strategies on broadband mobile data services in order to achieve the objective of a smooth transforming from voice services to data services.Technologically,various 3G technologies and their evolutions become great concerns of telecom operators;while in terms of services,the key for 3G systems is their data services.As a result,high speed broadband data services see an era of rapid development.展开更多
This paper introduces the application of data mining technology in data service. Data mining is a new technology, but its application time is long, and the application effect is obvious. Data mining technology can be ...This paper introduces the application of data mining technology in data service. Data mining is a new technology, but its application time is long, and the application effect is obvious. Data mining technology can be used in enterprise customer service system, which can help enterprises to find potential customers, while retaining the most valuable customers. We pointed out the necessity of establishing data mining technology and service intelligence analysis system based on the combination of intelligence analysis and service characteristics and processes, the method of data mining, knowledge management ideas applied to intelligence analysis and service system.展开更多
World Data Center(WDC)for Seismology,Beijing has developed for 20 years in China until this year.The sustained and stable data sharing service system has already taken shape.This article gives an overview of the const...World Data Center(WDC)for Seismology,Beijing has developed for 20 years in China until this year.The sustained and stable data sharing service system has already taken shape.This article gives an overview of the construction and development of WDC for Seismology,Beijing.It outlines the history,facilities and technical specifications of the center.It also illustrates the data service,the website,and gives a brief description of the perspective.展开更多
Fengyun meteorological satellites have undergone a series of significant developments over the past 50 years.Two generations,four types,and 21 Fengyun satellites have been developed and launched,with 9 currently opera...Fengyun meteorological satellites have undergone a series of significant developments over the past 50 years.Two generations,four types,and 21 Fengyun satellites have been developed and launched,with 9 currently operational in orbit.The data obtained from Fengyun satellites is employed in a multitude of applications,including weather forecasting,meteorological disaster prevention and reduction,climate change,global environmental monitoring,and space weather.These data products and services are made available to the global community,resulting in tangible social and economic benefits.In 2023,two Fengyun meteorological satellites were successfully launched.This report presents an overview of the two recently launched Fengyun satellites and currently in orbit Fengyun satellites,including an evaluation of their remote sensing instruments since 2022.Additionally,it addresses the subject of Fengyun satellite data archiving,data services,application services,international cooperation,and supporting activities.Furthermore,the development prospects have been outlined.展开更多
In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish ...In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish security-threat assessment architectures to evaluate risks of traffic elements by managing and sharing securitythreat information.Unfortunately,most assessment architectures process data in a centralized manner,causing delays in query services.To address this issue,in this paper,a Hierarchical Blockchain-enabled Security threat Assessment Architecture(HBSAA)is proposed,utilizing edge chains and global chains to share data.In addition,data virtualization technology is introduced to manage multi-source heterogeneous data,and a metadata association model based on attribute graph is designed to deal with complex data relationships.In order to provide high-speed query service,the ant colony optimization of key nodes is designed,and the HBSAA prototype is also developed and the performance is tested.Experimental results on the large-scale vulnerabilities data gathered from NVD demonstrate that the HBSAA not only shields data heterogeneity,but also reduces service response time.展开更多
China began to develop its meteorological satellite program since 1969.With 50-years’growing,there are 17 Fengyun(FY)meteorological satellites launched successfully.At present,seven of them are in orbit to provide th...China began to develop its meteorological satellite program since 1969.With 50-years’growing,there are 17 Fengyun(FY)meteorological satellites launched successfully.At present,seven of them are in orbit to provide the operational service,including three polar orbiting meteorological satellites and four geostationary meteorological satellites.Since last COSPAR report,no new Fengyun satellite has been launched.The information of the on-orbit FY-2 series,FY-3 series,and FY-4 series has been updated.FY-3D and FY-2H satellites accomplished the commission test and transitioned into operation in 2018.FY-2E satellite completed its service to decommission in 2019.The web-based users and Direct Broadcasting(DB)users keep growing worldwide to require the Fengyun satellite data and products.A new Mobile Application Service has been launched to Fengyun users based on the cloud technology in 2018.In this report,the international and regional co-operations to facilitate the Fengyun user community have been addressed especially.To strengthen the data service in the Belt and Road countries,the Emergency Support Mechanism of Fengyun satellite(FY_ESM)has been established since 2018.Meanwhile,a Recalibrating 30-years’archived Fengyun satellite data project has been founded since 2018.This project targets to generate the Fundamental Climate Data Record(FCDR)as a space agency response to the Global Climate Observation System(GCOS).At last,the future Fengyun program up to 2025 has been introduced as well.展开更多
China’s efforts to develop Fengyun meteorological satellites have made major strides over the past 50 years,with the polar and geostationary meteorological satellite series achieving continuously stable operation to ...China’s efforts to develop Fengyun meteorological satellites have made major strides over the past 50 years,with the polar and geostationary meteorological satellite series achieving continuously stable operation to persistently provide data and product services globally.By the end of 2021,19 Chinese self-developed Fengyun meteorological satellites have been launched successfully.Seven of them are in operation at present,the data and products are widely applied to weather analysis,numerical weather forecasting and climate prediction,as well as environment and disaster monitoring.Since the last COSPAR report,FY-4B,the first new-generation operational geostationary satellite,and FY-3E,the first early-morning orbit satellite in China’s polar-orbiting meteorological satellite family have been launched in 2021.The characteristics of the two latest satellites and the instruments onboard are addressed in this report.The status of current Fengyun Satellites,product and data service and international cooperation and supporting activities has been introduced as well.展开更多
Purpose:The main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’websites.The information automatically extracte...Purpose:The main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’websites.The information automatically extracted can be potentially updated with a frequency higher than once per year,and be safe from manipulations or misinterpretations.Moreover,this approach allows us flexibility in collecting indicators about the efficiency of universities’websites and their effectiveness in disseminating key contents.These new indicators can complement traditional indicators of scientific research(e.g.number of articles and number of citations)and teaching(e.g.number of students and graduates)by introducing further dimensions to allow new insights for“profiling”the analyzed universities.Design/methodology/approach:Webometrics relies on web mining methods and techniques to perform quantitative analyses of the web.This study implements an advanced application of the webometric approach,exploiting all the three categories of web mining:web content mining;web structure mining;web usage mining.The information to compute our indicators has been extracted from the universities’websites by using web scraping and text mining techniques.The scraped information has been stored in a NoSQL DB according to a semistructured form to allow for retrieving information efficiently by text mining techniques.This provides increased flexibility in the design of new indicators,opening the door to new types of analyses.Some data have also been collected by means of batch interrogations of search engines(Bing,www.bing.com)or from a leading provider of Web analytics(SimilarWeb,http://www.similarweb.com).The information extracted from the Web has been combined with the University structural information taken from the European Tertiary Education Register(https://eter.joanneum.at/#/home),a database collecting information on Higher Education Institutions(HEIs)at European level.All the above was used to perform a clusterization of 79 Italian universities based on structural and digital indicators.Findings:The main findings of this study concern the evaluation of the potential in digitalization of universities,in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’websites.These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators.Research limitations:The results reported in this study refers to Italian universities only,but the approach could be extended to other university systems abroad.Practical implications:The approach proposed in this study and its illustration on Italian universities show the usefulness of recently introduced automatic data extraction and web scraping approaches and its practical relevance for characterizing and profiling the activities of universities on the basis of their websites.The approach could be applied to other university systems.Originality/value:This work applies for the first time to university websites some recently introduced techniques for automatic knowledge extraction based on web scraping,optical character recognition and nontrivial text mining operations(Bruni&Bianchi,2020).展开更多
文摘为提升医院精细化管理,推动医疗机构科学地开展智慧医院建设,解决国家、省、市逐年增长的数据填报工作难题,本文构建了符合实际场景的智慧医疗数据管理平台。该平台利用Data Services技术将HANA数据库计算出的指标抽取到平台数据库,利用JAVA SSM框架完成平台开发,可实现各科室数据自动填报,同时实现了业务处理、数据核对、流程管理、统计分析等上报数据的精细化管理。以SAP Data Services为工具,实现平台指标的自动计算展示,优化流程,建设数据为驱动的高水平智慧医院,从而提升医院核心竞争力。
基金Supported by the National High Technology Research and Development Programme of China(No.2009AA01 Z141)the National Natural Science Foundation of China(No.60573117)Beijing Natural Science Foundation(No.4131001)
文摘To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness.
文摘In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.
基金supported by the National Natural Science Foundation of China (No. 61502043, No. 61132001)Beijing Natural Science Foundation (No. 4162042)BeiJing Talents Fund (No. 2015000020124G082)
文摘With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependencies, resulting in the inflexibility of the design and implement for the processes. This paper proposes a novel data-aware business process model which is able to describe both explicit control flow and implicit data flow. Data model with dependencies which are formulated by Linear-time Temporal Logic(LTL) is presented, and their satisfiability is validated by an automaton-based model checking algorithm. Data dependencies are fully considered in modeling phase, which helps to improve the efficiency and reliability of programming during developing phase. Finally, a prototype system based on j BPM for data-aware workflow is designed using such model, and has been deployed to Beijing Kingfore heating management system to validate the flexibility, efficacy and convenience of our approach for massive coding and large-scale system management in reality.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFB0203801the National Natural Science Foundation of China under Grant Nos.61702529 and 61802424.
文摘Massive ocean data acquired by various observing platforms and sensors poses new challenges to data management and utilization.Typically,it is difficult to find the desired data from the large amount of datasets efficiently and effectively.Most of existing methods for data discovery are based on the keyword retrieval or direct semantic reasoning,and they are either limited in data access rate or do not take the time cost into account.In this paper,we creatively design and implement a novel system to alleviate the problem by introducing semantics with ontologies,which is referred to as Data Ontology and List-Based Publishing(DOLP).Specifically,we mainly improve the ocean data services in the following three aspects.First,we propose a unified semantic model called OEDO(Ocean Environmental Data Ontology)to represent heterogeneous ocean data by metadata and to be published as data services.Second,we propose an optimized quick service query list(QSQL)data structure for storing the pre-inferred semantically related services,and reducing the service querying time.Third,we propose two algorithms for optimizing QSQL hierarchically and horizontally,respectively,which aim to extend the semantics relationships of the data service and improve the data access rate.Experimental results prove that DOLP outperforms the benchmark methods.First,our QSQL-based data discovery methods obtain a higher recall rate than the keyword-based method,and are faster than the traditional semantic method based on direct reasoning.Second,DOLP can handle more complex semantic relationships than the existing methods.
文摘As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is revolutionizing all industries,bringing colossal impacts to them[2].Many researchers have pointed out the huge impact that big data can have on our daily lives[3].We can utilize the information we obtain and help us make decisions.Also,the conclusions we drew from the big data we analyzed can be used as a prediction for the future,helping us to make more accurate and benign decisions earlier than others.If we apply these technics in finance,for example,in stock,we can get detailed information for stocks.Moreover,we can use the analyzed data to predict certain stocks.This can help people decide whether to buy a stock or not by providing predicted data for people at a certain convincing level,helping to protect them from potential losses.
文摘JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and timeliness of the marine-meteorological and oceanographic data, metadata and products available to end users. China as a candidate of CMOC China has been approved to run on a trial basis after the 4th Meeting of the Joint IOC/WMO Technical Commission for Oceanography and Marine Meteorology (JCOMM). This article states the developing intention of CMOC China in the next few years through the brief introduction to critical marine data, products and service system and cooperation projects in the world.
基金Supported by the National Natural Science Foundation of China(61622301,61533002)Beijing Natural Science Foundation(4172005)Major National Science and Technology Project(2017ZX07104)
文摘In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.
基金The National Natural Science Foundation of China under contract No.41206012
文摘An ocean state monitor and analysis radar(OSMAR), developed by Wuhan University in China, have been mounted at six stations along the coasts of East China Sea(ECS) to measure velocities(currents, waves and winds) at the sea surface. Radar-observed surface current is taken as an example to illustrate the operational high-frequency(HF) radar observing and data service platform(OP), presenting an operational flow from data observing, transmitting, processing, visualizing, to end-user service. Three layers(systems): radar observing system(ROS), data service system(DSS) and visualization service system(VSS), as well as the data flow within the platform are introduced. Surface velocities observed at stations are synthesized at the radar data receiving and preprocessing center of the ROS, and transmitted to the DSS, in which the data processing and quality control(QC) are conducted. Users are allowed to browse the processed data on the portal of the DSS, and access to those data files. The VSS aims to better show the data products by displaying the information on a visual globe. By utilizing the OP, the surface currents in East China Sea are monitored, and hourly and seasonal variabilities of them are investigated.
基金support of the European Commission ETER Project (No. 934533-2017-AO8-CH)H2020 RISIS 2 project (No. 824091)。
文摘Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register(ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain.Design/methodology/approach: The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings: We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information.Research limitations: The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated.Practical implications: The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives.Originality/value: The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions.
基金Supported by National High Technology Research and Development Program of China (863 Program) (Nos. 2009AA12Z225,2009AA12Z208)the National Natural Science Foundation of China (No. 61074132)
文摘Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the intrinsic spatial nature of ocean data and information,no current GIS software is adequate to deal effectively and efficiently with spatiotemporal data.Furthermore,while existing ocean data portals are generally designed to meet the basic needs of a broad range of users,they are sometimes very complicated for general audiences,especially for those without training in GIS.In this paper,a new technical architecture for an ocean data integration and service system is put forward that consists of four layers:the operation layer,the extract,transform,and load(ETL) layer,the data warehouse layer,and the presentation layer.The integration technology based on the XML,ontology,and spatiotemporal data organization scheme for the data warehouse layer is then discussed.In addition,the ocean observing data service technology realized in the presentation layer is also discussed in detail,including the development of the web portal and ocean data sharing platform.The application on the Taiwan Strait shows that the technology studied in this paper can facilitate sharing,access,and use of ocean observation data.The paper is based on an ongoing research project for the development of an ocean observing information system for the Taiwan Strait that will facilitate the prevention of ocean disasters.
基金Supported by the National Natural Science Foun-dation of China(60373072)
文摘In existing web services-based workflow, data exchanging across the web services is centralized, the workflow engine intermediates at each step of the application sequence. However, many grid applications, especially data intensive scientific applications, require exchanging large amount of data across the grid services. Having a central workflow engine relay the data between the services would resu'lts in a bottleneck in these cases. This paper proposes a data exchange model for individual grid workflow and multiworkflows composition respectively. The model enables direct communication for large amounts of data between two grid services. To enable data to exchange among multiple workflows, the bridge data service is used.
文摘For China’s telecom industry,2009 is destined to be an extraordinary year due to the approach of long-thirsted-for mobile 3G era,which will have significant impact on current work and lifestyles.2009 will also be a year full of opportunities and challenges because the coming 3G era will bring limitless business opportunities and impose more challenges on Chinese telecom operators.The reshuffling of Chinese telecom markets has been brought to an end.The new China Unicom,China Mobile and China Telecom all focus their strategies on broadband mobile data services in order to achieve the objective of a smooth transforming from voice services to data services.Technologically,various 3G technologies and their evolutions become great concerns of telecom operators;while in terms of services,the key for 3G systems is their data services.As a result,high speed broadband data services see an era of rapid development.
文摘This paper introduces the application of data mining technology in data service. Data mining is a new technology, but its application time is long, and the application effect is obvious. Data mining technology can be used in enterprise customer service system, which can help enterprises to find potential customers, while retaining the most valuable customers. We pointed out the necessity of establishing data mining technology and service intelligence analysis system based on the combination of intelligence analysis and service characteristics and processes, the method of data mining, knowledge management ideas applied to intelligence analysis and service system.
文摘World Data Center(WDC)for Seismology,Beijing has developed for 20 years in China until this year.The sustained and stable data sharing service system has already taken shape.This article gives an overview of the construction and development of WDC for Seismology,Beijing.It outlines the history,facilities and technical specifications of the center.It also illustrates the data service,the website,and gives a brief description of the perspective.
基金Supported by National Natural Science Foundation of China(42274217)。
文摘Fengyun meteorological satellites have undergone a series of significant developments over the past 50 years.Two generations,four types,and 21 Fengyun satellites have been developed and launched,with 9 currently operational in orbit.The data obtained from Fengyun satellites is employed in a multitude of applications,including weather forecasting,meteorological disaster prevention and reduction,climate change,global environmental monitoring,and space weather.These data products and services are made available to the global community,resulting in tangible social and economic benefits.In 2023,two Fengyun meteorological satellites were successfully launched.This report presents an overview of the two recently launched Fengyun satellites and currently in orbit Fengyun satellites,including an evaluation of their remote sensing instruments since 2022.Additionally,it addresses the subject of Fengyun satellite data archiving,data services,application services,international cooperation,and supporting activities.Furthermore,the development prospects have been outlined.
基金supported in part by the Science and Technology Project Program of Sichuan under Grant 2022YFG0022in part by the Science and Technology Research Program of Chongqing Municipal Education Commission under Grant KJZD-K202000602+1 种基金in part by the General Program of Natural Science Foundation of Chongqing under Grant cstc2020jcyj-msxmX1021in part by the Chongqing Natural Science Foundation of China under Grant cstc2020jcyj-msxmX0343.
文摘In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish security-threat assessment architectures to evaluate risks of traffic elements by managing and sharing securitythreat information.Unfortunately,most assessment architectures process data in a centralized manner,causing delays in query services.To address this issue,in this paper,a Hierarchical Blockchain-enabled Security threat Assessment Architecture(HBSAA)is proposed,utilizing edge chains and global chains to share data.In addition,data virtualization technology is introduced to manage multi-source heterogeneous data,and a metadata association model based on attribute graph is designed to deal with complex data relationships.In order to provide high-speed query service,the ant colony optimization of key nodes is designed,and the HBSAA prototype is also developed and the performance is tested.Experimental results on the large-scale vulnerabilities data gathered from NVD demonstrate that the HBSAA not only shields data heterogeneity,but also reduces service response time.
基金Supported by the National Key Research and Development Program of China(2018YFB0504900,2018YFB0504905)。
文摘China began to develop its meteorological satellite program since 1969.With 50-years’growing,there are 17 Fengyun(FY)meteorological satellites launched successfully.At present,seven of them are in orbit to provide the operational service,including three polar orbiting meteorological satellites and four geostationary meteorological satellites.Since last COSPAR report,no new Fengyun satellite has been launched.The information of the on-orbit FY-2 series,FY-3 series,and FY-4 series has been updated.FY-3D and FY-2H satellites accomplished the commission test and transitioned into operation in 2018.FY-2E satellite completed its service to decommission in 2019.The web-based users and Direct Broadcasting(DB)users keep growing worldwide to require the Fengyun satellite data and products.A new Mobile Application Service has been launched to Fengyun users based on the cloud technology in 2018.In this report,the international and regional co-operations to facilitate the Fengyun user community have been addressed especially.To strengthen the data service in the Belt and Road countries,the Emergency Support Mechanism of Fengyun satellite(FY_ESM)has been established since 2018.Meanwhile,a Recalibrating 30-years’archived Fengyun satellite data project has been founded since 2018.This project targets to generate the Fundamental Climate Data Record(FCDR)as a space agency response to the Global Climate Observation System(GCOS).At last,the future Fengyun program up to 2025 has been introduced as well.
基金Supported by the National Key Research and Development Program of China(2018YFB0504900,2018YFB0504905)the National Project on Fengyun Meteorological Satellite Development。
文摘China’s efforts to develop Fengyun meteorological satellites have made major strides over the past 50 years,with the polar and geostationary meteorological satellite series achieving continuously stable operation to persistently provide data and product services globally.By the end of 2021,19 Chinese self-developed Fengyun meteorological satellites have been launched successfully.Seven of them are in operation at present,the data and products are widely applied to weather analysis,numerical weather forecasting and climate prediction,as well as environment and disaster monitoring.Since the last COSPAR report,FY-4B,the first new-generation operational geostationary satellite,and FY-3E,the first early-morning orbit satellite in China’s polar-orbiting meteorological satellite family have been launched in 2021.The characteristics of the two latest satellites and the instruments onboard are addressed in this report.The status of current Fengyun Satellites,product and data service and international cooperation and supporting activities has been introduced as well.
基金This work is developed with the support of the H2020 RISIS 2 Project(No.824091)and of the“Sapienza”Research Awards No.RM1161550376E40E of 2016 and RM11916B8853C925 of 2019.This article is a largely extended version of Bianchi et al.(2019)presented at the ISSI 2019 Conference held in Rome,2–5 September 2019.
文摘Purpose:The main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’websites.The information automatically extracted can be potentially updated with a frequency higher than once per year,and be safe from manipulations or misinterpretations.Moreover,this approach allows us flexibility in collecting indicators about the efficiency of universities’websites and their effectiveness in disseminating key contents.These new indicators can complement traditional indicators of scientific research(e.g.number of articles and number of citations)and teaching(e.g.number of students and graduates)by introducing further dimensions to allow new insights for“profiling”the analyzed universities.Design/methodology/approach:Webometrics relies on web mining methods and techniques to perform quantitative analyses of the web.This study implements an advanced application of the webometric approach,exploiting all the three categories of web mining:web content mining;web structure mining;web usage mining.The information to compute our indicators has been extracted from the universities’websites by using web scraping and text mining techniques.The scraped information has been stored in a NoSQL DB according to a semistructured form to allow for retrieving information efficiently by text mining techniques.This provides increased flexibility in the design of new indicators,opening the door to new types of analyses.Some data have also been collected by means of batch interrogations of search engines(Bing,www.bing.com)or from a leading provider of Web analytics(SimilarWeb,http://www.similarweb.com).The information extracted from the Web has been combined with the University structural information taken from the European Tertiary Education Register(https://eter.joanneum.at/#/home),a database collecting information on Higher Education Institutions(HEIs)at European level.All the above was used to perform a clusterization of 79 Italian universities based on structural and digital indicators.Findings:The main findings of this study concern the evaluation of the potential in digitalization of universities,in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’websites.These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators.Research limitations:The results reported in this study refers to Italian universities only,but the approach could be extended to other university systems abroad.Practical implications:The approach proposed in this study and its illustration on Italian universities show the usefulness of recently introduced automatic data extraction and web scraping approaches and its practical relevance for characterizing and profiling the activities of universities on the basis of their websites.The approach could be applied to other university systems.Originality/value:This work applies for the first time to university websites some recently introduced techniques for automatic knowledge extraction based on web scraping,optical character recognition and nontrivial text mining operations(Bruni&Bianchi,2020).