期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
The Semantic Data Dictionary-An Approach for Describing and Annotating Data
1
作者 Sabbir M.Rashid James P.McCusker +5 位作者 Paulo Pinheiro Marcello P.Bax Henrique O.Santos Jeanette A.Stingone Amar K.Das Deborah L.McGuinness 《Data Intelligence》 2020年第4期443-486,共44页
It is common practice for data providers to include text descriptions for each column when publishing data sets in the form of data dictionaries.While these documents are useful in helping an end-user properly interpr... It is common practice for data providers to include text descriptions for each column when publishing data sets in the form of data dictionaries.While these documents are useful in helping an end-user properly interpret the meaning of a column in a data set,existing data dictionaries typically are not machine-readable and do not follow a common specification standard.We introduce the Semantic Data Dictionary,a specification that formalizes the assignment of a semantic representation of data,enabling standardization and harmonization across diverse data sets.In this paper,we present our Semantic Data Dictionary work in the context of our work with biomedical data;however,the approach can and has been used in a wide range of domains.The rendition of data in this form helps promote improved discovery,interoperability,reuse,traceability,and reproducibility.We present the associated research and describe how the Semantic Data Dictionary can help address existing limitations in the related literature.We discuss our approach,present an example by annotating portions of the publicly available National Health and Nutrition Examination Survey data set,present modeling challenges,and describe the use of this approach in sponsored research,including our work on a large National Institutes of Health(NIH)-funded exposure and health data portal and in the RPI-IBM collaborative Health Empowerment by Analytics,Learning,and Semantics project. 展开更多
关键词 semantic data Dictionary Dictionary mapping CODEBOOK Knowledge modeling data integration data dictionary Mapping language Metadata standard semantic Web semantic ETL FAIR data
原文传递
Semantic Recognition of a Data Structure in Big-Data
2
作者 Aicha Ben Salem Faouzi Boufares Sebastiao Correia 《Journal of Computer and Communications》 2014年第9期93-102,共10页
Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality ... Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling. 展开更多
关键词 data Quality Big-data semantic data Profiling data Dictionary Regular Expressions ONTOLOGY
下载PDF
A Data-Semantic-Conflict-Based Multi-Truth Discovery Algorithm for a Programming Site 被引量:1
3
作者 Haitao Xu Haiwang Zhang +2 位作者 Qianqian Li Tao Qin Zhen Zhang 《Computers, Materials & Continua》 SCIE EI 2021年第8期2681-2691,共11页
With the extensive application of software collaborative development technology,the processing of code data generated in programming scenes has become a research hotspot.In the collaborative programming process,differ... With the extensive application of software collaborative development technology,the processing of code data generated in programming scenes has become a research hotspot.In the collaborative programming process,different users can submit code in a distributed way.The consistency of code grammar can be achieved by syntax constraints.However,when different users work on the same code in semantic development programming practices,the development factors of different users will inevitably lead to the problem of data semantic conflict.In this paper,the characteristics of code segment data in a programming scene are considered.The code sequence can be obtained by disassembling the code segment using lexical analysis technology.Combined with a traditional solution of a data conflict problem,the code sequence can be taken as the declared value object in the data conflict resolution problem.Through the similarity analysis of code sequence objects,the concept of the deviation degree between the declared value object and the truth value object is proposed.A multi-truth discovery algorithm,called the multiple truth discovery algorithm based on deviation(MTDD),is proposed.The basic methods,such as Conflict Resolution on Heterogeneous Data,Voting-K,and MTRuths_Greedy,are compared to verify the performance and precision of the proposed MTDD algorithm. 展开更多
关键词 data semantic conflict multi-truth discovery programming site
下载PDF
Improving Archival Records and Service of Traditional Korean Performing Arts in a Semantic Web Environment 被引量:1
4
作者 Ziyoung Park Hosin Lee +1 位作者 Seungchon Kim Sungjae Park 《Journal of Data and Information Science》 CSCD 2020年第1期68-80,共13页
Purpose:This research project aims to organize the archival information of traditional Korean performing arts in a semantic web environment.Key requirements,which the archival records manager should consider for publi... Purpose:This research project aims to organize the archival information of traditional Korean performing arts in a semantic web environment.Key requirements,which the archival records manager should consider for publishing and distribution of gugak performing archival information in a semantic web environment,are presented in the perspective of linked data.Design/methodology/approach:This study analyzes the metadata provided by the National Gugak Center’s Gugak Archive,the search and browse menus of Gugak Archive’s website and K-PAAN,the performing arts portal site.Findings:The importance of consistency,continuity,and systematicity—crucial qualities in traditional record management practices—is undiminished in a semantic web environment.However,a semantic web environment also requires new tools such as web identifiers(URIs),data models(RDF),and link information(interlinking).Research limitations:The scope of this study does not include practical implementation strategies for the archival records management system and website services.The suggestions also do not discuss issues related to copyright or policy coordination between related organizations.Practical implications:The findings of this study can assist records managers in converting a traditional performing arts information archive into a semantic web environment-based online archival service and system.This can also be useful for collaboration with record managers who are unfamiliar with relational or triple database system.Originality/value:This study analyzed the metadata of the Gugak Archive and its online services to present practical requirements for managing and disseminating gugak performing arts information in a semantic web environment.In the application of the semantic web services’principles and methods to an Gugak Archive,this study can contribute to the improvement of information organization and services in the field of Korean traditional music. 展开更多
关键词 Gugak archive Korean traditional music Performing arts archive Linked semantic data K-PAAN
下载PDF
机载激光雷达点云分类研究进展与趋势
5
作者 王建楠 李楚钰 +3 位作者 唐廷元 李瀚琨 梁鹏 荣伟 《北京测绘》 2024年第4期603-608,共6页
机载激光雷达点云数据能为诸多行业应用提供框架性、基础性的技术支撑;点云数据也是智慧城市和实景三维(3D)中国建设的重要地理空间数据,高质量的点云分类能极大地提升地理空间数据的实体3D表征效果。因此,对机载激光雷达点云分类的技... 机载激光雷达点云数据能为诸多行业应用提供框架性、基础性的技术支撑;点云数据也是智慧城市和实景三维(3D)中国建设的重要地理空间数据,高质量的点云分类能极大地提升地理空间数据的实体3D表征效果。因此,对机载激光雷达点云分类的技术研究进展情况进行凝练和梳理则显得较为重要。本论文从基于众源地图、基于特征、基于神经网络与深度学习、基于多模态数据利用等方面对点云分类方法进行论述,归纳各种方法的技术优势和潜在问题,并对发展趋势进行了分析。在城市复杂场景的激光雷达点云分类场景中,通过嵌入光学影像、融合众源地图标注信息,结合神经网络和深度学习方法,进行全局推理的多模态数据耦合,实现对机载激光雷达点云的高效率、高精度、高准确性的分类,将是今后需要进行深入研究的方向。 展开更多
关键词 机载激光雷达 点云分类 神经网络 深度学习 多模态数据 点云语义化
下载PDF
GDM: A New Graph Based Data Model Using Functional Abstractionx 被引量:1
6
作者 Sankhayan Choudhury Nabendu Chaki Swapan Bhattacharya 《Journal of Computer Science & Technology》 SCIE EI CSCD 2006年第3期430-438,共9页
In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize i... In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models. The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similarity with the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment, and a corresponding query processing technique for distributed environment is also suggested for the sake of completeness. 展开更多
关键词 graph data model semantic group semantic data model distributed database fragmentation and allocation schema
原文传递
FAIR Machine Learning Model Pipeline Implementation of COVID-19 Data
7
作者 Sakinat Folorunso Ezekiel Ogundepo +4 位作者 Mariam Basajja Joseph Awotunde Abdullahi Kawu Francisca Oladipo Abdullahi Ibrahim 《Data Intelligence》 EI 2022年第4期971-990,1036,共21页
Research and development are gradually becoming data-driven and the implementation of the FAIR Guidelines(that data should be Findable, Accessible, Interoperable, and Reusable) for scientific data administration and s... Research and development are gradually becoming data-driven and the implementation of the FAIR Guidelines(that data should be Findable, Accessible, Interoperable, and Reusable) for scientific data administration and stewardship has the potential to remarkably enhance the framework for the reuse of research data. In this way, FAIR is aiding digital transformation. The ‘FAIRification’ of data increases the interoperability and(re)usability of data, so that new and robust analytical tools, such as machine learning(ML) models, can access the data to deduce meaningful insights, extract actionable information, and identify hidden patterns. This article aims to build a FAIR ML model pipeline using the generic FAIRification workflow to make the whole ML analytics process FAIR. Accordingly, FAIR input data was modelled using a FAIR ML model. The output data from the FAIR ML model was also made FAIR. For this, a hybrid hierarchical k-means (HHK) clustering ML algorithm was applied to group the data into homogeneous subgroups and ascertain the underlying structure of the data using a Nigerian-based FAIR dataset that contains data on economic factors, healthcare facilities, and coronavirus occurrences in all the 36 states of Nigeria. The model showed that research data and the ML pipeline can be FAIRified, shared, and reused by following the proposed FAIRification workflow and implementing technical architecture. 展开更多
关键词 FAIRification semantic data model Cluster analysis FAIR data METAdata Machine learning model
原文传递
A Cloud Service Architecture for Analyzing Big Monitoring Data 被引量:3
8
作者 Samneet Singh Yan Liu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2016年第1期55-70,共16页
Cloud monitoring is of a source of big data that are constantly produced from traces of infrastructures,platforms, and applications. Analysis of monitoring data delivers insights of the system's workload and usage pa... Cloud monitoring is of a source of big data that are constantly produced from traces of infrastructures,platforms, and applications. Analysis of monitoring data delivers insights of the system's workload and usage pattern and ensures workloads are operating at optimum levels. The analysis process involves data query and extraction, data analysis, and result visualization. Since the volume of monitoring data is big, these operations require a scalable and reliable architecture to extract, aggregate, and analyze data in an arbitrary range of granularity. Ultimately, the results of analysis become the knowledge of the system and should be shared and communicated. This paper presents our cloud service architecture that explores a search cluster for data indexing and query. We develop REST APIs that the data can be accessed by different analysis modules. This architecture enables extensions to integrate with software frameworks of both batch processing(such as Hadoop) and stream processing(such as Spark) of big data. The analysis results are structured in Semantic Media Wiki pages in the context of the monitoring data source and the analysis process. This cloud architecture is empirically assessed to evaluate its responsiveness when processing a large set of data records under node failures. 展开更多
关键词 cloud computing REST API big data software architecture semantic web
原文传递
OPEN ONTOLOGY-BASED INTEGRATION PLATFORM FOR MODELING AND SIMULATION IN ENGINEERING 被引量:1
9
作者 TOMMI KARHELA ANTTI VILLBERG HANNU NIEMISTO 《International Journal of Modeling, Simulation, and Scientific Computing》 EI 2012年第2期131-166,共36页
The benefits of the use of modeling and simulation in engineering are acknowledged widely.It has proven its advantages e.g.,in virtual prototyping i.e.,simulation aided design and testing as well as in training and R&... The benefits of the use of modeling and simulation in engineering are acknowledged widely.It has proven its advantages e.g.,in virtual prototyping i.e.,simulation aided design and testing as well as in training and R&D.It is recognized to be a tool for modern decision making.However,there are still reasons that slow down the wider utilization of modeling and simulation in companies.Modeling and simulation tools are separate and are not an integrated part of the other engineering information management in the company networks.They do not integrate well enough into the used CAD,PLM/PDM and control systems.The co-use of the simulation tools themselves is poor and the whole modeling process is considered often to be too laborious.In this article we introduce an integration solution for modeling and simulation based on the semantic data modeling approach.Semantic data modeling and ontology mapping techniques have been used in database system integration,but the novelty of this work is in utilizing these techniques in the domain of modeling and simulation.The benefits and drawbacks of the chosen approach are discussed.Furthermore,we describe real industrial project cases where this new approach has been applied. 展开更多
关键词 semantic data modeling data driven approach information management.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部