期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Enhancing AI System Privacy:An Automatic Tool for Achieving GDPR Compliance in NoSQL Databases
1
作者 Yifei Zhao Zhaohui Li Siyi Lv 《Computers, Materials & Continua》 SCIE EI 2024年第7期217-234,共18页
The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users wit... The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems. 展开更多
关键词 GDPR compliance nosql databases AI system PRIVACY
下载PDF
A Systematic Review of Automated Classification for Simple and Complex Query SQL on NoSQL Database
2
作者 Nurhadi Rabiah Abdul Kadir +1 位作者 Ely Salwana Mat Surin Mahidur R.Sarker 《Computer Systems Science & Engineering》 2024年第6期1405-1435,共31页
A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various form... A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications. 展开更多
关键词 nosql database data lake machine learning ACID complex query smart city
下载PDF
A new fragment re-allocation strategy for NoSQL database systems 被引量:3
3
作者 Zhikun CHEN Shuqiang YANG +3 位作者 Shuang TAN Li HE Hong YIN Ge ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2015年第1期111-127,共17页
Abstract NoSQL databases are famed for the characteristics of high scalability, high availability, and high faulttolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and frag... Abstract NoSQL databases are famed for the characteristics of high scalability, high availability, and high faulttolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and fragment allocation strategy directly affect NoSQL database systems' performance. The data partition strategy of large, global databases is performed by horizontally, vertically partitioning or combination of both. In the general way the system scatters the related fragments as possible to improve operations' parallel degree. But the operations are usually not very complicated in some applications, and an operation may access to more than one fragment. At the same time, those fragments which have to be accessed by an operation may interact with each other. The general allocation strategies will increase system's communication cost during operations execution over sites. In order to improve those applications' performance and enable NoSQL database systems to work efficiently, these applications' fragments have to be allocated in a reasonable way that can reduce the communication cost i.e., to minimize the total volume of data transmitted during operations execution over sites. A strategy of clustering fragments based onhypergraph is proposed, which can cluster fragments which were accessed together in most operations to the same cluster. The method uses a weighted hypergraph to represent the fragments' access pattem of operations. A hypergraph partitioning algorithm is used to cluster fragments in our strategy. This method can reduce the amount of sites that an operation has to span. So it can reduce the communication cost over sites. Experimental results confirm that the proposed technique will effectively contribute in solving fragments re-allocation problem in a specific application environment of NoSQL database system. 展开更多
关键词 fragment allocation nosql database hypergraph partition clustering fragments fragment correlation
原文传递
Big data storage technologies: a survey 被引量:17
4
作者 Aisha SIDDIQA Ahmad KARIM Abdullah GANI 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第8期1040-1070,共31页
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage m... There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mecha- nism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the ex- isling approaches using Brewer's CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system. 展开更多
关键词 Big data Big data storage nosql databases Distributed databases CAP theorem SCALABILITY Consistency-partition resilience Availability-partition resilience
原文传递
TIFAflow: Enhancing Traffic Archiving System with Flow Granularity for Forensic Analysis in Network Security 被引量:3
5
作者 Zhen Chen Linyun Ruan +2 位作者 Junwei Cao Yifan Yu Xin Jiang 《Tsinghua Science and Technology》 SCIE EI CAS 2013年第4期406-417,共12页
The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves stora... The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves storage and analysis of network flow statistic. However, this approach loses much valuable information within the Internet traffic. With the advancement of commodity hardware, in particular the volume of storage devices and the speed of interconnect technologies used in network adapter cards and multi-core processors, it is now possible to capture 10 Gbps and beyond real-time network traffic using a commodity computer, such as n2disk. Also with the advancement of distributed file system (such as Hadoop, ZFS, etc.) and open cloud computing platform (such as OpenStack, CloudStack, and Eucalyptus, etc.), it is practical to store such large volume of traffic data and fully in-depth analyse the inside communication within an acceptable latency. In this paper, based on well- known TimeMachine, we present TIFAflow, the design and implementation of a novel system for archiving and querying network flows. Firstly, we enhance the traffic archiving system named TImemachine+FAstbit (TIFA) with flow granularity, i.e., supply the system with flow table and flow module. Secondly, based on real network traces, we conduct performance comparison experiments of TIFAflow with other implementations such as common database solution, TimeMachine and TIFA system. Finally, based on comparison results, we demonstrate that TIFAflow has a higher performance improvement in storing and querying performance than TimeMachine and TIFA, both in time and space metrics. 展开更多
关键词 network security traffic archival forensic analysis phishing attack bitmap database hadoop distributed file system cloud computing nosql
原文传递
Lom: Discovering Logic Flaws Within MongoDB-based Web Applications
6
作者 Shuo Wen Yuan Xue +4 位作者 Jing Xu Li-Ying Yuan Wen-Li Song Hong-Ji Yang Guan-Nan Si 《International Journal of Automation and computing》 EI CSCD 2017年第1期106-118,共13页
Logic flaws within web applications will allow malicious operations to be triggered towards back-end database. Existing approaches to identifying logic flaws of database accesses are strongly tied to structured query ... Logic flaws within web applications will allow malicious operations to be triggered towards back-end database. Existing approaches to identifying logic flaws of database accesses are strongly tied to structured query language (SQL) statement construction and cannot be applied to the new generation of web applications that use not only structured query language (NoSQL) databases as the storage tier. In this paper, we present Lom, a black-box approach for discovering many categories of logic flaws within MongoDB- based web applications. Our approach introduces a MongoDB operation model to support new features of MongoDB and models the application logic as a mealy finite state machine. During the testing phase, test inputs which emulate state violation attacks are constructed for identifying logic flaws at each application state. We apply Lom to several MongoDB-based web applications and demonstrate its effectiveness. 展开更多
关键词 Logic flaw web application security not only structured query language nosql database BLACK-BOX MougoDB.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部