As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, ...As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, many scholars have tried different techniques. Unfortunately, there is a lack of research on Redis’s transaction in the existing literatures. This paper proposes a transaction model for key-value NoSQL databases including Redis to make possible allowing users to access data in the ACID (Atomicity, Consistency, Isolation and Durability) way, and this model is vividly called the surfing concurrence transaction model. The architecture, important features and implementation principle are described in detail. The key algorithms also were given in the form of pseudo program code, and the performance also was evaluated. With the proposed model, the transactions of Key-Value NoSQL databases can be performed in a lock free and MVCC (Multi-Version Concurrency Control) free manner. This is the result of further research on the related topic, which fills the gap ignored by relevant scholars in this field to make a little contribution to the further development of NoSQL technology.展开更多
The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users wit...The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.展开更多
A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various form...A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.展开更多
为解决传统的关系型数据库在海量数据的存储和访问效率中存在的瓶颈问题,提出了一种基于非关系型(NoSQL:Not only SQL)数据库的地学大数据高效存储方法。同时以MongoDB为代表,通过C#语言编写测试程序,与SQL Server在地学大数据的存储、...为解决传统的关系型数据库在海量数据的存储和访问效率中存在的瓶颈问题,提出了一种基于非关系型(NoSQL:Not only SQL)数据库的地学大数据高效存储方法。同时以MongoDB为代表,通过C#语言编写测试程序,与SQL Server在地学大数据的存储、查询等方面进行了性能对比。结果表明,与传统关系型数据库相比,NoSQL数据库的增、删和查询耗时明显降低,尤其是针对海量的非结构化、半结构化数据,其性能优势更加明显。展开更多
文摘As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, many scholars have tried different techniques. Unfortunately, there is a lack of research on Redis’s transaction in the existing literatures. This paper proposes a transaction model for key-value NoSQL databases including Redis to make possible allowing users to access data in the ACID (Atomicity, Consistency, Isolation and Durability) way, and this model is vividly called the surfing concurrence transaction model. The architecture, important features and implementation principle are described in detail. The key algorithms also were given in the form of pseudo program code, and the performance also was evaluated. With the proposed model, the transactions of Key-Value NoSQL databases can be performed in a lock free and MVCC (Multi-Version Concurrency Control) free manner. This is the result of further research on the related topic, which fills the gap ignored by relevant scholars in this field to make a little contribution to the further development of NoSQL technology.
基金supported by the National Natural Science Foundation of China(No.62302242)the China Postdoctoral Science Foundation(No.2023M731802).
文摘The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.
基金supported by the Student Scheme provided by Universiti Kebangsaan Malaysia with the Code TAP-20558.
文摘A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.
文摘为解决传统的关系型数据库在海量数据的存储和访问效率中存在的瓶颈问题,提出了一种基于非关系型(NoSQL:Not only SQL)数据库的地学大数据高效存储方法。同时以MongoDB为代表,通过C#语言编写测试程序,与SQL Server在地学大数据的存储、查询等方面进行了性能对比。结果表明,与传统关系型数据库相比,NoSQL数据库的增、删和查询耗时明显降低,尤其是针对海量的非结构化、半结构化数据,其性能优势更加明显。