This paper introduces a multi-granularity locking model (MGL) for concurrency control in object-oriented database system briefiy, and presents a MGL model formally. Four lockingscheduling algorithms for MGL are propos...This paper introduces a multi-granularity locking model (MGL) for concurrency control in object-oriented database system briefiy, and presents a MGL model formally. Four lockingscheduling algorithms for MGL are proposed in the paper. The ideas of single queue scheduling(SQS) and dual queue scheduling (DQS) are proposed and the algorithm and the performance evaluation for these two scheduling are presented in some paper. This paper describes a new idea of thescheduling for MGL, compatible requests first (CRF). Combining the new idea with SQS and DQS,we propose two new scheduling algorithms called CRFS and CRFD. After describing the simulationmodel, this paper illustrates the comparisons of the performance among these four algorithms. Asshown in the experiments, DQS has better performance than SQS, CRFD is better than DQS, CRFSperforms better than SQS, and CRFS is the best one of these four scheduling algorithms.展开更多
Focusing on the development of electronic-mart (e-mart) based on object-oriented databases (OODB), the concepts of integrated electronic-commerce (e-commerce) environment and e-mart are introduced, and the basic chara...Focusing on the development of electronic-mart (e-mart) based on object-oriented databases (OODB), the concepts of integrated electronic-commerce (e-commerce) environment and e-mart are introduced, and the basic characteristics of OODB Jasmine are described. In addition, the database mode and hierarchy of e-mart are discussed in detail.展开更多
All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations ...All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations to search for high ion-conducting solid-state electrolytes have attracted broad concern.However,obtaining SSEs with high ionic conductivity is challenging due to the complex structural information and the less-explored structure-performance relationship.To provide a solution to these challenges,developing a database containing typical SSEs from available experimental reports would be a new avenue to understand the structureperformance relationships and find out new design guidelines for reasonable SSEs.Herein,a dynamic experimental database containing>600 materials was developed in a wide range of temperatures(132.40–1261.60 K),including mono-and divalent cations(e.g.,Li^(+),Na^(+),K^(+),Ag^(+),Ca^(2+),Mg^(2+),and Zn^(2+))and various types of anions(e.g.,halide,hydride,sulfide,and oxide).Data-mining was conducted to explore the relationships among different variates(e.g.,transport ion,composition,activation energy,and conductivity).Overall,we expect that this database can provide essential guidelines for the design and development of high-performance SSEs in ASSB applications.This database is dynamically updated,which can be accessed via our open-source online system.展开更多
This paper presents a practical concurrency control mechanism - ObjectLockingin OODBMS. Object-Locking can schedule transactions, each of themcan be considered as a sequence of high level operations defined on classe...This paper presents a practical concurrency control mechanism - ObjectLockingin OODBMS. Object-Locking can schedule transactions, each of themcan be considered as a sequence of high level operations defined on classes. Bythe properties of parallelity and coatativity between high level operations,proper lock modes for each operation are desigued and the compatibility matrixis constructed. With these lock modes, phatoms are kept away from databasesand a high degree of concurrency is achieved.展开更多
The data model of WHYMX complicates transaction management. Traditional locking method is not powerful enough to solve the new problem of concurrency control of WHYMX's transaction. This paper presents a number of...The data model of WHYMX complicates transaction management. Traditional locking method is not powerful enough to solve the new problem of concurrency control of WHYMX's transaction. This paper presents a number of concurrency control algorithms based on the extended locking method.展开更多
Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 compon...Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 components make accurate separation,identification,and quantification challenging.In this work,a high-resolution quantitative method was developed using single-dimensional high-performance liquid chromatography(HPLC)with charged aerosol detection(CAD)to separate 18 key components with multiple esters.The separated components were characterized by ultra-high-performance liquid chromatography-quadrupole time-of-flight mass spectrometry(UHPLC-Q-TOF-MS)with an identical gradient as the HPLC-CAD analysis.The polysorbate compound database and library were expanded over 7-time compared to the commercial database.The method investigated differences in PS20 samples from various origins and grades for different dosage forms to evaluate the composition-process relationship.UHPLC-Q-TOF-MS identified 1329 to 1511 compounds in 4 batches of PS20 from different sources.The method observed the impact of 4 degradation conditions on peak components,identifying stable components and their tendencies to change.HPLC-CAD and UHPLC-Q-TOF-MS results provided insights into fingerprint differences,distinguishing quasi products.展开更多
The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users wit...The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.展开更多
Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods...Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods that covers the entire range of food categories,which limits the accurate risk assessment of dietary AGEs in human diseases.In this study,we first established an isotope dilution UHPLCQq Q-MS/MS-based method for simultaneous quantification of 10 major AGEs in foods.The contents of these AGEs were detected in 334 foods covering all main groups consumed in Western and Chinese populations.Nε-Carboxymethyllysine,methylglyoxal-derived hydroimidazolone isomers,and glyoxal-derived hydroimidazolone-1 are predominant AGEs found in most foodstuffs.Total amounts of AGEs were high in processed nuts,bakery products,and certain types of cereals and meats(>150 mg/kg),while low in dairy products,vegetables,fruits,and beverages(<40 mg/kg).Assessment of estimated daily intake implied that the contribution of food groups to daily AGE intake varied a lot under different eating patterns,and selection of high-AGE foods leads to up to a 2.7-fold higher intake of AGEs through daily meals.The presented AGE database allows accurate assessment of dietary exposure to these glycotoxins to explore their physiological impacts on human health.展开更多
This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, catego...This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.展开更多
Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new mater...Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.展开更多
The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase d...The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase diagrams and equilibrium phases involving refractories in industrial process.In this study,the FactSage thermodynamic database relevant to ZrO_(2)-based refractories was reviewed and the application of the database to understanding the corrosion of continuous casting nozzle refractories in steelmaking was presented.展开更多
BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and imp...BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and implications of CCY remain unclear.AIM To examine the impact of same-admission CCY compared to interval CCY on patients with gallstone-related AC using the National Readmission Database(NRD).METHODS We queried the NRD to identify all gallstone-related AC hospitalizations in adult patients with and without the same admission CCY between 2016 and 2020.Our primary outcome was all-cause 30-d readmission rates,and secondary outcomes included in-hospital mortality,length of stay(LOS),and hospitalization cost.RESULTS Among the 124964 gallstone-related AC hospitalizations,only 14.67%underwent the same admission CCY.The all-cause 30-d readmissions in the same admission CCY group were almost half that of the non-CCY group(5.56%vs 11.50%).Patients in the same admission CCY group had a longer mean LOS and higher hospitalization costs attrib-utable to surgery.Although the most common reason for readmission was sepsis in both groups,the second most common reason was AC in the interval CCY group.CONCLUSION Our study suggests that patients with gallstone-related AC who do not undergo the same admission CCY have twice the risk of readmission compared to those who undergo CCY during the same admission.These readmis-sions can potentially be prevented by performing same-admission CCY in appropriate patients,which may reduce subsequent hospitalization costs secondary to readmissions.展开更多
Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti...Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.展开更多
A SOTER management system was developed by analyzing, designing, programming, testing, repeated proceeding and progressing based on the object-oriented method. The function of the attribute database management is inhe...A SOTER management system was developed by analyzing, designing, programming, testing, repeated proceeding and progressing based on the object-oriented method. The function of the attribute database management is inherited and expanded in the new system. The integrity and security of the SOTER database are enhanced. The attribute database management, the spatial database management and the model base are integrated into SOTER based on the component object model (COM), and the graphical user interface (GUI) for Windows is used to interact with clients, thus being easy to create and maintain the SOTER, and convenient to promote the quantification and automation of soil information application.展开更多
The constraints and the operations play an important role in database generalization.They guide and govern database generalization.The constraints are translation of the required conditions that should take into accou...The constraints and the operations play an important role in database generalization.They guide and govern database generalization.The constraints are translation of the required conditions that should take into account not only the objects and relationships among objects but also spatial data schema (classification and aggregation hierarchy) associated with the final existing database.The operations perform the actions of generalization in support of data reduction in the database.The constraints in database generalization are still lack of research.There is still the lack of frameworks to express the constraints and the operations on the basis of object_oriented data structure in database generalization.This paper focuses on the frameworks for generalization operations and constraints on the basis of object_oriented data structure in database generalization.The constraints as the attributes of the object and the operations as the methods of the object can be encapsulated in classes.They have the inheritance and polymorphism property.So the framework of the constraints and the operations which are based on object_oriented data structure can be easily understood and implemented.The constraint and the operations based on object_oriented database are proposed based on object_oriented database.The frameworks for generalization operations,constraints and relations among objects based on object_oriented data structure in database generalization are designed.The categorical database generalization is concentrated on in this paper.展开更多
Based on the approach implementing a deductive object-oriented database system through the underlying relational database, this paper presents an object reasoning language O-Datalog, which is the extension of Datalog ...Based on the approach implementing a deductive object-oriented database system through the underlying relational database, this paper presents an object reasoning language O-Datalog, which is the extension of Datalog in form and can deal with object-oriented data. For any O-Datalog program, an equivalent Datalog program can be built to help evaluate the original program. This paper focuses on the syntax, semantics and evaluation of O-Datalog.展开更多
The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and reg...The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and regeneration via signaling pathways,and play important regulatory and structural roles. However, the complete list of bone extracellular matrix proteins, their roles, and the extent of individual and cross-species variations have not been fully captured in both humans and model organisms. Here, we introduce the most comprehensive resource of bone extracellular matrix(ECM) proteins that can be used in research fields such as bone regeneration, osteoporosis, and mechanobiology. The Phylobone database(available at https://phylobone.com) includes 255proteins potentially expressed in the bone extracellular matrix(ECM) of humans and 30 species of vertebrates. A bioinformatics pipeline was used to identify the evolutionary relationships of bone ECM proteins. The analysis facilitated the identification of potential model organisms to study the molecular mechanisms of bone regeneration. A network analysis showed high connectivity of bone ECM proteins. A total of 214 functional protein domains were identified, including collagen and the domains involved in bone formation and resorption. Information from public drug repositories was used to identify potential repurposing of existing drugs. The Phylobone database provides a platform to study bone regeneration and osteoporosis in light of(biological) evolution,and will substantially contribute to the identification of molecular mechanisms and drug targets.展开更多
Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of met...Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of metagenomic sequencing,scientists have gained the ability to decipher the profiles of ARGs in diverse samples with high accuracy at an accelerated speed.To analyze thousands of ARGs in a highthroughput way,standardized and integrated pipelines are needed.The new version(v3.0)of the widely used ARGs online analysis pipeline(ARGs-OAP)has made significant improvements to both the reference database-the structured ARG(SARG)database-and the integrated analysis pipeline.SARG has been enhanced with sequence curation to improve annotation reliability,incorporate emerging resistance genotypes,and determine rigorous mechanism classification.The database has been further organized and visualized online in the format of a tree-like structure with a dictionary.It has also been divided into sub-databases for different application scenarios.In addition,the ARGs-OAP has been improved with adjusted quantification methods,simplified tool implementation,and multiple functions with userdefined reference databases.Moreover,the online platform now provides a diverse biostatistical analysis workflow with visualization packages for the efficient interpretation of ARG profiles.The ARGs-OAP v3.0 with an improved database and analysis pipeline will benefit academia,governmental management,and consultation regarding risk assessment of the environmental prevalence of ARGs.展开更多
Background:Skin aging has recently gained significant attention in both society and skin care research.Understanding the biological processes of photoaging caused by long-term skin exposure to ultraviolet radiation is...Background:Skin aging has recently gained significant attention in both society and skin care research.Understanding the biological processes of photoaging caused by long-term skin exposure to ultraviolet radiation is critical for preventing and treating skin aging.Therefore,it is important to identify genes related to skin photoaging and shed light on their functions.Methods:We used data from the Gene Expression Omnibus(GEO)database and conducted bioinformatics analyses to screen and extract microRNAs(miRNAs)and their downstream target genes related to skin photoaging,and to determine possible biological mechanisms of skin photoaging.Results:A total of 34 differentially expressed miRNAs and their downstream target genes potentially related to the biological process of skin photoaging were identified.Gene Ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis showed that these target genes were enriched in pathways related to human papillomavirus infection,extracellular matrix(ECM)-receptor signaling,estrogen receptor,skin development,epidermal development,epidermal cell differentiation,keratinocyte differentiation,structural components of the ECM,structural components of the skin epidermis,and others.Conclusion:Based on the GEO database-derived findings,we determined that target genes of two miRNAs,namely miR-4667-5P-KRT79 and miR-139-5P-FOS,play an important role in skin photoaging.These observations could provide theoretical support and guidance for further research on skin aging-related biological processes.展开更多
文摘This paper introduces a multi-granularity locking model (MGL) for concurrency control in object-oriented database system briefiy, and presents a MGL model formally. Four lockingscheduling algorithms for MGL are proposed in the paper. The ideas of single queue scheduling(SQS) and dual queue scheduling (DQS) are proposed and the algorithm and the performance evaluation for these two scheduling are presented in some paper. This paper describes a new idea of thescheduling for MGL, compatible requests first (CRF). Combining the new idea with SQS and DQS,we propose two new scheduling algorithms called CRFS and CRFD. After describing the simulationmodel, this paper illustrates the comparisons of the performance among these four algorithms. Asshown in the experiments, DQS has better performance than SQS, CRFD is better than DQS, CRFSperforms better than SQS, and CRFS is the best one of these four scheduling algorithms.
文摘Focusing on the development of electronic-mart (e-mart) based on object-oriented databases (OODB), the concepts of integrated electronic-commerce (e-commerce) environment and e-mart are introduced, and the basic characteristics of OODB Jasmine are described. In addition, the database mode and hierarchy of e-mart are discussed in detail.
基金supported by the Ensemble Grant for Early Career Researchers 2022 and the 2023 Ensemble Continuation Grant of Tohoku University,the Hirose Foundation,the Iwatani Naoji Foundation,and the AIMR Fusion Research Grantsupported by JSPS KAKENHI Nos.JP23K13599,JP23K13703,JP22H01803,and JP18H05513+2 种基金the Center for Computational Materials Science,Institute for Materials Research,Tohoku University for the use of MASAMUNEIMR(Nos.202212-SCKXX0204 and 202208-SCKXX-0212)the Institute for Solid State Physics(ISSP)at the University of Tokyo for the use of their supercomputersthe China Scholarship Council(CSC)fund to pursue studies in Japan.
文摘All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations to search for high ion-conducting solid-state electrolytes have attracted broad concern.However,obtaining SSEs with high ionic conductivity is challenging due to the complex structural information and the less-explored structure-performance relationship.To provide a solution to these challenges,developing a database containing typical SSEs from available experimental reports would be a new avenue to understand the structureperformance relationships and find out new design guidelines for reasonable SSEs.Herein,a dynamic experimental database containing>600 materials was developed in a wide range of temperatures(132.40–1261.60 K),including mono-and divalent cations(e.g.,Li^(+),Na^(+),K^(+),Ag^(+),Ca^(2+),Mg^(2+),and Zn^(2+))and various types of anions(e.g.,halide,hydride,sulfide,and oxide).Data-mining was conducted to explore the relationships among different variates(e.g.,transport ion,composition,activation energy,and conductivity).Overall,we expect that this database can provide essential guidelines for the design and development of high-performance SSEs in ASSB applications.This database is dynamically updated,which can be accessed via our open-source online system.
文摘This paper presents a practical concurrency control mechanism - ObjectLockingin OODBMS. Object-Locking can schedule transactions, each of themcan be considered as a sequence of high level operations defined on classes. Bythe properties of parallelity and coatativity between high level operations,proper lock modes for each operation are desigued and the compatibility matrixis constructed. With these lock modes, phatoms are kept away from databasesand a high degree of concurrency is achieved.
基金This research is supported by National Natural Science Foundation of China
文摘The data model of WHYMX complicates transaction management. Traditional locking method is not powerful enough to solve the new problem of concurrency control of WHYMX's transaction. This paper presents a number of concurrency control algorithms based on the extended locking method.
基金financial support from the Science Research Program Project for Drug Regulation,Jiangsu Drug Administration,China(Grant No.:202207)the National Drug Standards Revision Project,China(Grant No.:2023Y41)+1 种基金the National Natural Science Foundation of China(Grant No.:22276080)the Foreign Expert Project,China(Grant No.:G2022014096L).
文摘Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 components make accurate separation,identification,and quantification challenging.In this work,a high-resolution quantitative method was developed using single-dimensional high-performance liquid chromatography(HPLC)with charged aerosol detection(CAD)to separate 18 key components with multiple esters.The separated components were characterized by ultra-high-performance liquid chromatography-quadrupole time-of-flight mass spectrometry(UHPLC-Q-TOF-MS)with an identical gradient as the HPLC-CAD analysis.The polysorbate compound database and library were expanded over 7-time compared to the commercial database.The method investigated differences in PS20 samples from various origins and grades for different dosage forms to evaluate the composition-process relationship.UHPLC-Q-TOF-MS identified 1329 to 1511 compounds in 4 batches of PS20 from different sources.The method observed the impact of 4 degradation conditions on peak components,identifying stable components and their tendencies to change.HPLC-CAD and UHPLC-Q-TOF-MS results provided insights into fingerprint differences,distinguishing quasi products.
基金supported by the National Natural Science Foundation of China(No.62302242)the China Postdoctoral Science Foundation(No.2023M731802).
文摘The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.
基金the financial support received from the Natural Science Foundation of China(32202202 and 31871735)。
文摘Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods that covers the entire range of food categories,which limits the accurate risk assessment of dietary AGEs in human diseases.In this study,we first established an isotope dilution UHPLCQq Q-MS/MS-based method for simultaneous quantification of 10 major AGEs in foods.The contents of these AGEs were detected in 334 foods covering all main groups consumed in Western and Chinese populations.Nε-Carboxymethyllysine,methylglyoxal-derived hydroimidazolone isomers,and glyoxal-derived hydroimidazolone-1 are predominant AGEs found in most foodstuffs.Total amounts of AGEs were high in processed nuts,bakery products,and certain types of cereals and meats(>150 mg/kg),while low in dairy products,vegetables,fruits,and beverages(<40 mg/kg).Assessment of estimated daily intake implied that the contribution of food groups to daily AGE intake varied a lot under different eating patterns,and selection of high-AGE foods leads to up to a 2.7-fold higher intake of AGEs through daily meals.The presented AGE database allows accurate assessment of dietary exposure to these glycotoxins to explore their physiological impacts on human health.
文摘This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61888102,52272172,and 52102193)the Major Program of the National Natural Science Foundation of China(Grant No.92163206)+2 种基金the National Key Research and Development Program of China(Grant Nos.2021YFA1201501 and 2022YFA1204100)the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB30000000)the Fundamental Research Funds for the Central Universities.
文摘Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.
基金Tata Steel Netherlands,Posco,Hyundai Steel,Nucor Steel,RioTinto,Nippon Steel Corp.,JFE Steel,Voestalpine,RHi-Magnesita,Doosan Enerbility,Seah Besteel,Umicore,Vesuvius and Schott AG are gratefully acknowledged.
文摘The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase diagrams and equilibrium phases involving refractories in industrial process.In this study,the FactSage thermodynamic database relevant to ZrO_(2)-based refractories was reviewed and the application of the database to understanding the corrosion of continuous casting nozzle refractories in steelmaking was presented.
文摘BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and implications of CCY remain unclear.AIM To examine the impact of same-admission CCY compared to interval CCY on patients with gallstone-related AC using the National Readmission Database(NRD).METHODS We queried the NRD to identify all gallstone-related AC hospitalizations in adult patients with and without the same admission CCY between 2016 and 2020.Our primary outcome was all-cause 30-d readmission rates,and secondary outcomes included in-hospital mortality,length of stay(LOS),and hospitalization cost.RESULTS Among the 124964 gallstone-related AC hospitalizations,only 14.67%underwent the same admission CCY.The all-cause 30-d readmissions in the same admission CCY group were almost half that of the non-CCY group(5.56%vs 11.50%).Patients in the same admission CCY group had a longer mean LOS and higher hospitalization costs attrib-utable to surgery.Although the most common reason for readmission was sepsis in both groups,the second most common reason was AC in the interval CCY group.CONCLUSION Our study suggests that patients with gallstone-related AC who do not undergo the same admission CCY have twice the risk of readmission compared to those who undergo CCY during the same admission.These readmis-sions can potentially be prevented by performing same-admission CCY in appropriate patients,which may reduce subsequent hospitalization costs secondary to readmissions.
文摘Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.
基金Project supported by the National Natural Science Foundation of China (No. 40271056) Hubei Provin- cial Natural Science Foundation of China (No. 99J123).
文摘A SOTER management system was developed by analyzing, designing, programming, testing, repeated proceeding and progressing based on the object-oriented method. The function of the attribute database management is inherited and expanded in the new system. The integrity and security of the SOTER database are enhanced. The attribute database management, the spatial database management and the model base are integrated into SOTER based on the component object model (COM), and the graphical user interface (GUI) for Windows is used to interact with clients, thus being easy to create and maintain the SOTER, and convenient to promote the quantification and automation of soil information application.
文摘The constraints and the operations play an important role in database generalization.They guide and govern database generalization.The constraints are translation of the required conditions that should take into account not only the objects and relationships among objects but also spatial data schema (classification and aggregation hierarchy) associated with the final existing database.The operations perform the actions of generalization in support of data reduction in the database.The constraints in database generalization are still lack of research.There is still the lack of frameworks to express the constraints and the operations on the basis of object_oriented data structure in database generalization.This paper focuses on the frameworks for generalization operations and constraints on the basis of object_oriented data structure in database generalization.The constraints as the attributes of the object and the operations as the methods of the object can be encapsulated in classes.They have the inheritance and polymorphism property.So the framework of the constraints and the operations which are based on object_oriented data structure can be easily understood and implemented.The constraint and the operations based on object_oriented database are proposed based on object_oriented database.The frameworks for generalization operations,constraints and relations among objects based on object_oriented data structure in database generalization are designed.The categorical database generalization is concentrated on in this paper.
文摘Based on the approach implementing a deductive object-oriented database system through the underlying relational database, this paper presents an object reasoning language O-Datalog, which is the extension of Datalog in form and can deal with object-oriented data. For any O-Datalog program, an equivalent Datalog program can be built to help evaluate the original program. This paper focuses on the syntax, semantics and evaluation of O-Datalog.
基金supported by continuation funds from the Turku Collegium for Science,Medicine and Technologythe Japan Society for the Promotion of Science (#23K08670)+1 种基金the Sigrid Jusélius Foundation (#230131)MF-R internship at the University of Turku was funded by the Erasmus+program。
文摘The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and regeneration via signaling pathways,and play important regulatory and structural roles. However, the complete list of bone extracellular matrix proteins, their roles, and the extent of individual and cross-species variations have not been fully captured in both humans and model organisms. Here, we introduce the most comprehensive resource of bone extracellular matrix(ECM) proteins that can be used in research fields such as bone regeneration, osteoporosis, and mechanobiology. The Phylobone database(available at https://phylobone.com) includes 255proteins potentially expressed in the bone extracellular matrix(ECM) of humans and 30 species of vertebrates. A bioinformatics pipeline was used to identify the evolutionary relationships of bone ECM proteins. The analysis facilitated the identification of potential model organisms to study the molecular mechanisms of bone regeneration. A network analysis showed high connectivity of bone ECM proteins. A total of 214 functional protein domains were identified, including collagen and the domains involved in bone formation and resorption. Information from public drug repositories was used to identify potential repurposing of existing drugs. The Phylobone database provides a platform to study bone regeneration and osteoporosis in light of(biological) evolution,and will substantially contribute to the identification of molecular mechanisms and drug targets.
基金supported by a Theme-based Research Scheme grant from the Research Grants Council of the Hong Kong Special Administrative Region,China(T21-705/20-N)。
文摘Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of metagenomic sequencing,scientists have gained the ability to decipher the profiles of ARGs in diverse samples with high accuracy at an accelerated speed.To analyze thousands of ARGs in a highthroughput way,standardized and integrated pipelines are needed.The new version(v3.0)of the widely used ARGs online analysis pipeline(ARGs-OAP)has made significant improvements to both the reference database-the structured ARG(SARG)database-and the integrated analysis pipeline.SARG has been enhanced with sequence curation to improve annotation reliability,incorporate emerging resistance genotypes,and determine rigorous mechanism classification.The database has been further organized and visualized online in the format of a tree-like structure with a dictionary.It has also been divided into sub-databases for different application scenarios.In addition,the ARGs-OAP has been improved with adjusted quantification methods,simplified tool implementation,and multiple functions with userdefined reference databases.Moreover,the online platform now provides a diverse biostatistical analysis workflow with visualization packages for the efficient interpretation of ARG profiles.The ARGs-OAP v3.0 with an improved database and analysis pipeline will benefit academia,governmental management,and consultation regarding risk assessment of the environmental prevalence of ARGs.
基金supported by Zhejiang Provincial Natural Science Foundation of China(grant no.LQ22H150005)。
文摘Background:Skin aging has recently gained significant attention in both society and skin care research.Understanding the biological processes of photoaging caused by long-term skin exposure to ultraviolet radiation is critical for preventing and treating skin aging.Therefore,it is important to identify genes related to skin photoaging and shed light on their functions.Methods:We used data from the Gene Expression Omnibus(GEO)database and conducted bioinformatics analyses to screen and extract microRNAs(miRNAs)and their downstream target genes related to skin photoaging,and to determine possible biological mechanisms of skin photoaging.Results:A total of 34 differentially expressed miRNAs and their downstream target genes potentially related to the biological process of skin photoaging were identified.Gene Ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis showed that these target genes were enriched in pathways related to human papillomavirus infection,extracellular matrix(ECM)-receptor signaling,estrogen receptor,skin development,epidermal development,epidermal cell differentiation,keratinocyte differentiation,structural components of the ECM,structural components of the skin epidermis,and others.Conclusion:Based on the GEO database-derived findings,we determined that target genes of two miRNAs,namely miR-4667-5P-KRT79 and miR-139-5P-FOS,play an important role in skin photoaging.These observations could provide theoretical support and guidance for further research on skin aging-related biological processes.