All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations ...All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations to search for high ion-conducting solid-state electrolytes have attracted broad concern.However,obtaining SSEs with high ionic conductivity is challenging due to the complex structural information and the less-explored structure-performance relationship.To provide a solution to these challenges,developing a database containing typical SSEs from available experimental reports would be a new avenue to understand the structureperformance relationships and find out new design guidelines for reasonable SSEs.Herein,a dynamic experimental database containing>600 materials was developed in a wide range of temperatures(132.40–1261.60 K),including mono-and divalent cations(e.g.,Li^(+),Na^(+),K^(+),Ag^(+),Ca^(2+),Mg^(2+),and Zn^(2+))and various types of anions(e.g.,halide,hydride,sulfide,and oxide).Data-mining was conducted to explore the relationships among different variates(e.g.,transport ion,composition,activation energy,and conductivity).Overall,we expect that this database can provide essential guidelines for the design and development of high-performance SSEs in ASSB applications.This database is dynamically updated,which can be accessed via our open-source online system.展开更多
Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 compon...Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 components make accurate separation,identification,and quantification challenging.In this work,a high-resolution quantitative method was developed using single-dimensional high-performance liquid chromatography(HPLC)with charged aerosol detection(CAD)to separate 18 key components with multiple esters.The separated components were characterized by ultra-high-performance liquid chromatography-quadrupole time-of-flight mass spectrometry(UHPLC-Q-TOF-MS)with an identical gradient as the HPLC-CAD analysis.The polysorbate compound database and library were expanded over 7-time compared to the commercial database.The method investigated differences in PS20 samples from various origins and grades for different dosage forms to evaluate the composition-process relationship.UHPLC-Q-TOF-MS identified 1329 to 1511 compounds in 4 batches of PS20 from different sources.The method observed the impact of 4 degradation conditions on peak components,identifying stable components and their tendencies to change.HPLC-CAD and UHPLC-Q-TOF-MS results provided insights into fingerprint differences,distinguishing quasi products.展开更多
Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new mater...Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.展开更多
Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues...Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues to face numerous cyber-attacks.Database management systems serve as the foundation of any information system or application.Any cyber-attack can result in significant damage to the database system and loss of sensitive data.Consequently,cyber risk classifications and assessments play a crucial role in risk management and establish an essential framework for identifying and responding to cyber threats.Risk assessment aids in understanding the impact of cyber threats and developing appropriate security controls to mitigate risks.The primary objective of this study is to conduct a comprehensive analysis of cyber risks in database management systems,including classifying threats,vulnerabilities,impacts,and countermeasures.This classification helps to identify suitable security controls to mitigate cyber risks for each type of threat.Additionally,this research aims to explore technical countermeasures to protect database systems from cyber threats.This study employs the content analysis method to collect,analyze,and classify data in terms of types of threats,vulnerabilities,and countermeasures.The results indicate that SQL injection attacks and Denial of Service(DoS)attacks were the most prevalent technical threats in database systems,each accounting for 9%of incidents.Vulnerable audit trails,intrusion attempts,and ransomware attacks were classified as the second level of technical threats in database systems,comprising 7%and 5%of incidents,respectively.Furthermore,the findings reveal that insider threats were the most common non-technical threats in database systems,accounting for 5%of incidents.Moreover,the results indicate that weak authentication,unpatched databases,weak audit trails,and multiple usage of an account were the most common technical vulnerabilities in database systems,each accounting for 9%of vulnerabilities.Additionally,software bugs,insecure coding practices,weak security controls,insecure networks,password misuse,weak encryption practices,and weak data masking were classified as the second level of security vulnerabilities in database systems,each accounting for 4%of vulnerabilities.The findings from this work can assist organizations in understanding the types of cyber threats and developing robust strategies against cyber-attacks.展开更多
Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods...Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods that covers the entire range of food categories,which limits the accurate risk assessment of dietary AGEs in human diseases.In this study,we first established an isotope dilution UHPLCQq Q-MS/MS-based method for simultaneous quantification of 10 major AGEs in foods.The contents of these AGEs were detected in 334 foods covering all main groups consumed in Western and Chinese populations.Nε-Carboxymethyllysine,methylglyoxal-derived hydroimidazolone isomers,and glyoxal-derived hydroimidazolone-1 are predominant AGEs found in most foodstuffs.Total amounts of AGEs were high in processed nuts,bakery products,and certain types of cereals and meats(>150 mg/kg),while low in dairy products,vegetables,fruits,and beverages(<40 mg/kg).Assessment of estimated daily intake implied that the contribution of food groups to daily AGE intake varied a lot under different eating patterns,and selection of high-AGE foods leads to up to a 2.7-fold higher intake of AGEs through daily meals.The presented AGE database allows accurate assessment of dietary exposure to these glycotoxins to explore their physiological impacts on human health.展开更多
The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users wit...The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.展开更多
A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various form...A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.展开更多
BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and imp...BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and implications of CCY remain unclear.AIM To examine the impact of same-admission CCY compared to interval CCY on patients with gallstone-related AC using the National Readmission Database(NRD).METHODS We queried the NRD to identify all gallstone-related AC hospitalizations in adult patients with and without the same admission CCY between 2016 and 2020.Our primary outcome was all-cause 30-d readmission rates,and secondary outcomes included in-hospital mortality,length of stay(LOS),and hospitalization cost.RESULTS Among the 124964 gallstone-related AC hospitalizations,only 14.67%underwent the same admission CCY.The all-cause 30-d readmissions in the same admission CCY group were almost half that of the non-CCY group(5.56%vs 11.50%).Patients in the same admission CCY group had a longer mean LOS and higher hospitalization costs attrib-utable to surgery.Although the most common reason for readmission was sepsis in both groups,the second most common reason was AC in the interval CCY group.CONCLUSION Our study suggests that patients with gallstone-related AC who do not undergo the same admission CCY have twice the risk of readmission compared to those who undergo CCY during the same admission.These readmis-sions can potentially be prevented by performing same-admission CCY in appropriate patients,which may reduce subsequent hospitalization costs secondary to readmissions.展开更多
The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase d...The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase diagrams and equilibrium phases involving refractories in industrial process.In this study,the FactSage thermodynamic database relevant to ZrO_(2)-based refractories was reviewed and the application of the database to understanding the corrosion of continuous casting nozzle refractories in steelmaking was presented.展开更多
This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, catego...This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.展开更多
The continuously updated database of failures and censored data of numerous products has become large, and on some covariates, information regarding the failure times is missing in the database. As the dataset is larg...The continuously updated database of failures and censored data of numerous products has become large, and on some covariates, information regarding the failure times is missing in the database. As the dataset is large and has missing information, the analysis tasks become complicated and a long time is required to execute the programming codes. In such situations, the divide and recombine (D&R) approach, which has a practical computational performance for big data analysis, can be applied. In this study, the D&R approach was applied to analyze the real field data of an automobile component with incomplete information on covariates using the Weibull regression model. Model parameters were estimated using the expectation maximization algorithm. The results of the data analysis and simulation demonstrated that the D&R approach is applicable for analyzing such datasets. Further, the percentiles and reliability functions of the distribution under different covariate conditions were estimated to evaluate the component performance of these covariates. The findings of this study have managerial implications regarding design decisions, safety, and reliability of automobile components.展开更多
With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enha...With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enhance database query systems, enabling more intuitive and semantic query mechanisms. Our model leverages LLM’s deep learning architecture to interpret and process natural language queries and translate them into accurate database queries. The system integrates an LLM-powered semantic parser that translates user input into structured queries that can be understood by the database management system. First, the user query is pre-processed, the text is normalized, and the ambiguity is removed. This is followed by semantic parsing, where the LLM interprets the pre-processed text and identifies key entities and relationships. This is followed by query generation, which converts the parsed information into a structured query format and tailors it to the target database schema. Finally, there is query execution and feedback, where the resulting query is executed on the database and the results are returned to the user. The system also provides feedback mechanisms to improve and optimize future query interpretations. By using advanced LLMs for model implementation and fine-tuning on diverse datasets, the experimental results show that the proposed method significantly improves the accuracy and usability of database queries, making data retrieval easy for users without specialized knowledge.展开更多
Objective:Red blood cell distribution width(RDW)has been utilized as a prognostic indicator for mortality risk assessment in cardiovascular and cerebrovascular patients.Nevertheless,the prognostic significance of RDW ...Objective:Red blood cell distribution width(RDW)has been utilized as a prognostic indicator for mortality risk assessment in cardiovascular and cerebrovascular patients.Nevertheless,the prognostic significance of RDW in critically ill patients with cerebral infarction is yet to be investigated.The objective of this study is to examine the association between RDW and the risk of all-cause mortality in cerebral infarction patients admitted to the intensive care unit(ICU).Method:A retrospective cohort study was conducted using the Medical Information Mart for Intensive Care IV 2.2(MIMIC-IV)intensive care dataset for data analysis.The main results were the all-cause mortality rates at 3 and 12 months of follow-up.Cumulative curves were plotted using the Kaplan-Meier method,and Cox proportional hazards analysis was used to examine the relationship between RDW and mortality rates in critically ill cerebral infarction patients.Results:The findings indicate that RDW serves as a significant prognostic factor for mortality risk in critically ill stroke patients,specifically at the 3 and 12-month follow-up periods.The observed correlation between increasing RDW levels and higher mortality rates among cerebral infarction patients further supports the potential utility of RDW as a predictive indicator.Conclusion:RDW emerges as an independent predictor of mortality risk during the 3 and 12-month follow-up periods for critically ill patients with cerebral infarction.展开更多
The college innovation and entrepreneurship program is a powerful means to enhance students’innovation and entrepreneurship skills.Evaluating the maturity of innovation and entrepreneurship projects can stimulate stu...The college innovation and entrepreneurship program is a powerful means to enhance students’innovation and entrepreneurship skills.Evaluating the maturity of innovation and entrepreneurship projects can stimulate students’enthusiasm and initiative to participate.Utilizing computer database technology for maturity evaluation can make the process more efficient,accurate,and convenient,aligning with the needs of the information age.Exploring strategies for applying computer database technology in the maturity evaluation of innovation and entrepreneurship projects offers valuable insights and directions for developing these projects,while also providing strong support for enhancing students’innovation and entrepreneurship abilities.展开更多
With the continuous development of computer network technology, its applications in daily life and work have become increasingly widespread, greatly improving efficiency. However, certain security risks remain. To ens...With the continuous development of computer network technology, its applications in daily life and work have become increasingly widespread, greatly improving efficiency. However, certain security risks remain. To ensure the security of computer networks and databases, it is essential to enhance the security of both through optimization of technology. This includes improving management practices, optimizing data processing methods, and establishing comprehensive laws and regulations. This paper analyzes the current security risks in computer networks and databases and proposes corresponding solutions, offering reference points for relevant personnel.展开更多
In typical Wi-Fi based indoor positioning systems employing fingerprint model,plentiful fingerprints need to be trained by trained experts or technician,which extends labor costs and restricts their promotion.In this ...In typical Wi-Fi based indoor positioning systems employing fingerprint model,plentiful fingerprints need to be trained by trained experts or technician,which extends labor costs and restricts their promotion.In this paper,a novel approach based on crowd paths to solve this problem is presented,which collects and constructs automatically fingerprints database for anonymous buildings through common crowd customers.However,the accuracy degradation problem may be introduced as crowd customers are not professional trained and equipped.Therefore,we define two concepts:fixed landmark and hint landmark,to rectify the fingerprint database in the practical system,in which common corridor crossing points serve as fixed landmark and cross point among different crowd paths serve as hint landmark.Machinelearning techniques are utilized for short range approximation around fixed landmarks and fuzzy logic decision technology is applied for searching hint landmarks in crowd traces space.Besides,the particle filter algorithm is also introduced to smooth the sample points in crowd paths.We implemented the approach on off-the-shelf smartphones and evaluate the performance.Experimental results indicate that the approach can availably construct WiFi fingerprint database without reduce the localization accuracy.展开更多
Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of met...Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of metagenomic sequencing,scientists have gained the ability to decipher the profiles of ARGs in diverse samples with high accuracy at an accelerated speed.To analyze thousands of ARGs in a highthroughput way,standardized and integrated pipelines are needed.The new version(v3.0)of the widely used ARGs online analysis pipeline(ARGs-OAP)has made significant improvements to both the reference database-the structured ARG(SARG)database-and the integrated analysis pipeline.SARG has been enhanced with sequence curation to improve annotation reliability,incorporate emerging resistance genotypes,and determine rigorous mechanism classification.The database has been further organized and visualized online in the format of a tree-like structure with a dictionary.It has also been divided into sub-databases for different application scenarios.In addition,the ARGs-OAP has been improved with adjusted quantification methods,simplified tool implementation,and multiple functions with userdefined reference databases.Moreover,the online platform now provides a diverse biostatistical analysis workflow with visualization packages for the efficient interpretation of ARG profiles.The ARGs-OAP v3.0 with an improved database and analysis pipeline will benefit academia,governmental management,and consultation regarding risk assessment of the environmental prevalence of ARGs.展开更多
The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and reg...The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and regeneration via signaling pathways,and play important regulatory and structural roles. However, the complete list of bone extracellular matrix proteins, their roles, and the extent of individual and cross-species variations have not been fully captured in both humans and model organisms. Here, we introduce the most comprehensive resource of bone extracellular matrix(ECM) proteins that can be used in research fields such as bone regeneration, osteoporosis, and mechanobiology. The Phylobone database(available at https://phylobone.com) includes 255proteins potentially expressed in the bone extracellular matrix(ECM) of humans and 30 species of vertebrates. A bioinformatics pipeline was used to identify the evolutionary relationships of bone ECM proteins. The analysis facilitated the identification of potential model organisms to study the molecular mechanisms of bone regeneration. A network analysis showed high connectivity of bone ECM proteins. A total of 214 functional protein domains were identified, including collagen and the domains involved in bone formation and resorption. Information from public drug repositories was used to identify potential repurposing of existing drugs. The Phylobone database provides a platform to study bone regeneration and osteoporosis in light of(biological) evolution,and will substantially contribute to the identification of molecular mechanisms and drug targets.展开更多
This paper brings the comparison of performances of CO_(2)conversion by plasma and plasma-assisted catalysis based on the data collected from literature in this field,organised in an open access online database.This t...This paper brings the comparison of performances of CO_(2)conversion by plasma and plasma-assisted catalysis based on the data collected from literature in this field,organised in an open access online database.This tool is open to all users to carry out their own analyses,but also to contributors who wish to add their data to the database in order to improve the relevance of the comparisons made,and ultimately to improve the efficiency of CO_(2)conversion by plasma-catalysis.The creation of this database and database user interface is motivated by the fact that plasma-catalysis is a fast-growing field for all CO_(2)conversion processes,be it methanation,dry reforming of methane,methanolisation,or others.As a result of this rapid increase,there is a need for a set of standard procedures to rigorously compare performances of different systems.However,this is currently not possible because the fundamental mechanisms of plasma-catalysis are still too poorly understood to define these standard procedures.Fortunately however,the accumulated data within the CO_(2)plasma-catalysis community has become large enough to warrant so-called“big data”studies more familiar in the fields of medicine and the social sciences.To enable comparisons between multiple data sets and make future research more effective,this work proposes the first database on CO_(2)conversion performances by plasma-catalysis open to the whole community.This database has been initiated in the framework of a H_(2)0_(2)0 European project and is called the“PIONEER Data Base”.The database gathers a large amount of CO_(2)conversion performance data such as conversion rate,energy efficiency,and selectivity for numerous plasma sources coupled with or without a catalyst.Each data set is associated with metadata describing the gas mixture,the plasma source,the nature of the catalyst,and the form of coupling with the plasma.Beyond the database itself,a data extraction tool with direct visualisation features or advanced filtering functionalities has been developed and is available online to the public.The simple and fast visualisation of the state of the art puts new results into context,identifies literal gaps in data,and consequently points towards promising research routes.More advanced data extraction illustrates the impact that the database can have in the understanding of plasma-catalyst coupling.Lessons learned from the review of a large amount of literature during the setup of the database lead to best practice advice to increase comparability between future CO_(2)plasma-catalytic studies.Finally,the community is strongly encouraged to contribute to the database not only to increase the visibility of their data but also the relevance of the comparisons allowed by this tool.展开更多
CHDTEPDB(URL:http://chdtepdb.com/)is a manually integrated database for congenital heart disease(CHD)that stores the expression profiling data of CHD derived from published papers,aiming to provide rich resources for i...CHDTEPDB(URL:http://chdtepdb.com/)is a manually integrated database for congenital heart disease(CHD)that stores the expression profiling data of CHD derived from published papers,aiming to provide rich resources for investigating a deeper correlation between human CHD and aberrant transcriptome expression.The develop-ment of human diseases involves important regulatory roles of RNAs,and expression profiling data can reflect the underlying etiology of inherited diseases.Hence,collecting and compiling expression profiling data is of critical significance for a comprehensive understanding of the mechanisms and functions that underpin genetic diseases.CHDTEPDB stores the expression profiles of over 200 sets of 7 types of CHD and provides users with more convenient basic analytical functions.Due to the differences in clinical indicators such as disease type and unavoidable detection errors among various datasets,users are able to customize their selection of corresponding data for personalized analysis.Moreover,we provide a submission page for researchers to submit their own data so that increasing expression profiles as well as some other histological data could be supplemented to the database.CHDTEPDB is a user-friendly interface that allows users to quickly browse,retrieve,download,and analyze their target samples.CHDTEPDB will significantly improve the current knowledge of expression profiling data in CHD and has the potential to be exploited as an important tool for future research on the disease.展开更多
基金supported by the Ensemble Grant for Early Career Researchers 2022 and the 2023 Ensemble Continuation Grant of Tohoku University,the Hirose Foundation,the Iwatani Naoji Foundation,and the AIMR Fusion Research Grantsupported by JSPS KAKENHI Nos.JP23K13599,JP23K13703,JP22H01803,and JP18H05513+2 种基金the Center for Computational Materials Science,Institute for Materials Research,Tohoku University for the use of MASAMUNEIMR(Nos.202212-SCKXX0204 and 202208-SCKXX-0212)the Institute for Solid State Physics(ISSP)at the University of Tokyo for the use of their supercomputersthe China Scholarship Council(CSC)fund to pursue studies in Japan.
文摘All-solid-state batteries(ASSBs)are a class of safer and higher-energy-density materials compared to conventional devices,from which solid-state electrolytes(SSEs)are their essential components.To date,investigations to search for high ion-conducting solid-state electrolytes have attracted broad concern.However,obtaining SSEs with high ionic conductivity is challenging due to the complex structural information and the less-explored structure-performance relationship.To provide a solution to these challenges,developing a database containing typical SSEs from available experimental reports would be a new avenue to understand the structureperformance relationships and find out new design guidelines for reasonable SSEs.Herein,a dynamic experimental database containing>600 materials was developed in a wide range of temperatures(132.40–1261.60 K),including mono-and divalent cations(e.g.,Li^(+),Na^(+),K^(+),Ag^(+),Ca^(2+),Mg^(2+),and Zn^(2+))and various types of anions(e.g.,halide,hydride,sulfide,and oxide).Data-mining was conducted to explore the relationships among different variates(e.g.,transport ion,composition,activation energy,and conductivity).Overall,we expect that this database can provide essential guidelines for the design and development of high-performance SSEs in ASSB applications.This database is dynamically updated,which can be accessed via our open-source online system.
基金financial support from the Science Research Program Project for Drug Regulation,Jiangsu Drug Administration,China(Grant No.:202207)the National Drug Standards Revision Project,China(Grant No.:2023Y41)+1 种基金the National Natural Science Foundation of China(Grant No.:22276080)the Foreign Expert Project,China(Grant No.:G2022014096L).
文摘Analyzing polysorbate 20(PS20)composition and the impact of each component on stability and safety is crucial due to formulation variations and individual tolerance.The similar structures and polarities of PS20 components make accurate separation,identification,and quantification challenging.In this work,a high-resolution quantitative method was developed using single-dimensional high-performance liquid chromatography(HPLC)with charged aerosol detection(CAD)to separate 18 key components with multiple esters.The separated components were characterized by ultra-high-performance liquid chromatography-quadrupole time-of-flight mass spectrometry(UHPLC-Q-TOF-MS)with an identical gradient as the HPLC-CAD analysis.The polysorbate compound database and library were expanded over 7-time compared to the commercial database.The method investigated differences in PS20 samples from various origins and grades for different dosage forms to evaluate the composition-process relationship.UHPLC-Q-TOF-MS identified 1329 to 1511 compounds in 4 batches of PS20 from different sources.The method observed the impact of 4 degradation conditions on peak components,identifying stable components and their tendencies to change.HPLC-CAD and UHPLC-Q-TOF-MS results provided insights into fingerprint differences,distinguishing quasi products.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61888102,52272172,and 52102193)the Major Program of the National Natural Science Foundation of China(Grant No.92163206)+2 种基金the National Key Research and Development Program of China(Grant Nos.2021YFA1201501 and 2022YFA1204100)the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB30000000)the Fundamental Research Funds for the Central Universities.
文摘Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia(Grant No.KFU242068).
文摘Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues to face numerous cyber-attacks.Database management systems serve as the foundation of any information system or application.Any cyber-attack can result in significant damage to the database system and loss of sensitive data.Consequently,cyber risk classifications and assessments play a crucial role in risk management and establish an essential framework for identifying and responding to cyber threats.Risk assessment aids in understanding the impact of cyber threats and developing appropriate security controls to mitigate risks.The primary objective of this study is to conduct a comprehensive analysis of cyber risks in database management systems,including classifying threats,vulnerabilities,impacts,and countermeasures.This classification helps to identify suitable security controls to mitigate cyber risks for each type of threat.Additionally,this research aims to explore technical countermeasures to protect database systems from cyber threats.This study employs the content analysis method to collect,analyze,and classify data in terms of types of threats,vulnerabilities,and countermeasures.The results indicate that SQL injection attacks and Denial of Service(DoS)attacks were the most prevalent technical threats in database systems,each accounting for 9%of incidents.Vulnerable audit trails,intrusion attempts,and ransomware attacks were classified as the second level of technical threats in database systems,comprising 7%and 5%of incidents,respectively.Furthermore,the findings reveal that insider threats were the most common non-technical threats in database systems,accounting for 5%of incidents.Moreover,the results indicate that weak authentication,unpatched databases,weak audit trails,and multiple usage of an account were the most common technical vulnerabilities in database systems,each accounting for 9%of vulnerabilities.Additionally,software bugs,insecure coding practices,weak security controls,insecure networks,password misuse,weak encryption practices,and weak data masking were classified as the second level of security vulnerabilities in database systems,each accounting for 4%of vulnerabilities.The findings from this work can assist organizations in understanding the types of cyber threats and developing robust strategies against cyber-attacks.
基金the financial support received from the Natural Science Foundation of China(32202202 and 31871735)。
文摘Advanced glycation end-products(AGEs)are a group of heterogeneous compounds formed in heatprocessed foods and are proven to be detrimental to human health.Currently,there is no comprehensive database for AGEs in foods that covers the entire range of food categories,which limits the accurate risk assessment of dietary AGEs in human diseases.In this study,we first established an isotope dilution UHPLCQq Q-MS/MS-based method for simultaneous quantification of 10 major AGEs in foods.The contents of these AGEs were detected in 334 foods covering all main groups consumed in Western and Chinese populations.Nε-Carboxymethyllysine,methylglyoxal-derived hydroimidazolone isomers,and glyoxal-derived hydroimidazolone-1 are predominant AGEs found in most foodstuffs.Total amounts of AGEs were high in processed nuts,bakery products,and certain types of cereals and meats(>150 mg/kg),while low in dairy products,vegetables,fruits,and beverages(<40 mg/kg).Assessment of estimated daily intake implied that the contribution of food groups to daily AGE intake varied a lot under different eating patterns,and selection of high-AGE foods leads to up to a 2.7-fold higher intake of AGEs through daily meals.The presented AGE database allows accurate assessment of dietary exposure to these glycotoxins to explore their physiological impacts on human health.
基金supported by the National Natural Science Foundation of China(No.62302242)the China Postdoctoral Science Foundation(No.2023M731802).
文摘The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.
基金supported by the Student Scheme provided by Universiti Kebangsaan Malaysia with the Code TAP-20558.
文摘A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.
文摘BACKGROUND Elective cholecystectomy(CCY)is recommended for patients with gallstone-related acute cholangitis(AC)following endoscopic decompression to prevent recurrent biliary events.However,the optimal timing and implications of CCY remain unclear.AIM To examine the impact of same-admission CCY compared to interval CCY on patients with gallstone-related AC using the National Readmission Database(NRD).METHODS We queried the NRD to identify all gallstone-related AC hospitalizations in adult patients with and without the same admission CCY between 2016 and 2020.Our primary outcome was all-cause 30-d readmission rates,and secondary outcomes included in-hospital mortality,length of stay(LOS),and hospitalization cost.RESULTS Among the 124964 gallstone-related AC hospitalizations,only 14.67%underwent the same admission CCY.The all-cause 30-d readmissions in the same admission CCY group were almost half that of the non-CCY group(5.56%vs 11.50%).Patients in the same admission CCY group had a longer mean LOS and higher hospitalization costs attrib-utable to surgery.Although the most common reason for readmission was sepsis in both groups,the second most common reason was AC in the interval CCY group.CONCLUSION Our study suggests that patients with gallstone-related AC who do not undergo the same admission CCY have twice the risk of readmission compared to those who undergo CCY during the same admission.These readmis-sions can potentially be prevented by performing same-admission CCY in appropriate patients,which may reduce subsequent hospitalization costs secondary to readmissions.
基金Tata Steel Netherlands,Posco,Hyundai Steel,Nucor Steel,RioTinto,Nippon Steel Corp.,JFE Steel,Voestalpine,RHi-Magnesita,Doosan Enerbility,Seah Besteel,Umicore,Vesuvius and Schott AG are gratefully acknowledged.
文摘The CALPHAD thermodynamic databases are very useful to analyze the complex chemical reactions happening in high temperature material process.The FactSage thermodynamic database can be used to calculate complex phase diagrams and equilibrium phases involving refractories in industrial process.In this study,the FactSage thermodynamic database relevant to ZrO_(2)-based refractories was reviewed and the application of the database to understanding the corrosion of continuous casting nozzle refractories in steelmaking was presented.
文摘This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.
文摘The continuously updated database of failures and censored data of numerous products has become large, and on some covariates, information regarding the failure times is missing in the database. As the dataset is large and has missing information, the analysis tasks become complicated and a long time is required to execute the programming codes. In such situations, the divide and recombine (D&R) approach, which has a practical computational performance for big data analysis, can be applied. In this study, the D&R approach was applied to analyze the real field data of an automobile component with incomplete information on covariates using the Weibull regression model. Model parameters were estimated using the expectation maximization algorithm. The results of the data analysis and simulation demonstrated that the D&R approach is applicable for analyzing such datasets. Further, the percentiles and reliability functions of the distribution under different covariate conditions were estimated to evaluate the component performance of these covariates. The findings of this study have managerial implications regarding design decisions, safety, and reliability of automobile components.
文摘With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enhance database query systems, enabling more intuitive and semantic query mechanisms. Our model leverages LLM’s deep learning architecture to interpret and process natural language queries and translate them into accurate database queries. The system integrates an LLM-powered semantic parser that translates user input into structured queries that can be understood by the database management system. First, the user query is pre-processed, the text is normalized, and the ambiguity is removed. This is followed by semantic parsing, where the LLM interprets the pre-processed text and identifies key entities and relationships. This is followed by query generation, which converts the parsed information into a structured query format and tailors it to the target database schema. Finally, there is query execution and feedback, where the resulting query is executed on the database and the results are returned to the user. The system also provides feedback mechanisms to improve and optimize future query interpretations. By using advanced LLMs for model implementation and fine-tuning on diverse datasets, the experimental results show that the proposed method significantly improves the accuracy and usability of database queries, making data retrieval easy for users without specialized knowledge.
基金Project of Science and Technology Plan of Tianjin City(Grant number 20ZYJDSY00020)。
文摘Objective:Red blood cell distribution width(RDW)has been utilized as a prognostic indicator for mortality risk assessment in cardiovascular and cerebrovascular patients.Nevertheless,the prognostic significance of RDW in critically ill patients with cerebral infarction is yet to be investigated.The objective of this study is to examine the association between RDW and the risk of all-cause mortality in cerebral infarction patients admitted to the intensive care unit(ICU).Method:A retrospective cohort study was conducted using the Medical Information Mart for Intensive Care IV 2.2(MIMIC-IV)intensive care dataset for data analysis.The main results were the all-cause mortality rates at 3 and 12 months of follow-up.Cumulative curves were plotted using the Kaplan-Meier method,and Cox proportional hazards analysis was used to examine the relationship between RDW and mortality rates in critically ill cerebral infarction patients.Results:The findings indicate that RDW serves as a significant prognostic factor for mortality risk in critically ill stroke patients,specifically at the 3 and 12-month follow-up periods.The observed correlation between increasing RDW levels and higher mortality rates among cerebral infarction patients further supports the potential utility of RDW as a predictive indicator.Conclusion:RDW emerges as an independent predictor of mortality risk during the 3 and 12-month follow-up periods for critically ill patients with cerebral infarction.
基金“Undergraduate Teaching Research and Reform Project of the University of Shanghai for Science and Technology”(Project No.JGXM202351).
文摘The college innovation and entrepreneurship program is a powerful means to enhance students’innovation and entrepreneurship skills.Evaluating the maturity of innovation and entrepreneurship projects can stimulate students’enthusiasm and initiative to participate.Utilizing computer database technology for maturity evaluation can make the process more efficient,accurate,and convenient,aligning with the needs of the information age.Exploring strategies for applying computer database technology in the maturity evaluation of innovation and entrepreneurship projects offers valuable insights and directions for developing these projects,while also providing strong support for enhancing students’innovation and entrepreneurship abilities.
文摘With the continuous development of computer network technology, its applications in daily life and work have become increasingly widespread, greatly improving efficiency. However, certain security risks remain. To ensure the security of computer networks and databases, it is essential to enhance the security of both through optimization of technology. This includes improving management practices, optimizing data processing methods, and establishing comprehensive laws and regulations. This paper analyzes the current security risks in computer networks and databases and proposes corresponding solutions, offering reference points for relevant personnel.
基金partially sponsored by National Key Project of China (No.2012ZX03001013-003)
文摘In typical Wi-Fi based indoor positioning systems employing fingerprint model,plentiful fingerprints need to be trained by trained experts or technician,which extends labor costs and restricts their promotion.In this paper,a novel approach based on crowd paths to solve this problem is presented,which collects and constructs automatically fingerprints database for anonymous buildings through common crowd customers.However,the accuracy degradation problem may be introduced as crowd customers are not professional trained and equipped.Therefore,we define two concepts:fixed landmark and hint landmark,to rectify the fingerprint database in the practical system,in which common corridor crossing points serve as fixed landmark and cross point among different crowd paths serve as hint landmark.Machinelearning techniques are utilized for short range approximation around fixed landmarks and fuzzy logic decision technology is applied for searching hint landmarks in crowd traces space.Besides,the particle filter algorithm is also introduced to smooth the sample points in crowd paths.We implemented the approach on off-the-shelf smartphones and evaluate the performance.Experimental results indicate that the approach can availably construct WiFi fingerprint database without reduce the localization accuracy.
基金supported by a Theme-based Research Scheme grant from the Research Grants Council of the Hong Kong Special Administrative Region,China(T21-705/20-N)。
文摘Antibiotic resistance,which is encoded by antibiotic-resistance genes(ARGs),has proliferated to become a growing threat to public health around the world.With technical advances,especially in the popularization of metagenomic sequencing,scientists have gained the ability to decipher the profiles of ARGs in diverse samples with high accuracy at an accelerated speed.To analyze thousands of ARGs in a highthroughput way,standardized and integrated pipelines are needed.The new version(v3.0)of the widely used ARGs online analysis pipeline(ARGs-OAP)has made significant improvements to both the reference database-the structured ARG(SARG)database-and the integrated analysis pipeline.SARG has been enhanced with sequence curation to improve annotation reliability,incorporate emerging resistance genotypes,and determine rigorous mechanism classification.The database has been further organized and visualized online in the format of a tree-like structure with a dictionary.It has also been divided into sub-databases for different application scenarios.In addition,the ARGs-OAP has been improved with adjusted quantification methods,simplified tool implementation,and multiple functions with userdefined reference databases.Moreover,the online platform now provides a diverse biostatistical analysis workflow with visualization packages for the efficient interpretation of ARG profiles.The ARGs-OAP v3.0 with an improved database and analysis pipeline will benefit academia,governmental management,and consultation regarding risk assessment of the environmental prevalence of ARGs.
基金supported by continuation funds from the Turku Collegium for Science,Medicine and Technologythe Japan Society for the Promotion of Science (#23K08670)+1 种基金the Sigrid Jusélius Foundation (#230131)MF-R internship at the University of Turku was funded by the Erasmus+program。
文摘The bone extracellular matrix(ECM) contains minerals deposited on highly crosslinked collagen fibrils and hundreds of noncollagenous proteins. Some of these proteins are key to the regulation of bone formation and regeneration via signaling pathways,and play important regulatory and structural roles. However, the complete list of bone extracellular matrix proteins, their roles, and the extent of individual and cross-species variations have not been fully captured in both humans and model organisms. Here, we introduce the most comprehensive resource of bone extracellular matrix(ECM) proteins that can be used in research fields such as bone regeneration, osteoporosis, and mechanobiology. The Phylobone database(available at https://phylobone.com) includes 255proteins potentially expressed in the bone extracellular matrix(ECM) of humans and 30 species of vertebrates. A bioinformatics pipeline was used to identify the evolutionary relationships of bone ECM proteins. The analysis facilitated the identification of potential model organisms to study the molecular mechanisms of bone regeneration. A network analysis showed high connectivity of bone ECM proteins. A total of 214 functional protein domains were identified, including collagen and the domains involved in bone formation and resorption. Information from public drug repositories was used to identify potential repurposing of existing drugs. The Phylobone database provides a platform to study bone regeneration and osteoporosis in light of(biological) evolution,and will substantially contribute to the identification of molecular mechanisms and drug targets.
基金funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No.813393partially funded by the Portuguese FCT-Funda??o para a Ciência e a Tecnologia,under projects UIDB/50010/2020,UIDP/50010/2020 and PTDC/FIS-PLA/1616/2021。
文摘This paper brings the comparison of performances of CO_(2)conversion by plasma and plasma-assisted catalysis based on the data collected from literature in this field,organised in an open access online database.This tool is open to all users to carry out their own analyses,but also to contributors who wish to add their data to the database in order to improve the relevance of the comparisons made,and ultimately to improve the efficiency of CO_(2)conversion by plasma-catalysis.The creation of this database and database user interface is motivated by the fact that plasma-catalysis is a fast-growing field for all CO_(2)conversion processes,be it methanation,dry reforming of methane,methanolisation,or others.As a result of this rapid increase,there is a need for a set of standard procedures to rigorously compare performances of different systems.However,this is currently not possible because the fundamental mechanisms of plasma-catalysis are still too poorly understood to define these standard procedures.Fortunately however,the accumulated data within the CO_(2)plasma-catalysis community has become large enough to warrant so-called“big data”studies more familiar in the fields of medicine and the social sciences.To enable comparisons between multiple data sets and make future research more effective,this work proposes the first database on CO_(2)conversion performances by plasma-catalysis open to the whole community.This database has been initiated in the framework of a H_(2)0_(2)0 European project and is called the“PIONEER Data Base”.The database gathers a large amount of CO_(2)conversion performance data such as conversion rate,energy efficiency,and selectivity for numerous plasma sources coupled with or without a catalyst.Each data set is associated with metadata describing the gas mixture,the plasma source,the nature of the catalyst,and the form of coupling with the plasma.Beyond the database itself,a data extraction tool with direct visualisation features or advanced filtering functionalities has been developed and is available online to the public.The simple and fast visualisation of the state of the art puts new results into context,identifies literal gaps in data,and consequently points towards promising research routes.More advanced data extraction illustrates the impact that the database can have in the understanding of plasma-catalyst coupling.Lessons learned from the review of a large amount of literature during the setup of the database lead to best practice advice to increase comparability between future CO_(2)plasma-catalytic studies.Finally,the community is strongly encouraged to contribute to the database not only to increase the visibility of their data but also the relevance of the comparisons allowed by this tool.
文摘CHDTEPDB(URL:http://chdtepdb.com/)is a manually integrated database for congenital heart disease(CHD)that stores the expression profiling data of CHD derived from published papers,aiming to provide rich resources for investigating a deeper correlation between human CHD and aberrant transcriptome expression.The develop-ment of human diseases involves important regulatory roles of RNAs,and expression profiling data can reflect the underlying etiology of inherited diseases.Hence,collecting and compiling expression profiling data is of critical significance for a comprehensive understanding of the mechanisms and functions that underpin genetic diseases.CHDTEPDB stores the expression profiles of over 200 sets of 7 types of CHD and provides users with more convenient basic analytical functions.Due to the differences in clinical indicators such as disease type and unavoidable detection errors among various datasets,users are able to customize their selection of corresponding data for personalized analysis.Moreover,we provide a submission page for researchers to submit their own data so that increasing expression profiles as well as some other histological data could be supplemented to the database.CHDTEPDB is a user-friendly interface that allows users to quickly browse,retrieve,download,and analyze their target samples.CHDTEPDB will significantly improve the current knowledge of expression profiling data in CHD and has the potential to be exploited as an important tool for future research on the disease.