In China, the vast majority of the bibliographic databases is commercial, such as China National Knowledge Infrastructure (CNKI), Wanfang Database, Longyuan Journal Net, CQVIP Company, however, there are also non-pr...In China, the vast majority of the bibliographic databases is commercial, such as China National Knowledge Infrastructure (CNKI), Wanfang Database, Longyuan Journal Net, CQVIP Company, however, there are also non-profit open access (OA) databases, such as journal database jointly established by Chinese Academy of Social Sciences (CASS) and National Social Science Fund. The commercial bibliographic databases have to face many difficulties: intellectual property disputes, the benefit distribution between the hardcopy periodical and the commercial bibliographic database, the lack of quality assessment about the commercial bibliographic databases, the need of improving digital technology as well as the lack of a unified database regulation, which restricts the development of commercial bibliographic databases. This paper puts forward the countermeasures from the perspective of how to enhance the governmental management; how to protect the intellectual property fight; how to improve the technical standard of the commercial bibliographic databases; how to build interest distribution between the hardcopy periodical and the commercial bibliographic database; how to improve the quality of commercial bibliographic databases; and how to improve the industrial chain of the commercial bibliographic databases.展开更多
Background: Suicide among physicians is a serious public health issue, with an extremely complex and multifactorial behavior. Aim: The aim of this study was to use the theme “Suicide among Physicians” to exemplify t...Background: Suicide among physicians is a serious public health issue, with an extremely complex and multifactorial behavior. Aim: The aim of this study was to use the theme “Suicide among Physicians” to exemplify the analysis of methodological similarities between the scientific content available at MEDLINE and BVS databases, as scientific research tools. Methods: This is a systematic review with metanalysis. The following combinations of keywords were used for data search in the referred databases: “suicide” AND “physicians” AND “public heath”. Results: Three hundred and thirteen publications were identified, but only 16 studies were chosen. Great association was found between MEDLINE and BVS databases and the Odds Ratio regarding the theme: “Suicide among physicians”. Conclusions: Considering the similarities found in the utilization of the two analyzed databases, it was possible to identify that suicide among physicians is associated with the exercise of an important professional role in the society and in the workplace. With regard to scientific information, the p-value-obtained value (<0.05) seems to be statistically significant for the association between the suggested theme and the methodological similarities of the scientific information available in the analyzed databases. Thus, these open-access research tools are considered scientific reliable tools.展开更多
The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users wit...The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.展开更多
Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analy...Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analyze the popularity of certain research topics,well-adopted methodologies,influential authors,and the interrelationships among research disciplines.However,the visual exploration of the patterns of research topics with an emphasis on their spatial and temporal distribution remains challenging.This study combined a Space-Time Cube(STC)and a 3D glyph to represent the complex multivariate bibliographic data.We further implemented a visual design by developing an interactive interface.The effectiveness,understandability,and engagement of ST-Map are evaluated by seven experts in geovisualization.The results suggest that it is promising to use three-dimensional visualization to show the overview and on-demand details on a single screen.展开更多
Purpose:We analyzed the structure of a community of authors working in the field of social network analysis(SNA)based on citation indicators:direct citation and bibliographic coupling metrics.We observed patterns at t...Purpose:We analyzed the structure of a community of authors working in the field of social network analysis(SNA)based on citation indicators:direct citation and bibliographic coupling metrics.We observed patterns at the micro,meso,and macro levels of analysis.Design/methodology/approach:We used bibliometric network analysis,including the“temporal quantities”approach proposed to study temporal networks.Using a two-mode network linking publications with authors and a one-mode network of citations between the works,we constructed and analyzed the networks of citation and bibliographic coupling among authors.We used an iterated saturation data collection approach.Findings:At the macro-level,we observed the global structural features of citations between authors,showing that 80%of authors have not more than 15 citations from other works.At the meso-level,we extracted the groups of authors citing each other and similar to each other according to their citation patterns.We have seen a division of authors in SNA into groups of social scientists and physicists,as well as into other groups of authors from different disciplines.We found some examples of brokerage between different groups that maintained the common identity of the field.At the micro-level,we extracted authors with extremely high values of received citations,who can be considered as the most prominent authors in the field.We examined the temporal properties of the most popular authors.Research limitations:The main challenge in this approach is the resolution of the author’s name(synonyms and homonyms).We faced the author disambiguation,or“multiple personalities”(Harzing,2015)problem.To remain consistent and comparable with our previously published articles,we used the same SNA data collected up to 2018.The analysis and conclusions on the activity,productivity,and visibility of the authors are relative only to the field of SNA.Practical implications:The proposed approach can be utilized for similar objectives and identifying key structures and characteristics in other disciplines.This may potentially inspire the application of network approaches in other research areas,creating more authors collaborating in the field of SNA.Originality/value:We identified and applied an innovative approach and methods to study the structure of scientific communities,which allowed us to get the findings going beyond those obtained with other methods.We used a new approach to temporal network analysis,which is an important addition to the analysis as it provides detailed information on different measures for the authors and pairs of authors over time.展开更多
Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new mater...Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.展开更多
Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccura...Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccurate transaction properties.Soft computing-based solutions have been developed to solve this challenge.In a single framework,ambiguous,vague,incomplete,and inconsistent transaction attribute information has received minimal attention.The work presented in this paper employed type-2 neutrosophic logic,an extension of type-1 neutrosophic logic,to handle uncertainty in real-time deadlock-resolving systems.The proposed method is structured to reflect multiple types of knowledge and relations among transactions’features that include validation factor degree,slackness degree,degree of deadline-missed transaction based on the degree of membership of truthiness,degree ofmembership of indeterminacy,and degree ofmembership of falsity.Here,the footprint of uncertainty(FOU)for truth,indeterminacy,and falsity represents the level of uncertainty that exists in the value of a grade of membership.We employed a distributed real-time transaction processing simulator(DRTTPS)to conduct the simulations and conducted experiments using the benchmark Pima Indians diabetes dataset(PIDD).As the results showed,there is an increase in detection rate and a large drop in rollback rate when this new strategy is used.The performance of Type-2 neutrosophicbased resolution is better than the Type-1 neutrosophic-based approach on the execution ratio scale.The improvement rate has reached 10%to 20%,depending on the number of arrived transactions.展开更多
Objective:To investigate the variation,expression and clinical significance of E2F3 gene in melanoma.Methods:Firstly,cBioportal database,Oncomine database and GEO database were used to analyze the variation and expres...Objective:To investigate the variation,expression and clinical significance of E2F3 gene in melanoma.Methods:Firstly,cBioportal database,Oncomine database and GEO database were used to analyze the variation and expression level of E2F3 gene in melanoma.OSskcm database and TISIDB database were used to analyze the relationship between E2F3 and melanoma prognosis and tumor immune infiltrating cells.Then,the LinkedOmics database was used to identify the differential genes related to E2F3 expression in melanoma and analyze their biological functions.Finally,small molecule compounds for the treatment of melanoma were screened through the CMap database.Results:The mutation rate of E2F3 gene in melanoma is about 4%,and there are 21 mutation sites.Compared with normal skin tissues,the expression of E2F3 gene in melanoma was significantly increased(P<0.01).The mutation and increased expression of E2F3 gene were associated with the shortened overall survival(OS)of melanoma patients(P<0.05).The CNA level of E2F3 was negatively correlated with the expression levels of lymphocytes such as pDC,Neutrophil,Act DC and Th17,and negatively correlated with the expression levels of chemokines such as CXCL5,CCL13 and CCR1.The methylation level of E2F3 was positively correlated with the expression levels of Th1,Neutrophil,Act DC and other lymphocytes,and positively correlated with the expression levels of CXCL16,CXCL12,CCR1 and other chemokines.The expression level of E2F3 was negatively correlated with the expression levels of lymphocytes such as Th17,Tcm CD4 and Th1,and negatively correlated with the expression levels of chemokines such as CXCL 16,CCL 22 and CCL 2.The expression of 96 genes such as UHRF1BP1 in melanoma was significantly correlated with the expression of E2F3(|cor|0.5,P<0.05).The above genes were mainly related to RNA transport,eukaryotic ribosome biogenesis,cell cycle and other pathways.Among them,WDR12,WDR43,RBM28,UTP18,DKC1,PAK1IP1,DDX31,TEX10,TRUB1 and TRMT61B were the top 10 hub genes.YC-1,simvastatin,cytochalasin-d,Deforolimus and cytochalasin-b may be five small molecule compounds for the treatment of melanoma.Conclusion:The mutation and increased expression level of E2F3 gene are related to the poor prognosis of melanoma and participate in the occurrence and development of melanoma by affecting the expression of different tumor immune infiltrating cell subtypes,which may be a potential diagnostic marker and therapeutic target for melanoma.展开更多
Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
Data acquisition and modeling are the two important, difficult and costful aspects in a Cybercity project. 2D-GIS is mature and can manage a lot of spatial data. Thus 3D-GIS should make the best of data and technology...Data acquisition and modeling are the two important, difficult and costful aspects in a Cybercity project. 2D-GIS is mature and can manage a lot of spatial data. Thus 3D-GIS should make the best of data and technology of 2D-GIS. Construction of a useful synthetic environment requires integration of multiple types of information like DEM, texture images and 3D representation of objects such as buildings. In this paper, the method for 3D city landscape data model and visualization based on integrated databases is presented. Since the data volume of raster are very huge, special strategies(for example, pyramid gridded method) must be adopted in order to manage raster data efficiently. Three different methods of data acquisition, the proper data structure and a simple modeling method are presented as well. At last, a pilot project of Shanghai Cybercity is illustrated.展开更多
To solve the problems of shaving and reusing information in the information system, a rules-based ontology constructing approach from object-relational databases is proposed. A 3-tuple ontology constructing model is p...To solve the problems of shaving and reusing information in the information system, a rules-based ontology constructing approach from object-relational databases is proposed. A 3-tuple ontology constructing model is proposed first. Then, four types of ontology constructing rules including class, property, property characteristics, and property restrictions ave formalized according to the model. Experiment results described in Web ontology language prove that our proposed approach is feasible for applying in the semantic objects project of semantic computing laboratory in UC Irvine. Our approach reduces about twenty percent constructing time compared with the ontology construction from relational databases.展开更多
Most knowledgeable people agree that networking and routing technologies have been around about 25 years. Routing is simultaneously the most complicated function of a network and the most important. It is of the same ...Most knowledgeable people agree that networking and routing technologies have been around about 25 years. Routing is simultaneously the most complicated function of a network and the most important. It is of the same kind that more than 70% of computer application fields are MIS applications. So the challenge in building and using a MIS in the network is developing the means to find, access, and communicate large databases or multi databases systems. Because general databases are not time continuous, in fact, they can not be streaming, so we can't obtain reliable and secure quality of service by deleting some unimportant datagrams in the databases transmission. In this article, we will discuss which kind of routing protocol is the best type for large databases or multi databases systems transmission in the networks.展开更多
The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher wei...The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher weights are assigned to more significant attributes, so important attributes are more frequently fingerprinted than other ones. Finally, the robustness of the proposed algorithm, such as performance against collusion attacks, is analyzed. Experimental results prove the superiority of the algorithm.展开更多
A weighted algorithm for watermarking relational databases for copyright protection is presented. The possibility of watermarking an attribute is assigned according to its weight decided by the owner of the database. ...A weighted algorithm for watermarking relational databases for copyright protection is presented. The possibility of watermarking an attribute is assigned according to its weight decided by the owner of the database. A one-way hash function and a secret key known only to the owner of the data are used to select tuples and bits to mark. By assigning high weight to significant attributes, the scheme ensures that important attributes take more chance to be marked than less important ones. Experimental results show that the proposed scheme is robust against various forms of attacks, and has perfect immunity to subset attack.展开更多
As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g...As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g., bitcoinsystem), the node does not rely on any central organization, and every node keepsan entire copy of the transaction database. However, this feature determines thatthe size of blockchain transaction database is growing rapidly. Therefore, with thecontinuous system operations, the node memory also needs to be expanded tosupport the system running. Especially in the big data era, the increasing networktraffic will lead to faster transaction growth rate. This paper analyzes blockchaintransaction databases and proposes a storage optimization scheme. The proposedscheme divides blockchain transaction database into cold zone and hot zone usingexpiration recognition method based on Least Recently Used (LRU) algorithm. Itcan achieve storage optimization by moving unspent transaction outputs outsidethe in-memory transaction databases. We present the theoretical analysis on theoptimization method to validate the effectiveness. Extensive experiments showour proposed method outperforms the current mechanism for the blockchaintransaction databases.展开更多
It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial...It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial data sources are usually distributed and heterogeneous. Federated database is the best resolution for the share and interoperation of spatial database. In this paper, the concepts of federated database and interoperability are introduced. Three heterogeneous kinds of spatial data, vector, image and DEM are used to create integrated database. A data model of federated spatial databases is given.展开更多
The technique of Knowlege Discovery in Databases (KDD) to learn valuable knowledge hidden in network alarm databases is introduced. To get such knowledge, we propose an efficient method based on sliding windows (named...The technique of Knowlege Discovery in Databases (KDD) to learn valuable knowledge hidden in network alarm databases is introduced. To get such knowledge, we propose an efficient method based on sliding windows (named as Slidwin) to discover different episode rules from time squential alarm data. The experimental results show that given different thresholds parameters, large amount of different rules could be discovered quickly.展开更多
In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, thr...In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, three algorithms are presented for answering this kind of query. Among of them, two-phase Range+Join and Join+Range algorithms adopt the strategy that changes the execution order of range and closest pairs queries, and constrained heap-based algorithm utilizes extended distance functions to prune search space and minimize the pruning distance. Experimental results show that constrained heap-base algorithm has better applicability and performance than two-phase algorithms.展开更多
Almost all the cellular processes in a living system are controlled by proteins:They regulate gene expression,catalyze chemical reactions,transport small molecules across membranes,and transmit signal across membranes...Almost all the cellular processes in a living system are controlled by proteins:They regulate gene expression,catalyze chemical reactions,transport small molecules across membranes,and transmit signal across membranes.Even,a viral infection is often initiated through virus-host protein interactions.Protein-protein interactions(PPIs)are the physical contacts between two or more proteins and they represent complex biological functions.Nowadays,PPIs have been used to construct PPI networks to study complex pathways for revealing the functions of unknown proteins.Scientists have used PPIs to find the molecular basis of certain diseases and also some potential drug targets.In this review,we will discuss how PPI networks are essential to understand the molecular basis of virus-host relationships and several databases which are dedicated to virus-host interaction studies.Here,we present a short but comprehensive review on PPIs,including the experimental and computational methods of finding PPIs,the databases dedicated to virus-host PPIs,and the associated various applications in protein interaction networks of some lethal viruses with their hosts.展开更多
GeoStar is the registered trademark of GIS software made by WTUSM in China.By means of the GeoStar,multi_scale images,DEMs,graphics and attributes integrated in very large seamless databases can be created,and the mul...GeoStar is the registered trademark of GIS software made by WTUSM in China.By means of the GeoStar,multi_scale images,DEMs,graphics and attributes integrated in very large seamless databases can be created,and the multi_dimensional dynamic visualization and information extraction are also available.This paper describes the fundamental characteristics of such huge integrated databases,for instance,the data models,database structures and the spatial index strategies.At last,the typical applications of GeoStar for a few pilot projects like the Shanghai CyberCity and the Guangdong provincial spatial data infrastructure (SDI) are illustrated and several concluding remarks are stressed.展开更多
文摘In China, the vast majority of the bibliographic databases is commercial, such as China National Knowledge Infrastructure (CNKI), Wanfang Database, Longyuan Journal Net, CQVIP Company, however, there are also non-profit open access (OA) databases, such as journal database jointly established by Chinese Academy of Social Sciences (CASS) and National Social Science Fund. The commercial bibliographic databases have to face many difficulties: intellectual property disputes, the benefit distribution between the hardcopy periodical and the commercial bibliographic database, the lack of quality assessment about the commercial bibliographic databases, the need of improving digital technology as well as the lack of a unified database regulation, which restricts the development of commercial bibliographic databases. This paper puts forward the countermeasures from the perspective of how to enhance the governmental management; how to protect the intellectual property fight; how to improve the technical standard of the commercial bibliographic databases; how to build interest distribution between the hardcopy periodical and the commercial bibliographic database; how to improve the quality of commercial bibliographic databases; and how to improve the industrial chain of the commercial bibliographic databases.
文摘Background: Suicide among physicians is a serious public health issue, with an extremely complex and multifactorial behavior. Aim: The aim of this study was to use the theme “Suicide among Physicians” to exemplify the analysis of methodological similarities between the scientific content available at MEDLINE and BVS databases, as scientific research tools. Methods: This is a systematic review with metanalysis. The following combinations of keywords were used for data search in the referred databases: “suicide” AND “physicians” AND “public heath”. Results: Three hundred and thirteen publications were identified, but only 16 studies were chosen. Great association was found between MEDLINE and BVS databases and the Odds Ratio regarding the theme: “Suicide among physicians”. Conclusions: Considering the similarities found in the utilization of the two analyzed databases, it was possible to identify that suicide among physicians is associated with the exercise of an important professional role in the society and in the workplace. With regard to scientific information, the p-value-obtained value (<0.05) seems to be statistically significant for the association between the suggested theme and the methodological similarities of the scientific information available in the analyzed databases. Thus, these open-access research tools are considered scientific reliable tools.
基金supported by the National Natural Science Foundation of China(No.62302242)the China Postdoctoral Science Foundation(No.2023M731802).
文摘The EU’s Artificial Intelligence Act(AI Act)imposes requirements for the privacy compliance of AI systems.AI systems must comply with privacy laws such as the GDPR when providing services.These laws provide users with the right to issue a Data Subject Access Request(DSAR).Responding to such requests requires database administrators to identify information related to an individual accurately.However,manual compliance poses significant challenges and is error-prone.Database administrators need to write queries through time-consuming labor.The demand for large amounts of data by AI systems has driven the development of NoSQL databases.Due to the flexible schema of NoSQL databases,identifying personal information becomes even more challenging.This paper develops an automated tool to identify personal information that can help organizations respond to DSAR.Our tool employs a combination of various technologies,including schema extraction of NoSQL databases and relationship identification from query logs.We describe the algorithm used by our tool,detailing how it discovers and extracts implicit relationships from NoSQL databases and generates relationship graphs to help developers accurately identify personal data.We evaluate our tool on three datasets,covering different database designs,achieving an F1 score of 0.77 to 1.Experimental results demonstrate that our tool successfully identifies information relevant to the data subject.Our tool reduces manual effort and simplifies GDPR compliance,showing practical application value in enhancing the privacy performance of NOSQL databases and AI systems.
文摘Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analyze the popularity of certain research topics,well-adopted methodologies,influential authors,and the interrelationships among research disciplines.However,the visual exploration of the patterns of research topics with an emphasis on their spatial and temporal distribution remains challenging.This study combined a Space-Time Cube(STC)and a 3D glyph to represent the complex multivariate bibliographic data.We further implemented a visual design by developing an interactive interface.The effectiveness,understandability,and engagement of ST-Map are evaluated by seven experts in geovisualization.The results suggest that it is promising to use three-dimensional visualization to show the overview and on-demand details on a single screen.
基金supported in part by the Slovenian Research Agency(VB,research program P1-0294)(VB,research project J5-2557)+2 种基金(VB,research project J5-4596)COST EU(VB,COST action CA21163(HiTEc)is prepared within the framework of the HSE University Basic Research Program.
文摘Purpose:We analyzed the structure of a community of authors working in the field of social network analysis(SNA)based on citation indicators:direct citation and bibliographic coupling metrics.We observed patterns at the micro,meso,and macro levels of analysis.Design/methodology/approach:We used bibliometric network analysis,including the“temporal quantities”approach proposed to study temporal networks.Using a two-mode network linking publications with authors and a one-mode network of citations between the works,we constructed and analyzed the networks of citation and bibliographic coupling among authors.We used an iterated saturation data collection approach.Findings:At the macro-level,we observed the global structural features of citations between authors,showing that 80%of authors have not more than 15 citations from other works.At the meso-level,we extracted the groups of authors citing each other and similar to each other according to their citation patterns.We have seen a division of authors in SNA into groups of social scientists and physicists,as well as into other groups of authors from different disciplines.We found some examples of brokerage between different groups that maintained the common identity of the field.At the micro-level,we extracted authors with extremely high values of received citations,who can be considered as the most prominent authors in the field.We examined the temporal properties of the most popular authors.Research limitations:The main challenge in this approach is the resolution of the author’s name(synonyms and homonyms).We faced the author disambiguation,or“multiple personalities”(Harzing,2015)problem.To remain consistent and comparable with our previously published articles,we used the same SNA data collected up to 2018.The analysis and conclusions on the activity,productivity,and visibility of the authors are relative only to the field of SNA.Practical implications:The proposed approach can be utilized for similar objectives and identifying key structures and characteristics in other disciplines.This may potentially inspire the application of network approaches in other research areas,creating more authors collaborating in the field of SNA.Originality/value:We identified and applied an innovative approach and methods to study the structure of scientific communities,which allowed us to get the findings going beyond those obtained with other methods.We used a new approach to temporal network analysis,which is an important addition to the analysis as it provides detailed information on different measures for the authors and pairs of authors over time.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61888102,52272172,and 52102193)the Major Program of the National Natural Science Foundation of China(Grant No.92163206)+2 种基金the National Key Research and Development Program of China(Grant Nos.2021YFA1201501 and 2022YFA1204100)the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB30000000)the Fundamental Research Funds for the Central Universities.
文摘Discovery of materials using“bottom-up”or“top-down”approach is of great interest in materials science.Layered materials consisting of two-dimensional(2D)building blocks provide a good platform to explore new materials in this respect.In van der Waals(vdW)layered materials,these building blocks are charge neutral and can be isolated from their bulk phase(top-down),but usually grow on substrate.In ionic layered materials,they are charged and usually cannot exist independently but can serve as motifs to construct new materials(bottom-up).In this paper,we introduce our recently constructed databases for 2D material-substrate interface(2DMSI),and 2D charged building blocks.For 2DMSI database,we systematically build a workflow to predict appropriate substrates and their geometries at substrates,and construct the 2DMSI database.For the 2D charged building block database,1208 entries from bulk material database are identified.Information of crystal structure,valence state,source,dimension and so on is provided for each entry with a json format.We also show its application in designing and searching for new functional layered materials.The 2DMSI database,building block database,and designed layered materials are available in Science Data Bank at https://doi.org/10.57760/sciencedb.j00113.00188.
文摘Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccurate transaction properties.Soft computing-based solutions have been developed to solve this challenge.In a single framework,ambiguous,vague,incomplete,and inconsistent transaction attribute information has received minimal attention.The work presented in this paper employed type-2 neutrosophic logic,an extension of type-1 neutrosophic logic,to handle uncertainty in real-time deadlock-resolving systems.The proposed method is structured to reflect multiple types of knowledge and relations among transactions’features that include validation factor degree,slackness degree,degree of deadline-missed transaction based on the degree of membership of truthiness,degree ofmembership of indeterminacy,and degree ofmembership of falsity.Here,the footprint of uncertainty(FOU)for truth,indeterminacy,and falsity represents the level of uncertainty that exists in the value of a grade of membership.We employed a distributed real-time transaction processing simulator(DRTTPS)to conduct the simulations and conducted experiments using the benchmark Pima Indians diabetes dataset(PIDD).As the results showed,there is an increase in detection rate and a large drop in rollback rate when this new strategy is used.The performance of Type-2 neutrosophicbased resolution is better than the Type-1 neutrosophic-based approach on the execution ratio scale.The improvement rate has reached 10%to 20%,depending on the number of arrived transactions.
基金National Natural Science Foundation of China (No.82060503)。
文摘Objective:To investigate the variation,expression and clinical significance of E2F3 gene in melanoma.Methods:Firstly,cBioportal database,Oncomine database and GEO database were used to analyze the variation and expression level of E2F3 gene in melanoma.OSskcm database and TISIDB database were used to analyze the relationship between E2F3 and melanoma prognosis and tumor immune infiltrating cells.Then,the LinkedOmics database was used to identify the differential genes related to E2F3 expression in melanoma and analyze their biological functions.Finally,small molecule compounds for the treatment of melanoma were screened through the CMap database.Results:The mutation rate of E2F3 gene in melanoma is about 4%,and there are 21 mutation sites.Compared with normal skin tissues,the expression of E2F3 gene in melanoma was significantly increased(P<0.01).The mutation and increased expression of E2F3 gene were associated with the shortened overall survival(OS)of melanoma patients(P<0.05).The CNA level of E2F3 was negatively correlated with the expression levels of lymphocytes such as pDC,Neutrophil,Act DC and Th17,and negatively correlated with the expression levels of chemokines such as CXCL5,CCL13 and CCR1.The methylation level of E2F3 was positively correlated with the expression levels of Th1,Neutrophil,Act DC and other lymphocytes,and positively correlated with the expression levels of CXCL16,CXCL12,CCR1 and other chemokines.The expression level of E2F3 was negatively correlated with the expression levels of lymphocytes such as Th17,Tcm CD4 and Th1,and negatively correlated with the expression levels of chemokines such as CXCL 16,CCL 22 and CCL 2.The expression of 96 genes such as UHRF1BP1 in melanoma was significantly correlated with the expression of E2F3(|cor|0.5,P<0.05).The above genes were mainly related to RNA transport,eukaryotic ribosome biogenesis,cell cycle and other pathways.Among them,WDR12,WDR43,RBM28,UTP18,DKC1,PAK1IP1,DDX31,TEX10,TRUB1 and TRMT61B were the top 10 hub genes.YC-1,simvastatin,cytochalasin-d,Deforolimus and cytochalasin-b may be five small molecule compounds for the treatment of melanoma.Conclusion:The mutation and increased expression level of E2F3 gene are related to the poor prognosis of melanoma and participate in the occurrence and development of melanoma by affecting the expression of different tumor immune infiltrating cell subtypes,which may be a potential diagnostic marker and therapeutic target for melanoma.
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
文摘Data acquisition and modeling are the two important, difficult and costful aspects in a Cybercity project. 2D-GIS is mature and can manage a lot of spatial data. Thus 3D-GIS should make the best of data and technology of 2D-GIS. Construction of a useful synthetic environment requires integration of multiple types of information like DEM, texture images and 3D representation of objects such as buildings. In this paper, the method for 3D city landscape data model and visualization based on integrated databases is presented. Since the data volume of raster are very huge, special strategies(for example, pyramid gridded method) must be adopted in order to manage raster data efficiently. Three different methods of data acquisition, the proper data structure and a simple modeling method are presented as well. At last, a pilot project of Shanghai Cybercity is illustrated.
基金supported by the National Natural Science Foundation of China (60471055)the National "863" High Technology Research and Development Program of China (2007AA01Z443)
文摘To solve the problems of shaving and reusing information in the information system, a rules-based ontology constructing approach from object-relational databases is proposed. A 3-tuple ontology constructing model is proposed first. Then, four types of ontology constructing rules including class, property, property characteristics, and property restrictions ave formalized according to the model. Experiment results described in Web ontology language prove that our proposed approach is feasible for applying in the semantic objects project of semantic computing laboratory in UC Irvine. Our approach reduces about twenty percent constructing time compared with the ontology construction from relational databases.
基金Supported by National Natural Science Foundation of China(6 98730 2 7)
文摘Most knowledgeable people agree that networking and routing technologies have been around about 25 years. Routing is simultaneously the most complicated function of a network and the most important. It is of the same kind that more than 70% of computer application fields are MIS applications. So the challenge in building and using a MIS in the network is developing the means to find, access, and communicate large databases or multi databases systems. Because general databases are not time continuous, in fact, they can not be streaming, so we can't obtain reliable and secure quality of service by deleting some unimportant datagrams in the databases transmission. In this article, we will discuss which kind of routing protocol is the best type for large databases or multi databases systems transmission in the networks.
文摘The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher weights are assigned to more significant attributes, so important attributes are more frequently fingerprinted than other ones. Finally, the robustness of the proposed algorithm, such as performance against collusion attacks, is analyzed. Experimental results prove the superiority of the algorithm.
基金Supported by the Aeronautics Science Foundation of China (02F52033), the High-Technology Research Project of Jiangsu Province (BG2004005) and Youth Research Foundation of Qufu Normal Univer-sity(XJ02057)
文摘A weighted algorithm for watermarking relational databases for copyright protection is presented. The possibility of watermarking an attribute is assigned according to its weight decided by the owner of the database. A one-way hash function and a secret key known only to the owner of the data are used to select tuples and bits to mark. By assigning high weight to significant attributes, the scheme ensures that important attributes take more chance to be marked than less important ones. Experimental results show that the proposed scheme is robust against various forms of attacks, and has perfect immunity to subset attack.
基金supported by Researchers Supporting Project(No.RSP-2020/102)King Saud University,Riyadh,Saudi Arabiathe National Natural Science Foundation of China(Nos.61802031,61772454,61811530332,61811540410)+4 种基金the Natural Science Foundation of Hunan Province,China(No.2019JGYB177)the Research Foundation of Education Bureau of Hunan Province,China(No.18C0216)the“Practical Innovation and Entrepreneurial Ability Improvement Plan”for Professional Degree Graduate students of Changsha University of Science and Technology(No.SJCX201971)Hunan Graduate Scientific Research Innovation Project,China(No.CX2019694)This work is also supported by the Programs of Transformation and Upgrading of Industries and Information Technologies of Jiangsu Province(No.JITC-1900AX2038/01).
文摘As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g., bitcoinsystem), the node does not rely on any central organization, and every node keepsan entire copy of the transaction database. However, this feature determines thatthe size of blockchain transaction database is growing rapidly. Therefore, with thecontinuous system operations, the node memory also needs to be expanded tosupport the system running. Especially in the big data era, the increasing networktraffic will lead to faster transaction growth rate. This paper analyzes blockchaintransaction databases and proposes a storage optimization scheme. The proposedscheme divides blockchain transaction database into cold zone and hot zone usingexpiration recognition method based on Least Recently Used (LRU) algorithm. Itcan achieve storage optimization by moving unspent transaction outputs outsidethe in-memory transaction databases. We present the theoretical analysis on theoptimization method to validate the effectiveness. Extensive experiments showour proposed method outperforms the current mechanism for the blockchaintransaction databases.
基金Supported by the National Nature Science Foundation under"Outstanding Young Researchers"(495 2 5 10 1)
文摘It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial data sources are usually distributed and heterogeneous. Federated database is the best resolution for the share and interoperation of spatial database. In this paper, the concepts of federated database and interoperability are introduced. Three heterogeneous kinds of spatial data, vector, image and DEM are used to create integrated database. A data model of federated spatial databases is given.
基金Supported by the National86 3High-Tech Project!(863-306-Z705-0 2 ) National Natural Science F oundation of China!(69896240)
文摘The technique of Knowlege Discovery in Databases (KDD) to learn valuable knowledge hidden in network alarm databases is introduced. To get such knowledge, we propose an efficient method based on sliding windows (named as Slidwin) to discover different episode rules from time squential alarm data. The experimental results show that given different thresholds parameters, large amount of different rules could be discovered quickly.
基金Supported by National Natural Science Foundationof China (60073045)
文摘In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, three algorithms are presented for answering this kind of query. Among of them, two-phase Range+Join and Join+Range algorithms adopt the strategy that changes the execution order of range and closest pairs queries, and constrained heap-based algorithm utilizes extended distance functions to prune search space and minimize the pruning distance. Experimental results show that constrained heap-base algorithm has better applicability and performance than two-phase algorithms.
基金National Natural Science Foundation of China,No.31971180 and No.11474013.
文摘Almost all the cellular processes in a living system are controlled by proteins:They regulate gene expression,catalyze chemical reactions,transport small molecules across membranes,and transmit signal across membranes.Even,a viral infection is often initiated through virus-host protein interactions.Protein-protein interactions(PPIs)are the physical contacts between two or more proteins and they represent complex biological functions.Nowadays,PPIs have been used to construct PPI networks to study complex pathways for revealing the functions of unknown proteins.Scientists have used PPIs to find the molecular basis of certain diseases and also some potential drug targets.In this review,we will discuss how PPI networks are essential to understand the molecular basis of virus-host relationships and several databases which are dedicated to virus-host interaction studies.Here,we present a short but comprehensive review on PPIs,including the experimental and computational methods of finding PPIs,the databases dedicated to virus-host PPIs,and the associated various applications in protein interaction networks of some lethal viruses with their hosts.
文摘GeoStar is the registered trademark of GIS software made by WTUSM in China.By means of the GeoStar,multi_scale images,DEMs,graphics and attributes integrated in very large seamless databases can be created,and the multi_dimensional dynamic visualization and information extraction are also available.This paper describes the fundamental characteristics of such huge integrated databases,for instance,the data models,database structures and the spatial index strategies.At last,the typical applications of GeoStar for a few pilot projects like the Shanghai CyberCity and the Guangdong provincial spatial data infrastructure (SDI) are illustrated and several concluding remarks are stressed.