This paper describes the Papers in Chinese Seismological Journal (CSJP) database (English edition) comprehensively, including the retrieval system of the database, the database features, the documental indexing,and th...This paper describes the Papers in Chinese Seismological Journal (CSJP) database (English edition) comprehensively, including the retrieval system of the database, the database features, the documental indexing,and the documental record format etc. It gives the block diagram of the retrieval system and the flow chart ofthe documental processing.展开更多
Objectives: To analyze the documental quality of 389 websites in Portuguese about physical activity, healthy lifestyles and sedentary lifestyles found on the Brazilian version of the general search engine Google. Meth...Objectives: To analyze the documental quality of 389 websites in Portuguese about physical activity, healthy lifestyles and sedentary lifestyles found on the Brazilian version of the general search engine Google. Methods: The documental quality of the 389 websites was estimated based upon the following parameters: 1) a combination of quality criteria from the Health Information Locator (LIS—OPS/BIREME) and those from Chile’s Pontifical Catholic University, organized into 17 variables;2) uniformity of reference criteria (Vancouver);3) association between the presence of authorship and a higher number of the quality criteria being fulfilled. We also studied the ranking of the results presented by Google in addition to attributes connected to the websites’ target audience, the types of content, their sponsors and country of origin. Results: Of the 389 websites studied, 111 links were not active (28.53% CI 95% [24.05 - 33.02]) and none of the websites in the sample met all of the 17 quality variables. Authored websites displayed remarkable differences in quality when compared to those which did not identify their authors. Conclusions: Faced with the issue of the proliferation of websites with questionable quality content, and the fact that the ranking of results interferes directly in the internal evaluation of content relevance, we propose that public-health research institutions cooperate with web-searching developers to improve the website-positioning formula, in which the “identified authorship” criterion should play a major role in the ranking system.展开更多
Seal authentication is an important task for verifying the authenticity of stamped seals used in various domains to protect legal documents from tampering and counterfeiting.Stamped seal inspection is commonly audited...Seal authentication is an important task for verifying the authenticity of stamped seals used in various domains to protect legal documents from tampering and counterfeiting.Stamped seal inspection is commonly audited manually to ensure document authenticity.However,manual assessment of seal images is tedious and laborintensive due to human errors,inconsistent placement,and completeness of the seal.Traditional image recognition systems are inadequate enough to identify seal types accurately,necessitating a neural network-based method for seal image recognition.However,neural network-based classification algorithms,such as Residual Networks(ResNet)andVisualGeometryGroup with 16 layers(VGG16)yield suboptimal recognition rates on stamp datasets.Additionally,the fixed training data categories make handling new categories to be a challenging task.This paper proposes amulti-stage seal recognition algorithmbased on Siamese network to overcome these limitations.Firstly,the seal image is pre-processed by applying an image rotation correction module based on Histogram of Oriented Gradients(HOG).Secondly,the similarity between input seal image pairs is measured by utilizing a similarity comparison module based on the Siamese network.Finally,we compare the results with the pre-stored standard seal template images in the database to obtain the seal type.To evaluate the performance of the proposed method,we further create a new seal image dataset that contains two subsets with 210,000 valid labeled pairs in total.The proposed work has a practical significance in industries where automatic seal authentication is essential as in legal,financial,and governmental sectors,where automatic seal recognition can enhance document security and streamline validation processes.Furthermore,the experimental results show that the proposed multi-stage method for seal image recognition outperforms state-of-the-art methods on the two established datasets.展开更多
This video series is the first experimental psychology documentary made in China.It focuses on analyzing professional theories to raise people’s general understanding of basic psychology.By combining innovative audio...This video series is the first experimental psychology documentary made in China.It focuses on analyzing professional theories to raise people’s general understanding of basic psychology.By combining innovative audiovisual narrative with psychological experiments,it zooms in on real human nature through discussing social hotspots from the perspectives of social psychology,cognitive psychology,and personality psychology,in order to help people find answers for their current psychological difficulties.展开更多
In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright ...In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright infringements occur frequently due to the ease of copying,which not only infringes on the rights of creators but also weakens their creative enthusiasm.Therefore,it is crucial to establish an e-document sharing system that enforces copyright protection.However,the existing centralized system has outstanding vulnerabilities,and the plagiarism detection algorithm used cannot fully detect the context,semantics,style,and other factors of the text.Digital watermark technology is only used as a means of infringement tracing.This paper proposes a decentralized framework for e-document sharing based on decentralized autonomous organization(DAO)and non-fungible token(NFT)in blockchain.The use of blockchain as a distributed credit base resolves the vulnerabilities inherent in traditional centralized systems.The e-document evaluation and plagiarism detection mechanisms based on the DAO model effectively address challenges in comprehensive text information checks,thereby promoting the enhancement of e-document quality.The mechanism for protecting and circulating e-document copyrights using NFT technology ensures effective safeguarding of users’e-document copyrights and facilitates e-document sharing.Moreover,recognizing the security issues within the DAO governance mechanism,we introduce an innovative optimization solution.Through experimentation,we validate the enhanced security of the optimized governance mechanism,reducing manipulation risks by up to 51%.Additionally,by utilizing evolutionary game analysis to deduce the equilibrium strategies of the framework,we discovered that adjusting the reward and penalty parameters of the incentive mechanism motivates creators to generate superior quality and unique e-documents,while evaluators are more likely to engage in assessments.展开更多
Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review arti...Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.展开更多
The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study intro...The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.展开更多
The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Curr...The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.展开更多
As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of d...As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.展开更多
Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such...Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM.展开更多
BACKGROUND Imipenem is a highly effective carbapenem antibiotic,which is widely used in the treatment of many serious bacterial infections.At the same time,it can also cause some adverse reactions,mental abnormalities...BACKGROUND Imipenem is a highly effective carbapenem antibiotic,which is widely used in the treatment of many serious bacterial infections.At the same time,it can also cause some adverse reactions,mental abnormalities are the most concerned central nervous system adverse reactions.Different patients respond differently to imipenem,and the effect of imipenem on psychiatric disorders is unclear.Therefore,meta-analysis summarizing the results of multiple previous studies can provide stronger evidence support for clinical guidelines to guide clinical rational use of imipenem to minimize risks.After reviewing the literature published between 2003 and 2017,seven controlled trials with a total of 550 patients were included,with 273 and 277 patients in the control and experimental groups,respectively.The sample size of the study ranged from a minimum of 30 cases to a maximum of 61 cases.Patients in the experimental group were treated with imipenem while the control group was treated with conventional drugs.Meta-analysis showed that the incidence of mental disorders in the experimental group was higher than that in the control group(odds ratio=3.66,95%confidence interval:1.11-12.11,P=0.030);however,there was no significant difference in the incidence of adverse reactions between the two groups(odds ratio=0.05,95%confidence interval:0.00 to 0.10,P=0.060).Funnel diagrams showed that the scattered points of each study were symmetrical and distributed in an inverted funnel shape;therefore,there was no publication bias.CONCLUSION Imipenem can cause mental disorders in patients.However,the low quality of the included literature may have affected the final results.Therefore,it is necessary to conduct a high-quality randomized controlled study with multiple samples to further confirm the mechanism of imipenem-induced mental disorders and provide effective guidance for clinical treatment.展开更多
Signature verification involves vague situations in which a signature could resemble many reference samples ormight differ because of handwriting variances. By presenting the features and similarity score of signature...Signature verification involves vague situations in which a signature could resemble many reference samples ormight differ because of handwriting variances. By presenting the features and similarity score of signatures from thematching algorithm as fuzzy sets and capturing the degrees of membership, non-membership, and indeterminacy,a neutrosophic engine can significantly contribute to signature verification by addressing the inherent uncertaintiesand ambiguities present in signatures. But type-1 neutrosophic logic gives these membership functions fixed values,which could not adequately capture the various degrees of uncertainty in the characteristics of signatures. Type-1neutrosophic representation is also unable to adjust to various degrees of uncertainty. The proposed work exploresthe type-2 neutrosophic logic to enable additional flexibility and granularity in handling ambiguity, indeterminacy,and uncertainty, hence improving the accuracy of signature verification systems. Because type-2 neutrosophiclogic allows the assessment of many sources of ambiguity and conflicting information, decision-making is moreflexible. These experimental results show the possible benefits of using a type-2 neutrosophic engine for signatureverification by demonstrating its superior handling of uncertainty and variability over type-1, which eventuallyresults in more accurate False Rejection Rate (FRR) and False Acceptance Rate (FAR) verification results. In acomparison analysis using a benchmark dataset of handwritten signatures, the type-2 neutrosophic similaritymeasure yields a better accuracy rate of 98% than the type-1 95%.展开更多
Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;...Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;and research that found that it is time consuming. The purpose of this quantitative retrospective before-after project was to measure the impact of using the laboratory value flowsheet within the EHR on documentation time. The research question was: “Does the use of a laboratory value flowsheet in the EHR impact documentation time by primary care providers (PCPs)?” The theoretical framework utilized in this project was the Donabedian Model. The population in this research was the two PCPs in a small primary care clinic in the northwest of Puerto Rico. The sample was composed of all the encounters during the months of October 2019 and December 2019. The data was obtained through data mining and analyzed using SPSS 27. The evaluative outcome of this project is that there is a decrease in documentation time after implementation of the use of the laboratory value flowsheet in the EHR. However, patients per day increase therefore having an impact on the number of patients seen per day/week/month. The implications for clinical practice include the use of templates to improve workflow and documentation as well as decreasing documentation time while also increasing the number of patients seen per day. .展开更多
With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial informati...With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial information grow, electronic documents have also proliferated. When dealing with numerous electronic documents and texts written by Chinese beginners, manually written texts often contain hidden grammatical errors, posing a significant challenge to traditional manual proofreading. Correcting these grammatical errors is crucial to ensure fluency and readability. However, certain special types of text grammar or logical errors can have a huge impact, and manually proofreading a large number of texts individually is clearly impractical. Consequently, research on text error correction techniques has garnered significant attention in recent years. The advent and advancement of deep learning have paved the way for sequence-to-sequence learning methods to be extensively applied to the task of text error correction. This paper presents a comprehensive analysis of Chinese text grammar error correction technology, elaborates on its current research status, discusses existing problems, proposes preliminary solutions, and conducts experiments using judicial documents as an example. The aim is to provide a feasible research approach for Chinese text error correction technology.展开更多
This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such...This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such as blogs,forums,emails and online archives.The discussion highlights the role of AI in different stages of the research process,including question generation,sample and design definition,ethical considerations,data analysis,and results dissemination,emphasizing how AI can automate complex tasks and enhance research design.The paper also reports on practical experiences using AI tools,specifically ChatGPT-4,in conducting web-mediated documentary analysis and shares some ideas for the integration of AI in social research.展开更多
文本分类是信息检索的关键问题之一.提取更多的可信反例和构造准确高效的分类器是PU(positive and unlabeled)文本分类的两个重要问题.然而,在现有的可信反例提取方法中,很多方法提取的可信反例数量较少,构建的分类器质量有待提高.分别...文本分类是信息检索的关键问题之一.提取更多的可信反例和构造准确高效的分类器是PU(positive and unlabeled)文本分类的两个重要问题.然而,在现有的可信反例提取方法中,很多方法提取的可信反例数量较少,构建的分类器质量有待提高.分别针对这两个重要步骤提供了一种基于聚类的半监督主动分类方法.与传统的反例提取方法不同,利用聚类技术和正例文档应与反例文档共享尽可能少的特征项这一特点,从未标识数据集中尽可能多地移除正例,从而可以获得更多的可信反例.结合SVM主动学习和改进的Rocchio构建分类器,并采用改进的TFIDF(term frequency inverse document frequency)进行特征提取,可以显著提高分类的准确度.分别在3个不同的数据集中测试了分类结果(RCV1,Reuters-21578,20 Newsgoups).实验结果表明,基于聚类寻找可信反例可以在保持较低错误率的情况下获取更多的可信反例,而且主动学习方法的引入也显著提升了分类精度.展开更多
文摘This paper describes the Papers in Chinese Seismological Journal (CSJP) database (English edition) comprehensively, including the retrieval system of the database, the database features, the documental indexing,and the documental record format etc. It gives the block diagram of the retrieval system and the flow chart ofthe documental processing.
文摘Objectives: To analyze the documental quality of 389 websites in Portuguese about physical activity, healthy lifestyles and sedentary lifestyles found on the Brazilian version of the general search engine Google. Methods: The documental quality of the 389 websites was estimated based upon the following parameters: 1) a combination of quality criteria from the Health Information Locator (LIS—OPS/BIREME) and those from Chile’s Pontifical Catholic University, organized into 17 variables;2) uniformity of reference criteria (Vancouver);3) association between the presence of authorship and a higher number of the quality criteria being fulfilled. We also studied the ranking of the results presented by Google in addition to attributes connected to the websites’ target audience, the types of content, their sponsors and country of origin. Results: Of the 389 websites studied, 111 links were not active (28.53% CI 95% [24.05 - 33.02]) and none of the websites in the sample met all of the 17 quality variables. Authored websites displayed remarkable differences in quality when compared to those which did not identify their authors. Conclusions: Faced with the issue of the proliferation of websites with questionable quality content, and the fact that the ranking of results interferes directly in the internal evaluation of content relevance, we propose that public-health research institutions cooperate with web-searching developers to improve the website-positioning formula, in which the “identified authorship” criterion should play a major role in the ranking system.
基金the National Natural Science Foundation of China(Grant No.62172132)Public Welfare Technology Research Project of Zhejiang Province(Grant No.LGF21F020014)the Opening Project of Key Laboratory of Public Security Information Application Based on Big-Data Architecture,Ministry of Public Security of Zhejiang Police College(Grant No.2021DSJSYS002).
文摘Seal authentication is an important task for verifying the authenticity of stamped seals used in various domains to protect legal documents from tampering and counterfeiting.Stamped seal inspection is commonly audited manually to ensure document authenticity.However,manual assessment of seal images is tedious and laborintensive due to human errors,inconsistent placement,and completeness of the seal.Traditional image recognition systems are inadequate enough to identify seal types accurately,necessitating a neural network-based method for seal image recognition.However,neural network-based classification algorithms,such as Residual Networks(ResNet)andVisualGeometryGroup with 16 layers(VGG16)yield suboptimal recognition rates on stamp datasets.Additionally,the fixed training data categories make handling new categories to be a challenging task.This paper proposes amulti-stage seal recognition algorithmbased on Siamese network to overcome these limitations.Firstly,the seal image is pre-processed by applying an image rotation correction module based on Histogram of Oriented Gradients(HOG).Secondly,the similarity between input seal image pairs is measured by utilizing a similarity comparison module based on the Siamese network.Finally,we compare the results with the pre-stored standard seal template images in the database to obtain the seal type.To evaluate the performance of the proposed method,we further create a new seal image dataset that contains two subsets with 210,000 valid labeled pairs in total.The proposed work has a practical significance in industries where automatic seal authentication is essential as in legal,financial,and governmental sectors,where automatic seal recognition can enhance document security and streamline validation processes.Furthermore,the experimental results show that the proposed multi-stage method for seal image recognition outperforms state-of-the-art methods on the two established datasets.
文摘This video series is the first experimental psychology documentary made in China.It focuses on analyzing professional theories to raise people’s general understanding of basic psychology.By combining innovative audiovisual narrative with psychological experiments,it zooms in on real human nature through discussing social hotspots from the perspectives of social psychology,cognitive psychology,and personality psychology,in order to help people find answers for their current psychological difficulties.
基金This work is supported by the National Key Research and Development Program(2022YFB2702300)National Natural Science Foundation of China(Grant No.62172115)+2 种基金Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education under Grant Number No.1331005Guangdong Higher Education Innovation Group 2020KCXTD007Guangzhou Fundamental Research Plan of Municipal-School Jointly Funded Projects(No.202102010445).
文摘In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright infringements occur frequently due to the ease of copying,which not only infringes on the rights of creators but also weakens their creative enthusiasm.Therefore,it is crucial to establish an e-document sharing system that enforces copyright protection.However,the existing centralized system has outstanding vulnerabilities,and the plagiarism detection algorithm used cannot fully detect the context,semantics,style,and other factors of the text.Digital watermark technology is only used as a means of infringement tracing.This paper proposes a decentralized framework for e-document sharing based on decentralized autonomous organization(DAO)and non-fungible token(NFT)in blockchain.The use of blockchain as a distributed credit base resolves the vulnerabilities inherent in traditional centralized systems.The e-document evaluation and plagiarism detection mechanisms based on the DAO model effectively address challenges in comprehensive text information checks,thereby promoting the enhancement of e-document quality.The mechanism for protecting and circulating e-document copyrights using NFT technology ensures effective safeguarding of users’e-document copyrights and facilitates e-document sharing.Moreover,recognizing the security issues within the DAO governance mechanism,we introduce an innovative optimization solution.Through experimentation,we validate the enhanced security of the optimized governance mechanism,reducing manipulation risks by up to 51%.Additionally,by utilizing evolutionary game analysis to deduce the equilibrium strategies of the framework,we discovered that adjusting the reward and penalty parameters of the incentive mechanism motivates creators to generate superior quality and unique e-documents,while evaluators are more likely to engage in assessments.
文摘Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.
文摘The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.
文摘The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.
文摘As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.
文摘Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM.
基金Supported by the Education Research Program Project of Zhejiang Province,No.Y202043224.
文摘BACKGROUND Imipenem is a highly effective carbapenem antibiotic,which is widely used in the treatment of many serious bacterial infections.At the same time,it can also cause some adverse reactions,mental abnormalities are the most concerned central nervous system adverse reactions.Different patients respond differently to imipenem,and the effect of imipenem on psychiatric disorders is unclear.Therefore,meta-analysis summarizing the results of multiple previous studies can provide stronger evidence support for clinical guidelines to guide clinical rational use of imipenem to minimize risks.After reviewing the literature published between 2003 and 2017,seven controlled trials with a total of 550 patients were included,with 273 and 277 patients in the control and experimental groups,respectively.The sample size of the study ranged from a minimum of 30 cases to a maximum of 61 cases.Patients in the experimental group were treated with imipenem while the control group was treated with conventional drugs.Meta-analysis showed that the incidence of mental disorders in the experimental group was higher than that in the control group(odds ratio=3.66,95%confidence interval:1.11-12.11,P=0.030);however,there was no significant difference in the incidence of adverse reactions between the two groups(odds ratio=0.05,95%confidence interval:0.00 to 0.10,P=0.060).Funnel diagrams showed that the scattered points of each study were symmetrical and distributed in an inverted funnel shape;therefore,there was no publication bias.CONCLUSION Imipenem can cause mental disorders in patients.However,the low quality of the included literature may have affected the final results.Therefore,it is necessary to conduct a high-quality randomized controlled study with multiple samples to further confirm the mechanism of imipenem-induced mental disorders and provide effective guidance for clinical treatment.
文摘Signature verification involves vague situations in which a signature could resemble many reference samples ormight differ because of handwriting variances. By presenting the features and similarity score of signatures from thematching algorithm as fuzzy sets and capturing the degrees of membership, non-membership, and indeterminacy,a neutrosophic engine can significantly contribute to signature verification by addressing the inherent uncertaintiesand ambiguities present in signatures. But type-1 neutrosophic logic gives these membership functions fixed values,which could not adequately capture the various degrees of uncertainty in the characteristics of signatures. Type-1neutrosophic representation is also unable to adjust to various degrees of uncertainty. The proposed work exploresthe type-2 neutrosophic logic to enable additional flexibility and granularity in handling ambiguity, indeterminacy,and uncertainty, hence improving the accuracy of signature verification systems. Because type-2 neutrosophiclogic allows the assessment of many sources of ambiguity and conflicting information, decision-making is moreflexible. These experimental results show the possible benefits of using a type-2 neutrosophic engine for signatureverification by demonstrating its superior handling of uncertainty and variability over type-1, which eventuallyresults in more accurate False Rejection Rate (FRR) and False Acceptance Rate (FAR) verification results. In acomparison analysis using a benchmark dataset of handwritten signatures, the type-2 neutrosophic similaritymeasure yields a better accuracy rate of 98% than the type-1 95%.
文摘Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;and research that found that it is time consuming. The purpose of this quantitative retrospective before-after project was to measure the impact of using the laboratory value flowsheet within the EHR on documentation time. The research question was: “Does the use of a laboratory value flowsheet in the EHR impact documentation time by primary care providers (PCPs)?” The theoretical framework utilized in this project was the Donabedian Model. The population in this research was the two PCPs in a small primary care clinic in the northwest of Puerto Rico. The sample was composed of all the encounters during the months of October 2019 and December 2019. The data was obtained through data mining and analyzed using SPSS 27. The evaluative outcome of this project is that there is a decrease in documentation time after implementation of the use of the laboratory value flowsheet in the EHR. However, patients per day increase therefore having an impact on the number of patients seen per day/week/month. The implications for clinical practice include the use of templates to improve workflow and documentation as well as decreasing documentation time while also increasing the number of patients seen per day. .
文摘With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial information grow, electronic documents have also proliferated. When dealing with numerous electronic documents and texts written by Chinese beginners, manually written texts often contain hidden grammatical errors, posing a significant challenge to traditional manual proofreading. Correcting these grammatical errors is crucial to ensure fluency and readability. However, certain special types of text grammar or logical errors can have a huge impact, and manually proofreading a large number of texts individually is clearly impractical. Consequently, research on text error correction techniques has garnered significant attention in recent years. The advent and advancement of deep learning have paved the way for sequence-to-sequence learning methods to be extensively applied to the task of text error correction. This paper presents a comprehensive analysis of Chinese text grammar error correction technology, elaborates on its current research status, discusses existing problems, proposes preliminary solutions, and conducts experiments using judicial documents as an example. The aim is to provide a feasible research approach for Chinese text error correction technology.
文摘This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such as blogs,forums,emails and online archives.The discussion highlights the role of AI in different stages of the research process,including question generation,sample and design definition,ethical considerations,data analysis,and results dissemination,emphasizing how AI can automate complex tasks and enhance research design.The paper also reports on practical experiences using AI tools,specifically ChatGPT-4,in conducting web-mediated documentary analysis and shares some ideas for the integration of AI in social research.
文摘文本分类是信息检索的关键问题之一.提取更多的可信反例和构造准确高效的分类器是PU(positive and unlabeled)文本分类的两个重要问题.然而,在现有的可信反例提取方法中,很多方法提取的可信反例数量较少,构建的分类器质量有待提高.分别针对这两个重要步骤提供了一种基于聚类的半监督主动分类方法.与传统的反例提取方法不同,利用聚类技术和正例文档应与反例文档共享尽可能少的特征项这一特点,从未标识数据集中尽可能多地移除正例,从而可以获得更多的可信反例.结合SVM主动学习和改进的Rocchio构建分类器,并采用改进的TFIDF(term frequency inverse document frequency)进行特征提取,可以显著提高分类的准确度.分别在3个不同的数据集中测试了分类结果(RCV1,Reuters-21578,20 Newsgoups).实验结果表明,基于聚类寻找可信反例可以在保持较低错误率的情况下获取更多的可信反例,而且主动学习方法的引入也显著提升了分类精度.