In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright ...In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright infringements occur frequently due to the ease of copying,which not only infringes on the rights of creators but also weakens their creative enthusiasm.Therefore,it is crucial to establish an e-document sharing system that enforces copyright protection.However,the existing centralized system has outstanding vulnerabilities,and the plagiarism detection algorithm used cannot fully detect the context,semantics,style,and other factors of the text.Digital watermark technology is only used as a means of infringement tracing.This paper proposes a decentralized framework for e-document sharing based on decentralized autonomous organization(DAO)and non-fungible token(NFT)in blockchain.The use of blockchain as a distributed credit base resolves the vulnerabilities inherent in traditional centralized systems.The e-document evaluation and plagiarism detection mechanisms based on the DAO model effectively address challenges in comprehensive text information checks,thereby promoting the enhancement of e-document quality.The mechanism for protecting and circulating e-document copyrights using NFT technology ensures effective safeguarding of users’e-document copyrights and facilitates e-document sharing.Moreover,recognizing the security issues within the DAO governance mechanism,we introduce an innovative optimization solution.Through experimentation,we validate the enhanced security of the optimized governance mechanism,reducing manipulation risks by up to 51%.Additionally,by utilizing evolutionary game analysis to deduce the equilibrium strategies of the framework,we discovered that adjusting the reward and penalty parameters of the incentive mechanism motivates creators to generate superior quality and unique e-documents,while evaluators are more likely to engage in assessments.展开更多
Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review arti...Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.展开更多
The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study intro...The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.展开更多
As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of d...As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.展开更多
Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;...Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;and research that found that it is time consuming. The purpose of this quantitative retrospective before-after project was to measure the impact of using the laboratory value flowsheet within the EHR on documentation time. The research question was: “Does the use of a laboratory value flowsheet in the EHR impact documentation time by primary care providers (PCPs)?” The theoretical framework utilized in this project was the Donabedian Model. The population in this research was the two PCPs in a small primary care clinic in the northwest of Puerto Rico. The sample was composed of all the encounters during the months of October 2019 and December 2019. The data was obtained through data mining and analyzed using SPSS 27. The evaluative outcome of this project is that there is a decrease in documentation time after implementation of the use of the laboratory value flowsheet in the EHR. However, patients per day increase therefore having an impact on the number of patients seen per day/week/month. The implications for clinical practice include the use of templates to improve workflow and documentation as well as decreasing documentation time while also increasing the number of patients seen per day. .展开更多
Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such...Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM.展开更多
With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial informati...With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial information grow, electronic documents have also proliferated. When dealing with numerous electronic documents and texts written by Chinese beginners, manually written texts often contain hidden grammatical errors, posing a significant challenge to traditional manual proofreading. Correcting these grammatical errors is crucial to ensure fluency and readability. However, certain special types of text grammar or logical errors can have a huge impact, and manually proofreading a large number of texts individually is clearly impractical. Consequently, research on text error correction techniques has garnered significant attention in recent years. The advent and advancement of deep learning have paved the way for sequence-to-sequence learning methods to be extensively applied to the task of text error correction. This paper presents a comprehensive analysis of Chinese text grammar error correction technology, elaborates on its current research status, discusses existing problems, proposes preliminary solutions, and conducts experiments using judicial documents as an example. The aim is to provide a feasible research approach for Chinese text error correction technology.展开更多
This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such...This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such as blogs,forums,emails and online archives.The discussion highlights the role of AI in different stages of the research process,including question generation,sample and design definition,ethical considerations,data analysis,and results dissemination,emphasizing how AI can automate complex tasks and enhance research design.The paper also reports on practical experiences using AI tools,specifically ChatGPT-4,in conducting web-mediated documentary analysis and shares some ideas for the integration of AI in social research.展开更多
In this paper,the research achievements and progress of Yunnan tea germplasm resource in past sixty years are systematically reviewed from the following aspects:exploration,collecting,conservation,protection,identifi...In this paper,the research achievements and progress of Yunnan tea germplasm resource in past sixty years are systematically reviewed from the following aspects:exploration,collecting,conservation,protection,identification,evaluation and shared utilization.Simultaneously,the current problems and the suggestions about subsequent development of tea germplasm resources in Yunnan were discussed,including superior and rare germplasm collection,tea genetic diversity research,biotechnology utilization in tea germplasm innovation,super gene exploration and function,the construction of utilization platform,biological base of species and population conservation.展开更多
A rough set based corner classification neural network, the Rough-CC4, is presented to solve document classification problems such as document representation of different document sizes, document feature selection and...A rough set based corner classification neural network, the Rough-CC4, is presented to solve document classification problems such as document representation of different document sizes, document feature selection and document feature encoding. In the Rough-CC4, the documents are described by the equivalent classes of the approximate words. By this method, the dimensions representing the documents can be reduced, which can solve the precision problems caused by the different document sizes and also blur the differences caused by the approximate words. In the Rough-CC4, a binary encoding method is introduced, through which the importance of documents relative to each equivalent class is encoded. By this encoding method, the precision of the Rough-CC4 is improved greatly and the space complexity of the Rough-CC4 is reduced. The Rough-CC4 can be used in automatic classification of documents.展开更多
The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dy...The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dynamic correlation functions. This paper introduces a new method and uses XML to store and manage engineering documents to realize the format unity of engineering documents and their dynamic correlations.展开更多
背景:色素沉着绒毛结节性滑膜炎在病因、临床表现、诊断与治疗等研究领域依然存在较大争议,对色素沉着绒毛结节性滑膜炎进行文献计量学及可视化研究可以理清该领域研究发展脉络,为未来的研究指明方向。目的:分析色素沉着绒毛结节性滑膜...背景:色素沉着绒毛结节性滑膜炎在病因、临床表现、诊断与治疗等研究领域依然存在较大争议,对色素沉着绒毛结节性滑膜炎进行文献计量学及可视化研究可以理清该领域研究发展脉络,为未来的研究指明方向。目的:分析色素沉着绒毛结节性滑膜炎全球研究现状、热点及趋势。方法:选用Web of Science及中国知网数据库检索1995-2023年所有与色素沉着绒毛结节性滑膜炎相关的文献出版物,用Citespace和Bibliometrics软件对所有文献进行聚类、共现和突现词分析。Web of Science核心合集数据库采用主题词加自由词进行检索,中国知网数据库通过主题词进行检索。最终纳入986篇英文文献和599篇中文文献。结果与结论:①美国在该领域研究占有绝对领导地位,共发文数量、H指数和被引次数均排名第一。中国总发文量排名第4,H指数排名第12,发文质量与国际合作方面仍有待加强。②中国知网数据库相关研究聚类分析显示排名前5的聚类为放射治疗、软组织肉瘤、类风湿性关节炎、磁共振影像和诊断。③持续突现至2023年的关键词有集落刺激因子1、腱鞘巨细胞瘤、个案报道、染色体易位、放射治疗、表达和激酶。④依据关键词及共被引文献分析发现,关于色素沉着绒毛结节性滑膜炎临床特征的研究、集落刺激因子1抑制剂新药开发,以及集落刺激因子1抑制剂在治疗过程中的应用是目前的研究热点。⑤结合主题演化及现有研究热点分析,未来在明确该病病因、发病机制及临床特征的基础上,提高色素沉着绒毛结节性滑膜炎诊断准确率、增强治疗的精准性、降低治疗后复发率将是需要重点关注的问题。展开更多
BACKGROUND: Acute aortic dissection(AoD) is a hypertensive emergency often requiring the transfer of patients to higher care hospitals; thus, clinical care documentation and compliance with the Emergency Medical Treat...BACKGROUND: Acute aortic dissection(AoD) is a hypertensive emergency often requiring the transfer of patients to higher care hospitals; thus, clinical care documentation and compliance with the Emergency Medical Treatment and Active Labor Act(EMTALA) is crucial. The study assessed emergency providers(EP) documentation of clinical care and EMTALA compliance among interhospital transferred AoD patients.METHODS: This retrospective study examined adult patients transferred directly from a referring emergency department(ED) to a quaternary academic center between January 1, 2011 and September 30, 2015. The primary outcome was the percentage of records with adequate documentation of clinical care(ADoCC). The secondary outcome was the percentage of records with adequate documentation of EMTALA compliance(ADoEMTALA). RESULTS: There were 563 electronically identified patients with 287 included in the final analysis. One hundred and five(36.6%) patients had ADoCC while 166(57.8%) patients had ADoEMTALA. Patients with inadequate documentation of EMTALA(IDoEMTALA) were associated with a higher likelihood of not meeting the American Heart Association(AHA) ED Departure SBP guideline(OR 1.8, 95% CI 1.03–3.2, P=0.04). Male gender, handwritten type of documentation, and transport by air were associated with an increased risk of inadequate documentation of clinical care(IDoCC), while receiving continuous infusion was associated with higher risk of IDoEMTALA.CONCLUSION: Documentation of clinical care and EMTALA compliance by Emergency Providers is poor. Inadequate EMTALA documentation was associated with a higher likelihood of patients not meeting the AHA ED Departure SBP guideline. Therefore, Emergency Providers should thoroughly document clinical care and EMTALA compliance among this critically ill group before transfer.展开更多
Clustering is one of the recently challenging tasks since there is an ever.growing amount of data in scientific research and commercial applications. High quality and fast document clustering algorithms are in great d...Clustering is one of the recently challenging tasks since there is an ever.growing amount of data in scientific research and commercial applications. High quality and fast document clustering algorithms are in great demand to deal with large volume of data. The computational requirements for bringing such growing amount data to a central site for clustering are complex. The proposed algorithm uses optimal centroids for K.Means clustering based on Particle Swarm Optimization(PSO).PSO is used to take advantage of its global search ability to provide optimal centroids which aids in generating more compact clusters with improved accuracy. This proposed methodology utilizes Hadoop and Map Reduce framework which provides distributed storage and analysis to support data intensive distributed applications. Experiments were performed on Reuter's and RCV1 document dataset which shows an improvement in accuracy with reduced execution time.展开更多
Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and...Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and programming.To extract semantic structures from document images,we present an end-to-end dilated convolution network architecture.Dilated convolutions have well-known advantages for extracting multi-scale context information without losing spatial resolution.Our model utilizes dilated convolutions with residual network to represent the image features and predicting pixel labels.The convolution part works as feature extractor to obtain multidimensional and hierarchical image features.The consecutive deconvolution is used for producing full resolution segmentation prediction.The probability of each pixel decides its predefined semantic class label.To understand segmentation granularity,we compare performances at three different levels.From fine grained class to coarse class levels,the proposed dilated convolution network architecture is evaluated on three document datasets.The experimental results have shown that both semantic data distribution imbalance and network depth are import factors that influence the document’s semantic segmentation performances.The research is aimed at offering an education resource for teaching artificial intelligence concepts and techniques.展开更多
Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collab...Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collaboration with governments and businesses.This study aims to investigate the development of research fields over time,translating it into a topic detection problem.Design/methodology/approach:To achieve the objectives,we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents.Document embedding approaches are utilized to transform documents into vector-based representations.The proposed method is evaluated by comparing it with a combination of different embedding and clustering approaches and the classical topic modeling algorithms(i.e.LDA)against a benchmark dataset.A case study is also conducted exploring the evolution of Artificial Intelligence(AI)detecting the research topics or sub-fields in related AI publications.Findings:Evaluating the performance of the proposed method using clustering performance indicators reflects that our proposed method outperforms similar approaches against the benchmark dataset.Using the proposed method,we also show how the topics have evolved in the period of the recent 30 years,taking advantage of a keyword extraction method for cluster tagging and labeling,demonstrating the context of the topics.Research limitations:We noticed that it is not possible to generalize one solution for all downstream tasks.Hence,it is required to fine-tune or optimize the solutions for each task and even datasets.In addition,interpretation of cluster labels can be subjective and vary based on the readers’opinions.It is also very difficult to evaluate the labeling techniques,rendering the explanation of the clusters further limited.Practical implications:As demonstrated in the case study,we show that in a real-world example,how the proposed method would enable the researchers and reviewers of the academic research to detect,summarize,analyze,and visualize research topics from decades of academic documents.This helps the scientific community and all related organizations in fast and effective analysis of the fields,by establishing and explaining the topics.Originality/value:In this study,we introduce a modified and tuned deep embedding clustering coupled with Doc2Vec representations for topic extraction.We also use a concept extraction method as a labeling approach in this study.The effectiveness of the method has been evaluated in a case study of AI publications,where we analyze the AI topics during the past three decades.展开更多
基金This work is supported by the National Key Research and Development Program(2022YFB2702300)National Natural Science Foundation of China(Grant No.62172115)+2 种基金Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education under Grant Number No.1331005Guangdong Higher Education Innovation Group 2020KCXTD007Guangzhou Fundamental Research Plan of Municipal-School Jointly Funded Projects(No.202102010445).
文摘In the information age,electronic documents(e-documents)have become a popular alternative to paper documents due to their lower costs,higher dissemination rates,and ease of knowledge sharing.However,digital copyright infringements occur frequently due to the ease of copying,which not only infringes on the rights of creators but also weakens their creative enthusiasm.Therefore,it is crucial to establish an e-document sharing system that enforces copyright protection.However,the existing centralized system has outstanding vulnerabilities,and the plagiarism detection algorithm used cannot fully detect the context,semantics,style,and other factors of the text.Digital watermark technology is only used as a means of infringement tracing.This paper proposes a decentralized framework for e-document sharing based on decentralized autonomous organization(DAO)and non-fungible token(NFT)in blockchain.The use of blockchain as a distributed credit base resolves the vulnerabilities inherent in traditional centralized systems.The e-document evaluation and plagiarism detection mechanisms based on the DAO model effectively address challenges in comprehensive text information checks,thereby promoting the enhancement of e-document quality.The mechanism for protecting and circulating e-document copyrights using NFT technology ensures effective safeguarding of users’e-document copyrights and facilitates e-document sharing.Moreover,recognizing the security issues within the DAO governance mechanism,we introduce an innovative optimization solution.Through experimentation,we validate the enhanced security of the optimized governance mechanism,reducing manipulation risks by up to 51%.Additionally,by utilizing evolutionary game analysis to deduce the equilibrium strategies of the framework,we discovered that adjusting the reward and penalty parameters of the incentive mechanism motivates creators to generate superior quality and unique e-documents,while evaluators are more likely to engage in assessments.
文摘Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.
文摘The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.
文摘As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.
文摘Research on the use of EHR is contradictory since it presents contradicting results regarding the time spent documenting. There is research that supports the use of electronic records as a tool to speed documentation;and research that found that it is time consuming. The purpose of this quantitative retrospective before-after project was to measure the impact of using the laboratory value flowsheet within the EHR on documentation time. The research question was: “Does the use of a laboratory value flowsheet in the EHR impact documentation time by primary care providers (PCPs)?” The theoretical framework utilized in this project was the Donabedian Model. The population in this research was the two PCPs in a small primary care clinic in the northwest of Puerto Rico. The sample was composed of all the encounters during the months of October 2019 and December 2019. The data was obtained through data mining and analyzed using SPSS 27. The evaluative outcome of this project is that there is a decrease in documentation time after implementation of the use of the laboratory value flowsheet in the EHR. However, patients per day increase therefore having an impact on the number of patients seen per day/week/month. The implications for clinical practice include the use of templates to improve workflow and documentation as well as decreasing documentation time while also increasing the number of patients seen per day. .
文摘Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM.
文摘With the widespread use of Chinese globally, the number of Chinese learners has been increasing, leading to various grammatical errors among beginners. Additionally, as domestic efforts to develop industrial information grow, electronic documents have also proliferated. When dealing with numerous electronic documents and texts written by Chinese beginners, manually written texts often contain hidden grammatical errors, posing a significant challenge to traditional manual proofreading. Correcting these grammatical errors is crucial to ensure fluency and readability. However, certain special types of text grammar or logical errors can have a huge impact, and manually proofreading a large number of texts individually is clearly impractical. Consequently, research on text error correction techniques has garnered significant attention in recent years. The advent and advancement of deep learning have paved the way for sequence-to-sequence learning methods to be extensively applied to the task of text error correction. This paper presents a comprehensive analysis of Chinese text grammar error correction technology, elaborates on its current research status, discusses existing problems, proposes preliminary solutions, and conducts experiments using judicial documents as an example. The aim is to provide a feasible research approach for Chinese text error correction technology.
文摘This paper explores how artificial intelligence(AI)can support social researchers in utilizing web-mediated documents for research purposes.It extends traditional documentary analysis to include digital artifacts such as blogs,forums,emails and online archives.The discussion highlights the role of AI in different stages of the research process,including question generation,sample and design definition,ethical considerations,data analysis,and results dissemination,emphasizing how AI can automate complex tasks and enhance research design.The paper also reports on practical experiences using AI tools,specifically ChatGPT-4,in conducting web-mediated documentary analysis and shares some ideas for the integration of AI in social research.
基金Supported by Project of National Natural Science Foundation of China (31160175)Project of Tea Research Institute of Yunnan Academy of Agricultural Sciences (2009A0937)National Modern Agriculture Technology System Projects in Tea Industry (nycytx-23)~~
文摘In this paper,the research achievements and progress of Yunnan tea germplasm resource in past sixty years are systematically reviewed from the following aspects:exploration,collecting,conservation,protection,identification,evaluation and shared utilization.Simultaneously,the current problems and the suggestions about subsequent development of tea germplasm resources in Yunnan were discussed,including superior and rare germplasm collection,tea genetic diversity research,biotechnology utilization in tea germplasm innovation,super gene exploration and function,the construction of utilization platform,biological base of species and population conservation.
基金The National Natural Science Foundation of China(No.60503020,60373066,60403016,60425206),the Natural Science Foundation of Jiangsu Higher Education Institutions ( No.04KJB520096),the Doctoral Foundation of Nanjing University of Posts and Telecommunication (No.0302).
文摘A rough set based corner classification neural network, the Rough-CC4, is presented to solve document classification problems such as document representation of different document sizes, document feature selection and document feature encoding. In the Rough-CC4, the documents are described by the equivalent classes of the approximate words. By this method, the dimensions representing the documents can be reduced, which can solve the precision problems caused by the different document sizes and also blur the differences caused by the approximate words. In the Rough-CC4, a binary encoding method is introduced, through which the importance of documents relative to each equivalent class is encoded. By this encoding method, the precision of the Rough-CC4 is improved greatly and the space complexity of the Rough-CC4 is reduced. The Rough-CC4 can be used in automatic classification of documents.
文摘The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dynamic correlation functions. This paper introduces a new method and uses XML to store and manage engineering documents to realize the format unity of engineering documents and their dynamic correlations.
文摘背景:色素沉着绒毛结节性滑膜炎在病因、临床表现、诊断与治疗等研究领域依然存在较大争议,对色素沉着绒毛结节性滑膜炎进行文献计量学及可视化研究可以理清该领域研究发展脉络,为未来的研究指明方向。目的:分析色素沉着绒毛结节性滑膜炎全球研究现状、热点及趋势。方法:选用Web of Science及中国知网数据库检索1995-2023年所有与色素沉着绒毛结节性滑膜炎相关的文献出版物,用Citespace和Bibliometrics软件对所有文献进行聚类、共现和突现词分析。Web of Science核心合集数据库采用主题词加自由词进行检索,中国知网数据库通过主题词进行检索。最终纳入986篇英文文献和599篇中文文献。结果与结论:①美国在该领域研究占有绝对领导地位,共发文数量、H指数和被引次数均排名第一。中国总发文量排名第4,H指数排名第12,发文质量与国际合作方面仍有待加强。②中国知网数据库相关研究聚类分析显示排名前5的聚类为放射治疗、软组织肉瘤、类风湿性关节炎、磁共振影像和诊断。③持续突现至2023年的关键词有集落刺激因子1、腱鞘巨细胞瘤、个案报道、染色体易位、放射治疗、表达和激酶。④依据关键词及共被引文献分析发现,关于色素沉着绒毛结节性滑膜炎临床特征的研究、集落刺激因子1抑制剂新药开发,以及集落刺激因子1抑制剂在治疗过程中的应用是目前的研究热点。⑤结合主题演化及现有研究热点分析,未来在明确该病病因、发病机制及临床特征的基础上,提高色素沉着绒毛结节性滑膜炎诊断准确率、增强治疗的精准性、降低治疗后复发率将是需要重点关注的问题。
文摘BACKGROUND: Acute aortic dissection(AoD) is a hypertensive emergency often requiring the transfer of patients to higher care hospitals; thus, clinical care documentation and compliance with the Emergency Medical Treatment and Active Labor Act(EMTALA) is crucial. The study assessed emergency providers(EP) documentation of clinical care and EMTALA compliance among interhospital transferred AoD patients.METHODS: This retrospective study examined adult patients transferred directly from a referring emergency department(ED) to a quaternary academic center between January 1, 2011 and September 30, 2015. The primary outcome was the percentage of records with adequate documentation of clinical care(ADoCC). The secondary outcome was the percentage of records with adequate documentation of EMTALA compliance(ADoEMTALA). RESULTS: There were 563 electronically identified patients with 287 included in the final analysis. One hundred and five(36.6%) patients had ADoCC while 166(57.8%) patients had ADoEMTALA. Patients with inadequate documentation of EMTALA(IDoEMTALA) were associated with a higher likelihood of not meeting the American Heart Association(AHA) ED Departure SBP guideline(OR 1.8, 95% CI 1.03–3.2, P=0.04). Male gender, handwritten type of documentation, and transport by air were associated with an increased risk of inadequate documentation of clinical care(IDoCC), while receiving continuous infusion was associated with higher risk of IDoEMTALA.CONCLUSION: Documentation of clinical care and EMTALA compliance by Emergency Providers is poor. Inadequate EMTALA documentation was associated with a higher likelihood of patients not meeting the AHA ED Departure SBP guideline. Therefore, Emergency Providers should thoroughly document clinical care and EMTALA compliance among this critically ill group before transfer.
文摘Clustering is one of the recently challenging tasks since there is an ever.growing amount of data in scientific research and commercial applications. High quality and fast document clustering algorithms are in great demand to deal with large volume of data. The computational requirements for bringing such growing amount data to a central site for clustering are complex. The proposed algorithm uses optimal centroids for K.Means clustering based on Particle Swarm Optimization(PSO).PSO is used to take advantage of its global search ability to provide optimal centroids which aids in generating more compact clusters with improved accuracy. This proposed methodology utilizes Hadoop and Map Reduce framework which provides distributed storage and analysis to support data intensive distributed applications. Experiments were performed on Reuter's and RCV1 document dataset which shows an improvement in accuracy with reduced execution time.
基金Project(61806107)supported by the National Natural Science Foundation of ChinaProject supported by the Shandong Key Laboratory of Wisdom Mine Information Technology,ChinaProject supported by the Opening Project of State Key Laboratory of Digital Publishing Technology,China。
文摘Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and programming.To extract semantic structures from document images,we present an end-to-end dilated convolution network architecture.Dilated convolutions have well-known advantages for extracting multi-scale context information without losing spatial resolution.Our model utilizes dilated convolutions with residual network to represent the image features and predicting pixel labels.The convolution part works as feature extractor to obtain multidimensional and hierarchical image features.The consecutive deconvolution is used for producing full resolution segmentation prediction.The probability of each pixel decides its predefined semantic class label.To understand segmentation granularity,we compare performances at three different levels.From fine grained class to coarse class levels,the proposed dilated convolution network architecture is evaluated on three document datasets.The experimental results have shown that both semantic data distribution imbalance and network depth are import factors that influence the document’s semantic segmentation performances.The research is aimed at offering an education resource for teaching artificial intelligence concepts and techniques.
文摘Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collaboration with governments and businesses.This study aims to investigate the development of research fields over time,translating it into a topic detection problem.Design/methodology/approach:To achieve the objectives,we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents.Document embedding approaches are utilized to transform documents into vector-based representations.The proposed method is evaluated by comparing it with a combination of different embedding and clustering approaches and the classical topic modeling algorithms(i.e.LDA)against a benchmark dataset.A case study is also conducted exploring the evolution of Artificial Intelligence(AI)detecting the research topics or sub-fields in related AI publications.Findings:Evaluating the performance of the proposed method using clustering performance indicators reflects that our proposed method outperforms similar approaches against the benchmark dataset.Using the proposed method,we also show how the topics have evolved in the period of the recent 30 years,taking advantage of a keyword extraction method for cluster tagging and labeling,demonstrating the context of the topics.Research limitations:We noticed that it is not possible to generalize one solution for all downstream tasks.Hence,it is required to fine-tune or optimize the solutions for each task and even datasets.In addition,interpretation of cluster labels can be subjective and vary based on the readers’opinions.It is also very difficult to evaluate the labeling techniques,rendering the explanation of the clusters further limited.Practical implications:As demonstrated in the case study,we show that in a real-world example,how the proposed method would enable the researchers and reviewers of the academic research to detect,summarize,analyze,and visualize research topics from decades of academic documents.This helps the scientific community and all related organizations in fast and effective analysis of the fields,by establishing and explaining the topics.Originality/value:In this study,we introduce a modified and tuned deep embedding clustering coupled with Doc2Vec representations for topic extraction.We also use a concept extraction method as a labeling approach in this study.The effectiveness of the method has been evaluated in a case study of AI publications,where we analyze the AI topics during the past three decades.