期刊文献+
共找到2,931篇文章
< 1 2 147 >
每页显示 20 50 100
A Method of Generating Semi-Experimental Biomedical Datasets
1
作者 Jing Wang Naike Du +1 位作者 Zi He Xiuzhu Ye 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期219-226,共8页
This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which ... This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which is more realistic than synthetical datasets.In this paper,datasets containing different shapes are constructed based on the relative permittivities of human tissues.Then,a back-propagation scheme is used to obtain the rough reconstructions,which will be fed into a U-net convolutional neural network(CNN)to recover the high-resolution images.Numerical results show that the network trained on the datasets generated by the proposed method can obtain satisfying reconstruction results and is promising to be applied in real-time biomedical imaging. 展开更多
关键词 electromagnetic imaging dataset biomedical imaging
下载PDF
Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets
2
作者 Shuo Xu Yuefu Zhang +1 位作者 Xin An Sainan Pi 《Journal of Data and Information Science》 CSCD 2024年第2期81-103,共23页
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t... Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution. 展开更多
关键词 Multi-label classification Real-World datasets Hierarchical structure Classification system Label correlation Machine learning
下载PDF
Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection
3
作者 Ankan Kar Nirjhar Nath +1 位作者 Utpalraj Kemprai   Aman 《International Journal of Communications, Network and System Sciences》 2024年第2期11-29,共19页
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to... This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus. 展开更多
关键词 Support Vector Machine Challenging datasets Forest Fire Detection CLASSIFICATION
下载PDF
A Comprehensive Analysis of Datasets for Automotive Intrusion Detection Systems
4
作者 Seyoung Lee Wonsuk Choi +2 位作者 InsupKim Ganggyu Lee Dong Hoon Lee 《Computers, Materials & Continua》 SCIE EI 2023年第9期3413-3442,共30页
Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the ... Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the datasets used for training and evaluation.Despite the availability of several datasets for automotive IDSs,there has been a lack of comprehensive analysis focusing on assessing these datasets.This paper aims to address the need for dataset assessment in the context of automotive IDSs.It proposes qualitative and quantitative metrics that are independent of specific automotive IDSs,to evaluate the quality of datasets.These metrics take into consideration various aspects such as dataset description,collection environment,and attack complexity.This paper evaluates eight commonly used datasets for automotive IDSs using the proposed metrics.The evaluation reveals biases in the datasets,particularly in terms of limited contexts and lack of diversity.Additionally,it highlights that the attacks in the datasets were mostly injected without considering normal behaviors,which poses challenges for training and evaluating machine learning-based IDSs.This paper emphasizes the importance of addressing the identified limitations in existing datasets to improve the performance and adaptability of automotive IDSs.The proposed metrics can serve as valuable guidelines for researchers and practitioners in selecting and constructing high-quality datasets for automotive security applications.Finally,this paper presents the requirements for high-quality datasets,including the need for representativeness,diversity,and balance. 展开更多
关键词 Controller area network(CAN) intrusion detection system(IDS) automotive security machine learning(ML) dataset
下载PDF
Empirical Analysis of Neural Networks-Based Models for Phishing Website Classification Using Diverse Datasets
5
作者 Shoaib Khan Bilal Khan +2 位作者 Saifullah Jan Subhan Ullah Aiman 《Journal of Cyber Security》 2023年第1期47-66,共20页
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis... Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security. 展开更多
关键词 Artificial neural networks phishing websites network security machine learning phishing datasets CLASSIFICATION
下载PDF
Research on Enhanced Contraband Dataset ACXray Based on ETL
6
作者 Xueping Song Jianming Yang +1 位作者 Shuyu Zhang Jicun Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第6期4551-4572,共22页
To address the shortage of public datasets for customs X-ray images of contraband and the difficulties in deploying trained models in engineering applications,a method has been proposed that employs the Extract-Transf... To address the shortage of public datasets for customs X-ray images of contraband and the difficulties in deploying trained models in engineering applications,a method has been proposed that employs the Extract-Transform-Load(ETL)approach to create an X-ray dataset of contraband items.Initially,X-ray scatter image data is collected and cleaned.Using Kafka message queues and the Elasticsearch(ES)distributed search engine,the data is transmitted in real-time to cloud servers.Subsequently,contraband data is annotated using a combination of neural networks and manual methods to improve annotation efficiency and implemented mean hash algorithm for quick image retrieval.The method of integrating targets with backgrounds has enhanced the X-ray contraband image data,increasing the number of positive samples.Finally,an Airport Customs X-ray dataset(ACXray)compatible with customs business scenarios has been constructed,featuring an increased number of positive contraband samples.Experimental tests using three datasets to train the Mask Region-based Convolutional Neural Network(Mask R-CNN)algorithm and tested on 400 real customs images revealed that the recognition accuracy of algorithms trained with Security Inspection X-ray(SIXray)and Occluded Prohibited Items X-ray(OPIXray)decreased by 16.3%and 15.1%,respectively,while the ACXray dataset trained algorithm’s accuracy was almost unaffected.This indicates that the ACXray dataset-trained algorithm possesses strong generalization capabilities and is more suitable for customs detection scenarios. 展开更多
关键词 X-ray contraband ETL data enhancement dataset
下载PDF
The accessible seismological dataset of a high-density 2D seismic array along Anninghe fault
7
作者 Weifan Lu Zeyan Zhao +3 位作者 Han Yue Shiyong Zhou Jianping Wu Xiaodong Song 《Earthquake Science》 2024年第1期67-77,共11页
The scientific goal of the Anninghe seismic array is to investigate the detailed geometry of the Anninghe fault and the velocity structure of the fault zone.This 2D seismic array is composed of 161 stations forming su... The scientific goal of the Anninghe seismic array is to investigate the detailed geometry of the Anninghe fault and the velocity structure of the fault zone.This 2D seismic array is composed of 161 stations forming sub-rectangular geometry along the Anninghe fault,which covers 50 km and 150 km in the fault normal and strike directions,respectively,with~5 km intervals.The data were collected between June 2020 and June 2021,with some level of temporal gaps.Two types of instruments,i.e.QS-05A and SmartSolo,are used in this array.Data quality and examples of seismograms are provided in this paper.After the data protection period ends(expected in June 2024),researchers can request a dataset from the National Earthquake Science Data Center. 展开更多
关键词 Anninghe fault seismological dataset data share
下载PDF
Optimizing Enterprise Conversational AI: Accelerating Response Accuracy with Custom Dataset Fine-Tuning
8
作者 Yash Kishore 《Intelligent Information Management》 2024年第2期65-76,共12页
As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidab... As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidable challenges. These models, honed on vast and diverse datasets, have undoubtedly pushed the boundaries of natural language understanding and generation. However, they often stumble when faced with the intricate demands of nuanced enterprise applications. This research advocates for a strategic paradigm shift, urging enterprises to embrace a fine-tuning approach as a means to optimize conversational AI. While generalized LLMs are linguistic marvels, their inability to cater to the specific needs of businesses across various industries poses a critical challenge. This strategic shift involves empowering enterprises to seamlessly integrate their own datasets into LLMs, a process that extends beyond linguistic enhancement. The core concept of this approach centers on customization, enabling businesses to fine-tune the AI’s functionality to fit precisely within their unique business landscapes. By immersing the LLM in industry-specific documents, customer interaction records, internal reports, and regulatory guidelines, the AI transcends its generic capabilities to become a sophisticated conversational partner aligned with the intricacies of the enterprise’s domain. The transformative potential of this fine-tuning approach cannot be overstated. It enables a transition from a universal AI solution to a highly customizable tool. The AI evolves from being a linguistic powerhouse to a contextually aware, industry-savvy assistant. As a result, it not only responds with linguistic accuracy but also with depth, relevance, and resonance, significantly elevating user experiences and operational efficiency. In the subsequent sections, this paper delves into the intricacies of fine-tuning, exploring the multifaceted challenges and abundant opportunities it presents. It addresses the technical intricacies of data integration, ethical considerations surrounding data usage, and the broader implications for the future of enterprise AI. The journey embarked upon in this research holds the potential to redefine the role of conversational AI in enterprises, ushering in an era where AI becomes a dynamic, deeply relevant, and highly effective tool, empowering businesses to excel in an ever-evolving digital landscape. 展开更多
关键词 Fine-Tuning dataset AI CONVERSATIONAL ENTERPRISE LLM
下载PDF
Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset,Methodology and Evaluation
9
作者 Shiwen Song Rui Zhang +1 位作者 Min Hu Feiyao Huang 《Computers, Materials & Continua》 SCIE EI 2024年第6期5243-5271,共29页
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi... Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios. 展开更多
关键词 Multi-modality dataset ship recognition fine-grained recognition attention mechanism
下载PDF
SciCN:A Scientific Dataset for Chinese Named Entity Recognition
10
作者 Jing Yang Bin Ji +2 位作者 Shasha Li Jun Ma Jie Yu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4303-4315,共13页
Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly prom... Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly promoted the NER research in the English field.By contrast,much fewer efforts are made to the Chinese NER research,especially in the scientific domain,due to the scarcity of Chinese NER datasets.To alleviate this problem,we present aChinese scientificNER dataset–SciCN,which contains entity annotations of titles and abstracts derived from 3,500 scientific papers.We manually annotate a total of 62,059 entities,and these entities are classified into six types.Compared to English scientific NER datasets,SciCN has a larger scale and is more diverse,for it not only contains more paper abstracts but these abstracts are derived from more research fields.To investigate the properties of SciCN and provide baselines for future research,we adapt a number of previous state-of-theart Chinese NER models to evaluate SciCN.Experimental results show that SciCN is more challenging than other Chinese NER datasets.In addition,previous studies have proven the effectiveness of using lexicons to enhance Chinese NER models.Motivated by this fact,we provide a scientific domain-specific lexicon.Validation results demonstrate that our lexicon delivers better performance gains than lexicons of other domains.We hope that the SciCN dataset and the lexicon will enable us to benchmark the NER task regarding the Chinese scientific domain and make progress for future research.The dataset and lexicon are available at:https://github.com/yangjingla/SciCN.git. 展开更多
关键词 Named entity recognition dataset scientific information extraction LEXICON
下载PDF
CNN Channel Attention Intrusion Detection SystemUsing NSL-KDD Dataset
11
作者 Fatma S.Alrayes Mohammed Zakariah +2 位作者 Syed Umar Amin Zafar Iqbal Khan Jehad Saad Alqurni 《Computers, Materials & Continua》 SCIE EI 2024年第6期4319-4347,共29页
Intrusion detection systems(IDS)are essential in the field of cybersecurity because they protect networks from a wide range of online threats.The goal of this research is to meet the urgent need for small-footprint,hi... Intrusion detection systems(IDS)are essential in the field of cybersecurity because they protect networks from a wide range of online threats.The goal of this research is to meet the urgent need for small-footprint,highly-adaptable Network Intrusion Detection Systems(NIDS)that can identify anomalies.The NSL-KDD dataset is used in the study;it is a sizable collection comprising 43 variables with the label’s“attack”and“level.”It proposes a novel approach to intrusion detection based on the combination of channel attention and convolutional neural networks(CNN).Furthermore,this dataset makes it easier to conduct a thorough assessment of the suggested intrusion detection strategy.Furthermore,maintaining operating efficiency while improving detection accuracy is the primary goal of this work.Moreover,typical NIDS examines both risky and typical behavior using a variety of techniques.On the NSL-KDD dataset,our CNN-based approach achieves an astounding 99.728%accuracy rate when paired with channel attention.Compared to previous approaches such as ensemble learning,CNN,RBM(Boltzmann machine),ANN,hybrid auto-encoders with CNN,MCNN,and ANN,and adaptive algorithms,our solution significantly improves intrusion detection performance.Moreover,the results highlight the effectiveness of our suggested method in improving intrusion detection precision,signifying a noteworthy advancement in this field.Subsequent efforts will focus on strengthening and expanding our approach in order to counteract growing cyberthreats and adjust to changing network circumstances. 展开更多
关键词 Intrusion detection system(IDS) NSL-KDD dataset deep-learning MACHINE-LEARNING CNN channel Attention network security
下载PDF
Rock mass quality prediction on tunnel faces with incomplete multi-source dataset via tree-augmented naive Bayesian network
12
作者 Hongwei Huang Chen Wu +3 位作者 Mingliang Zhou Jiayao Chen Tianze Han Le Zhang 《International Journal of Mining Science and Technology》 SCIE EI CAS CSCD 2024年第3期323-337,共15页
Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantita... Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantitative parameters.However,due to the harsh on-site construction conditions,it is rather difficult to obtain some of the evaluation parameters which are essential for the rock mass quality prediction.In this study,a novel improved Swin Transformer is proposed to detect,segment,and quantify rock mass characteristic parameters such as water leakage,fractures,weak interlayers.The site experiment results demonstrate that the improved Swin Transformer achieves optimal segmentation results and achieving accuracies of 92%,81%,and 86%for water leakage,fractures,and weak interlayers,respectively.A multisource rock tunnel face characteristic(RTFC)dataset includes 11 parameters for predicting rock mass quality is established.Considering the limitations in predictive performance of incomplete evaluation parameters exist in this dataset,a novel tree-augmented naive Bayesian network(BN)is proposed to address the challenge of the incomplete dataset and achieved a prediction accuracy of 88%.In comparison with other commonly used Machine Learning models the proposed BN-based approach proved an improved performance on predicting the rock mass quality with the incomplete dataset.By utilizing the established BN,a further sensitivity analysis is conducted to quantitatively evaluate the importance of the various parameters,results indicate that the rock strength and fractures parameter exert the most significant influence on rock mass quality. 展开更多
关键词 Rock mass quality Tunnel faces Incomplete multi-source dataset Improved Swin Transformer Bayesian networks
下载PDF
A LiDAR Point Clouds Dataset of Ships in a Maritime Environment
13
作者 Qiuyu Zhang Lipeng Wang +2 位作者 Hao Meng Wen Zhang Genghua Huang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第7期1681-1694,共14页
For the first time, this article introduces a LiDAR Point Clouds Dataset of Ships composed of both collected and simulated data to address the scarcity of LiDAR data in maritime applications. The collected data are ac... For the first time, this article introduces a LiDAR Point Clouds Dataset of Ships composed of both collected and simulated data to address the scarcity of LiDAR data in maritime applications. The collected data are acquired using specialized maritime LiDAR sensors in both inland waterways and wide-open ocean environments. The simulated data is generated by placing a ship in the LiDAR coordinate system and scanning it with a redeveloped Blensor that emulates the operation of a LiDAR sensor equipped with various laser beams. Furthermore,we also render point clouds for foggy and rainy weather conditions. To describe a realistic shipping environment, a dynamic tail wave is modeled by iterating the wave elevation of each point in a time series. Finally, networks serving small objects are migrated to ship applications by feeding our dataset. The positive effect of simulated data is described in object detection experiments, and the negative impact of tail waves as noise is verified in single-object tracking experiments. The Dataset is available at https://github.com/zqy411470859/ship_dataset. 展开更多
关键词 3D point clouds dataset dynamic tail wave fog simulation rainy simulation simulated data
下载PDF
Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset
14
作者 Mohammed Abdalsalam Chunlin Li +1 位作者 Abdelghani Dahou Natalia Kryvinska 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第2期1427-1467,共41页
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli... One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier. 展开更多
关键词 Artificial intelligence machine learning natural language processing data analytic DistilBERT feature extraction terrorism classification GTD dataset
下载PDF
KurdSet: A Kurdish Handwritten Characters Recognition Dataset Using Convolutional Neural Network
15
作者 Sardar Hasen Ali Maiwan Bahjat Abdulrazzaq 《Computers, Materials & Continua》 SCIE EI 2024年第4期429-448,共20页
Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format fo... Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset. 展开更多
关键词 CNN models Kurdish handwritten recognition KurdSet dataset Arabic handwritten recognition DenseNet121 model InceptionV3 model Xception model
下载PDF
CREDIT-X1local:A reference dataset for machine learning seismology from ChinArray in Southwest China
16
作者 Lu Li Weitao Wang +1 位作者 Ziye Yu Yini Chen 《Earthquake Science》 2024年第2期139-157,共19页
High-quality datasets are critical for the development of advanced machine-learning algorithms in seismology.Here,we present an earthquake dataset based on the ChinArray Phase I records(X1).ChinArray Phase I was deplo... High-quality datasets are critical for the development of advanced machine-learning algorithms in seismology.Here,we present an earthquake dataset based on the ChinArray Phase I records(X1).ChinArray Phase I was deployed in the southern north-south seismic zone(20°N-32°N,95°E-110°E)in 2011-2013 using 355 portable broadband seismic stations.CREDIT-X1local,the first release of the ChinArray Reference Earthquake Dataset for Innovative Techniques(CREDIT),includes comprehensive information for the 105,455 local events that occurred in the southern north-south seismic zone during array observation,incorporating them into a single HDF5 file.Original 100-Hz sampled three-component waveforms are organized by event for stations within epicenter distances of 1,000 km,and records of≥200 s are included for each waveform.Two types of phase labels are provided.The first includes manually picked labels for 5,999 events with magnitudes≥2.0,providing 66,507 Pg,42,310 Sg,12,823 Pn,and 546 Sn phases.The second contains automatically labeled phases for 105,442 events with magnitudes of−1.6 to 7.6.These phases were picked using a recurrent neural network phase picker and screened using the corresponding travel time curves,resulting in 1,179,808 Pg,884,281 Sg,176,089 Pn,and 22,986 Sn phases.Additionally,first-motion polarities are included for 31,273 Pg phases.The event and station locations are provided,so that deep learning networks for both conventional phase picking and phase association can be trained and validated.The CREDIT-X1local dataset is the first million-scale dataset constructed from a dense seismic array,which is designed to support various multi-station deep-learning methods,high-precision focal mechanism inversion,and seismic tomography studies.Additionally,owing to the high seismicity in the southern north-south seismic zone in China,this dataset has great potential for future scientific discoveries. 展开更多
关键词 earthquake dataset machine learning Pg/Sg/Pn/Sn phase picking P-wave first-motion polarity
下载PDF
Deep Learning Recognition for Arabic Alphabet Sign Language RGB Dataset
17
作者 Rabie El Kharoua Xiaoming Jiang 《Journal of Computer and Communications》 2024年第3期32-51,共20页
This paper introduces a Convolutional Neural Network (CNN) model for Arabic Sign Language (AASL) recognition, using the AASL dataset. Recognizing the fundamental importance of communication for the hearing-impaired, e... This paper introduces a Convolutional Neural Network (CNN) model for Arabic Sign Language (AASL) recognition, using the AASL dataset. Recognizing the fundamental importance of communication for the hearing-impaired, especially within the Arabic-speaking deaf community, the study emphasizes the critical role of sign language recognition systems. The proposed methodology achieves outstanding accuracy, with the CNN model reaching 99.9% accuracy on the training set and a validation accuracy of 97.4%. This study not only establishes a high-accuracy AASL recognition model but also provides insights into effective dropout strategies. The achieved high accuracy rates position the proposed model as a significant advancement in the field, holding promise for improved communication accessibility for the Arabic-speaking deaf community. 展开更多
关键词 Convolutional Neural Network (CNN) AASL dataset DROPOUT Deep Learning Communication Technology
下载PDF
Numerical investigation on the flow and power of small-sized multi-bladed straight Darrieus wind turbine 被引量:10
18
作者 JIANG Zhi-chao DOI Yasuaki ZHANG Shu-you 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2007年第9期1414-1421,共8页
Straight Darrieus wind turbine has attractive characteristics such as the ability to accept wind from random direction and easy installation and maintenance. But its aerodynamic performance is very complicated,especia... Straight Darrieus wind turbine has attractive characteristics such as the ability to accept wind from random direction and easy installation and maintenance. But its aerodynamic performance is very complicated,especially for the existence of dynamic stall. How to get better aerodynamic performance arouses lots of interests in the design procedure of a straight Darrieus wind turbine. In this paper,mainly the effects of number of blades and tip speed ratio are discussed. Based on the numerical investigation,an assumed asymmetric straight Darrieus wind turbine is proposed to improve the averaged power coefficient. As to the numerical method,the flow around the turbine is simulated by solving the 2D unsteady Navier-Stokes equation combined with continuous equation. The time marching method on a body-fitted coordinate system based on MAC (Marker-and-Cell) method is used. O-type grid is generated for the whole calculation domain. The characteristics of tangential and normal force are discussed related with dynamic stall of the blade. Averaged power coefficient per period of rotating is calculated to evaluate the eligibility of the turbine. 展开更多
关键词 small-sized Straight Darrieus wind turbine Multi-bladed Power coefficient
下载PDF
Performances of Seven Datasets in Presenting the Upper Ocean Heat Content in the South China Sea 被引量:2
19
作者 陈晓 严幼芳 +1 位作者 程旭华 齐义泉 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2013年第5期1331-1342,共12页
In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (W... In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS. 展开更多
关键词 South China Sea ocean heat content multiple datasets interannual variability
下载PDF
The Assessment of Global Surface Temperature Change from 1850s:The C-LSAT2.0 Ensemble and the CMST-Interim Datasets 被引量:9
20
作者 Wenbin SUN Qingxiang LI +6 位作者 Boyin HUANG Jiayi CHENG Zhaoyang SONG Haiyan LI Wenjie DONG Panmao ZHAI Phil JONES 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2021年第5期875-888,共14页
Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been develo... Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been developed.These ensemble versions have been merged with the ERSSTv5 ensemble dataset,and an upgraded version of the CMSTInterim dataset with 5°×5°resolution has been developed.The CMST-Interim dataset has significantly improved the coverage rate of global surface temperature data.After reconstruction,the data coverage before 1950 increased from 78%−81%of the original CMST to 81%−89%.The total coverage after 1955 reached about 93%,including more than 98%in the Northern Hemisphere and 81%−89%in the Southern Hemisphere.Through the reconstruction ensemble experiments with different parameters,a good basis is provided for more systematic uncertainty assessment of C-LSAT2.0 and CMSTInterim.In comparison with the original CMST,the global mean surface temperatures are estimated to be cooler in the second half of 19th century and warmer during the 21st century,which shows that the global warming trend is further amplified.The global warming trends are updated from 0.085±0.004℃(10 yr)^(–1)and 0.128±0.006℃(10 yr)^(–1)to 0.089±0.004℃(10 yr)^(–1)and 0.137±0.007℃(10 yr)^(–1),respectively,since the start and the second half of 20th century. 展开更多
关键词 C-LSAT2.0 ensemble datasets CMST-Interim EOTs high-and low-frequency components RECONSTRUCTION
下载PDF
上一页 1 2 147 下一页 到第
使用帮助 返回顶部