期刊文献+
共找到3,018篇文章
< 1 2 151 >
每页显示 20 50 100
Rock mass quality prediction on tunnel faces with incomplete multi-source dataset via tree-augmented naive Bayesian network
1
作者 Hongwei Huang Chen Wu +3 位作者 Mingliang Zhou Jiayao Chen Tianze Han Le Zhang 《International Journal of Mining Science and Technology》 SCIE EI CAS CSCD 2024年第3期323-337,共15页
Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantita... Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantitative parameters.However,due to the harsh on-site construction conditions,it is rather difficult to obtain some of the evaluation parameters which are essential for the rock mass quality prediction.In this study,a novel improved Swin Transformer is proposed to detect,segment,and quantify rock mass characteristic parameters such as water leakage,fractures,weak interlayers.The site experiment results demonstrate that the improved Swin Transformer achieves optimal segmentation results and achieving accuracies of 92%,81%,and 86%for water leakage,fractures,and weak interlayers,respectively.A multisource rock tunnel face characteristic(RTFC)dataset includes 11 parameters for predicting rock mass quality is established.Considering the limitations in predictive performance of incomplete evaluation parameters exist in this dataset,a novel tree-augmented naive Bayesian network(BN)is proposed to address the challenge of the incomplete dataset and achieved a prediction accuracy of 88%.In comparison with other commonly used Machine Learning models the proposed BN-based approach proved an improved performance on predicting the rock mass quality with the incomplete dataset.By utilizing the established BN,a further sensitivity analysis is conducted to quantitatively evaluate the importance of the various parameters,results indicate that the rock strength and fractures parameter exert the most significant influence on rock mass quality. 展开更多
关键词 Rock mass quality Tunnel faces Incomplete multi-source dataset Improved Swin Transformer Bayesian networks
下载PDF
A Method of Generating Semi-Experimental Biomedical Datasets
2
作者 Jing Wang Naike Du +1 位作者 Zi He Xiuzhu Ye 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期219-226,共8页
This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which ... This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which is more realistic than synthetical datasets.In this paper,datasets containing different shapes are constructed based on the relative permittivities of human tissues.Then,a back-propagation scheme is used to obtain the rough reconstructions,which will be fed into a U-net convolutional neural network(CNN)to recover the high-resolution images.Numerical results show that the network trained on the datasets generated by the proposed method can obtain satisfying reconstruction results and is promising to be applied in real-time biomedical imaging. 展开更多
关键词 electromagnetic imaging dataset biomedical imaging
下载PDF
Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets
3
作者 Shuo Xu Yuefu Zhang +1 位作者 Xin An Sainan Pi 《Journal of Data and Information Science》 CSCD 2024年第2期81-103,共23页
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t... Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution. 展开更多
关键词 Multi-label classification Real-World datasets Hierarchical structure Classification system Label correlation Machine learning
下载PDF
A multi-source information fusion layer counting method for penetration fuze based on TCN-LSTM
4
作者 Yili Wang Changsheng Li Xiaofeng Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期463-474,共12页
When employing penetration ammunition to strike multi-story buildings,the detection methods using acceleration sensors suffer from signal aliasing,while magnetic detection methods are susceptible to interference from ... When employing penetration ammunition to strike multi-story buildings,the detection methods using acceleration sensors suffer from signal aliasing,while magnetic detection methods are susceptible to interference from ferromagnetic materials,thereby posing challenges in accurately determining the number of layers.To address this issue,this research proposes a layer counting method for penetration fuze that incorporates multi-source information fusion,utilizing both the temporal convolutional network(TCN)and the long short-term memory(LSTM)recurrent network.By leveraging the strengths of these two network structures,the method extracts temporal and high-dimensional features from the multi-source physical field during the penetration process,establishing a relationship between the multi-source physical field and the distance between the fuze and the target plate.A simulation model is developed to simulate the overload and magnetic field of a projectile penetrating multiple layers of target plates,capturing the multi-source physical field signals and their patterns during the penetration process.The analysis reveals that the proposed multi-source fusion layer counting method reduces errors by 60% and 50% compared to single overload layer counting and single magnetic anomaly signal layer counting,respectively.The model's predictive performance is evaluated under various operating conditions,including different ratios of added noise to random sample positions,penetration speeds,and spacing between target plates.The maximum errors in fuze penetration time predicted by the three modes are 0.08 ms,0.12 ms,and 0.16 ms,respectively,confirming the robustness of the proposed model.Moreover,the model's predictions indicate that the fitting degree for large interlayer spacings is superior to that for small interlayer spacings due to the influence of stress waves. 展开更多
关键词 Penetration fuze Temporal convolutional network(TCN) Long short-term memory(LSTM) Layer counting multi-source fusion
下载PDF
Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection
5
作者 Ankan Kar Nirjhar Nath +1 位作者 Utpalraj Kemprai   Aman 《International Journal of Communications, Network and System Sciences》 2024年第2期11-29,共19页
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to... This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus. 展开更多
关键词 Support Vector Machine Challenging datasets Forest Fire Detection CLASSIFICATION
下载PDF
Multi-source heterogeneous data access management framework and key technologies for electric power Internet of Things
6
作者 Pengtian Guo Kai Xiao +1 位作者 Xiaohui Wang Daoxing Li 《Global Energy Interconnection》 EI CSCD 2024年第1期94-105,共12页
The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initiall... The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT. 展开更多
关键词 Power Internet of Things Object model High concurrency access Zero trust mechanism multi-source heterogeneous data
下载PDF
Runout prediction of potential landslides based on the multi-source data collaboration analysis on historical cases
7
作者 Jun Sun Yu Zhuang Ai-guo Xing 《China Geology》 CAS CSCD 2024年第2期264-276,共13页
Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to pred... Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide. 展开更多
关键词 Landslide runout prediction Drone survey multi-source data collaboration DAN3D numerical modeling Jianshanying landslide Guizhou Province Geological hazards survey engineering
下载PDF
A Web-Based Approach for the Efficient Management of Massive Multi-source 3D Models
8
作者 ZHAO Qiansheng TANG Ruibing +1 位作者 PENG Mingjun GUO Mingwu 《Journal of Geodesy and Geoinformation Science》 CSCD 2024年第3期24-41,共18页
Effectively managing extensive,multi-source,and multi-level real-scene 3D models for responsive retrieval scheduling and rapid visualization in the Web environment is a significant challenge in the current development... Effectively managing extensive,multi-source,and multi-level real-scene 3D models for responsive retrieval scheduling and rapid visualization in the Web environment is a significant challenge in the current development of real-scene 3D applications in China.In this paper,we address this challenge by reorganizing spatial and temporal information into a 3D geospatial grid.It introduces the Global 3D Geocoding System(G_(3)DGS),leveraging neighborhood similarity and uniqueness for efficient storage,retrieval,updating,and scheduling of these models.A combination of G_(3)DGS and non-relational databases is implemented,enhancing data storage scalability and flexibility.Additionally,a model detail management scheduling strategy(TLOD)based on G_(3)DGS and an importance factor T is designed.Compared with mainstream commercial and open-source platforms,this method significantly enhances the loadable capacity of massive multi-source real-scene 3D models in the Web environment by 33%,improves browsing efficiency by 48%,and accelerates invocation speed by 40%. 展开更多
关键词 massive multi-source real-scene 3D model non-relational database global 3D geocoding system importance factor massive model management
下载PDF
Multi-source Data-driven Identification of Urban Functional Areas:A Case of Shenyang,China 被引量:3
9
作者 XUE Bing XIAO Xiao +2 位作者 LI Jingzhong ZHAO Bingyu FU Bo 《Chinese Geographical Science》 SCIE CSCD 2023年第1期21-35,共15页
Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of ... Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective. 展开更多
关键词 human-land relationship multi-source big data urban functional area identification method Shenyang City
下载PDF
Multi-Source Data Privacy Protection Method Based on Homomorphic Encryption and Blockchain 被引量:2
10
作者 Ze Xu Sanxing Cao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期861-881,共21页
Multi-Source data plays an important role in the evolution of media convergence.Its fusion processing enables the further mining of data and utilization of data value and broadens the path for the sharing and dissemin... Multi-Source data plays an important role in the evolution of media convergence.Its fusion processing enables the further mining of data and utilization of data value and broadens the path for the sharing and dissemination of media data.However,it also faces serious problems in terms of protecting user and data privacy.Many privacy protectionmethods have been proposed to solve the problemof privacy leakage during the process of data sharing,but they suffer fromtwo flaws:1)the lack of algorithmic frameworks for specific scenarios such as dynamic datasets in the media domain;2)the inability to solve the problem of the high computational complexity of ciphertext in multi-source data privacy protection,resulting in long encryption and decryption times.In this paper,we propose a multi-source data privacy protection method based on homomorphic encryption and blockchain technology,which solves the privacy protection problem ofmulti-source heterogeneous data in the dissemination ofmedia and reduces ciphertext processing time.We deployed the proposedmethod on theHyperledger platformfor testing and compared it with the privacy protection schemes based on k-anonymity and differential privacy.The experimental results showthat the key generation,encryption,and decryption times of the proposedmethod are lower than those in data privacy protection methods based on k-anonymity technology and differential privacy technology.This significantly reduces the processing time ofmulti-source data,which gives it potential for use in many applications. 展开更多
关键词 Homomorphic encryption blockchain technology multi-source data data privacy protection privacy data processing
下载PDF
Recent trends of machine learning applied to multi-source data of medicinal plants 被引量:1
11
作者 Yanying Zhang Yuanzhong Wang 《Journal of Pharmaceutical Analysis》 SCIE CAS CSCD 2023年第12期1388-1407,共20页
In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese... In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese medicine during corona virus disease 2019(COVID-19)pandemic has attracted extensive attention globally.Medicinal plants have,therefore,become increasingly popular among the public.However,with increasing demand for and profit with medicinal plants,commercial fraudulent events such as adulteration or counterfeits sometimes occur,which poses a serious threat to the clinical outcomes and interests of consumers.With rapid advances in artificial intelligence,machine learning can be used to mine information on various medicinal plants to establish an ideal resource database.We herein present a review that mainly introduces common machine learning algorithms and discusses their application in multi-source data analysis of medicinal plants.The combination of machine learning algorithms and multi-source data analysis facilitates a comprehensive analysis and aids in the effective evaluation of the quality of medicinal plants.The findings of this review provide new possibilities for promoting the development and utilization of medicinal plants. 展开更多
关键词 Machine learning Medicinal plant multi-source data Data fusion Application
下载PDF
Threat Modeling and Application Research Based on Multi-Source Attack and Defense Knowledge
12
作者 Shuqin Zhang Xinyu Su +2 位作者 Peiyu Shi Tianhui Du Yunfei Han 《Computers, Materials & Continua》 SCIE EI 2023年第10期349-377,共29页
Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to u... Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment. 展开更多
关键词 multi-source data fusion threat modeling threat propagation path knowledge graph intelligent defense decision-making
下载PDF
Risk Analysis Using Multi-Source Data for Distribution Networks Facing Extreme Natural Disasters
13
作者 Jun Yang Nannan Wang +1 位作者 Jiang Wang Yashuai Luo 《Energy Engineering》 EI 2023年第9期2079-2096,共18页
Distribution networks denote important public infrastructure necessary for people’s livelihoods.However,extreme natural disasters,such as earthquakes,typhoons,and mudslides,severely threaten the safe and stable opera... Distribution networks denote important public infrastructure necessary for people’s livelihoods.However,extreme natural disasters,such as earthquakes,typhoons,and mudslides,severely threaten the safe and stable operation of distribution networks and power supplies needed for daily life.Therefore,considering the requirements for distribution network disaster prevention and mitigation,there is an urgent need for in-depth research on risk assessment methods of distribution networks under extreme natural disaster conditions.This paper accessesmultisource data,presents the data quality improvement methods of distribution networks,and conducts data-driven active fault diagnosis and disaster damage analysis and evaluation using data-driven theory.Furthermore,the paper realizes real-time,accurate access to distribution network disaster information.The proposed approach performs an accurate and rapid assessment of cross-sectional risk through case study.The minimal average annual outage time can be reduced to 3 h/a in the ring network through case study.The approach proposed in this paper can provide technical support to the further improvement of the ability of distribution networks to cope with extreme natural disasters. 展开更多
关键词 Distribution network disaster damage analysis fault judgment multi-source data
下载PDF
Evaluation and Improvement Strategies for Slow Traffic Systems Based on Multi-source Big Data:A Case Study of Shijingshan District of Beijing City
14
作者 LI Yiwen 《Journal of Landscape Research》 2023年第4期62-64,68,共4页
The slow traffic system is an important component of urban transportation,and the prerequisite and necessary condition for Beijing to continue promoting“green priority”are establishing a good urban slow traffic syst... The slow traffic system is an important component of urban transportation,and the prerequisite and necessary condition for Beijing to continue promoting“green priority”are establishing a good urban slow traffic system.Shijingshan District of Beijing City is taken as a research object.By analyzing and processing population distribution data,POI data,and shared bicycle data,the shortcomings and deficiencies of the current slow traffic system in Shijingshan District are explored,and corresponding solutions are proposed,in order to provide new ideas and methods for future urban planning from the perspective of data. 展开更多
关键词 multi-source data Slow traffic system Shijingshan District
下载PDF
Classification of Beijing Line 10 Subway Living Circle Based on Multi-source Big Data
15
作者 SUN Shuai LI Ziying 《Journal of Landscape Research》 2023年第3期53-58,共6页
In the first-tier cities,subway has become an important carrier and life focus of people’s daily travel activities.By studying the distribution of POIs of public service facilities around Metro Line 10,using GIS to q... In the first-tier cities,subway has become an important carrier and life focus of people’s daily travel activities.By studying the distribution of POIs of public service facilities around Metro Line 10,using GIS to quantitatively analyze the surrounding formats of subway stations,discussing the functional attributes of subway stations,and discussing the distribution of urban functions from a new perspective,this paper provided guidance and advice for the construction of service facilities. 展开更多
关键词 multi-source big data Subway living circle BEIJING GIS
下载PDF
A Comprehensive Analysis of Datasets for Automotive Intrusion Detection Systems
16
作者 Seyoung Lee Wonsuk Choi +2 位作者 InsupKim Ganggyu Lee Dong Hoon Lee 《Computers, Materials & Continua》 SCIE EI 2023年第9期3413-3442,共30页
Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the ... Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the datasets used for training and evaluation.Despite the availability of several datasets for automotive IDSs,there has been a lack of comprehensive analysis focusing on assessing these datasets.This paper aims to address the need for dataset assessment in the context of automotive IDSs.It proposes qualitative and quantitative metrics that are independent of specific automotive IDSs,to evaluate the quality of datasets.These metrics take into consideration various aspects such as dataset description,collection environment,and attack complexity.This paper evaluates eight commonly used datasets for automotive IDSs using the proposed metrics.The evaluation reveals biases in the datasets,particularly in terms of limited contexts and lack of diversity.Additionally,it highlights that the attacks in the datasets were mostly injected without considering normal behaviors,which poses challenges for training and evaluating machine learning-based IDSs.This paper emphasizes the importance of addressing the identified limitations in existing datasets to improve the performance and adaptability of automotive IDSs.The proposed metrics can serve as valuable guidelines for researchers and practitioners in selecting and constructing high-quality datasets for automotive security applications.Finally,this paper presents the requirements for high-quality datasets,including the need for representativeness,diversity,and balance. 展开更多
关键词 Controller area network(CAN) intrusion detection system(IDS) automotive security machine learning(ML) dataset
下载PDF
Empirical Analysis of Neural Networks-Based Models for Phishing Website Classification Using Diverse Datasets
17
作者 Shoaib Khan Bilal Khan +2 位作者 Saifullah Jan Subhan Ullah Aiman 《Journal of Cyber Security》 2023年第1期47-66,共20页
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis... Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security. 展开更多
关键词 Artificial neural networks phishing websites network security machine learning phishing datasets CLASSIFICATION
下载PDF
Research on Enhanced Contraband Dataset ACXray Based on ETL
18
作者 Xueping Song Jianming Yang +1 位作者 Shuyu Zhang Jicun Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第6期4551-4572,共22页
To address the shortage of public datasets for customs X-ray images of contraband and the difficulties in deploying trained models in engineering applications,a method has been proposed that employs the Extract-Transf... To address the shortage of public datasets for customs X-ray images of contraband and the difficulties in deploying trained models in engineering applications,a method has been proposed that employs the Extract-Transform-Load(ETL)approach to create an X-ray dataset of contraband items.Initially,X-ray scatter image data is collected and cleaned.Using Kafka message queues and the Elasticsearch(ES)distributed search engine,the data is transmitted in real-time to cloud servers.Subsequently,contraband data is annotated using a combination of neural networks and manual methods to improve annotation efficiency and implemented mean hash algorithm for quick image retrieval.The method of integrating targets with backgrounds has enhanced the X-ray contraband image data,increasing the number of positive samples.Finally,an Airport Customs X-ray dataset(ACXray)compatible with customs business scenarios has been constructed,featuring an increased number of positive contraband samples.Experimental tests using three datasets to train the Mask Region-based Convolutional Neural Network(Mask R-CNN)algorithm and tested on 400 real customs images revealed that the recognition accuracy of algorithms trained with Security Inspection X-ray(SIXray)and Occluded Prohibited Items X-ray(OPIXray)decreased by 16.3%and 15.1%,respectively,while the ACXray dataset trained algorithm’s accuracy was almost unaffected.This indicates that the ACXray dataset-trained algorithm possesses strong generalization capabilities and is more suitable for customs detection scenarios. 展开更多
关键词 X-ray contraband ETL data enhancement dataset
下载PDF
The accessible seismological dataset of a high-density 2D seismic array along Anninghe fault
19
作者 Weifan Lu Zeyan Zhao +3 位作者 Han Yue Shiyong Zhou Jianping Wu Xiaodong Song 《Earthquake Science》 2024年第1期67-77,共11页
The scientific goal of the Anninghe seismic array is to investigate the detailed geometry of the Anninghe fault and the velocity structure of the fault zone.This 2D seismic array is composed of 161 stations forming su... The scientific goal of the Anninghe seismic array is to investigate the detailed geometry of the Anninghe fault and the velocity structure of the fault zone.This 2D seismic array is composed of 161 stations forming sub-rectangular geometry along the Anninghe fault,which covers 50 km and 150 km in the fault normal and strike directions,respectively,with~5 km intervals.The data were collected between June 2020 and June 2021,with some level of temporal gaps.Two types of instruments,i.e.QS-05A and SmartSolo,are used in this array.Data quality and examples of seismograms are provided in this paper.After the data protection period ends(expected in June 2024),researchers can request a dataset from the National Earthquake Science Data Center. 展开更多
关键词 Anninghe fault seismological dataset data share
下载PDF
Optimizing Enterprise Conversational AI: Accelerating Response Accuracy with Custom Dataset Fine-Tuning
20
作者 Yash Kishore 《Intelligent Information Management》 2024年第2期65-76,共12页
As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidab... As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidable challenges. These models, honed on vast and diverse datasets, have undoubtedly pushed the boundaries of natural language understanding and generation. However, they often stumble when faced with the intricate demands of nuanced enterprise applications. This research advocates for a strategic paradigm shift, urging enterprises to embrace a fine-tuning approach as a means to optimize conversational AI. While generalized LLMs are linguistic marvels, their inability to cater to the specific needs of businesses across various industries poses a critical challenge. This strategic shift involves empowering enterprises to seamlessly integrate their own datasets into LLMs, a process that extends beyond linguistic enhancement. The core concept of this approach centers on customization, enabling businesses to fine-tune the AI’s functionality to fit precisely within their unique business landscapes. By immersing the LLM in industry-specific documents, customer interaction records, internal reports, and regulatory guidelines, the AI transcends its generic capabilities to become a sophisticated conversational partner aligned with the intricacies of the enterprise’s domain. The transformative potential of this fine-tuning approach cannot be overstated. It enables a transition from a universal AI solution to a highly customizable tool. The AI evolves from being a linguistic powerhouse to a contextually aware, industry-savvy assistant. As a result, it not only responds with linguistic accuracy but also with depth, relevance, and resonance, significantly elevating user experiences and operational efficiency. In the subsequent sections, this paper delves into the intricacies of fine-tuning, exploring the multifaceted challenges and abundant opportunities it presents. It addresses the technical intricacies of data integration, ethical considerations surrounding data usage, and the broader implications for the future of enterprise AI. The journey embarked upon in this research holds the potential to redefine the role of conversational AI in enterprises, ushering in an era where AI becomes a dynamic, deeply relevant, and highly effective tool, empowering businesses to excel in an ever-evolving digital landscape. 展开更多
关键词 Fine-Tuning dataset AI CONVERSATIONAL ENTERPRISE LLM
下载PDF
上一页 1 2 151 下一页 到第
使用帮助 返回顶部