期刊文献+
共找到2,207篇文章
< 1 2 111 >
每页显示 20 50 100
Enhancing Relational Triple Extraction in Specific Domains:Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models
1
作者 Jiakai Li Jianpeng Hu Geng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2481-2503,共23页
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e... In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach. 展开更多
关键词 Relational triple extraction semantic interaction large language models data augmentation specific domains
下载PDF
Security Vulnerability Analyses of Large Language Models (LLMs) through Extension of the Common Vulnerability Scoring System (CVSS) Framework
2
作者 Alicia Biju Vishnupriya Ramesh Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2024年第5期340-358,共19页
Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, a... Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks. 展开更多
关键词 Common Vulnerability Scoring System (CVSS) Large language Models (LLMs) DALL-E Prompt Injections Training data Poisoning CVSS Metrics
下载PDF
Extensible Markup Language Data Mining System Model
3
作者 李炜 宋瀚涛 《Journal of Beijing Institute of Technology》 EI CAS 2003年第1期28-32,共5页
The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type descriptio... The existing data mining methods are mostly focused on relational databases and structured data, but not on complex structured data (like in extensible markup language(XML)). By converting XML document type description to the relational semantic recording XML data relations, and using an XML data mining language, the XML data mining system presents a strategy to mine information on XML. 展开更多
关键词 extensible markup language(XML) document type description(DTD) data mining data mining language relational schema
下载PDF
Web Data Aggregation in MOLAP:Approach,Language,and Implementation
4
作者 朱焱 唐慧佳 马永强 《Journal of Southwest Jiaotong University(English Edition)》 2007年第3期179-186,共8页
This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules... This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules. To define the rules for reading warehouse data and computing aggregates, a rule definition language - array aggregation language (AAL) is developed. This language treats an array as a function from indexes to values and provides syntax and semantics based on monads. External functions can be called in aggregation rules to specify array reading, writing, and aggregating. Based on the features of AAL, array operations are unified as function operations, which can be easily expressed and automatically evaluated. To implement the aggregation approach, a processor for computing aggregates over the base cube and for materializing them in the data warehouse is built, and the component structure and working principle of the aggregation processor are introduced. 展开更多
关键词 Web data aggregation Aggregation language MOLAP Aggregation processor
下载PDF
A Shallow Parsing Approach to Natural Language Queries of a Database
5
作者 Richard Skeggs Stasha Lauria 《Journal of Software Engineering and Applications》 2019年第9期365-382,共18页
The performance and reliability of converting natural language into structured query language can be problematic in handling nuances that are prevalent in natural language. Relational databases are not designed to und... The performance and reliability of converting natural language into structured query language can be problematic in handling nuances that are prevalent in natural language. Relational databases are not designed to understand language nuance, therefore the question why we must handle nuance has to be asked. This paper is looking at an alternative solution for the conversion of a Natural Language Query into a Structured Query Language (SQL) capable of being used to search a relational database. The process uses the natural language concept, Part of Speech to identify words that can be used to identify database tables and table columns. The use of Open NLP based grammar files, as well as additional configuration files, assist in the translation from natural language to query language. Having identified which tables and which columns contain the pertinent data the next step is to create the SQL statement. 展开更多
关键词 NLIDB NATURAL language Processing dataBASE QUERY data MINING
下载PDF
Communication Mediated through Natural Language Generation in Big Data Environments: The Case of Nomao
6
作者 Jean-Sébastien Vayre Estelle Delpech +1 位作者 Aude Dufresne Céline Lemercier 《Journal of Computer and Communications》 2017年第6期125-148,共24页
Along with the development of big data, various Natural Language Generation systems (NLGs) have recently been developed by different companies. The aim of this paper is to propose a better understanding of how these s... Along with the development of big data, various Natural Language Generation systems (NLGs) have recently been developed by different companies. The aim of this paper is to propose a better understanding of how these systems are designed and used. We propose to study in details one of them which is the NLGs developed by the company Nomao. First, we show the development of this NLGs underlies strong economic stakes since the business model of Nomao partly depends on it. Then, thanks to an eye movement analysis conducted with 28 participants, we show that the texts generated by Nomao’s NLGs contain syntactic and semantic structures that are easy to read but lack socio-semantic coherence which would improve their understanding. From a scientific perspective, our research results highlight the importance of socio-semantic coherence in text-based communication produced by NLGs. 展开更多
关键词 BIG data Natural language Generation Socio-Semantic COHERENCE COGNITIVE Load READING Eye Tracking
下载PDF
Semantic Analysis of Natural Language Queries for an Object Oriented Database
7
作者 Bentamar Hemerelain Hafida Belbachir 《Journal of Software Engineering and Applications》 2010年第11期1047-1053,共7页
This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around... This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around a verb. A NG consists of one or several keywords (application dependent noun or value). Simple semantic filters are defined for identifying these keywords which can be of semantic value: class, simple attribute, composed attribute, key value or not key value. Coherence rules and coherence constraints are introduced, to check the validity of the co-occurrence of two consecutive nouns in complex NG. If a query is constituted of a single NG, no further analysis is required. Otherwise, if a query covers two valid NG, it is a subject of studying the semantic coherence of the verb and both NG which are attached to it. 展开更多
关键词 QUERY NOMINAL Group Natural language Object Oriented data Base SEMANTIC Validation
下载PDF
Database Research Method for Researches on International Chinese Language Teaching in an Era of Big Data
8
作者 SONG Fei HAN Xiu-juan XU Ming-hui 《Journal of Literature and Art Studies》 2018年第1期124-136,共13页
The database research method is a method that analyses, generalizes and deduces from the data of subject investigated with database techniques, quantitative statistics and mathematical models. As the big data age come... The database research method is a method that analyses, generalizes and deduces from the data of subject investigated with database techniques, quantitative statistics and mathematical models. As the big data age comes with the data explosion in modem society, the International Chinese Language Teaching (ICLT) shows signs of sizable data accumulation, remarkable economic property, strong modeling requirements and notable cross-research trends, which thus make this method necessary as a new and independent research method in the researches on this area. Theory bases, applicative areas, available software and data resources, research program designs, as well as their advantages and disadvantages will be figured out in this paper. In the near future, it will bring about a revolution to the international Chinese language teaching. 展开更多
关键词 big data Intemational Chinese language Teaching database research method
下载PDF
On the Combination of “The Textual Research on Historical Documents” and “The Comparative Study of Historical Data” —— and a Discussion on “The Law of Quan-ma and Gui-mei” in Chinese Language Studies
9
作者 Lu Guoyao 《宏观语言学》 2007年第1期46-59,共14页
In Chinese language studies, both “The Textual Research on Historical Documents” and “The Comparative Study of Historical Data” are traditional in methodology and they both deserve being treasured, passed on, and ... In Chinese language studies, both “The Textual Research on Historical Documents” and “The Comparative Study of Historical Data” are traditional in methodology and they both deserve being treasured, passed on, and further developed. It will certainly do harm to the development of academic research if any of the two methods is given unreasonable priority. The author claims that the best or one of the best methodologies of the historical study of Chinese language is the combination of the two, hence a new interpretation of “The Double-proof Method”. Meanwhile, this essay is also an attempt to put forward “The Law of Quan-ma and Gui-mei” in Chinese language studies, in which the author believes that it is not advisable to either treat Gui-mei as Quan-ma or vice versa in linguistic research. It is crucial for us to respect always the language facts first, which is considered the very soul of linguistics. 展开更多
关键词 the history of Chinese language methodology The Textual Research on HISTORICAL Documents The Comparative Study of HISTORICAL data Double-proof method the LAW of Quan-ma and Gui-mei
下载PDF
Research on Application of Data Mining in Virtual Community of Foreign Language Learning
10
作者 GAO Liuxin 《International English Education Research》 2018年第1期7-9,共3页
The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, c... The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, combining the virtual reality technology to construct the language environment of foreign language learning. The virtual community of foreign language leaming can improve the sense of language authenticity in foreign language learning and improve the quality of foreign language teaching. A method of building a virtual community for foreign language learning is proposed based on data mining technology, data acquisition and feature preprocessing model for building semantic vocabulary of foreign language learning is constructed, the linguistic environment characteristics of the semantic vocabulary data of foreign language learning is analyzed, and the semantic noumenon structure model is obtained. Fuzzy clustering method is used for vocabulary clustering and comprehensive retrieval in the virtual community of foreign language learning, the performance of vocabulary classification in foreign language learning is improved, the adaptive semantic information fusion method is used to realize the vocabulary data mining in the virtual community of foreign language learning, information retrieval and access scheduling for virtual communities in foreign language learning are realized based on data mining results. The simulation results show that the accuracy of foreign language vocabulary retrieval is good, improve the efficiency of foreign language learning. 展开更多
关键词 data mining Foreign language learning Virtual community language environment Fuzzy clustering
下载PDF
Classification of Big Data Security Based on Ontology Web Language
11
作者 Alsadig Mohammed Adam Abdallah Amir Mohamed Talib 《Journal of Information Security》 2023年第1期76-91,共16页
A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (I... A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (IoT) devices. The term “big data security” refers to all the safeguards and instruments used to protect both the data and analytics processes against intrusions, theft, and other hostile actions that could endanger or adversely influence them. Beyond being a high-value and desirable target, protecting Big Data has particular difficulties. Big Data security does not fundamentally differ from conventional data security. Big Data security issues are caused by extraneous distinctions rather than fundamental ones. This study meticulously outlines the numerous security difficulties Large Data analytics now faces and encourages additional joint research for reducing both big data security challenges utilizing Ontology Web Language (OWL). Although we focus on the Security Challenges of Big Data in this essay, we will also briefly cover the broader Challenges of Big Data. The proposed classification of Big Data security based on ontology web language resulting from the protégé software has 32 classes and 45 subclasses. 展开更多
关键词 Big data Big data Security Information Security data Security Ontology Web language PROTÉGÉ
下载PDF
Sign language synthesis of individuation based on data model
12
作者 JIANG Feng YAO Hong-xun WANG Xiao-yu 《通讯和计算机(中英文版)》 2008年第11期32-36,共5页
关键词 信号 通信技术 计算方法 语言分析
下载PDF
On the teaching content of Language and Culture
13
作者 LI Guang-mei 《Sino-US English Teaching》 2008年第5期1-10,共10页
To cultivate English majors' culture awareness and improve their English integrated competence, this paper clarifies the reasons and the choice of the content for the teaching of Language and Culture. The results of ... To cultivate English majors' culture awareness and improve their English integrated competence, this paper clarifies the reasons and the choice of the content for the teaching of Language and Culture. The results of the teaching and the investigation conducted by the author show that the choice of the teaching materials and the topics for the class must be included in the teaching plan. Besides, this paper also probed into the most difficult topic, the easiest topic, the topics that need to be explained more and the topics that need to be explained less in class; and the similarity and differences of the teaching contents for students in different grades are also analyzed. In the end, proper sources of the comparison of English and Chinese languages and cultures are revealed in order to enlarge students' knowledge and improve their competence in using English. 展开更多
关键词 language and Culture MATERIALS CONTENT TOPICS data analyses
下载PDF
Application of Recursive Query on Structured Query Language Server
14
作者 荀雪莲 ABHIJIT Sen 姚志强 《Journal of Donghua University(English Edition)》 CAS 2023年第1期68-73,共6页
The advantage of recursive programming is that it is very easy to write and it only requires very few lines of code if done correctly.Structured query language(SQL)is a database language and is used to manipulate data... The advantage of recursive programming is that it is very easy to write and it only requires very few lines of code if done correctly.Structured query language(SQL)is a database language and is used to manipulate data.In Microsoft SQL Server 2000,recursive queries are implemented to retrieve data which is presented in a hierarchical format,but this way has its disadvantages.Common table expression(CTE)construction introduced in Microsoft SQL Server 2005 provides the significant advantage of being able to reference itself to create a recursive CTE.Hierarchical data structures,organizational charts and other parent-child table relationship reports can easily benefit from the use of recursive CTEs.The recursive query is illustrated and implemented on some simple hierarchical data.In addition,one business case study is brought forward and the solution using recursive query based on CTE is shown.At the same time,stored procedures are programmed to do the recursion in SQL.Test results show that recursive queries based on CTEs bring us the chance to create much more complex queries while retaining a much simpler syntax. 展开更多
关键词 structured query language(SQL)server common table expression(CTE) recursive query stored procedure hierarchical data
下载PDF
Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset
15
作者 Mohammed Abdalsalam Chunlin Li +1 位作者 Abdelghani Dahou Natalia Kryvinska 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第2期1427-1467,共41页
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli... One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier. 展开更多
关键词 Artificial intelligence machine learning natural language processing data analytic DistilBERT feature extraction terrorism classification GTD dataset
下载PDF
Comparison of R and Excel in the Field of Data Analysis
16
作者 Jue Wang 《Journal of Electronic Research and Application》 2024年第3期178-184,共7页
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i... This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences. 展开更多
关键词 EXCEL R language data analysis Open source COMPARE data management Advantages Disadvantages FUNCTION
下载PDF
Semantic-based query processing for relational data integration 被引量:1
17
作者 苗壮 张亚非 +2 位作者 王进鹏 陆建江 周波 《Journal of Southeast University(English Edition)》 EI CAS 2011年第1期22-25,共4页
To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al... To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance. 展开更多
关键词 data integration relational database simple protocol and RDF query language(SPARQL) minimal connectable unit query processing
下载PDF
FAIR Enough:Develop and Assess a FAIR-Compliant Dataset for Large Language Model Training?
18
作者 Shaina Raza Shardul Ghuge +2 位作者 Chen Ding Elham Dolatabadi Deval Pandya 《Data Intelligence》 EI 2024年第2期559-585,共27页
The rapid evolution of Large Language Models(LLMs) highlights the necessity for ethical considerations and data integrity in AI development, particularly emphasizing the role of FAIR(Findable, Accessible, Interoperabl... The rapid evolution of Large Language Models(LLMs) highlights the necessity for ethical considerations and data integrity in AI development, particularly emphasizing the role of FAIR(Findable, Accessible, Interoperable, Reusable) data principles. While these principles are crucial for ethical data stewardship, their specific application in the context of LLM training data remains an under-explored area. This research gap is the focus of our study, which begins with an examination of existing literature to underline the importance of FAIR principles in managing data for LLM training. Building upon this, we propose a novel frame-work designed to integrate FAIR principles into the LLM development lifecycle. A contribution of our work is the development of a comprehensive checklist intended to guide researchers and developers in applying FAIR data principles consistently across the model development process. The utility and effectiveness of our frame-work are validated through a case study on creating a FAIR-compliant dataset aimed at detecting and mitigating biases in LLMs. We present this framework to the community as a tool to foster the creation of technologically advanced, ethically grounded, and socially responsible AI models. 展开更多
关键词 Responsible Al Large language models FAIR data principles Ethical Al Biases
原文传递
On the Use of E-Learning Software Data--with Speexx Foreign Lan guage Learning System Being the Case
19
作者 白秀敏 《海外英语》 2021年第8期261-262,264,共3页
E-learning produces the data on the learners’utilization of the software,which helps the teacher to perceive the learners’mental status and learning efficiency,so it is of great value to make full use of the data.Wi... E-learning produces the data on the learners’utilization of the software,which helps the teacher to perceive the learners’mental status and learning efficiency,so it is of great value to make full use of the data.With Speexx foreign language learning system being the case,this thesis introduces the function of such data and the modes of how to use them to facilitate the blendedteaching and learning. 展开更多
关键词 E-LEARNING software data Speexx foreign language learning system function
下载PDF
有限元数据文件处理软件JIG-DATA的功能及应用
20
作者 蒲军平 张石峰 《新疆工学院学报》 1997年第2期142-144,共3页
使用WATFOR77编辑编译集成软件,编制了能将有限元前处理程序MESHG及主程序系统JIGFEX有机地联系在一起的软件JIG-DATA,该软件能自动生成用于主程序JIGFEX计算的数据文件DATA,这一过程的自动化程度高,效率显著.
关键词 数据文件处理 JIGFEX 软件 有限元 JIG-data
下载PDF
上一页 1 2 111 下一页 到第
使用帮助 返回顶部