摘要
随着互联网技术的发展,很多应用为大众提供金融量化服务,而大部分用户不具备金融或计算机专业知识,他们期望使用自然语言查询数据,因此自然语言转SQL(NL2SQL)被迫切需要。针对此问题,提出一种基于双向长短期记忆模型(BiLSTM)的中文金融NL2SQL算法,分为编码和解码阶段。在编码阶段,利用BiLSTM和注意力机制生成特征向量。在解码阶段,根据SQL的语法规则,将SQL生成解耦为九个分类任务,各个任务间相互依赖联合学习,之后生成复杂的SQL语句。除模型外,还训练出包含金融词汇的向量库,构建金融领域的数据集。通过在此数据集上实验验证,结果表明,该方法准确率更高,能有效解决金融领域SQL生成问题,并在某金融量化分析系统中实现。
With the development of Internet technology,many applications provide financial quantification services for the public,but most users do not have financial or computer professional knowledge,they expect to use natural language to query data,so natural language to SQL(NL2SQL)is urgently needed.To solve this problem,a Chinese financial NL2SQL algorithm based on BiLSTM is proposed,which is divided into encoding and decoding stages.In the encoding stage,feature vectors were generated by BiLSTM and attention mechanism.In the decoding stage,the SQL generation was decoupled into nine classified tasks according to the SQL syntax rules,and each task was interdependent and joint learning,and then the complex SQL statement was generated.In addition to the model,a vector library containing financial vocabulary was trained,which built data sets for the financial domain.The experimental verification on this data set shows that the method has higher accuracy,can effectively solve the problem of SQL generation in the financial field,and is implemented in a financial quantitative analysis system.
作者
邰伟鹏
刘杨
王小林
郑啸
钟亮
Tai Weipeng;Liu Yang;Wang Xiaolin;Zheng Xiao;Zhong Liang(School of Computer Science and Technology,Anhui University of Technology,Maanshan 243000,Anhui,China;Institute of Information Technology,Anhui University of Technology,Maanshan 243000,Anhui,China;Shanghai Measuring Mirror Space Information Technology Co.,Ltd.,Shanghai 200000,China)
出处
《计算机应用与软件》
北大核心
2024年第3期34-40,共7页
Computer Applications and Software
基金
安徽高校自然科学研究重大项目(KJ2019ZD09)
安徽省重点研究与开发计划项目(202004a07020028)
安徽省高校协同创新项目(Grant No.GXXT-2019-025)。