With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily meas...With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.展开更多
The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas...The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas from local learning, constraint- based, and search-and-score techniques in a principled and ef- fective way. It first reconstructs the junction tree of a BN and then performs a K2-scoring greedy search to orientate the local edges in the cliques of junction tree. Theoretical and experimental results show the proposed algorithm is capable of handling networks with a large number of variables. Its comparison with the well-known K2 algorithm is also presented.展开更多
采用主成分分析方法提取头相关传输函数(head-ralated transfer function,HRTF)的个性化系数,计算了影响HRTF的人体参数的拉普拉斯得分,并联合Pearson相关系数提取出对HRTF影响显著的关键人体参数;构建了径向基函数(radial basis functi...采用主成分分析方法提取头相关传输函数(head-ralated transfer function,HRTF)的个性化系数,计算了影响HRTF的人体参数的拉普拉斯得分,并联合Pearson相关系数提取出对HRTF影响显著的关键人体参数;构建了径向基函数(radial basis function,RBF)神经网络,学习关键人体参数到头相关传输函数个性化系数的非线性映射模型,利用简单的人体参数测量估计出待测者的个性化头相关传输函数.通过实验仿真与偏最小二乘回归(partial least squares regression,PLSR)法比较可知。展开更多
针对应用RBF(Radial Basis Function)神经网络信用评分中存在的第Ⅰ类错误率高的问题,提出了基于Linex损失下RBF神经网络分类方法,并给出了UCI(University of California Irvine)中德国信用评分数据集上的测试结果。实验结果表明,该方...针对应用RBF(Radial Basis Function)神经网络信用评分中存在的第Ⅰ类错误率高的问题,提出了基于Linex损失下RBF神经网络分类方法,并给出了UCI(University of California Irvine)中德国信用评分数据集上的测试结果。实验结果表明,该方法能有效解决传统RBF神经网络信用评分中存在的问题。展开更多
文摘With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.
文摘针对复杂因果句实体密度高、句式冗长等特点导致的外部信息不足和信息传递遗忘问题,提出一种基于提示增强与双图注意力网络(BiGAT)的复杂因果关系抽取模型PE-BiGAT(PromptEnhancementandBi-Graph Attention Network)。首先,抽取句子中的结果实体并与提示学习模板组成提示信息,再通过外部知识库增强提示信息;其次,将提示信息输入BiGAT,同时结合关注层与句法和语义依存图,并利用双仿射注意力机制缓解特征重叠的情况,增强模型对关系特征的感知能力;最后,用分类器迭代预测句子中的所有因果实体,并通过评分函数分析句子中所有的因果对。在SemEval-2010 task 8和AltLex数据集上的实验结果表明,与RPA-GCN(Relationship Position and Attention-Graph Convolutional Network)相比,所提模型的F1值提高了1.65个百分点,其中在链式因果和多因果句中分别提高了2.16和4.77个百分点,验证了所提模型在处理复杂因果句时更具优势。
基金supported by the National Natural Science Fundation of China (6097408261075055)the Fundamental Research Funds for the Central Universities (K50510700004)
文摘The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas from local learning, constraint- based, and search-and-score techniques in a principled and ef- fective way. It first reconstructs the junction tree of a BN and then performs a K2-scoring greedy search to orientate the local edges in the cliques of junction tree. Theoretical and experimental results show the proposed algorithm is capable of handling networks with a large number of variables. Its comparison with the well-known K2 algorithm is also presented.
文摘LeNet-5卷积神经网络在手写数字库上取得了很好地识别效果,但在表情识别中识别率很低.改进了LeNet-5卷积神经网络,使用浅层卷积结构,连续经过1×1和3×3的卷积层,在每一层的卷积后,加上Z-score标准化处理,使用性能更好的Relu激活函数,此函数计算速度快,减少梯度弥散问题;输出层用softmax函数,该层输出表情图像的概率.仿真结果表明,在JAFFE表情数据库上,即使在小样本数据集的情况下,算法识别率达到79.81%,识别单幅人脸表情图像的平均耗时为0.353 s.
文摘采用主成分分析方法提取头相关传输函数(head-ralated transfer function,HRTF)的个性化系数,计算了影响HRTF的人体参数的拉普拉斯得分,并联合Pearson相关系数提取出对HRTF影响显著的关键人体参数;构建了径向基函数(radial basis function,RBF)神经网络,学习关键人体参数到头相关传输函数个性化系数的非线性映射模型,利用简单的人体参数测量估计出待测者的个性化头相关传输函数.通过实验仿真与偏最小二乘回归(partial least squares regression,PLSR)法比较可知。
文摘针对应用RBF(Radial Basis Function)神经网络信用评分中存在的第Ⅰ类错误率高的问题,提出了基于Linex损失下RBF神经网络分类方法,并给出了UCI(University of California Irvine)中德国信用评分数据集上的测试结果。实验结果表明,该方法能有效解决传统RBF神经网络信用评分中存在的问题。