Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability...Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.展开更多
Our previous work has introduced the newly generated program using the code transformation model GPT-2,verifying the generated programming codes through simhash(SH)and longest common subsequence(LCS)algo-rithms.Howeve...Our previous work has introduced the newly generated program using the code transformation model GPT-2,verifying the generated programming codes through simhash(SH)and longest common subsequence(LCS)algo-rithms.However,the entire code transformation process has encountered a time-consuming problem.Therefore,the objective of this study is to speed up the code transformation process signicantly.This paper has proposed deep learning approaches for modifying SH using a variational simhash(VSH)algorithm and replacing LCS with a piecewise longest common subsequence(PLCS)algorithm to faster the verication process in the test phase.Besides the code transformation model GPT-2,this study has also introduced MicrosoMASS and Facebook BART for a comparative analysis of their performance.Meanwhile,the explainable AI technique using local interpretable model-agnostic explanations(LIME)can also interpret the decision-making ofAImodels.The experimental results show that VSH can reduce the number of qualied programs by 22.11%,and PLCS can reduce the execution time of selected pocket programs by 32.39%.As a result,the proposed approaches can signicantly speed up the entire code transformation process by 1.38 times on average compared with our previous work.展开更多
This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the pro...This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the program execution performance significantly.Besides,a theoretical estimation in statistics has given the minimum number of generated programs as required,which guarantees to find the best one within them.The proposed approach can help the voice assistant machine resolve the problem of inefficient execution of application code.In addition to GPT-2,this study develops the variational Simhash algorithm to check the code similarity between sample program and newly generated program,and conceives the piecewise longest common subsequence algorithm to examine the execution’s conformity from the two programs mentioned above.The code similarity check deducts the redundant generated programs,and the output conformity check finds the best-performing generative program.In addition to texts,the proposed approach can also prove the other media,including images,sounds,and movies.As a result,the newly generated program outperforms the sample program significantly because the number of code lines reduces 27.21%,and the program execution time shortens 24.62%.展开更多
基金supported and granted by the Ministry of Science and Technology,Taiwan(MOST110-2622-E-390-001 and MOST109-2622-E-390-002-CC3).
文摘Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.
基金supported by the Ministry of Science and Technology,Taiwan,under Grant Nos.MOST 111-2221-E-390-012 and MOST 111-2622-E-390-001.
文摘Our previous work has introduced the newly generated program using the code transformation model GPT-2,verifying the generated programming codes through simhash(SH)and longest common subsequence(LCS)algo-rithms.However,the entire code transformation process has encountered a time-consuming problem.Therefore,the objective of this study is to speed up the code transformation process signicantly.This paper has proposed deep learning approaches for modifying SH using a variational simhash(VSH)algorithm and replacing LCS with a piecewise longest common subsequence(PLCS)algorithm to faster the verication process in the test phase.Besides the code transformation model GPT-2,this study has also introduced MicrosoMASS and Facebook BART for a comparative analysis of their performance.Meanwhile,the explainable AI technique using local interpretable model-agnostic explanations(LIME)can also interpret the decision-making ofAImodels.The experimental results show that VSH can reduce the number of qualied programs by 22.11%,and PLCS can reduce the execution time of selected pocket programs by 32.39%.As a result,the proposed approaches can signicantly speed up the entire code transformation process by 1.38 times on average compared with our previous work.
基金This work is fully supported by the Ministry of Science and Technology,Taiwan,Republic of China,under Grant Nos.MOST 110-2622-E-390-001 and MOST 109-2622-E-390-002-CC3.
文摘This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the program execution performance significantly.Besides,a theoretical estimation in statistics has given the minimum number of generated programs as required,which guarantees to find the best one within them.The proposed approach can help the voice assistant machine resolve the problem of inefficient execution of application code.In addition to GPT-2,this study develops the variational Simhash algorithm to check the code similarity between sample program and newly generated program,and conceives the piecewise longest common subsequence algorithm to examine the execution’s conformity from the two programs mentioned above.The code similarity check deducts the redundant generated programs,and the output conformity check finds the best-performing generative program.In addition to texts,the proposed approach can also prove the other media,including images,sounds,and movies.As a result,the newly generated program outperforms the sample program significantly because the number of code lines reduces 27.21%,and the program execution time shortens 24.62%.