Stock index forecast is regarded as a challenging task of financial time-series prediction. In this paper, the non-linear support vector regression (SVR) method was optimized for the application in stock index predict...Stock index forecast is regarded as a challenging task of financial time-series prediction. In this paper, the non-linear support vector regression (SVR) method was optimized for the application in stock index prediction. The parameters (C, σ) of SVR models were selected by three different methods of grid search (GRID), particle swarm optimization (PSO) and genetic algorithm (GA).The optimized parameters were used to predict the opening price of the test samples. The predictive results shown that the SVR model with GRID (GRID-SVR), the SVR model with PSO (PSO-SVR) and the SVR model with GA (GA-SVR) were capable to fully demonstrate the time-dependent trend of stock index and had the significant prediction accuracy. The minimum root mean square error (RMSE) of the GA-SVR model was 15.630, the minimum mean absolute percentage error (MAPE) equaled to 0.39% and the correspondent optimal parameters (C, σ) were identified as (45.422, 0.012). The appreciated modeling results provided theoretical and technical reference for investors to make a better trading strategy.展开更多
Nowadays,cloud computing is used more and more widely,more and more people prefer to using cloud server to store data.So,how to encrypt the data efficiently is an important problem.The search efficiency of existed sea...Nowadays,cloud computing is used more and more widely,more and more people prefer to using cloud server to store data.So,how to encrypt the data efficiently is an important problem.The search efficiency of existed search schemes decreases as the index increases.For solving this problem,we build the two-level index.Simultaneously,for improving the semantic information,the central word expansion is combined.The purpose of privacy-preserving content-aware search by using the two-level index(CKESS)is that the first matching is performed by using the extended central words,then calculate the similarity between the trapdoor and the secondary index,finally return the results in turn.Through experiments and analysis,it is proved that our proposed schemes can resist multiple threat models and the schemes are secure and efficient.展开更多
In this paper, we propose a new method based on index to realize IR-style Chinese keyword search with ranking strategies in relational databases. This method creates an index by using the related information of tuple ...In this paper, we propose a new method based on index to realize IR-style Chinese keyword search with ranking strategies in relational databases. This method creates an index by using the related information of tuple words and presents a ranking strategy in terms of the nature of Chinese words. For a Chinese keyword query, the index is used to match query search words and the tuple words in index quickly, and to compute similarities between the query and tuples by the ranking strategy, and then the set of identifiers of candidate tuples is generated. Thus, we retrieve top-N results of the query using SQL selection statements and output the ranked answers according to the similarities. The experimental results show that our method is efficient and effective.展开更多
Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curs...Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.展开更多
针对传统近邻传播聚类算法以数据点对之间的相似度作为输入度量,由于需要预设偏向参数p和阻尼系数λ,算法精度无法精确控制的问题,提出了一种跳跃跟踪麻雀搜索算法优化的交叉迭代近邻传播聚类方法.首先,针对麻雀搜索算法中发现者和加入...针对传统近邻传播聚类算法以数据点对之间的相似度作为输入度量,由于需要预设偏向参数p和阻尼系数λ,算法精度无法精确控制的问题,提出了一种跳跃跟踪麻雀搜索算法优化的交叉迭代近邻传播聚类方法.首先,针对麻雀搜索算法中发现者和加入者位置更新不足的问题,设计了一种跳跃跟踪优化策略,通过考虑偏好阻尼因子的跳跃策略设计大步长更新发现者,增加麻雀搜索算法的全局勘探能力和寻优速度,加入者设计动态小步长跟踪领头雀更新位置,同时,利用自适应种群划分机制更新发现者和加入者的比重,增加算法的后期局部开发能力和寻优速度;其次,设计基于扰动因子的Tent映射,在此基础上增加3个参数,使映射分布范围增大,并避免了陷入小周期点和不稳周期点;最后,引入轮廓系数作为评价函数,跳跃跟踪麻雀搜索算法自动寻找较优的p和λ,代替手动输入参数,并融合基于扰动因子的Tent映射优化近邻传播算法,交叉迭代确定最优簇数.使用多种算法聚类University of California Irvine数据集的10种公共数据集,仿真结果表明,本文提出的聚类算法与经典近邻传播算法、基于差分改进的仿射传播聚类算法、基于麻雀搜索算法优化的近邻传播聚类算法和进化近邻传播算法相比具有更优的搜索效率以及聚类精度.对国家信息数据进行了聚类分析,提出的方法更加准确有效合理,具有较好的应用价值.展开更多
Meta-heuristics typically takes long time to search optimality from huge amounts of data samples for applications like communication, medicine, and civil engineering. Therefore, parallelizing meta-heuristics to massiv...Meta-heuristics typically takes long time to search optimality from huge amounts of data samples for applications like communication, medicine, and civil engineering. Therefore, parallelizing meta-heuristics to massively reduce runtime is one hot topic in related research. In this paper, we propose a MapReduce modified cuckoo search (MRMCS), an efficient modified cuckoo search (MCS) implementation on a MapReduce architecture--Hadoop. MapReduce particle swarm optimization (MRPSO) from a previous work is also implemented for comparison. Four evaluation functions and two engineering design problems are used to conduct experiments. As a result, MRMCS shows better convergence in obtaining optimality than MRPSO with two to four times speed-up.展开更多
文摘Stock index forecast is regarded as a challenging task of financial time-series prediction. In this paper, the non-linear support vector regression (SVR) method was optimized for the application in stock index prediction. The parameters (C, σ) of SVR models were selected by three different methods of grid search (GRID), particle swarm optimization (PSO) and genetic algorithm (GA).The optimized parameters were used to predict the opening price of the test samples. The predictive results shown that the SVR model with GRID (GRID-SVR), the SVR model with PSO (PSO-SVR) and the SVR model with GA (GA-SVR) were capable to fully demonstrate the time-dependent trend of stock index and had the significant prediction accuracy. The minimum root mean square error (RMSE) of the GA-SVR model was 15.630, the minimum mean absolute percentage error (MAPE) equaled to 0.39% and the correspondent optimal parameters (C, σ) were identified as (45.422, 0.012). The appreciated modeling results provided theoretical and technical reference for investors to make a better trading strategy.
基金This work is supported by the National Natural Science Foundation of China under grant U1836110,U1836208,U1536206,61602253,61672294by the National Key R&D Program of China under grant 2018YFB1003205+5 种基金by China Postdoctoral Science Foundation(2017M610574)by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20181407by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fundby the Major Program of the National Social Science Fund of China(17ZDA092)Qing Lan Projectby the Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(CICAEET)fund,China.
文摘Nowadays,cloud computing is used more and more widely,more and more people prefer to using cloud server to store data.So,how to encrypt the data efficiently is an important problem.The search efficiency of existed search schemes decreases as the index increases.For solving this problem,we build the two-level index.Simultaneously,for improving the semantic information,the central word expansion is combined.The purpose of privacy-preserving content-aware search by using the two-level index(CKESS)is that the first matching is performed by using the extended central words,then calculate the similarity between the trapdoor and the secondary index,finally return the results in turn.Through experiments and analysis,it is proved that our proposed schemes can resist multiple threat models and the schemes are secure and efficient.
文摘In this paper, we propose a new method based on index to realize IR-style Chinese keyword search with ranking strategies in relational databases. This method creates an index by using the related information of tuple words and presents a ranking strategy in terms of the nature of Chinese words. For a Chinese keyword query, the index is used to match query search words and the tuple words in index quickly, and to compute similarities between the query and tuples by the ranking strategy, and then the set of identifiers of candidate tuples is generated. Thus, we retrieve top-N results of the query using SQL selection statements and output the ranked answers according to the similarities. The experimental results show that our method is efficient and effective.
基金Project (No. [2005]555) supported by the Hi-Tech Research and De-velopment Program (863) of China
文摘Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.
文摘针对传统近邻传播聚类算法以数据点对之间的相似度作为输入度量,由于需要预设偏向参数p和阻尼系数λ,算法精度无法精确控制的问题,提出了一种跳跃跟踪麻雀搜索算法优化的交叉迭代近邻传播聚类方法.首先,针对麻雀搜索算法中发现者和加入者位置更新不足的问题,设计了一种跳跃跟踪优化策略,通过考虑偏好阻尼因子的跳跃策略设计大步长更新发现者,增加麻雀搜索算法的全局勘探能力和寻优速度,加入者设计动态小步长跟踪领头雀更新位置,同时,利用自适应种群划分机制更新发现者和加入者的比重,增加算法的后期局部开发能力和寻优速度;其次,设计基于扰动因子的Tent映射,在此基础上增加3个参数,使映射分布范围增大,并避免了陷入小周期点和不稳周期点;最后,引入轮廓系数作为评价函数,跳跃跟踪麻雀搜索算法自动寻找较优的p和λ,代替手动输入参数,并融合基于扰动因子的Tent映射优化近邻传播算法,交叉迭代确定最优簇数.使用多种算法聚类University of California Irvine数据集的10种公共数据集,仿真结果表明,本文提出的聚类算法与经典近邻传播算法、基于差分改进的仿射传播聚类算法、基于麻雀搜索算法优化的近邻传播聚类算法和进化近邻传播算法相比具有更优的搜索效率以及聚类精度.对国家信息数据进行了聚类分析,提出的方法更加准确有效合理,具有较好的应用价值.
文摘Meta-heuristics typically takes long time to search optimality from huge amounts of data samples for applications like communication, medicine, and civil engineering. Therefore, parallelizing meta-heuristics to massively reduce runtime is one hot topic in related research. In this paper, we propose a MapReduce modified cuckoo search (MRMCS), an efficient modified cuckoo search (MCS) implementation on a MapReduce architecture--Hadoop. MapReduce particle swarm optimization (MRPSO) from a previous work is also implemented for comparison. Four evaluation functions and two engineering design problems are used to conduct experiments. As a result, MRMCS shows better convergence in obtaining optimality than MRPSO with two to four times speed-up.