Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their ...Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.展开更多
A new chaos game representation of protein sequences based on the detailed hydrophobic-hydrophilic (HP) model has been proposed by Yu et al (Physica A 337(2004) 171). A CGR-walk model is proposed based on the ne...A new chaos game representation of protein sequences based on the detailed hydrophobic-hydrophilic (HP) model has been proposed by Yu et al (Physica A 337(2004) 171). A CGR-walk model is proposed based on the new CGR coordinates for the protein sequences from complete genomes in the present paper. The new CCR coordinates based on the detailed HP model are converted into a time series, and a long-memory ARFIMA(p, d, q) model is introduced into the protein sequence analysis. This model is applied to simulating real CCR-walk sequence data of twelve protein sequences. Remarkably long-range correlations are uncovered in the data and the results obtained from these models are reasonably consistent with those available from the ARFIMA(p, d, q) model.展开更多
利用DNA序列的混沌游戏表示(chaos game representation,CGR),提出了将2维DNA图谱转化成相应的类谱格式的方法。该方法不仅提供了一个较好的视觉表示,而且可将DNA序列转化成一个时间序列。利用CGR坐标将DNA序列转化成CGR弧度序列,并引...利用DNA序列的混沌游戏表示(chaos game representation,CGR),提出了将2维DNA图谱转化成相应的类谱格式的方法。该方法不仅提供了一个较好的视觉表示,而且可将DNA序列转化成一个时间序列。利用CGR坐标将DNA序列转化成CGR弧度序列,并引入长记忆ARFIMA(p,d,q)模型去拟合此类序列,发现此类序列中有显著的长相关性且拟合度很好。展开更多
青海省位于中国西部,拥有丰富的草地资源和独特的畜牧业生产条件,使得该地区的肉类产量在全国占有重要地位。肉类产量不仅直接影响当地牧民的收入水平,还关系到青海省的经济发展和社会稳定。有鉴于此,采用R studio软件建立自回归差分移...青海省位于中国西部,拥有丰富的草地资源和独特的畜牧业生产条件,使得该地区的肉类产量在全国占有重要地位。肉类产量不仅直接影响当地牧民的收入水平,还关系到青海省的经济发展和社会稳定。有鉴于此,采用R studio软件建立自回归差分移动平均(ARIMA)模型可对青海省的肉类产量进行历史数据分析和未来趋势预测,并利用AIC准则确定模型的最优阶数。本文使用1997~2020年的青海省肉类产量作为数据源,在此基础上采用2021~2023年青海省肉类产量数据作为对比数据来判断真实值与预测值之间的差异,最终可得出其真实值与预测值之前有一定差异,但差异较小,整体预测精度较高。Qinghai Province, located in western China, has rich grassland resources and unique livestock production conditions, which make the meat production in the region occupy an important position in China. Meat production not only directly affects the income level of local herders, but also relates to the economic development and social stability of Qinghai Province. In view of this, the autoregressive integrated moving average (ARIMA) model can be built using R studio software to analyze the historical data and forecast the future trend of meat production in Qinghai Province, and the optimal order of the model can be determined using the AIC criterion. In this paper, the meat production of Qinghai Province from 1997 to 2020 is used as the data source, and on this basis, the meat production data of Qinghai Province from 2021 to 2023 is used as the comparative data to judge the difference between the real value and the forecast value, and finally, it can be concluded that there is a certain difference between the real value and the forecast value, but the difference is small, and the overall forecast accuracy is high.展开更多
流感病毒分为三类:甲型(A型),乙型(B型),丙型(C型).在这三种类型中甲型(A型)流感病毒是最致命的流感病毒,对人类引起了严重疾病.本文对甲型流感病毒DNA序列建立了一种新的时间序列模型,即CGR(Chaos Game Representation)弧度序列.利用CG...流感病毒分为三类:甲型(A型),乙型(B型),丙型(C型).在这三种类型中甲型(A型)流感病毒是最致命的流感病毒,对人类引起了严重疾病.本文对甲型流感病毒DNA序列建立了一种新的时间序列模型,即CGR(Chaos Game Representation)弧度序列.利用CGR坐标将甲流病毒DNA序列转换成CGR弧度序列,且引入长记忆ARFIMA模型去拟合此类序列,发现随机找来的10条H1N1序列,10条H3N2序列都具有长相关性且拟合很好,并且还发现这两种序列可以尝试用不同的ARFIMA模型去识别,其中H1N1可用ARFIMA(0,d,5)模型去识别,H3N2可用ARFIMA(1,d,1)模型去识别.展开更多
It is very im portant to analyze network traffic in the network control and management. In thi s paper, extreme value theory is first introduced and a model with threshold met hods is proposed to analyze the character...It is very im portant to analyze network traffic in the network control and management. In thi s paper, extreme value theory is first introduced and a model with threshold met hods is proposed to analyze the characteristics of network traffic. In this mode l, only some traffic data that is greater than threshold value is considered. Th en the proposed model with the trace is simulated by using S Plus software. The modeling results show the network traffic model constructed from the extreme va lue theory fits well with that of empirical distribution. Finally, the extreme v alue model with the FARIMA(p,d,q) modeling is compared. The anal ytical results illustrate that extreme value theory has a good application foreg round in the statistic analysis of network traffic. In addition, since only some traffic data which is greater than the threshold is processed, the computation overhead is reduced greatly.展开更多
基金Project supported by the National Natural Science Foundation of China (Grant No 60575038)the Natural Science Foundation of Jiangnan University,China (Grant No 20070365)
文摘Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.
基金Project supported by the National Natural Science Foundation of China (Grant No 60575038)the Natural Science Foundation of Jiangnan University, China (Grant No 20070365)the Program for Innovative Research Team of Jiangnan University, China
文摘A new chaos game representation of protein sequences based on the detailed hydrophobic-hydrophilic (HP) model has been proposed by Yu et al (Physica A 337(2004) 171). A CGR-walk model is proposed based on the new CGR coordinates for the protein sequences from complete genomes in the present paper. The new CCR coordinates based on the detailed HP model are converted into a time series, and a long-memory ARFIMA(p, d, q) model is introduced into the protein sequence analysis. This model is applied to simulating real CCR-walk sequence data of twelve protein sequences. Remarkably long-range correlations are uncovered in the data and the results obtained from these models are reasonably consistent with those available from the ARFIMA(p, d, q) model.
基金supported by the National Natural Science Grant No.60575038Jiangnan University Grant No.20070365 and the Program for Innovative Research Team of Jiangnan University~~
文摘利用DNA序列的混沌游戏表示(chaos game representation,CGR),提出了将2维DNA图谱转化成相应的类谱格式的方法。该方法不仅提供了一个较好的视觉表示,而且可将DNA序列转化成一个时间序列。利用CGR坐标将DNA序列转化成CGR弧度序列,并引入长记忆ARFIMA(p,d,q)模型去拟合此类序列,发现此类序列中有显著的长相关性且拟合度很好。
文摘青海省位于中国西部,拥有丰富的草地资源和独特的畜牧业生产条件,使得该地区的肉类产量在全国占有重要地位。肉类产量不仅直接影响当地牧民的收入水平,还关系到青海省的经济发展和社会稳定。有鉴于此,采用R studio软件建立自回归差分移动平均(ARIMA)模型可对青海省的肉类产量进行历史数据分析和未来趋势预测,并利用AIC准则确定模型的最优阶数。本文使用1997~2020年的青海省肉类产量作为数据源,在此基础上采用2021~2023年青海省肉类产量数据作为对比数据来判断真实值与预测值之间的差异,最终可得出其真实值与预测值之前有一定差异,但差异较小,整体预测精度较高。Qinghai Province, located in western China, has rich grassland resources and unique livestock production conditions, which make the meat production in the region occupy an important position in China. Meat production not only directly affects the income level of local herders, but also relates to the economic development and social stability of Qinghai Province. In view of this, the autoregressive integrated moving average (ARIMA) model can be built using R studio software to analyze the historical data and forecast the future trend of meat production in Qinghai Province, and the optimal order of the model can be determined using the AIC criterion. In this paper, the meat production of Qinghai Province from 1997 to 2020 is used as the data source, and on this basis, the meat production data of Qinghai Province from 2021 to 2023 is used as the comparative data to judge the difference between the real value and the forecast value, and finally, it can be concluded that there is a certain difference between the real value and the forecast value, but the difference is small, and the overall forecast accuracy is high.
文摘流感病毒分为三类:甲型(A型),乙型(B型),丙型(C型).在这三种类型中甲型(A型)流感病毒是最致命的流感病毒,对人类引起了严重疾病.本文对甲型流感病毒DNA序列建立了一种新的时间序列模型,即CGR(Chaos Game Representation)弧度序列.利用CGR坐标将甲流病毒DNA序列转换成CGR弧度序列,且引入长记忆ARFIMA模型去拟合此类序列,发现随机找来的10条H1N1序列,10条H3N2序列都具有长相关性且拟合很好,并且还发现这两种序列可以尝试用不同的ARFIMA模型去识别,其中H1N1可用ARFIMA(0,d,5)模型去识别,H3N2可用ARFIMA(1,d,1)模型去识别.
文摘It is very im portant to analyze network traffic in the network control and management. In thi s paper, extreme value theory is first introduced and a model with threshold met hods is proposed to analyze the characteristics of network traffic. In this mode l, only some traffic data that is greater than threshold value is considered. Th en the proposed model with the trace is simulated by using S Plus software. The modeling results show the network traffic model constructed from the extreme va lue theory fits well with that of empirical distribution. Finally, the extreme v alue model with the FARIMA(p,d,q) modeling is compared. The anal ytical results illustrate that extreme value theory has a good application foreg round in the statistic analysis of network traffic. In addition, since only some traffic data which is greater than the threshold is processed, the computation overhead is reduced greatly.