摘要
归纳提出了数据等概率分档排序算法 .该算法综合分析了以往的概率统计排序算法 ,充分利用了数据的分布信息 ,使得待排序数据尽可能平均分配到不同的区间内 ,分别对不同区间的数据排序 ,进而得到有序的序列 ;提出了数据等概率分档排序算法有效性的定量研究 ,从理论上量化并论证了分档数m的取值、分布类型的近似程度以及影响它们的几个因素 ,而这些方面的量化能为实际排序提供指导 ;推导出了一些重要的结论 .
This paper brings forward a subsection insertion-sorting algorithm with equal probability data segment. The algorithm combines traditional sorting algorithms with some knowledge and skill of modern statistics to sort data with general distribution. The distribution information of these data is considered sufficiently. The approaches include mainly the following. Firstly, the distribution type of these data is determined experimentally. Secondly, distribution parameters of thesis data are estimated. Thirdly, these data are assigned evenly to different segments on the whole. Finally, the data of different segments is sorted with traditional sorting algorithms. The complexity of this algorithm is limited to O(n). And this paper uses the theory of non-parametric hypothesis test to quantize the number of segments and approximate degree of distribution types and to deduce some factors which effect the number of segments and approximate degree of distribution types. Some important theoretic results are deduced. Let b represents the ratio of the number of data to the number of segments. Main results as follows. For larger number n, the algorithm is time optimal when b is a constant; the more approximate degree of distribution types is different, the value of b is to ensure the time complexity is O(n). This paper experimentalizes about these important result. Experiment results show that the theoretic results are consistent with practice.
出处
《计算机学报》
EI
CSCD
北大核心
2003年第1期45-50,共6页
Chinese Journal of Computers
基金
国家自然科学基金 ( 69973 0 16
6973 3 0 10 )资助