文章/答案/技术大牛

发布

社区首页 >问答首页 >为给定特定列表的排序算法寻找声音定时测试

问为给定特定列表的排序算法寻找声音定时测试
EN

Stack Overflow用户

提问于 2015-06-18 09:12:32

回答 1查看 998关注 0票数 2

是否有客观的测试来为特定类型的列表找到最佳排序算法？我试过这样的测试，但不确定它是否可靠。关键可能是:能否设计出一个客观的测试来概括最优的列表类型，或者，这些决定是否需要经验证据？

问题

我试图为特定类型的列表找到最佳排序算法。它们包含2-202个具有唯一整数的项。我正试图找到最快的方法来对数百万这样的名单进行排序。

当我注意到C sorted(unsorted)中内置的test用于Python的速度仅略快于我在python中的简单测试simple_sort(unsorted_set, order)时，这个搜索就开始了。有趣的是，Python中的quick_sort并不总是比simple_sort更快。

>>> def simple_sort(unsorted_set, order):
...     sorted_list = []
...     for i in order:
...         if i in unsorted_set:
...             sorted_list.append(i)
...     return sorted_list
>>> unsorted = [1, 5, 2, 9]
>>> order = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> unsorted_set = {item for item in unsorted}
>>> print simple_sort(unsorted_set, order)
[1, 2, 5, 9]

在某个时候，我需要排序的算法一旦在C中足够熟悉，就会用C重写。

我把精力放在编写以下测试代码上，假设C中的simple_sort对特定类型的列表执行Tim排序。
我假设sorted(unsorted)是Tim的C实现。

排序列表切片的Restults

数数排序是最快的。
在我用Python测试的最快的算法中，Tim在C中是最慢的，包括我天真的解决方案simple_sort。
有趣的是，获胜算法(以下)聚为3组。
第一个测试错误地对预先排序的列表进行排序。我为未排序的列表添加了一个测试(如下)。
Excel文件

2-202个排序项的最快算法

2-202个排序项的最慢算法

排序未排序列表的切片的结果

提姆排序在C是最快的，我的名单短于203项。
对于超过475项的列表，simple_sort更快。
我为未排序的列表添加了本节，因为我的第一次测试使用了就地gnome_sort预排序输入列表。

2-202个未排序项的最快算法

2-1002个未排序项的最快算法

2-202个未排序项的最慢算法

2-1002个未排序项的最慢算法

测试代码

我已经链接了这里的完整排序测试代码，因为排序算法会产生过长的帖子。

# 'times' is how many repetitions each algorithm should run  
times = 100
# 'unique_items' is the number of non-redundant items in the unsorted list
unique_items = 1003
# 'order' is an ordered list
order = []
for number in range(unique_items):
    order.append(number)
# Generate the unsorted list
random_list = order[:]
random.shuffle(random_list)
# 'random_set' is used for simple_sort
random_simple = random_list[:]
random_set = {item for item in random_simple}

# A list of all sorted lists for each algorithm   
sorted_lists = [
    simple_sort(random_set, order),
    quick_sort(random_list[:]),
    merge_sort(random_list[:]),
    shell_sort(random_list[:]),
    bubble_sort(random_list[:]),
    heap_sort(random_list[:]),
    insertion_sort(random_list[:]),
    insertion_sort_bin(random_list[:]),
    circle_sort(random_list[:]),
    cocktail_sort(random_list[:]),
    counting_sort(random_list[:], 0, unique_items),
    cycle_sort(random_list[:]),
    gnome_sort(random_list[:]),
    pancake_sort(random_list[:]),
    patience_sort(random_list[:]),
    radix_sort(random_list[:], unique_items),
    selection_sort(random_list[:]),
    abstract_tree_sort(random_list[:], BinarySearchTree),
    sorted(random_list[:])
    ]

# A set of all sorted lists for each algorithm
sorted_set = {repr(item) for item in sorted_lists}
# If only one version of the sorted list exists, True is evaluated
print 'All algorithms sort identically', len(sorted_set) is 1

# Sort slices of an unsorted list and record the times in 'time_record'
time_record = defaultdict(list)
for length in range(2, unique_items, 10):
    unsorted = random_list[:length]
    # 'unsorted_set' is used for simple_sort
    simple_unsorted = unsorted[:]
    unsorted_set = {item for item in simple_unsorted}

    print '**********', length, '**********'    

    print 'simple'
    simple = timeit.timeit(lambda: simple_sort(unsorted_set, order), number=times)
    time_record['Simple Sort'].append(simple)    

    print 'quick'
    quick_unsorted = unsorted[:]
    quick = timeit.timeit(lambda: quick_sort(quick_unsorted), number=times)
    time_record['Quick Sort'].append(quick)

    print 'merge'
    merge_unsorted = unsorted[:]
    merged = timeit.timeit(lambda: merge_sort(merge_unsorted), number=times)
    time_record['Merge Sort'].append(merged)

    print 'shell'
    shell_unsorted = unsorted[:]
    shell = timeit.timeit(lambda: merge_sort(shell_unsorted), number=times)
    time_record['Shell Sort'].append(shell)

    print 'bubble'
    bubble_unsorted = unsorted[:]
    bubble = timeit.timeit(lambda: bubble_sort(bubble_unsorted), number=times)
    time_record['In Place Bubble Sort'].append(bubble)    

    print 'heap'
    heap_unsorted = unsorted[:]
    heap = timeit.timeit(lambda: heap_sort(heap_unsorted), number=times)
    time_record['In Place Heap Sort'].append(heap)

    print 'insertion'
    insertion_unsorted = unsorted[:]
    insertion = timeit.timeit(lambda: insertion_sort(insertion_unsorted), number=times)
    time_record['In Place Insertion Sort'].append(insertion)

    print 'insertion binary'
    insertion_bin_unsorted = unsorted[:]
    insertion_bin = timeit.timeit(lambda: insertion_sort_bin(insertion_bin_unsorted), number=times)
    time_record['In Place Insertion Sort Binary'].append(insertion_bin)

    print 'circle'
    circle_unsorted = unsorted[:]
    circle = timeit.timeit(lambda: circle_sort(circle_unsorted), number=times)
    time_record['In Place Circle Sort'].append(circle)

    print 'cocktail'
    cocktail_unsorted = unsorted[:]
    cocktail = timeit.timeit(lambda: cocktail_sort(cocktail_unsorted), number=times)   
    time_record['In Place Cocktail Sort'].append(cocktail)

    print 'counting'
    counting_unsorted = unsorted[:]
    counting = timeit.timeit(lambda: counting_sort(counting_unsorted, 0, length), number=times)
    time_record['Counting Sort'].append(counting)

    print 'cycle'
    cycle_unsorted = unsorted[:]
    cycle = timeit.timeit(lambda: cycle_sort(cycle_unsorted), number=times)
    time_record['In Place Cycle Sort'].append(cycle)

    print 'gnome'
    gnome_unsorted = unsorted[:]
    gnome = timeit.timeit(lambda: gnome_sort(gnome_unsorted), number=times)
    time_record['Gnome Sort'].append(gnome)

    print 'pancake'
    pancake_unsorted = unsorted[:]
    pancake = timeit.timeit(lambda: pancake_sort(pancake_unsorted), number=times)
    time_record['In Place Pancake Sort'].append(pancake)

    print 'patience'
    patience_unsorted = unsorted[:]
    patience = timeit.timeit(lambda: patience_sort(patience_unsorted), number=times)
    time_record['In Place Patience Sort'].append(patience)

    print 'radix'
    radix_unsorted = unsorted[:]
    radix = timeit.timeit(lambda: radix_sort(radix_unsorted, length), number=times)
    time_record['Radix Sort'].append(radix)

    print 'selection'
    selection_unsorted = unsorted[:]
    selection = timeit.timeit(lambda: selection_sort(selection_unsorted), number=times)
    time_record['Selection Sort'].append(selection)

    print 'tree'
    tree_unsorted = unsorted[:]
    tree_sorted = timeit.timeit(lambda: abstract_tree_sort(tree_unsorted, BinarySearchTree), number=times)
    time_record['Abstract Tree Sort'].append(tree_sorted)

    print 'tim in c'
    tim_unsorted = unsorted[:]
    tim = timeit.timeit(lambda: sorted(tim_unsorted), number=times)
    time_record['Tim in C'].append(tim)

python

algorithm

python-2.7

sorting

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-06-18 15:02:17

最佳排序算法取决于各种因素，包括输入的属性(例如元素的大小)和对结果的要求(例如稳定性)。对于给定的输入集，Bubblesort在O(N)中可能异常快，而快速排序在O( N )中可能异常慢，而Mergesort总是在O(N )中。

在一般情况下，排序是在O( no )中进行的，也就是说，没有任何算法能够比这更快地排序任意集合。但是，对于某些输入特性，有一些排序算法相对于输入集的大小是线性的。很明显，你不能比这更快。

如果你对排序不太了解，你最好的选择就是简单地比较一些常见的排序算法。因为您的输入由“唯一整数”组成，所以您不必关心排序算法是否稳定。

对实际数据尝试以下算法并选择最快的算法：

梅吉索
布布莱
快速排序
基排序

如果可能输入的总体数量“很小”，您甚至可以跳过排序，只需预先计算所有可能的结果。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/30911168

复制

相似问题

问为给定特定列表的排序算法寻找声音定时测试
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为给定特定列表的排序算法寻找声音定时测试EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为给定特定列表的排序算法寻找声音定时测试
EN