文章/答案/技术大牛

发布

社区首页 >问答首页 >如何快速获取python中集合的所有交叉点

问如何快速获取python中集合的所有交叉点
EN

Stack Overflow用户

提问于 2016-06-03 19:36:14

回答 2查看 1.2K关注 0票数 4

我想计算python中有限整数集合的所有(不同的)交点(这里实现为列表列表)(为了避免混淆，问题末尾有一个正式的定义)：

> A = [[0,1,2,3],[0,1,4],[1,2,4],[2,3,4],[0,3,4]]
> all_intersections(A) # desired output
[[], [0], [1], [2], [3], [4], [0, 1], [0, 3], [0, 4], [1, 2], [1, 4], [2, 3], [2, 4], [3, 4], [0, 1, 4], [0, 3, 4], [1, 2, 4], [2, 3, 4], [0, 1, 2, 3]]

我有一个迭代算法，但是它相当慢(我应该发布它吗？)，测试用例应该是

[[0, 1, 2, 3, 4, 9], [0, 1, 4, 5, 6, 10], [0, 2, 4, 5, 7, 11], [1, 3, 4, 6, 8, 12], [2, 3, 4, 7, 8, 13], [4, 5, 6, 7, 8, 14], [0, 1, 9, 10, 15, 16], [0, 2, 9, 11, 15, 17], [1, 3, 9, 12, 16, 18], [2, 3, 9, 13, 17, 18], [9, 15, 16, 17, 18, 19], [0, 5, 10, 11, 15, 20], [1, 6, 10, 12, 16, 21], [10, 15, 16, 19, 20, 21], [5, 6, 10, 14, 20, 21], [11, 15, 17, 19, 20, 22], [5, 7, 11, 14, 20, 22], [2, 7, 11, 13, 17, 22], [7, 8, 13, 14, 22, 23], [3, 8, 12, 13, 18, 23], [13, 17, 18, 19, 22, 23], [14, 19, 20, 21, 22, 23], [6, 8, 12, 14, 21, 23], [12, 16, 18, 19, 21, 23]]

我需要大约2.5秒的时间来计算。

你知道怎么快点吗？

形式定义(实际上是硬的，没有胶乳模式)：设A= {A1，.}是有限集的有限集Ai的非负整数。输出应该是A的B:B子集中集合的集合{交集。

所以，形式上的算法是把A的所有子集的所有交叉点结合起来，但这显然要花费很长时间。

非常感谢！

python

algorithm

回答 2

Stack Overflow用户

发布于 2016-06-03 20:31:23

这里有一个递归的解决方案。在您的测试示例中几乎是即时的：

def allIntersections(frozenSets):
    if len(frozenSets) == 0:
        return []
    else:
        head = frozenSets[0]
        tail = frozenSets[1:]
        tailIntersections = allIntersections(tail)
        newIntersections = [head]
        newIntersections.extend(tailIntersections)
        newIntersections.extend(head & s for s in tailIntersections)
        return list(set(newIntersections))

def all_intersections(lists):
    sets = allIntersections([frozenset(s) for s in lists])
    return [list(s) for s in sets]

On Edit在这里是相同思想的一个更干净、非递归的实现。

如果您将一个空集合的交集定义为通用集合，并且通过接受所有元素的并，就可以得到一个足够的通用集合，那么这个问题是最简单的。这是格论中的一个标准动作，它是将一个空集合的集合合并为空集的对偶。如果你不想要的话，你总是可以扔掉这个通用的集合：

def allIntersections(frozenSets):
    universalSet = frozenset.union(*frozenSets)
    intersections = set([universalSet])
    for s in frozenSets:
        moreIntersections = set(s & t for t in intersections)
        intersections.update(moreIntersections)
    return intersections

def all_intersections(lists):
    sets = allIntersections([frozenset(s) for s in lists])
    return [list(s) for s in sets]

测试示例如此快速的原因是，即使集合有24组，因此有2**24 (1 680万)潜在交叉点，但实际上只有242个(或者如果不算空交)不同的交叉口。因此，每个通过循环的交叉口的数目最多只有几百个。

可以选择24组，这样所有的2**24可能的交叉口实际上都是不同的，因此很容易看出最坏情况的行为是指数的。但是，如果像在测试示例中那样，交叉口的数目很小，这种方法将允许您快速计算它们。

一个潜在的优化可能是在遍历这些集之前，先按不断增加的大小对它们排序。在前面处理较小的设置可能会导致更多的空交叉口出现较早，从而使不同交叉口的总数保持较小，直到接近循环结束。

票数 7

Stack Overflow用户

发布于 2016-06-03 21:22:27

迭代解决方案，为您的大型测试输入在我的机器上花费大约3.5ms：

from itertools import starmap, product
from operator import and_

def all_intersections(sets):
    # Convert to set of frozensets for uniquification/type correctness
    last = new = sets = set(map(frozenset, sets))
    # Keep going until further intersections add nothing to results
    while new:
        # Compute intersection of old values with newly found values
        new = set(starmap(and_, product(last, new)))
        last = sets.copy()  # Save off prior state
        new -= last         # Determine truly newly added values
        sets |= new         # Accumulate newly added values in complete set
    # No more intersections being generated, convert results to canonical
    # form, list of lists, where each sublist is displayed in order, and
    # the top level list is ordered first by size of sublist, then by contents
    return sorted(map(sorted, sets), key=lambda x: (len(x), x))

基本上，它只是在旧的结果集和新发现的交叉口之间进行双向交叉，直到一轮交叉没有改变任何东西，然后就完成了。

备注：--这实际上不是最好的解决方案(递归算法足够好，可以在测试数据上获胜，其中John的解决方案在向外部包装添加排序使其与格式匹配后，大约需要0.94 ms，而对于我的则是3.5ms)。我主要是把它作为以其他方式解决问题的一个例子。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/37622153

复制

相似问题

问如何快速获取python中集合的所有交叉点
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何快速获取python中集合的所有交叉点EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何快速获取python中集合的所有交叉点
EN