首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >减少素筛内存消耗2

减少素筛内存消耗2
EN

Code Review用户
提问于 2014-05-08 23:27:44
回答 2查看 332关注 0票数 5

我制作了一个素数生成器(用于Euler项目)。它使用欧拉筛(一种改良的埃拉托斯提尼筛),有30步的模。我希望将内存消耗减少到当前的4/15,方法是只为可能的素余数30保留一个布尔数组。我不能让它开始工作,也担心这会减慢程序的速度。

我用过

代码语言:javascript
复制
n[f/30*8+[zeroes, except 1,2,3,4,5,6,7 at 7,11,13,17,19,23,29][f%30]]

似乎没有过滤掉任何合成数字。我如何使这个工作和其他优化(除了增加国防部),你可以建议?我需要达到20亿英镑的质数。

代码语言:javascript
复制
//pes30.cxx
#include <vector>
#include <algorithm>
#ifndef _pes30_cxx_
#define _pes30_cxx_

typedef unsigned long long big;

const int offsets[]={6,4,2,4,2,4,6,2};

//fill a given vector with all primes under some value:
void sievePrimes(big max, std::vector<big> &p){
    big multiple;
    bool n[max];//array of whether or not each number is prime (30/8 too big!)
    p={2,3,5};//because the sieve skips all multiples of 2,3, and 5, start with them.
    for(big i=0; i<max; ++i)//initialize the array
        n[i]=true;
    //for every number marked prime less than max, mark its multiples with
    //    every number still marked prime over it as composite.
    for(big i=7, step=1; i<=max; i+=offsets[step], ++step==8?step=0:0){
        if(!n[i])//if i is not prime
            continue;
        p.push_back(i);//add i to the list of primes
        //finds every multiple of i and a (still marked) prime greater than i
        for(big j=i, step2=step; j<=max/i; j+=offsets[step2], ++step2==8?step2=0:0){
            if(!n[j])//skip nonprimes
                continue;
            multiple=j*i;//begin at i^2
            do{
                n[multiple]=false;
            }while((multiple*=j) <= max);
        }
    }
}
//test if a number is prime by searching a given list of primes for it:
inline bool isPrime(big n, std::vector<big> p){
    return std::binary_search(p.begin(), p.end(), n);
}
#endif

请注意,这是一篇关于.正确的格式,虽然我欣赏对我的编码风格的批评,但我也希望得到一些关于改进数组的建议(只存储n个n个mod 30素数或1,但不是2,3,或5的数字n)。

EN

回答 2

Code Review用户

回答已采纳

发布于 2014-05-09 06:46:26

你发布了一些代码,显示了你让电脑做的事情。它甚至有一些评论。老实说,这比一般的问题好多了。

然后你会意识到我们不会理解这一团代码,所以你会使用像“我做的”和“它使用了欧拉的筛子”和“我需要的素数达到20亿”这样的短语。听着,这是非常有用的信息,我很高兴你告诉了我,但我会建议,将来你应该把程序的作者、算法的名称等等的信息放在程序中的注释中:

代码语言:javascript
复制
// pes30.cxx
// prime number generator for Project Euler
// Print all the primes up to 2x10^9.
// 2014-05-08: started by hacatu
// implementation of Euler's sieve (a modified Sieve of Eratosthenes)
// http://en.wikipedia.org/wiki/Euler%27s_sieve
// using wheel factorization with a wheel of size 30 to reduce memory requirements
// http://en.wikipedia.org/wiki/wheel_factorization

那么,有什么理由不使用http://primesieve.org/呢?你有没有让它在没有车轮因式分解优化的情况下工作?你试过只看奇怪的主要候选人(2码的轮子)还是6或12码的小轮?

我对这三行感到迷惑不解

代码语言:javascript
复制
bool n[max];//array of whether or not each number is prime (30/8 too big!)
for(big i=0; i<max; ++i)//initialize the array
    n[i]=true;

"max“是您想要检查的最大数字,因为在这个程序运行期间,它可能是最优的,对吗?所以这一行占用了一堆内存--从0到最大值,每个数字有一个bool,可能有200万个bool值。

我以为你会用轮式分解来减少内存需求?

我认为您希望用更类似于以下2行代码的内容来替换这3行代码:

代码语言:javascript
复制
big blocks_of_30_values = (max/30)+1;
// For example, when max is 38, max/30 gives 1, but we need 2 blocks: 0..29 and 30..59.
// In each block of 30 values under consideration
// (i.e, the block 30..59, the block 60..89, the block 90..119, etc.),
// after using the wheel factorization to eliminate multiples of 2 and 3 and 5,
// there are only 8 potential candidates left in each block.
// So store each block in a single unsigned char:
// and initialize the array to all 1 bits (1=candidate prime, 0=definitely composite)
std::vector<unsigned char> candidatePrime( blocks_of_30_values, 0xFF );

因此,当最大值为20亿时,我们只分配( max /30) =7000万个字符,即大约5.33亿比特。

如果您指定了函数来将数组中的位转换为它所代表的整数值,那么它可能会使程序更容易阅读,反之亦然:

代码语言:javascript
复制
// the bit at the n'th bit of the m'th byte of the candidatePrime array
// represents what value?
big the_value( big block_number, int bit_number ){
    const int offset[] = { 1, 7, 11, 13, 17, 19, 23, 29 };
    int this_offset = offset[bit_number];
    return( (30*block_number) + this_offset );
}
// the given integer value n corresponds to
// what (block_number, bit_number) of the candidatePrime array?
// assumes that n is *not* a multiple of 2, 3, or 5.
big block_number( big integer_value ){
    return (integer_value/30);
}
int bit_number( big integer_value ){
    // as long as the integer_value is not a multiple of 2, 3, or 5,
    // this function should never return '9'.
    // translate a value 0..29 into the corresponding bit number 0..7
    const int bit_offset[] = {
        9, 0, 9,  9, 9, 9,
        9, 1, 9,  9, 9, 2,
        9, 3, 9,  9, 9, 4,
        9, 5, 9,  9, 9, 6
        9, 7, 9,  9, 9, 7
    };
    return bit_offset[ integer_value % 30 ];
}
票数 4
EN

Code Review用户

发布于 2014-05-29 01:01:50

我将向量固定为使用三分之一的内存,使用的是偏移量,也可以使用vector<boolean>,这是为使用每一位进行优化的。这导致了50%的减速,但对于最高值(因为最大值现在可以是最大值的30倍),这是值得的,我相信我可以减少这一点。

我没有使用primesieve.org,因为我不知道它。比我的节目好得多。它在169秒内生成10亿以下的素数,我的程序需要18.539秒。

下面是我更新的程序,现在有了一个合理大小的向量(对于循环增量来说,它的工作量甚至更大):

代码语言:javascript
复制
// pes30.cxx
// prime number generator for Project Euler
// Print all the primes up to 2x10^9.
// 2014-05-08: started by hacatu
// implementation of Euler's sieve (a modified Sieve of Eratosthenes)
// http://en.wikipedia.org/wiki/Euler%27s_sieve
// using wheel factorization with a wheel of size 30 to reduce memory requirements
// http://en.wikipedia.org/wiki/wheel_factorization
#include <vector>
#include <algorithm>
#include <cstdint>
#include <cmath>
#ifndef _pes30_cxx_
#define _pes30_cxx_
using namespace std;
const int offsets[]={6,4,2,4,2,4,6,2};//steps in the wheel 1,7,11,13,17,19,23,29,1...
const int bool_offsets[]={//used to find a number's spot on the wheel, for accessing its primality in the vector
9, 0, 9, 9, 9, 9,//the nines should never be accessed
9, 1, 9, 9, 9, 2,
9, 3, 9, 9, 9, 4,
9, 5, 9, 9, 9, 6,
9, 9, 9, 9, 9, 7
};
//create a vector with all primes under some value:
inline vector<uint64_t> sievePrimes(uint64_t max){
    uint64_t multiple;
    vector<bool> n(max/30*8+8, true);//vector of whether every number relatively prime to 30 is still a candidate prime.
    //8 bits are needed for every 30 numbers (since 8 are relatively prime with 30), so max/30*8, plus 8 because max/30 rounds down.
    vector<uint64_t> primes={2,3,5};//because the sieve skips all multiples of 2,3, and 5, start with them.
    //for every number marked prime less than max, mark its multiples with
    //every number still marked prime over it as composite.
    int r=sqrt(max);
    for(uint64_t i=1, p=7, step=1; p <= max; ++i, p += offsets[step], ++step == 8 ? step=0:0){
        if(!n[i])//if p is not prime (using i for the index holding the primality of p, to avoid computing the index for every number)
            continue;
        primes.push_back(p);//add p to the list of primes
        //finds every multiple of i and a (still marked) prime greater than i
        for(uint64_t j=i, p2=p, step2=step; p2 <= max/p; ++j, p2 += offsets[step2], ++step2 == 8 ? step2=0:0){
            if(!n[j])//skip nonprimes
                continue;
            multiple=p*p2;
            do{
                n[multiple/30*8 + bool_offsets[ multiple%30 ]]=false;
            }while((multiple*=p) <= max);
        }
    }
    return primes;
}
//test if a number is prime by searching a given list of primes for it:
inline bool isPrime(uint64_t n, vector<uint64_t> primes){
    return std::binary_search(primes.begin(), primes.end(), n);
}
#endif
票数 2
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/49284

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档