Sunday, August 14, 2011

Sse tutorial for modular arithmetic?

I want to use sse vectorized instructions to speed up a large number of integer multiplies and additions, with the catch that this arithmetic is performed under some fixed < 32 bit prime modulus. I feel like the code gcc -02 could be improved by a factor of 10, but I'm not embly coder. Thanks,

No comments:

Post a Comment