Conversation
But it does for n = 10, 11, 12 :-) It's not that surprising to me: for small n Mulders ends up doing three hardcoded multiplications which are extremely fast and for slightly larger n Karatsuba kicks in. |
Move related parameters to flint-mparam.h
5cdf22e to
83de090
Compare
|
I'm thinking that the binary version is sort of useless if CPU has fast division. I don't really like the idea of having different versions of these sort of functions.
I think it is nice if tuning things are gathered nicely together. And I would argue that the tuning is part of the source code, but I'm okay with |
Remove the
n_modpart from #1991, and just focus on the tuning suite.To keep this concrete, the goal of this PR is to lay the foundation for a somewhat modular tuning suite and to include tuners for
n_xgcdandn_gcdinv,flint_mpn_mulhigh_nandflint_mpn_sqrhigh.Btw, @fredrik-johansson do you know why the current table for k-values for Mulders' high multiplication never favors
_flint_mpn_mulhigh_basecase? It feels like it should favor the basecase at least for n < 20.