Inside the source code on github, the function.ref file describes M_pow_mod, which is exactly what I'm looking for, but when I search the rest of the source code I can't find it mentioned anywhere else. This does look unusual, in that it uses the parameters as (exponent,base,modulus) instead of the normal (base,exponent,modulus)
It can be found in mapm_pwr3.c as m_apm_powmod(). It was not in the C++ wrapper either.
I have added support for it and added a $powmod() identifier that calls this function. Your script completes in half the time when using it.
I also implemented your $bf_modpow() identifier in C.
My development computer is about ten years old, so it is pretty slow. When I run your script with the following, the timings are:
$bf_modpow(): 2048 -> 7.4s, 4096 -> 57s
m_apm_powmod(): 2048 -> 3.7s, 4096 -> 29s
$bf_modpow() in C: 2048 -> 2.9s, 4096 -> 23s
All methods return the same results.
So the $bf_modpow() in C is faster than the m_apm_powmod() function. However, there seems to be more going on in m_apm_powmod() but I cannot tell which implementation is better in terms of edge cases, etc.
Let me know which one you want $powmod() to use.
Update: I am going to go with the faster implementation for the next beta and we can see how it works out.