Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-117999: use generic algorithm in complex_pow() if base has special components #123283

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

skirpichev
Copy link
Member

@skirpichev skirpichev commented Aug 24, 2024

@skirpichev
Copy link
Member Author

CC @picnixz

See also #60200

@picnixz
Copy link
Member

picnixz commented Aug 24, 2024

I had some thoughts on that matter. For small integer exponents $b = \Re(b) + \Im(b)\cdot\jmath$ with $\Im(b) = 0$ and $\Re(b) \in \mathbb{Z}$, we would now call _Py_c_pow in https://github.com/python/cpython/blob/main/Objects/complexobject.c:

vabs = hypot(a.real,a.imag);
len = pow(vabs,b.real);
at = atan2(a.imag, a.real);
phase = at*b.real;
/* if (b.imag != 0.0) {
    len /= exp(at*b.imag);
    phase += b.imag*log(vabs);
} */ // ignored since b.imag == 0
r.real = len*cos(phase);
r.imag = len*sin(phase);

Let $a = r(\cos \phi + \jmath \sin \phi)$ and $b = n\in\mathbb{Z}$. De Moivre's formula tells me that $a^{n} = r^{n}(\cos n\phi + \jmath \sin n\phi)$. So, we could technically have an alternative path for the case $n = 2k$ since we can directly compute vabs^2 instead of vabs. Since hypot treats special cases when inputs are quiet NaN and infinities, or subnormal numbers, we should only use this path when $a$ has finite components. We would save one square root in the computation of hypot and compute len = pow(vabs2, b.real // 2) instead.

I don't know whether the costly computation is due to hypot, atan2 or cos/sin computations. Maybe it won't change anything (and perhaps it would even be slower). Could you therefore check the timings of odd powers (your benchmarks only show even powers and they are twice slower with this patch)?

@skirpichev
Copy link
Member Author

skirpichev commented Aug 24, 2024

Could you therefore check the timings of odd powers (your benchmarks only show even powers and they are twice slower with this patch)?

Yeah, below are tests copied from #118000 (comment) with few more with odd exponents:

With specialized code (main):

$ python -m pyperf timeit -q -s 'z=1+1j;e=20' 'pow(z,e)'
Mean +- std dev: 237 ns +- 5 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=25' 'pow(z,e)'
Mean +- std dev: 252 ns +- 2 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=85' 'pow(z,e)'
Mean +- std dev: 277 ns +- 12 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=90' 'pow(z,e)'
Mean +- std dev: 259 ns +- 4 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=95' 'pow(z,e)'
Mean +- std dev: 283 ns +- 2 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=110' 'pow(z,e)'
Mean +- std dev: 579 ns +- 5 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=200' 'pow(z,e)'
Mean +- std dev: 571 ns +- 3 ns

Without:

$ python -m pyperf timeit -q -s 'z=1+1j;e=2' 'pow(z,e)'
Mean +- std dev: 562 ns +- 3 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=20' 'pow(z,e)'
Mean +- std dev: 576 ns +- 2 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=25' 'pow(z,e)'
Mean +- std dev: 584 ns +- 33 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=85' 'pow(z,e)'
Mean +- std dev: 583 ns +- 8 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=90' 'pow(z,e)'
Mean +- std dev: 576 ns +- 3 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=95' 'pow(z,e)'
Mean +- std dev: 578 ns +- 5 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=110' 'pow(z,e)'
Mean +- std dev: 578 ns +- 4 ns
$ python -m pyperf timeit -q -s 'z=1+1j;e=200' 'pow(z,e)'
Mean +- std dev: 573 ns +- 3 ns

@picnixz
Copy link
Member

picnixz commented Aug 24, 2024

Without:

Can I have the numbers for odd exponents without as well please?

@skirpichev
Copy link
Member Author

Sure, comment was updated. As you can see, results aren't biased to even/odd integers.

@picnixz
Copy link
Member

picnixz commented Aug 24, 2024

Do you think it's worth checking the even case for a fast path or do you think we should just live with those this loss of performance? (personally, in this case, I'm for correctness rather than speed).

@skirpichev
Copy link
Member Author

I think it not worth. Certainly, this can't restore speed for small integers (for even case).

picnixz
picnixz previously approved these changes Aug 24, 2024
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but I approved without commenting ...

@skirpichev skirpichev requested a review from picnixz August 25, 2024 01:27
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, I'm in favor of correctness rather than speed in this case. But since #60200 is still under discussion, I'll be interested in hearing from @mdickinson and @serhiy-storchaka for this one.

@vstinner
Copy link
Member

Can you update your PR? There is now a conflict (with the ERANGE fix).

@vstinner
Copy link
Member

So Python has its own cpow() implementation:

Py_complex
_Py_c_pow(Py_complex a, Py_complex b)
{
    Py_complex r;
    double vabs,len,at,phase;
    if (b.real == 0. && b.imag == 0.) {
        r.real = 1.;
        r.imag = 0.;
    }
    else if (a.real == 0. && a.imag == 0.) {
        if (b.imag != 0. || b.real < 0.)
            errno = EDOM;
        r.real = 0.;
        r.imag = 0.;
    }
    else {
        vabs = hypot(a.real,a.imag);
        len = pow(vabs,b.real);
        at = atan2(a.imag, a.real);
        phase = at*b.real;
        if (b.imag != 0.0) {
            len /= exp(at*b.imag);
            phase += b.imag*log(vabs);
        }
        r.real = len*cos(phase);
        r.imag = len*sin(phase);

        _Py_ADJUST_ERANGE2(r.real, r.imag);
    }
    return r;
}

Can we reuse the cpow() of the C library instead? A C11 compiler is now required to build CPython: https://docs.python.org/dev/using/configure.html#build-requirements

@vstinner vstinner changed the title gh-117999: drop special case for small integer exponents gh-117999: drop special case for small integer exponents in complex_pow() Sep 18, 2024
@skirpichev
Copy link
Member Author

Can you update your PR? There is now a conflict (with the ERANGE fix).

Done.

Can we reuse the cpow() of the C library instead? A C11 compiler is now required to build CPython: https://docs.python.org/dev/using/configure.html#build-requirements

Yet we don't require optional C11 features, cpow() - is from Annex G. Not even all compilers implement one (mvsc for example, though it has something similar).

And I'm not sure about quality of libm implementations for that part of the standard. After support for C complex in ctypes was merged, I did tests (same as in CMathTests.test_specific_values) of cmath_testcases.txt on top of Debian's 12 default gcc & glibc: to my surprise it was working (modulo two simple failures from changes in C17, now fixed). But perhaps I'm just lucky Linux user.

@vstinner
Copy link
Member

I'm not sure that I understood correctly the purpose of this change. Is it supposed to fix some corner cases and enhance correctness? The change only adds 2 tests which is small.

@serhiy-storchaka
Copy link
Member

Try complex(-1, 0)**1.

@vstinner
Copy link
Member

Try complex(-1, 0)**1.

Without the change, I get:

$ ./python
>>> complex(-1, 0)**1
(-1+0j)

With the change, I get:

>>> complex(-1, 0)**1
(-1+1.2246467991473532e-16j)

@skirpichev
Copy link
Member Author

Is it supposed to fix some corner cases and enhance correctness?

Well, yes. See examples in the issue description:

>>> z = complex(1,-0.0)
>>> z**2
(1+0j)
>>> z*z
(1-0j)
>>> z**2000  # large powers or non-integer - have own opinion
(1-0j)
>>> >>> z**2.1
(1-0j)
>>> _testcapi._py_c_pow(z, 2)[0]  # as well as _Py_c_pow() algorithm
(1-0j)

The change only adds 2 tests which is small.

Agreed. I'll work on this.

Try complex(-1, 0)**1

That's something, which now is more precise in the main, as an accident. But:

>>> _testcapi._py_c_pow(z, 1)[0]
(-1+1.2246467991473532e-16j)
>>> libm.cpow(z, 1)  # libm's cpow
(-1+1.2246467991473532e-16j)

@skirpichev skirpichev marked this pull request as draft September 19, 2024 10:45
@serhiy-storchaka
Copy link
Member

See also #124243.

@skirpichev skirpichev changed the title gh-117999: drop special case for small integer exponents in complex_pow() gh-117999: use generic algorithm in complex_pow() if base has special components Jan 20, 2025
@skirpichev skirpichev marked this pull request as ready for review February 4, 2025 00:58
@skirpichev
Copy link
Member Author

skirpichev commented Feb 4, 2025

ok, now it's ready for review again.

The fix now is restricted to arguments with special components in the base. That means, e.g. we can get wrong signs when special components appear in intermediate computations. An example from the issue thread:

>>> import _testcapi
>>> z = -1-1j
>>> _testcapi._py_c_pow(z, 6)[0]
(4.408728476930473e-15-8.000000000000004j)
>>> z**6
(-0-8j)
>>> z*z*z*z*z*z
-8j

The sign here is just an artifact of special ordering of multiplications, it's different for same base with other exponents:

>>> z**10
32j
>>> _testcapi._py_c_pow(z, 10)[0]
(-8.623494204034455e-14+32.00000000000002j)
>>> z**14
(-0-128j)
>>> _testcapi._py_c_pow(z, 14)[0]
(-6.278114563782784e-14-128.0000000000001j)

The only possible solution is removing special algorithm for integer exponents. Edit: BTW, I'm in favor of this option (was - previous variant) also on the ground of compatibility with C (which has only cpow() function). The special algorithm for integer exponents introduces mixed-mode arithmetic for powers, but inconsistently: only for small integer exponents. It's better to have here just z**real = exp(real*log(z)) (with multiplication taken component-wise).

CC @picnixz
CC @serhiy-storchaka

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants