Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type system implementation #2: Added magic/dunder method support for add operator on scalar types. #9669

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

kc611
Copy link
Contributor

@kc611 kc611 commented Jul 24, 2024

As titled,

This PR removes import guard and adds dunder method support for operator.add with scalar types.

Note: This PR builds on top of #9662

@kc611 kc611 marked this pull request as ready for review August 12, 2024 12:13
Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First patch of review is for everything but the new_builtins.py.

Suggestions:

  • I don't think we really need to split pythonapi.py. On both the old and new version, the types are always referring to the C-level types. To minimize the diff, you can swap types.intp to self.py_ssize_t etc. The PythonAPI.__init__ can have all the type definition and the required switching logic for old/new type system.

numba/core/base.py Outdated Show resolved Hide resolved
numba/core/datamodel/new_models.py Show resolved Hide resolved
numba/core/lowering.py Outdated Show resolved Hide resolved
numba/core/lowering.py Outdated Show resolved Hide resolved
numba/core/new_pythonapi.py Outdated Show resolved Hide resolved
numba/tests/test_new_type_system.py Outdated Show resolved Hide resolved
numba/tests/test_new_type_system.py Outdated Show resolved Hide resolved
numba/tests/test_new_type_system.py Outdated Show resolved Hide resolved
numba/tests/test_new_type_system.py Outdated Show resolved Hide resolved
numba/tests/test_new_type_system.py Outdated Show resolved Hide resolved
Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is for the __add__, __radd__ on bool, int, float, complex.

I did some digging into the behavior of the binops on the builtin types. They never calls the casting functions e.g. __int__, __float__ in userspace. They are hardcoded to call internal cast that cannot be overridden. For instance, complex ops use: https://github.com/python/cpython/blob/9621a7d0170bf1ec48bcfc35825007cdf75265ea/Objects/complexobject.c#L481-L501

Another example to demonstrate that the __<type>__ method is never called:

class MyOps:
    def __bool__(self):
        print("__bool__")
        return True

    def __int__(self):
        print("__int__")
        return 123

    def __float__(self):
        print("__float__")
        return 12.3

    def __complex__(self):
        print("__complex__")
        return 1   2.3j

class MyInt(MyOps, int):
    pass
class MyFloat(MyOps, float):
    pass
class MyComplex(MyOps, complex):
    pass


assert issubclass(MyInt, int)
print( bool(True)   MyInt(0) )
print( int(10)   MyInt(0) )
print( float(9.1)   MyInt(0) )
print( complex(1.3j)   MyInt(0) )
# Prints:
# 1
# 10
# 9.1
# 1.3j
assert issubclass(MyFloat, float)
print( bool(True)   MyFloat(0) )
print( int(10)   MyFloat(0) )
print( float(9.1)   MyFloat(0) )
print( complex(1.3j)   MyFloat(0) )
# Prints:
# 1.0
# 10.0
# 9.1
# 1.3j
assert issubclass(MyComplex, complex)
print( bool(True)   MyComplex(0) )
print( int(10)   MyComplex(0) )
print( float(9.1)   MyComplex(0) )
print( complex(1.3j)   MyComplex(0) )
# (1 0j)
# (10 0j)
# (9.1 0j)
# 1.3j

There is just no way for user to override these operation base by using the __<type>__() method.

numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This batch of review is from trying to extend __add__ to the Interval type. See https://gist.github.com/sklam/c60a989c927fefd901475a84516f7812 (main code starts at line 171). This is to make sure the protocol-based approach is truely extensible.

numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed everything up to and including 03c32a0

numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
@sklam
Copy link
Member

sklam commented Aug 14, 2024

There are many PEP8 errors introduced in this patch. Can you run the script in #9704 to find and fix these errors?

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following are violations from flake8:

Run flake8_diff by

git fetch origin pull/9704/head:pr/9704
git checkout pr/9704 maint/flake8_diff.py
python maint/flake8_diff.py

i excluded violations on changes that are clones of old files.

numba/core/new_boxing.py Outdated Show resolved Hide resolved
numba/core/types/misc.py Show resolved Hide resolved
numba/core/types/misc.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
numba/core/typing/new_builtins.py Outdated Show resolved Hide resolved
return float_add_float(self, asfloat)
if isinstance(other, float):
return float_add_float(self, other)
elif isinstance(other, (int, bool, np.float64)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python float should not know about numpy types. Handling of np.float64 should be done in NumPy binops. Also, the result type would be numpy types.

Copy link
Contributor Author

@kc611 kc611 Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary issue is if we remove this the following behaviour happens:

import numba
import numpy as np

@numba.njit
def foo(v, w):
    return v.__add__(w)

x = 1.1
y = np.float64(5.5)
print(foo.py_func(x, y))
# 6.6
print(foo(x, y))
# NotImplemented

def impl(self, other):
if isinstance(other, complex):
return complex_add_complex(self, other)
elif isinstance(other, (float, int, bool, np.float64, np.complex128)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about numpy types

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid <type>() or __<type>__(). Add ways to extract machine repr.

The overloads for __add__ can never call <type>() because they end up calling __<type>__---CPython semantic does not do that. Considering the extensibility of the builtin number types, I think Numba need to either:

  1. restrict these types to retain the datamodel of the base number type. For example, subclasses of int will be a i64 in LLVM. or
  2. implement intrinsics to extract the machine representation for each of the number types. They will be the equivalent of PyFloat_AS_DOUBLE for float.

Test failures

% NUMBA_USE_LEGACY_TYPE_SYSTEM=0 python runtests.py numba/tests/test_new_type_system.py -v
test_add (numba.tests.test_new_type_system.TestDunderMethods.test_add) ... ok
test_dunder_add (numba.tests.test_new_type_system.TestDunderMethods.test_dunder_add) ... FAIL
test_dunder_radd (numba.tests.test_new_type_system.TestDunderMethods.test_dunder_radd) ... FAIL
test_return_types (numba.tests.test_new_type_system.TestTypes.test_return_types) ... ok

======================================================================
FAIL: test_dunder_add (numba.tests.test_new_type_system.TestDunderMethods.test_dunder_add)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/dev/numba/numba/tests/test_new_type_system.py", line 77, in test_dunder_add
    self.assertEqual(res, py_res, (
AssertionError: NotImplemented != (5 5j) : Failed for (1 2j) and (4 3j); gave answer NotImplemented should be (5 5j)

======================================================================
FAIL: test_dunder_radd (numba.tests.test_new_type_system.TestDunderMethods.test_dunder_radd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/dev/numba/numba/tests/test_new_type_system.py", line 93, in test_dunder_radd
    self.assertEqual(res, py_res, (
AssertionError: NotImplemented != (5 5j) : Failed for (1 2j) and (4 3j); gave answer NotImplemented should be (5 5j)

----------------------------------------------------------------------
Ran 4 tests in 60.690s

FAILED (failures=2)

@@ -372,9 360,9 @@ def impl(self):
def py_bool__add__(self, other):
def impl(self, other):
if isinstance(other, bool):
return bool_add_bool(self, other)
return bool_add_bool(self, bool(other))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does other need to be casted to bool?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no bool_add_bool in CPython

>>> True   True
2

It is just int_add_int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, these overloads cannot call int(), bool(), float(), complex() ever. The CPython semantic do not do that. Calling this will allow user to override with __<type>__().

elif isinstance(other, int):
return int_add_int(int(self), other)
return int_add_int(int(self), int(other))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot call int()

return float_add_float(self, other)
elif isinstance(other, (int, bool, np.float64)):
if isinstance(other, (bool, int, float)):
# Cast is required in case the other is a NumPy float
return float_add_float(self, float(other))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot call float()

return complex_add_complex(self, other)
elif isinstance(other, (float, int, bool, np.float64, np.complex128)):
if isinstance(other, (bool, int, float, complex)):
# Cast is required in case the other is a NumPy complex
return complex_add_complex(self, complex(other))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot call complex()

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put my review into a diff to show what I mean: sklam@d1e4388

  • Avoid call to context.cast in the operator implementation. Put them into a specific cast intrinsic. We need to control what cast are allowed to match Python semantic.
  • The diff passes the test but I haven't looked into the changes in NumPy impl.

Added CPython based `__add__` semantics
@kc611
Copy link
Contributor Author

kc611 commented Aug 23, 2024

I added the diff into this PR as a whole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants