feat: add support for specifying a data type "kind" in `astype` #848

lucascolley · 2024-10-24T14:18:27Z

This PR:

Addresses a comment made in Arithmetic between real arrays and Python complex scalars (revisited) #841 (comment) and should close How to infer appropriate dtype from uint to int and float to complex? #859 by adding data type "kind" support in astype.

src/array_api_stubs/_draft/data_type_functions.py

rgommers

Thanks for getting the ball rolling on this @lucascolley! Some comments inline.

src/array_api_stubs/_draft/data_type_functions.py

rgommers · 2024-10-26T05:13:20Z

src/array_api_stubs/_draft/data_type_functions.py

+        For ``dtype_or_kind`` a data type, an array having the specified data type.
+        For ``dtype_or_kind`` a kind of data type:
+        -   If ``x.dtype`` is already of that kind, the data type is maintained.
+        -   Otherwise, an attempt is made to convert to the specified kind, according to the type promotion rules (see :ref:`type-promotion`).


Why "an attempt"? That seems ambiguous. We have to be clear about what must work. Which I think is:

float to complex

unsigned to signed integer

Anything else doesn't I think? There's no point allowing 'bool' I think, since there is only one boolean dtype so dtype=xp.bool will be cleaner.

For 'signed integer' and 'real floating-point'` there are also no promotion rules to follow, so they can be left out - or do you see a use case?

I've reduced this down to just 'complex floating' (use-case: mixed float/complex to complex) and 'signed integer' (use-case: mixed signed/unsigned to signed).

I think "an attempt" would still be accurate for an implementation of this? xp.astype(some_int8_array, 'complex floating') would attempt a conversion, whose success will depend on the implementation-specific type promotion rules, right?

Unless you think that this function should always error when the type promotion is not defined by the standard?

I think "an attempt" would still be accurate for an implementation of this?

I think you have the right idea in mind here, it's just a "language we use to specify things" thing. We specify which behavior has to be supported - 'complex floating' has type promotion rules defined in the standard, so it's expected to always work for a compliant implementation. Then, if we expect other input types to raise, then we specify that by "must raise ..." or "input type must be ...". In this case there's no reason to do that (implementors are free to suppport more types, it's just not standardized), so we then say "input type should be ...".

Your "attempt to ..." seems to be the same as "should be ...", it's just language we want to write in a uniform way.

how about the wording now?

Quick update: I took the liberty to update the wording. I also made the call to broaden the list of data type kinds. I think there are reasonable arguments for providing any one of the numeric data type kinds (e.g., int32 and real floating => float64, etc), and it is possible to delineate a set of clearly defined rules in terms of which data type should be returned. Leaving bool out seems somewhat arbitrary, especially when the semantics are clearly specified and all other kinds can be, IMO, reasonably provided (note: even including "numeric"; i.e., convert anything provided to me to numbers so I can compute the sum, etc).

src/array_api_stubs/_draft/data_type_functions.py

lucascolley · 2024-12-13T10:45:24Z

Gentle reminder of the 2024 milestone here!

lucascolley

just one nit, otherwise this new wording LGTM, thanks @kgryte !

lucascolley · 2025-01-25T20:45:49Z

src/array_api_stubs/_draft/data_type_functions.py

-       When casting a boolean input array to a real-valued data type, a value of ``True`` must cast to a real-valued number equal to ``1``, and a value of ``False`` must cast to a real-valued number equal to ``0``.
+                -   When applying type promotion rules, the returned array must have the lowest-precision data type belonging to the specified data type kind to which the data type of ``x`` promotes (e.g., if ``x`` is ``float32`` and the data type kind is ``'complex floating'``, then the returned array must have the data type ``complex64``; if ``x`` is ``uint16`` and the data type kind is ``'signed integer'``, then the returned array must have the data type ``int32``).
+                -   When type promotion rules are not specified between the data type of ``x`` and the specified data type kind (e.g., ``int16`` and ``'real floating'``) and there exists one or more data types belonging to the specified data kind in which the elements in ``x`` can be represented exactly (e.g., ``int32`` and ``float64``), the function must return an array having the smallest data type (in terms of range of values) capable of precisely representing the elements of ``x`` (e.g., if ``x`` is ``int16`` and the data type kind is ``'complex floating'``, then the returned array must have the data type ``complex64``; if ``x`` is `bool`` and the data type kind is ``integral``, then the returned array must have the data type ``int8``).
+                -   When type promotion rules are not specified between the data type of ``x`` and the specified data type kind and there neither exists a data type belonging to the specified data type in which the elements of ``x`` can be represented exactly (e.g., ``uint64`` and ``'real floating'``) nor are there applicable casting rules documented below, behavior is unspecified and thus implementation-defined.


Suggested change

- When type promotion rules are not specified between the data type of ``x`` and the specified data type kind and there neither exists a data type belonging to the specified data type in which the elements of ``x`` can be represented exactly (e.g., ``uint64`` and ``'real floating'``) nor are there applicable casting rules documented below, behavior is unspecified and thus implementation-defined.

- When type promotion rules are not specified between the data type of ``x`` and there neither exists a data type belonging to the specified data type in which the elements of ``x`` can be represented exactly (e.g., ``uint64`` and ``'real floating'``) nor are there applicable casting rules documented below, behavior is unspecified and thus implementation-defined.

I don't think this suggestion makes sense atm, as the removed phrase is part of

"between the data type of x and the specified data type kind"

Removing "the specified data type kind" removes the y in between x and y.

kgryte · 2025-02-06T03:08:23Z

As discussed during the workgroup meeting on January 23, 2025, given that this is new behavior without any downstream implementations, we decided to do two things:

Punt to v2025 in order to avoid rushing this into the spec.
Incubate the API in array-api-compat in order to work out any design issues after having real-world implementations.

@ev-br Would you mind adding to array-api-compat and circling back here?

feat: astype: accept a kind of data type

2a82758

lucascolley commented Oct 24, 2024

View reviewed changes

src/array_api_stubs/_draft/data_type_functions.py Outdated Show resolved Hide resolved

lucascolley added 2 commits October 24, 2024 14:24

try to fix docs build

419041b

change signature

302cf1a

lucascolley marked this pull request as ready for review October 24, 2024 14:33

run pre-commit

3186995

rgommers added the API change Changes to existing functions or objects in the API. label Oct 26, 2024

rgommers reviewed Oct 26, 2024

View reviewed changes

lucascolley added 4 commits October 27, 2024 16:20

address review comments

90aa750

adjust wording

41880c0

add note on unspecified promotion

d4450b3

improve formatting

1b056ad

lucascolley mentioned this pull request Oct 28, 2024

ENH: real and complex dtype functions data-apis/array-api-extra#13

Open

kgryte added this to the v2024 milestone Oct 31, 2024

kgryte self-requested a review October 31, 2024 05:40

kgryte added the Needs Review Pull request which needs review. label Oct 31, 2024

lucascolley mentioned this pull request Nov 26, 2024

How to infer appropriate dtype from uint to int and float to complex? #859

Open

asmeurer reviewed Nov 27, 2024

View reviewed changes

src/array_api_stubs/_draft/data_type_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Nov 27, 2024

View reviewed changes

src/array_api_stubs/_draft/data_type_functions.py Outdated Show resolved Hide resolved

lucascolley mentioned this pull request Dec 9, 2024

ENH: signal.vectorstrength: add array API standard support scipy/scipy#22008

Merged

kgryte added 3 commits January 23, 2025 00:10

refactor: update guidance to generalize across data type kinds

169ea5e

fix: update copy

ecd0f59

docs: update copy

b61122a

lucascolley commented Jan 25, 2025

View reviewed changes

kgryte changed the title ~~feat: astype: accept a kind of data type~~ feat: add support for specifying a data type "kind" in astype Feb 6, 2025

kgryte modified the milestones: v2024, v2025 Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for specifying a data type "kind" in `astype` #848

feat: add support for specifying a data type "kind" in `astype` #848

lucascolley commented Oct 24, 2024 •

edited by kgryte

Loading

rgommers left a comment

rgommers Oct 26, 2024

lucascolley Oct 27, 2024

rgommers Oct 27, 2024

lucascolley Oct 28, 2024

kgryte Jan 23, 2025 •

edited

Loading

lucascolley commented Dec 13, 2024

lucascolley left a comment

lucascolley Jan 25, 2025

kgryte Feb 6, 2025

kgryte commented Feb 6, 2025

	- When type promotion rules are not specified between the data type of ``x`` and the specified data type kind and there neither exists a data type belonging to the specified data type in which the elements of ``x`` can be represented exactly (e.g., ``uint64`` and ``'real floating'``) nor are there applicable casting rules documented below, behavior is unspecified and thus implementation-defined.
	- When type promotion rules are not specified between the data type of ``x`` and there neither exists a data type belonging to the specified data type in which the elements of ``x`` can be represented exactly (e.g., ``uint64`` and ``'real floating'``) nor are there applicable casting rules documented below, behavior is unspecified and thus implementation-defined.

feat: add support for specifying a data type "kind" in astype #848

Are you sure you want to change the base?

feat: add support for specifying a data type "kind" in astype #848

Conversation

lucascolley commented Oct 24, 2024 • edited by kgryte Loading

rgommers left a comment

Choose a reason for hiding this comment

rgommers Oct 26, 2024

Choose a reason for hiding this comment

lucascolley Oct 27, 2024

Choose a reason for hiding this comment

rgommers Oct 27, 2024

Choose a reason for hiding this comment

lucascolley Oct 28, 2024

Choose a reason for hiding this comment

kgryte Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

lucascolley commented Dec 13, 2024

lucascolley left a comment

Choose a reason for hiding this comment

lucascolley Jan 25, 2025

Choose a reason for hiding this comment

kgryte Feb 6, 2025

Choose a reason for hiding this comment

kgryte commented Feb 6, 2025

feat: add support for specifying a data type "kind" in `astype` #848

feat: add support for specifying a data type "kind" in `astype` #848

lucascolley commented Oct 24, 2024 •

edited by kgryte

Loading

kgryte Jan 23, 2025 •

edited

Loading