New Rule: Prefer list comprehension over generator comprehensions to create tuples #11839

Avasam · 2024-06-11T17:32:40Z

I was recently working on some bits of codes where most of my data had to be "readonly" (so I'm using immutable types like frozen dataclasses, frozensets, tuples, etc.) but also using plenty of comprehensions. Which made me wonder, since there's no "tuple comprehension" in Python, how I should be writing this code. I did a bit of performance testing, and here's the results:

import sys
from timeit import timeit

print(sys.version)
big_list = ["*"] * 99

def foo(value: str): return value

def test_list_comprehension():
    return [foo(value) for value in big_list]

def test_tuple_from_list_comprehension():
    return tuple([foo(value) for value in big_list])

def test_tuple_from_generator_comprehension():
    return tuple(foo(value) for value in big_list)

def test_unpack_generator_comprehension():
    return (*(foo(value) for value in big_list),)

print(
    "test_list_comprehension",
    timeit(test_list_comprehension),
)
print(
    "test_tuple_from_list_comprehension",
    timeit(test_tuple_from_list_comprehension),
)
print(
    "test_tuple_from_generator_comprehension",
    timeit(test_tuple_from_generator_comprehension),
)
print(
    "test_unpack_generator_comprehension",
    timeit(test_unpack_generator_comprehension),
)

3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
test_list_comprehension 6.4194597
test_tuple_from_list_comprehension 6.9672235
test_tuple_from_generator_comprehension 8.996260200000002
test_unpack_generator_comprehension 11.207814599999999

3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
test_list_comprehension 5.656617900000128
test_tuple_from_list_comprehension 6.026029500000277
test_tuple_from_generator_comprehension 9.207803900000272
test_unpack_generator_comprehension 10.375420500000018

Unsurprisingly, the difference is even greater in 3.12 with inline list comprehension.

Because of the tuple, the generator is immediately iterated, so you get no benefit from its "lazyness". This is probably true for other stdlib collections that don't have a comprehension syntax, tuple is just the only one I can think of atm.

For this reason, I'm asking for a performance rule with an autofix that transforms code like this:

tuple(a for a in b)

into

tuple([a for a in b])

Which, unless I'm missing something, is free performance whilst staying readable and pythonic.

It seems this would fit well in the flake8-comprehensions or refurb family of rules.

The text was updated successfully, but these errors were encountered:

tdulcet · 2024-06-12T09:42:54Z

Using your script, I see less of a difference on Linux with CPython:

3.12.3 (main, Apr 10 2024, 05:33:47) [GCC 13.2.0]
test_list_comprehension 3.4234498779999853
test_tuple_from_list_comprehension 3.7473174160000156
test_tuple_from_generator_comprehension 4.684340659999975
test_unpack_generator_comprehension 4.938177772000017

3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
test_list_comprehension 5.881427110000001
test_tuple_from_list_comprehension 6.184221321999999
test_tuple_from_generator_comprehension 6.949574859000002
test_unpack_generator_comprehension 7.213964431000001

3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0]
test_list_comprehension 5.159386299999994
test_tuple_from_list_comprehension 5.68906659999999
test_tuple_from_generator_comprehension 6.2835374
test_unpack_generator_comprehension 6.521585700000003

3.6.9 (default, Dec  8 2021, 21:08:43)
[GCC 8.4.0]
test_list_comprehension 5.325352799999997
test_tuple_from_list_comprehension 5.670514699999998
test_tuple_from_generator_comprehension 6.860152300000003
test_unpack_generator_comprehension 7.0944126999999995

2.7.17 (default, Feb 27 2021, 15:10:58)
[GCC 7.5.0]
('test_list_comprehension', 5.888335943222046)
('test_tuple_from_list_comprehension', 6.135804891586304)
('test_tuple_from_generator_comprehension', 6.965441942214966)

But much more of a difference with PyPy:

3.9.18 (7.3.15+dfsg-1build3, Apr 01 2024, 03:12:48)
[PyPy 7.3.15 with GCC 13.2.0]
test_list_comprehension 0.2822986920000403
test_tuple_from_list_comprehension 0.40187594900010026
test_tuple_from_generator_comprehension 0.9802658359999441
test_unpack_generator_comprehension 1.0730282659999375

ivanychev · 2024-06-26T00:54:35Z

I think tuple-from-list comprehension approach will lead to to 2x higher peak memory consumption, won't it?

NeilGirdhar · 2025-01-01T23:40:34Z

It may be faster, but I think it's less legible since it introduces a useless list comprehension. I think it would be better to post this on discuss to see the if core developers can make comprehension creation faster.

NeilGirdhar · 2025-01-01T23:48:07Z

Also, I came here to post a related issue, but probably the discussion belongs here since it's close enough:

In [1]: %timeit x = [1, *[2 for _ in range(10)]]
156 ns ± 3.32 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [2]: %timeit x = [1, *(2 for _ in range(10))]
337 ns ± 9.72 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Ruff doesn't seem to have an opinion on whether an unpacked nested iterable should be written as a list or generator. I think Ruff should have an opinion, and from a legibility standpoint, I prefer generators even though they are slower.

Avasam changed the title ~~New Rule: Prefer list comprehension over generators to create tuples~~ New Rule: Prefer list comprehension over generator comprehensions to create tuples Jun 11, 2024

charliermarsh added the rule Implementing or modifying a lint rule label Jun 12, 2024

This was referenced Aug 12, 2024

Consider deprecating UP027 or improve its docs #12754

Closed

C409 now makes code slower #12912

Open

Avasam mentioned this issue Dec 31, 2024

Enforce ruff/refurb rules (FURB) pypa/setuptools#4386

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Rule: Prefer list comprehension over generator comprehensions to create tuples #11839

New Rule: Prefer list comprehension over generator comprehensions to create tuples #11839

Avasam commented Jun 11, 2024 •

edited

Loading

tdulcet commented Jun 12, 2024

ivanychev commented Jun 26, 2024

NeilGirdhar commented Jan 1, 2025

NeilGirdhar commented Jan 1, 2025

New Rule: Prefer list comprehension over generator comprehensions to create tuples #11839

New Rule: Prefer list comprehension over generator comprehensions to create tuples #11839

Comments

Avasam commented Jun 11, 2024 • edited Loading

tdulcet commented Jun 12, 2024

ivanychev commented Jun 26, 2024

NeilGirdhar commented Jan 1, 2025

NeilGirdhar commented Jan 1, 2025

Avasam commented Jun 11, 2024 •

edited

Loading