Add free threading scaling microbenchmarks #125985

colesbury · 2024-10-25T16:25:20Z

Feature or enhancement

I've been using a simple script to help identify and track scaling bottlenecks in the free threading build. The benchmarks consists of patterns that ought to scale well, but haven't in the past, typically due to reference count contention or lock contention.

I think this is generally useful for people working on free-threading and would like to include it under Tools, perhaps as Tools/ftscalingbench/ftscalingbench.py.

Note that this is not intended to be a general multithreading benchmark suite, nor are the benchmarks intended to be representative of real-world workloads. The benchmarks are only intended to help identify and track scaling bottlenecks that occur in basic usage.

Here is the original script, I've since made some modifications:
https://github.com/colesbury/nogil-micro-benchmarks

Linked PRs

The text was updated successfully, but these errors were encountered:

These consist of a number of short snippets that help identify scaling bottlenecks in the free threaded interpreter. The current bottlenecks are in calling functions in benchmarks that call functions (due to `LOAD_ATTR` not yet using deferred reference counting) and when accessing thread-local data.

…125986) These consist of a number of short snippets that help identify scaling bottlenecks in the free threaded interpreter. The current bottlenecks are in calling functions in benchmarks that call functions (due to `LOAD_ATTR` not yet using deferred reference counting) and when accessing thread-local data.

Add a separate benchmark that measures the effect of `_PyObject_LookupSpecial()` on scaling. In the process of cleaning up the scaling benchmarks for inclusion, I unintentionally changed the "cmodule_function" benchmark to pass an `int` to `math.floor()` instead of a `float`, which causes it to use the `_PyObject_LookupSpecial()` code path. `_PyObject_LookupSpecial()` has its own scaling issues that we want to measure separately from calling a function on a C module.

…28460) Add a separate benchmark that measures the effect of `_PyObject_LookupSpecial()` on scaling. In the process of cleaning up the scaling benchmarks for inclusion, I unintentionally changed the "cmodule_function" benchmark to pass an `int` to `math.floor()` instead of a `float`, which causes it to use the `_PyObject_LookupSpecial()` code path. `_PyObject_LookupSpecial()` has its own scaling issues that we want to measure separately from calling a function on a C module.

…125986) These consist of a number of short snippets that help identify scaling bottlenecks in the free threaded interpreter. The current bottlenecks are in calling functions in benchmarks that call functions (due to `LOAD_ATTR` not yet using deferred reference counting) and when accessing thread-local data.

colesbury added type-feature A feature request or enhancement topic-free-threading labels Oct 25, 2024

bedevere-app bot mentioned this issue Oct 25, 2024

gh-125985: Add free threading scaling micro benchmarks #125986

Merged

colesbury added a commit to colesbury/cpython that referenced this issue Oct 28, 2024

Merge branch 'main' into pythongh-125985-ftscalingbench

e225113

colesbury closed this as completed Oct 28, 2024

bedevere-app bot mentioned this issue Jan 3, 2025

gh-125985: Fix cmodule_function() scaling benchmark #128460

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add free threading scaling microbenchmarks #125985

Add free threading scaling microbenchmarks #125985

colesbury commented Oct 25, 2024 •

edited by bedevere-app bot

Loading

Add free threading scaling microbenchmarks #125985

Add free threading scaling microbenchmarks #125985

Comments

colesbury commented Oct 25, 2024 • edited by bedevere-app bot Loading

Feature or enhancement

Linked PRs

colesbury commented Oct 25, 2024 •

edited by bedevere-app bot

Loading