3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

intgr · 2024-02-15T17:58:12Z

Bug report

Bug description:

This is related to #113964, but trying to address the bigger problem of what "interpreter shutdown" is, and trying to make a stronger case for restoring the old behavior.

TL;DR: This was an undocumented change. The old behavior was useful. Documentation is contradictory. Users are struggling and are recommended to downgrade.

Undocumented change

In Python 3.11 and previous versions, daemon=False threads could spawn new threads even after the main thread has exited.

This changed in Python 3.12, which throws "RuntimeError: can't create new thread at interpreter shutdown" when a daemon=False thread tries to spawn new threads after the main thread has exited.

I could not find this change to be documented in 3.12 release notes, it appears to be an unintentional change.

Reproducer

"""Python 3.11: Successfully spawns 20 subthreads. Python 3.12: RuntimeError."""
from threading import Thread, current_thread
from time import sleep

def subthread():
    print(f"Thread {current_thread().name} spawned...")

def thread():
    for i in range(20):
        Thread(target=subthread, daemon=False).start()
        sleep(0.001)

Thread(target=thread, daemon=False).start()

Old behavior was useful

While not join()ing the launched threads is considered an antipattern, at the very least the old behavior was useful for throw-away scripts that launch a bunch of threads, and have Python exit automatically once all work was finished.

Documentation is contradictory

The documentation says this about daemon threads:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.

I think a useful interpretation of "exits when only daemon threads are left" also implies "the Python program starts shutting down when only daemon threads are left", and does not shut down before. Because what else could "exits" mean if not "shuts down"?

Although this is in conflict with the definition of "interpreter shutdown" that states:

The main reason for interpreter shutdown is that the __main__ module or the script being run has finished executing.

This interpetation is also not being respected by sys.is_finalizing() API. This is supposed to "Return True if the Python interpreter is shutting down", but it returns False in this situation even after the main thread has exited and thread spawning fails: #114570. That's another indication that this change was unintentional.

Users are struggling

Python has a reputation for breakage across version upgrades. I hope there is motivation among CPython developers to change that. I think such unintentional changes that break users, should be treated as regressions to be fixed.

There are three posts on StackOverflow [1] [2] [3], where downgrading Python is the dominant "solution". And when you have threads spawning more threads, the "proper fix" is often not a simple matter of adding a Thread.join() somewhere, but may require re-architecting the whole program to keep track of what all threads are doing.

Related issues: #113964, #114570.

CPython versions tested on:

3.12.2, 3.11.7

Operating systems tested on:

Linux, macOS

The text was updated successfully, but these errors were encountered:

gpshead · 2024-02-22T08:24:49Z

+cc: @ericsnowcurrently @serhiy-storchaka @vstinner as FYI about this 3.12 behavior change.

gpshead · 2024-02-22T10:05:25Z

A compatible workaround for application code running into this is to potentially add a call to a function something like https://gist.github.com/gpshead/48389eac0fd5c3fbc7ed8c5f4667be41 at the end of their main program.

Not pretty, but it avoids re-architecting. (EDIT: do not use this, see vstinner's much simpler thing below)

vstinner · 2024-02-22T10:26:57Z

I'm not sure of the purpose of this issue.

it appears to be an unintentional change.

The change is intentional. Spawning threads can cause crashes. It was changed to make Python safer and more reliable.

I could not find this change to be documented in 3.12 release notes,

Oh, that's bad. It should be documented.

The documentation says this about daemon threads:

Your example doesn't use daemon threads, so it's unrelated.

Python calls threading._shutdown() which waits for all threads: call their join() method. But that's unrelated to the change on disallowing to spawn new threads during the Python finalization.

The main reason for interpreter shutdown is that the main module or the script being run has finished executing.

In your example, Python is finalized after it executes the last line of your script. That's correct.

the "proper fix" is often not a simple matter of adding a Thread.join() somewhere, but may require re-architecting the whole program to keep track of what all threads are doing.

You can add these lines at the end of your program. It's generic and doesn't require to re-architecture anything:

import threading
for thread in threading.enumerate():
    if thread.daemon or thread is threading.current_thread():
        continue
    thread.join()

intgr · 2024-02-22T11:25:15Z

The change is intentional.

Where was this change discussed and decided?

Spawning threads can cause crashes. It was changed to make Python safer and more reliable.

Are you saying that it's impossible to safely and reliably delay interpreter shutdown until all non-daemon threads have exited? Doubt! gpshead above seems to have done it.

(Or are you saying that spawning threads in general can cause crashes? Then the change would be unrelated)

The documentation says this about daemon threads:

Your example doesn't use daemon threads, so it's unrelated.

It is very much related. It's the part of documentation that explains the difference between daemon threads and non-daemon threads. By documenting what happens with daemon=True, it also documents what doesn't happen with daemon=False. There isn't a third possible value to a boolean attribute.

In your example, Python is finalized after it executes the last line of your script. That's correct.

Are you basing your claim of "correctness" on the quote from documentation? It's possible for documentation to be incorrect, just as it's possible for code to be incorrect.

CPython didn't follow documentation for ages, until version 3.12. I think it's a fair argument that documentation should be adapted instead, particularly as it's in apparent conflict with the description of non-daemon threads.

gpshead · 2024-02-22T18:54:20Z

FWIW @vstinner "I'm not sure the purpose of this issue" reads as dismissive of a user reporting a bug even if I doubt you meant it that way. It is a valid bug, what worked before no longer works in 3.12 - without any notice or explanation from us being the problem if this behavior change is what we want.

zenbones · 2024-02-22T19:09:13Z

You may say that basing Python's thread model on other languages is not relevant, but the current documentation accepts the meaning of daemon threads current in other languages with thread control. I'd say that Python's thread model is borrowed from Java, as are many of its more recent and useful additions to the standard libraries. This is not a bad thing. Love or hate Java it has a great set of standard libraries. In Java, when I mark a thread as daemon, the JVM does not exit until all daemon threads have finished. And I can continue to spawn new threads so long as the JVM has not exited. This is neither dangerous nor impossibly difficult. Forcing the user to keep track of every possible thread with some top level hack, to fulfill the meaning of daemon threads, is a situation which is unsafe and unclear.

You could change the documentation to clearly state that, in Python, a daemon thread will not prevent the interpreter from shutting down. I think that would be wrong and a step backwards in Python's evolution as a useful language. But at least it would clear and correct. Or, you could change python to properly track daemon threads and not shut down until they've all completed. This does naturally create a race condition between allowing the interpreter to finalize and client code which might be marking a thread in the daemon state. Fortunately, there are mutex locks capable of resolving that race condition in a clear manner. Use one. The error should not be delivered on creation of any thread during the lifetime of a daemon thread, but rather on the attempt to create a thread or mark a thread as daemon after the interpreter has determined no demon threads are running and begun finalization. And it's the interpreter's responsibility to run the checking loop, possibly on an internal daemon thread...

vstinner · 2024-02-22T20:29:27Z

FWIW @vstinner "I'm not sure the purpose of this issue" reads as dismissive of a user reporting a bug even if I doubt you meant it that way.

Oh, I'm sorry if it can be read this way. It might be a translation issue, English is not my first language.

What I mean is that I'm not sure which actions should be taken to solve thie issue. We might have to clarify which tasks should be done to enhance the situation. Apparently, documenting the change is a first step.

vstinner · 2024-02-22T20:39:48Z

Where was this change discussed and decided?

The change was done in issue gh-104690 with the change commit ce558e6: PR gh-104826. It was discussed there.

Are you saying that it's impossible to safely and reliably delay interpreter shutdown until all non-daemon threads have exited?

See issue gh-104690 for a concrete example of crash caused by a thread spawned during the Python finalization.

"Python exit" or "Python finalization" starts by waiting until threads complete. If new threads are spawned while Python is waiting for the existing threads to complete, bad things can happen.

In Python 3.11 and older it was allowed. But it was decided to change this in Python 3.12.

I didn't say that it's impossible to support this case. Python is a project maintained by volunteers. It's all about trade-offs. With our limited resources, we decided to make the finalization simpler to make it more reliable.

Apparently, there is a way to still support your use case, but it requires some changes to your code.

gvanrossum · 2024-02-22T21:31:37Z

Closing as a duplicate of gh-113964.

(Also, is there a confusion above about the meaning of "daemon" threads vs. non-daemon threads? The intention, from Python's earliest days, was that at exit the interpreter waits until all non daemon threads have exited, while daemon threads are killed unceremoniously. There's got to be documentation about that. If it's different in Java, that's too bad -- while we did borrow the thread API from Java, we might have accidentally diverged, but it's at least two decades too late to change.)

gpshead · 2024-02-22T22:06:02Z

#113964

With the switch to python 3.12 the log started registering a RuntimeError due to a race condition. ``` systemd[1]: Stopping Fishtest Server port 6543... pserve[17032]: flush pserve[17032]: .....................done pserve[17032]: Exception in thread Thread-740: pserve[17032]: Traceback (most recent call last): pserve[17032]: File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 1073, in _bootstrap_inner pserve[17032]: self.run() pserve[17032]: File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 1431, in run pserve[17032]: self.function(*self.args, **self.kwargs) pserve[17032]: File "/home/fishtest/fishtest/server/fishtest/rundb.py", line 451, in flush_buffers pserve[17032]: self.start_timer() pserve[17032]: File "/home/fishtest/fishtest/server/fishtest/rundb.py", line 365, in start_timer pserve[17032]: self.timer.start() pserve[17032]: File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 992, in start pserve[17032]: _start_new_thread(self._bootstrap, ()) pserve[17032]: RuntimeError: can't create new thread at interpreter shutdown systemd[1]: Stopped Fishtest Server port 6543. ``` This is the issue opened upstream: python/cpython#115533

we're getting an issue with the tests: ValueError: option names {'--shard-id'} already added This might be due to out-of-tree tests; after some debugging attempt I'm removing due to time constraints. The same tests were showing a "RuntimeError: can't create new thread at interpreter shutdown" on python 3.12 This is detailed here python/cpython#115533 From looking at the pytorch code it looks like there's still some polishing of support for 3.12 happening so will consider this an upstream error

intgr added the type-bug An unexpected behavior, bug, or error label Feb 15, 2024

ronaldoussoren added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 16, 2024

damies13 mentioned this issue Feb 18, 2024

ModuleNotFoundError: When running RFSwarm-Agent on Python 3.12 damies13/rfswarm#177

Closed

gpshead added the 3.12 bugs and security fixes label Feb 22, 2024

gpshead added this to Threading issues 🧵 Feb 22, 2024

gpshead moved this to Todo in Threading issues 🧵 Feb 22, 2024

gpshead changed the title ~~Python 3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left~~ 3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left Feb 22, 2024

gpshead mentioned this issue Feb 22, 2024

3.12 regression: "can't create new thread at interpreter shutdown" from non-daemon threads or atexit handlers #113964

Open

gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2024

github-project-automation bot moved this from Todo to Done in Threading issues 🧵 Feb 22, 2024

ericsnowcurrently added the topic-subinterpreters label Feb 27, 2024

ppigazzini mentioned this issue Apr 4, 2024

Mitigate Python 3.12 Shutdown RuntimeError official-stockfish/fishtest#1925

Merged

exceptionfactory mentioned this issue Apr 12, 2024

NIFI-13042 Support Python 3.12 for Python Processors apache/nifi#8644

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

intgr commented Feb 15, 2024 •

edited

Loading

gpshead commented Feb 22, 2024

gpshead commented Feb 22, 2024 •

edited

Loading

vstinner commented Feb 22, 2024

intgr commented Feb 22, 2024 •

edited

Loading

gpshead commented Feb 22, 2024

zenbones commented Feb 22, 2024 •

edited

Loading

vstinner commented Feb 22, 2024

vstinner commented Feb 22, 2024

gvanrossum commented Feb 22, 2024

gpshead commented Feb 22, 2024

3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

Comments

intgr commented Feb 15, 2024 • edited Loading

Bug report

Bug description:

Undocumented change

Reproducer

Old behavior was useful

Documentation is contradictory

Users are struggling

CPython versions tested on:

Operating systems tested on:

gpshead commented Feb 22, 2024

gpshead commented Feb 22, 2024 • edited Loading

vstinner commented Feb 22, 2024

intgr commented Feb 22, 2024 • edited Loading

gpshead commented Feb 22, 2024

zenbones commented Feb 22, 2024 • edited Loading

vstinner commented Feb 22, 2024

vstinner commented Feb 22, 2024

gvanrossum commented Feb 22, 2024

gpshead commented Feb 22, 2024

intgr commented Feb 15, 2024 •

edited

Loading

gpshead commented Feb 22, 2024 •

edited

Loading

intgr commented Feb 22, 2024 •

edited

Loading

zenbones commented Feb 22, 2024 •

edited

Loading