From 535c8b816ffee8c788bc3351264a0ff75fa0b24c Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 20 Jun 2025 15:59:42 -0600 Subject: [PATCH 01/23] Add a "Runtime Components" section to the execution model docs. --- Doc/reference/executionmodel.rst | 69 ++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index cb6c524dd97a30..0e8cdfa97117a6 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -398,6 +398,75 @@ See also the description of the :keyword:`try` statement in section :ref:`try` and :keyword:`raise` statement in section :ref:`raise`. +.. _execcomponents: + +Runtime Components +================== + +Python's execution model does not operate in a vacuum. It runs on a +computer. When a program runs, the conceptual layers of how it runs +on the computer look something like this:: + + host computer (or VM or container) + process + OS thread (runs machine code) + +.. (Sometimes there may even be an extra layer right after "thread" + for light-weight threads or coroutines.) + +While a program always starts with exactly one of each of those, it may +grow to include multiple of each. Hosts and processes are isolated and +independent from one another. However, threads are not. Each thread +does *run* independently, for the small segments of time it is +scheduled to execute its code on the CPU. Otherwise, all threads +in a process share all the process' resources, including memory. +This is exactly what can make threads a pain: two threads running +at the same arbitrary time on different CPU cores can accidentally +interfere with each other's use of some shared data. The initial +thread is known as the "main" thread. + +The same layers apply to each Python program, with some extra layers +specific to Python:: + + host + process + Python runtime + interpreter + Python thread (runs bytecode) + +when a Python program starts, it looks exactly like that, with one +of each. The process has a single global runtime to manage global +resources. Each Python thread has all the state it needs to run +Python code (and use any supported C-API) in its OS thread. + +.. , including its stack of call frames. + +.. If the program uses coroutines (async) then the thread will end up + juggling multiple stacks. + +In between the global runtime and the threads lies the interpreter. +It encapsulates all of the non-global runtime state that the +interpreter's Python threads share. For example, all those threads +share :data:`sys.modules`. When a Python thread is created, it belongs +to an interpreter. + +If the runtime supports using multiple interpreters then each OS thread +will have at most one Python thread for each interpreter. However, +only one is active in the OS thread at a time. Switching between +interpreters means changing the active Python thread. +The initial interpreter is known as the "main" interpreter. + +.. (The interpreter is different from the "bytecode interpreter", + of which each thread has one to execute Python code.) + +Once a program is running, new Python threads can be created using the +:mod:`threading` module. Additional processes can be created using the +:mod:`multiprocessing` and :mod:`subprocess` modules. You can run +coroutines (async) in the main thread using :mod:`asyncio`. +Interpreters can be created using the :mod:`concurrent.interpreters` +module. + + .. rubric:: Footnotes .. [#] This limitation occurs because the code that is executed by these operations From 3f3d5ccaa463f1453df9137df9b07fdde7d94680 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 11:35:02 -0600 Subject: [PATCH 02/23] Fix a typo. --- Doc/reference/executionmodel.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 0e8cdfa97117a6..b57e4755e59585 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -434,7 +434,7 @@ specific to Python:: interpreter Python thread (runs bytecode) -when a Python program starts, it looks exactly like that, with one +When a Python program starts, it looks exactly like that, with one of each. The process has a single global runtime to manage global resources. Each Python thread has all the state it needs to run Python code (and use any supported C-API) in its OS thread. From b1d6ed7ae15a3402703d7861d2b3443fe1eb0c63 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 11:58:46 -0600 Subject: [PATCH 03/23] Clarify about platform support for threads. --- Doc/reference/executionmodel.rst | 41 +++++++++++++++++++------------- 1 file changed, 24 insertions(+), 17 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index b57e4755e59585..4238d48432e728 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -416,14 +416,19 @@ on the computer look something like this:: While a program always starts with exactly one of each of those, it may grow to include multiple of each. Hosts and processes are isolated and -independent from one another. However, threads are not. Each thread -does *run* independently, for the small segments of time it is -scheduled to execute its code on the CPU. Otherwise, all threads +independent from one another. However, threads are not. + +Not all platforms support threads, though most do. For those that do, +each thread does *run* independently, for the small segments of time it +is scheduled to execute its code on the CPU. Otherwise, all threads in a process share all the process' resources, including memory. -This is exactly what can make threads a pain: two threads running -at the same arbitrary time on different CPU cores can accidentally -interfere with each other's use of some shared data. The initial -thread is known as the "main" thread. +The initial thread is known as the "main" thread. + +.. note:: + + The way they share resources is exactly what can make threads a pain: + two threads running at the same arbitrary time on different CPU cores + can accidentally interfere with each other's use of some shared data. The same layers apply to each Python program, with some extra layers specific to Python:: @@ -435,8 +440,8 @@ specific to Python:: Python thread (runs bytecode) When a Python program starts, it looks exactly like that, with one -of each. The process has a single global runtime to manage global -resources. Each Python thread has all the state it needs to run +of each. The process has a single global runtime to manage Python's +global resources. Each Python thread has all the state it needs to run Python code (and use any supported C-API) in its OS thread. .. , including its stack of call frames. @@ -444,11 +449,12 @@ Python code (and use any supported C-API) in its OS thread. .. If the program uses coroutines (async) then the thread will end up juggling multiple stacks. -In between the global runtime and the threads lies the interpreter. -It encapsulates all of the non-global runtime state that the -interpreter's Python threads share. For example, all those threads -share :data:`sys.modules`. When a Python thread is created, it belongs -to an interpreter. +In between the global runtime and the thread(s) lies the interpreter. +It completely encapsulates all of the non-global runtime state that the +interpreter's Python threads share. For example, all its threads share +:data:`sys.modules`. When a Python thread is created, it belongs +to an interpreter, and likewise when an OS thread is otherwise +associated with Python. If the runtime supports using multiple interpreters then each OS thread will have at most one Python thread for each interpreter. However, @@ -460,9 +466,10 @@ The initial interpreter is known as the "main" interpreter. of which each thread has one to execute Python code.) Once a program is running, new Python threads can be created using the -:mod:`threading` module. Additional processes can be created using the -:mod:`multiprocessing` and :mod:`subprocess` modules. You can run -coroutines (async) in the main thread using :mod:`asyncio`. +:mod:`threading` module (on platforms and Python implementations that +support threads). Additional processes can be created using the +:mod:`os`, :mod:`subprocess`, and :mod:`multiprocessing` modules. +You can run coroutines (async) in the main thread using :mod:`asyncio`. Interpreters can be created using the :mod:`concurrent.interpreters` module. From aeca87a57ec5ffa95a106ec08b6e77f20f13cdd1 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 12:03:57 -0600 Subject: [PATCH 04/23] Drop a comment. --- Doc/reference/executionmodel.rst | 3 --- 1 file changed, 3 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 4238d48432e728..497f9c3120750a 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -411,9 +411,6 @@ on the computer look something like this:: process OS thread (runs machine code) -.. (Sometimes there may even be an extra layer right after "thread" - for light-weight threads or coroutines.) - While a program always starts with exactly one of each of those, it may grow to include multiple of each. Hosts and processes are isolated and independent from one another. However, threads are not. From b12a02bc955ea5bd9a8bb0c926fbfd637771cda4 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 12:06:07 -0600 Subject: [PATCH 05/23] Identify what might be thread-specific state. --- Doc/reference/executionmodel.rst | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 497f9c3120750a..34b8ce5e0434eb 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -439,12 +439,9 @@ specific to Python:: When a Python program starts, it looks exactly like that, with one of each. The process has a single global runtime to manage Python's global resources. Each Python thread has all the state it needs to run -Python code (and use any supported C-API) in its OS thread. - -.. , including its stack of call frames. - -.. If the program uses coroutines (async) then the thread will end up - juggling multiple stacks. +Python code (and use any supported C-API) in its OS thread. Depending +on the implementation, this probably includes the current exception +and the Python call stack. In between the global runtime and the thread(s) lies the interpreter. It completely encapsulates all of the non-global runtime state that the From 17a2f341a8970c2ff1764eb83df70ad30ef6cab1 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 12:15:11 -0600 Subject: [PATCH 06/23] Clarify about "interpreter". --- Doc/reference/executionmodel.rst | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 34b8ce5e0434eb..a490a3903da5af 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -438,17 +438,22 @@ specific to Python:: When a Python program starts, it looks exactly like that, with one of each. The process has a single global runtime to manage Python's -global resources. Each Python thread has all the state it needs to run -Python code (and use any supported C-API) in its OS thread. Depending -on the implementation, this probably includes the current exception -and the Python call stack. +process-global resources. Each Python thread has all the state it needs +to run Python code (and use any supported C-API) in its OS thread. +Depending on the implementation, this probably includes the current +exception and the Python call stack. In between the global runtime and the thread(s) lies the interpreter. -It completely encapsulates all of the non-global runtime state that the -interpreter's Python threads share. For example, all its threads share -:data:`sys.modules`. When a Python thread is created, it belongs -to an interpreter, and likewise when an OS thread is otherwise -associated with Python. +It completely encapsulates all of the non-process-global runtime state +that the interpreter's Python threads share. For example, all its +threads share :data:`sys.modules`. When a Python thread is created, +it belongs to an interpreter, and likewise when an OS thread is +otherwise associated with Python. + +.. note:: + + The interpreter here is not the same as the "bytecode interpreter", + which is what runs in each thread, executing compiled Python code. If the runtime supports using multiple interpreters then each OS thread will have at most one Python thread for each interpreter. However, @@ -456,9 +461,6 @@ only one is active in the OS thread at a time. Switching between interpreters means changing the active Python thread. The initial interpreter is known as the "main" interpreter. -.. (The interpreter is different from the "bytecode interpreter", - of which each thread has one to execute Python code.) - Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that support threads). Additional processes can be created using the From 9ac4b4a00345543504609687f570629745a58853 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 27 Jun 2025 14:11:45 -0600 Subject: [PATCH 07/23] Clarify the relationship betwwen OS threads and Python threads and interpreters. --- Doc/reference/executionmodel.rst | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index a490a3903da5af..31b05216bfe80c 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -446,20 +446,24 @@ exception and the Python call stack. In between the global runtime and the thread(s) lies the interpreter. It completely encapsulates all of the non-process-global runtime state that the interpreter's Python threads share. For example, all its -threads share :data:`sys.modules`. When a Python thread is created, -it belongs to an interpreter, and likewise when an OS thread is -otherwise associated with Python. +threads share :data:`sys.modules`. Every Python thread belongs to a +single interpreter and runs using that shared state. The initial +interpreter is known as the "main" interpreter, and the initial thread, +where the runtime was initialized, is known as the "main" thread. .. note:: The interpreter here is not the same as the "bytecode interpreter", which is what runs in each thread, executing compiled Python code. -If the runtime supports using multiple interpreters then each OS thread -will have at most one Python thread for each interpreter. However, -only one is active in the OS thread at a time. Switching between -interpreters means changing the active Python thread. -The initial interpreter is known as the "main" interpreter. +Every Python thread is associated with a single OS thread, which is +where it runs. However, multiple Python threads can be associated with +the same OS thread. For example, an OS thread might run code with a +first interpreter and then with a second, each necessarily with its own +Python thread. Still, regardless of how many are *associated* with +an OS thread, only one Python thread can be actively *running* in +an OS thread at a time. Switching between interpreters means +changing the active Python thread. Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that From f7cb965e3a626a3f0deb16a83e80d36b0fd70025 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 11:23:12 -0600 Subject: [PATCH 08/23] Clarify about the host. --- Doc/reference/executionmodel.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 31b05216bfe80c..529fd48b13298d 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -407,7 +407,7 @@ Python's execution model does not operate in a vacuum. It runs on a computer. When a program runs, the conceptual layers of how it runs on the computer look something like this:: - host computer (or VM or container) + host machine process OS thread (runs machine code) From e71394cf4720c9d97d6f97ae1c08a9d5ec0e4fd0 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 11:25:07 -0600 Subject: [PATCH 09/23] Do not talk about distributed computing. --- Doc/reference/executionmodel.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 529fd48b13298d..b9fc39e9555336 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -411,15 +411,15 @@ on the computer look something like this:: process OS thread (runs machine code) -While a program always starts with exactly one of each of those, it may -grow to include multiple of each. Hosts and processes are isolated and -independent from one another. However, threads are not. - -Not all platforms support threads, though most do. For those that do, -each thread does *run* independently, for the small segments of time it -is scheduled to execute its code on the CPU. Otherwise, all threads -in a process share all the process' resources, including memory. -The initial thread is known as the "main" thread. +Hosts and processes are isolated and independent from one another. +However, threads are not. + +While a program always starts with exactly one thread, known as the +"main" thread, it may grow to run in multiple. Not all platforms +support threads, but most do. For those that do, each thread does *run* +independently, for the small segments of time it is scheduled to execute +its code on the CPU. Otherwise, all threads in a process share all the +process' resources, including memory. .. note:: From 8f454c4266687dbeadf9e335cf9117929f897055 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 11:41:15 -0600 Subject: [PATCH 10/23] Note the operating system in the diagram. --- Doc/reference/executionmodel.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index b9fc39e9555336..734e6c2a4a0be3 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -407,7 +407,7 @@ Python's execution model does not operate in a vacuum. It runs on a computer. When a program runs, the conceptual layers of how it runs on the computer look something like this:: - host machine + host machine and operating system (OS) process OS thread (runs machine code) From cd0200c0d6f9e1882852ad7707e982c381a2168a Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 12:02:48 -0600 Subject: [PATCH 11/23] Avoid talking about how threads are scheduled. --- Doc/reference/executionmodel.rst | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 734e6c2a4a0be3..9bf9b608059af7 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -414,18 +414,22 @@ on the computer look something like this:: Hosts and processes are isolated and independent from one another. However, threads are not. -While a program always starts with exactly one thread, known as the -"main" thread, it may grow to run in multiple. Not all platforms -support threads, but most do. For those that do, each thread does *run* -independently, for the small segments of time it is scheduled to execute -its code on the CPU. Otherwise, all threads in a process share all the -process' resources, including memory. +A program always starts with exactly one thread, known as the "main" +thread, it may grow to run in multiple. Not all platforms support +threads, but most do. For those that do, all threads in a process +share all the process' resources, including memory. + +Each thread does *run* independently, at the same time as the others. +That may be only conceptually at the same time ("concurrently") or +physically ("in parallel"). Either way, the threads run at a +non-synchronized rate, which means global state isn't guaranteed +to stay consistent for any given thread. .. note:: The way they share resources is exactly what can make threads a pain: - two threads running at the same arbitrary time on different CPU cores - can accidentally interfere with each other's use of some shared data. + two threads running at the same time can accidentally interfere with + each other's use of some shared data. The same layers apply to each Python program, with some extra layers specific to Python:: From 9ccc743b2e4ee3d7f99ade423c16ad215c3c2c13 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 12:45:27 -0600 Subject: [PATCH 12/23] Be more specific about the "pain" of threads. --- Doc/reference/executionmodel.rst | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 9bf9b608059af7..a9f6c8e916ef08 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -419,17 +419,28 @@ thread, it may grow to run in multiple. Not all platforms support threads, but most do. For those that do, all threads in a process share all the process' resources, including memory. -Each thread does *run* independently, at the same time as the others. -That may be only conceptually at the same time ("concurrently") or -physically ("in parallel"). Either way, the threads run at a -non-synchronized rate, which means global state isn't guaranteed -to stay consistent for any given thread. +The fundamental point of threads is that each thread does *run* +independently, at the same time as the others. That may be only +conceptually at the same time ("concurrently") or physically +("in parallel"). Either way, the threads effectively run +at a non-synchronized rate. .. note:: - The way they share resources is exactly what can make threads a pain: - two threads running at the same time can accidentally interfere with - each other's use of some shared data. + That non-synchronized rate means none of the global state is + guaranteed to stay consistent for the code running in any given + thread. Thus multi-threaded programs must take care to coordinate + access to intentionally shared resources. Likewise, they must take + care to be absolutely diligent about not accessing any *other* + resources in multiple threads; otherwise two threads running at the + same time might accidentally interfere with each other's use of some + shared data. All this is true for both Python programs and the + Python runtime. + + The cost of this broad, unstructured requirement is the tradeoff for + the concurrency and, especially, parallelism that threads provide. + The alternative generally means dealing with non-deterministic bugs + and data corruption. The same layers apply to each Python program, with some extra layers specific to Python:: From 4dce0fcdfcb855c8771daad83dc5f6923921c288 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 30 Jun 2025 16:35:51 -0600 Subject: [PATCH 13/23] Clarify about the relationship between OS threads, Python threads, and interpreters. --- Doc/reference/executionmodel.rst | 61 ++++++++++++++++++++------------ 1 file changed, 38 insertions(+), 23 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index a9f6c8e916ef08..002604efe86b4c 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -453,40 +453,55 @@ specific to Python:: When a Python program starts, it looks exactly like that, with one of each. The process has a single global runtime to manage Python's -process-global resources. Each Python thread has all the state it needs -to run Python code (and use any supported C-API) in its OS thread. -Depending on the implementation, this probably includes the current -exception and the Python call stack. - -In between the global runtime and the thread(s) lies the interpreter. -It completely encapsulates all of the non-process-global runtime state -that the interpreter's Python threads share. For example, all its -threads share :data:`sys.modules`. Every Python thread belongs to a -single interpreter and runs using that shared state. The initial -interpreter is known as the "main" interpreter, and the initial thread, -where the runtime was initialized, is known as the "main" thread. +process-global resources. The runtime may grow to include multiple +interpreters and each interpreter may grow to include multiple Python +threads. The initial interpreter is known as the "main" interpreter, +and the initial thread, where the runtime was initialized, is known +as the "main" thread. + +An interpreter completely encapsulates all of the non-process-global +runtime state that the interpreter's Python threads share. For example, +all its threads share :data:`sys.modules`, but each interpreter has its +own :data:`sys.modules`. .. note:: The interpreter here is not the same as the "bytecode interpreter", - which is what runs in each thread, executing compiled Python code. + which is what regularly runs in threads, executing compiled Python code. + +A Python thread represents the state necessary for the Python runtime +to *run* in an OS thread. It also represents the execution of Python +code (or any supported C-API) in that OS thread. Depending on the +implementation, this probably includes the current exception and +the Python call stack. The Python thread always identifies the +interpreter it belongs to, meaning the state it shares +with other threads. + +.. note:: + + Here "Python thread" does not necessarily refer to a thread created + using the :mod:`threading` module. + +Each Python thread is associated with a single OS thread, which is where +it can run. In the opposite direction, a single OS thread can have many +Python threads associated with it. However, only one of those Python +threads is "active" in the OS thread at time. The runtime will operate +in the OS thread relative to the active Python thread. -Every Python thread is associated with a single OS thread, which is -where it runs. However, multiple Python threads can be associated with -the same OS thread. For example, an OS thread might run code with a -first interpreter and then with a second, each necessarily with its own -Python thread. Still, regardless of how many are *associated* with -an OS thread, only one Python thread can be actively *running* in -an OS thread at a time. Switching between interpreters means -changing the active Python thread. +For an interpreter to be used in an OS thread, it must have a +corresponding active Python thread. Thus switching between interpreters +means changing the active Python thread. An interpreter can have Python +threads, active or inactive, for as many OS threads as it needs. It may +even have multiple Python threads for the same OS thread, though at most +one can be active at a time. Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that support threads). Additional processes can be created using the :mod:`os`, :mod:`subprocess`, and :mod:`multiprocessing` modules. You can run coroutines (async) in the main thread using :mod:`asyncio`. -Interpreters can be created using the :mod:`concurrent.interpreters` -module. +Interpreters can be created and used with the +:mod:`concurrent.interpreters` module. .. rubric:: Footnotes From bf1f1a2467bf70963cefa521f37caca71dde87f1 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Thu, 25 Sep 2025 17:30:51 -0600 Subject: [PATCH 14/23] Various refactors, incl. drop mention of "OS". --- Doc/reference/executionmodel.rst | 237 ++++++++++++++++++++++--------- 1 file changed, 171 insertions(+), 66 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 002604efe86b4c..abe0ed4e879ba9 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -403,23 +403,46 @@ and :keyword:`raise` statement in section :ref:`raise`. Runtime Components ================== -Python's execution model does not operate in a vacuum. It runs on a -computer. When a program runs, the conceptual layers of how it runs -on the computer look something like this:: - - host machine and operating system (OS) - process - OS thread (runs machine code) - -Hosts and processes are isolated and independent from one another. -However, threads are not. - -A program always starts with exactly one thread, known as the "main" -thread, it may grow to run in multiple. Not all platforms support -threads, but most do. For those that do, all threads in a process -share all the process' resources, including memory. - -The fundamental point of threads is that each thread does *run* +General Computing Model +----------------------- + +Python's execution model does not operate in a vacuum. It runs on +a host machine and through that host's runtime environment, including +its operating system (OS), if there is one. When a program runs, +the conceptual layers of how it runs on the host look something +like this:: + + **host machine** + **process** (global resources) + **thread** (runs machine code) + +Each process represents a program running on the host. Think of each +process itself as the data part of its program. Think of the process' +threads as the execution part of the program. This distinction will +be important to understand the conceptual Python runtime. + +The process, as the data part, is the execution context in which the +program runs. It mostly consists of the set of resources assigned to +the program by the host, including memory, signals, file handles, +sockets, and environment variables. + +Processes are isolated and independent from one another. (The same +is true for hosts.) The host manages the process' access to its +assigned resources, in addition to coordinating between processes. + +Each thread represents the actual execution of the program's machine +code, running relative to the resources assigned to the program's +process. It's strictly up to the host how and when that execution +takes place. + +From the point of view of Python, a program always starts with exactly +one thread. However, the program may grow to run in multiple +simultaneous threads. Not all hosts support multiple threads per +process, but most do. Unlike processes, threads in a process are not +isolated and independent from one another. Specifically, all threads +in a process share all of the process' resources. + +The fundamental point of threads is that each one does *run* independently, at the same time as the others. That may be only conceptually at the same time ("concurrently") or physically ("in parallel"). Either way, the threads effectively run @@ -427,7 +450,7 @@ at a non-synchronized rate. .. note:: - That non-synchronized rate means none of the global state is + That non-synchronized rate means none of the process' memory is guaranteed to stay consistent for the code running in any given thread. Thus multi-threaded programs must take care to coordinate access to intentionally shared resources. Likewise, they must take @@ -438,62 +461,109 @@ at a non-synchronized rate. Python runtime. The cost of this broad, unstructured requirement is the tradeoff for - the concurrency and, especially, parallelism that threads provide. - The alternative generally means dealing with non-deterministic bugs - and data corruption. - -The same layers apply to each Python program, with some extra layers -specific to Python:: - - host - process - Python runtime - interpreter - Python thread (runs bytecode) - -When a Python program starts, it looks exactly like that, with one -of each. The process has a single global runtime to manage Python's -process-global resources. The runtime may grow to include multiple -interpreters and each interpreter may grow to include multiple Python -threads. The initial interpreter is known as the "main" interpreter, -and the initial thread, where the runtime was initialized, is known -as the "main" thread. - -An interpreter completely encapsulates all of the non-process-global -runtime state that the interpreter's Python threads share. For example, -all its threads share :data:`sys.modules`, but each interpreter has its -own :data:`sys.modules`. + the kind of raw concurrency that threads provide. The alternative + to the required discipline generally means dealing with + non-deterministic bugs and data corruption. + +Python Runtime Model +-------------------- + +The same conceptual layers apply to each Python program, with some +extra data layers specific to Python:: + + **host machine** + **process** (global resources) + globl runtime (*state*) + interpreter (*state*) + **thread** (runs "C-API" and Python bytecode) + thread *state* + +At the conceptual level: when a Python program starts, it looks exactly +like that diagram, with one of each. The runtime may grow to include +multiple interpreters, and each interpreter may grow to include +multiple thread states. .. note:: - The interpreter here is not the same as the "bytecode interpreter", - which is what regularly runs in threads, executing compiled Python code. + A Python implementation won't necessarily implement the runtime + layers distinctly or even concretely. The only exception is places + where distinct layers are directly specified or exposed to users, + like through the :mod:`threading` module. -A Python thread represents the state necessary for the Python runtime -to *run* in an OS thread. It also represents the execution of Python -code (or any supported C-API) in that OS thread. Depending on the -implementation, this probably includes the current exception and -the Python call stack. The Python thread always identifies the -interpreter it belongs to, meaning the state it shares -with other threads. +.. note:: + + The initial interpreter is typically called the "main" interpreter. + Some Python implementations, like CPython, assign special roles + to the main interpreter. + + Likewise, the host thread where the runtime was initialized is known + as the "main" thread. It may be different from the process' initial + thread, though they are often the same. In some cases "main thread" + may be even more specific and refer to the initial thread state. + A Python runtime might assign specific responsibilities + to the main thread, such as handling signals. + +As a whole, the Python runtime consists of the global runtime state, +interpreters, and thread states. The runtime ensures all that state +stays consistent over its lifetime, particularly when used with +multiple host threads. The runtime also exposes a way for host threads +to "call into Python", which will be covered in the next subsection. + +The global runtime, at the conceptual level, is just a set of +interpreters. While they are otherwise isolated and independent from +one another, they may share some data or other resources. The runtime +is responsible for managing these global resources safely. The actual +nature and management of these resources is implementation-specific. +Ultimately, the external utility of the global runtime is limited +to managing interpreters. + +In contrast, an "interpreter" is conceptually what we would normally +think of as the (full-featured) "Python runtime". When machine code +executing in a host thread interacts with the Python runtime, it calls +into Python in the context of a specific interpreter. .. note:: - Here "Python thread" does not necessarily refer to a thread created - using the :mod:`threading` module. + The term "interpreter" here is not the same as the "bytecode + interpreter", which is what regularly runs in threads, executing + compiled Python code. + + In an ideal world, "Python runtime" would refer to what we currently + call "interpreter". However, it's been called "interpreter" at least + since introduced in 1997 (a027efa5b). + +Each interpreter completely encapsulates all of the non-process-global, +non-thread-specific state needed for the Python runtime to work. +Notably, the interpreter's state persists between uses. It includes +fundamental data like :data:`sys.modules`. The runtime ensures +multiple threads using the same interpreter will safely +share it between them. + +A Python implementation may support using multiple interpreters at the +same time in the same process. They are independent and isolated from +one another. For example, each interpreter has its own +:data:`sys.modules`. + +For thread-specific runtime state, each interpreter has a set of thread +states, which it manages, in the same way the global runtime contains +a set of interpreters. It can have thread states for as many host +threads as it needs. It may even have multiple thread states for +the same host thread, though that isn't as common. + +Each thread state, conceptually, has all the thread-specific runtime +data an interpreter needs to operate in one host thread. The thread +state includes the current raised exception and the thread's Python +call stack. It may include other thread-specific resources. + +.. note:: -Each Python thread is associated with a single OS thread, which is where -it can run. In the opposite direction, a single OS thread can have many -Python threads associated with it. However, only one of those Python -threads is "active" in the OS thread at time. The runtime will operate -in the OS thread relative to the active Python thread. + The term "Python thread" can sometimes refer to a thread state, but + normally it means a thread created using the :mod:`threading` module. -For an interpreter to be used in an OS thread, it must have a -corresponding active Python thread. Thus switching between interpreters -means changing the active Python thread. An interpreter can have Python -threads, active or inactive, for as many OS threads as it needs. It may -even have multiple Python threads for the same OS thread, though at most -one can be active at a time. +Each thread state, over its lifetime, is always tied to exactly one +interpreter and exactly one host thread. It will only ever be used in +that thread. In the other direction, a host thread may have many +Python thread states tied to it, for different interpreters. Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that @@ -501,7 +571,42 @@ support threads). Additional processes can be created using the :mod:`os`, :mod:`subprocess`, and :mod:`multiprocessing` modules. You can run coroutines (async) in the main thread using :mod:`asyncio`. Interpreters can be created and used with the -:mod:`concurrent.interpreters` module. +:mod:`~concurrent.interpreters` module. + +Calls into Python +----------------- + +A "call into Python" is an abstraction of "ask the Python runtime +to do something". It necessarily involves targeting a single runtime +context, whether global, interpreter, or thread. The layer depends +on the desired operation. Most operations require a thread state. + +When a running host thread calls into Python, the actual mechanism +is implementation-specific. For example, CPython provides a C-API and +the thread will literally call into Python through a C-API function. + +.. drop paragraph? + +Some thread-specific operations must only target a new thread state, +while others may target any thread state, including one with a Python +call already on its stack or a current exception set. + +A thread-specific call into Python can target only one thread state. +That means, when there are multiple Python thread states tied to the +current host thread, only one of them can be in use at a time. It +doesn't matter if the thread states belong to different interpreters +or the same interpreter. + +Calls into Python can be nested. Even if a thread has already called +into Python, that operation could be interrupted by another call into +Python targeting a different runtime context. For example, the +implementation of the outer call might make the inner call directly. +Alternately, the host or Python runtime might trigger some +asyncronous callback that calls into Python. + +Regardless, at the point of the inner call, the target is swapped. +When the inner call finishes, the target is swapped back and the outer +call resumes. .. rubric:: Footnotes From b58a95c283ba2ce1d127e25ac3be8960cd0ec954 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Thu, 25 Sep 2025 17:37:32 -0600 Subject: [PATCH 15/23] Drop "call into Python" discussion. --- Doc/reference/executionmodel.rst | 42 +++----------------------------- 1 file changed, 4 insertions(+), 38 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index abe0ed4e879ba9..3b1a208569429e 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -506,8 +506,7 @@ multiple thread states. As a whole, the Python runtime consists of the global runtime state, interpreters, and thread states. The runtime ensures all that state stays consistent over its lifetime, particularly when used with -multiple host threads. The runtime also exposes a way for host threads -to "call into Python", which will be covered in the next subsection. +multiple host threads. The global runtime, at the conceptual level, is just a set of interpreters. While they are otherwise isolated and independent from @@ -563,7 +562,9 @@ call stack. It may include other thread-specific resources. Each thread state, over its lifetime, is always tied to exactly one interpreter and exactly one host thread. It will only ever be used in that thread. In the other direction, a host thread may have many -Python thread states tied to it, for different interpreters. +Python thread states tied to it, for different interpreters or even the +same interpreter. However, for any given host thread, only one of the +thread states tied to it can be used by the thread at a time. Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that @@ -573,41 +574,6 @@ You can run coroutines (async) in the main thread using :mod:`asyncio`. Interpreters can be created and used with the :mod:`~concurrent.interpreters` module. -Calls into Python ------------------ - -A "call into Python" is an abstraction of "ask the Python runtime -to do something". It necessarily involves targeting a single runtime -context, whether global, interpreter, or thread. The layer depends -on the desired operation. Most operations require a thread state. - -When a running host thread calls into Python, the actual mechanism -is implementation-specific. For example, CPython provides a C-API and -the thread will literally call into Python through a C-API function. - -.. drop paragraph? - -Some thread-specific operations must only target a new thread state, -while others may target any thread state, including one with a Python -call already on its stack or a current exception set. - -A thread-specific call into Python can target only one thread state. -That means, when there are multiple Python thread states tied to the -current host thread, only one of them can be in use at a time. It -doesn't matter if the thread states belong to different interpreters -or the same interpreter. - -Calls into Python can be nested. Even if a thread has already called -into Python, that operation could be interrupted by another call into -Python targeting a different runtime context. For example, the -implementation of the outer call might make the inner call directly. -Alternately, the host or Python runtime might trigger some -asyncronous callback that calls into Python. - -Regardless, at the point of the inner call, the target is swapped. -When the inner call finishes, the target is swapped back and the outer -call resumes. - .. rubric:: Footnotes From f05848ce337511b608f92c04da2b9fd06ae66d8b Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 09:44:04 -0600 Subject: [PATCH 16/23] Fix literal block. --- Doc/reference/executionmodel.rst | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 3b1a208569429e..4fb106a79df407 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -410,11 +410,11 @@ Python's execution model does not operate in a vacuum. It runs on a host machine and through that host's runtime environment, including its operating system (OS), if there is one. When a program runs, the conceptual layers of how it runs on the host look something -like this:: +like this: - **host machine** - **process** (global resources) - **thread** (runs machine code) + | **host machine** + | **process** (global resources) + | **thread** (runs machine code) Each process represents a program running on the host. Think of each process itself as the data part of its program. Think of the process' @@ -469,14 +469,14 @@ Python Runtime Model -------------------- The same conceptual layers apply to each Python program, with some -extra data layers specific to Python:: - - **host machine** - **process** (global resources) - globl runtime (*state*) - interpreter (*state*) - **thread** (runs "C-API" and Python bytecode) - thread *state* +extra data layers specific to Python: + + | **host machine** + | **process** (global resources) + | globl runtime (*state*) + | interpreter (*state*) + | **thread** (runs "C-API" and Python bytecode) + | thread *state* At the conceptual level: when a Python program starts, it looks exactly like that diagram, with one of each. The runtime may grow to include From 6304a23e74cad3eca316baf46a97f2b997e9e535 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 09:50:28 -0600 Subject: [PATCH 17/23] Fix Python layers. --- Doc/reference/executionmodel.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 4fb106a79df407..93f8e9d541ee16 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -473,10 +473,10 @@ extra data layers specific to Python: | **host machine** | **process** (global resources) - | globl runtime (*state*) - | interpreter (*state*) - | **thread** (runs "C-API" and Python bytecode) - | thread *state* + | Python global runtime (*state*) + | Python interpreter (*state*) + | **thread** (runs Python bytecode and "C-API") + | Python thread *state* At the conceptual level: when a Python program starts, it looks exactly like that diagram, with one of each. The runtime may grow to include From e9c946f8d2e46f93e3652b11a319d89567593237 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 09:58:25 -0600 Subject: [PATCH 18/23] Fix an ambiguous sentance. --- Doc/reference/executionmodel.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 93f8e9d541ee16..2321091a31f775 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -509,12 +509,12 @@ stays consistent over its lifetime, particularly when used with multiple host threads. The global runtime, at the conceptual level, is just a set of -interpreters. While they are otherwise isolated and independent from -one another, they may share some data or other resources. The runtime -is responsible for managing these global resources safely. The actual -nature and management of these resources is implementation-specific. -Ultimately, the external utility of the global runtime is limited -to managing interpreters. +interpreters. While those interpreters are otherwise isolated and +independent from one another, they may share some data or other +resources. The runtime is responsible for managing these global +resources safely. The actual nature and management of these resources +is implementation-specific. Ultimately, the external utility of the +global runtime is limited to managing interpreters. In contrast, an "interpreter" is conceptually what we would normally think of as the (full-featured) "Python runtime". When machine code From 8de5e0a285d97ec7e2d46db36936957172998cbc Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 10:00:28 -0600 Subject: [PATCH 19/23] Fix an ambiguous commit hash. --- Doc/reference/executionmodel.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 2321091a31f775..9a59cec001aa3d 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -529,7 +529,7 @@ into Python in the context of a specific interpreter. In an ideal world, "Python runtime" would refer to what we currently call "interpreter". However, it's been called "interpreter" at least - since introduced in 1997 (a027efa5b). + since introduced in 1997 (CPython:a027efa5b). Each interpreter completely encapsulates all of the non-process-global, non-thread-specific state needed for the Python runtime to work. From b81dbd26ce72e85fcea4f0e633a90e144b190c08 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 10:15:32 -0600 Subject: [PATCH 20/23] Clarify about thread state independence. --- Doc/reference/executionmodel.rst | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 9a59cec001aa3d..59ff8d0b5b9180 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -561,10 +561,16 @@ call stack. It may include other thread-specific resources. Each thread state, over its lifetime, is always tied to exactly one interpreter and exactly one host thread. It will only ever be used in -that thread. In the other direction, a host thread may have many -Python thread states tied to it, for different interpreters or even the -same interpreter. However, for any given host thread, only one of the -thread states tied to it can be used by the thread at a time. +that thread and with that interpreter. + +In the other direction, a host thread may have many Python thread states +tied to it, for different interpreters or even the same interpreter. +However, for any given host thread, only one of the thread states +tied to it can be used by the thread at a time. + +Thread states are isolated and independent from one another and don't +share any data, except for possibly sharing an interpreter and objects +or other resources belonging to that interpreter. Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that From cd144def1252e56c861dd61565ec50bf8ba3bd55 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 29 Sep 2025 10:23:24 -0600 Subject: [PATCH 21/23] Clarify about multiple thread states per host thread. --- Doc/reference/executionmodel.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 59ff8d0b5b9180..0130192e28d96a 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -563,10 +563,10 @@ Each thread state, over its lifetime, is always tied to exactly one interpreter and exactly one host thread. It will only ever be used in that thread and with that interpreter. -In the other direction, a host thread may have many Python thread states -tied to it, for different interpreters or even the same interpreter. -However, for any given host thread, only one of the thread states -tied to it can be used by the thread at a time. +Multiple thread states may be tied to the same host thread, whether for +different interpreters or even the same interpreter. However, for any +given host thread, only one of the thread states tied to it can be used +by the thread at a time. Thread states are isolated and independent from one another and don't share any data, except for possibly sharing an interpreter and objects From 582b9248b5478e3993e4cd98d9f329222d0ce76d Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Tue, 30 Sep 2025 10:33:30 -0600 Subject: [PATCH 22/23] Clarify about asyncio. --- Doc/reference/executionmodel.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index 0130192e28d96a..ebc2601d843383 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -576,9 +576,10 @@ Once a program is running, new Python threads can be created using the :mod:`threading` module (on platforms and Python implementations that support threads). Additional processes can be created using the :mod:`os`, :mod:`subprocess`, and :mod:`multiprocessing` modules. -You can run coroutines (async) in the main thread using :mod:`asyncio`. Interpreters can be created and used with the -:mod:`~concurrent.interpreters` module. +:mod:`~concurrent.interpreters` module. Coroutines (async) can +be run using :mod:`asyncio` in each interpreter, typically only +in a single thread (often the main thread). .. rubric:: Footnotes From 78e4bbc6ced87ac77b3f2ca63ec91942c2485c21 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Tue, 30 Sep 2025 10:40:30 -0600 Subject: [PATCH 23/23] Add a link to the commit on github. --- Doc/reference/executionmodel.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Doc/reference/executionmodel.rst b/Doc/reference/executionmodel.rst index ebc2601d843383..639c232571edf3 100644 --- a/Doc/reference/executionmodel.rst +++ b/Doc/reference/executionmodel.rst @@ -529,7 +529,9 @@ into Python in the context of a specific interpreter. In an ideal world, "Python runtime" would refer to what we currently call "interpreter". However, it's been called "interpreter" at least - since introduced in 1997 (CPython:a027efa5b). + since introduced in 1997 (`CPython:a027efa5b`_). + + .. _CPython:a027efa5b: https://github.com/python/cpython/commit/a027efa5b Each interpreter completely encapsulates all of the non-process-global, non-thread-specific state needed for the Python runtime to work.