Commit 985dc7f
authored
Initial implementation of parallel type checking (#20280)
Fixes #933
This is not very polished, but a fully functional implementation. This
gives ~1.5x performance improvement for self-check. I think we can keep
this feature hidden while we iterate on it. A very high-level overview
is to start `n` workers, each of which loads the graph, the coordinator
process then submits SCCs one by one as they become unblocked by
dependencies. Workers use regular cache to get information about SCCs
processed by other workers. There are more details in the docstring for
`worker.py`.
Some notes:
* I moved some code around, so that we can share as many things as
possible between the daemon and workers IPC.
* For now I use some hybrid binary-JSON format for messages, but this is
temporary. I am going to switch to a proper binary fixed format soon.
* Windows is not supported yet, the missing part is `def
ready_to_read(conns: list[IPCClient]) -> list[int]`.
* Right now workers use default stdout/stderr. This is easier for
debugging, but I think we may switch to writing to a log file at some
point (like the daemon does).
* I add a GC freeze trick for initial graph loading. It is very similar
to the GC freeze trick for warm runs. I don't see any visible memory use
increase, while it gives 8-10% speedup (even for single-process runs).
Note I disable it in tests, since we run each test in the same process.
* Testing in general was the trickiest part. There are various implicit
assumptions that don't work for parallel checking. I use environment
variables to "propagate" those assumptions.
* I add two CI jobs (regular and compiled) that run ~60% of all tests
with 4 parallel workers. For now I skip 15 tests in parallel mode (all
because of some incremental mode bugs):
- Two because of a crash on star imports in import cycles (similar to
#11025)
- One because of an inconsistency when re-exporting `__all__`
- Few tests because of inconsistent formatting of overloaded
constructors in error messages
- Few tests because of a problem with `foo defined here` notes, see
#4772
* We should support reports in parallel mode. Proably by writing reports
by workers after processing each SCC.
* We should probably switch `mypy/ipc.py` to using to `librt.base64`.
This may be not critical now, but will be important with the new parser,
when we will be sending larger chunks of data over the sockets.
* We need to figure out and decide what exactly will be the role of
parallel workers in fine-grained mode.
I am going to address some of the above issues, and re-enable tests
gradually in follow-up PRs. More long term there are three main areas
for further improvements:
* Parallelizing parsing, not just the type-checking. It looks like this
is currently the main bottleneck.
* Improving SCCs packing, by splitting type-checking into public
interface phase and implementations phase. We can notify the coordinator
after the first phase.
* Switching to lazy-loading the cache. This will become important as we
will address the other two bottlenecks and will be able to use more
workers.1 parent ad0f41e commit 985dc7f
File tree
33 files changed
+759
-201
lines changed- .github/workflows
- mypy
- build_worker
- dmypy
- test
- test-data/unit
- lib-stub
33 files changed
+759
-201
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
98 | 110 | | |
99 | 111 | | |
100 | 112 | | |
| |||
0 commit comments