-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
gh-141858: Speed up Objects/tupleobject.c richcompare with early identity and length checks
#142027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Objects/tupleobject.c richcompare` with early identity and length checksObjects/tupleobject.c richcompare with early identity and length checks
| if (v == w) { | ||
| Py_RETURN_RICHCOMPARE(0, 0, op); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware of https://bugs.python.org/issue30907 but I think it was premature:
- pyperformance was much less mature then,
- it was hard to go beyond simple microbenchmarks,
- there was no quantifiable arguments,
- there was generally less focus on performance in the language.
That said, even 25d53eb, without the identity check, is great (9f2a34a is main):
The `pyperformance compare`:
pyperformance.9f2a34af747.json
==============================
Performance version: 1.13.0
Report on Linux-6.12.57+deb13-amd64-x86_64-with-glibc2.41
Number of logical CPUs: 24
Start date: 2025-11-26 20:05:17.126618
End date: 2025-11-26 21:43:38.389462
pyperformance.25d53ebaea3.json
==============================
Performance version: 1.13.0
Report on Linux-6.12.57+deb13-amd64-x86_64-with-glibc2.41
Number of logical CPUs: 24
Start date: 2025-11-27 14:00:59.954246
End date: 2025-11-27 15:54:44.937969
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| Benchmark | pyperformance.9f2a34af747.json | pyperformance.25d53ebaea3.json | Change | Significance |
+==================================+================================+================================+==============+========================+
| 2to3 | 207 ms | 207 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_generators | 289 ms | 286 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_cpu_io_mixed | 394 ms | 392 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_cpu_io_mixed_tg | 395 ms | 394 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager | 83.3 ms | 83.3 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed | 336 ms | 333 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed_tg | 368 ms | 366 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_io | 422 ms | 425 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_io_tg | 419 ms | 421 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_memoization | 162 ms | 161 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_memoization_tg | 205 ms | 205 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_tg | 156 ms | 155 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_io | 448 ms | 450 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_io_tg | 429 ms | 431 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_memoization | 222 ms | 221 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_memoization_tg | 231 ms | 232 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_none | 192 ms | 194 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_none_tg | 191 ms | 192 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_tcp | 252 ms | 277 ms | 1.10x slower | Significant (t=-35.30) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_tcp_ssl | 1.21 sec | 1.24 sec | 1.03x slower | Significant (t=-82.44) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_websockets | 357 ms | 357 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bench_mp_pool | 13.2 ms | 12.2 ms | 1.08x faster | Significant (t=9.85) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bench_thread_pool | 750 us | 674 us | 1.11x faster | Significant (t=113.01) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bpe_tokeniser | 3.58 sec | 3.64 sec | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| chameleon | 11.9 ms | 11.8 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| chaos | 43.8 ms | 44.1 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| comprehensions | 13.0 us | 12.3 us | 1.06x faster | Significant (t=8.87) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| coroutines | 17.1 ms | 17.0 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| coverage | 58.5 ms | 57.9 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| create_gc_cycles | 2.09 ms | 2.03 ms | 1.03x faster | Significant (t=8.72) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| crypto_pyaes | 58.3 ms | 57.7 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| dask | 682 ms | 531 ms | 1.28x faster | Significant (t=299.38) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy | 193 us | 196 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy_memo | 19.1 us | 19.3 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy_reduce | 2.13 us | 2.16 us | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deltablue | 2.42 ms | 2.42 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| django_template | 28.3 ms | 28.1 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| docutils | 2.11 sec | 2.12 sec | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| dulwich_log | 43.6 ms | 43.6 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| fannkuch | 273 ms | 268 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| float | 56.1 ms | 53.8 ms | 1.04x faster | Significant (t=8.11) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| gc_traversal | 4.15 ms | 4.04 ms | 1.03x faster | Significant (t=15.82) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| generators | 22.4 ms | 21.7 ms | 1.04x faster | Significant (t=21.28) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| genshi_text | 17.4 ms | 17.5 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| genshi_xml | 39.5 ms | 38.7 ms | 1.02x faster | Significant (t=10.11) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| go | 89.9 ms | 89.4 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| hexiom | 4.56 ms | 4.49 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| html5lib | 49.5 ms | 50.0 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| json_dumps | 7.49 ms | 7.29 ms | 1.03x faster | Significant (t=36.97) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| json_loads | 18.4 us | 18.8 us | 1.03x slower | Significant (t=-16.34) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_format | 4.88 us | 4.87 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_silent | 69.2 ns | 67.5 ns | 1.03x faster | Significant (t=26.85) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_simple | 4.40 us | 4.37 us | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| mako | 7.96 ms | 7.95 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| many_optionals | 862 us | 875 us | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| mdp | 939 ms | 954 ms | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| meteor_contest | 94.4 ms | 95.1 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| nbody | 68.1 ms | 69.9 ms | 1.03x slower | Significant (t=-11.08) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| nqueens | 69.1 ms | 72.1 ms | 1.04x slower | Significant (t=-56.52) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pathlib | 10.0 ms | 9.94 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle | 13.7 us | 13.7 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_dict | 25.0 us | 24.3 us | 1.03x faster | Significant (t=23.80) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_list | 3.78 us | 3.89 us | 1.03x slower | Significant (t=-13.22) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_pure_python | 241 us | 241 us | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pidigits | 184 ms | 184 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pprint_pformat | 1.12 sec | 1.16 sec | 1.04x slower | Significant (t=-31.84) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pprint_safe_repr | 547 ms | 570 ms | 1.04x slower | Significant (t=-31.75) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pyflate | 316 ms | 311 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| python_startup | 10.8 ms | 10.8 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| python_startup_no_site | 6.34 ms | 6.34 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| raytrace | 212 ms | 213 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_compile | 98.5 ms | 99.4 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_dna | 162 ms | 155 ms | 1.04x faster | Significant (t=58.02) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_effbot | 2.26 ms | 2.20 ms | 1.03x faster | Significant (t=33.39) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_v8 | 18.4 ms | 18.3 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| richards | 33.6 ms | 33.4 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| richards_super | 38.2 ms | 37.8 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_fft | 197 ms | 198 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_lu | 66.5 ms | 67.2 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_monte_carlo | 44.0 ms | 43.6 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_sor | 74.9 ms | 74.2 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_sparse_mat_mult | 2.99 ms | 3.10 ms | 1.04x slower | Significant (t=-41.64) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| spectral_norm | 63.7 ms | 63.6 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sphinx | 792 ms | 790 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlalchemy_declarative | 108 ms | 109 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlalchemy_imperative | 13.2 ms | 13.3 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_normalize | 83.2 ms | 83.1 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_optimize | 41.9 ms | 41.7 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_parse | 969 us | 987 us | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_transpile | 1.25 ms | 1.26 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlite_synth | 1.98 us | 1.97 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| subparsers | 31.3 ms | 31.0 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_expand | 360 ms | 361 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_integrate | 16.0 ms | 16.2 ms | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_str | 210 ms | 212 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_sum | 110 ms | 110 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| telco | 117 ms | 117 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| tomli_loads | 1.51 sec | 1.54 sec | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| tornado_http | 76.7 ms | 76.7 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| typing_runtime_protocols | 121 us | 121 us | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpack_sequence | 31.7 ns | 34.0 ns | 1.07x slower | Significant (t=-54.16) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle | 10.6 us | 10.3 us | 1.03x faster | Significant (t=10.82) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle_list | 3.92 us | 3.84 us | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle_pure_python | 156 us | 158 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xdsl_constant_fold | 35.9 ms | 35.8 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_generate | 67.2 ms | 68.0 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_iterparse | 66.3 ms | 65.8 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_parse | 106 ms | 105 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_process | 47.3 ms | 46.6 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Particularly:
The `pyperformance compare`:
pyperformance.25d53ebaea3.json
==============================
Performance version: 1.13.0
Report on Linux-6.12.57+deb13-amd64-x86_64-with-glibc2.41
Number of logical CPUs: 24
Start date: 2025-11-27 14:00:59.954246
End date: 2025-11-27 15:54:44.937969
pyperformance.9ba81106b1d.json
==============================
Performance version: 1.13.0
Report on Linux-6.12.57+deb13-amd64-x86_64-with-glibc2.41
Number of logical CPUs: 24
Start date: 2025-11-27 03:09:32.694677
End date: 2025-11-27 05:02:22.084653
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| Benchmark | pyperformance.25d53ebaea3.json | pyperformance.9ba81106b1d.json | Change | Significance |
+==================================+================================+================================+==============+========================+
| 2to3 | 207 ms | 206 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_generators | 286 ms | 290 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_cpu_io_mixed | 392 ms | 392 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_cpu_io_mixed_tg | 394 ms | 394 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager | 83.3 ms | 82.6 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed | 333 ms | 333 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed_tg | 366 ms | 366 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_io | 425 ms | 423 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_io_tg | 421 ms | 422 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_memoization | 161 ms | 160 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_memoization_tg | 205 ms | 206 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_eager_tg | 155 ms | 156 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_io | 450 ms | 453 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_io_tg | 431 ms | 432 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_memoization | 221 ms | 222 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_memoization_tg | 232 ms | 233 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_none | 194 ms | 193 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| async_tree_none_tg | 192 ms | 192 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_tcp | 277 ms | 274 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_tcp_ssl | 1.24 sec | 1.25 sec | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| asyncio_websockets | 357 ms | 358 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bench_mp_pool | 12.2 ms | 12.4 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bench_thread_pool | 674 us | 671 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| bpe_tokeniser | 3.64 sec | 3.58 sec | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| chameleon | 11.8 ms | 11.7 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| chaos | 44.1 ms | 44.1 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| comprehensions | 12.3 us | 12.3 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| connected_components | 359 ms | 358 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| coroutines | 17.0 ms | 17.0 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| coverage | 57.9 ms | 55.3 ms | 1.05x faster | Significant (t=35.65) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| create_gc_cycles | 2.03 ms | 2.03 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| crypto_pyaes | 57.7 ms | 58.2 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| dask | 531 ms | 526 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy | 196 us | 193 us | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy_memo | 19.3 us | 18.6 us | 1.03x faster | Significant (t=54.44) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deepcopy_reduce | 2.16 us | 2.13 us | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| deltablue | 2.42 ms | 2.42 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| django_template | 28.1 ms | 28.0 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| docutils | 2.12 sec | 2.12 sec | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| dulwich_log | 43.6 ms | 43.7 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| fannkuch | 268 ms | 278 ms | 1.04x slower | Significant (t=-9.68) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| float | 53.8 ms | 54.1 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| gc_traversal | 4.04 ms | 4.06 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| generators | 21.7 ms | 21.4 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| genshi_text | 17.5 ms | 16.8 ms | 1.04x faster | Significant (t=16.98) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| genshi_xml | 38.7 ms | 38.0 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| go | 89.4 ms | 91.2 ms | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| hexiom | 4.49 ms | 4.47 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| html5lib | 50.0 ms | 49.0 ms | 1.02x faster | Significant (t=15.54) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| json_dumps | 7.29 ms | 7.40 ms | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| json_loads | 18.8 us | 19.0 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| k_core | 1.58 sec | 1.57 sec | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_format | 4.87 us | 4.86 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_silent | 67.5 ns | 67.4 ns | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| logging_simple | 4.37 us | 4.34 us | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| mako | 7.95 ms | 7.67 ms | 1.04x faster | Significant (t=12.51) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| many_optionals | 875 us | 871 us | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| mdp | 954 ms | 941 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| meteor_contest | 95.1 ms | 94.2 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| nbody | 69.9 ms | 69.3 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| nqueens | 72.1 ms | 70.0 ms | 1.03x faster | Significant (t=34.08) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pathlib | 9.94 ms | 9.83 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle | 13.7 us | 13.8 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_dict | 24.3 us | 23.6 us | 1.03x faster | Significant (t=34.28) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_list | 3.89 us | 3.78 us | 1.03x faster | Significant (t=13.83) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pickle_pure_python | 241 us | 240 us | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pidigits | 184 ms | 184 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pprint_pformat | 1.16 sec | 1.14 sec | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pprint_safe_repr | 570 ms | 557 ms | 1.02x faster | Significant (t=16.23) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| pyflate | 311 ms | 316 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| python_startup | 10.8 ms | 10.8 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| python_startup_no_site | 6.34 ms | 6.34 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| raytrace | 213 ms | 215 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_compile | 99.4 ms | 98.0 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_dna | 155 ms | 155 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_effbot | 2.20 ms | 2.19 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| regex_v8 | 18.3 ms | 18.0 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| richards | 33.4 ms | 33.0 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| richards_super | 37.8 ms | 37.5 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_fft | 198 ms | 196 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_lu | 67.2 ms | 70.2 ms | 1.04x slower | Significant (t=-22.40) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_monte_carlo | 43.6 ms | 44.4 ms | 1.02x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_sor | 74.2 ms | 75.2 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| scimark_sparse_mat_mult | 3.10 ms | 3.03 ms | 1.02x faster | Significant (t=20.57) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| shortest_path | 369 ms | 370 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| spectral_norm | 63.6 ms | 62.5 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sphinx | 790 ms | 786 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlalchemy_declarative | 109 ms | 108 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlalchemy_imperative | 13.3 ms | 13.2 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_normalize | 83.1 ms | 81.6 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_optimize | 41.7 ms | 41.3 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_parse | 987 us | 962 us | 1.03x faster | Significant (t=10.79) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlglot_v2_transpile | 1.26 ms | 1.24 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sqlite_synth | 1.97 us | 1.99 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| subparsers | 31.0 ms | 31.1 ms | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_expand | 361 ms | 353 ms | 1.02x faster | Significant (t=33.14) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_integrate | 16.2 ms | 16.0 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_str | 212 ms | 208 ms | 1.02x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| sympy_sum | 110 ms | 109 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| telco | 117 ms | 114 ms | 1.03x faster | Significant (t=15.66) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| tomli_loads | 1.54 sec | 1.50 sec | 1.03x faster | Significant (t=17.51) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| tornado_http | 76.7 ms | 76.5 ms | 1.00x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| typing_runtime_protocols | 121 us | 121 us | 1.00x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpack_sequence | 34.0 ns | 31.3 ns | 1.09x faster | Significant (t=44.25) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle | 10.3 us | 10.4 us | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle_list | 3.84 us | 4.01 us | 1.04x slower | Significant (t=-30.06) |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| unpickle_pure_python | 158 us | 156 us | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xdsl_constant_fold | 35.8 ms | 35.4 ms | 1.01x faster | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_generate | 68.0 ms | 68.7 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_iterparse | 65.8 ms | 66.1 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_parse | 105 ms | 106 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
| xml_etree_process | 46.6 ms | 47.0 ms | 1.01x slower | Not significant |
+----------------------------------+--------------------------------+--------------------------------+--------------+------------------------+
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comparison tests to Lib/test/seq_tests.py? Especially a test with float("nan").
|
@vstinner Voilà! More than happy to add more. |
Objects/tupleobject.c richcompare with early identity and length checksObjects/tupleobject.c richcompare with early identity and length checks
|
See #141858 for a wider discussion. |
|
@picnixz Thank you for spotting this. I've missed this issue. Do you think it makes sense to split this PR into two? Especially given that the identity check is not responsible for the major (>= 1.05x) improvement here. |
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Victor Stinner <vstinner@python.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still want to have other core devs opinion on this one. While I agere that pyperformance wasn't as mature as it was, note that we have some tests that are definitely slower, such as unpickle and json.loads (AFAICT from the pyperformance benchmarks). I would like first to reach a consensus on this optimization in the issue rather than pushing this forward. In addition, we need to be sure that every Tier-1 platofmr exhibits the same speed-up. On Windows, we had slowdowns if we naively added those fast paths to tuple, list, and set.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
This is interesting. Usually it is the other way around: not all benefits we see in micro benchmarks show up in pyperformance ...
Maybe @mdboom can fire up the faster pyperformance suite on this? Or somebody the one from meta? These two are the defacto gold standards when it comes to benchmarking :) |
==and!=when tuple lenghts differ, to avoid a walk through the whole tupleSee below for more detailed benchmarks (9f2a34a v. c901539)
Significant (+ >= 1.05) improvements on pyperformance:
Significant regressions (+ <= 1.05) on pyperformance
I'm not exactly sure, but my hunch is that's the place hit in
asyncio_tcp:cpython/Lib/asyncio/events.py
Lines 182 to 188 in 5ec03cf
I couldn't reproduce this with a microbenchmark, though.
The
float("nan")behaviour is not changed:Benchmark
The script:
The results (with
--rigorous, on 0813448 v. 9f2a34a):The environment:
sudo ./python -m pyperf system tuneensured.pyperformance
The results:
pyperformance (without identity check)
Significant improvements
Significant regressions
UNPACK_SEQUENCE_TUPLEand_LISTshouldn't calltuple_richcompare; the difference is just 2.3ns which might be caused by jitter or by a branch predictor history)The results: