Skip to content

Commit e7764c8

Browse files
committed
SQLAlchemy: Remove support for legacy session.bulk_save_objects
> This method is a legacy feature as of the 2.0 series of SQLAlchemy. > For modern bulk INSERT and UPDATE, see the sections ORM Bulk INSERT > Statements and ORM Bulk UPDATE by Primary Key [at the SQLAlchemy docs] > > -- https://docs.sqlalchemy.org/orm/session_api.html#sqlalchemy.orm.Session.bulk_save_objects The new `insertmanyvalues` feature is the successor. Performance optimizations from `bulk_save()` have been made inherently part of `add_all()`. > A list of parameter dictionaries sent to the `Session.execute.params` > parameter, separate from the Insert object itself, will invoke bulk > INSERT mode for the statement, which essentially means the operation > will optimize as much as possible for many rows. -- https://docs.sqlalchemy.org/orm/queryguide/dml.html#orm-queryguide-bulk-insert -- sqlalchemy/sqlalchemy#6935 (comment) -- https://docs.sqlalchemy.org/core/connections.html#engine-insertmanyvalues
1 parent cabb2c2 commit e7764c8

File tree

2 files changed

+99
-4
lines changed

2 files changed

+99
-4
lines changed

CHANGES.txt

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,20 @@ Changes for crate
55
Unreleased
66
==========
77

8-
- SQLAlchemy: Support ``INSERT...VALUES`` with multiple value sets by enabling
8+
- SQLAlchemy Core: Support ``INSERT...VALUES`` with multiple value sets by enabling
99
``supports_multivalues_insert`` on the CrateDB dialect, it is used by pandas'
1010
``method="multi"`` option
1111

12-
- SQLAlchemy: Enable the ``insertmanyvalues`` feature, which lets you control
12+
- SQLAlchemy Core: Enable the ``insertmanyvalues`` feature, which lets you control
1313
the batch size of ``INSERT`` operations using the ``insertmanyvalues_page_size``
1414
engine-, connection-, and statement-options.
1515

16+
- SQLAlchemy ORM: Remove support for the legacy ``session.bulk_save_objects`` API
17+
on SQLAlchemy 2.0, in favor of the new ``insertmanyvalues`` feature. Performance
18+
optimizations from ``bulk_save()`` have been made inherently part of ``add_all()``.
19+
Note: The legacy mode will still work on SQLAlchemy 1.x, while SQLAlchemy 2.x users
20+
MUST switch to the new method now.
21+
1622

1723
2023/03/02 0.30.1
1824
=================

src/crate/client/sqlalchemy/tests/bulk_test.py

Lines changed: 91 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,14 @@
1919
# with Crate these terms will supersede the license and you may use the
2020
# software solely pursuant to the terms of the relevant commercial agreement.
2121

22-
from unittest import TestCase
22+
from unittest import TestCase, skipIf
2323
from unittest.mock import patch, MagicMock
2424

2525
import sqlalchemy as sa
2626
from sqlalchemy.orm import Session
27+
28+
from crate.client.sqlalchemy.sa_version import SA_VERSION, SA_2_0
29+
2730
try:
2831
from sqlalchemy.orm import declarative_base
2932
except ImportError:
@@ -52,8 +55,35 @@ class Character(Base):
5255
self.character = Character
5356
self.session = Session(bind=self.engine)
5457

58+
@skipIf(SA_VERSION >= SA_2_0, "SQLAlchemy 2.x uses modern bulk INSERT mode")
5559
@patch('crate.client.connection.Cursor', FakeCursor)
56-
def test_bulk_save(self):
60+
def test_bulk_save_legacy(self):
61+
"""
62+
Verify legacy SQLAlchemy bulk INSERT mode.
63+
64+
> bulk_save_objects: Perform a bulk save of the given list of objects.
65+
> This method is a legacy feature as of the 2.0 series of SQLAlchemy. For modern
66+
> bulk INSERT and UPDATE, see the sections ORM Bulk INSERT Statements and ORM Bulk
67+
> UPDATE by Primary Key.
68+
>
69+
> -- https://docs.sqlalchemy.org/orm/session_api.html#sqlalchemy.orm.Session.bulk_save_objects
70+
71+
> The Session includes legacy methods for performing "bulk" INSERT and UPDATE
72+
> statements. These methods share implementations with the SQLAlchemy 2.0
73+
> versions of these features, described at ORM Bulk INSERT Statements and
74+
> ORM Bulk UPDATE by Primary Key, however lack many features, namely RETURNING
75+
> support as well as support for session-synchronization.
76+
>
77+
> -- https://docs.sqlalchemy.org/orm/queryguide/dml.html#legacy-session-bulk-insert-methods
78+
79+
> The 1.4 version of the "ORM bulk insert" methods are really not very efficient and
80+
> don't grant that much of a performance bump vs. regular ORM `session.add()`, provided
81+
> in both cases the objects you provide already have their primary key values assigned.
82+
> SQLAlchemy 2.0 made a much more comprehensive change to how this all works as well so
83+
> that all INSERT methods are essentially extremely fast now, relative to the 1.x series.
84+
>
85+
> -- https://github.com/sqlalchemy/sqlalchemy/discussions/6935#discussioncomment-4789701
86+
"""
5787
chars = [
5888
self.character(name='Arthur', age=35),
5989
self.character(name='Banshee', age=26),
@@ -79,3 +109,62 @@ def test_bulk_save(self):
79109
('Callisto', 37)
80110
)
81111
self.assertSequenceEqual(expected_bulk_args, bulk_args)
112+
113+
@skipIf(SA_VERSION < SA_2_0, "SQLAlchemy 1.x uses legacy bulk INSERT mode")
114+
@patch('crate.client.connection.Cursor', FakeCursor)
115+
def test_bulk_save_modern(self):
116+
"""
117+
Verify modern SQLAlchemy bulk INSERT mode.
118+
119+
> A list of parameter dictionaries sent to the `Session.execute.params` parameter,
120+
> separate from the Insert object itself, will invoke *bulk INSERT mode* for the
121+
> statement, which essentially means the operation will optimize as much as
122+
> possible for many rows.
123+
>
124+
> -- https://docs.sqlalchemy.org/orm/queryguide/dml.html#orm-queryguide-bulk-insert
125+
126+
> We have been looking into getting performance optimizations
127+
> from `bulk_save()` to be inherently part of `add_all()`.
128+
>
129+
> -- https://github.com/sqlalchemy/sqlalchemy/discussions/6935#discussioncomment-1233465
130+
131+
> The remaining performance limitation, that the `cursor.executemany()` DBAPI method
132+
> does not allow for rows to be fetched, is resolved for most backends by *foregoing*
133+
> the use of `executemany()` and instead restructuring individual INSERT statements
134+
> to each accommodate a large number of rows in a single statement that is invoked
135+
> using `cursor.execute()`. This approach originates from the `psycopg2` fast execution
136+
> helpers feature of the `psycopg2` DBAPI, which SQLAlchemy incrementally added more
137+
> and more support towards in recent release series.
138+
>
139+
> -- https://docs.sqlalchemy.org/core/connections.html#engine-insertmanyvalues
140+
"""
141+
142+
# Don't truncate unittest's diff output on `assertListEqual`.
143+
self.maxDiff = None
144+
145+
chars = [
146+
self.character(name='Arthur', age=35),
147+
self.character(name='Banshee', age=26),
148+
self.character(name='Callisto', age=37),
149+
]
150+
151+
fake_cursor.description = ()
152+
fake_cursor.rowcount = len(chars)
153+
fake_cursor.execute.return_value = [
154+
{'rowcount': 1},
155+
{'rowcount': 1},
156+
{'rowcount': 1},
157+
]
158+
self.session.add_all(chars)
159+
self.session.commit()
160+
(stmt, bulk_args), _ = fake_cursor.execute.call_args
161+
162+
expected_stmt = "INSERT INTO characters (name, age) VALUES (?, ?), (?, ?), (?, ?)"
163+
self.assertEqual(expected_stmt, stmt)
164+
165+
expected_bulk_args = (
166+
'Arthur', 35,
167+
'Banshee', 26,
168+
'Callisto', 37,
169+
)
170+
self.assertSequenceEqual(expected_bulk_args, bulk_args)

0 commit comments

Comments
 (0)