Skip to content

Commit 6a0082e

Browse files
authored
BUG: Inconsistent behavior with groupby and copy-on-write (#63232)
1 parent 92ef76e commit 6a0082e

File tree

3 files changed

+16
-0
lines changed

3 files changed

+16
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1321,6 +1321,7 @@ Groupby/resample/rolling
13211321
- Bug in :meth:`Series.resample` raising error when resampling non-nanosecond resolutions out of bounds for nanosecond precision (:issue:`57427`)
13221322
- Bug in :meth:`Series.rolling.var` and :meth:`Series.rolling.std` computing incorrect results due to numerical instability. (:issue:`47721`, :issue:`52407`, :issue:`54518`, :issue:`55343`)
13231323
- Bug in :meth:`DataFrame.groupby` methods when operating on NumPy-nullable data failing when the NA mask was not C-contiguous (:issue:`61031`)
1324+
- Bug in :meth:`DataFrame.groupby` when grouping by a Series and that Series was modified after calling :meth:`DataFrame.groupby` but prior to the groupby operation (:issue:`63219`)
13241325

13251326
Reshaping
13261327
^^^^^^^^^

pandas/core/groupby/grouper.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,8 @@ def __init__(
460460
dropna: bool = True,
461461
uniques: ArrayLike | None = None,
462462
) -> None:
463+
if isinstance(grouper, Series):
464+
grouper = grouper.copy(deep=False)
463465
self.level = level
464466
self._orig_grouper = grouper
465467
grouping_vector = _convert_grouper(index, grouper)

pandas/tests/copy_view/test_methods.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,19 @@ def test_groupby_column_index_in_references():
226226
tm.assert_frame_equal(result, expected)
227227

228228

229+
def test_groupby_modify_series():
230+
# https://github.com/pandas-dev/pandas/issues/63219
231+
# Modifying a Series after using it to groupby should not impact
232+
# the groupby operation.
233+
ser = Series([1, 2, 1])
234+
df = DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
235+
gb = df.groupby(ser)
236+
ser.iloc[0] = 100
237+
result = gb.sum()
238+
expected = DataFrame({"a": [4, 2], "b": [10, 5]}, index=[1, 2])
239+
tm.assert_frame_equal(result, expected)
240+
241+
229242
def test_rename_columns():
230243
# Case: renaming columns returns a new dataframe
231244
# + afterwards modifying the result

0 commit comments

Comments
 (0)