What’s new in 2.2.1 (February 22, 2024)#
These are the changes in pandas 2.2.1. See Release notes for a full changelog including other versions of pandas.
Enhancements#
Added
pyarrow
pip extra so users can install pandas and pyarrow with pip withpip install pandas[pyarrow]
(GH 54466)
Fixed regressions#
Fixed memory leak in
read_csv()
(GH 57039)Fixed performance regression in
Series.combine_first()
(GH 55845)Fixed regression causing overflow for near-minimum timestamps (GH 57150)
Fixed regression in
concat()
changing long-standing behavior that always sorted the non-concatenation axis when the axis was aDatetimeIndex
(GH 57006)Fixed regression in
merge_ordered()
raisingTypeError
forfill_method="ffill"
andhow="left"
(GH 57010)Fixed regression in
pandas.testing.assert_series_equal()
defaulting tocheck_exact=True
when checking theIndex
(GH 57067)Fixed regression in
read_json()
where anIndex
would be returned instead of aRangeIndex
(GH 57429)Fixed regression in
wide_to_long()
raising anAttributeError
for string columns (GH 57066)Fixed regression in
DataFrameGroupBy.idxmin()
,DataFrameGroupBy.idxmax()
,SeriesGroupBy.idxmin()
,SeriesGroupBy.idxmax()
ignoring theskipna
argument (GH 57040)Fixed regression in
DataFrameGroupBy.idxmin()
,DataFrameGroupBy.idxmax()
,SeriesGroupBy.idxmin()
,SeriesGroupBy.idxmax()
where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040)Fixed regression in
CategoricalIndex.difference()
raisingKeyError
when other contains null values other than NaN (GH 57318)Fixed regression in
DataFrame.groupby()
raisingValueError
when grouping by aSeries
in some cases (GH 57276)Fixed regression in
DataFrame.loc()
raisingIndexError
for non-unique, masked dtype indexes where result has more than 10,000 rows (GH 57027)Fixed regression in
DataFrame.loc()
which was unnecessarily throwing “incompatible dtype warning” when expanding with partial row indexer and multiple columns (see PDEP6) (GH 56503)Fixed regression in
DataFrame.map()
withna_action="ignore"
not being respected for NumPy nullable andArrowDtypes
(GH 57316)Fixed regression in
DataFrame.merge()
raisingValueError
for certain types of 3rd-party extension arrays (GH 57316)Fixed regression in
DataFrame.query()
with allNaT
column with object dtype (GH 57068)Fixed regression in
DataFrame.shift()
raisingAssertionError
foraxis=1
and emptyDataFrame
(GH 57301)Fixed regression in
DataFrame.sort_index()
not producing a stable sort for a index with duplicates (GH 57151)Fixed regression in
DataFrame.to_dict()
withorient='list'
and datetime or timedelta types returning integers (GH 54824)Fixed regression in
DataFrame.to_json()
converting nullable integers to floats (GH 57224)Fixed regression in
DataFrame.to_sql()
whenmethod="multi"
is passed and the dialect type is not Oracle (GH 57310)Fixed regression in
DataFrame.transpose()
with nullable extension dtypes not having F-contiguous data potentially causing exceptions when used (GH 57315)Fixed regression in
DataFrame.update()
emitting incorrect warnings about downcasting (GH 57124)Fixed regression in
DataFrameGroupBy.idxmin()
,DataFrameGroupBy.idxmax()
,SeriesGroupBy.idxmin()
,SeriesGroupBy.idxmax()
ignoring theskipna
argument (GH 57040)Fixed regression in
DataFrameGroupBy.idxmin()
,DataFrameGroupBy.idxmax()
,SeriesGroupBy.idxmin()
,SeriesGroupBy.idxmax()
where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040)Fixed regression in
ExtensionArray.to_numpy()
raising for non-numeric masked dtypes (GH 56991)Fixed regression in
Index.join()
raisingTypeError
when joining an empty index to a non-empty index containing mixed dtype values (GH 57048)Fixed regression in
Series.astype()
introducing decimals when converting from integer with missing values to string dtype (GH 57418)Fixed regression in
Series.pct_change()
raising aValueError
for an emptySeries
(GH 57056)Fixed regression in
Series.to_numpy()
when dtype is given as float and the data contains NaNs (GH 57121)Fixed regression in addition or subtraction of
DateOffset
objects with millisecond components todatetime64
Index
,Series
, orDataFrame
(GH 57529)
Bug fixes#
Fixed bug in
pandas.api.interchange.from_dataframe()
which was raising for Nullable integers (GH 55069)Fixed bug in
pandas.api.interchange.from_dataframe()
which was raising for empty inputs (GH 56700)Fixed bug in
pandas.api.interchange.from_dataframe()
which wasn’t converting columns names to strings (GH 55069)Fixed bug in
DataFrame.__getitem__()
for emptyDataFrame
with Copy-on-Write enabled (GH 57130)Fixed bug in
PeriodIndex.asfreq()
which was silently converting frequencies which are not supported as period frequencies instead of raising an error (GH 56945)
Other#
Note
The DeprecationWarning
that was raised when pandas was imported without PyArrow being
installed has been removed. This decision was made because the warning was too noisy for too
many users and a lot of feedback was collected about the decision to make PyArrow a required
dependency. Pandas is currently considering the decision whether or not PyArrow should be added
as a hard dependency in 3.0. Interested users can follow the discussion
here.
Added the argument
skipna
toDataFrameGroupBy.first()
,DataFrameGroupBy.last()
,SeriesGroupBy.first()
, andSeriesGroupBy.last()
; achievingskipna=False
used to be available viaDataFrameGroupBy.nth()
, but the behavior was changed in pandas 2.0.0 (GH 57019)Added the argument
skipna
toResampler.first()
,Resampler.last()
(GH 57019)
Contributors#
A total of 14 people contributed patches to this release. People with a “ ” by their names contributed a patch for the first time.
Albert Villanova del Moral
Luke Manley
Lumberbot (aka Jack)
Marco Edward Gorelli
Matthew Roeschke
Natalia Mokeeva
Pandas Development Team
Patrick Hoefler
Richard Shadrach
Robert Schmidtke
Samuel Chai
Thomas Li
William Ayd
dependabot[bot]