pyspark.pandas.window.Rolling.max¶

Rolling.max() → FrameLike[source]¶

Calculate the rolling maximum.

Note

the current implementation of this API uses Spark’s Window without specifying partition specification. This leads to move all data into single partition in single machine and could cause serious performance degradation. Avoid this method against very large dataset.

Returns

Series or DataFrame: Return type is determined by the caller.

See also

pyspark.pandas.Series.rolling: Series rolling.
pyspark.pandas.DataFrame.rolling: DataFrame rolling.
pyspark.pandas.Series.max: Similar method for Series.
pyspark.pandas.DataFrame.max: Similar method for DataFrame.

Examples

>>> s = ps.Series([4, 3, 5, 2, 6])
>>> s
0    4
1    3
2    5
3    2
4    6
dtype: int64

>>> s.rolling(2).max()
  NaN
  4.0
  5.0
  5.0
  6.0
dtype: float64

>>> s.rolling(3).max()
  NaN
  NaN
  5.0
  5.0
  6.0
dtype: float64

For DataFrame, each rolling maximum is computed column-wise.

>>> df = ps.DataFrame({"A": s.to_numpy(), "B": s.to_numpy() ** 2})
>>> df
   A   B
0  4  16
1  3   9
2  5  25
3  2   4
4  6  36

>>> df.rolling(2).max()
     A     B
NaN   NaN
4.0  16.0
5.0  25.0
5.0  25.0
6.0  36.0

>>> df.rolling(3).max()
     A     B
NaN   NaN
NaN   NaN
5.0  25.0
5.0  25.0
6.0  36.0

pyspark.pandas.window.Rolling.min

pyspark.pandas.window.Rolling.mean