pyspark.pandas.groupby.GroupBy.diff#
- GroupBy.diff(periods=1)[source]#
- First discrete difference of element. - Calculates the difference of a DataFrame element compared with another element in the DataFrame group (default is the element in the same column of the previous row). - Parameters
- periodsint, default 1
- Periods to shift for calculating difference, accepts negative values. 
 
- Returns
- diffedDataFrame or Series
 
 - Examples - >>> df = ps.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}, columns=['a', 'b', 'c']) >>> df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36 - >>> df.groupby(['b']).diff().sort_index() a c 0 NaN NaN 1 1.0 3.0 2 NaN NaN 3 NaN NaN 4 NaN NaN 5 NaN NaN - Difference with previous column in a group. - >>> df.groupby(['b'])['a'].diff().sort_index() 0 NaN 1 1.0 2 NaN 3 NaN 4 NaN 5 NaN Name: a, dtype: float64