5.5. cpforager.utils.apply_functions_between_samples

cpforager.utils.apply_functions_between_samples(df, resolution, columns_functions, verbose=False)

Apply a chosen function (e.g. sum, mean, min, max) over every high resolution elements between two subsamples defined by a given resolution.

Parameters:
  • df (pandas.DataFrame) – dataframe with a datetime column.

  • resolution (pandas.DataFrame(dtype=bool)) – boolean dataframe of the subsampling resolution.

  • columns_functions (dict) – dictionary giving for each specified column the function to apply.

  • verbose (bool) – display progress if True.

Returns:

the dataframe with the additional columns “column_function” composed of NaN values everywhere except at the subsampling resolution where the function was applied to every elements between two subsamples.

Return type:

pandas.DataFrame

This function is key to handle data with different resolutions, such as high-resolution acceleration measures and low-resolution position and pressure measures. It thus allows to produce a low-resolution version of the high-resolution data by summarising it using a function between subsamples. Find below the exhaustive table of possible functions to apply.

Important

Output dataframe is of same size as the input dataframe, though only indices corresponding to the subsampling resolution have non-NaN values.

function

description

sum

compute the sum of every elements bewteen two subsamples

mean

compute the mean of every elements bewteen two subsamples

min

keep the minimum value of every elements bewteen two subsamples

max

keep the maximum value of every elements bewteen two subsamples

len_unique_pos

compute the number of different positive values of every elements bewteen two subsamples