Time Series Preprocessing¶
This module is for preprocessing time series data.
-
class
seglearn.preprocessing.
TargetRunLengthEncoder
(min_length=200)[source]¶ Takes a data set with a categorical target variable encoded as a time series and transforms it with run length encoding (RLE) of the target variable
RLE finds contiguous runs of the same target value within the input data and derives the transformed data set from the amalgum of all contiguous runs of all target classes from all series in the input data.
This is useful for generating “pure” series with no mixing of target variables from datasets that encode the target variable as a series (e.g. MHEALTH and PAMAP2)
Note that
seglearn
can handle datasets with target variables encoded as a series natively (usingSegmentXY
) and so this preprocessing is not required but may be helpful for some tasks. Effectively it will let you useSegmentX
on datasets that would otherwise requireSegmentXY
.- Parameters
- min_lengthinteger > 1
minimum number of samples in a run for it to be included in the transformed data
Methods
fit
(self, X[, y])Fit the transform
fit_transform
(self, X, y[, sample_weight])Fit the data and transform (required by sklearn API)
get_params
(self[, deep])Get parameters for this estimator.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X, y[, sample_weight])Transforms the time series data with run length encoding of the target variable Note this transformation changes the number of samples in the data If sample_weight is provided, it is transformed to align to the new target encoding
-
fit
(self, X, y=None)[source]¶ Fit the transform
- Parameters
- Xarray-like, shape [n_series, …]
Time series data and (optionally) contextual data
- yNone
There is no need of a target in a transformer, yet the pipeline API requires this parameter.
- Returns
- selfobject
Returns self.
-
transform
(self, X, y, sample_weight=None)[source]¶ Transforms the time series data with run length encoding of the target variable Note this transformation changes the number of samples in the data If sample_weight is provided, it is transformed to align to the new target encoding
- Parameters
- Xarray-like, shape [n_series, …]
Time series data and (optionally) contextual data
- yarray-like shape [n_series, …]
target variable encoded as a time series
- sample_weightarray-like shape [n_series], default = None
sample weights
- Returns
- Xtarray-like, shape [n_rle_series, ]
transformed time series data
- ytarray-like, shape [n_rle_series]
target values for each series
- sample_weight_newarray-like shape [n_rle_series]
sample weights