Time Series Preprocessing¶

This module is for preprocessing time series data.

class seglearn.preprocessing.TargetRunLengthEncoder(min_length=200)[source]¶

Takes a data set with a categorical target variable encoded as a time series and transforms it with run length encoding (RLE) of the target variable

RLE finds contiguous runs of the same target value within the input data and derives the transformed data set from the amalgum of all contiguous runs of all target classes from all series in the input data.

This is useful for generating “pure” series with no mixing of target variables from datasets that encode the target variable as a series (e.g. MHEALTH and PAMAP2)

Note that seglearn can handle datasets with target variables encoded as a series natively (using SegmentXY) and so this preprocessing is not required but may be helpful for some tasks. Effectively it will let you use SegmentX on datasets that would otherwise require SegmentXY.

Parameters

min_lengthinteger > 1: minimum number of samples in a run for it to be included in the transformed data

Methods

`fit`(self, X[, y])	Fit the transform
`fit_transform`(self, X, y[, sample_weight])	Fit the data and transform (required by sklearn API)
`get_params`(self[, deep])	Get parameters for this estimator.
`set_params`(self, \\params)	Set the parameters of this estimator.
`transform`(self, X, y[, sample_weight])	Transforms the time series data with run length encoding of the target variable Note this transformation changes the number of samples in the data If sample_weight is provided, it is transformed to align to the new target encoding

fit(self, X, y=None)[source]¶

Fit the transform

Parameters

Xarray-like, shape [n_series, …]: Time series data and (optionally) contextual data
yNone: There is no need of a target in a transformer, yet the pipeline API requires this parameter.

Returns

selfobject: Returns self.

transform(self, X, y, sample_weight=None)[source]¶

Transforms the time series data with run length encoding of the target variable Note this transformation changes the number of samples in the data If sample_weight is provided, it is transformed to align to the new target encoding

Parameters

Xarray-like, shape [n_series, …]: Time series data and (optionally) contextual data
yarray-like shape [n_series, …]: target variable encoded as a time series
sample_weightarray-like shape [n_series], default = None: sample weights

Returns

Xtarray-like, shape [n_rle_series, ]: transformed time series data
ytarray-like, shape [n_rle_series]: target values for each series
sample_weight_newarray-like shape [n_rle_series]: sample weights