Segmented Time Series Pipeline¶

This module is an sklearn compatible pipeline for machine learning time series data and sequences using a sliding window segmentation

class seglearn.pipe.Pype(steps, scorer=None, memory=None)[source]¶

This pipeline extends the sklearn Pipeline to support transformers that change X, y, sample_weight, and the number of samples.

It also adds some new options for setting hyper-parameters with callables and in reference to other parameters (see examples).

Parameters

stepslist: List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.
scorersklearn scorer object
memorycurrently not implemented

Examples

>>> from seglearn.transform import FeatureRep, SegmentX
>>> from seglearn.pipe import Pype
>>> from seglearn.datasets import load_watch
>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.preprocessing import StandardScaler
>>> data = load_watch()
>>> X = data['X']
>>> y = data['y']
>>> pipe = Pype([('segment', SegmentX()),
>>>              ('features', FeatureRep()),
>>>              ('scaler', StandardScaler()),
>>>              ('rf', RandomForestClassifier())])
>>> pipe.fit(X, y)
>>> print(pipe.score(X, y))

Attributes

N_trainnumber of training samples - available after calling fit method
N_testnumber of testing samples - available after calling predict, or score methods

Methods

`decision_function`(self, X)	Apply transforms, and decision_function of the final estimator
`fit`(self, X[, y])	Fit the model
`fit_predict`(self, X[, y])	Applies fit_predict of last step in pipeline after transforms.
`fit_transform`(self, X[, y])	Fit the model and transform with the final estimator Fits all the transforms one after the other and transforms the data, then uses fit_transform on transformed data with the final estimator.
`get_params`(self[, deep])	Get parameters for this estimator.
`predict`(self, X)	Apply transforms to the data, and predict with the final estimator
`predict_as_series`(self, X)	Returns predictions in a list, grouping predictions based on the series they were derived from
`predict_log_proba`(self, X)	Apply transforms, and predict_log_proba of the final estimator
`predict_proba`(self, X)	Apply transforms, and predict_proba of the final estimator
`predict_unsegmented`(self, X[, …])	Generates predictions for each time series on the same sampling as the original series, by resampling a prediction performed with sliding window segmentation
`score`(self, X[, y, sample_weight])	Apply transforms, and score with the final estimator
`score_samples`(self, X)	Apply transforms, and score_samples of the final estimator.
`set_params`(self, \\params)	Set the parameters of this estimator.
`transform`(self, X[, y])	Apply transforms, and transform with the final estimator This also works where final estimator is `None`: all prior transformations are applied.
`transform_predict`(self, X, y)	Apply transforms to the data, and predict with the final estimator.

decision_function(self, X)[source]¶

Apply transforms, and decision_function of the final estimator

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.

Returns

y_scorearray-like, shape = [n_samples, n_classes]

fit(self, X, y=None, **fit_params)[source]¶

Fit the model

Fit all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.

Parameters

Xiterable: Training data. Must fulfill input requirements of first step of the pipeline.
yiterable, default=None: Training targets. Must fulfill label requirements for all steps of the pipeline.
**fit_paramsdict of string -> object: Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns

selfPipeline: This estimator

fit_transform(self, X, y=None, **fit_params)[source]¶

Fit the model and transform with the final estimator Fits all the transforms one after the other and transforms the data, then uses fit_transform on transformed data with the final estimator.

Parameters

Xiterable: Training data. Must fulfill input requirements of first step of the pipeline.
yiterable, default=None: Training targets. Must fulfill label requirements for all steps of the pipeline.
**fit_paramsdict of string -> object: Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns

Xtarray-like, shape = [n_samples, n_transformed_features]: Transformed samples
ytarray-like, shape = [n_samples]: Transformed target

predict(self, X)[source]¶

Apply transforms to the data, and predict with the final estimator

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.

Returns

yparray-like: Predicted transformed target

predict_as_series(self, X)[source]¶

Returns predictions in a list, grouping predictions based on the series they were derived from

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.

Returns

yplist: Predictions

predict_log_proba(self, X)[source]¶

Apply transforms, and predict_log_proba of the final estimator

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.

Returns

y_scorearray-like, shape = [n_samples, n_classes]

predict_proba(self, X)[source]¶

Apply transforms, and predict_proba of the final estimator

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.

Returns

y_probaarray-like, shape = [n_samples, n_classes]: Predicted probability of each class

predict_unsegmented(self, X, categorical_target=False)[source]¶

Generates predictions for each time series on the same sampling as the original series, by resampling a prediction performed with sliding window segmentation

Requires that one of the Segment transforms be part of the pipeline

See plot_feature_rep.py example

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.
categorical_targetboolean: Set to True for classification problems, and false for regression problems

Returns

ypiterable: Time series predictions on the same sampling as X

score(self, X, y=None, sample_weight=None)[source]¶

Apply transforms, and score with the final estimator

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.
yiterable, default=None: Targets used for scoring. Must fulfill label requirements for all steps of the pipeline.
sample_weightarray-like, default=None: If not None, this argument is passed as sample_weight keyword argument to the score method of the final estimator.

Returns

scorefloat

set_params(self, **params)[source]¶

Set the parameters of this estimator. Valid parameter keys can be listed with get_params().

Returns

self

transform(self, X, y=None)[source]¶

Apply transforms, and transform with the final estimator This also works where final estimator is None: all prior transformations are applied.

Parameters

Xiterable: Data to transform. Must fulfill input requirements of first step of the pipeline.
yarray-like: Target

Returns

Xtarray-like, shape = [n_samples, n_transformed_features]: Transformed data
ytarray-like, shape = [n_samples]: Transformed target

transform_predict(self, X, y)[source]¶

Apply transforms to the data, and predict with the final estimator. Unlike predict, this also returns the transformed target

Parameters

Xiterable: Data to predict on. Must fulfill input requirements of first step of the pipeline.
yarray-like: target

Returns

ytarray-like: Transformed target
yparray-like: Predicted transformed target