Transformations

Sklearn Transformation

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Class for applying PCA and kernel PCA models from sklearn

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

The method trains the PCA model

Parameters

input_data (InputData) – data with features, target and ids for PCA training

Returns

trained PCA model (optional output)

Return type

sklearn.decomposition._pca.PCA

transform(input_data)[source]

Method for transformation tabular data using PCA

Parameters

input_data (InputData) – data with features, target and ids for PCA applying

Returns

data with transformed features attribute

Return type

OutputData

check_and_correct_params(is_ts_data=False)[source]

Method check if number of features in data enough for n_components parameter in PCA or not. And if not enough - fixes it

Parameters

is_ts_data (bool) –

static update_column_types(output_data)[source]

Update column types after applying PCA operations

Parameters

output_data (OutputData) –

Return type

OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PCAImplementation(params=None)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation

Class for applying PCA from sklearn

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.KernelPCAImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation

Class for applying kernel PCA from sklearn

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.FastICAImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation

Class for applying FastICA from sklearn

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PolyFeaturesImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation

Class for application of PolynomialFeatures operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

Method for fit Poly features operation

Parameters

input_data (InputData) –

transform(input_data)[source]

Firstly perform filtration of columns

Parameters

input_data (InputData) –

Return type

OutputData

_update_column_types(source_features_shape, output_data)[source]

Update column types after applying operations. If new columns added, new type for them are defined

Parameters

output_data (OutputData) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ScalingImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation

Class for application of Scaling operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.NormalizationImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation

Class for application of MinMax normalization operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ImputationImplementation(params=None)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Class for applying imputation on tabular data

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

The method trains SimpleImputer

Parameters

input_data (InputData) – data with features

transform(input_data)[source]

Method for transformation tabular data using SimpleImputer

Parameters

input_data (InputData) – data with features

Returns

data with transformed features attribute

Return type

OutputData

fit_transform(input_data)[source]

Method for training and transformation tabular data using SimpleImputer

Parameters

input_data (InputData) – data with features

Returns

data with transformed features attribute

Return type

OutputData

_categorical_numerical_union(categorical_features, numerical_features)[source]

Merge numerical and categorical features in right order (as it was in source table)

Parameters
  • categorical_features (numpy.array) –

  • numerical_features (numpy.array) –

Return type

numpy.array

_find_binary_features(numerical_features)[source]

Find indices of features with only two unique values in column

Notes

All features in table are numerical

Parameters

numerical_features (numpy.array) –

_correct_binary_ids_features(filled_numerical_features)[source]

Correct filled features if previously it was binary. Discretization is performed for the reconstructed values

Tip

[1, 1, 0.75, 0] will be transformed to [1, 1, 1, 0]

Parameters

filled_numerical_features (numpy.array) –

Return type

numpy.array

get_params()[source]

Method return parameters, which can be optimized for particular operation

Return type

fedot.core.operations.operation_parameters.OperationParameters

Time Series Transformation

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.random() x in the interval [0, 1).
class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data – data with features, target and ids to process

transform(input_data)[source]

Method for transformation of time series to lagged form for predict stage

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with transformed features table

Return type

OutputData

transform_for_fit(input_data)[source]

Method for transformation of time series to lagged form for fit stage

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with transformed features table

Return type

OutputData

_check_and_correct_window_size(time_series, forecast_length)[source]

Method check if the length of the time series is not enough for lagged transformation

Parameters
  • time_series (numpy.ndarray) – time series for transformation

  • forecast_length (int) – forecast length

Returns:

_update_column_types(output_data)[source]

Update column types after lagged transformation. All features becomes float

Parameters

output_data (OutputData) –

_apply_transformation_for_fit(input_data, features, target, forecast_length, old_idx)[source]

Apply lagged transformation on each time series in the current dataset

Parameters
  • input_data (InputData) –

  • features (numpy.array) –

  • target (numpy.array) –

  • forecast_length (int) –

  • old_idx (numpy.array) –

stack_by_type_fit(input_data, all_features, all_target, all_idx, features, target, idx)[source]

Apply stack function for multi_ts and multivariable ts types on fit step

_stack_multi_variable(all_features, all_target, all_idx, features, target, idx)[source]

Horizontally stack tables as multiple variables extends features for training

Parameters
  • all_features (numpy.array) – array with all features for adding new

  • all_target (numpy.array) – array with all target (does not change)

  • all_idx (numpy.array) – array with all indices (does not change)

  • features (numpy.array) – array with new features for adding

  • target (numpy.array) – array with new target for adding

  • idx (Union[list, numpy.array]) – array with new idx for adding

Returns

table

_stack_multi_ts(all_features, all_target, all_idx, features, target, idx)[source]

Vertically stack tables as multi_ts data extends training set as combination of train and target

Parameters
  • all_features (numpy.array) – array with all features for adding new

  • all_target (numpy.array) – array with all target

  • all_idx (numpy.array) – array with all indices

  • features (numpy.array) – array with new features for adding

  • target (numpy.array) – array with new target for adding

  • idx (Union[list, numpy.array]) – array with new idx for adding

Returns

table

_current_target_for_each_ts(current_ts_id, target)[source]

Returns target for each time-series

_apply_transformation_for_predict(input_data)[source]

Apply lagged transformation for every column (time series) in the dataset

Parameters

input_data (InputData) –

stack_by_type_predict(input_data, all_features, part_to_add)[source]

Apply stack function for multi_ts and multivariable ts types on predict step

_update_features_for_sparse(time_series, idx)[source]

Make sparse matrix which will be used during forecasting

Parameters
  • time_series (numpy.array) –

  • idx (numpy.array) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.SparseLaggedTransformationImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation

Implementation of sparse lagged transformation for time series forecasting

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedTransformationImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation

Implementation of lagged transformation for time series forecasting

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.TsSmoothingImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for smoothing time series

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with smoothed time series

Return type

OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ExogDataTransformationImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for representing time series as column

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with features as columns

Return type

OutputData

transform_for_fit(input_data)[source]

Method for representing time series as column for fit stage

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with features as columns

Return type

OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.GaussianFilterImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for smoothing time series for predict stage

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with smoothed time series

Return type

OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.NumericalDerivativeFilterImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (fedot.core.operations.operation_parameters.OperationParameters) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for finding numerical derivative of time series for predict stage

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with smoothed time series

Return type

OutputData

_differential_filter(ts)[source]

NumericalDerivative filter

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.CutImplementation(params)[source]

Bases: fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation

Parameters

params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters

input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Cut first cut_part from time series

new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with cutted time series

Return type

OutputData

transform_for_fit(input_data)[source]
Cut first cut_part from time series for fit stage

new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))

Parameters

input_data (InputData) – data with features, target and ids to process

Returns

output data with cutted time series

Return type

OutputData

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ts_to_table(idx, time_series, window_size, is_lag=False)[source]

Method convert time series to lagged form.

Parameters
  • idx – the indices of the time series to convert

  • time_series (numpy.array) – source time series

  • window_size (int) – size of sliding window, which defines lag

  • is_lag (bool) – is function used for lagged transformation. False needs to convert one dimensional output to lagged form.

Returns

updated_idx -> clipped indices of time series

features_columns -> lagged time series feature table

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._sparse_matrix(logger, features_columns, n_components_perc=0.5, use_svd=False)[source]

Method converts the matrix to sparse form

Parameters
  • features_columns (numpy.array) – matrix to sparse

  • n_components_perc – initial approximation of percent of components to keep

  • use_svd – is there need to use SVD method for sparse or use naive method

Returns

reduced dimension matrix

Notes

shape of returned matrix depends on the number of components which includes the threshold of explained variance gain

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._get_svd(features_columns, n_components)[source]

Method converts the matrix to svd sparse form

Parameters
  • features_columns (numpy.array) – matrix to sparse

  • n_components (int) – number of components to keep

Returns

transformed sparse matrix

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.prepare_target(all_idx, idx, features_columns, target, forecast_length)[source]

Method convert time series to lagged form. Transformation applied only for generating target table (time series considering as multi-target regression task)

Parameters
  • all_idx – all indices in data

  • idx – remaining indices after lagged feature table generation

  • features_columns (numpy.array) – lagged feature table

  • target – source time series

  • forecast_length (int) – forecast length

Returns

updated_idx, updated_features, updated_target

more information:
  • updated_idx -> clipped indices of time series

  • updated_features -> clipped lagged feature table

  • updated_target -> lagged target table

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.transform_features_and_target_into_lagged(input_data, forecast_length, window_size)[source]

Perform lagged transformation firstly on features and secondly on target array

Parameters
  • input_data (InputData) – dataclass with features

  • forecast_length (int) – forecast horizon

  • window_size (int) – window size for features transformation

Returns

new_idx, transformed_cols, new_target

more information:
  • new_idx ->

  • transformed_cols ->

  • new_target ->