Transformations

Sklearn Transformation

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation(params)[source]

Bases: DataOperationImplementation

Class for applying PCA and kernel PCA models from sklearn

Parameters: params (Optional[OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

The method trains the PCA model

Parameters: input_data (InputData) – data with features, target and ids for PCA training
Returns: trained PCA model (optional output)
Return type: PCA

transform(input_data)[source]

Method for transformation tabular data using PCA

Parameters: input_data (InputData) – data with features, target and ids for PCA applying
Returns: data with transformed features attribute
Return type: OutputData

check_and_correct_params(is_ts_data=False)[source]

Method check if number of features in data enough for n_components parameter in PCA or not. And if not enough - fixes it

Parameters: is_ts_data (bool) –

static update_column_types(output_data)[source]

Update column types after applying PCA operations

Parameters: output_data (OutputData) –
Return type: OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PCAImplementation(params=None)[source]

Bases: ComponentAnalysisImplementation

Class for applying PCA from sklearn

Parameters: params (Optional[OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.DaskPCAImplementation(params=None)[source]

Bases: ComponentAnalysisImplementation

Class for applying PCA from sklearn

Parameters: params (Optional[OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.KernelPCAImplementation(params)[source]

Bases: ComponentAnalysisImplementation

Class for applying kernel PCA from sklearn

Parameters: params (Optional[OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.FastICAImplementation(params)[source]

Bases: ComponentAnalysisImplementation

Class for applying FastICA from sklearn

Parameters: params (Optional[OperationParameters]) – OperationParameters with the hyperparameters

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PolyFeaturesImplementation(params)[source]

Bases: EncodedInvariantImplementation

Class for application of PolynomialFeatures operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters: params (Optional[OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

Method for fit Poly features operation

Parameters: input_data (InputData) –

transform(input_data)[source]

Firstly perform filtration of columns

Parameters: input_data (InputData) –
Return type: OutputData

_update_column_types(source_features_shape, output_data)[source]

Update column types after applying operations. If new columns added, new type for them are defined

Parameters: output_data (OutputData) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ScalingImplementation(params)[source]

Bases: EncodedInvariantImplementation

Class for application of Scaling operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters: params (Optional[OperationParameters]) – OperationParameters with the arguments

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.NormalizationImplementation(params)[source]

Bases: EncodedInvariantImplementation

Class for application of MinMax normalization operation on data, where only not encoded features (were not converted from categorical using OneHot encoding) are used

Parameters: params (Optional[OperationParameters]) – OperationParameters with the arguments

class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ImputationImplementation(params=None)[source]

Bases: DataOperationImplementation

Class for applying imputation on tabular data

Parameters: params (Optional[OperationParameters]) – OperationParameters with the arguments

fit(input_data)[source]

The method trains SimpleImputer

Parameters: input_data (InputData) – data with features

transform(input_data)[source]

Method for transformation tabular data using SimpleImputer

Parameters: input_data (InputData) – data with features
Returns: data with transformed features attribute
Return type: OutputData

fit_transform(input_data)[source]

Method for training and transformation tabular data using SimpleImputer

Parameters: input_data (InputData) – data with features
Returns: data with transformed features attribute
Return type: OutputData

_categorical_numerical_union(categorical_features, numerical_features)[source]

Merge numerical and categorical features in right order (as it was in source table)

Parameters

categorical_features (array) –
numerical_features (array) –

Return type

array

_find_binary_features(numerical_features)[source]

Find indices of features with only two unique values in column

Notes

All features in table are numerical

Parameters: numerical_features (array) –

_correct_binary_ids_features(filled_numerical_features)[source]

Correct filled features if previously it was binary. Discretization is performed for the reconstructed values

Tip

[1, 1, 0.75, 0] will be transformed to [1, 1, 1, 0]

Parameters: filled_numerical_features (array) –
Return type: array

get_params()[source]

Method return parameters, which can be optimized for particular operation

Return type: OperationParameters

Time Series Transformation

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.random() → x in the interval [0, 1).

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (Optional[OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data – data with features, target and ids to process

transform(input_data)[source]

Method for transformation of time series to lagged form for predict stage

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with transformed features table
Return type: OutputData

transform_for_fit(input_data)[source]

Method for transformation of time series to lagged form for fit stage

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with transformed features table
Return type: OutputData

_check_and_correct_window_size(time_series, forecast_length)[source]

Method check if the length of the time series is not enough for lagged transformation

Parameters

time_series (ndarray) – time series for transformation
forecast_length (int) – forecast length

Returns:

_update_column_types(output_data)[source]

Update column types after lagged transformation. All features becomes float

Parameters: output_data (OutputData) –

_apply_transformation_for_fit(input_data, features, target, forecast_length, old_idx)[source]

Apply lagged transformation on each time series in the current dataset

Parameters

input_data (InputData) –
features (array) –
target (array) –
forecast_length (int) –
old_idx (array) –

stack_by_type_fit(input_data, all_features, all_target, all_idx, features, target, idx)[source]: Apply stack function for multi_ts and multivariable ts types on fit step

_stack_multi_variable(all_features, all_target, all_idx, features, target, idx)[source]

Horizontally stack tables as multiple variables extends features for training

Parameters

all_features (array) – array with all features for adding new
all_target (array) – array with all target (does not change)
all_idx (array) – array with all indices (does not change)
features (array) – array with new features for adding
target (array) – array with new target for adding
idx (Union[list, array]) – array with new idx for adding

Returns

table

_stack_multi_ts(all_features, all_target, all_idx, features, target, idx)[source]

Vertically stack tables as multi_ts data extends training set as combination of train and target

Parameters

all_features (array) – array with all features for adding new
all_target (array) – array with all target
all_idx (array) – array with all indices
features (array) – array with new features for adding
target (array) – array with new target for adding
idx (Union[list, array]) – array with new idx for adding

Returns

table

_current_target_for_each_ts(current_ts_id, target)[source]: Returns target for each time-series

_apply_transformation_for_predict(input_data)[source]

Apply lagged transformation for every column (time series) in the dataset

Parameters: input_data (InputData) –

stack_by_type_predict(input_data, all_features, part_to_add)[source]: Apply stack function for multi_ts and multivariable ts types on predict step

_update_features_for_sparse(time_series, idx)[source]

Make sparse matrix which will be used during forecasting

Parameters

time_series (array) –
idx (array) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.SparseLaggedTransformationImplementation(params)[source]

Bases: LaggedImplementation

Implementation of sparse lagged transformation for time series forecasting

Parameters: params (Optional[OperationParameters]) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedTransformationImplementation(params)[source]

Bases: LaggedImplementation

Implementation of lagged transformation for time series forecasting

Parameters: params (Optional[OperationParameters]) –

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.TsSmoothingImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (Optional[OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for smoothing time series

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with smoothed time series
Return type: OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ExogDataTransformationImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (Optional[OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for representing time series as column

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with features as columns
Return type: OutputData

transform_for_fit(input_data)[source]

Method for representing time series as column for fit stage

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with features as columns
Return type: OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.GaussianFilterImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (Optional[OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for smoothing time series for predict stage

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with smoothed time series
Return type: OutputData

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.NumericalDerivativeFilterImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (OperationParameters) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Method for finding numerical derivative of time series for predict stage

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with smoothed time series
Return type: OutputData

_differential_filter(ts)[source]: NumericalDerivative filter

class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.CutImplementation(params)[source]

Bases: DataOperationImplementation

Parameters: params (Optional[OperationParameters]) –

fit(input_data)[source]

Class doesn’t support fit operation

Parameters: input_data (InputData) – data with features, target and ids to process

transform(input_data)[source]

Cut first cut_part from time series

new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with cutted time series
Return type: OutputData

transform_for_fit(input_data)[source]

Cut first cut_part from time series for fit stage: new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))

Parameters: input_data (InputData) – data with features, target and ids to process
Returns: output data with cutted time series
Return type: OutputData

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ts_to_table(idx, time_series, window_size, is_lag=False)[source]

Method convert time series to lagged form.

Parameters

idx – the indices of the time series to convert
time_series (array) – source time series
window_size (int) – size of sliding window, which defines lag
is_lag (bool) – is function used for lagged transformation. False needs to convert one dimensional output to lagged form.

Returns

updated_idx -> clipped indices of time series

features_columns -> lagged time series feature table

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._sparse_matrix(logger, features_columns, n_components_perc=0.5, use_svd=False)[source]

Method converts the matrix to sparse form

Parameters

features_columns (array) – matrix to sparse
n_components_perc – initial approximation of percent of components to keep
use_svd – is there need to use SVD method for sparse or use naive method

Returns

reduced dimension matrix

Notes

shape of returned matrix depends on the number of components which includes the threshold of explained variance gain

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._get_svd(features_columns, n_components)[source]

Method converts the matrix to svd sparse form

Parameters

features_columns (array) – matrix to sparse
n_components (int) – number of components to keep

Returns

transformed sparse matrix

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.prepare_target(all_idx, idx, features_columns, target, forecast_length)[source]

Method convert time series to lagged form. Transformation applied only for generating target table (time series considering as multi-target regression task)

Parameters

all_idx – all indices in data
idx – remaining indices after lagged feature table generation
features_columns (array) – lagged feature table
target – source time series
forecast_length (int) – forecast length

Returns

updated_idx, updated_features, updated_target

more information:

updated_idx -> clipped indices of time series
updated_features -> clipped lagged feature table
updated_target -> lagged target table

fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.transform_features_and_target_into_lagged(input_data, forecast_length, window_size)[source]

Perform lagged transformation firstly on features and secondly on target array

Parameters

input_data (InputData) – dataclass with features
forecast_length (int) – forecast horizon
window_size (int) – window size for features transformation

Returns

new_idx, transformed_cols, new_target

more information:

new_idx ->
transformed_cols ->
new_target ->