Transformations
Sklearn Transformation
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
Class for applying PCA and kernel PCA models from sklearn
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments
- fit(input_data)[source]
The method trains the PCA model
- Parameters
input_data (InputData) – data with features, target and ids for PCA training
- Returns
trained PCA model (optional output)
- Return type
sklearn.decomposition._pca.PCA
- transform(input_data)[source]
Method for transformation tabular data using PCA
- Parameters
input_data (InputData) – data with features, target and ids for PCA applying
- Returns
data with transformed features attribute
- Return type
- check_and_correct_params(is_ts_data=False)[source]
Method check if number of features in data enough for
n_components
parameter in PCA or not. And if not enough - fixes it- Parameters
is_ts_data (bool) –
- static update_column_types(output_data)[source]
Update column types after applying PCA operations
- Parameters
output_data (OutputData) –
- Return type
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PCAImplementation(params=None)[source]
-
Class for applying PCA from sklearn
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.KernelPCAImplementation(params)[source]
-
Class for applying kernel PCA from sklearn
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.FastICAImplementation(params)[source]
-
Class for applying FastICA from sklearn
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PolyFeaturesImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation
Class for application of
PolynomialFeatures
operation on data, where only not encoded features (were not converted from categorical usingOneHot encoding
) are used- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments
- transform(input_data)[source]
Firstly perform filtration of columns
- Parameters
input_data (InputData) –
- Return type
- _update_column_types(source_features_shape, output_data)[source]
Update column types after applying operations. If new columns added, new type for them are defined
- Parameters
output_data (OutputData) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ScalingImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation
Class for application of
Scaling operation
on data, where only not encoded features (were not converted from categorical usingOneHot encoding
) are used- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.NormalizationImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.EncodedInvariantImplementation
Class for application of
MinMax normalization
operation on data, where only not encoded features (were not converted from categorical usingOneHot encoding
) are used- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ImputationImplementation(params=None)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
Class for applying imputation on tabular data
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) – OperationParameters with the arguments
- fit(input_data)[source]
The method trains
SimpleImputer
- Parameters
input_data (InputData) – data with features
- transform(input_data)[source]
Method for transformation tabular data using
SimpleImputer
- Parameters
input_data (InputData) – data with features
- Returns
data with transformed features attribute
- Return type
- fit_transform(input_data)[source]
Method for training and transformation tabular data using
SimpleImputer
- Parameters
input_data (InputData) – data with features
- Returns
data with transformed features attribute
- Return type
- _categorical_numerical_union(categorical_features, numerical_features)[source]
Merge numerical and categorical features in right order (as it was in source table)
- Parameters
categorical_features (numpy.array) –
numerical_features (numpy.array) –
- Return type
numpy.array
- _find_binary_features(numerical_features)[source]
Find indices of features with only two unique values in column
Notes
All features in table are numerical
- Parameters
numerical_features (numpy.array) –
- _correct_binary_ids_features(filled_numerical_features)[source]
Correct filled features if previously it was binary. Discretization is performed for the reconstructed values
Tip
[1, 1, 0.75, 0] will be transformed to [1, 1, 1, 0]
- Parameters
filled_numerical_features (numpy.array) –
- Return type
numpy.array
Time Series Transformation
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.random() x in the interval [0, 1).
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data – data with features, target and ids to process
- transform(input_data)[source]
Method for transformation of time series to lagged form for predict stage
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with transformed features table
- Return type
- transform_for_fit(input_data)[source]
Method for transformation of time series to lagged form for fit stage
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with transformed features table
- Return type
- _check_and_correct_window_size(time_series, forecast_length)[source]
Method check if the length of the time series is not enough for lagged transformation
- Parameters
time_series (numpy.ndarray) – time series for transformation
forecast_length (int) – forecast length
Returns:
- _update_column_types(output_data)[source]
Update column types after lagged transformation. All features becomes
float
- Parameters
output_data (OutputData) –
- _apply_transformation_for_fit(input_data, features, target, forecast_length, old_idx)[source]
Apply lagged transformation on each time series in the current dataset
- Parameters
input_data (InputData) –
features (numpy.array) –
target (numpy.array) –
forecast_length (int) –
old_idx (numpy.array) –
- stack_by_type_fit(input_data, all_features, all_target, all_idx, features, target, idx)[source]
Apply stack function for multi_ts and multivariable ts types on fit step
- _stack_multi_variable(all_features, all_target, all_idx, features, target, idx)[source]
Horizontally stack tables as multiple variables extends features for training
- Parameters
all_features (numpy.array) –
array
with all features for adding newall_target (numpy.array) –
array
with all target (does not change)all_idx (numpy.array) –
array
with all indices (does not change)features (numpy.array) –
array
with new features for addingtarget (numpy.array) –
array
with new target for addingidx (Union[list, numpy.array]) –
array
with new idx for adding
- Returns
table
- _stack_multi_ts(all_features, all_target, all_idx, features, target, idx)[source]
Vertically stack tables as multi_ts data extends training set as combination of train and target
- Parameters
all_features (numpy.array) –
array
with all features for adding newall_target (numpy.array) –
array
with all targetall_idx (numpy.array) –
array
with all indicesfeatures (numpy.array) –
array
with new features for addingtarget (numpy.array) –
array
with new target for addingidx (Union[list, numpy.array]) –
array
with new idx for adding
- Returns
table
- _apply_transformation_for_predict(input_data)[source]
Apply lagged transformation for every column (time series) in the dataset
- Parameters
input_data (InputData) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.SparseLaggedTransformationImplementation(params)[source]
-
Implementation of sparse lagged transformation for time series forecasting
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedTransformationImplementation(params)[source]
-
Implementation of lagged transformation for time series forecasting
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.TsSmoothingImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ExogDataTransformationImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- transform(input_data)[source]
Method for representing time series as column
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with features as columns
- Return type
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.GaussianFilterImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.NumericalDerivativeFilterImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (fedot.core.operations.operation_parameters.OperationParameters) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.CutImplementation(params)[source]
Bases:
fedot.core.operations.evaluation.operation_implementations.implementation_interfaces.DataOperationImplementation
- Parameters
params (Optional[fedot.core.operations.operation_parameters.OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- transform(input_data)[source]
Cut first cut_part from time series
new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with cutted time series
- Return type
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ts_to_table(idx, time_series, window_size, is_lag=False)[source]
Method convert time series to lagged form.
- Parameters
idx – the indices of the time series to convert
time_series (numpy.array) – source time series
window_size (int) – size of sliding window, which defines lag
is_lag (bool) – is function used for lagged transformation.
False
needs to convert one dimensional output to lagged form.
- Returns
updated_idx
-> clipped indices of time seriesfeatures_columns
-> lagged time series feature table
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._sparse_matrix(logger, features_columns, n_components_perc=0.5, use_svd=False)[source]
Method converts the matrix to sparse form
- Parameters
features_columns (numpy.array) – matrix to sparse
n_components_perc – initial approximation of percent of components to keep
use_svd – is there need to use
SVD
method for sparse or use naive method
- Returns
reduced dimension matrix
Notes
shape of returned matrix depends on the number of components which includes the threshold of explained variance gain
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._get_svd(features_columns, n_components)[source]
Method converts the matrix to svd sparse form
- Parameters
features_columns (numpy.array) – matrix to sparse
n_components (int) – number of components to keep
- Returns
transformed sparse matrix
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.prepare_target(all_idx, idx, features_columns, target, forecast_length)[source]
Method convert time series to lagged form. Transformation applied only for generating target table (time series considering as multi-target regression task)
- Parameters
all_idx – all indices in data
idx – remaining indices after lagged feature table generation
features_columns (numpy.array) – lagged feature table
target – source time series
forecast_length (int) – forecast length
- Returns
updated_idx
,updated_features
,updated_target
more information:
updated_idx
-> clipped indices of time seriesupdated_features
-> clipped lagged feature tableupdated_target
-> lagged target table
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.transform_features_and_target_into_lagged(input_data, forecast_length, window_size)[source]
Perform lagged transformation firstly on features and secondly on target array
- Parameters
input_data (InputData) – dataclass with features
forecast_length (int) – forecast horizon
window_size (int) – window size for features transformation
- Returns
new_idx
,transformed_cols
,new_target
more information:
new_idx
->transformed_cols
->new_target
->