Transformations
Sklearn Transformation
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ComponentAnalysisImplementation(params)[source]
Bases:
DataOperationImplementationClass for applying PCA and kernel PCA models from sklearn
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the arguments
- fit(input_data)[source]
The method trains the PCA model
- Parameters
input_data (InputData) – data with features, target and ids for PCA training
- Returns
trained PCA model (optional output)
- Return type
PCA
- transform(input_data)[source]
Method for transformation tabular data using PCA
- Parameters
input_data (InputData) – data with features, target and ids for PCA applying
- Returns
data with transformed features attribute
- Return type
- check_and_correct_params(is_ts_data=False)[source]
Method check if number of features in data enough for
n_componentsparameter in PCA or not. And if not enough - fixes it- Parameters
is_ts_data (bool) –
- static update_column_types(output_data)[source]
Update column types after applying PCA operations
- Parameters
output_data (OutputData) –
- Return type
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PCAImplementation(params=None)[source]
Bases:
ComponentAnalysisImplementationClass for applying PCA from sklearn
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.DaskPCAImplementation(params=None)[source]
Bases:
ComponentAnalysisImplementationClass for applying PCA from sklearn
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.KernelPCAImplementation(params)[source]
Bases:
ComponentAnalysisImplementationClass for applying kernel PCA from sklearn
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.FastICAImplementation(params)[source]
Bases:
ComponentAnalysisImplementationClass for applying FastICA from sklearn
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the hyperparameters
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.PolyFeaturesImplementation(params)[source]
Bases:
EncodedInvariantImplementationClass for application of
PolynomialFeaturesoperation on data, where only not encoded features (were not converted from categorical usingOneHot encoding) are used- Parameters
params (Optional[OperationParameters]) – OperationParameters with the arguments
- transform(input_data)[source]
Firstly perform filtration of columns
- Parameters
input_data (InputData) –
- Return type
- _update_column_types(source_features_shape, output_data)[source]
Update column types after applying operations. If new columns added, new type for them are defined
- Parameters
output_data (OutputData) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ScalingImplementation(params)[source]
Bases:
EncodedInvariantImplementationClass for application of
Scaling operationon data, where only not encoded features (were not converted from categorical usingOneHot encoding) are used- Parameters
params (Optional[OperationParameters]) – OperationParameters with the arguments
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.NormalizationImplementation(params)[source]
Bases:
EncodedInvariantImplementationClass for application of
MinMax normalizationoperation on data, where only not encoded features (were not converted from categorical usingOneHot encoding) are used- Parameters
params (Optional[OperationParameters]) – OperationParameters with the arguments
- class fedot.core.operations.evaluation.operation_implementations.data_operations.sklearn_transformations.ImputationImplementation(params=None)[source]
Bases:
DataOperationImplementationClass for applying imputation on tabular data
- Parameters
params (Optional[OperationParameters]) – OperationParameters with the arguments
- fit(input_data)[source]
The method trains
SimpleImputer- Parameters
input_data (InputData) – data with features
- transform(input_data)[source]
Method for transformation tabular data using
SimpleImputer- Parameters
input_data (InputData) – data with features
- Returns
data with transformed features attribute
- Return type
- fit_transform(input_data)[source]
Method for training and transformation tabular data using
SimpleImputer- Parameters
input_data (InputData) – data with features
- Returns
data with transformed features attribute
- Return type
- _categorical_numerical_union(categorical_features, numerical_features)[source]
Merge numerical and categorical features in right order (as it was in source table)
- Parameters
categorical_features (array) –
numerical_features (array) –
- Return type
array
- _find_binary_features(numerical_features)[source]
Find indices of features with only two unique values in column
Notes
All features in table are numerical
- Parameters
numerical_features (array) –
Time Series Transformation
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.random() x in the interval [0, 1).
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (Optional[OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data – data with features, target and ids to process
- transform(input_data)[source]
Method for transformation of time series to lagged form for predict stage
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with transformed features table
- Return type
- transform_for_fit(input_data)[source]
Method for transformation of time series to lagged form for fit stage
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with transformed features table
- Return type
- _check_and_correct_window_size(time_series, forecast_length)[source]
Method check if the length of the time series is not enough for lagged transformation
- Parameters
time_series (ndarray) – time series for transformation
forecast_length (int) – forecast length
Returns:
- _update_column_types(output_data)[source]
Update column types after lagged transformation. All features becomes
float- Parameters
output_data (OutputData) –
- _apply_transformation_for_fit(input_data, features, target, forecast_length, old_idx)[source]
Apply lagged transformation on each time series in the current dataset
- Parameters
input_data (InputData) –
features (array) –
target (array) –
forecast_length (int) –
old_idx (array) –
- stack_by_type_fit(input_data, all_features, all_target, all_idx, features, target, idx)[source]
Apply stack function for multi_ts and multivariable ts types on fit step
- _stack_multi_variable(all_features, all_target, all_idx, features, target, idx)[source]
Horizontally stack tables as multiple variables extends features for training
- Parameters
all_features (array) –
arraywith all features for adding newall_target (array) –
arraywith all target (does not change)all_idx (array) –
arraywith all indices (does not change)features (array) –
arraywith new features for addingtarget (array) –
arraywith new target for addingidx (Union[list, array]) –
arraywith new idx for adding
- Returns
table
- _stack_multi_ts(all_features, all_target, all_idx, features, target, idx)[source]
Vertically stack tables as multi_ts data extends training set as combination of train and target
- Parameters
all_features (array) –
arraywith all features for adding newall_target (array) –
arraywith all targetall_idx (array) –
arraywith all indicesfeatures (array) –
arraywith new features for addingtarget (array) –
arraywith new target for addingidx (Union[list, array]) –
arraywith new idx for adding
- Returns
table
- _apply_transformation_for_predict(input_data)[source]
Apply lagged transformation for every column (time series) in the dataset
- Parameters
input_data (InputData) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.SparseLaggedTransformationImplementation(params)[source]
Bases:
LaggedImplementationImplementation of sparse lagged transformation for time series forecasting
- Parameters
params (Optional[OperationParameters]) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.LaggedTransformationImplementation(params)[source]
Bases:
LaggedImplementationImplementation of lagged transformation for time series forecasting
- Parameters
params (Optional[OperationParameters]) –
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.TsSmoothingImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (Optional[OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ExogDataTransformationImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (Optional[OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- transform(input_data)[source]
Method for representing time series as column
- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with features as columns
- Return type
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.GaussianFilterImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (Optional[OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.NumericalDerivativeFilterImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (OperationParameters) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- class fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.CutImplementation(params)[source]
Bases:
DataOperationImplementation- Parameters
params (Optional[OperationParameters]) –
- fit(input_data)[source]
Class doesn’t support fit operation
- Parameters
input_data (InputData) – data with features, target and ids to process
- transform(input_data)[source]
Cut first cut_part from time series
new_len = len - int(self.cut_part * (input_values.shape[0]-horizon))- Parameters
input_data (InputData) – data with features, target and ids to process
- Returns
output data with cutted time series
- Return type
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.ts_to_table(idx, time_series, window_size, is_lag=False)[source]
Method convert time series to lagged form.
- Parameters
idx – the indices of the time series to convert
time_series (array) – source time series
window_size (int) – size of sliding window, which defines lag
is_lag (bool) – is function used for lagged transformation.
Falseneeds to convert one dimensional output to lagged form.
- Returns
updated_idx-> clipped indices of time seriesfeatures_columns-> lagged time series feature table
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._sparse_matrix(logger, features_columns, n_components_perc=0.5, use_svd=False)[source]
Method converts the matrix to sparse form
- Parameters
features_columns (array) – matrix to sparse
n_components_perc – initial approximation of percent of components to keep
use_svd – is there need to use
SVDmethod for sparse or use naive method
- Returns
reduced dimension matrix
Notes
shape of returned matrix depends on the number of components which includes the threshold of explained variance gain
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations._get_svd(features_columns, n_components)[source]
Method converts the matrix to svd sparse form
- Parameters
features_columns (array) – matrix to sparse
n_components (int) – number of components to keep
- Returns
transformed sparse matrix
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.prepare_target(all_idx, idx, features_columns, target, forecast_length)[source]
Method convert time series to lagged form. Transformation applied only for generating target table (time series considering as multi-target regression task)
- Parameters
all_idx – all indices in data
idx – remaining indices after lagged feature table generation
features_columns (array) – lagged feature table
target – source time series
forecast_length (int) – forecast length
- Returns
updated_idx,updated_features,updated_targetmore information:
updated_idx-> clipped indices of time seriesupdated_features-> clipped lagged feature tableupdated_target-> lagged target table
- fedot.core.operations.evaluation.operation_implementations.data_operations.ts_transformations.transform_features_and_target_into_lagged(input_data, forecast_length, window_size)[source]
Perform lagged transformation firstly on features and secondly on target array
- Parameters
input_data (InputData) – dataclass with features
forecast_length (int) – forecast horizon
window_size (int) – window size for features transformation
- Returns
new_idx,transformed_cols,new_targetmore information:
new_idx->transformed_cols->new_target->