FEDOT API
- class fedot.api.main.Fedot(problem, timeout=5.0, task_params=None, seed=None, logging_level=40, safe_mode=False, n_jobs=- 1, **composer_tuner_params)[source]
Bases:
object
The main class for FEDOT AutoML API.
Alternatively, may be initialized using the class
FedotBuilder
, where all the optional AutoML parameters are documented and separated by meaning.- Parameters
problem (str) –
name of the modelling problem to solve. .. details:: Possible options:
classification
-> for classification taskregression
-> for regression taskts_forecasting
-> for time series forecasting task
timeout (Optional[float]) – time for model design (in minutes):
None
or-1
means infinite time.task_params (TaskParams) – additional parameters of the task.
seed (Optional[int]) – value for a fixed random seed.
logging_level (int) –
logging levels are the same as in built-in logging library.
Possible options:
50
-> critical40
-> error30
-> warning20
-> info10
-> debug0
-> nonset
safe_mode (bool) – if set
True
it will cut large datasets to prevent memory overflow and use label encoder instead of OneHot encoder if summary cardinality of categorical features is high. Default value isFalse
.n_jobs (int) – num of
n_jobs
for parallelization (set to-1
to use all cpu’s). Defaults to-1
.composer_tuner_params – Additional optional parameters. See their documentation at the methods of
FedotBuilder
.
- fit(features, target='target', predefined_model=None)[source]
Composes and fits a new pipeline, or fits a predefined one.
- Parameters
features (Union[str, os.PathLike, numpy.ndarray, pandas.core.frame.DataFrame, InputData, fedot.core.data.multi_modal.MultiModalData, dict, tuple]) – train data feature values in one of the supported features formats.
target (Union[str, os.PathLike, numpy.ndarray, pandas.core.series.Series, dict]) – train data target values in one of the supported target formats.
predefined_model (Optional[Union[str, fedot.core.pipelines.pipeline.Pipeline]]) – the name of a single model or a
Pipeline
instance, orauto
. With any value specified, the method does not perform composing and tuning. In case ofauto
, the method generates a single initial assumption and then fits the created pipeline.
- Returns
Pipeline
object.- Return type
fedot.core.pipelines.pipeline.Pipeline
- tune(input_data=None, metric_name=None, iterations=100000, timeout=None, cv_folds=None, n_jobs=None, show_progress=False)[source]
Method for hyperparameters tuning of current pipeline
- Parameters
input_data (Optional[InputData]) – data for tuning pipeline.
metric_name (Optional[Union[str, QualityMetricCallable, ComplexityMetricCallable]]) – name of metric for quality tuning.
iterations (int) – numbers of tuning iterations.
timeout (Optional[float]) – time for tuning (in minutes). If
None
or-1
means tuning until max iteration reach.cv_folds (Optional[int]) – number of folds on data for cross-validation.
n_jobs (Optional[int]) – num of
n_jobs
for parallelization (-1
for use all cpu’s).show_progress (bool) – shows progress of tuning if
True
.
- Returns
Pipeline
object.- Return type
fedot.core.pipelines.pipeline.Pipeline
- predict(features, save_predictions=False, in_sample=True, validation_blocks=None)[source]
Predicts new target using already fitted model.
For time-series performs forecast with depth
forecast_length
ifin_sample=False
. Ifin_sample=True
performs in-sample forecast using features as sample.- Parameters
features (Union[str, os.PathLike, numpy.ndarray, pandas.core.frame.DataFrame, InputData, fedot.core.data.multi_modal.MultiModalData, dict, tuple]) – an array with features of test data.
save_predictions (bool) – if
True
- save predictions as csv-file in working directory.in_sample (bool) – used while time-series prediction. If
in_sample=True
performs in-sample forecast using features with number if iterations specified invalidation_blocks
.validation_blocks (Optional[int]) – number of validation blocks for in-sample forecast.
- Returns
An array with prediction values.
- Return type
numpy.ndarray
- predict_proba(features, save_predictions=False, probs_for_all_classes=False)[source]
Predicts the probability of new target using already fitted classification model
- Parameters
features (Union[str, os.PathLike, numpy.ndarray, pandas.core.frame.DataFrame, InputData, fedot.core.data.multi_modal.MultiModalData, dict, tuple]) – an array with features of test data.
save_predictions (bool) – if
True
- save predictions as.csv
file in working directory.probs_for_all_classes (bool) – if
True
- return probability for each class even for binary classification.
- Returns
An array with prediction values.
- Return type
numpy.ndarray
- forecast(pre_history=None, horizon=None, save_predictions=False)[source]
Forecasts the new values of time series. If horizon is bigger than forecast length of fitted model - out-of-sample forecast is applied (not supported for multi-modal data).
- Parameters
pre_history (Optional[Union[str, Tuple[numpy.ndarray, numpy.ndarray], InputData, dict]]) – an array with features for pre-history of the forecast.
horizon (Optional[int]) – amount of steps to forecast.
save_predictions (bool) – if
True
save predictions as csv-file in working directory.
- Returns
An array with prediction values.
- Return type
numpy.ndarray
- plot_prediction(in_sample=None, target=None)[source]
Plots prediction obtained from a graph.
- Parameters
in_sample (Optional[bool]) – if current prediction is in_sample (for time-series forecasting), plots predictions as future values.
target (Optional[Any]) – user-specified name of target variable for
MultiModalData
.
- get_metrics(target=None, metric_names=None, in_sample=None, validation_blocks=None, rounding_order=3)[source]
Gets quality metrics for a fitted graph
- Parameters
target (Optional[Union[numpy.ndarray, pandas.core.series.Series]]) – an array with target values of test data. If
None
, target specified for fit is used.metric_names (Optional[Union[str, List[str]]]) – names of required metrics.
in_sample (Optional[bool]) – used for time series forecasting. If True prediction will be obtained as
.predict(..., in_sample=True)
.validation_blocks (Optional[int]) – number of validation blocks for time series in-sample forecast.
rounding_order (int) – number of decimal places for metrics
- Returns
Values of quality metrics.
- Return type
dict
- save_predict(predicted_data)[source]
Saves pipeline forecasts in csv file
- Parameters
predicted_data (OutputData) –
- explain(features=None, method='surrogate_dt', visualization=True, **kwargs)[source]
Creates explanation for
current_pipeline
according to the selectedmethod
.An
Explainer
instance will return.- Parameters
features (Optional[Union[str, os.PathLike, numpy.ndarray, pandas.core.frame.DataFrame, InputData, fedot.core.data.multi_modal.MultiModalData, dict, tuple]]) – samples to be explained. If
None
,train_data
from last fit will be used.method (str) – explanation method, defaults to
surrogate_dt
visualization (bool) – print and plot the explanation simultaneously, defaults to
True
.
- Return type
fedot.explainability.explainer_template.Explainer
Notes
An explanation can be retrieved later by executing
Explainer.visualize()
.
- return_report()[source]
Function returns a report on time consumption.
The following steps are presented in this report: - ‘Data Definition (fit)’: Time spent on data definition in fit(). - ‘Data Preprocessing’: Total time spent on preprocessing data, includes fitting and predicting stages. - ‘Fitting (summary)’: Total time spent on Composing, Tuning and Training Inference. - ‘Composing’: Time spent on searching for the best pipeline. - ‘Train Inference’: Time spent on training the pipeline found during composing. - ‘Tuning (composing)’: Time spent on hyperparameters tuning in the whole fitting, if with_tune is True. - ‘Tuning (after)’: Time spent on .tune() (hyperparameters tuning) after composing. - ‘Data Definition (predict)’: Time spent on data definition in predict(). - ‘Predicting’: Time spent on predicting (inference).
- Return type
pandas.core.frame.DataFrame