This function computes various (instance and dataset level) model explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. Easily save the dashboard and share it with others. Tools for Explanatory Model Analysis unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.
The extensive documentation covers:
Function parameters description  perks and features
Framework and model compatibility  R & Python examples
Theoretical introduction to the plots  Explanatory Model Analysis: Explore, Explain and Examine Predictive Models
Displayed variable can be changed by clicking on the bars of plots or with the first dropdown list,
and observation can be changed with the second dropdown list.
The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length,
package version, dashboard dimensions). This is for the development purposes only and can be blocked
by setting telemetry
to FALSE
.
modelStudio(explainer, ...) # S3 method for explainer modelStudio( explainer, new_observation = NULL, new_observation_y = NULL, facet_dim = c(2, 2), time = 500, max_features = 10, N = 300, B = 10, eda = TRUE, show_info = TRUE, parallel = FALSE, options = ms_options(), viewer = "external", widget_id = NULL, telemetry = TRUE, max_vars = NULL, ... )
explainer  An 

...  Other parameters. 
new_observation  New observations with columns that correspond to variables used in the model. 
new_observation_y  True label for 
facet_dim  Dimensions of the grid. Default is 
time  Time in ms. Set the animation length. Default is 
max_features  Maximum number of features to be included in BD and SV plots.
Default is 
N  Number of observations used for the calculation of PD and AD.

B  Number of permutation rounds used for calculation of SV and FI.
Default is 
eda  Compute EDA plots and Residuals vs Feature plot, which adds the data to the dashboard. Default is 
show_info  Verbose a progress on the console. Default is 
parallel  Speed up the computation using 
options  Customize 
viewer  Default is 
widget_id  Use an explicit element ID for the widget (rather than an automatically generated one).
Useful e.g. when using 
telemetry  The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length,
package version, dashboard dimensions). This is for the development purposes only and can be blocked by setting 
max_vars  An alias for 
An object of the r2d3, htmlwidget, modelStudio
class.
The input object is implemented in DALEX
Feature Importance, Ceteris Paribus, Partial Dependence and Accumulated Dependence explanations are implemented in ingredients
Break Down and Shapley Values explanations are implemented in iBreakDown
#>#> #> #>library("modelStudio") #:# ex1 classification on 'titanic' data # fit a model model_titanic < glm(survived ~., data = titanic_imputed, family = "binomial") # create an explainer for the model explainer_titanic < explain(model_titanic, data = titanic_imputed, y = titanic_imputed$survived, label = "Titanic GLM")#> Preparation of a new explainer is initiated #> > model label : Titanic GLM #> > data : 2207 rows 8 cols #> > target variable : 2207 values #> > predict function : yhat.glm will be used ( default ) #> > predicted values : numerical, min = 0.008128381 , mean = 0.3221568 , max = 0.9731431 #> > model_info : package stats , ver. 4.0.2 , task classification ( default ) #> > residual function : difference between y and yhat ( default ) #> > residuals : numerical, min = 0.9628583 , mean = 2.569729e10 , max = 0.9663346 #> A new explainer has been created!# pick observations new_observations < titanic_imputed[1:2,] rownames(new_observations) < c("Lucas","James") # make a studio for the model modelStudio(explainer_titanic, new_observations, N = 200, B = 5) # faster example # \donttest{ #:# ex2 regression on 'apartments' data library("ranger") model_apartments < ranger(m2.price ~. ,data = apartments) explainer_apartments < explain(model_apartments, data = apartments, y = apartments$m2.price)#> Preparation of a new explainer is initiated #> > model label : ranger ( default ) #> > data : 1000 rows 6 cols #> > target variable : 1000 values #> > predict function : yhat.ranger will be used ( default ) #> > predicted values : numerical, min = 1852.093 , mean = 3489.113 , max = 6141.653 #> > model_info : package ranger , ver. 0.12.1 , task regression ( default ) #> > residual function : difference between y and yhat ( default ) #> > residuals : numerical, min = 399.2476 , mean = 2.094116 , max = 587.1217 #> A new explainer has been created!new_apartments < apartments[1:2,] rownames(new_apartments) < c("ap1","ap2") # change dashboard dimensions and animation length modelStudio(explainer_apartments, new_apartments, facet_dim = c(2, 3), time = 800) # add information about true labels modelStudio(explainer_apartments, new_apartments, new_observation_y = new_apartments$m2.price) # don't compute EDA plots modelStudio(explainer_apartments, eda = FALSE)#>#>#:# ex3 xgboost model on 'HR' dataset library("xgboost") HR_matrix < model.matrix(status == "fired" ~ . 1, HR) # fit a model xgb_matrix < xgb.DMatrix(HR_matrix, label = HR$status == "fired") params < list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc") model_HR < xgb.train(params, xgb_matrix, nrounds = 300) # create an explainer for the model explainer_HR < explain(model_HR, data = HR_matrix, y = HR$status == "fired", label = "xgboost")#> Preparation of a new explainer is initiated #> > model label : xgboost #> > data : 7847 rows 6 cols #> > target variable : 7847 values #> > predict function : yhat.default will be used ( default ) #> > predicted values : numerical, min = 1.158531e06 , mean = 0.36388 , max = 0.9997931 #> > model_info : package Model of class: xgb.Booster package unrecognized , ver. Unknown , task regression ( default ) #> > model_info : Model info detected regression task but 'y' is a logical . ( WARNING ) #> > model_info : By deafult regressions tasks supports only numercical 'y' parameter. #> > model_info : Consider changing to numerical vector. #> > model_info : Otherwise I will not be able to calculate residuals or loss function. #> > residual function : difference between y and yhat ( default ) #> > residuals : numerical, min = 0.9830738 , mean = 4.673149e05 , max = 0.9703595 #> A new explainer has been created!# pick observations new_observation < HR_matrix[1:2, , drop=FALSE] rownames(new_observation) < c("id1", "id2") # make a studio for the model modelStudio(explainer_HR, new_observation)#> Warning: Coercing LHS to a list# }