This function computes various (instance and dataset level) model explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. Easily save the dashboard and share it with others. Tools for Explanatory Model Analysis unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.

The extensive documentation covers:

Displayed variable can be changed by clicking on the bars of plots or with the first dropdown list, and observation can be changed with the second dropdown list. The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length, package version, dashboard dimensions). This is for the development purposes only and can be blocked by setting telemetry to FALSE.

modelStudio(explainer, ...)

# S3 method for explainer
modelStudio(
explainer,
new_observation = NULL,
new_observation_y = NULL,
new_observation_n = 3,
facet_dim = c(2, 2),
time = 500,
max_features = 10,
N = 300,
N_fi = N * 10,
B = 10,
B_fi = B,
eda = TRUE,
show_info = TRUE,
parallel = FALSE,
options = ms_options(),
viewer = "external",
widget_id = NULL,
telemetry = TRUE,
max_vars = NULL,
...
)

## Arguments

explainer An explainer created with DALEX::explain(). Other parameters. New observations with columns that correspond to variables used in the model. True label for new_observation (optional). Number of observations to be taken from the explainer$data if new_observation = NULL. See vignette Dimensions of the grid. Default is c(2,2). Time in ms. Set the animation length. Default is 500. Maximum number of features to be included in BD and SV plots. Default is 10. Number of observations used for the calculation of PD and AD. Default is 300. See vignette Number of observations used for the calculation of FI. Default is 10*N. Number of permutation rounds used for calculation of SV. Default is 10. See vignette Number of permutation rounds used for calculation of FI. Default is B. Compute EDA plots and Residuals vs Feature plot, which adds the data to the dashboard. Default is TRUE. Verbose a progress on the console. Default is TRUE. Speed up the computation using parallelMap::parallelMap(). See vignette. This might interfere with showing progress using show_info. Customize modelStudio. See ms_options and vignette. Default is external to display in an external RStudio window. Use browser to display in an external browser or internal to use the RStudio internal viewer pane for output. Use an explicit element ID for the widget (rather than an automatically generated one). Useful e.g. when using modelStudio with Shiny. See vignette. Path to the file containing the license (con parameter passed to readLines()). It can be used e.g. to include the license for explainer$data as a comment in the source of .html output file. The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length, package version, dashboard dimensions). This is for the development purposes only and can be blocked by setting telemetry to FALSE. An alias for max_features. If provided, it will override the value.

## Value

An object of the r2d3, htmlwidget, modelStudio class.

## References

• The input object is implemented in DALEX

• Feature Importance, Ceteris Paribus, Partial Dependence and Accumulated Dependence explanations are implemented in ingredients

• Break Down and Shapley Values explanations are implemented in iBreakDown

## Examples

library("DALEX")
#> Welcome to DALEX (version: 2.2.0).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
#> Additional features will be available after installation of: ggpubr.
#> Use 'install_dependencies()' to get all suggested dependencieslibrary("modelStudio")

#:# ex1 classification on 'titanic' data

# fit a model
model_titanic <- glm(survived ~., data = titanic_imputed, family = "binomial")

# create an explainer for the model
explainer_titanic <- explain(model_titanic,
data = titanic_imputed,
y = titanic_imputed$survived, label = "Titanic GLM") #> Preparation of a new explainer is initiated #> -> model label : Titanic GLM #> -> data : 2207 rows 8 cols #> -> target variable : 2207 values #> -> predict function : yhat.glm will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package stats , ver. 4.0.5 , task classification ( default ) #> -> predicted values : numerical, min = 0.008128381 , mean = 0.3221568 , max = 0.9731431 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -0.9628583 , mean = -2.569729e-10 , max = 0.9663346 #> A new explainer has been created! # pick observations new_observations <- titanic_imputed[1:2,] rownames(new_observations) <- c("Lucas","James") # make a studio for the model modelStudio(explainer_titanic, new_observations, N = 200, B = 5) # faster example # \donttest{ #:# ex2 regression on 'apartments' data if (requireNamespace("ranger", quietly=TRUE)) { library("ranger") model_apartments <- ranger(m2.price ~. ,data = apartments) explainer_apartments <- explain(model_apartments, data = apartments, y = apartments$m2.price)

new_apartments <- apartments[1:2,]
rownames(new_apartments) <- c("ap1","ap2")

# change dashboard dimensions and animation length
modelStudio(explainer_apartments,
new_apartments,
facet_dim = c(2, 3),
time = 800)

modelStudio(explainer_apartments,
new_apartments,
new_observation_y = new_apartments$m2.price) # don't compute EDA plots modelStudio(explainer_apartments, eda = FALSE) } #> Preparation of a new explainer is initiated #> -> model label : ranger ( default ) #> -> data : 1000 rows 6 cols #> -> target variable : 1000 values #> -> predict function : yhat.ranger will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package ranger , ver. 0.12.1 , task regression ( default ) #> -> predicted values : numerical, min = 1852.093 , mean = 3489.113 , max = 6141.653 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -399.2476 , mean = -2.094116 , max = 587.1217 #> A new explainer has been created! #> new_observation argument is NULL. new_observation_n observations needed to calculate local explanations are taken from the data. #:# ex3 xgboost model on 'HR' dataset if (requireNamespace("xgboost", quietly=TRUE)) { library("xgboost") HR_matrix <- model.matrix(status == "fired" ~ . -1, HR) # fit a model xgb_matrix <- xgb.DMatrix(HR_matrix, label = HR$status == "fired")
params <- list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc")
model_HR <- xgb.train(params, xgb_matrix, nrounds = 300)

# create an explainer for the model
explainer_HR <- explain(model_HR,
data = HR_matrix,
y = HR\$status == "fired",
label = "xgboost")

# pick observations
new_observation <- HR_matrix[1:2, , drop=FALSE]
rownames(new_observation) <- c("id1", "id2")

# make a studio for the model
modelStudio(explainer_HR,
new_observation)
}
#> Preparation of a new explainer is initiated
#>   -> model label       :  xgboost
#>   -> data              :  7847  rows  6  cols
#>   -> target variable   :  7847  values
#>   -> predict function  :  yhat.default will be used (  default  )
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package Model of class: xgb.Booster package unrecognized , ver. Unknown , task regression (  default  )
#>   -> model_info        :  Model info detected regression task but 'y' is a logical .  (  WARNING  )
#>   -> model_info        :  By deafult regressions tasks supports only numercical 'y' parameter.
#>   -> model_info        :  Consider changing to numerical vector.
#>   -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
#>   -> predicted values  :  numerical, min =  1.158531e-06 , mean =  0.36388 , max =  0.9997931
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -0.9830738 , mean =  -4.673149e-05 , max =  0.9703595
#>   A new explainer has been created!  # }