This function computes various (instance and dataset level) model explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. Easily save the dashboard and share it with others. Tools for Explanatory Model Analysis unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.

The extensive documentation covers:

Function parameters description -

**perks and features**Framework and model compatibility -

**R & Python examples**Theoretical introduction to the plots - Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models

Displayed variable can be changed by clicking on the bars of plots or with the first dropdown list,
and observation can be changed with the second dropdown list.
The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length,
package version, dashboard dimensions). This is for the development purposes only and can be blocked
by setting `telemetry`

to `FALSE`

.

```
modelStudio(explainer, ...)
# S3 method for explainer
modelStudio(
explainer,
new_observation = NULL,
new_observation_y = NULL,
new_observation_n = 3,
facet_dim = c(2, 2),
time = 500,
max_features = 10,
max_features_fi = NULL,
N = 300,
N_fi = N * 10,
N_sv = N * 3,
B = 10,
B_fi = B,
eda = TRUE,
show_info = TRUE,
parallel = FALSE,
options = ms_options(),
viewer = "external",
widget_id = NULL,
license = NULL,
telemetry = TRUE,
max_vars = NULL,
verbose = NULL,
...
)
```

- explainer
An

`explainer`

created with`DALEX::explain()`

.- ...
Other parameters.

- new_observation
New observations with columns that correspond to variables used in the model.

- new_observation_y
True label for

`new_observation`

(optional).- new_observation_n
Number of observations to be taken from the

`explainer$data`

if`new_observation = NULL`

. See**vignette**- facet_dim
Dimensions of the grid. Default is

`c(2,2)`

.- time
Time in ms. Set the animation length. Default is

`500`

.- max_features
Maximum number of features to be included in BD, SV, and FI plots. Default is

`10`

.- max_features_fi
Maximum number of features to be included in FI plot. Default is

`max_features`

.- N
Number of observations used for the calculation of PD and AD. Default is

`300`

. See**vignette**- N_fi
Number of observations used for the calculation of FI. Default is

`10*N`

.- N_sv
Number of observations used for the calculation of SV. Default is

`3*N`

.- B
Number of permutation rounds used for calculation of SV. Default is

`10`

. See**vignette**- B_fi
Number of permutation rounds used for calculation of FI. Default is

`B`

.- eda
Compute EDA plots and Residuals vs Feature plot, which adds the data to the dashboard. Default is

`TRUE`

.- show_info
Verbose a progress on the console. Default is

`TRUE`

.- parallel
Speed up the computation using

`parallelMap::parallelMap()`

. See**vignette**. This might interfere with showing progress using`show_info`

.- options
Customize

`modelStudio`

. See`ms_options`

and**vignette**.- viewer
Default is

`external`

to display in an external RStudio window. Use`browser`

to display in an external browser or`internal`

to use the RStudio internal viewer pane for output.- widget_id
Use an explicit element ID for the widget (rather than an automatically generated one). Useful e.g. when using

`modelStudio`

with Shiny. See**vignette**.- license
Path to the file containing the license (

`con`

parameter passed to`readLines()`

). It can be used e.g. to include the license for`explainer$data`

as a comment in the source of`.html`

output file.- telemetry
The dashboard gathers useful, but not sensitive, information about how it is being used (e.g. computation length, package version, dashboard dimensions). This is for the development purposes only and can be blocked by setting

`telemetry`

to`FALSE`

.- max_vars
An alias for

`max_features`

. If provided, it will override the value.- verbose
An alias for

`show_info`

. If provided, it will override the value.

An object of the `r2d3, htmlwidget, modelStudio`

class.

The input object is implemented in

**DALEX**Feature Importance, Ceteris Paribus, Partial Dependence and Accumulated Dependence explanations are implemented in

**ingredients**Break Down and Shapley Values explanations are implemented in

**iBreakDown**

```
library("DALEX")
#> Welcome to DALEX (version: 2.4.1).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
#> Additional features will be available after installation of: ggpubr.
#> Use 'install_dependencies()' to get all suggested dependencies
library("modelStudio")
#:# ex1 classification on 'titanic' data
# fit a model
model_titanic <- glm(survived ~., data = titanic_imputed, family = "binomial")
# create an explainer for the model
explainer_titanic <- explain(model_titanic,
data = titanic_imputed,
y = titanic_imputed$survived,
label = "Titanic GLM")
#> Preparation of a new explainer is initiated
#> -> model label : Titanic GLM
#> -> data : 2207 rows 8 cols
#> -> target variable : 2207 values
#> -> predict function : yhat.glm will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package stats , ver. 4.2.0 , task classification ( default )
#> -> predicted values : numerical, min = 0.008128381 , mean = 0.3221568 , max = 0.9731431
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -0.9628583 , mean = -2.569729e-10 , max = 0.9663346
#> A new explainer has been created!
# pick observations
new_observations <- titanic_imputed[1:2,]
rownames(new_observations) <- c("Lucas","James")
# make a studio for the model
modelStudio(explainer_titanic,
new_observations,
N = 200, B = 5) # faster example
# \donttest{
#:# ex2 regression on 'apartments' data
if (requireNamespace("ranger", quietly=TRUE)) {
library("ranger")
model_apartments <- ranger(m2.price ~. ,data = apartments)
explainer_apartments <- explain(model_apartments,
data = apartments,
y = apartments$m2.price)
new_apartments <- apartments[1:2,]
rownames(new_apartments) <- c("ap1","ap2")
# change dashboard dimensions and animation length
modelStudio(explainer_apartments,
new_apartments,
facet_dim = c(2, 3),
time = 800)
# add information about true labels
modelStudio(explainer_apartments,
new_apartments,
new_observation_y = new_apartments$m2.price)
# don't compute EDA plots
modelStudio(explainer_apartments,
eda = FALSE)
}
#> Preparation of a new explainer is initiated
#> -> model label : ranger ( default )
#> -> data : 1000 rows 6 cols
#> -> target variable : 1000 values
#> -> predict function : yhat.ranger will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package ranger , ver. 0.13.1 , task regression ( default )
#> -> predicted values : numerical, min = 1847.292 , mean = 3489.451 , max = 6139.825
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -411.5567 , mean = -2.431569 , max = 617.052
#> A new explainer has been created!
#> `new_observation` argument is NULL. `new_observation_n` observations needed to calculate local explanations are taken from the data.
#:# ex3 xgboost model on 'HR' dataset
if (requireNamespace("xgboost", quietly=TRUE)) {
library("xgboost")
HR_matrix <- model.matrix(status == "fired" ~ . -1, HR)
# fit a model
xgb_matrix <- xgb.DMatrix(HR_matrix, label = HR$status == "fired")
params <- list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc")
model_HR <- xgb.train(params, xgb_matrix, nrounds = 300)
# create an explainer for the model
explainer_HR <- explain(model_HR,
data = HR_matrix,
y = HR$status == "fired",
type = "classification",
label = "xgboost")
# pick observations
new_observation <- HR_matrix[1:2, , drop=FALSE]
rownames(new_observation) <- c("id1", "id2")
# make a studio for the model
modelStudio(explainer_HR,
new_observation)
}
#> Preparation of a new explainer is initiated
#> -> model label : xgboost
#> -> data : 7847 rows 6 cols
#> -> target variable : 7847 values
#> -> predict function : yhat.default will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package Model of class: xgb.Booster package unrecognized , ver. Unknown , task regression ( default )
#> -> model_info : type set to classification
#> -> model_info : Model info detected classification task but 'y' is a logical . Converted to numeric. ( NOTE )
#> -> predicted values : numerical, min = 1.158531e-06 , mean = 0.36388 , max = 0.9997931
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -0.9830738 , mean = -4.673149e-05 , max = 0.9703595
#> A new explainer has been created!
```