Hello everyone! In this new seriesI will talk about and test some libraries, code or blogs talking about R and their application to Machine Learning, Deep Learning and Data science. You can read the Python version of here.

1. paletteer – Collection of most color palettes in a single R package

Data visualization is crucial in Data Science. Is the path we have to explain our findings to the business, also it help us understand the data we are analyzing, crunch weeks of work into a single picture.
R is a great language for visualization. The goal of this great package paletteer is to be a comprehensive collection of color palettes in R using a common interface. Think of it as the "caret of palettes".
The package is not yet on CRAN but ff you want the development version then install directly from GitHub:
# install.packages("devtools")
devtools::install_github("EmilHvitfeldt/paletteer")
Palettes
The palettes are divided into 2 groups; discrete and continuous. For discrete palette you have the choice between the fixed width palettes and dynamic palettes. Most common of the two are the fixed width palettes which have a set amount of colors which doesn’t change when the number of colors requisted vary like the following palettes:

on the other hand we have the dynamic palettes where the colors of the palette depend on the number of colors you need like the green.pal
palette from the cartography
package:

Lastly we have the continuous palettes which provides as many colors as you need for a smooth transition of color:

This package includes 958 from 28 different packages and information about these can be found in the following data.frames: palettes_c_names
, palettes_d_names
and palettes_dynamic_names
.
The package also includes scales for ggplot2
using the same standard interface
library(ggplot2)
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point() +
scale_color_paletteer_d(nord, aurora)

Very easy and useful. Remember to visit the GitHub repo:
and star it ;).
2. DALEX – Descriptive Machine Learning EXplanations

Explaining machine learning models is not always easy, yet so important for some business applications. There are great libraries that help us in this task, for example:
BTW, sometimes a simple visualization with ggplot can help you explain a model. For more on this check this awesome article by Matthew Mayo:
In many applications we need to know, understand or prove how input variables are used in the model and what impact do they have on final model prediction. DALEX
is a set of tools that help to understand how complex models are working.
To install from CRAN just run:
install.packages("DALEX")
They have amazing documentation on how to use DALEX with different ML packages:
- How to use DALEX with caret
- How to use DALEX with mlr
- How to use DALEX with H2O
- How to use DALEX with xgboost package
- How to use DALEX for teaching. Part 1
- How to use DALEX for teaching. Part 2
- breakDown vs lime vs shapleyR
Great cheatsheets:


An interactive notebook where you can know more about the package:
And finally a book style documentation where they talk about DALEX, machine learning and explainability:
Check it out in the original repository:
and remember to star it :).
modelDown – Generate a website with HTML summaries for predictive models

modelDown
generates a website with HTML summaries for predictive models. Is uses DALEX (see above) explainers to compute and plot summaries of how given models behave. We can see how exactly scores for predictions were calculated (Prediction BreakDown), how much each variable contributes to predictions (Variable Response), which variables are the most important for a given model (Variable Importance) and how well out models behave (Model Performance).
You can install it right now from GitHub:
devtools::install_github("MI2DataLab/modelDown")
When you have the package successfully installed, you need to create DALEX explainers for you models. Here is a simple example (from the authors):
# assuming you have two models: glm_model and ranger_model for HR_data
explainer_glm <- DALEX::explain(glm_model, data=HR_data, y=HR_data$left)
explainer_ranger <- DALEX::explain(ranger_model, data=HR_data, y=HR_data$left)
Next, just pass all created explainers to function modelDown
. For example:
modelDown::modelDown(explainer_ranger, explainer_glm)
That’s it! Now you should have your html page generated with default options.
You’ll have pages like:
Index page

Index page presents basic information about data provided in explainers. You can also see types of all explainers given as parameters. Additionally, summary statistics are available for numerical variables. For categorical variables, tables with frequencies of factor levels are presented.
Model Performance

Module shows result of function model_performance
.
Variable Importance

Output of function variable_importance
is presented in form of a plot as well as a table.
And much more. Here is a live example of a page generated with the package:
https://mi2datalab.github.io/modelDown_example/
And that’s it for today :). Soon you’ll get more information, and posts with R too. If you want to stay updated please subscribe bellow:
Thanks for reading this. I hope you found something interesting here 🙂
If you have questions just follow me on Twitter
and LinkedIn.
See you there 🙂