Datasets happiness_train and happiness_test are real data from the World Happiness Reports. Happiness is scored according to economic production, social support, etc. happiness_train accumulates the data from years 2015-2018, while happiness_test is the data from the year 2019, which imitates the out-of-time validation.

data(happiness_train); data(happiness_test)


happiness_train: a data frame with 625 rows and 7 columns, happiness_test: a data frame with 156 rows and 7 columns


Source: World Happiness Report at

The following columns: GDP per Capita, Social Support, Life Expectancy, Freedom, Generosity, Corruption describe the extent to which these factors contribute in evaluating the happiness in each country. Variables:

  • score - target variable, continuous value between 0 and 10 (regression)

  • gdp_per_capita

  • social_support

  • healthy_life_expectancy

  • freedom_life_choices

  • generosity

  • perceptions_of_corruption