Boosting in Machine Learning using XGBoost

XGboost is short for eXtreme Gradient Boosting package. Before diving deep into XGBoost, let us first understand Gradient Boosting and just Boosting.

Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias and variance in supervised learning and a family of machine learning algorithms that convert weak learners to strong ones. Algorithms that achieve hypothesis boosting quickly are called “boosting”.

Gradient Boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.

XGBoost is an algorithm that has recently been dominating applied machine learning for structured or tabular data. XGBoost is an implementation of gradient boosted decision trees designed for speed and performance, XGBoosting is a type of software library.

Table of Contents

Understand XGBoost simple way:

The use of machine learning continues to grow in industry. Hence, the need to understand and explain what machine learning models do seems to be a growing trend. For machine learning classification problems that are not of the deep learning type, XGBoost is the most popular. XGBoost can be particularly useful in a commercial setting because of its ability to scale up well to large datasets and support of many languages. For example, it is very easy to train models in Python and deploy them in a Java production environment.

This algorithm goes by lots of different names such as gradient boosting, multiple additive regression trees, stochastic gradient boosting or gradient boosting machines.

Boosting is an ensemble technique where new models are added to correct the errors made by existing models. Models are added sequentially until no further improvements can be made. A popular example is the AdaBoost algorithm that weighs data points that are hard to predict.

Gradient Boosting is an approach where new models are created that predict the residuals or errors of prior models and then added together to make the final prediction. It is called gradient boosting because it uses a gradient descent algorithm to minimize the loss when adding new models.

Advantages that XGBoost provides:

Regularization:

Standard GBM implementation has no regularization like XGBoost, therefore it also helps to reduce overfitting.
In fact, XGBoost is also known as ‘regularized boosting’ technique.

Parallel Processing:

XGBoost implements parallel processing and are blazingly faster as compared to GBM.
We know that boosting is sequential process. As each tree can be built only after the previous one, so nothing stops from making a tree using all cores.
It also supports implementation on Hadoop.

High Flexibility

XGBoost allow users to define custom optimization objectives and evaluation criteria.

Handling Missing Values

It has an in-built routine to handle missing values.
User is required to supply a different value than other observations and pass that as a parameter.
This tries different things as it encounters a missing value on each node and learns which path to take for missing values in future.

Tree Pruning

A GBM would stop splitting a node when it encounters a negative loss in the split. Thus it is more of a greedy algorithm.
XGBoost on the other hand can make splits up to the maximum depth specified and then start pruning the tree backwards and remove splits beyond which there is no positive gain.

Built-in Cross-Validation

XGBoost allows user to run a cross-validation at each iteration of the boosting process and thus it is easy to get the exact optimum number of boosting iterations in a single run.
This is unlike GBM where we have to run a grid-search and only limited values can be tested.

Continue on Existing Model

User can start training an XGBoost model from its last iteration of previous run. This can be of significant advantage in certain specific applications.

V: 5,799

Understand XGBoost simple way:

Advantages that XGBoost provides:

Leave a Comment Cancel Reply