Before we understand what is Ensemble technique in Machine learning let us first understand few challenges associated while building an efficient and accurate Machine Learning Model:
1. Bias (Intercept) - If our model is skewed towards some data points then the ability of a Machine learning model to capture the true relationship cannot be obtained. This is a type of error in which the weights are not properly represented leading to skewed results, less accurate and more analytical errors.
"Higher the Bias less accurate will be our model".
2. Variance - The difference between the accuracy predicted on training data and the accuracy predicted on the test data is called as 'Variance'. There is a problem here if there is no variance in our data then there is a chance of Overfitting of our training data on the test data.
"Higher the Variance less accurate will be our model".
3. Overfitting - When we train our model with lot of data there is a chance that our model is learning from noise and inaccurate data points in our dataset. Then our model fails to categorize the data correctly because of too much noise and details.
"Overfitting is when High Variance and Low Bias is present in a model".
4. Underfitting - In this situation our model is failing to identify the trend itself destroying the accuracy of our Machine learning model which usually happens when we have not trained our model with sufficient data points, just like trying to build a Linear model using non-linear relational data points.
"Underfitting is when Low Variance and High Bias is present in a model".
To deal with this problem we need a model with "Low Variance and Low Bias" which is considered as an ideal Good fit model for making better Predictions and for achieving best Insights form our dataset and the point where both of these factors Variance and Bias are low is called as the "Sweet spot between a simple model and complex model" which we can find using Regularization, Bagging, Boosting and Stacking.
An Ensemble Machine Learning is a technique of combining predictions from same training dataset (also known as Classifiers) using multiple Machine learning models to achieve better accuracy. It is one of the efficient way of building a Machine learning model.
1. Strong Classifiers: the prediction obtained from any model which is performing really well on both regression and classification tasks given.
2. Weak Classifiers: are the prediction obtained form any model that performs only slight better than any random chance. There can be single weak learner or combined weak learners.
We can divide Ensemble learning techniques into Simple and Advanced Ensemble learning techniques which are: -Regularization:
Let's begin with understanding the Regularization techniques used for improving the accuracy of the model and to control the overfitting scenarios (basically controlling high Variance). Though regularization does not improvise the performance of the model but as an advantage it can improve the generalization of performance of new and unseen data.
The three main Regularization techniques are: -
1. L2 penalty/L2 Norm - Ridge Regression method.
2. L1 penalty/L1 Norm - Lasso Regression.
3. Dropout.
We can use Ridge and Lasso algorithms for any type of algorithms involving Weighted parameters and also for Neural networks whereas Dropout is primarily used for any kind of Neural networks like ANN, CNN, DNN or RNN to moderate the learning.
Fig: - Ridge Regression Formulation
Ridge Regression penalizes sum of squared coefficients. Here we will try to decrease the Ridge Regression Penalty (Lambda * slope^2) of the Least Squares Line (Regression Line) calculated such that the Ridge Regression Line also is fitting on most of the data points. This can be done by increasing the value of 'Lambda' form 0 to n positive numbers using Cross Validation method.
If Lambda = 0 then the Ridge Penalty is same as Regression Line. Larger Lambda value gets less steeper becomes Slope of out data set.
Now let us discuss some of the advanced Ensemble Machine Learning Techniques: -
References:
For more references on Ensemble Learninig you can visit - https://machinelearningmastery.com/ensemble-machine-learning-with-python-7-day-mini-course/
Also do visit - https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205 for further references.
You can connect with me on -
Linkedin - https://www.linkedin.com/in/harish-singh-166b63118
Twitter - @harisshh_singh
Gmail - hs02863@gmail.com
End notes:
Hope this was useful for beginners in the field of Data Science.
See you guys until next time.




















Follow Us
Were this world an endless plain, and by sailing eastward we could for ever reach new distances