What is Unsupervised Machine Learning? Unsupervised machine learning is used when we have un-labelled data but need to find the hidden pa...

 What is Unsupervised Machine Learning?

Unsupervised machine learning is used when we have un-labelled data but need to find the hidden patterns in the given dataset, group them based on similarities, patterns or differences and represent the dataset in compressed format. The model itself tries to find the hidden patterns and insights from the given dataset without even training the model as we do not have any output data here but only input data in our dataset. For this reason we also cannot directly apply unsupervised machine learning to regression or classification problems. 

Fig: - Unsupervised Machine Learning


Steps involved in Unsupervised Machine Learning: -

Fig: - Steps involved in Unsupervised Machine Learning


Types of Unsupervised Machine Learning: -

1. Clustering: - grouping the items with most similarities into one group and items with no similarities in another group. Clustering can be of type Agglomerative, Exclusive(Partitioning), Overlapping or Probabilistic.

Some of the Most common examples for Clustering algorithms are: -

    a. Hierarchical Clustering

    b. K-Means Clustering

    c. Principal Component Analysis

    d. Singular value decomposition

    e. Independent Component Analysis

    f. KNN (Nearest neighbor) clustering


2. Association: - allows us to establish associations among the data objects in large database such as people who purchases item X are also seen tending to buy item Y(Grocery/Market basket recommendations), movie recommendations, web usage mining, etc... Associations are made based on the factors like Support, Confidence and Lift.

Most Common example for Association rules are: -

    a. Apriori algorithm

    b.  Eclat algorithm

    c. F-P Growth algorithm


Advantages and Disadvantages of Unsupervised Machine Learning are: -



You can connect  with me on -


Linkedin - https://www.linkedin.com/in/harish-singh-166b63118


Twitter - @harisshh_singh


Gmail - hs02863@gmail.com


End notes: -

Hope this was useful for beginners in the field of Data Science. 

See you guys until next time.

What is Supervised machine learning? When a model is getting trained on a labelled dataset , which means dataset where some input data is al...

What is Supervised machine learning?

When a model is getting trained on a labelled dataset, which means dataset where some input data is already tagged to its correct output so that the model can predict the correct output in case when a new data is passed to the model. 

In a Supervised ML model both training and validation datasets are labelled, training data teaches the machine to make the predictions correctly. While training the model mostly dataset is divided into 80:20 ratio i.e. 80% of both input and output data for training the model and later for testing the model we pass input from the remaining 20% of data which the model has never seen and we will now compare the output predicted by our model with the actual output we already have to predict the accuracy and precision of the model.

The goal of Supervised machine learning model is to find a mapping function to map the input variable(x) with the output variable(y). 


fig: - Supervised Machine Learning

Steps involved in a Supervised Machine Learning: -

Fig: - Steps involved in a Supervised Machine Learning

Types of Supervised machine learning: -

1. Regression: - It is mostly used for the prediction of future on the basis of input and output having continuous values. Regression algorithm uses the relationship between input variable and output variable to predict a much closer value to the actual output value. Here the lesser is the error value greater will be the accuracy of our regression model.

Some most common regression techniques are: -

    a. Linear Regression

    b. Regression Trees

    c. Non-Linear Regression

                 i. KNN 
                 ii. Support vector regression 
                 iii. CART 
                 iv. Random forests 
                 v. Gradient boosting algorithm         
                 vi. Extreme gradient boosting algorithm 
                 vii. Light gradient boosting algorithm 
                 viii. CatBoost 
                 ix. Neural networks.

    d. Bayesian Linear Regression

    e. Polynomial Regression


2. Classification: - Mostly used when the input and output variables have categorical/discrete values, which means there are two or more classes.
ex. for binary classification can be - 0/1, Y/N, Male/Female, True/False, etc....
ex. for multiclass classifications can be - Gmail filtering(social, updates, spam, primary, promotions).

Some of the most common Classification algorithms are: -

    a. Logistic Regression

    b. Random Forest

    c. Decision Trees

    d. Naive Baye's - Gaussian NB and Multinomial NB

    4. Support vector machines


Advantages and Disadvantages of Supervised Machine Learning methods: -


You can connect  with me on -


Linkedin - https://www.linkedin.com/in/harish-singh-166b63118


Twitter - @harisshh_singh


Gmail - hs02863@gmail.com


End notes: -

Hope this was useful for beginners in the field of Data Science. 

See you guys until next time.