If you are a software engineer or a programmer you must have used StackOverflow at least once in your lifetime. But have you ever wondered how StackOverflow predicts the tags for a given question ? In this blog, I will discuss the StackOverflow tag predictor case study.


  1. Overview of Stack…


  1. Overview of Dataset.
  2. Data Preprocessing.
  3. Train-Test split.
  4. Text Featurization using Bag of Words.
  5. Hyper Parameter Tuning.
  6. Model Building using the Naive Bayes algorithm.
  7. Performance Metrics.
  8. Model deployment into Web app using Flask API.
  9. Production of the model by Heroku platform.
  10. Results.

This dataset consists of reviews of fine foods from…

In this blog, we’ll try to understand one of the most important algorithms in machine learning i.e. Random Forest Algorithm. We will try to look at the things that make Random Forest so special and will try to implement it on a real-world dataset.


  1. What Are Ensembles?
  2. Types of Ensemble…

Decision trees are a popular supervised learning method for a variety of reasons. The benefits of decision trees include that they can be used for both regression and classification, they are easy to interpret and they don’t require feature scaling. They have several flaws including being prone to overfitting.


  1. What…

SVM is a supervised Machine Learning algorithm that is used in many classifications and regression problems. It still presents as one of the most used robust prediction methods that can be applied to many use cases involving classifications.


  1. Geometric Intuition Of Support Vector Machines.
  2. Mathematical Formulation of Support Vector Machines.

Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. There are lots of classification problems that are available, but the logistics regression is common and is a useful regression method for solving the binary classification problem.

There are lots of classification problems that…

The solution to an optimization problem can be done by selecting different methods. Moreover, the user can navigate on the surface or curve to establish an initial point and find the optimal or critical point, which can be observed on the plotted function.


1.Single Value Differentiation

2. Minima and Maxima


  1. Geometric Intuition for Linear Regression

2. Linear Regression using Loss-Minimization

3. Assumptions of Linear Regression

4. Implementation of the Linear Regression using Python

What is Regression?

Regression analysis is a form of predictive modeling technique that investigates the relationship between a dependent and independent variable.

Geometric Intuition for Linear Regression

Linear regression is perhaps one of the most…

Naive Bayes is a statistical classification technique based on Bayes Theorem. It is one of the simplest supervised learning algorithms. Naive Bayes classifier is a fast, accurate, and reliable algorithm. Naive Bayes classifiers have high accuracy and speed on large datasets.

To understand the Naive Bayes algorithm first we want…

First We want to know What is Amazon Fine Food Review Analysis?

This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plaintext…

Sachin D N

Trained on Data Science and Machine Learning at @6benches

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store