Naive Bayes

Naive Bayes is used for classification model.

Under Bayes theorem we have many models for classification and Regression.

This is for classification model, which predicts the target class based of probabilities of individual variables.

For Any Model, we check Multicollinearity, But for Naive Bayes, we don’t need to check for collinearity, because even if two variables are highly correlated, then Naive Bayes take both variables into consideration as it combines probabilities of every single variables to predict target value.

Naive Bayes has the power of iterative learning.

Example :- Gmail Spam

When you open an email and report as a spam, Naive Bayes which runs behind not only moves that email into spam but also learns to filter the same kind of email into spam next time.

Naive Bayes Formula1p1,p2

For Classification

It calculates the probability of yes across all independent variables. And it calculates the probability of No across all independent variables

Finally if probability of yes is higher than the probability of No, then it assigns Yes to that particular Target value

Naive Bayes work well for categorical variables, and for continuous variable if it follows normal distribution

Formula for Naive Bayes continuous variable:

First Naive Bayes will build a contingency table for every single variable

When your test data has some different values, which is not there in training data, then that probability will be zero which makes the final result zero.. This situation is called Zero Frequency situation.

To solve this issue, we have the smoothing techniques, Laplace Estimation is one of the techniques.

It is always best practice to convert continuous variables to categorical values before feeding into Naive Bayes.

We use this type of Bayes Algorithm basically in Mining Area. Ex. Text mining