How Logistic Regression classifies?

Aditya Tiwari
4 min readMay 27, 2020

Topics to be covered in this blog:

  1. What is Logistic Regression?
  2. How it works over the data points

Logistic Regression is statistical and a supervised learning technique that is widely used in the classification of data with an assumption that it should be almost ‘linearly separable’.

By ‘Linearly Separable’, it means that the output should be binary (Yes/No, True/False). Therefore the objective of the Logistic Regression is to find that PLANE/LINE which separates maximum no of those linearly separable data points.

Now let us proceed towards the mathematical aspect of Logistic Regression. So our objective is to find that line/plane which fits best to our data points. To derive this we need to know the equation of a line

Y = MX + C

Let's change the notation a bit:

f(x) = WT.X + b

Where x is our data point

W is normal to the plane. WT is the transpose of W

b is a bias/constant/intercept

assuming the line passes through the origin then b = 0

f(x) = WT.X

As shown the distance dx from the data point X1 plane W is normal to the plane

d1 = WT.X1/ |W| (where |W| is an unit vector)

so, d1 = WT.X1

Now let's understand how logistic regression works.

Consider this example, there are 4 points X1, X2, X3, X4 with blue points having Y values = 1 and red having Y values = -1. Let us see how the model classifies these points.

Points above the line are positive and those below the line are negative.

There are 4 cases of how the model is classified:-

Case1: When X is positive and Y is positive

For data point X2, It will be positive so WT.X2 would also be positive means d2 will be positive Then we will perform Y2.d2, here Y2 is the corresponding value(label) for X2 in the data set. In this case, it would be +1 (check the legend on the above diagram blue points have Y value = 1 and red points have Y value = -1)So if Y2.d1 will be positive means that the model has correctly classified the result.

Case2: When X is positive and Y is negative

For data point X1, It will be positive so WT.X1 would also be positive means d1 will be positive Then we will perform Y1.d1, here Y1 is the corresponding value(label) for X1 in the data set. In this case, it would be -1 (check the legend on the above diagram blue points have Y value = 1 and red points have Y value = -1)So if Y1.d1 will be negative means that the model has incorrectly classified the result.

Case3: When X is negative and Y is positive

For data point X3, It will be negative so WT.X3 would also be negative means d3 will be negative then we will perform Y3.d3, here Y3 is the corresponding value(label) for X3 in the data set. In this case, it would be 1 (check the legend on the above diagram blue points have Y value = 1 and red points have Y value = -1)So if Y3.d3 will be negative means that the model has incorrectly classified the result.

Case4: When X is negative and Y is negative

For data point X4, It will be negative so WT.X4 would also be negative means d4 will be negative then we will perform Y4.d4, here Y4 is the corresponding value(label) for X4 in the data set. In this case, it would be -1 (check the legend on the above diagram blue points have Y value = 1 and red points have Y value = -1)So if Y4.d4 will be positive means that the model has correctly classified the result.

Note: That is how the regression model classifies the data. Although there is an important concept.

The concept is also called the sigmoid function squashing. For now, keep in mind that the function is used to bring down the value of very large values of WT.X as it can cause some serious issues in finding our best line/plane for the given data. This issue would be addressed in a separate blog.

--

--