Logistic Regression

Logistic regression is classification's answer to the straight line. Take the familiar linear score z = \vec{w}\cdot\vec{x} + b, then pass it through the sigmoid to turn it into a probability:

h(\vec{x}) = \sigma(\vec{w}\cdot\vec{x} + b) = P(\text{class } 1).

Despite the name, it's a classifier. It outputs the probability of the positive class, and you decide by a threshold — usually 0.5: above it, predict class 1; below, class 0.

A probability curve through the data

The points sit at height 0 (class 0, on the left) or 1 (class 1, on the right). The S-curve is the model's predicted probability. Slide its steepness and position so it rises through the gap between the classes — where it crosses 0.5 is the decision threshold.

How it learns

Logistic regression is trained exactly like its linear cousin: define a cost, then run gradient descent. The right cost here isn't squared error, though — it's cross-entropy, which rewards confident correct answers and savagely punishes confident wrong ones. With it, logistic regression is the reliable baseline classifier of all of machine learning.