Logistic Regression
Logistic regression is classification's answer to the straight line. Take the
familiar linear score z = \vec{w}\cdot\vec{x} + b, then pass it
through the sigmoid
to turn it into a probability:
h(\vec{x}) = \sigma(\vec{w}\cdot\vec{x} + b) = P(\text{class } 1).
Despite the name, it's a classifier. It outputs the probability of the positive class,
and you decide by a threshold — usually 0.5: above it, predict class
1; below, class 0.
A probability curve through the data
The points sit at height 0 (class 0, on the left) or
1 (class 1, on the right). The S-curve is the model's predicted
probability. Slide its steepness and position so it rises through the gap between the classes —
where it crosses 0.5 is the decision threshold.
How it learns
Logistic regression is trained exactly like its linear cousin: define a cost, then run
gradient
descent. The right cost here isn't squared error, though — it's
cross-entropy,
which rewards confident correct answers and savagely punishes confident wrong ones. With it,
logistic regression is the reliable baseline classifier of all of machine learning.