Knowledge representation and logic agents
A reflex agent reacts to what it senses right now. But some of the smartest behaviour comes from an
agent that knows things — a stack of facts about its world — and can reason
from those facts to conclusions it was never told directly. A doctor who has never seen your X-ray can
still deduce, from a handful of symptoms and a body of medical knowledge, what is probably wrong. That
leap — from what you were told to what must therefore be true — is inference, and it
is the heart of a knowledge-based agent.
This page is about knowledge representation: how an agent stores what it knows in a
formal language, and how it squeezes new conclusions out of that store by pure logic. We'll meet the
knowledge base, the two operations TELL and ASK,
the crucial idea of entailment, and the difference between what follows and
what a program can actually derive. Get this frame right and you have the machinery behind
theorem provers, planners, and the classic Wumpus-world agent that deduces where the
monster hides without ever seeing it.
The knowledge base: TELL and ASK
A knowledge base (KB) is, quite literally, a set of sentences written in a
formal knowledge-representation language. Not English sentences — precise, unambiguous
ones, so that a machine can manipulate them without guessing what you meant. Each sentence asserts
something the agent takes to be true about its world.
A knowledge-based agent interacts with its KB through exactly two operations:
- TELL — add a new sentence (a fact or a rule) to the KB;
- ASK — pose a query and get back whatever the KB, plus reasoning,
entails.
The magic is in that word "reasoning". When you ASK whether some sentence
\alpha holds, the agent does not merely look it up — it may have never been
told \alpha at all. Instead it runs an inference procedure
that decides whether \alpha must be true given everything in the
KB. The diagram below traces that flow.
Propositional logic as a representation
The simplest useful representation language is propositional logic. Its atoms are
propositional symbols — P, Q,
R, each standing for a fact that is either true or false ("it is raining",
"the alarm is on"). Symbols combine through connectives:
\lnot (not), \land (and),
\lor (or), \rightarrow (implies), and
\leftrightarrow (if and only if).
The meaning of a sentence is fixed by a model: an assignment of
true/false to every symbol. With
n symbols there are exactly 2^n possible models —
the 2^n rows of the sentence's truth table. In each model, a sentence is
simply true or false, computed from the connectives. A model in which a sentence is true is said to
satisfy it; we call it a model of that sentence.
Say we TELL the KB two sentences:
P \rightarrow Q, \qquad P.
We were never told Q. Yet in every model where both of those
sentences are true, Q comes out true as well — check the rows for yourself.
So the KB entails Q, and ASK-ing for
Q should return "yes". That is exactly the intuition we now make precise.
Entailment: the gold standard for "follows from"
A knowledge base KB entails a sentence
\alpha, written KB \models \alpha, when:
- \alpha is true in every model in which
KB is true.
- Equivalently: the set of models that satisfy KB is a
subset of the models that satisfy \alpha — wherever
KB holds, \alpha is dragged along with it.
Entailment is a semantic notion: it is defined purely in terms of truth in models,
with no mention of any algorithm. It is the gold standard for what it means for a conclusion
to genuinely follow from what you know. If KB \models \alpha, then anyone
who accepts every sentence in KB is logically committed to
\alpha — there is no consistent way to believe the KB and deny
\alpha.
And this gives us an immediate, if brutal, algorithm for answering ASK: enumerate all
the models and check.
Deciding entailment by model checking
Model checking takes the definition of entailment absolutely literally. To decide
whether KB \models \alpha, enumerate all 2^n
assignments of the n symbols; for every assignment that makes the KB true,
check that \alpha is true there too. If even one KB-model falsifies
\alpha, entailment fails; if none do, it holds.
Let's actually build it. We represent a tiny KB and a query each as a boolean function over the symbols
A, B, C, loop over all 2^3 = 8 models, and apply
the definition. Notice we test one entailed query and one non-entailed query — press
Run:
type Model = { A: boolean; B: boolean; C: boolean };
// KB, in words: (A implies B) and (B implies C) and A are all true.
// (P implies Q) is logically (!P || Q).
const KB = (m: Model): boolean =>
(!m.A || m.B) && (!m.B || m.C) && m.A;
// Does KB entail `query`? True iff query holds in EVERY model where KB holds.
function entails(query: (m: Model) => boolean): boolean {
for (const A of [false, true])
for (const B of [false, true])
for (const C of [false, true]) {
const m = { A, B, C };
if (KB(m) && !query(m)) return false; // a counter-model: KB true, query false
}
return true; // no counter-model found
}
// C should follow (A -> B -> C, and A holds), a chain of modus ponens.
console.log("KB entails C? ", entails((m) => m.C));
// But nothing forces A and C to differ, so "B is false" does NOT follow.
console.log("KB entails !B?", entails((m) => !m.B));
Model checking is sound and complete — it gives exactly the right
answer for every query — but it pays for that with 2^n work. Ten symbols is
a thousand models; sixty symbols is more models than there are seconds since the Big Bang. For large
knowledge bases we need something cleverer than enumerating the whole universe of possibilities.
Proof: deriving instead of enumerating
The alternative to checking models is to manipulate sentences directly with
inference rules — syntactic patterns that turn sentences you have into new ones,
never touching a truth table. The best-known is Modus Ponens:
\dfrac{\;\alpha \rightarrow \beta, \quad \alpha\;}{\beta}
"Given \alpha \rightarrow \beta and \alpha,
conclude \beta." A more powerful single rule, resolution,
is enough on its own to prove anything entailed in propositional logic (combined with a proof by
contradiction). When a procedure derives \alpha from
KB by such rules, we write KB \vdash \alpha —
note the different turnstile.
So we now have two arrows and they must be kept apart:
- KB \models \alpha — semantic: \alpha
is true in all models of the KB. What genuinely follows.
- KB \vdash \alpha — syntactic: \alpha
can be derived by a particular inference procedure. What a program actually reaches.
We want these two to line up. Two properties tie them together:
- An inference procedure is sound (truth-preserving) when everything it derives is
entailed: KB \vdash \alpha implies
KB \models \alpha. It never invents a falsehood.
- An inference procedure is complete when it can derive everything that is
entailed: KB \models \alpha implies
KB \vdash \alpha. It never misses a truth.
A procedure that is both sound and complete derives exactly the entailed
sentences — for it, \vdash and \models coincide.
Model checking and resolution are both sound and complete for propositional logic, which is the good
news that lets us build reliable logic agents.
Two traps live in this card. First, do not conflate
\vdash with \models. Entailment
(\models) is about truth in every model — it exists whether or not
any algorithm ever finds it. Derivability (\vdash) is about a
specific procedure actually reaching the conclusion. They are guaranteed to agree
only when your procedure is both sound and complete; a weak procedure can fail to
derive something that is nonetheless entailed.
Second, beware the fallacy of affirming the consequent. From
P \rightarrow Q and Q you may
not conclude P. ("If it rains the grass is wet; the grass
is wet — so it rained." No: a sprinkler would do it.) There is a model where
Q is true, P is false, and
P \rightarrow Q still holds — so P is not
entailed. Modus Ponens runs the arrow one way only.
First-order logic: objects, relations and quantifiers
Propositional logic has a ceiling. Each symbol is an atomic, opaque fact; there is no way to say
"every square next to a pit is breezy" in one stroke — you would need a separate symbol for
every single square. First-order logic (FOL) breaks through that ceiling by
representing the world as objects, relations among them, and
functions:
- Objects — the things: squares, people, numbers, the Wumpus.
- Predicates / relations — properties and relationships:
\text{Breezy}(x), \text{Adjacent}(x, y).
- Functions — mappings from objects to objects:
\text{North}(x), \text{Father}(x).
- Quantifiers — \forall ("for all") and
\exists ("there exists"), which range over objects.
Quantifiers are the leap in power. One FOL axiom
\forall x \;\; \text{Pit}(x) \rightarrow \big(\forall y \;\; \text{Adjacent}(x, y) \rightarrow \text{Breezy}(y)\big)
captures a rule about infinitely many objects at once — every pit, every neighbour, for all
time. Propositional logic would need one clause per (square, neighbour) pair. This expressiveness is
why FOL is the workhorse of serious knowledge representation, and why the classic logic-agent testbed,
the Wumpus-world, is usually posed in it: an agent groping through a dark cave feels
breezes and stenches, and must deduce from general rules exactly
which squares hold pits and where the Wumpus lurks — reasoning its way to safety without ever seeing
the danger.
You could, in principle, "compile" a first-order rule down to propositions — one symbol per fact —
but only if the world is finite and small. Say "all humans are mortal" over a billion
people and you need a billion separate implications; say it over the natural numbers and you need
infinitely many. A single quantified sentence, \forall x\; \text{Human}(x)
\rightarrow \text{Mortal}(x), replaces that unbounded pile with one line — and stays
true even as new objects are discovered. Quantifiers are compression, and compression is what lets a
finite agent reason about an unbounded world.
Classical logic is monotonic: TELL-ing the KB more facts can only add
conclusions, never retract old ones. If KB \models \alpha, then
KB \cup \{\beta\} \models \alpha for any
\beta — the models only shrink, so everything true before stays true. That
is wonderful for provable certainty, but real life isn't like that: told "Tweety is a bird" you
conclude Tweety flies, then learn "Tweety is a penguin" and want to take it back. Plain
logic can't. This limitation — the need to reason with defaults, exceptions, and degrees of belief —
is exactly what pushes us toward reasoning under uncertainty and probability on the
pages ahead.