Minimax with Α/Β Pruning

See also principal variation search.

This page used to begin at What search trees look like. If someone directed you to this page, you may want to begin at that section.

Sections Notation Values returned by search Kinds of nodes What search trees look like Predicting the kind of a successor node

# Notation

We’ll notate the successors of a node p with ps₀, ps₁, and so on. We’ll notate the initial value of alpha (the parameter of the search of p) with pα₀, the value of alpha before the n-th successor is searched with pα_n, and the value of alpha after the n-th successor is searched with pα_n+1, so that ps₀α₀ = −pβ and ps₀β = −pα₀, and more generally, ps_nα₀ = −pβ and ps_nβ = −pα_n.

When we’re discussing only one node and its successors — where p is obvious from context — we’ll omit p and simply write these as s_nα₀ = −β and s_nβ = −α_n.

We’ll write pr for the return value of the search of p and write pc_n for the maximum of −ps_kr for k in 0, ..., n (the “best so far”), so that pr = pc_L = max(−ps_kr for k in 0, ..., L) where L is the index of the last successor of p that was searched.

Minimax with α/β pruning is then

def search(p, α₀, β) return evaluation(p) if p has no successors r ← unknown for n in 0, ..., | successors(p) | − 1 s_nr = search(s_n, −β, −α_n) c_n = (max(c_n−1, −s_nr) unless n = 0 then −s_nr) r ← c_n break if β * r α_n+1 = max(α_n, −s_nr) end return r end

where * is either ≤ or < as will be discussed below.

Note that we break before alpha is updated (in other words, we never define α_n+1 when the n-th successor causes a cutoff) in order to maintain the invariant that values of alpha are always less than or equal to beta. This is fairly irrelevant, as α_L+1 is not used in any case.

We’ll notate the score of p as pe. We’ll always use “score” to refer to an evaluation (for terminal nodes) or the theoretical result of a minimax search without α/β pruning. We will not use it a synonym for the return value of a search with pruning.

# Values returned by search

α₀ < r < β

When the value returned by a search is strictly between the initial values of alpha and beta, the value is exact (the return value is the score of the position).

β < r

When the value returned by a search is strictly above beta, the value is a lower bound of the score (but the score is unknown). That is, the score is the return value or greater. An example is depicted below, without and with α/β pruning. Scores are labelled with e, return values are labelled with r, and maxima of the negated return values of successors are labelled with c.

              ╥ e +9                          β +2 ╥ r +7
              ║                               α₀−4 ║
              ║                                 ╭──╨──╮
              ║                                 │ Cut │
              ║                                 ╰──╥──╯
      ╔═══════╩═══════╗                    ╔═══════╩═══════╗
      ║ c₀+7          ║ c₁+9          β +2 ║ c₀+7          ║
      ║               ║               α₀−4 ║               ║
╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╶╨╴╴╴╴╴╴
      ║ e −7          ║ e −9          β +4 ║ r −7
      ║               ║               α₀−2 ║
   ╭──╨──╮         ╭──╨──╮              ╭──╨──╮
   │ −7  │         │ −9  │              │ −7  │
   ╰─────╯         ╰─────╯              ╰─────╯

Here the search with α/β pruning returned +7, but the score is actually +9.

r < α₀

Similarly, when the value returned by a search is strictly below the initial value of alpha, the value is an upper bound of the score (and the score is unknown). That is, the score is the return value or less. An example is depicted below, without and with α/β pruning.

              ╥ e −9                          β +4 ╥ r −7
              ║                               α₀−2 ║
              ║                                 ╭──╨──╮
              ║                                 │ All │
              ║                                 ╰──╥──╯
              ║ c₀−9                          β +4 ║ c₀−7
              ║                               α₀−2 ║
╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴
              ║ e +9                          β +2 ║ r +7
              ║                               α₀−4 ║
              ║                                 ╭──╨──╮
              ║                                 │ Cut │
              ║                                 ╰──╥──╯
      ╔═══════╩═══════╗                    ╔═══════╩═══════╗
      ║ c₀+7          ║ c₁+9          β +2 ║ c₀+7          ║
      ║               ║               α₀−4 ║               ║
╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╶╨╴╴╴╴╴╴
      ║ e −7          ║ e −9          β +4 ║ r −7
      ║               ║               α₀−2 ║
   ╭──╨──╮         ╭──╨──╮              ╭──╨──╮
   │ −7  │         │ −9  │              │ −7  │
   ╰─────╯         ╰─────╯              ╰─────╯

Here the search with α/β pruning returned −7, but the score is actually −9.

r = α₀ or r = β

When the value returned by a search is equal to alpha or beta, whether the value is exact or a bound depends on the criterion for cutoff.

When the criterion for cutoff is β < r , a return value equal to α₀ or β is exact. A position is unreachable when β < r.

For theoretical matters, this is the convention I prefer. For example, it implies that when α = β we know the score E exactly, as one would intuitively expect: after all, if α ≤ E and E ≤ β and a = β, then E = α = β. But this is not necessarily true if the criterion for cutoff is β ≤ r, because the position may be unreachable, and E ≤ β does not hold for unreachable positions.

When the criterion for cutoff is β ≤ r , whether a return value equal to α₀ or β is exact or a bound is unknown. A position is unreachable when β < r and may be unreachable when β = r.

However, whether the return value is exact or a bound can be determined if an additional field is also returned.

Implementation

def invert(t) return upper bound if t = lower bound return lower bound if t = upper bound return exact end
def join(t_x, t_y) return lower bound if t_x = lower bound or t_y = lower bound return exact if t_x = exact or t_y = exact return upper bound end
def search(p, α₀, β) if p has no successors return ⟨if p is stalemate then 0 else −∞, exact⟩ end r ← unknown t ← unknown for n in 0, ..., | successors(p) | − 1 s_nr, s_nt = search(s_n, −β, −α_n) let nr , nt = −s_nr , invert(s_nt) if n = 0 or nr ≥ r r ← nr t ← if nr = r then join(t, nt) else nt end break if β ≤ r α_n+1 = max(α_n, nr) end if β ≤ r and not all successors of p searched t ← lower bound end return ⟨r, t⟩ end

# Kinds of nodes

A pv node has a return value that is exact.
The name is somewhat unfortunate, because a pv node is not necessarily part of the principal variation of the root or even the pv of its predecessor; only when move ordering is perfect is every pv node part of the pv of its predecessor. And when move ordering is perfect, every pv node is part of the pv of the root.

                           ╥
                           ║
   ╔═══════════════╦═══════╩═══════╦═══╶╶╶
   ║        before ║  after        ║
   ║         β +7  ║  β +7         ║
   ║         α −2  ║  α +5         ║
   ║              +5=              ║
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮    ╭╴╴╨╶╶╮
╷     ╷    ╷ β +2    r −5= ╷    ╷     ╷
╵     ╵    ╵ α −7    P−V   ╵    ╵     ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯    ╰╴╴─╶╶╯

A Cut node has a score

greater than beta, from its own perspective
(because one of its successors caused a cutoff), and
less than alpha, from the perspective of its predecessor
(so it fails to raise alpha).

                           ╥
                           ║
   ╔═══════════════╦═══════╩═══════╦═══╶╶╶
   ║        before ║  after        ║
   ║         β +7  ║  β +7         ║
   ║         α −2  ║  α −2         ║
   ║              −4↓              ║
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮    ╭╴╴╨╶╶╮
╷     ╷    ╷ β +2    r +4↑ ╷    ╷     ╷
╵     ╵    ╵ α −7    Cut   ╵    ╵     ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯    ╰╴╴─╶╶╯

The score of a Cut node is unknown; the returned value is a bound. Specifically, the score of a Cut node is

greater than the value returned, from it own perspective, and
less than the value returned, from the perspective of its predecessor.

An All node has a score

less than alpha, from its own perspective
(because none of its successors raised alpha), and
greater than beta, from the perspective of its predecessor
(so it causes a cutoff).

                           ╥
                           ║
   ╔═══════════════╦═══════╝
   ║        before ║
   ║         β +7  ║
   ║         α −2  ║
   ║              +9↑
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮
╷     ╷    ╷ β +2    r −9↓ ╷
╵     ╵    ╵ α −7    All   ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯

The score of an All node is unknown; the returned value is a bound. Specifically, the score of an All node is

less than the value returned, from it own perspective, and
greater than the value returned, from the perspective of its predecessor.

# What search trees look like

With perfect move ordering, search nodes have this structure:

       P−V           Cut           All
   ┌────┼────┐        │        ┌────┼────┐
  P−V  Cut  ...      All      Cut  ...  ...

where moves are searched left-to-right and an ellipsis indicates zero or more nodes of the preceding type.
(You should be able to horizontally scroll the diagrams if you have a narrow display and the diagrams are being clipped.)

With imperfect move ordering, they have this structure:

           P−V                              Cut                All
     ┌──────┼────┬────┬────┐          ┌──────┼────┐        ┌────┼────┐
  P−V/Cut  ...  P−V  Cut  ...      P−V/Cut  ...  All      Cut  ...  ...
                 ↑
             best move

A tree with perfect move ordering looks like:

                                          P−V
                      ┌───────────┬────────┼────────┐
                     P−V         Cut      Cut      Cut
          ┌──────┬────┼────┐      │        │        │
         P−V    Cut  Cut  Cut    All      All      All
   ┌───┬──┼──┐   │    │    │   ┌──┼──┐  ┌──┼──┐  ┌──┼──┐
  P−V  C  C  C  All  All  All  C  C  C  C  C  C  C  C  C
  ┌┼┐  │  │  │  ┌┼┐  ┌┼┐  ┌┼┐  │  │  │  │  │  │  │  │  │

Actual search trees are quite good: 90+% of Cut nodes have only a single successor and in the majority of pv nodes the first successor is the best move. This is partially explained by iterative deepening with a transposition table: the tree is continually being reëvaluated, and at each iteration the previously-calculated best responses are now ordered first. And since deeper searches are more likely to be accurate, the closer a node is to the root, the more likely it is for its first move to be ordered correctly. So the interior of the tree looks nearly ideal, but the leaves might be garbage (over the next few steps they will be massively refined, but no longer be leaf nodes). What's curious, though, is that the overall statistics are still good even given the fact that more than half of all nodes are leaf nodes.

# Predicting the kind of a successor node

If you believe you are an All node, you should predict your successors are Cut nodes.

If you believe you are a Cut node, you should predict your successors are All nodes.

If you believe you are a pv node, you should predict your first successor is a pv node and every subsequent successor is a Cut node. There are conditions under which you might revise your beliefs:

If you strongly believe your move ordering is perfect and the first successor fails to raise alpha (meaning it was a Cut node), then you should consider yourself an All node. (This is probably not the case.)
If you suspect your move ordering is fallible (or you strongly believe you are in fact a pv node) and the first successor fails to raise alpha, then you should predict the next successor is a pv node (and not a Cut node) and that every successor thereafter is a Cut node (and so on until a successor raises alpha).

# Why so many nodes get pruned

This section needs to be written!