Minimax with Α/Β Pruning

See also principal variation search.

This page used to begin at What search trees look like. If someone directed you to this page, you may want to begin at that section.

Sections Notation Values returned by search Kinds of nodes What search trees look like Predicting the kind of a successor node

# Notation

We’ll notate the successors of a node p with ps0, ps1, and so on. We’ll notate the initial value of alpha (the parameter of the search of p) with pα0, the value of alpha before the n-th successor is searched with pαn, and the value of alpha after the n-th successor is searched with pαn+1, so that ps0α0 = −pβ and ps0β = −pα0, and more generally, psnα0 = −pβ and psnβ = −pαn.

When we’re discussing only one node and its successors where p is obvious from context we’ll omit p and simply write these as snα0 = −β and snβ = −αn.

We’ll write pr for the return value of the search of p so that pαn+1 = max(pαn, −psnr), where L is the index of the last successor of p, and we’ll write pcn for the maximum of −pskr for k in 0, ..., n (the “best so far”), so that pr = pc L = max(pskr for k in 0, ..., L) if all successors are searched.

Minimax with α/β pruning is then

def search(p, α0, β) return evaluation(p) if p has no successors runknown for n in 0, ..., L break if β * αn snr = search(sn, −β, −αn) cn = (max(cn1, −snr) unless n = 0 thensnr) rcn αn+1 = max(αn, −snr) end return r end

where * is either ≤ or < as will be discussed below.

We’ll notate the score of p as pe. We’ll always use “score” to refer to an evaluation (for terminal nodes) or the theoretical result of a minimax search without α/β pruning. We will not use it a synonym for the return value of a search with pruning.

# Values returned by search

α0 < r < β

When the value returned by a search is strictly between the initial values of alpha and beta, the value is exact (the return value is the score of the position).

β < r

When the value returned by a search is strictly above beta, the value is a lower bound of the score (but the score is unknown). That is, the score is the return value or greater. An example is depicted below, without and with α/β pruning. Scores are labelled with e, return values are labelled with r, and maxima of the negated return values of successors are labelled with c.

e +9                          β +2r +7α₀−4 ║
              ║                                 ╭──╨──╮
              ║                                 │ Cut │
              ║                                 ╰──╥──╯
      ╔═══════╩═══════╗                    ╔═══════╩═══════╗
      ║ c₀+7c₁+9          β +2c₀+7     β +2 ║
      ║               ║               α₀−4α₁+7 ║
╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╶╨╴╴╴╴╴╴
      ║ e −7e −9          β +4r −7
      ║               ║               α₀−2 ║
   ╭──╨──╮         ╭──╨──╮              ╭──╨──╮
   │ −7  │         │ −9  │              │ −7  │
   ╰─────╯         ╰─────╯              ╰─────╯

Here the search with α/β pruning returned +7, but the score is actually +9.

r < α0

Similarly, when the value returned by a search is strictly below the initial value of alpha, the value is an upper bound of the score (and the score is unknown). That is, the score is the return value or less. An example is depicted below, without and with α/β pruning.

e −9                          β +4r −7α₀−2 ║
              ║                                 ╭──╨──╮
              ║                                 │ All │
              ║                                 ╰──╥──╯
              ║ c₀−9                          β +4c₀−7α₀−2 ║
╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴
              ║ e +9                          β +2r +7α₀−4 ║
              ║                                 ╭──╨──╮
              ║                                 │ Cut │
              ║                                 ╰──╥──╯
      ╔═══════╩═══════╗                    ╔═══════╩═══════╗
      ║ c₀+7c₁+9          β +2c₀+7     β +2 ║
      ║               ║               α₀−4α₁+7 ║
╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴║╴╴╴╴╴╴╴      ╴╴╴╴╴╴╴║╴╴╴╴╴╴╴╴╴╴╴╴╴╴╶╨╴╴╴╴╴╴
      ║ e −7e −9          β +4r −7
      ║               ║               α₀−2 ║
   ╭──╨──╮         ╭──╨──╮              ╭──╨──╮
   │ −7  │         │ −9  │              │ −7  │
   ╰─────╯         ╰─────╯              ╰─────╯

Here the search with α/β pruning returned −7, but the score is actually −9.

r = α0 or r = β

When the value returned by a search is equal to alpha or beta, whether the value is exact or a bound depends on the criterion for cutoff.

When the criterion for cutoff is β < αn , a return value equal to α0 or β is exact. A position is unreachable when β < r.

For theoretical matters, this is the convention I prefer. For example, it implies that when α = β we know the score E exactly, as one would intuitively expect: after all, if αE and Eβ and a = β, then E = α = β. But this is not necessarily true if the criterion for cutoff is βαn, because the position may be unreachable, and Eβ does not hold for unreachable positions.

When the criterion for cutoff is βαn , whether a return value equal to α0 or β is exact or a bound is unknown. A position is unreachable when β < r and may be unreachable when β = r.

However, whether the return value is exact or a bound can be determined if an additional field is also returned.

Implementation
def invert(t) return upper bound if t = lower bound return lower bound if t = upper bound return exact end
def join(tx, ty) return lower bound if tx = lower bound or ty = lower bound return exact if tx = exact or ty = exact return upper bound end
def search' (p, α0, β) if p has no successors return if p is stalemate then 0 else −∞, exact end runknown tunknown for n in 0, ..., L break if βαn snr, snt = search' (sn, −β, −αn) let nr , nt = −snr , invert(snt) if n = 0 or nrr rnr tif nr = r then join(t, nt) else nt end αn+1 = max(αn, nr) end if βr and not all successors of p searched tlower bound end return r, t end

# Kinds of nodes

A pv node has a return value that is exact.
The name is somewhat unfortunate, because a pv node is not necessarily part of the principal variation of the root or even the pv of its predecessor; only when move ordering is perfect is every pv node part of the pv of its predecessor. And when move ordering is perfect, every pv node is part of the pv of the root.

                           ╥
                           ║
   ╔═══════════════╦═══════╩═══════╦═══╶╶╶
   ║        beforeafter        ║
   ║         β +7β +7         ║
   ║         α −2α +5         ║
   ║              +5=              ║
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮    ╭╴╴╨╶╶╮
╷     ╷    ╷ β +2    r −5= ╷    ╷     ╷
╵     ╵    ╵ α −7    P−V   ╵    ╵     ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯    ╰╴╴─╶╶╯

A Cut node has a score

                           ╥
                           ║
   ╔═══════════════╦═══════╩═══════╦═══╶╶╶
   ║        beforeafter        ║
   ║         β +7β +7         ║
   ║         α −2α −2         ║
   ║              −4↓              ║
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮    ╭╴╴╨╶╶╮
╷     ╷    ╷ β +2    r +4↑ ╷    ╷     ╷
╵     ╵    ╵ α −7    Cut   ╵    ╵     ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯    ╰╴╴─╶╶╯

The score of a Cut node is unknown; the returned value is a bound. Specifically, the score of a Cut node is

An All node has a score

                           ╥
                           ║
   ╔═══════════════╦═══════╝
   ║        beforeafterβ +7β +7α −2α +9+9↑
╭╴╴╨╶╶╮    ╭╴╴╴╴╴╴╴╨╶╶╶╶╶╶╶╮
╷     ╷    ╷ β +2    r −9↓ ╷
╵     ╵    ╵ α −7    All   ╵
╰╴╴─╶╶╯    ╰╴╴╴╴╴╴╴─╶╶╶╶╶╶╶╯

The score of an All node is unknown; the returned value is a bound. Specifically, the score of an All node is

# What search trees look like

With perfect move ordering, search nodes have this structure:

       P−V           Cut           All
   ┌────┼────┐        │        ┌────┼────┐
  P−V  Cut  ...      All      Cut  ...  ...

where moves are searched left-to-right and an ellipsis indicates zero or more nodes of the preceding type.
(You should be able to horizontally scroll the diagrams if you have a narrow display and the diagrams are being clipped.)

With imperfect move ordering, they have this structure:

           P−V                              Cut                All
     ┌──────┼────┬────┬────┐          ┌──────┼────┐        ┌────┼────┐
  P−V/Cut  ...  P−V  Cut  ...      P−V/Cut  ...  All      Cut  ...  ...
                 ↑
             best move

A tree with perfect move ordering looks like:

                                          P−V
                      ┌───────────┬────────┼────────┐
                     P−V         Cut      Cut      Cut
          ┌──────┬────┼────┐      │        │        │
         P−V    Cut  Cut  Cut    All      All      All
   ┌───┬──┼──┐   │    │    │   ┌──┼──┐  ┌──┼──┐  ┌──┼──┐
  P−V  C  C  C  All  All  All  C  C  C  C  C  C  C  C  C
  ┌┼┐  │  │  │  ┌┼┐  ┌┼┐  ┌┼┐  │  │  │  │  │  │  │  │  │

Actual search trees are quite good: 90+% of Cut nodes have only a single successor and in the majority of pv nodes the first successor is the best move. This is partially explained by iterative deepening with a transposition table: the tree is continually being reëvaluated, and at each iteration the previously-calculated best responses are now ordered first. And since deeper searches are more likely to be accurate, the closer a node is to the root, the more likely it is for its first move to be ordered correctly. So the interior of the tree looks nearly ideal, but the leaves might be garbage (over the next few steps they will be massively refined, but no longer be leaf nodes). What's curious, though, is that the overall statistics are still good even given the fact that more than half of all nodes are leaf nodes.

# Predicting the kind of a successor node

If you believe you are an All node, you should predict your successors are Cut nodes.

If you believe you are a Cut node, you should predict your successors are All nodes.

If you believe you are a pv node, you should predict your first successor is a pv node and every subsequent successor is a Cut node. There are conditions under which you might revise your beliefs: