Principal Variation Search

See also minimax with α/β pruning.

Sections Description Proving a search will fail low When to attempt a proof Example Terminology Implementation Improvement Appendix

# Description

The term “principal variation search” is something of a misnomer because pvs is not a distinct search algorithm but α/β minimax with an additional pruning rule:

Before searching a successor, attempt to prove that the search will fail low (return a value beneath alpha), and if successful, skip the successor.

Let’s call the value of alpha before the n-th successor is searched αn, the value of alpha after the n-th successor is searched αn+1, and give the search of the n-th successor itself the name Fn. For the value returned by Fn, let’s write ret(Fn).
The α/β minimax update rule is then αn+1 = max(αn, −ret(Fn)).
Let’s refer to the parameters of a search Sn as Sαn and Sβn, so Fαn = −β and Fβn = −αn.

# Proving a search will fail low

The proof that Fn will fail low takes the form of a zero-window search, Zn, which has the property of failing low exactly when Fn will fail low. That is, −ret(Zn) < αn implies −ret(Fn) < αn. And −ret(Fn) < αn implies αn+1 = αn, and αn+1 is all we really need to determine.
The search Fn is referred to as the full-window search.

Zn is an α/β minimax search with β lowered to αn (so the parameters of Zn are Zαn = −αn and Zβn = −αn). However, it’s common for the cutoff criterion to be βαn rather than β < αn, in which case β is lowered to αn+1 rather than αn, and so you will almost exclusively see Zαn = −αn1 and Zβn = −αn in implementations.

Typically, the same routine is used to perform Fn and Zn, and this only requires a single bit of bookkeeping: a flag that indicates which tree is being searched.
(This is necessary because you can’t tell just from looking at α and β, because they may differ by 0 or 1 only incidentally.)

# When to attempt a proof

Zero-window pruning isn’t applicable in a zero-window tree (that is, within Zn) because full-window and zero-window searches are equivalent in zero-window trees.

Typically, zero-window pruning is not attempted for the first successor of a node because the first successor is expected to meet or exceed alpha, in which case the proof will fail and Fn will have to be evaluated anyway. Since Zn is expected to be useless, implementations simply begin Fn right away.

More correctly, the first successor meets or exceeds alpha in pv and Cut nodes with perfect ordering, but not in All nodes. It’d be reasonable, then, to attempt zero-window pruning for every successor of an expected-All node (including the first), but as far as I’m aware, no one has tried this. (We might expect it to be fairly useless since there shouldn’t be many All nodes in the search tree from the root if move ordering is accurate – they’d only appear in zero-window trees, as we’ll see below.)

If the first successor of an expected-pv node fails to meet alpha, the next successor is predicted to meet or raise alpha instead. Then Z1 is expected to be useless (just as Z0 was expected to be useless), which suggests an implementation oughtn’t attempt zero-window pruning yet but instead simply begin F1 right away (and so on until a successor meets alpha). As far as I’m aware, no one has tried this.

Illustration
                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛────┬──────────┬─────┐
                   F                          Z          Z     Z
                  P−V                        Skp        Skp   Skp
   ┌───────────────┗━━━┓───────────┐
   F                   *           Z
  Cut                 P−V         Skp

Typically, zero-window pruning would be attempted for the node marked with the asterisk, and in this example the * would then be Z+F, but since the first successor failed to meet alpha, we would be justified in searching the node with a full-window right away, and the * would be F.

This is an incorrect guess only when the expected-pv node is ultimately found to be an All node. (The reasoning for attempting zero-window pruning after the first successor fails low might be, “we tried the only reasonable successor and failed, so this position probably isn’t viable”).

# Example

Note these depict common practice, where the first successor is always searched with a full-window (even within All nodes) and zero-window pruning is attempted for every successor after the first (even if the first successor failed to meet alpha).

Here is a minimax search tree with α/β pruning:

                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛─────┬─────────┬─────┐
                   ┃                           │         │     │
                  P−V                         Cut       Cut   Cut
   ┌───────────────┗━━━┓───────────┐       ┌───┴───┐     │   ┌─┴─┐
   │                   ┃           │       │       │     │   │   │
  Cut                 P−V         Cut     P−V     All   All Cut All
   │             ┏━━━━━┛─────┐     │     ┌─┴─┐   ┌─┴─┐
   │             ┃           │     │     │   │   │   │
  All           P−V         Cut   All   P−V Cut Cut Cut
 ┌─┴─┐   ┌───┬───┃───┬───┐   │   ┌─┴─┐
 │   │   │   │   ┃   │   │   │   │   │
Cut Cut P−V Cut P−V Cut Cut All Cut Cut

This is not a complete tree; the leaf nodes are omitted due to limited space.

Below is a minimax search tree of the same position with α/β and zero-window pruning (and the same move ordering). Nodes where a zero-window search was performed are marked with a Z or Z+F.

                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛────┬──────────┬─────┐
                   F                          Z          Z     Z
                  P−V                        Skp        Skp   Skp
   ┌───────────────┗━━━┓───────────┐
   F                  Z+F          Z
  Cut                 P−V         Skp
   │             ┏━━━━━┛─────┐
   F             F           Z
  All           P−V         Skp
 ┌─┴─┐   ┌───┬───┃───┬───┐
 F   Z   F   Z  Z+F  Z   Z
Cut Skp P−V Skp P−V Skp Skp

This tree has significantly fewer nodes than the first. However, there are also zero-window trees. These are the trees of the nine successful proofs:

                                              Cut       Cut   Cut
                                           ┌───┘         │   ┌─┘
                                           │             │   │
                                  Cut     All           All All
                                   │     ┌─┴─┐
                                   │     │   │
                            Cut   All   Cut Cut
                             │   ┌─┴─┐
                             │   │   │
    Cut     Cut     Cut Cut All Cut Cut

And one of the two failed proofs:

                      All
                 ┌─────┴─────┐
                 │           │
                Cut         Cut
         ┌───┬───┤           │
         │   │   │           │
        Cut Cut All         All

And the other failed proof:

                All

(These last two are rooted at the positions marked Z+F.)

The trees of the successful proofs usually have fewer nodes than their corre­sponding subtrees in the minimax search without zero-window pruning, so if the number of failed proofs is sufficiently low, zero-window pruning can reduce the number of nodes that need to be examined overall.

# Terminology

The term “pv node” refers to nodes whose return values are exact. (Whether a node is a pv node or not is only known after searching, but we can predict that a node will be a pv node, in which case we might call it an “expected-pv node”.)

Confusingly, the term “pv node” is also commonly used to refer to a node in the search tree of the root that we do not attempt to prune with a zero-window search.
(In the example above, these are the nodes marked with an F only.)
Being a “pv node” in this sense is correlated with being an expected-pv node, but the two properties are distinct.

I am rather unhappy about this usage because it frequently leads to people talking past each other, and so I’d encourage you to avoid using “pv node” to mean anything other than “node whose return value is exact”.

# Implementation

The mark indicates differences between cutoff conventions.

Split search functions

define search(p, alpha0, beta)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    break if beta < alpha               ※ beta ≤ alpha
    if n > 0
      retval = zws(s, alpha, alpha)  ※ (s, −alpha−1, −alpha)
      next if retval < alpha
    end
    retval = search(s, beta, alpha)
    best  <- (max(best,  retval) unless n = 0 then retval)
    alpha <-  max(alpha, retval)
  end
  return (best unless best = none then alpha0)
end

define zws(p, alpha0, beta)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = unknown
  for s in successors(p) with-index n
    break if beta < alpha               ※ beta ≤ alpha
    retval = zws(s, beta, alpha)
    best  <- (max(best,  retval) unless n = 0 then retval)
    alpha <-  max(alpha, retval)
  end
  return best
end

Merged search functions

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    break if beta < alpha                         ※ beta ≤ alpha
    if (not zero-window) and n > 0
      retval = search'(s, alpha, alpha, true)  ※ (s, −alpha−1, −alpha)
      next if retval < alpha
    end
    retval = search'(s, beta, alpha, zero-window)
    best  <- (max(best,  retval) unless n = 0 then retval)
    alpha <-  max(alpha, retval)
  end
  return (best unless best = none then alpha0)
end

define search(p, alpha0, beta, zero-window)
  return search'(p, alpha0, beta, false)
end

# Improvement

There’s a lovely improvement we can make: we can not only skip a successor when −ret(Zn) < αn, but also avoid evaluating Fn when −ret(Zn) = αn; that is, Fn is obviated when −ret(Zn) ≤ αn.

This works regardless of the cutoff criterion.

Merged search functions with β < αn as the cutoff criterion

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    break if beta < alpha
    if (not zero-window) and n > 0
      retval = search'(s, alpha, alpha, true)
      next if retval < alpha
      if retval > alpha
        retval = search'(s, beta, alpha, false)
      end
    else
      retval = search'(s, beta, alpha, zero-window)
    end
    best  <- (max(best,  retval) unless n = 0 then retval)
    alpha <-  max(alpha, retval)
  end
  return (best unless best = none then alpha0)
end

define search(p, alpha0, beta, zero-window)
  return search'(p, alpha0, beta, false)
end

Merged search functions with βαn as the cutoff criterion

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    break if beta  alpha
    if (not zero-window) and n > 0
      retval = search'(s, alpha1, alpha, true)
      next if retval  alpha
    end
    retval = search'(s, beta, alpha, zero-window)
    best  <- (max(best,  retval) unless n = 0 then retval)
    alpha <-  max(alpha, retval)
  end
  return (best unless best = none then alpha0)
end

define search(p, alpha0, beta, zero-window)
  return search'(p, alpha0, beta, false)
end

# Appendix

Here we present a proof that −ret(Zn) < αn implies −ret(Fn) < αn by proving two more general propositions of which the implication is a corollary.

This section uses the notation described in the note on minimax with α/β pruning.

Prop 1. For any node p, the predicate pβ < pr is independent of pα0 if pα0pβ.
Prop 2. For any node p, the predicate pr < pα0 is independent of pβ if pα0pβ.

Proof of Prop 1. For each successor tn of p that is a terminal node (leaf), the return value tnr is its evaluation, which does not depend on α0, and therefore β < −tnr does not depend on α0.

For each successor sn of p that is not a terminal node, we first note (*) that snα0 = −β and snβ = −αn. Crucially, we know (**) that αnβ because

From (*) and (**) we then have snα0snβ, and can apply Prop 2 to assert that snr < snα0 is independent of snβ, that is, β < −snr is independent of αn. And although αn = −snβ may depend on α0, the parameter snα0 = −β does not, and these are the only two parameters of the search of sn, so if β < −snr is indepen­dent of αn it is also true that β < −snr is indepen­dent of α0.

Since β < −tnr or β < −snr is independent of α0 for all n, we conclude that β < r is independent of α0.

Proof of Prop 2. This proceeds in an exactly analagous manner, so we omit it here.

The proofs of these two propositions are mutually recursive, and so they are only valid if they are well-founded. To finish, we require an additional but fairly reason­able assumption: that search trees are finite. If all search trees are finite, along any descending chain, there will be a node with only terminal nodes as successors, for which a proof does not make use of a further proof.