Principal Variation Search

See also minimax with α/β pruning.

Sections Description Proving a search will fail low When to attempt a proof Example Terminology Implementation Improvement Appendix

# Description

The term “principal variation search” is something of a misnomer because pvs is not a distinct search algorithm but α/β minimax with an additional pruning rule:

Before searching a successor, attempt to prove that the search will fail low (return a value beneath alpha), and if successful, skip the successor.

Let’s call the value of alpha before the n-th successor is searched α_n, the value of alpha after the n-th successor is searched α_n+1, and give the search of the n-th successor itself the name F_n. For the value returned by F_n, let’s write ret(F_n). Let’s refer to the parameters of a search S_n as Sα_n and Sβ_n, so Fα_n = −β and Fβ_n = −α_n.

# Proving a search will fail low

The proof that F_n will fail low takes the form of a zero-window search, Z_n, which has the property of failing low exactly when F_n will fail low. That is, −ret(Z_n) < α_n implies −ret(F_n) < α_n. And −ret(F_n) < α_n implies α_n+1 = α_n, and α_n+1 is all we really need to determine.
The search F_n is referred to as the full-window search.

Z_n is an α/β minimax search with β lowered to α_n (so the parameters of Z_n are Zα_n = −α_n and Zβ_n = −α_n). However, it’s common for the cutoff criterion to be β ≤ r rather than β < r, in which case β is lowered to α_n+1 rather than α_n, and so you will almost exclusively see Zα_n = −α_n−1 and Zβ_n = −α_n in implementations.

Typically, the same routine is used to perform F_n and Z_n, and this only requires a single bit of bookkeeping: a flag that indicates which tree is being searched.
(This is necessary because you can’t tell just from looking at α and β, because they may differ by 0 or 1 only incidentally.)

# When to attempt a proof

Zero-window pruning isn’t applicable in a zero-window tree (that is, within Z_n) because full-window and zero-window searches are equivalent in zero-window trees.

Typically, zero-window pruning is not attempted for the first successor of a node because the first successor is expected to meet or exceed alpha, in which case the proof will fail and F_n will have to be evaluated anyway. Since Z_n is expected to be useless, implementations simply begin F_n right away.

More correctly, the first successor meets or exceeds alpha in pv and Cut nodes with perfect ordering, but not in All nodes. It’d be reasonable, then, to attempt zero-window pruning for every successor of an expected-All node (including the first), but as far as I’m aware, no one has tried this. (We might expect it to be fairly useless since there shouldn’t be many All nodes in the search tree from the root if move ordering is accurate – they’d only appear in zero-window trees, as we’ll see below.)

If the first successor of an expected-pv node fails to meet alpha, the next successor is predicted to meet or raise alpha instead. Then Z₁ is expected to be useless (just as Z₀ was expected to be useless), which suggests an implementation oughtn’t attempt zero-window pruning yet but instead simply begin F₁ right away (and so on until a successor meets alpha). As far as I’m aware, no one has tried this.

Illustration

                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛────┬──────────┬─────┐
                   F                          Z          Z     Z
                  P−V                        Skp        Skp   Skp
   ┌───────────────┗━━━┓───────────┐
   F                   *           Z
  Cut                 P−V         Skp

Typically, zero-window pruning would be attempted for the node marked with the asterisk, and in this example the * would then be Z+F, but since the first successor failed to meet alpha, we would be justified in searching the node with a full-window right away, and the * would be F.

This is an incorrect guess only when the expected-pv node is ultimately found to be an All node. (The reasoning for attempting zero-window pruning after the first successor fails low might be, “we tried the only reasonable successor and failed, so this position probably isn’t viable”).

# Example

Note these depict common practice, where the first successor is always searched with a full-window (even within All nodes) and zero-window pruning is attempted for every successor after the first (even if the first successor failed to meet alpha).

Here is a minimax search tree with α/β pruning:

                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛─────┬─────────┬─────┐
                   ┃                           │         │     │
                  P−V                         Cut       Cut   Cut
   ┌───────────────┗━━━┓───────────┐       ┌───┴───┐     │   ┌─┴─┐
   │                   ┃           │       │       │     │   │   │
  Cut                 P−V         Cut     P−V     All   All Cut All
   │             ┏━━━━━┛─────┐     │     ┌─┴─┐   ┌─┴─┐
   │             ┃           │     │     │   │   │   │
  All           P−V         Cut   All   P−V Cut Cut Cut
 ┌─┴─┐   ┌───┬───┃───┬───┐   │   ┌─┴─┐
 │   │   │   │   ┃   │   │   │   │   │
Cut Cut P−V Cut P−V Cut Cut All Cut Cut

This is not a complete tree; the leaf nodes are omitted due to limited space.

Below is a minimax search tree of the same position with α/β and zero-window pruning (and the same move ordering). Nodes where a zero-window search was performed are marked with a Z or Z+F.

                                        P−V
                   ┏━━━━━━━━━━━━━━━━━━━━━┛────┬──────────┬─────┐
                   F                          Z          Z     Z
                  P−V                        Skp        Skp   Skp
   ┌───────────────┗━━━┓───────────┐
   F                  Z+F          Z
  Cut                 P−V         Skp
   │             ┏━━━━━┛─────┐
   F             F           Z
  All           P−V         Skp
 ┌─┴─┐   ┌───┬───┃───┬───┐
 F   Z   F   Z  Z+F  Z   Z
Cut Skp P−V Skp P−V Skp Skp

This tree has significantly fewer nodes than the first. However, there are also zero-window trees. These are the trees of the nine successful proofs:

                                              Cut       Cut   Cut
                                           ┌───┘         │   ┌─┘
                                           │             │   │
                                  Cut     All           All All
                                   │     ┌─┴─┐
                                   │     │   │
                            Cut   All   Cut Cut
                             │   ┌─┴─┐
                             │   │   │
    Cut     Cut     Cut Cut All Cut Cut

And one of the two failed proofs:

                      All
                 ┌─────┴─────┐
                 │           │
                Cut         Cut
         ┌───┬───┤           │
         │   │   │           │
        Cut Cut All         All

And the other failed proof:

All

(These last two are rooted at the positions marked Z+F.)

The trees of the successful proofs usually have fewer nodes than their corresponding subtrees in the minimax search without zero-window pruning, so if the number of failed proofs is sufficiently low, zero-window pruning can reduce the number of nodes that need to be examined overall.

# Terminology

The term “pv node” refers to nodes whose return values are exact. (Whether a node is a pv node or not is only known after searching, but we can predict that a node will be a pv node, in which case we might call it an “expected-pv node”.)

Confusingly, the term “pv node” is also commonly used to refer to a node in the search tree of the root that we do not attempt to prune with a zero-window search.
(In the example above, these are the nodes marked with an F only.)
Being a “pv node” in this sense is correlated with being an expected-pv node, but the two properties are distinct.
In fact, a “pv node” in this sense can be a pv node, Cut node, or All node!

I am rather unhappy about this usage because it frequently leads to people talking past each other, and so I’d encourage you to avoid using “pv node” to mean anything other than “node whose return value is exact”.

# Implementation

The ※ mark indicates differences between cutoff conventions.

Split search functions

define search(p, alpha0, beta)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    if n > 0
      retval = −zws(s, −alpha, −alpha)  ※ (s, −alpha−1, −alpha)
      next if retval < alpha
    end
    retval = −search(s, −beta, −alpha)
    best <- (max(best,  retval) unless n = 0 then retval)
    break if beta < best                ※ beta ≤ best
    alpha <- max(alpha, retval)
  end
  return (best unless best = none then alpha0 − 1)
end

define zws(p, alpha0, beta)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = unknown
  for s in successors(p) with-index n
    retval = −zws(s, −beta, −alpha)
    best <- (max(best,  retval) unless n = 0 then retval)
    break if beta < best                ※ beta ≤ best
    alpha <- max(alpha, retval)
  end
  return best
end

Merged search functions

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    if (not zero-window) and n > 0
      retval = −search'(s, −alpha, −alpha, true)  ※ (s, −alpha−1, −alpha)
      next if retval < alpha
    end
    retval = −search'(s, −beta, −alpha, zero-window)
    best <- (max(best,  retval) unless n = 0 then retval)
    break if beta < best                          ※ beta ≤ best
    alpha <- max(alpha, retval)
  end
  return (best unless best = none then alpha0 − 1)
end

define search(p, alpha0, beta)
  return search'(p, alpha0, beta, false)
end

# Improvement

There’s a lovely improvement we can make: we can not only skip a successor when −ret(Z_n) < α_n, but also avoid evaluating F_n when −ret(Z_n) = α_n; that is, F_n is obviated when −ret(Z_n) ≤ α_n.

This works regardless of the cutoff criterion.

If the cutoff criterion is β < r: When −ret(Z_n) = α_n, the return value is exact; the score is known, so F_n is not needed.
If the cutoff criterion is β ≤ r: When −ret(Z_n) = α_n, whether the score is equal to α_n or less than α_n is unknown. But in the former case, F_n is not needed, and in the latter case, the successor should be skipped, and so we do not evaluate F_n in either case.

Merged search functions with β < r as the cutoff criterion

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    if (not zero-window) and n > 0
      retval = −search'(s, −alpha, −alpha, true)
      next if retval < alpha
      if retval > alpha
        retval = −search'(s, −beta, −alpha, false)
      end
    else
      retval = −search'(s, −beta, −alpha, zero-window)
    end
    best <- (max(best,  retval) unless n = 0 then retval)
    break if beta < best
    alpha <- max(alpha, retval)
  end
  return (best unless best = none then alpha0 − 1)
end

define search(p, alpha0, beta)
  return search'(p, alpha0, beta, false)
end

Merged search functions with β ≤ r as the cutoff criterion

define search'(p, alpha0, beta, zero-window)
  return eval(p) if terminal(p)
  mutable alpha = alpha0
  mutable best = none
  for s in successors(p) with-index n
    if (not zero-window) and n > 0
      retval = −search'(s, −alpha−1, −alpha, true)
      next if retval ≤ alpha
    end
    retval = −search'(s, −beta, −alpha, zero-window)
    best <- (max(best,  retval) unless n = 0 then retval)
    break if beta ≤ best
    alpha <- max(alpha, retval)
  end
  return (best unless best = none then alpha0)
end

define search(p, alpha0, beta)
  return search'(p, alpha0, beta, false)
end

# Appendix

Here we present a proof that −ret(Z_n) < α_n implies −ret(F_n) < α_n by proving two more general propositions of which the implication is a corollary.

This section uses the notation described in the note on minimax with α/β pruning.

Prop 1. For any node p, the predicate pβ < pr is independent of pα₀.
Prop 2. For any node p, the predicate pr < pα₀ is independent of pβ.

Proof of Prop 1. For each successor t_n of p that is a terminal node (leaf), the return value t_nr is its evaluation, which does not depend on α₀, and therefore β < −t_nr does not depend on α₀.

For each successor s_n of p that is not a terminal node, we first note that s_nα₀ = −β and s_nβ = −α_n. We can apply Prop 2 to assert that s_nr < s_nα₀ is independent of s_nβ, that is, β < −s_nr is independent of α_n. And although α_n = −s_nβ may depend on α₀, the parameter s_nα₀ = −β does not, and these are the only two parameters of the search of s_n, so if β < −s_nr is independent of α_n it is also true that β < −s_nr is independent of α₀.

Since β < −t_nr or β < −s_nr is independent of α₀ for all n, we conclude that β < r is independent of α₀.

Proof of Prop 2. This proceeds in an exactly analagous manner, so we omit it here.

The proofs of these two propositions are mutually recursive, and so they are only valid if they are well-founded. To finish, we require an additional but fairly reasonable assumption: that search trees are finite. If all search trees are finite, along any descending chain, there will be a node with only terminal nodes as successors, for which a proof does not make use of a further proof.