Combinatorial Enumeration (aka Counting Stuff)

This week, we will learn how to count! Counting problem are a regular occurence in programming competition problems. Here, we will cover all of the basic tools needed to count.

Modular arithmetic

In school, you probably learned how to count to 10, then 100, and then even higher. Most of the time, counting problems in programming competitions will require you to count much much higher. So high in fact that the answers often can not be stored in our usual 32- and 64-bit integers! To overcome this limitation, counting problems frequently ask that the answer be given modulo some number. This means that we report the remainder of the answer when divided by the given number, usually (but not always) a large prime number. Since the intermediate calculations that we make to get the answer will themselves probably not fit in a 64-bit integer either, it is important that we do all intermediate calcuations in modular arithmetic as well.

Moduar exponentiation

Perhaps one of the most common tools needed for counting problems are large powers computations. You may have seen the binary exponentiation or exponentiation by squaring technique before, which is the usual way to compute large powers quickly. Since these numbers will very quickly exceed the capacity of a 64-bit integer, we usually want them done modulo. Here is some pseudocode that does this. The following function computes \(a^b \mod m\).

// Computes a^b mod m
function expmod(a, b, m) {
  res = 1 % m;
  a = a % m;
  while (b > 0) {
    if (b % 2 == 1) res = res*a % m
    a = a*a % m
    b = b / 2
  }
  return res;
}

Note that the intermediate calculations may be up to \(m^2\) in size, so make sure that number fits in the datatype that you are using. To be safe, it is often good to use 64-bit integers even if \(m\) fits in a 32-bit integer, just in case \(m^2 \) does not.

Here's a recursive C++ implementation:


typedef long long ll;  // 64 bit type

ll expmod (ll a, ll b, ll m) {
  if (b == 0) return 1;
  ll x = expmod (a, b>>1, m);
  ll xx = (x * x) % m;
  if ((b & 1) == 0) return xx; else return (a*xx) % m;
}

Modular inverses

While addition, multipliciation and subtraction are all easy in modular arithmetic, division is not. The reason for this is because although \( (a + b) \mod m = ((a \mod m) + (b \mod m)) \mod m \), and the same for multipliciation and subtraction (being careful with negatives), the equivalent is not true for division. In fact, depending on the operands and the modulus, the inverse is not even guaranteed to exist!

Formally, the inverse of an integer \( x \) modulo \( m \) is an integer \( y \) such that \( x \times y \mod m = 1 \). Thankfully, the inverse is guaranteed to exist if \( x \) and \( m \) are co-prime. This will always be true if \( m \) is a prime number, which it usually is in programming competitions. Problems where the modulus is not prime are often significantly harder for this reason, but we won't cover those for now. One way to compute modular inverses is to use Euler's theorem. Don't worry about understanding the math for now, just know that if \( m \) is prime, then we can compute the inverse of \( x \) as \( x^{m-2} \mod m \). This can be done using the modular exponentiation algorithm given above.

// Computes the modular inverse of x, modulo m
// Assumes that m is a prime number
function inverse(x, m) {
  return expmod(x, m-2, m)
}

If \( m \) is not prime, an alternative is to use the Extended Euclidean Algorithm.

Counting tools

Let's talk about the fundamental tools used in counting problems. One of the main tools that we will use are selection probems. In a selection problem, we want to count the number of ways in which we can choose \( k \) items from a set of \( n \) items. There are four main variants of the problem: We must consider whether the order of the items chosen matters (ordered versus unordered selections), and we must decide whether we can select the same item multiple times (with replacement versus without replacement).

Powers -- Counting ordered selections with replacement

Powers allow us to count ordered selections with replacement. Suppose there are \( n \) items to choose from, and you need to select \( k \) of them. If you are allowed to select the same item multiple times, and the order of your choices matters, then the number of possible ways that you can act is \( n^k \).

Factorials -- Counting permutations

A permutation of a sequence is another sequence containing the same elements but in a potentially different order. A frequent necessity in counting problems is to count the number of distinct permutations of a sequence. If the sequence contains \( n \) distinct items, then the number of permutations is \( n! \) (\( n \) factorial), which is defined as $$ n! = n \times (n - 1) \times (n - 2) \times ... \times 2 \times 1. $$

In programming competition problems, it is common that you will need to compute the factorials of many different numbers. To save time, it is very often lucrative to precompute them so that they can be looked up in constant time

// Precompute all factorials up to max_n, modulo m
function precompute(max_n, m) {
  fact = array of size (max_n+1)
  fact[0] = fact[1] = 1
  for (n = 2; n <= max_n; n++) {
    fact[n] = (fact[n-1] * n) % m
  }
}

// Return n!, modulo m (as it was precomputed)
function factorial(n) {
  return fact[n]
}

If a sequence has duplicate elements then \( n! \) will overcount the number of permutations since some of them will look the same. To account for this, simply divide the answer (using modular inverses) by the factorials of the number of occurrences of each element.

Subfactorials -- Counting derangements

A derangement of a sequence of distinct elements is a permutation such that no element ends up in its original position. The number of derangements of a sequence of length \( n \) is called the subfactorial of \( n \), written \( !n \). Derrangements can be computed using the following recurrence relation $$ !n = (n-1)(!(n-1)+!(n-2)) $$ Like factorials, it is often useful to precompute subfactorials before using them in order to save on redundant computation time.

Binomial coefficients -- Counting combinations (unordered selections without replacement)

Binomial coefficients tell us the number of ways that we can select \( k \) elements from \( n \) choices without picking the same element twice, and without the order of the selections being important. $$ \binom{n}{k} $$ There are many ways to compute binomial coefficients. Let's look at the two most common. The first way is to use the following formula $$ \binom{n}{k} = \frac{n!}{(n-k)! k!} $$ which we can evaluate using our precomputed factorials from earlier:

// Compute Binomial(n,k) modulo m
function binomial(n, k, m) {
  return ((factorial(n) * inverse(factorial(n-k), m) % m) * inverse(factorial(k), m)) % m
}

This could be sped up by also precomputing the inverse factorials, enabling binomial coefficients to be computed in constant time. Computing the inverses each time might be slow if lots of them are needed.

The second method is to evaluate the following recurrence relation and store the results in a table, like in dynamic programming: $$ \binom{n}{k} = \binom{n-1}{k} + \binom{n-1}{k-1} $$ The base cases are \( \binom{n}{0} = \binom{n}{n} = 1 \).

// Compute Binomial(n,k) mod m for all n <= max_n and k <= max_k
function precompute(max_n, max_k, m) {
  binom = table of size (max_n+1) * (max_k+1)
  for (n = 0; n <= max_n; n++) binom[n][0] = 1
  for (n = 1; n <= max_n; n++) {
    for (k = 1; k <= n && k <= max_k; k++) {
      binom[n][k] = (binom[n-1][k] + binom[n-1][k-1]) % m
    }
  }
}

Catalan numbers -- counting balanced bracket sequences and lots of other things

The Catalan numbers are a famous number sequence because they count an extraordinary number of different combinatorial objects (at least 66 according to a book by Richard P. Stanley), so it is worth knowing about them. One of the many interesting applications of the Catalan numbers is counting the number of balanced bracket sequences consisting of \(n\) pairs of brackets. A sequence of \(n\) pairs of brackets (left or right) is balanced if every prefix of the sequence has at least as many opening brackets as closing ones. Catalan numbers can be computed in terms of binomial coefficients with the formula $$ C_n = \frac{1}{n+1}\binom{2n}{n}. $$ Be sure to use modular inverses for computing this formula!

Some useful counting formulas

Counting arrangements (ordered selections without replacement)

When counting combinations (with binomial coefficients), the order of the items selected did not matter. Ordered combinations, i.e. ordered selections without replacement, are sometimes called arrangements. The number of arrangements of \( k \) items selected from \( n \) items can be computed as $$ \binom{n}{k} k! = \frac{n!}{(n-k)!} $$

Balls and bins -- Counting unordered selections with replacement

The final of the four selection problems is unordered selections with replacement. This problem is often called the balls and bins problem. Suppose you have \( n \) indistinguishable balls and you wish to place each of them into one of \( k \) labelled bins. How many different ways can I place the balls? Using a technique called stars and bars, we can derive the following formula: $$\binom{n+k-1}{k-1}.$$

Balls and bins with no empty bins

For a related problem, where we require that every bin gets at least one ball, the answer is $$\binom{n-1}{k-1}.$$

The \(PQ^{-1}\) Representaton for Rational Numbers

In problems where you're required to output a probability (or in general a rational number), it's fairly common to see something like this as the output requirement for a problem:

If the answer is an irreducible fraction \(P/Q\), print the value of \(PQ^{-1}\) in the prime field of integers modulo 1000000007 (\(10^9+7\)). It is guaranteed that this modulo does not divide \(Q\), thus the number to be printed is well-defined.

Although this at first seems pretty cryptic, what it's saying is that if you, throughout your computation, use \(PQ^{-1}\) to represent the rational \(P/Q\) then you can can substitute multiplying modulo \(p\) for multiplying two rationals, and substitute addition modulo \(p\) for adding two rationals.

Let's be a bit more formal. Fix a prime \(p\). Suppose we define \(F()\) as follows $$ F(a,b) = ((a \bmod p) * (b^{-1} \bmod p)) \bmod p $$ Note that \(F(a,b)\) is only defined when \(b\) is not divisible by \(p\). So that is a restriction that holds throughout.

Now you can prove the following properties of this function:

(1) If \({a \over b} = {c \over d}\) then \(F(a,b) = F(c,d)\)

(2) \(F({a \over b} + {c \over d}) = (F(a,b) + F(c,d)) \bmod p\)

(3) \(F({a \over b} * {c \over d}) = (F(a,b) * F(c,d)) \bmod p\)

I will not prove these facts here. You can work it out just by applying the definition of \(F()\) and rules for adding and multiplying fractions, and properties of modular arithmetic.

The upshot is that, as long as you know that \(p\) never divides any of the denominators of the fractions that occur, you can substitute adding and multiplying these rationals with adding and multiplying the \(F()\) representations of them modulo \(p\).


Danny Sleator
Last modified: Thu Mar 25 10:43:13 2021