This week, we will learn how to count! Counting problem are a regular occurence in programming competition problems. Here, we will cover all of the basic tools needed to count.

In school, you probably learned how to count to 10, then 100, and then even higher.
Most of the time, counting problems in programming competitions will require you
to count much much higher. So high in fact that the answers often can not be stored
in our usual 32- and 64-bit integers! To overcome this limitation, counting problems
frequently ask that the answer be given *modulo* some number. This means that we
report the remainder of the answer when divided by the given number, usually (but
not always) a large prime number. Since the intermediate calculations that we make
to get the answer will themselves probably not fit in a 64-bit integer either, it
is important that we do all intermediate calcuations in modular arithmetic as well.

Perhaps one of the most common tools needed for counting problems are large powers
computations. You may have seen the *binary exponentiation* or *exponentiation by squaring*
technique before, which is the usual way to compute large powers quickly. Since these
numbers will very quickly exceed the capacity of a 64-bit integer, we usually want
them done modulo. Here is some pseudocode that does this. The following function
computes \(a^b \mod m\).

```
// Computes a^b mod m
function expmod(a, b, m) {
res = 1 % m;
a = a % m;
while (b > 0) {
if (b % 2 == 1) res = res*a % m
a = a*a % m
b = b / 2
}
return res;
}
```

Note that the intermediate calculations may be up to \(m^2\) in size, so make sure that number fits in the datatype that you are using. To be safe, it is often good to use 64-bit integers even if \(m\) fits in a 32-bit integer, just in case \(m^2 \) does not.

Here's a recursive C++ implementation:

```
typedef long long ll; // 64 bit type
ll expmod (ll a, ll b, ll m) {
if (b == 0) return 1;
ll x = expmod (a, b>>1, m);
ll xx = (x * x) % m;
if ((b & 1) == 0) return xx; else return (a*xx) % m;
}
```

While addition, multipliciation and subtraction are all easy in modular arithmetic, division is not. The reason for this is because although \( (a + b) \mod m = ((a \mod m) + (b \mod m)) \mod m \), and the same for multipliciation and subtraction (being careful with negatives), the equivalent is not true for division. In fact, depending on the operands and the modulus, the inverse is not even guaranteed to exist!

Formally, the inverse of an integer \( x \) modulo \( m \) is an integer \( y \) such that \( x \times y \mod m = 1 \). Thankfully, the inverse is guaranteed to exist if \( x \) and \( m \) are co-prime. This will always be true if \( m \) is a prime number, which it usually is in programming competitions. Problems where the modulus is not prime are often significantly harder for this reason, but we won't cover those for now. One way to compute modular inverses is to use Euler's theorem. Don't worry about understanding the math for now, just know that if \( m \) is prime, then we can compute the inverse of \( x \) as \( x^{m-2} \mod m \). This can be done using the modular exponentiation algorithm given above.

```
// Computes the modular inverse of x, modulo m
// Assumes that m is a prime number
function inverse(x, m) {
return expmod(x, m-2, m)
}
```

If \( m \) is not prime, an alternative is to use the Extended Euclidean Algorithm.

Let's talk about the fundamental tools used in counting problems. One of the main tools that we will
use are *selection probems*. In a selection problem, we want to count the number of ways in which
we can choose \( k \) items from a set of \( n \) items. There are four main variants of the problem:
We must consider whether the order of the items chosen matters (ordered versus unordered selections),
and we must decide whether we can select the same item multiple times (with replacement versus without
replacement).

Powers allow us to count ordered selections with replacement. Suppose there are \( n \) items to choose from, and you need to select \( k \) of them. If you are allowed to select the same item multiple times, and the order of your choices matters, then the number of possible ways that you can act is \( n^k \).

A *permutation* of a sequence is another sequence containing the same elements but
in a potentially different order. A frequent necessity in counting problems is to
count the number of distinct permutations of a sequence. If the sequence contains
\( n \) distinct items, then the number of permutations is \( n! \) (\( n \) *factorial*),
which is defined as
$$
n! = n \times (n - 1) \times (n - 2) \times ... \times 2 \times 1.
$$

In programming competition problems, it is common that you will need to compute the factorials of many different numbers. To save time, it is very often lucrative to precompute them so that they can be looked up in constant time

```
// Precompute all factorials up to max_n, modulo m
function precompute(max_n, m) {
fact = array of size (max_n+1)
fact[0] = fact[1] = 1
for (n = 2; n <= max_n; n++) {
fact[n] = (fact[n-1] * n) % m
}
}
// Return n!, modulo m (as it was precomputed)
function factorial(n) {
return fact[n]
}
```

If a sequence has duplicate elements then \( n! \) will overcount the number of permutations since some of them will look the same. To account for this, simply divide the answer (using modular inverses) by the factorials of the number of occurrences of each element.

A derangement of a sequence of distinct elements is a permutation such that no element ends up in its
original position. The number of derangements of a sequence of length \( n \) is called the
*subfactorial* of \( n \), written \( !n \). Derrangements can be computed using the
following recurrence relation
$$
!n = (n-1)(!(n-1)+!(n-2))
$$
Like factorials, it is often useful to precompute subfactorials before using them in order to save on
redundant computation time.

Binomial coefficients tell us the number of ways that we can select \( k \) elements from \( n \) choices without picking the same element twice, and without the order of the selections being important. $$ \binom{n}{k} $$ There are many ways to compute binomial coefficients. Let's look at the two most common. The first way is to use the following formula $$ \binom{n}{k} = \frac{n!}{(n-k)! k!} $$ which we can evaluate using our precomputed factorials from earlier:

```
// Compute Binomial(n,k) modulo m
function binomial(n, k, m) {
return ((factorial(n) * inverse(factorial(n-k), m) % m) * inverse(factorial(k), m)) % m
}
```

This could be sped up by also precomputing the inverse factorials, enabling binomial coefficients to be computed in constant time. Computing the inverses each time might be slow if lots of them are needed.

The second method is to evaluate the following recurrence relation and store the results in a table, like in dynamic programming: $$ \binom{n}{k} = \binom{n-1}{k} + \binom{n-1}{k-1} $$ The base cases are \( \binom{n}{0} = \binom{n}{n} = 1 \).

```
// Compute Binomial(n,k) mod m for all n <= max_n and k <= max_k
function precompute(max_n, max_k, m) {
binom = table of size (max_n+1) * (max_k+1)
for (n = 0; n <= max_n; n++) binom[n][0] = 1
for (n = 1; n <= max_n; n++) {
for (k = 1; k <= n && k <= max_k; k++) {
binom[n][k] = (binom[n-1][k] + binom[n-1][k-1]) % m
}
}
}
```

The Catalan numbers are a famous number sequence because they count an extraordinary number of different combinatorial objects (at least 66 according to a book by Richard P. Stanley), so it is worth knowing about them. One of the many interesting applications of the Catalan numbers is counting the number of balanced bracket sequences consisting of \(n\) pairs of brackets. A sequence of \(n\) pairs of brackets (left or right) is balanced if every prefix of the sequence has at least as many opening brackets as closing ones. Catalan numbers can be computed in terms of binomial coefficients with the formula $$ C_n = \frac{1}{n+1}\binom{2n}{n}. $$ Be sure to use modular inverses for computing this formula!

When counting combinations (with binomial coefficients), the order of the items
selected did not matter. Ordered combinations, i.e. ordered selections without
replacement, are sometimes called *arrangements*. The number of arrangements of \( k \) items
selected from \( n \) items can be computed as
$$
\binom{n}{k} k! = \frac{n!}{(n-k)!}
$$

The final of the four selection problems is unordered selections with replacement. This problem is often called the balls and bins problem. Suppose you have \( n \) indistinguishable balls and you wish to place each of them into one of \( k \) labelled bins. How many different ways can I place the balls? Using a technique called stars and bars, we can derive the following formula: $$\binom{n+k-1}{k-1}.$$

For a related problem, where we require that every bin gets at least one ball, the answer is $$\binom{n-1}{k-1}.$$

In problems where you're required to output a probability (or in general a rational number), it's fairly common to see something like this as the output requirement for a problem:

If the answer is an irreducible fraction \(P/Q\), print the value of \(PQ^{-1}\) in the prime field of integers modulo 1000000007 (\(10^9+7\)). It is guaranteed that this modulo does not divide \(Q\), thus the number to be printed is well-defined.

Although this at first seems pretty cryptic, what it's saying is that if you, throughout your computation, use \(PQ^{-1}\) to represent the rational \(P/Q\) then you can can substitute multiplying modulo \(p\) for multiplying two rationals, and substitute addition modulo \(p\) for adding two rationals.

Let's be a bit more formal. Fix a prime \(p\). Suppose we define \(F()\) as follows $$ F(a,b) = ((a \bmod p) * (b^{-1} \bmod p)) \bmod p $$ Note that \(F(a,b)\) is only defined when \(b\) is not divisible by \(p\). So that is a restriction that holds throughout.

Now you can prove the following properties of this function:

(1) If \({a \over b} = {c \over d}\) then \(F(a,b) = F(c,d)\)

(2) \(F({a \over b} + {c \over d}) = (F(a,b) + F(c,d)) \bmod p\)

(3) \(F({a \over b} * {c \over d}) = (F(a,b) * F(c,d)) \bmod p\)

I will not prove these facts here. You can work it out just by applying the definition of \(F()\) and rules for adding and multiplying fractions, and properties of modular arithmetic.

The upshot is that, as long as you know that \(p\) never divides any of the denominators of the fractions that occur, you can substitute adding and multiplying these rationals with adding and multiplying the \(F()\) representations of them modulo \(p\).

Danny Sleator Last modified: Thu Mar 25 10:43:13 2021