A Simple Combo Problem

As is quite evident from the sparse content of this blog, I have a tendency to jump around from subject to subject. As of late I have been making a concerted effort to become a better coder, an effort that takes literally all of my energy. It is disappointing to let all the ideas I had for this site fall by the wayside, but perhaps new ideas can take their places. I ran across a fun problem while practicing the other day, and although it is an easy problem for the experienced coder I was happy to solve it, so here it is.

The problem is taken directly from Codeforces and can be found here, and if you click on the link to the “tutorial” on the lower right side of the page you can find their solution. I didn’t read it too carefully but it looks like more or less the same thing. But you might as well read mine, right?

If you didn’t click the link, here’s the problem statement (sans their creative background story).

Exercise 1 Given two integers {n} and {k} with {1 \leq n,k \leq 2000}. How many sequences of integers {b_1, b_2, \cdots b_k} are there such that {1 \geq b_1 \geq b_2 \geq \cdots b_k} and {b_i \mid b_{i+1}} for {i = 1,2,\cdots,k-1}?

 I stared at this problem for a while with no helpful ideas. I knew there should be a recurrence of some sort, and the idea that removing any element from a sequence of length {k} generates a sequence of length {k-1} sort of supported this idea. Ultimately though I realized that none of this was making use of the key property of the sequences, the divisor requirement. This requires that all elements of the sequence be divisors of the last element, {b_k}. This made me think of the sequence as a process of accumulating prime factors of {b_k}, and since the sequences are easily partitioned by their last element, I figured this should lead somewhere.

Slightly more formally, let {B(n,k)} be the number of valid sequences in the question and let {A(n,k)} be the number of valid sequences of length {k} where {b_k = n} (end with {n}). Clearly {B(n,k) = \sum_{i=1}^{n}{A(i,k)}}, so if we can compute {A(i,k)} in reasonable time we should be good. I actually am terrible at estimating the time it will take code to run, it is something I definitely need to learn. For now we will steer clear of the discussion of complexity with the confidence that the algorithm “looks probably good enough.”

So now our challenge is to compute {A(n,k)}, and as we mentioned we are going to need the prime factorization. Actually it turns out we only need the exponents of the distinct prime factors, but I realized this far too late and didn’t really feel like modifying the code. Computing prime factorization is probably a well-studied topic, but for our purposes the simple process of dividing out all prime factors works fine. Here’s the code.

def prime_factor(n):
    pf = []
    d = 2
    while d**2 <= n:
        exp = 0
        while n % d == 0:
            n /= d 
            exp += 1
        if exp > 0:    
            pf.append((d,exp)) #The prime doesn't actually matter, but whatever
        d += 1
    if n > 1:
        pf.append((n,1))
    return pf

So now we’ve got our prime factorization, but how do we use it? As I mentioned before I like to think of the sequence as the process of accumulating prime factors, so in the “gap” between each element of the sequence we throw in some new prime factors. We also create an artificial {1} at the beginning of the sequence since we can start from any divisor. Since numbers can repeat we aren’t required to put any prime factors in a given gap, and since the last number has to get to {n} we have to use all the prime factors eventually. Splitting them up by primes, say {n = p_1^{r_1}p_2^{r_2}\cdots p_s^{r_s}}, we can also just do the assignments of {p_1, p_2, \cdots, p_n} independently. There are {k} slots, one preceding each element of the sequence, we are permitted to have empty slots, and we must eventually assign all {r_i} {p_i}s. We recognize this as the number of nonnegative solutions to {x_1 + x_2 + \cdots + x_k = r_i} which is just {\binom{r_i+k-1}{k-1}}. Since the distinct prime factors can be dealt with independently this gives us our expression:

Proposition 1 If the prime factorization of {n} is {p_1^{r_1}p_2^{r_2}\cdots p_s^{r_s}}, then {A(n,k) = \displaystyle\prod_{i=1}^{s}{\binom{r_i+k-1}{k-1}}}.

The corresponding code shouldn’t be interesting, but apparently Python’s math library doesn’t feel like including a function to compute {\binom{n}{k}}. Writing one isn’t hard but I had to show off my impressive Python knowledge. Unfortunately doing so exposed the fact that I don’t know how to call the basic operators as functions, so it became slightly less than impressive.

def binom(n,k): #Because importing itertools or scipy seems overkill
    return reduce(mult, xrange(k+1,n+1)) / reduce(mult, xrange(1,n-k+1))

def mult(x,y): #To compensate for not knowing Python
    return x*y

The code for computing {B(n,k)} isn’t interesting at all, but I suppose it serves as a reasonable summary. Again it was hardly necessary to record the actual prime factors since we only need their exponents, but Python handled it nicely enough for me to not really care.

def B(n,k):
    s = 0
    for i in xrange(1,n+1):
        s += A(i,k)
    return s

def A(n,k): #Another function may have been unnecessary...
    pf = prime_factor(n)
    prod = 1
    for (p, exp) in pf:
        prod *= binom(exp+k-1, k-1)
    return prod

All in all there are a ton of improvements I could probably make. My binom function is mostly a joke, I could probably cache prime factorizations, etc. I was pretty happy with the solution though and it ran in well under the time limit on Codeforces, so futher optimization seemed a little unnecessary. I hope I can solve some more interesting problems as I improve as a coder – I say my goal is to improve coding but in reality my ability to design algorithms also lags far behind where I would like it. At the risk of being terribly mistaken once again, expect more of these types of posts in the future.