Next permutation: When C++ gets it right

来源：互联网发布：户外运动人群数据分析编辑：程序博客网时间：2024/05/22 04:38

The Next Number Problem

Suppose you have a fixed list of digits chosen from the range 1..9. What numbers can you make with them? You’re allowed as many zeros as you want. Write the numbers in increasing order.

Exactly this puzzle came up in the recent Google Code Jam programming contest:

You are writing out a list of numbers. Your list contains all numbers with exactly D_i digits in its decimal representation which are equal to i, for each i between 1 and 9, inclusive. You are writing them out in ascending order.
For example, you might be writing every number with two ‘1’s and one ‘5’. Your list would begin 115, 151, 511, 1015, 1051.
Given N, the last number you wrote, compute what the next number in the list will be.

The competition has closed now, but if you’d like to give it a go sample input files can be found on the website, where you can also upload your results and have them checked.

Here’s a short section from a trial I ran on my computer. Input numbers are in the left-hand column: the corresponding output numbers are in the right-hand column.

50110812884911623516 → 5011081288491162356182454322474161687049 → 8245432247416168709482040229261723155710 → 8204022926172315701543888989554234187388 → 4388898955423418783876080994872481480636 → 7608099487248148066331000989133449480678 → 3100098913344948068720347716554681051891 → 20347716554681051918

Choice of Algorithm

Like many of the code jam challenges, you’ll need to write a program which runs fast enough; but choosing the right algorithm is more important than choosing the right language. Typically a high-level interpreted language like Python allows me to code and test a solution far more quickly than using a low-level language like C or C++.

In this particular case, though, like most successful candidates, I used C++. Here’s why.

Next_permutation transforms the range of elements [first, last) into the lexicographically next greater permutation of the elements. […] If such a permutation exists, next_permutationtransforms [first, last) into that permutation and returns true. Otherwise it transforms[first, last) into the lexicographically smallest permutation and returns false.

Although the next number problem appears to be about numbers and lexicographical ordering appears to be about words, std::next_permutation is exactly what’s needed here.

Lexicographical Ordering

Lexicographical order

A dictionary provides the canonical example of lexicographical ordering. Words are built from characters, which can be alphabetically ordered A, B, C, … , so in the dictionary words which begin with A appear before words which begin with B, which themselves come in front of words beginning with C, etc. If two words start with the same letter, pop that letter from the head of the word and compare their tails, which puts AARDVARK before ANIMAL, and — applying this rule recursively — after AARDMAN. Imagine there’s an empty word marking position zero, before A, right at the front of the dictionary, and our recursive definition is complete.

Next permutation in action

Here’s a simple program which shows next_permutation() in action.

#include <algorithm>
#include <cstdio>

int main()
{
    char xs[] = "123";
    do
    {
        std::puts(xs);
    }
    while (std::next_permutation(xs, xs + sizeof(xs) - 1));
    return 0;
}

This program outputs lexicographically ordered permutations of 1, 2 and 3. When the main function returns, the array xs will have cycled round to hold the lexicographically smallest arrangement of its elements, which is "123". Note that we never convert the characters '1', '2', '3' into the numbers 1,2, 3. The values of both sets of data types appear in the same order, so all works as expected.

123132213231312321

If we tweak and rerun the same program with xs initialised to "AAADKRRV" we get rather more output.

AAADKRRVAAADKRVRAAADKVRR...AARDVARK...VRRKAADAVRRKADAAVRRKDAAA

The sequence doesn’t start by repeating "AAADKRRV" 6 times, once for every permutation of the 3 A’s. Only strictly increasing permutations are included. And although the repeated calls to next_permutationgenerate a series of permutations, the algorithm holds no state. Each function call works on its input range afresh.

This second run of the program yields 3360 lines of output, even though there are 8! = 40320 possible permutations of 8 characters. Each unique permutation corresponds to 3! × 2! = 12 actual permutations of the 8 characters (because there are 3 A’s and 2 R’s), and 40320 ÷ 12 is 3360.

Snail sort’s revenge

...and in last place. By Tim Norris

As you can see, next_permutation sorts an input range, one step at a time. When next_permutationeventually returns false, the range will be perfectly ordered. Hence we have snail_sort(), hailed by the SGI STL documentation as the worst known deterministic sorting algorithm.

template <class Iter> 
void snail_sort(Iter first, Iter last)
{
    while (next_permutation(first, last)) {}
}

Very witty, and evidence that code can be both elegant and inefficient.

In two important edge cases, though, snail_sort performs on a par with super-charged quicksort!

I snail sorted an array filled with 100000000 zeros in 0.502 seconds. Running quicksort on the same array took 5.504 seconds.
Starting with an array of the same size filled with the values 99999999, 99999998, 99999997, … 1, 0 snail sort’s 0.500 seconds trounced quicksort’s 4.08 seconds.

The Next Number, Solved

Here’s an outline solution to the next number problem. (I’ve glossed over the exact input and output file formats for clarity.) It reads numbers from standard input and writes next numbers to standard output. Next_permutation does the hard work, and there’s a bit of fiddling when we have to increase the number of digits by adding a zero.^[1]

#include <algorithm>
#include <iostream>

/*
 Given a string of digits, shift any leading '0's
 past the first non-zero digit and insert an extra zero.
 
 Examples:
  
 123 -> 1023
 008 -> 8000
 034 -> 3004
*/
void insert_a_zero(std::string & number)
{
    size_t nzeros = number.find_first_not_of('0');
    number = number.substr(nzeros);
    number.insert(1, nzeros + 1, '0');
}

/*
 Outline solution to the 2009 code jam Next Number problem.
 
 Given a string representing a decimal number, find the next
 number which can be formed from the same set of digits. Add
 another zero if necessary. Repeat for all such strings read
 from standard input.
*/
int main()
{
    std::string number;
    while (std::cin >> number)
    {
        if (!next_permutation(number.begin(), number.end()))
        {
            insert_a_zero(number);
        }
        std::cout << number << '\n';
    }
    return 0;
}

Implementation

Having used the C++ standard library to solve the puzzle, let’s take a look at how it works. Next permutation is a clever algorithm which shuffles a collection in place. My system implements it like this^[2].

template<typename Iter>
bool next_permutation(Iter first, Iter last)
{
    if (first == last)
        return false;
    Iter i = first;
    ++i;
    if (i == last)
        return false;
    i = last;
    --i;
        
    for(;;)
    {
        Iter ii = i;
        --i;
        if (*i < *ii)
        {
            Iter j = last;
            while (!(*i < *--j))
            {}
            std::iter_swap(i, j);
            std::reverse(ii, last);
            return true;
        }
        if (i == first)
        {
            std::reverse(first, last);
            return false;
        }
    }
}

We start with a range delimited by a pair of bi-directional iterators, [first, last). If the range contains one item or fewer, there can be no next permutation, so leave the range as is and returnfalse. Otherwise, enter the for loop with an iterator i pointing at the final item in the range.

At each pass through the body of this for loop we decrement i by one, stepping towards the first item in the range. We are looking for one of two conditions:

the value pointed to by i is smaller than the one it pointed to previously
i reaches into the first item in the range

Put another way, we divide the range into a head and tail, where the tail is the longest possible decreasing tail of the range.

If this tail is the whole range (the second condition listed above) then the whole range is in reverse order, and we have the lexicographical maximum formed from its elements. Reversing the range returns it to its lexicographical minimum, and we can return false.

If this tail is not the whole range, then the final item in the head of the range, the item i points to, this item is smaller than at least one of the items in the tail of the range, and we can certainly generate a greater permutation by moving the item towards the end of the range. To find the next permutation, we reverse iterate from the end of the range until we find an item *j bigger than *i — that’s what the while loop does. Swapping the items pointed to by i and j ensures the head of the range is bigger than it was, and the tail of the range remains in reverse order. Finally, we reverse the tail of the range, leaving us with a permutation exactly one beyond the input permutation, and we return true.

What’s happening here?

It’s clear from this paper analysis that the algorithm is of linear complexity. Essentially, it walks up and down the tail of the list, comparing and swapping. But why does it work?

Let xs be the range (first, last). As described above, divide this range into prefix and suffix subranges, head and tail, where tail is the longest monotonically decreasing tail of the range.

If the head of the range is empty, then the range xs is clearly at its lexicographical maximum.

Otherwise, tail is a lexicographical maximum of the elements it contains, and xs is therefore the largest permutation which starts with the subrange head. What will the head of the next permutation be? We have to swap the final item in head with the smallest item of tail which exceeds it: the definition of tail guarantees at least one such item exists. Now we want to permute the new tail to be at a its lexicographical minimum, which is a matter of sorting it from low to high.

Since tail is in reverse order, finding the smallest item larger than head[-1] is a matter of walking back from the end of the range to find the first such items; and once we’ve swapped these items,tail remains in reverse order, so a simple reversed will sort it.

As an example consider finding the next permutation of:

8342666411

The longest monotonically decreasing tail is 666411, and the corresponding head is 8342.

8342 666411

666411 is, by definition, reverse-ordered, and cannot be increased by permuting its elements. To find the next permutation, we must increase the head; a matter of finding the smallest tail element larger than the head’s final 2.

8342 666411

Walking back from the end of tail, the first element greater than 2 is 4.

8342 666411

Swap the 2 and the 4

8344 666211

Since head has increased, we now have a greater permutation. To reduce to the next permutation, we reverse tail, putting it into increasing order.

8344 112666

Join the head and tail back together. The permutation one greater than 8342666411 is 8344112666.

0 0