USACO Training Section 1.3 Greedy Algorithm 贪心算法

来源：互联网发布：linux云计算培训编辑：程序博客网时间：2024/05/19 13:24

贪心算法

例题：修复牛栏（1999年USACO春季公开赛）

现在我们有一排牛栏，但其中的一些需要修复（就是拿块木板去修理），你最多能使用N（1 ≤ N ≤ 50）块木板，其中每一块木板可以搞定连续的任意个牛栏。你的任务是使用这些木板数搞定所有需要修复的牛栏，但是要使得尽可能少的本来就完好的牛栏被重复修复（就是要使得这些木板覆盖尽可能少的原来就是完好的牛栏）。

想法：

贪心算法最基本的思路是“从一个大型问题较小的部分开始解决，一步一步以至于完成整个问题”（编者说一句：这里我的理解是假如有规模为N的问题，我们可以先试着解决规模为1时的最优情况。接下来在1最优的基础上，加上第2部分…直到把N部分处理完毕）。然而，和其他方法不同的是，贪心算法在每一步的计算中只采用仅对这一步来说最优的方案，而非环顾全局。所以，如果用上面的例题做例子的话，贪心算法在计算N = 5时的方案时，会先了解并沿用N = 4时的最优方案，然后在N = 4最优方案的基础上计算得到N = 5时的最优方案。在这里，N = 4时的除了最优方案之外的任何方案都没有被考虑在内（都被永远的丢掉了）。

贪心算法非常快，时间复杂度大致都在O(n)或O(n^2)，而且只需要很少的额外存储空间。（有这么多优点总要付出点啥吧…司马懿说过(误)：出来混，总是要还的！）很不幸的是，使用贪心算法计算得到的结果经常是不对的。然而当他被证明在某道题上使用的正确性时，贪心算法实现简单，计算迅速的优点显露无遗。

问题：

关于贪心算法，你可能有两个最基本的问题——怎样实现它？它好用吗？

怎样实现：

怎样从小的问题出发来解决全局的问题呢？大体上来说，这要用到程序中的函数。对于例题中的问题，从四块木板到五块木板最显然的方法就是选择一块木板，同时把已经放好的一块移动位置。因此，我们把一块木板变成了两块。你应该移动所有只覆盖了不需要修复的牛栏的木板中最大的一块（这样就减少了完好牛栏被重复修复的数量）。

通过移除这块“废木板”，我们把这块板子分成了两份：一份覆盖这段完好牛栏之前的牛栏，另一份覆盖这段完好牛栏之后的牛栏。

它好用吗：

对程序员来说，贪心法最大的问题就是它不总是好用。即使对于样例数据、随机数据以及一堆又一堆的你所能想到的数据，它都好用。但是只要有一组数据它是错的，这组数据就会出现在评判程序的测试数据里（评判程序腹黑不解释…），所以这种情况下贪心法不能使用。

对于例题，我们看看上面描述的贪心法的实现好不好。（谭小江语：我们需要一个很好的实数空间来保证极限有很好的性质啊~）

如果标准答案里没有包括贪心法移除的那个最大牛栏段，而是移除了一个较小的。那么现在我们把较小的牛栏段前后的木板连成一个，然后把最大的那个牛栏段分裂成两个木板（空开最大完好牛栏段），我么使用了相同数量的木板，但是只覆盖了更少数量的完好牛栏。立即看出（丘维声语）这个新的答案要比原来的好，所以假设是错误的，我们总应该分裂最大完好牛栏段。

如果标准答案没有分裂这个最大牛栏段但是选择了另一个一样长的，做和上一段讨论的一样的变换，我们得到：使用相同数量的木板，覆盖了相同数量的无用区域。这个新方案和以前的一样，所以我们选哪个都可以。

因此，最优答案存在，并且一定分裂了最大完好牛栏段。所以对于每一步，总存在一个最优方案，它分裂了最大完好牛栏段。并且每一个状态都可以选择这个方案（这个方案是当前状态的超集）。最后我们必然可以得到一个对于全局的最优方案。

上机时间到了…回宿舍继续翻…

这里是英文原文：

Greedy Algorithm

Sample Problem: Barn Repair [1999 USACO Spring Open]

There is a long list of stalls, some of which need to be covered with boards. You can use up to N (1 <= N <= 50) boards, each of which may cover any number of consecutive stalls. Cover all the necessary stalls, while covering as few total stalls as possible.

The Idea

The basic idea behind greedy algorithms is to build large solutions up from smaller ones. Unlike other approaches, however, greedy algorithms keep only the best solution they find as they go along. Thus, for the sample problem, to build the answer for N = 5, they find the best solution for N = 4, and then alter it to get a solution for N = 5. No other solution for N = 4 is ever considered.

Greedy algorithms are fast, generally linear to quadratic and require little extra memory. Unfortunately, they usually aren't correct. But when they do work, they are often easy to implement and fast enough to execute.

Problems

There are two basic problems to greedy algorithms.

How to Build

How does one create larger solutions from smaller ones? In general, this is a function of the problem. For the sample problem, the most obvious way to go from four boards to five boards is to pick a board and remove a section, thus creating two boards from one. You should choose to remove the largest section from any board which covers only stalls which don't need covering (so as to minimize the total number of stalls covered).

To remove a section of covered stalls, take the board which spans those stalls, and make into two boards: one of which covers the stalls before the section, one of which covers the stalls after the section.

Does it work?

The real challenge for the programmer lies in the fact that greedy solutions don't always work. Even if they seem to work for the sample input, random input, and all the cases you can think of, if there's a case where it won't work, at least one (if not more!) of the judges' test cases will be of that form.

For the sample problem, to see that the greedy algorithm described above works, consider the following:

Assume that the answer doesn't contain the large gap which the algorithm removed, but does contain a gap which is smaller. By combining the two boards at the end of the smaller gap and splitting the board across the larger gap, an answer is obtained which uses as many boards as the original solution but which covers fewer stalls. This new answer is better, so therefore the assumption is wrong and we should always choose to remove the largest gap.

If the answer doesn't contain this particular gap but does contain another gap which is just as large, doing the same transformation yields an answer which uses as many boards and covers as many stalls as the other answer. This new answer is just as good as the original solution but no better, so we may choose either.

Thus, there exists an optimal answer which contains the large gap, so at each step, there is always an optimal answer which is a superset of the current state. Thus, the final answer is optimal.

Conclusions

If a greedy solution exists, use it. They are easy to code, easy to debug, run quickly, and use little memory, basically defining a good algorithm in contest terms. The only missing element from that list is correctness. If the greedy algorithm finds the correct answer, go for it, but don't get suckered into thinking the greedy solution will work for all problems.

Sample Problems

Sorting a three-valued sequence [IOI 1996]

You are given a three-valued (1, 2, or 3) sequence of length up to 1000. Find a minimum set of exchanges to put the sequence in sorted order.

Algorithm The sequence has three parts: the part which will be 1 when in sorted order, 2 when in sorted order, and 3 when in sorted order. The greedy algorithm swaps as many as possible of the 1's in the 2 part with 2's in the 1 part, as many as possible 1's in the 3 part with 3's in the 1 part, and 2's in the 3 part with 3's in the 2 part. Once none of these types remains, the remaining elements out of place need to be rotated one way or the other in sets of 3. You can optimally sort these by swapping all the 1's into place and then all the 2's into place.

Analysis: Obviously, a swap can put at most two elements in place, so all the swaps of the first type are optimal. Also, it is clear that they use different types of elements, so there is no ``interference'' between those types. This means the order does not matter. Once those swaps have been performed, the best you can do is two swaps for every three elements not in the correct location, which is what the second part will achieve (for example, all the 1's are put in place but no others; then all that remains are 2's in the 3's place and vice-versa, and which can be swapped).

Friendly Coins - A Counterexample [abridged]

Given the denominations of coins for a newly founded country, the Dairy Republic, and some monetary amount, find the smallest set of coins that sums to that amount. The Dairy Republic is guaranteed to have a 1 cent coin.

Algorithm: Take the largest coin value that isn't more than the goal and iterate on the total minus this value.

(Faulty) Analysis: Obviously, you'd never want to take a smaller coin value, as that would mean you'd have to take more coins to make up the difference, so this algorithm works.

Maybe not: Okay, the algorithm usually works. In fact, for the U.S. coin system {1, 5, 10, 25}, it always yields the optimal set. However, for other sets, like {1, 5, 8, 10} and a goal of 13, this greedy algorithm would take one 10, and then three 1's, for a total of four coins, when the two coin solution {5, 8} also exists.

Topological Sort

Given a collection of objects, along with some ordering constraints, such as "A must be before B," find an order of the objects such that all the ordering constraints hold.

Algorithm: Create a directed graph over the objects, where there is an arc from A to B if "A must be before B." Make a pass through the objects in arbitrary order. Each time you find an object with in-degree of 0, greedily place it on the end of the current ordering, delete all of its out-arcs, and recurse on its (former) children, performing the same check. If this algorithm gets through all the objects without putting every object in the ordering, there is no ordering which satisfies the constraints.